From glider at google.com Mon Dec 26 10:16:00 2011 From: glider at google.com (Alexander Potapenko) Date: Mon, 26 Dec 2011 20:16:00 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: disallow inlining the __asan_report_* functions Message-ID: Hi all, The attached patch marks the __asan_report_{load,store}{1,2,4,8,16} functions as noinline, thus making sure they'll be present in the resulting binary (this is necessary e.g. for Chrome) Without that it turned out that the calls to those functions from __asan_force_interface_symbols were inlined. -- Alexander Potapenko Software Engineer Google Moscow -------------- next part -------------- A non-text attachment was scrubbed... Name: asan-noinline.patch Type: text/x-patch Size: 957 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111226/88662ff0/attachment.bin From nicholas at mxc.ca Mon Dec 26 14:37:41 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 26 Dec 2011 20:37:41 -0000 Subject: [llvm-commits] [llvm] r147279 - /llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Message-ID: <20111226203741.40AA02A6C12C@llvm.org> Author: nicholas Date: Mon Dec 26 14:37:40 2011 New Revision: 147279 URL: http://llvm.org/viewvc/llvm-project?rev=147279&view=rev Log: Sort includes, canonicalize whitespace, fix typos. No functionality change. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147279&r1=147278&r2=147279&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Dec 26 14:37:40 2011 @@ -14,11 +14,11 @@ #define DEBUG_TYPE "simplifycfg" #include "llvm/Transforms/Utils/Local.h" #include "llvm/Constants.h" +#include "llvm/DerivedTypes.h" +#include "llvm/GlobalVariable.h" #include "llvm/Instructions.h" #include "llvm/IntrinsicInst.h" #include "llvm/Type.h" -#include "llvm/DerivedTypes.h" -#include "llvm/GlobalVariable.h" #include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/ValueTracking.h" #include "llvm/Target/TargetData.h" @@ -431,9 +431,9 @@ return 0; } - + static void EraseTerminatorInstAndDCECond(TerminatorInst *TI) { - Instruction* Cond = 0; + Instruction *Cond = 0; if (SwitchInst *SI = dyn_cast(TI)) { Cond = dyn_cast(SI->getCondition()); } else if (BranchInst *BI = dyn_cast(TI)) { @@ -1480,7 +1480,7 @@ // Ignore dbg intrinsics. while (isa(FrontIt)) ++FrontIt; - + // Allow a single instruction to be hoisted in addition to the compare // that feeds the branch. We later ensure that any values that _it_ uses // were also live in the predecessor, so that we don't unnecessarily create @@ -1558,7 +1558,7 @@ SmallPtrSet UsedValues; for (Instruction::op_iterator OI = BonusInst->op_begin(), OE = BonusInst->op_end(); OI != OE; ++OI) { - Value* V = *OI; + Value *V = *OI; if (!isa(V)) UsedValues.insert(V); } @@ -2365,7 +2365,7 @@ if (SI->getSuccessor(0) == BB) { std::map > Popularity; for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) { - std::pair& entry = + std::pair &entry = Popularity[SI->getSuccessor(i)]; if (entry.first == 0) { entry.first = 1; @@ -2677,8 +2677,8 @@ if (ICI->isEquality() && isa(ICI->getOperand(1))) { for (++I; isa(I); ++I) ; - if (I->isTerminator() - && TryToSimplifyUncondBranchWithICmpInIt(ICI, TD, Builder)) + if (I->isTerminator() && + TryToSimplifyUncondBranchWithICmpInIt(ICI, TD, Builder)) return true; } @@ -2755,8 +2755,8 @@ if (FoldCondBranchOnPHI(BI, TD)) return SimplifyCFG(BB) | true; - // If this basic block is ONLY a setcc and a branch, and if a predecessor - // branches to us and one of our successors, fold the setcc into the + // If this basic block is ONLY a compare and a branch, and if a predecessor + // branches to us and one of our successors, fold the comparison into the // predecessor and use logical operations to pick the right destination. if (FoldBranchToCommonDest(BI)) return SimplifyCFG(BB) | true; @@ -2810,7 +2810,7 @@ } /// If BB has an incoming value that will always trigger undefined behavior -/// (eg. null pointer derefence), remove the branch leading here. +/// (eg. null pointer dereference), remove the branch leading here. static bool removeUndefIntroducingPredecessor(BasicBlock *BB) { for (BasicBlock::iterator i = BB->begin(); PHINode *PHI = dyn_cast(i); ++i) From nicholas at mxc.ca Mon Dec 26 14:54:14 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 26 Dec 2011 20:54:14 -0000 Subject: [llvm-commits] [llvm] r147280 - in /llvm/trunk: lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/preserve-branchweights.ll Message-ID: <20111226205414.E04622A6C12E@llvm.org> Author: nicholas Date: Mon Dec 26 14:54:14 2011 New Revision: 147280 URL: http://llvm.org/viewvc/llvm-project?rev=147280&view=rev Log: Update the branch weight metadata when reversing the order of a branch. Added: llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147280&r1=147279&r2=147280&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Dec 26 14:54:14 2011 @@ -1603,10 +1603,7 @@ } PBI->setCondition(NewCond); - BasicBlock *OldTrue = PBI->getSuccessor(0); - BasicBlock *OldFalse = PBI->getSuccessor(1); - PBI->setSuccessor(0, OldFalse); - PBI->setSuccessor(1, OldTrue); + PBI->swapSuccessors(); } // If we have a bonus inst, clone it into the predecessor block. Added: llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll?rev=147280&view=auto ============================================================================== --- llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll (added) +++ llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll Mon Dec 26 14:54:14 2011 @@ -0,0 +1,26 @@ +; RUN: opt -simplifycfg -S -o - < %s | FileCheck %s + +declare void @helper(i32) + +define void @test1(i1 %a, i1 %b) { +; CHECK @test1 +entry: + br i1 %a, label %Y, label %X, !prof !0 +; CHECK: br i1 %or.cond, label %Z, label %Y, !prof !0 + +X: + %c = or i1 %b, false + br i1 %c, label %Z, label %Y + +Y: + call void @helper(i32 0) + ret void + +Z: + call void @helper(i32 1) + ret void +} + +!0 = metadata !{metadata !"branch_weights", i32 1, i32 2} + +; CHECK: !0 = metadata !{metadata !"branch_weights", i32 2, i32 1} From eli.friedman at gmail.com Mon Dec 26 16:49:33 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Mon, 26 Dec 2011 22:49:33 -0000 Subject: [llvm-commits] [llvm] r147283 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll Message-ID: <20111226224933.234A62A6C12D@llvm.org> Author: efriedma Date: Mon Dec 26 16:49:32 2011 New Revision: 147283 URL: http://llvm.org/viewvc/llvm-project?rev=147283&view=rev Log: Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. Added: llvm/trunk/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/trunk/test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=147283&r1=147282&r2=147283&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Mon Dec 26 16:49:32 2011 @@ -6905,6 +6905,10 @@ EVT LVT = ExtVT; if (InVec.getOpcode() == ISD::BITCAST) { + // Don't duplicate a load with other uses. + if (!InVec.hasOneUse()) + return SDValue(); + EVT BCVT = InVec.getOperand(0).getValueType(); if (!BCVT.isVector() || ExtVT.bitsGT(BCVT.getVectorElementType())) return SDValue(); @@ -6922,12 +6926,20 @@ } else if (InVec.getOpcode() == ISD::SCALAR_TO_VECTOR && InVec.getOperand(0).getValueType() == ExtVT && ISD::isNormalLoad(InVec.getOperand(0).getNode())) { + // Don't duplicate a load with other uses. + if (!InVec.hasOneUse()) + return SDValue(); + LN0 = cast(InVec.getOperand(0)); } else if ((SVN = dyn_cast(InVec))) { // (vextract (vector_shuffle (load $addr), v2, <1, u, u, u>), 1) // => // (load $addr+1*size) + // Don't duplicate a load with other uses. + if (!InVec.hasOneUse()) + return SDValue(); + // If the bit convert changed the number of elements, it is unsafe // to examine the mask. if (BCNumEltsChanged) @@ -6938,14 +6950,21 @@ int Idx = (Elt > (int)NumElems) ? -1 : SVN->getMaskElt(Elt); InVec = (Idx < (int)NumElems) ? InVec.getOperand(0) : InVec.getOperand(1); - if (InVec.getOpcode() == ISD::BITCAST) + if (InVec.getOpcode() == ISD::BITCAST) { + // Don't duplicate a load with other uses. + if (!InVec.hasOneUse()) + return SDValue(); + InVec = InVec.getOperand(0); + } if (ISD::isNormalLoad(InVec.getNode())) { LN0 = cast(InVec); Elt = (Idx < (int)NumElems) ? Idx : Idx - (int)NumElems; } } + // Make sure we found a non-volatile load and the extractelement is + // the only use. if (!LN0 || !LN0->hasNUsesOfValue(1,0) || LN0->isVolatile()) return SDValue(); @@ -6982,6 +7001,9 @@ // The replacement we need to do here is a little tricky: we need to // replace an extractelement of a load with a load. // Use ReplaceAllUsesOfValuesWith to do the replacement. + // Note that this replacement assumes that the extractvalue is the only + // use of the load; that's okay because we don't want to perform this + // transformation in other cases anyway. SDValue Load = DAG.getLoad(LVT, N->getDebugLoc(), LN0->getChain(), NewPtr, LN0->getPointerInfo().getWithOffset(PtrOff), LN0->isVolatile(), LN0->isNonTemporal(), Modified: llvm/trunk/test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll?rev=147283&r1=147282&r2=147283&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll (original) +++ llvm/trunk/test/CodeGen/ARM/2011-11-29-128bitArithmetics.ll Mon Dec 26 16:49:32 2011 @@ -8,11 +8,11 @@ ; CHECK: movw r1, :lower16:{{.*}} ; CHECK: movt r1, :upper16:{{.*}} -; CHECK: vldmia r1, {[[short0:s[0-9]+]], [[short1:s[0-9]+]], [[short2:s[0-9]+]], [[short3:s[0-9]+]]} -; CHECK: vsqrt.f32 {{s[0-9]+}}, [[short3]] -; CHECK: vsqrt.f32 {{s[0-9]+}}, [[short2]] -; CHECK: vsqrt.f32 {{s[0-9]+}}, [[short1]] -; CHECK: vsqrt.f32 {{s[0-9]+}}, [[short0]] +; CHECK: vldmia r1 +; CHECK: vsqrt.f32 {{s[0-9]+}}, {{s[0-9]+}} +; CHECK: vsqrt.f32 {{s[0-9]+}}, {{s[0-9]+}} +; CHECK: vsqrt.f32 {{s[0-9]+}}, {{s[0-9]+}} +; CHECK: vsqrt.f32 {{s[0-9]+}}, {{s[0-9]+}} ; CHECK: vstmia {{.*}} L.entry: Added: llvm/trunk/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll?rev=147283&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll (added) +++ llvm/trunk/test/CodeGen/X86/2011-12-26-extractelement-duplicate-load.ll Mon Dec 26 16:49:32 2011 @@ -0,0 +1,16 @@ +; RUN: llc -march=x86-64 -mattr=-sse42,+sse41 < %s | FileCheck %s +; Make sure we don't load from the location pointed to by %p +; twice: it has non-obvious performance implications, and +; the relevant transformation doesn't know how to update +; the chains correctly. +; PR10747 + +; CHECK: test: +; CHECK: pextrd $2, %xmm +define <4 x i32> @test(<4 x i32>* %p) { + %v = load <4 x i32>* %p + %e = extractelement <4 x i32> %v, i32 2 + %cmp = icmp eq i32 %e, 3 + %sel = select i1 %cmp, <4 x i32> %v, <4 x i32> zeroinitializer + ret <4 x i32> %sel +} From rafael.espindola at gmail.com Mon Dec 26 17:12:42 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 26 Dec 2011 23:12:42 -0000 Subject: [llvm-commits] [llvm] r147284 - /llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Message-ID: <20111226231242.BD6E12A6C12D@llvm.org> Author: rafael Date: Mon Dec 26 17:12:42 2011 New Revision: 147284 URL: http://llvm.org/viewvc/llvm-project?rev=147284&view=rev Log: Fix warning. Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp?rev=147284&r1=147283&r2=147284&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Mon Dec 26 17:12:42 2011 @@ -938,13 +938,14 @@ void run(AllocaInst *AI, const SmallVectorImpl &Insts) { // Remember which alloca we're promoting (for isInstInList). this->AI = AI; - if (MDNode *DebugNode = MDNode::getIfExists(AI->getContext(), AI)) + if (MDNode *DebugNode = MDNode::getIfExists(AI->getContext(), AI)) { for (Value::use_iterator UI = DebugNode->use_begin(), E = DebugNode->use_end(); UI != E; ++UI) if (DbgDeclareInst *DDI = dyn_cast(*UI)) DDIs.push_back(DDI); else if (DbgValueInst *DVI = dyn_cast(*UI)) DVIs.push_back(DVI); + } LoadAndStorePromoter::run(Insts); AI->eraseFromParent(); From nicholas at mxc.ca Mon Dec 26 19:17:40 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 01:17:40 -0000 Subject: [llvm-commits] [llvm] r147285 - /llvm/trunk/lib/VMCore/Metadata.cpp Message-ID: <20111227011740.6FFCC2A6C12C@llvm.org> Author: nicholas Date: Mon Dec 26 19:17:40 2011 New Revision: 147285 URL: http://llvm.org/viewvc/llvm-project?rev=147285&view=rev Log: Using Inst->setMetadata(..., NULL) should be safe to remove metadata even when there is non of that type to remove. This fixes a crasher in the particular case where the instruction has metadata but no metadata storage in the context (this is only possible if the instruction has !dbg but no other metadata info). Modified: llvm/trunk/lib/VMCore/Metadata.cpp Modified: llvm/trunk/lib/VMCore/Metadata.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=147285&r1=147284&r2=147285&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Metadata.cpp (original) +++ llvm/trunk/lib/VMCore/Metadata.cpp Mon Dec 26 19:17:40 2011 @@ -470,9 +470,11 @@ } // Otherwise, we're removing metadata from an instruction. - assert(hasMetadataHashEntry() && - getContext().pImpl->MetadataStore.count(this) && + assert((hasMetadataHashEntry() == + getContext().pImpl->MetadataStore.count(this)) && "HasMetadata bit out of date!"); + if (!hasMetadataHashEntry()) + return; // Nothing to remove! LLVMContextImpl::MDMapTy &Info = getContext().pImpl->MetadataStore[this]; // Common case is removing the only entry. From jcarter at mips.com Mon Dec 26 19:40:52 2011 From: jcarter at mips.com (Carter, Jack) Date: Tue, 27 Dec 2011 01:40:52 +0000 Subject: [llvm-commits] [Mips] Request for review: redundant code elimination Message-ID: <86AC779C188FE74F88F6494478B46332E8F6D0@exchdb03.mips.com> getRegisterNumbering.patch This patch takes out a redundant table. It does not affect output and thus there is no attached test case. . contributer: Jack Carter lib/Target/Mips/MipsAsmPrinter.cpp lib/Target/Mips/MipsCodeEmitter.cpp lib/Target/Mips/MipsRegisterInfo.cpp -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/c2b01842/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: getRegisterNumbering.patch Type: text/x-patch Size: 5704 bytes Desc: getRegisterNumbering.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/c2b01842/attachment.bin From nicholas at mxc.ca Mon Dec 26 22:31:52 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 04:31:52 -0000 Subject: [llvm-commits] [llvm] r147286 - in /llvm/trunk: lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/preserve-branchweights.ll Message-ID: <20111227043152.F2AE42A6C12C@llvm.org> Author: nicholas Date: Mon Dec 26 22:31:52 2011 New Revision: 147286 URL: http://llvm.org/viewvc/llvm-project?rev=147286&view=rev Log: Teach simplifycfg to recompute branch weights when merging some branches, and to discard weights when appropriate. Still more to do (and a new TODO), but it's a start! Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147286&r1=147285&r2=147286&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Dec 26 22:31:52 2011 @@ -18,6 +18,8 @@ #include "llvm/GlobalVariable.h" #include "llvm/Instructions.h" #include "llvm/IntrinsicInst.h" +#include "llvm/LLVMContext.h" +#include "llvm/Metadata.h" #include "llvm/Type.h" #include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/ValueTracking.h" @@ -1462,6 +1464,26 @@ return true; } +/// ExtractBranchMetadata - Given a conditional BranchInstruction, retrieve the +/// probabilities of the branch taking each edge. Fills in the two APInt +/// parameters and return true, or returns false if no or invalid metadata was +/// found. +static bool ExtractBranchMetadata(BranchInst *BI, + APInt &ProbTrue, APInt &ProbFalse) { + assert(BI->isConditional() && + "Looking for probabilities on unconditional branch?"); + MDNode *ProfileData = BI->getMetadata(LLVMContext::MD_prof); + if (!ProfileData || ProfileData->getNumOperands() != 3) return 0; + ConstantInt *CITrue = dyn_cast(ProfileData->getOperand(1)); + ConstantInt *CIFalse = dyn_cast(ProfileData->getOperand(2)); + if (!CITrue || !CIFalse) return 0; + ProbTrue = CITrue->getValue(); + ProbFalse = CIFalse->getValue(); + assert(ProbTrue.getBitWidth() == 32 && ProbFalse.getBitWidth() == 32 && + "Branch probability metadata must be 32-bit integers"); + return true; +} + /// FoldBranchToCommonDest - If this basic block is simple enough, and if a /// predecessor branches to us and one of our successors, fold the block into /// the predecessor and use logical operations to pick the right destination. @@ -1636,6 +1658,51 @@ PBI->setSuccessor(1, FalseDest); } + // TODO: If BB is reachable from all paths through PredBlock, then we + // could replace PBI's branch probabilities with BI's. + + // Merge probability data into PredBlock's branch. + APInt A, B, C, D; + if (ExtractBranchMetadata(PBI, C, D) && ExtractBranchMetadata(BI, A, B)) { + // bbA: br bbB (a% probability), bbC (b% prob.) + // bbB: br bbD (c% probability), bbC (d% prob.) + // --> bbA: br bbD ((a*c)% prob.), bbC ((b+a*d)% prob.) + // + // Probabilities aren't stored as ratios directly. Converting to + // probability-numerator form, we get: + // (a*c)% = A*C, (b+(a*d))% = A*D+B*C+B*D. + + bool Overflow1 = false, Overflow2 = false, Overflow3 = false; + bool Overflow4 = false, Overflow5 = false, Overflow6 = false; + APInt ProbTrue = A.umul_ov(C, Overflow1); + + APInt Tmp1 = A.umul_ov(D, Overflow2); + APInt Tmp2 = B.umul_ov(C, Overflow3); + APInt Tmp3 = B.umul_ov(D, Overflow4); + APInt Tmp4 = Tmp1.uadd_ov(Tmp2, Overflow5); + APInt ProbFalse = Tmp4.uadd_ov(Tmp3, Overflow6); + + APInt GCD = APIntOps::GreatestCommonDivisor(ProbTrue, ProbFalse); + ProbTrue = ProbTrue.udiv(GCD); + ProbFalse = ProbFalse.udiv(GCD); + + if (Overflow1 || Overflow2 || Overflow3 || Overflow4 || Overflow5 || + Overflow6) { + DEBUG(dbgs() << "Overflow recomputing branch weight on: " << *PBI + << "when merging with: " << *BI); + PBI->setMetadata(LLVMContext::MD_prof, NULL); + } else { + LLVMContext &Context = BI->getContext(); + Value *Ops[3]; + Ops[0] = BI->getMetadata(LLVMContext::MD_prof)->getOperand(0); + Ops[1] = ConstantInt::get(Context, ProbTrue); + Ops[2] = ConstantInt::get(Context, ProbFalse); + PBI->setMetadata(LLVMContext::MD_prof, MDNode::get(Context, Ops)); + } + } else { + PBI->setMetadata(LLVMContext::MD_prof, NULL); + } + // Copy any debug value intrinsics into the end of PredBlock. for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) if (isa(*I)) Modified: llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll?rev=147286&r1=147285&r2=147286&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll (original) +++ llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll Mon Dec 26 22:31:52 2011 @@ -10,6 +10,45 @@ X: %c = or i1 %b, false + br i1 %c, label %Z, label %Y, !prof !1 + +Y: + call void @helper(i32 0) + ret void + +Z: + call void @helper(i32 1) + ret void +} + +define void @test2(i1 %a, i1 %b) { +; CHECK: @test2 +entry: + br i1 %a, label %X, label %Y, !prof !1 +; CHECK: br i1 %or.cond, label %Z, label %Y, !prof !1 +; CHECK-NOT: !prof + +X: + %c = or i1 %b, false + br i1 %c, label %Z, label %Y, !prof !2 + +Y: + call void @helper(i32 0) + ret void + +Z: + call void @helper(i32 1) + ret void +} + +define void @test3(i1 %a, i1 %b) { +; CHECK: @test3 +; CHECK-NOT: !prof +entry: + br i1 %a, label %X, label %Y, !prof !1 + +X: + %c = or i1 %b, false br i1 %c, label %Z, label %Y Y: @@ -21,6 +60,29 @@ ret void } -!0 = metadata !{metadata !"branch_weights", i32 1, i32 2} +define void @test4(i1 %a, i1 %b) { +; CHECK: @test4 +; CHECK-NOT: !prof +entry: + br i1 %a, label %X, label %Y + +X: + %c = or i1 %b, false + br i1 %c, label %Z, label %Y, !prof !1 + +Y: + call void @helper(i32 0) + ret void + +Z: + call void @helper(i32 1) + ret void +} + +!0 = metadata !{metadata !"branch_weights", i32 3, i32 5} +!1 = metadata !{metadata !"branch_weights", i32 1, i32 1} +!2 = metadata !{metadata !"branch_weights", i32 1, i32 2} -; CHECK: !0 = metadata !{metadata !"branch_weights", i32 2, i32 1} +; CHECK: !0 = metadata !{metadata !"branch_weights", i32 5, i32 11} +; CHECK: !1 = metadata !{metadata !"branch_weights", i32 1, i32 5} +; CHECK-NOT: !2 From lostphifunction at gmail.com Mon Dec 26 23:04:36 2011 From: lostphifunction at gmail.com (Alexander Malyshev) Date: Tue, 27 Dec 2011 00:04:36 -0500 Subject: [llvm-commits] [PATCH] SimplifyLibCalls.cpp: Small cosine optimization Message-ID: Adds the pattern for cos(-x) -> cos(x). Test file included. Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/1ce83b12/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: cos.diff Type: text/x-patch Size: 3650 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/1ce83b12/attachment.bin From craig.topper at gmail.com Tue Dec 27 00:27:24 2011 From: craig.topper at gmail.com (Craig Topper) Date: Tue, 27 Dec 2011 06:27:24 -0000 Subject: [llvm-commits] [llvm] r147287 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20111227062724.1A41C2A6C12C@llvm.org> Author: ctopper Date: Tue Dec 27 00:27:23 2011 New Revision: 147287 URL: http://llvm.org/viewvc/llvm-project?rev=147287&view=rev Log: Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147287&r1=147286&r2=147287&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Dec 27 00:27:23 2011 @@ -12621,7 +12621,8 @@ case Intrinsic::x86_sse2_movmsk_pd: case Intrinsic::x86_avx_movmsk_pd_256: case Intrinsic::x86_mmx_pmovmskb: - case Intrinsic::x86_sse2_pmovmskb_128: { + case Intrinsic::x86_sse2_pmovmskb_128: + case Intrinsic::x86_avx2_pmovmskb: { // High bits of movmskp{s|d}, pmovmskb are known zero. switch (IntId) { case Intrinsic::x86_sse_movmsk_ps: NumLoBits = 4; break; @@ -12630,6 +12631,7 @@ case Intrinsic::x86_avx_movmsk_pd_256: NumLoBits = 4; break; case Intrinsic::x86_mmx_pmovmskb: NumLoBits = 8; break; case Intrinsic::x86_sse2_pmovmskb_128: NumLoBits = 16; break; + case Intrinsic::x86_avx2_pmovmskb: NumLoBits = 32; break; } KnownZero = APInt::getHighBitsSet(Mask.getBitWidth(), Mask.getBitWidth() - NumLoBits); @@ -13856,6 +13858,7 @@ return SDValue(); } +// PerformXorCombine - Attempts to turn XOR nodes into BLSMSK nodes static SDValue PerformXorCombine(SDNode *N, SelectionDAG &DAG, TargetLowering::DAGCombinerInfo &DCI, const X86Subtarget *Subtarget) { @@ -13867,6 +13870,8 @@ if (VT != MVT::i32 && VT != MVT::i64) return SDValue(); + assert(Subtarget->hasBMI() && "Creating BLSMSK requires BMI instructions"); + // Create BLSMSK instructions by finding X ^ (X-1) SDValue N0 = N->getOperand(0); SDValue N1 = N->getOperand(1); From nicholas at mxc.ca Tue Dec 27 00:32:19 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 26 Dec 2011 22:32:19 -0800 Subject: [llvm-commits] [PATCH] SimplifyLibCalls.cpp: Small cosine optimization In-Reply-To: References: Message-ID: <4EF96673.7030409@mxc.ca> On 12/26/2011 09:04 PM, Alexander Malyshev wrote: > Adds the pattern for cos(-x) -> cos(x). Test file included. + // cos(-x) -> cos(x) + Value *Op1 = CI->getArgOperand(0); + if (BinaryOperator *BinExpr = dyn_cast(Op1)) { + if (ConstantFP *C = dyn_cast(BinExpr->getOperand(0))) { + if (BinExpr->getOpcode() == Instruction::FSub && + C->getValueAPF().isZero()) { I think you can simplify this using BinExpr->isFNeg()? This looks great overall, if that simplification works please resend an updated patch! Nick + Value *X = BinExpr->getOperand(1); + return B.CreateCall(Callee, X, ""); + } + } + } > > Alex > > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From lostphifunction at gmail.com Tue Dec 27 01:03:34 2011 From: lostphifunction at gmail.com (Alexander Malyshev) Date: Tue, 27 Dec 2011 02:03:34 -0500 Subject: [llvm-commits] [PATCH] SimplifyLibCalls.cpp: Small cosine optimization In-Reply-To: <4EF96673.7030409@mxc.ca> References: <4EF96673.7030409@mxc.ca> Message-ID: Thanks, and yeah it looks cleaner now. Alex On Tue, Dec 27, 2011 at 1:32 AM, Nick Lewycky wrote: > On 12/26/2011 09:04 PM, Alexander Malyshev wrote: > >> Adds the pattern for cos(-x) -> cos(x). Test file included. >> > > + // cos(-x) -> cos(x) > + Value *Op1 = CI->getArgOperand(0); > + if (BinaryOperator *BinExpr = dyn_cast(Op1)) { > + if (ConstantFP *C = dyn_cast(BinExpr->**getOperand(0))) > { > + if (BinExpr->getOpcode() == Instruction::FSub && > + C->getValueAPF().isZero()) { > > I think you can simplify this using BinExpr->isFNeg()? > > This looks great overall, if that simplification works please resend an > updated patch! > > Nick > > + Value *X = BinExpr->getOperand(1); > + return B.CreateCall(Callee, X, ""); > + } > + } > + } > > >> Alex >> >> >> >> ______________________________**_________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/**mailman/listinfo/llvm-commits >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/05d96c7a/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: cos.diff Type: text/x-patch Size: 3462 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/05d96c7a/attachment.bin From glider at google.com Tue Dec 27 02:24:48 2011 From: glider at google.com (Alexander Potapenko) Date: Tue, 27 Dec 2011 12:24:48 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: avoid name conflicts between multiple mach_override instances. Message-ID: Hi, The code instrumented with ASan may have its own instance of the mach_override library. In this case chances are that functions from it will be called from mach_override_ptr() during ASan initialization. This may lead to crashes (if those functions are instrumented) or incorrect behavior (if the implementations differ). The attached patch renames mach_override_ptr() into __asan_mach_override_ptr() and makes the rest of the mach_override internals hidden. The corresponding AddressSanitizer bug is http://code.google.com/p/address-sanitizer/issues/detail?id=22 -- Alexander Potapenko Software Engineer Google Moscow -------------- next part -------------- A non-text attachment was scrubbed... Name: asan-mach-hidden.patch Type: text/x-patch Size: 4392 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/cf10a77a/attachment.bin From baldrick at free.fr Tue Dec 27 04:27:44 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 27 Dec 2011 11:27:44 +0100 Subject: [llvm-commits] [llvm] r147142 - in /llvm/trunk: lib/Analysis/BranchProbabilityInfo.cpp test/Analysis/BranchProbabilityInfo/noreturn.ll In-Reply-To: <20111222092637.B42A82A6C12C@llvm.org> References: <20111222092637.B42A82A6C12C@llvm.org> Message-ID: <4EF99DA0.70804@free.fr> Hi Chandler, > Make the unreachable probability much much heavier. The previous > probability wouldn't be considered "hot" in some weird loop structures > or other compounding probability patterns. This makes it much harder to > confuse, but isn't really a principled fix. I'd actually like it if we > could model a zero probability, as it would make this much easier to > reason about. Suggestions for how to do this better are welcome. a call to a function that only throws an exception will usually be followed by unreachable. Would giving unreachable zero probability mean that throwing the exception is considered to occur with probability zero? While throwing exceptions is fairly rare, it does happen. Ciao, Duncan. > > Modified: > llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp > llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll > > Modified: llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp?rev=147142&r1=147141&r2=147142&view=diff > ============================================================================== > --- llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp (original) > +++ llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp Thu Dec 22 03:26:37 2011 > @@ -65,8 +65,9 @@ > /// > /// This is the weight for a branch not being taken toward a block that > /// terminates (eventually) in unreachable. Such a branch is essentially never > -/// taken. > -static const uint32_t UR_NONTAKEN_WEIGHT = 1023; > +/// taken. Set the weight to an absurdly high value so that nested loops don't > +/// easily subsume it. > +static const uint32_t UR_NONTAKEN_WEIGHT = 1024*1024 - 1; > > static const uint32_t PH_TAKEN_WEIGHT = 20; > static const uint32_t PH_NONTAKEN_WEIGHT = 12; > > Modified: llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll?rev=147142&r1=147141&r2=147142&view=diff > ============================================================================== > --- llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll (original) > +++ llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll Thu Dec 22 03:26:37 2011 > @@ -8,8 +8,8 @@ > entry: > %cond = icmp eq i32 %a, 42 > br i1 %cond, label %exit, label %abort > -; CHECK: edge entry -> exit probability is 1023 / 1024 > -; CHECK: edge entry -> abort probability is 1 / 1024 > +; CHECK: edge entry -> exit probability is 1048575 / 1048576 > +; CHECK: edge entry -> abort probability is 1 / 1048576 > > abort: > call void @abort() noreturn > @@ -26,11 +26,11 @@ > i32 2, label %case_b > i32 3, label %case_c > i32 4, label %case_d] > -; CHECK: edge entry -> exit probability is 1023 / 1027 > -; CHECK: edge entry -> case_a probability is 1 / 1027 > -; CHECK: edge entry -> case_b probability is 1 / 1027 > -; CHECK: edge entry -> case_c probability is 1 / 1027 > -; CHECK: edge entry -> case_d probability is 1 / 1027 > +; CHECK: edge entry -> exit probability is 1048575 / 1048579 > +; CHECK: edge entry -> case_a probability is 1 / 1048579 > +; CHECK: edge entry -> case_b probability is 1 / 1048579 > +; CHECK: edge entry -> case_c probability is 1 / 1048579 > +; CHECK: edge entry -> case_d probability is 1 / 1048579 > > case_a: > br label %case_b > @@ -55,8 +55,8 @@ > entry: > %cond1 = icmp eq i32 %a, 42 > br i1 %cond1, label %exit, label %dom > -; CHECK: edge entry -> exit probability is 1023 / 1024 > -; CHECK: edge entry -> dom probability is 1 / 1024 > +; CHECK: edge entry -> exit probability is 1048575 / 1048576 > +; CHECK: edge entry -> dom probability is 1 / 1048576 > > dom: > %cond2 = icmp ult i32 %a, 42 > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From chandlerc at google.com Tue Dec 27 04:50:30 2011 From: chandlerc at google.com (Chandler Carruth) Date: Tue, 27 Dec 2011 02:50:30 -0800 Subject: [llvm-commits] [llvm] r147286 - in /llvm/trunk: lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/preserve-branchweights.ll In-Reply-To: <20111227043152.F2AE42A6C12C@llvm.org> References: <20111227043152.F2AE42A6C12C@llvm.org> Message-ID: On Dec 26, 2011 11:40 PM, "Nick Lewycky" wrote: > > Author: nicholas > Date: Mon Dec 26 22:31:52 2011 > New Revision: 147286 > > URL: http://llvm.org/viewvc/llvm-project?rev=147286&view=rev > Log: > Teach simplifycfg to recompute branch weights when merging some branches, and > to discard weights when appropriate. Still more to do (and a new TODO), but > it's a start! > > Modified: > llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp > llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > > Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147286&r1=147285&r2=147286&view=diff > ============================================================================== > --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) > +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Dec 26 22:31:52 2011 > @@ -18,6 +18,8 @@ > #include "llvm/GlobalVariable.h" > #include "llvm/Instructions.h" > #include "llvm/IntrinsicInst.h" > +#include "llvm/LLVMContext.h" > +#include "llvm/Metadata.h" > #include "llvm/Type.h" > #include "llvm/Analysis/InstructionSimplify.h" > #include "llvm/Analysis/ValueTracking.h" > @@ -1462,6 +1464,26 @@ > return true; > } > > +/// ExtractBranchMetadata - Given a conditional BranchInstruction, retrieve the > +/// probabilities of the branch taking each edge. Fills in the two APInt > +/// parameters and return true, or returns false if no or invalid metadata was > +/// found. > +static bool ExtractBranchMetadata(BranchInst *BI, > + APInt &ProbTrue, APInt &ProbFalse) { > + assert(BI->isConditional() && > + "Looking for probabilities on unconditional branch?"); > + MDNode *ProfileData = BI->getMetadata(LLVMContext::MD_prof); > + if (!ProfileData || ProfileData->getNumOperands() != 3) return 0; return false; // ? > + ConstantInt *CITrue = dyn_cast(ProfileData->getOperand(1)); > + ConstantInt *CIFalse = dyn_cast(ProfileData->getOperand(2)); > + if (!CITrue || !CIFalse) return 0; return false; // ? > + ProbTrue = CITrue->getValue(); > + ProbFalse = CIFalse->getValue(); > + assert(ProbTrue.getBitWidth() == 32 && ProbFalse.getBitWidth() == 32 && > + "Branch probability metadata must be 32-bit integers"); > + return true; > +} > + > /// FoldBranchToCommonDest - If this basic block is simple enough, and if a > /// predecessor branches to us and one of our successors, fold the block into > /// the predecessor and use logical operations to pick the right destination. > @@ -1636,6 +1658,51 @@ > PBI->setSuccessor(1, FalseDest); > } > > + // TODO: If BB is reachable from all paths through PredBlock, then we > + // could replace PBI's branch probabilities with BI's. > + > + // Merge probability data into PredBlock's branch. > + APInt A, B, C, D; > + if (ExtractBranchMetadata(PBI, C, D) && ExtractBranchMetadata(BI, A, B)) { > + // bbA: br bbB (a% probability), bbC (b% prob.) > + // bbB: br bbD (c% probability), bbC (d% prob.) I don't understand this comment at all... the association between letters is particularly mysterious. > + // --> bbA: br bbD ((a*c)% prob.), bbC ((b+a*d)% prob.) > + // > + // Probabilities aren't stored as ratios directly. Converting to > + // probability-numerator form, we get: > + // (a*c)% = A*C, (b+(a*d))% = A*D+B*C+B*D. Why is this done with explicit math? At the least it seems like we should be able to form BranchProbability objects to represent the ratio form. Even better would be to use the BranchProbability analysis to compute the ratios from the metadata? > + > + bool Overflow1 = false, Overflow2 = false, Overflow3 = false; > + bool Overflow4 = false, Overflow5 = false, Overflow6 = false; > + APInt ProbTrue = A.umul_ov(C, Overflow1); > + > + APInt Tmp1 = A.umul_ov(D, Overflow2); > + APInt Tmp2 = B.umul_ov(C, Overflow3); > + APInt Tmp3 = B.umul_ov(D, Overflow4); > + APInt Tmp4 = Tmp1.uadd_ov(Tmp2, Overflow5); > + APInt ProbFalse = Tmp4.uadd_ov(Tmp3, Overflow6); > + > + APInt GCD = APIntOps::GreatestCommonDivisor(ProbTrue, ProbFalse); > + ProbTrue = ProbTrue.udiv(GCD); > + ProbFalse = ProbFalse.udiv(GCD); > + > + if (Overflow1 || Overflow2 || Overflow3 || Overflow4 || Overflow5 || > + Overflow6) { > + DEBUG(dbgs() << "Overflow recomputing branch weight on: " << *PBI > + << "when merging with: " << *BI); > + PBI->setMetadata(LLVMContext::MD_prof, NULL); > + } else { > + LLVMContext &Context = BI->getContext(); > + Value *Ops[3]; > + Ops[0] = BI->getMetadata(LLVMContext::MD_prof)->getOperand(0); > + Ops[1] = ConstantInt::get(Context, ProbTrue); > + Ops[2] = ConstantInt::get(Context, ProbFalse); > + PBI->setMetadata(LLVMContext::MD_prof, MDNode::get(Context, Ops)); > + } > + } else { > + PBI->setMetadata(LLVMContext::MD_prof, NULL); > + } > + > // Copy any debug value intrinsics into the end of PredBlock. > for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) > if (isa(*I)) > > Modified: llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll?rev=147286&r1=147285&r2=147286&view=diff > ============================================================================== > --- llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll (original) > +++ llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll Mon Dec 26 22:31:52 2011 > @@ -10,6 +10,45 @@ > > X: > %c = or i1 %b, false > + br i1 %c, label %Z, label %Y, !prof !1 > + > +Y: > + call void @helper(i32 0) > + ret void > + > +Z: > + call void @helper(i32 1) > + ret void > +} > + > +define void @test2(i1 %a, i1 %b) { > +; CHECK: @test2 > +entry: > + br i1 %a, label %X, label %Y, !prof !1 > +; CHECK: br i1 %or.cond, label %Z, label %Y, !prof !1 > +; CHECK-NOT: !prof > + > +X: > + %c = or i1 %b, false > + br i1 %c, label %Z, label %Y, !prof !2 > + > +Y: > + call void @helper(i32 0) > + ret void > + > +Z: > + call void @helper(i32 1) > + ret void > +} > + > +define void @test3(i1 %a, i1 %b) { > +; CHECK: @test3 > +; CHECK-NOT: !prof > +entry: > + br i1 %a, label %X, label %Y, !prof !1 > + > +X: > + %c = or i1 %b, false > br i1 %c, label %Z, label %Y > > Y: > @@ -21,6 +60,29 @@ > ret void > } > > -!0 = metadata !{metadata !"branch_weights", i32 1, i32 2} > +define void @test4(i1 %a, i1 %b) { > +; CHECK: @test4 > +; CHECK-NOT: !prof > +entry: > + br i1 %a, label %X, label %Y > + > +X: > + %c = or i1 %b, false > + br i1 %c, label %Z, label %Y, !prof !1 > + > +Y: > + call void @helper(i32 0) > + ret void > + > +Z: > + call void @helper(i32 1) > + ret void > +} > + > +!0 = metadata !{metadata !"branch_weights", i32 3, i32 5} > +!1 = metadata !{metadata !"branch_weights", i32 1, i32 1} > +!2 = metadata !{metadata !"branch_weights", i32 1, i32 2} > > -; CHECK: !0 = metadata !{metadata !"branch_weights", i32 2, i32 1} > +; CHECK: !0 = metadata !{metadata !"branch_weights", i32 5, i32 11} > +; CHECK: !1 = metadata !{metadata !"branch_weights", i32 1, i32 5} > +; CHECK-NOT: !2 > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/ff9a49cc/attachment-0001.html From baldrick at free.fr Tue Dec 27 04:53:18 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 27 Dec 2011 11:53:18 +0100 Subject: [llvm-commits] [llvm] r147036 - in /llvm/trunk: lib/Analysis/ValueTracking.cpp lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/SpeculativeExec.ll In-Reply-To: References: <20111221055202.885DE2A6C12C@llvm.org> <05C067F7-5491-4FF6-AA1F-CDCB18D419BB@apple.com> <4EF249CA.9010808@mxc.ca> Message-ID: <4EF9A39E.3070900@free.fr> >>> Just checking mayHaveSideEffects() should cover almost everything here. >> >> Good point! Is there anything that mayHaveSideEffects() would return >> true on that wouldn't be safe to speculate? The comment on >> mayHaveSideEffects claims that a call to malloc would return false, but >> that's a lie, suggesting that an audit of users of these two functions >> is due... > > mayHaveSideEffects is essentially equivalent to mayWriteToMemory, and > that is clearly not the same thing as being safe to speculate. Maybe mayHaveSideEffects should then be eliminated, or changed to check more side-effects? Ciao, Duncan. From chandlerc at google.com Tue Dec 27 05:02:12 2011 From: chandlerc at google.com (Chandler Carruth) Date: Tue, 27 Dec 2011 03:02:12 -0800 Subject: [llvm-commits] [llvm] r147142 - in /llvm/trunk: lib/Analysis/BranchProbabilityInfo.cpp test/Analysis/BranchProbabilityInfo/noreturn.ll In-Reply-To: <4EF99DA0.70804@free.fr> References: <20111222092637.B42A82A6C12C@llvm.org> <4EF99DA0.70804@free.fr> Message-ID: On Dec 27, 2011 5:32 AM, "Duncan Sands" wrote: > > Hi Chandler, > > > Make the unreachable probability much much heavier. The previous > > probability wouldn't be considered "hot" in some weird loop structures > > or other compounding probability patterns. This makes it much harder to > > confuse, but isn't really a principled fix. I'd actually like it if we > > could model a zero probability, as it would make this much easier to > > reason about. Suggestions for how to do this better are welcome. > > a call to a function that only throws an exception will usually be followed > by unreachable. Would giving unreachable zero probability mean that throwing > the exception is considered to occur with probability zero? While throwing > exceptions is fairly rare, it does happen. I think to handle this well the probability analysis would need to learn about exceptions and throwing so it can differentiate between abort and throw. I suspect there are other places where we compute poor probabilities in the face of exceptions. Still, treating probably exceptional code paths as having a probability that *approaches* zero (as BPI is never used to prove reachability) seems not entirely unreasonable-it means we will optimize the non-throwing path over the throwing path at any expense. That's the same trade off as zero-cost exceptions? Anyways, I'm not saying it's the right long term model, suggestions here are very welcome. > > Ciao, Duncan. > > > > > Modified: > > llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp > > llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll > > > > Modified: llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp?rev=147142&r1=147141&r2=147142&view=diff > > ============================================================================== > > --- llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp (original) > > +++ llvm/trunk/lib/Analysis/BranchProbabilityInfo.cpp Thu Dec 22 03:26:37 2011 > > @@ -65,8 +65,9 @@ > > /// > > /// This is the weight for a branch not being taken toward a block that > > /// terminates (eventually) in unreachable. Such a branch is essentially never > > -/// taken. > > -static const uint32_t UR_NONTAKEN_WEIGHT = 1023; > > +/// taken. Set the weight to an absurdly high value so that nested loops don't > > +/// easily subsume it. > > +static const uint32_t UR_NONTAKEN_WEIGHT = 1024*1024 - 1; > > > > static const uint32_t PH_TAKEN_WEIGHT = 20; > > static const uint32_t PH_NONTAKEN_WEIGHT = 12; > > > > Modified: llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll?rev=147142&r1=147141&r2=147142&view=diff > > ============================================================================== > > --- llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll (original) > > +++ llvm/trunk/test/Analysis/BranchProbabilityInfo/noreturn.ll Thu Dec 22 03:26:37 2011 > > @@ -8,8 +8,8 @@ > > entry: > > %cond = icmp eq i32 %a, 42 > > br i1 %cond, label %exit, label %abort > > -; CHECK: edge entry -> exit probability is 1023 / 1024 > > -; CHECK: edge entry -> abort probability is 1 / 1024 > > +; CHECK: edge entry -> exit probability is 1048575 / 1048576 > > +; CHECK: edge entry -> abort probability is 1 / 1048576 > > > > abort: > > call void @abort() noreturn > > @@ -26,11 +26,11 @@ > > i32 2, label %case_b > > i32 3, label %case_c > > i32 4, label %case_d] > > -; CHECK: edge entry -> exit probability is 1023 / 1027 > > -; CHECK: edge entry -> case_a probability is 1 / 1027 > > -; CHECK: edge entry -> case_b probability is 1 / 1027 > > -; CHECK: edge entry -> case_c probability is 1 / 1027 > > -; CHECK: edge entry -> case_d probability is 1 / 1027 > > +; CHECK: edge entry -> exit probability is 1048575 / 1048579 > > +; CHECK: edge entry -> case_a probability is 1 / 1048579 > > +; CHECK: edge entry -> case_b probability is 1 / 1048579 > > +; CHECK: edge entry -> case_c probability is 1 / 1048579 > > +; CHECK: edge entry -> case_d probability is 1 / 1048579 > > > > case_a: > > br label %case_b > > @@ -55,8 +55,8 @@ > > entry: > > %cond1 = icmp eq i32 %a, 42 > > br i1 %cond1, label %exit, label %dom > > -; CHECK: edge entry -> exit probability is 1023 / 1024 > > -; CHECK: edge entry -> dom probability is 1 / 1024 > > +; CHECK: edge entry -> exit probability is 1048575 / 1048576 > > +; CHECK: edge entry -> dom probability is 1 / 1048576 > > > > dom: > > %cond2 = icmp ult i32 %a, 42 > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/b2657a30/attachment.html From baldrick at free.fr Tue Dec 27 05:44:17 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 27 Dec 2011 12:44:17 +0100 Subject: [llvm-commits] [llvm] r147142 - in /llvm/trunk: lib/Analysis/BranchProbabilityInfo.cpp test/Analysis/BranchProbabilityInfo/noreturn.ll In-Reply-To: References: <20111222092637.B42A82A6C12C@llvm.org> <4EF99DA0.70804@free.fr> Message-ID: <4EF9AF91.7080003@free.fr> Hi Chandler, On 27/12/11 12:02, Chandler Carruth wrote: > > On Dec 27, 2011 5:32 AM, "Duncan Sands" > wrote: > > > > Hi Chandler, > > > > > Make the unreachable probability much much heavier. The previous > > > probability wouldn't be considered "hot" in some weird loop structures > > > or other compounding probability patterns. This makes it much harder to > > > confuse, but isn't really a principled fix. I'd actually like it if we > > > could model a zero probability, as it would make this much easier to > > > reason about. Suggestions for how to do this better are welcome. > > > > a call to a function that only throws an exception will usually be followed > > by unreachable. Would giving unreachable zero probability mean that throwing > > the exception is considered to occur with probability zero? While throwing > > exceptions is fairly rare, it does happen. > > I think to handle this well the probability analysis would need to learn about > exceptions and throwing so it can differentiate between abort and throw. "abort" should have the nounwind attribute, which tells you that it is not going to throw an exception. > I suspect there are other places where we compute poor probabilities in the face > of exceptions. > > Still, treating probably exceptional code paths as having a probability that > *approaches* zero (as BPI is never used to prove reachability) seems not > entirely unreasonable-it means we will optimize the non-throwing path over the > throwing path at any expense. That's the same trade off as zero-cost exceptions? Codegen also supports sj/lj style exception handling in which throwing an exception is much less expensive than with zero-cost exception handling. > Anyways, I'm not saying it's the right long term model, suggestions here are > very welcome. I don't have any useful ideas for the moment :( Ciao, Duncan. From benny.kra at googlemail.com Tue Dec 27 05:41:05 2011 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Tue, 27 Dec 2011 11:41:05 -0000 Subject: [llvm-commits] [llvm] r147289 - in /llvm/trunk/lib/Target: Hexagon/HexagonExpandPredSpillCode.cpp Hexagon/HexagonInstrInfo.cpp Hexagon/HexagonRegisterInfo.cpp PTX/PTXMFInfoExtract.cpp Message-ID: <20111227114105.D174B2A6C12C@llvm.org> Author: d0k Date: Tue Dec 27 05:41:05 2011 New Revision: 147289 URL: http://llvm.org/viewvc/llvm-project?rev=147289&view=rev Log: Clean up some Release build warnings. Modified: llvm/trunk/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp llvm/trunk/lib/Target/Hexagon/HexagonRegisterInfo.cpp llvm/trunk/lib/Target/PTX/PTXMFInfoExtract.cpp Modified: llvm/trunk/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp?rev=147289&r1=147288&r2=147289&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonExpandPredSpillCode.cpp Tue Dec 27 05:41:05 2011 @@ -70,7 +70,6 @@ bool HexagonExpandPredSpillCode::runOnMachineFunction(MachineFunction &Fn) { const HexagonInstrInfo *TII = QTM.getInstrInfo(); - const HexagonRegisterInfo *RegInfo = QTM.getRegisterInfo(); // Loop over all of the basic blocks. for (MachineFunction::iterator MBBb = Fn.begin(), MBBe = Fn.end(); @@ -84,7 +83,7 @@ if (Opc == Hexagon::STriw_pred) { // STriw_pred [R30], ofst, SrcReg; unsigned FP = MI->getOperand(0).getReg(); - assert(FP == RegInfo->getFrameRegister() && + assert(FP == QTM.getRegisterInfo()->getFrameRegister() && "Not a Frame Pointer, Nor a Spill Slot"); assert(MI->getOperand(1).isImm() && "Not an offset"); int Offset = MI->getOperand(1).getImm(); @@ -129,7 +128,7 @@ assert(Hexagon::PredRegsRegClass.contains(DstReg) && "Not a predicate register"); unsigned FP = MI->getOperand(1).getReg(); - assert(FP == RegInfo->getFrameRegister() && + assert(FP == QTM.getRegisterInfo()->getFrameRegister() && "Not a Frame Pointer, Nor a Spill Slot"); assert(MI->getOperand(2).isImm() && "Not an offset"); int Offset = MI->getOperand(2).getImm(); Modified: llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp?rev=147289&r1=147288&r2=147289&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp Tue Dec 27 05:41:05 2011 @@ -461,7 +461,7 @@ } else if (VT == MVT::i64) { TRC = Hexagon::DoubleRegsRegisterClass; } else { - assert(0 && "Cannot handle this register class"); + llvm_unreachable("Cannot handle this register class"); } unsigned NewReg = RegInfo.createVirtualRegister(TRC); @@ -553,10 +553,6 @@ case Hexagon::JMPR: return false; - return true; - - default: - return true; } return true; @@ -793,9 +789,8 @@ case Hexagon::DEALLOC_RET_V4: return !invertPredicate ? Hexagon::DEALLOC_RET_cPt_V4 : Hexagon::DEALLOC_RET_cNotPt_V4; - default: - assert(false && "Unexpected predicable instruction"); } + llvm_unreachable("Unexpected predicable instruction"); } @@ -1243,8 +1238,8 @@ return true; } - assert(0 && "No offset range is defined for this opcode. Please define it in \ - the above switch statement!"); + llvm_unreachable("No offset range is defined for this opcode. " + "Please define it in the above switch statement!"); } Modified: llvm/trunk/lib/Target/Hexagon/HexagonRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonRegisterInfo.cpp?rev=147289&r1=147288&r2=147289&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonRegisterInfo.cpp Tue Dec 27 05:41:05 2011 @@ -58,18 +58,16 @@ }; switch(Subtarget.getHexagonArchVersion()) { + case HexagonSubtarget::V1: + break; case HexagonSubtarget::V2: return CalleeSavedRegsV2; - break; case HexagonSubtarget::V3: case HexagonSubtarget::V4: return CalleeSavedRegsV3; - break; - default: - const char *ErrorString = - "Callee saved registers requested for unknown archtecture version"; - llvm_unreachable(ErrorString); } + llvm_unreachable("Callee saved registers requested for unknown architecture " + "version"); } BitVector HexagonRegisterInfo::getReservedRegs(const MachineFunction &MF) @@ -106,18 +104,16 @@ }; switch(Subtarget.getHexagonArchVersion()) { + case HexagonSubtarget::V1: + break; case HexagonSubtarget::V2: return CalleeSavedRegClassesV2; - break; case HexagonSubtarget::V3: case HexagonSubtarget::V4: return CalleeSavedRegClassesV3; - break; - default: - const char *ErrorString = - "Callee saved register classes requested for unknown archtecture version"; - llvm_unreachable(ErrorString); } + llvm_unreachable("Callee saved register classes requested for unknown " + "architecture version"); } void HexagonRegisterInfo:: Modified: llvm/trunk/lib/Target/PTX/PTXMFInfoExtract.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PTX/PTXMFInfoExtract.cpp?rev=147289&r1=147288&r2=147289&view=diff ============================================================================== --- llvm/trunk/lib/Target/PTX/PTXMFInfoExtract.cpp (original) +++ llvm/trunk/lib/Target/PTX/PTXMFInfoExtract.cpp Tue Dec 27 05:41:05 2011 @@ -71,6 +71,8 @@ RegType = PTXRegisterType::F32; else if (TRC == PTX::RegF64RegisterClass) RegType = PTXRegisterType::F64; + else + llvm_unreachable("Unkown register class."); MFI->addRegister(Reg, RegType, PTXRegisterSpace::Reg); } From samsonov at google.com Tue Dec 27 07:31:23 2011 From: samsonov at google.com (Alexey Samsonov) Date: Tue, 27 Dec 2011 17:31:23 +0400 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptors for strcasecmp and strncasecmp Message-ID: Rietveld link: http://codereview.appspot.com/5500082/ -- Alexey Samsonov Software Engineer, Moscow samsonov at google.com -------------- next part -------------- A non-text attachment was scrubbed... Name: asan-strcasecmp.diff Type: text/x-patch Size: 12582 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/18b3ff80/attachment.bin From samsonov at google.com Tue Dec 27 07:34:35 2011 From: samsonov at google.com (Alexey Samsonov) Date: Tue, 27 Dec 2011 17:34:35 +0400 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptor for memcmp Message-ID: Rietveld link: http://codereview.appspot.com/5501076/ -- Alexey Samsonov Software Engineer, Moscow samsonov at google.com -------------- next part -------------- A non-text attachment was scrubbed... Name: asan-memcmp.diff Type: text/x-patch Size: 6000 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/e5d47683/attachment.bin From samsonov at google.com Tue Dec 27 07:37:18 2011 From: samsonov at google.com (Alexey Samsonov) Date: Tue, 27 Dec 2011 17:37:18 +0400 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptor for strcat Message-ID: Rietveld link: http://codereview.appspot.com/5504087/ -- Alexey Samsonov Software Engineer, Moscow samsonov at google.com -------------- next part -------------- A non-text attachment was scrubbed... Name: asan-strcat.diff Type: text/x-patch Size: 8416 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/aae458df/attachment.bin From stpworld at narod.ru Tue Dec 27 11:58:32 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Tue, 27 Dec 2011 21:58:32 +0400 Subject: [llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1 In-Reply-To: <4EF37B6B.6000205@narod.ru> References: <4EAA9B5D.802@narod.ru> <4EAA9DE8.80000@free.fr> <485181319805488@web67.yandex.ru> <4EAB079D.6000606@free.fr> <4EB18F12.6060409@narod.ru> <4EB7C319.1000709@narod.ru> <4EDE7D75.704@narod.ru> <4EDFD0F4.1040204@narod.ru> <4EE25B61.9070006@narod.ru> <4EE5C06C.3050705@narod.ru> <333531323974498@web57.yandex.ru> <4EEB9C52.1050301@narod.ru> <4EF37B6B.6000205@narod.ru> Message-ID: <4EFA0748.9080702@narod.ru> ping. Stepan Dyatkovskiy wrote: > Ping. > > Stepan Dyatkovskiy wrote: >> Ping. >> >> -Stepan. > From spop at codeaurora.org Tue Dec 27 12:19:03 2011 From: spop at codeaurora.org (Sebastian Pop) Date: Tue, 27 Dec 2011 12:19:03 -0600 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <1324411736.31367.485.camel@sapling> References: <1319909412.23036.851.camel@sapling> <1319914924.23036.852.camel@sapling> <1319919418.23036.881.camel@sapling> <1319928991.23036.957.camel@sapling> <1320108633.23036.1266.camel@sapling> <1320172356.23036.1298.camel@sapling> <4EB0462C.5010209@grosser.es> <1320184739.23036.1334.camel@sapling> <1320191694.23036.1497.camel@sapling> <1320749109.19359.76.camel@sapling> <4EB90E98.4010805@grosser.es> <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <1324411736.31367.485.camel@sapling> Message-ID: Hi, On Tue, Dec 20, 2011 at 2:08 PM, Hal Finkel wrote: > On Tue, 2011-12-20 at 13:57 -0600, Sebastian Pop wrote: >> Hi, >> >> I see that there are two functions in your code that are O(n^2) in >> number of instructions of the program: getCandidatePairs and >> buildDepMap. ?I think that you could make these two functions faster >> if you work on some form of factored def-use chains for memory, like >> the VUSE/VDEFs of GCC. > > Thanks for the comment! I am not aware of anything along these lines, > although it would be quite helpful. The pass spends a significant amount > of time running the aliasing-analysis queries. I see no reason against committing the vectorizer pass as it is now: we can rework the slow parts of the vectorizer once we'll have the factored use-def chains for memory. Thanks, Sebastian -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum From nicholas at mxc.ca Tue Dec 27 12:25:51 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 18:25:51 -0000 Subject: [llvm-commits] [llvm] r147291 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/cos.ll Message-ID: <20111227182551.383311BE003@llvm.org> Author: nicholas Date: Tue Dec 27 12:25:50 2011 New Revision: 147291 URL: http://llvm.org/viewvc/llvm-project?rev=147291&view=rev Log: Turn cos(-x) into cos(x). Patch by Alexander Malyshev! Added: llvm/trunk/test/Transforms/SimplifyLibCalls/cos.ll Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=147291&r1=147290&r2=147291&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Tue Dec 27 12:25:50 2011 @@ -841,6 +841,28 @@ //===----------------------------------------------------------------------===// //===---------------------------------------===// +// 'cos*' Optimizations + +struct CosOpt : public LibCallOptimization { + virtual Value *CallOptimizer(Function *Callee, CallInst *CI, IRBuilder<> &B) { + FunctionType *FT = Callee->getFunctionType(); + // Just make sure this has 1 argument of FP type, which matches the + // result type. + if (FT->getNumParams() != 1 || FT->getReturnType() != FT->getParamType(0) || + !FT->getParamType(0)->isFloatingPointTy()) + return 0; + + // cos(-x) -> cos(x) + Value *Op1 = CI->getArgOperand(0); + if (BinaryOperator::isFNeg(Op1)) { + BinaryOperator *BinExpr = cast(Op1); + return B.CreateCall(Callee, BinExpr->getOperand(1), "cos"); + } + return 0; + } +}; + +//===---------------------------------------===// // 'pow*' Optimizations struct PowOpt : public LibCallOptimization { @@ -870,7 +892,7 @@ if (Op2C->isExactlyValue(0.5)) { // Expand pow(x, 0.5) to (x == -infinity ? +infinity : fabs(sqrt(x))). // This is faster than calling pow, and still handles negative zero - // and negative infinite correctly. + // and negative infinity correctly. // TODO: In fast-math mode, this could be just sqrt(x). // TODO: In finite-only mode, this could be just fabs(sqrt(x)). Value *Inf = ConstantFP::getInfinity(CI->getType()); @@ -1455,7 +1477,7 @@ StrToOpt StrTo; StrSpnOpt StrSpn; StrCSpnOpt StrCSpn; StrStrOpt StrStr; MemCmpOpt MemCmp; MemCpyOpt MemCpy; MemMoveOpt MemMove; MemSetOpt MemSet; // Math Library Optimizations - PowOpt Pow; Exp2Opt Exp2; UnaryDoubleFPOpt UnaryDoubleFP; + CosOpt Cos; PowOpt Pow; Exp2Opt Exp2; UnaryDoubleFPOpt UnaryDoubleFP; // Integer Optimizations FFSOpt FFS; AbsOpt Abs; IsDigitOpt IsDigit; IsAsciiOpt IsAscii; ToAsciiOpt ToAscii; @@ -1539,6 +1561,9 @@ Optimizations["__strcpy_chk"] = &StrCpyChk; // Math Library Optimizations + Optimizations["cosf"] = &Cos; + Optimizations["cos"] = &Cos; + Optimizations["cosl"] = &Cos; Optimizations["powf"] = &Pow; Optimizations["pow"] = &Pow; Optimizations["powl"] = &Pow; @@ -2352,9 +2377,6 @@ // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(sqrt(x)) -> pow(x,1/9) // -// cos, cosf, cosl: -// * cos(-x) -> cos(x) -// // exp, expf, expl: // * exp(log(x)) -> x // Added: llvm/trunk/test/Transforms/SimplifyLibCalls/cos.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyLibCalls/cos.ll?rev=147291&view=auto ============================================================================== --- llvm/trunk/test/Transforms/SimplifyLibCalls/cos.ll (added) +++ llvm/trunk/test/Transforms/SimplifyLibCalls/cos.ll Tue Dec 27 12:25:50 2011 @@ -0,0 +1,14 @@ +; RUN: opt < %s -simplify-libcalls -S | FileCheck %s + +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +define double @foo(double %d) nounwind readnone { +; CHECK: @foo + %1 = fsub double -0.000000e+00, %d + %2 = call double @cos(double %1) nounwind readnone +; CHECK: call double @cos(double %d) + ret double %2 +} + +declare double @cos(double) nounwind readnone From nicholas at mxc.ca Tue Dec 27 12:27:22 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 18:27:22 -0000 Subject: [llvm-commits] [llvm] r147292 - /llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Message-ID: <20111227182722.6818E1BE003@llvm.org> Author: nicholas Date: Tue Dec 27 12:27:22 2011 New Revision: 147292 URL: http://llvm.org/viewvc/llvm-project?rev=147292&view=rev Log: Use false not zero, as a bool. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147292&r1=147291&r2=147292&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Tue Dec 27 12:27:22 2011 @@ -1473,10 +1473,10 @@ assert(BI->isConditional() && "Looking for probabilities on unconditional branch?"); MDNode *ProfileData = BI->getMetadata(LLVMContext::MD_prof); - if (!ProfileData || ProfileData->getNumOperands() != 3) return 0; + if (!ProfileData || ProfileData->getNumOperands() != 3) return false; ConstantInt *CITrue = dyn_cast(ProfileData->getOperand(1)); ConstantInt *CIFalse = dyn_cast(ProfileData->getOperand(2)); - if (!CITrue || !CIFalse) return 0; + if (!CITrue || !CIFalse) return false; ProbTrue = CITrue->getValue(); ProbFalse = CIFalse->getValue(); assert(ProbTrue.getBitWidth() == 32 && ProbFalse.getBitWidth() == 32 && From nicholas at mxc.ca Tue Dec 27 12:36:58 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 10:36:58 -0800 Subject: [llvm-commits] [PATCH] SimplifyLibCalls.cpp: Small cosine optimization In-Reply-To: References: <4EF96673.7030409@mxc.ca> Message-ID: <4EFA104A.7010607@mxc.ca> On 12/26/2011 11:03 PM, Alexander Malyshev wrote: > Thanks, and yeah it looks cleaner now. I made some truly minor changes and submitted it as r147291. Thanks for the patch! Nick > > Alex > > On Tue, Dec 27, 2011 at 1:32 AM, Nick Lewycky > wrote: > > On 12/26/2011 09:04 PM, Alexander Malyshev wrote: > > Adds the pattern for cos(-x) -> cos(x). Test file included. > > > + // cos(-x) -> cos(x) > + Value *Op1 = CI->getArgOperand(0); > + if (BinaryOperator *BinExpr = dyn_cast(Op1)) { > + if (ConstantFP *C = > dyn_cast(BinExpr->__getOperand(0))) { > + if (BinExpr->getOpcode() == Instruction::FSub && > + C->getValueAPF().isZero()) { > > I think you can simplify this using BinExpr->isFNeg()? > > This looks great overall, if that simplification works please resend > an updated patch! > > Nick > > + Value *X = BinExpr->getOperand(1); > + return B.CreateCall(Callee, X, ""); > + } > + } > + } > > > Alex > > > > _________________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/__mailman/listinfo/llvm-commits > > > > From kcc at google.com Tue Dec 27 13:52:54 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 19:52:54 -0000 Subject: [llvm-commits] [compiler-rt] r147293 - /compiler-rt/trunk/lib/asan/tests/test_output.sh Message-ID: <20111227195254.BBCFA1BE003@llvm.org> Author: kcc Date: Tue Dec 27 13:52:54 2011 New Revision: 147293 URL: http://llvm.org/viewvc/llvm-project?rev=147293&view=rev Log: [asan] make sure frame pointers are not omitted when running asan output tests Modified: compiler-rt/trunk/lib/asan/tests/test_output.sh Modified: compiler-rt/trunk/lib/asan/tests/test_output.sh URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/test_output.sh?rev=147293&r1=147292&r2=147293&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/test_output.sh (original) +++ compiler-rt/trunk/lib/asan/tests/test_output.sh Tue Dec 27 13:52:54 2011 @@ -5,7 +5,7 @@ OS=`uname` CXX=$1 CC=$2 -CXXFLAGS="-mno-omit-leaf-frame-pointer" +CXXFLAGS="-mno-omit-leaf-frame-pointer -fno-omit-frame-pointer" SYMBOLIZER=../scripts/asan_symbolize.py C_TEST=use-after-free From baldrick at free.fr Tue Dec 27 14:12:26 2011 From: baldrick at free.fr (Duncan Sands) Date: Tue, 27 Dec 2011 21:12:26 +0100 Subject: [llvm-commits] [llvm] r146851 - in /llvm/trunk: include/llvm/Analysis/CodeMetrics.h include/llvm/CodeGen/MachineFunction.h lib/Analysis/InlineCost.cpp lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp lib/CodeGen/StackSlotColoring.cpp test/Transforms/Inline/inline_returns_twice.ll In-Reply-To: <20111218203543.98DEB2A6C12C@llvm.org> References: <20111218203543.98DEB2A6C12C@llvm.org> Message-ID: <4EFA26AA.5070400@free.fr> Hi Joerg, > Allow inlining of functions with returns_twice calls, if they have the > attribute themselve. > --- llvm/trunk/include/llvm/Analysis/CodeMetrics.h (original) > +++ llvm/trunk/include/llvm/Analysis/CodeMetrics.h Sun Dec 18 14:35:43 2011 > @@ -31,8 +31,9 @@ > /// caller. > // bool NeverInline; > > - // True if this function contains a call to setjmp or _setjmp > - bool callsSetJmp; > + // True if this function contains a call to setjmp or other functions > + // with attribute "returns twice" without having the attribute by itself. by itself -> itself > + bool exposesReturnsTwice; > > // True if this function calls itself > bool isRecursive; Ciao, Duncan. From jcarter at mips.com Tue Dec 27 14:27:59 2011 From: jcarter at mips.com (Carter, Jack) Date: Tue, 27 Dec 2011 20:27:59 +0000 Subject: [llvm-commits] [Mips] Patch request Message-ID: <86AC779C188FE74F88F6494478B46332E8F859@exchdb03.mips.com> This is a formatting change that somehow got left out of an earlier submission. It does not change behavior. It only conforms to coding standards. Jack Carter -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/7adf6663/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: MipsMCInstLower.patch Type: text/x-patch Size: 6348 bytes Desc: MipsMCInstLower.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/7adf6663/attachment.bin From benny.kra at googlemail.com Tue Dec 27 14:35:07 2011 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Tue, 27 Dec 2011 20:35:07 -0000 Subject: [llvm-commits] [llvm] r147294 - in /llvm/trunk: include/llvm/ADT/StringMap.h lib/Support/StringMap.cpp Message-ID: <20111227203507.9EED41BE003@llvm.org> Author: d0k Date: Tue Dec 27 14:35:07 2011 New Revision: 147294 URL: http://llvm.org/viewvc/llvm-project?rev=147294&view=rev Log: Switch StringMap from an array of structures to a structure of arrays. - -25% memory usage of the main table on x86_64 (was wasted in struct padding). - no significant performance change. Modified: llvm/trunk/include/llvm/ADT/StringMap.h llvm/trunk/lib/Support/StringMap.cpp Modified: llvm/trunk/include/llvm/ADT/StringMap.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/StringMap.h?rev=147294&r1=147293&r2=147294&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/StringMap.h (original) +++ llvm/trunk/include/llvm/ADT/StringMap.h Tue Dec 27 14:35:07 2011 @@ -51,20 +51,11 @@ /// StringMapImpl - This is the base class of StringMap that is shared among /// all of its instantiations. class StringMapImpl { -public: - /// ItemBucket - The hash table consists of an array of these. If Item is - /// non-null, this is an extant entry, otherwise, it is a hole. - struct ItemBucket { - /// FullHashValue - This remembers the full hash value of the key for - /// easy scanning. - unsigned FullHashValue; - - /// Item - This is a pointer to the actual item object. - StringMapEntryBase *Item; - }; - protected: - ItemBucket *TheTable; + // Array of NumBuckets pointers to entries, null pointers are holes. + // TheTable[NumBuckets] contains a sentinel value for easy iteration. Follwed + // by an array of the actual hash values as unsigned integers. + StringMapEntryBase **TheTable; unsigned NumBuckets; unsigned NumItems; unsigned NumTombstones; @@ -320,13 +311,13 @@ /// insert it and return true. bool insert(MapEntryTy *KeyValue) { unsigned BucketNo = LookupBucketFor(KeyValue->getKey()); - ItemBucket &Bucket = TheTable[BucketNo]; - if (Bucket.Item && Bucket.Item != getTombstoneVal()) + StringMapEntryBase *&Bucket = TheTable[BucketNo]; + if (Bucket && Bucket != getTombstoneVal()) return false; // Already exists in map. - if (Bucket.Item == getTombstoneVal()) + if (Bucket == getTombstoneVal()) --NumTombstones; - Bucket.Item = KeyValue; + Bucket = KeyValue; ++NumItems; assert(NumItems + NumTombstones <= NumBuckets); @@ -340,10 +331,11 @@ // Zap all values, resetting the keys back to non-present (not tombstone), // which is safe because we're removing all elements. - for (ItemBucket *I = TheTable, *E = TheTable+NumBuckets; I != E; ++I) { - if (I->Item && I->Item != getTombstoneVal()) { - static_cast(I->Item)->Destroy(Allocator); - I->Item = 0; + for (unsigned I = 0, E = NumBuckets; I != E; ++I) { + StringMapEntryBase *&Bucket = TheTable[I]; + if (Bucket && Bucket != getTombstoneVal()) { + static_cast(Bucket)->Destroy(Allocator); + Bucket = 0; } } @@ -357,21 +349,21 @@ template MapEntryTy &GetOrCreateValue(StringRef Key, InitTy Val) { unsigned BucketNo = LookupBucketFor(Key); - ItemBucket &Bucket = TheTable[BucketNo]; - if (Bucket.Item && Bucket.Item != getTombstoneVal()) - return *static_cast(Bucket.Item); + StringMapEntryBase *&Bucket = TheTable[BucketNo]; + if (Bucket && Bucket != getTombstoneVal()) + return *static_cast(Bucket); MapEntryTy *NewItem = MapEntryTy::Create(Key.begin(), Key.end(), Allocator, Val); - if (Bucket.Item == getTombstoneVal()) + if (Bucket == getTombstoneVal()) --NumTombstones; ++NumItems; assert(NumItems + NumTombstones <= NumBuckets); // Fill in the bucket for the hash table. The FullHashValue was already // filled in by LookupBucketFor. - Bucket.Item = NewItem; + Bucket = NewItem; RehashTable(); return *NewItem; @@ -410,21 +402,21 @@ template class StringMapConstIterator { protected: - StringMapImpl::ItemBucket *Ptr; + StringMapEntryBase **Ptr; public: typedef StringMapEntry value_type; - explicit StringMapConstIterator(StringMapImpl::ItemBucket *Bucket, + explicit StringMapConstIterator(StringMapEntryBase **Bucket, bool NoAdvance = false) : Ptr(Bucket) { if (!NoAdvance) AdvancePastEmptyBuckets(); } const value_type &operator*() const { - return *static_cast*>(Ptr->Item); + return *static_cast*>(*Ptr); } const value_type *operator->() const { - return static_cast*>(Ptr->Item); + return static_cast*>(*Ptr); } bool operator==(const StringMapConstIterator &RHS) const { @@ -445,7 +437,7 @@ private: void AdvancePastEmptyBuckets() { - while (Ptr->Item == 0 || Ptr->Item == StringMapImpl::getTombstoneVal()) + while (*Ptr == 0 || *Ptr == StringMapImpl::getTombstoneVal()) ++Ptr; } }; @@ -453,15 +445,15 @@ template class StringMapIterator : public StringMapConstIterator { public: - explicit StringMapIterator(StringMapImpl::ItemBucket *Bucket, + explicit StringMapIterator(StringMapEntryBase **Bucket, bool NoAdvance = false) : StringMapConstIterator(Bucket, NoAdvance) { } StringMapEntry &operator*() const { - return *static_cast*>(this->Ptr->Item); + return *static_cast*>(*this->Ptr); } StringMapEntry *operator->() const { - return static_cast*>(this->Ptr->Item); + return static_cast*>(*this->Ptr); } }; Modified: llvm/trunk/lib/Support/StringMap.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/StringMap.cpp?rev=147294&r1=147293&r2=147294&view=diff ============================================================================== --- llvm/trunk/lib/Support/StringMap.cpp (original) +++ llvm/trunk/lib/Support/StringMap.cpp Tue Dec 27 14:35:07 2011 @@ -39,11 +39,13 @@ NumItems = 0; NumTombstones = 0; - TheTable = (ItemBucket*)calloc(NumBuckets+1, sizeof(ItemBucket)); - + TheTable = (StringMapEntryBase **)calloc(NumBuckets+1, + sizeof(StringMapEntryBase **) + + sizeof(unsigned)); + // Allocate one extra bucket, set it to look filled so the iterators stop at // end. - TheTable[NumBuckets].Item = (StringMapEntryBase*)2; + TheTable[NumBuckets] = (StringMapEntryBase*)2; } @@ -60,29 +62,29 @@ } unsigned FullHashValue = HashString(Name); unsigned BucketNo = FullHashValue & (HTSize-1); - + unsigned *HashTable = (unsigned *)(TheTable + NumBuckets + 1); + unsigned ProbeAmt = 1; int FirstTombstone = -1; while (1) { - ItemBucket &Bucket = TheTable[BucketNo]; - StringMapEntryBase *BucketItem = Bucket.Item; + StringMapEntryBase *BucketItem = TheTable[BucketNo]; // If we found an empty bucket, this key isn't in the table yet, return it. if (BucketItem == 0) { // If we found a tombstone, we want to reuse the tombstone instead of an // empty bucket. This reduces probing. if (FirstTombstone != -1) { - TheTable[FirstTombstone].FullHashValue = FullHashValue; + HashTable[FirstTombstone] = FullHashValue; return FirstTombstone; } - Bucket.FullHashValue = FullHashValue; + HashTable[BucketNo] = FullHashValue; return BucketNo; } if (BucketItem == getTombstoneVal()) { // Skip over tombstones. However, remember the first one we see. if (FirstTombstone == -1) FirstTombstone = BucketNo; - } else if (Bucket.FullHashValue == FullHashValue) { + } else if (HashTable[BucketNo] == FullHashValue) { // If the full hash value matches, check deeply for a match. The common // case here is that we are only looking at the buckets (for item info // being non-null and for the full hash value) not at the items. This @@ -115,18 +117,18 @@ if (HTSize == 0) return -1; // Really empty table? unsigned FullHashValue = HashString(Key); unsigned BucketNo = FullHashValue & (HTSize-1); - + unsigned *HashTable = (unsigned *)(TheTable + NumBuckets + 1); + unsigned ProbeAmt = 1; while (1) { - ItemBucket &Bucket = TheTable[BucketNo]; - StringMapEntryBase *BucketItem = Bucket.Item; + StringMapEntryBase *BucketItem = TheTable[BucketNo]; // If we found an empty bucket, this key isn't in the table yet, return. if (BucketItem == 0) return -1; if (BucketItem == getTombstoneVal()) { // Ignore tombstones. - } else if (Bucket.FullHashValue == FullHashValue) { + } else if (HashTable[BucketNo] == FullHashValue) { // If the full hash value matches, check deeply for a match. The common // case here is that we are only looking at the buckets (for item info // being non-null and for the full hash value) not at the items. This @@ -165,8 +167,8 @@ int Bucket = FindKey(Key); if (Bucket == -1) return 0; - StringMapEntryBase *Result = TheTable[Bucket].Item; - TheTable[Bucket].Item = getTombstoneVal(); + StringMapEntryBase *Result = TheTable[Bucket]; + TheTable[Bucket] = getTombstoneVal(); --NumItems; ++NumTombstones; assert(NumItems + NumTombstones <= NumBuckets); @@ -180,6 +182,7 @@ /// the appropriate mod-of-hashtable-size. void StringMapImpl::RehashTable() { unsigned NewSize; + unsigned *HashTable = (unsigned *)(TheTable + NumBuckets + 1); // If the hash table is now more than 3/4 full, or if fewer than 1/8 of // the buckets are empty (meaning that many are filled with tombstones), @@ -194,19 +197,23 @@ // Allocate one extra bucket which will always be non-empty. This allows the // iterators to stop at end. - ItemBucket *NewTableArray =(ItemBucket*)calloc(NewSize+1, sizeof(ItemBucket)); - NewTableArray[NewSize].Item = (StringMapEntryBase*)2; - + StringMapEntryBase **NewTableArray = + (StringMapEntryBase **)calloc(NewSize+1, sizeof(StringMapEntryBase *) + + sizeof(unsigned)); + unsigned *NewHashArray = (unsigned *)(NewTableArray + NewSize + 1); + NewTableArray[NewSize] = (StringMapEntryBase*)2; + // Rehash all the items into their new buckets. Luckily :) we already have // the hash values available, so we don't have to rehash any strings. - for (ItemBucket *IB = TheTable, *E = TheTable+NumBuckets; IB != E; ++IB) { - if (IB->Item && IB->Item != getTombstoneVal()) { + for (unsigned I = 0, E = NumBuckets; I != E; ++I) { + StringMapEntryBase *Bucket = TheTable[I]; + if (Bucket && Bucket != getTombstoneVal()) { // Fast case, bucket available. - unsigned FullHash = IB->FullHashValue; + unsigned FullHash = HashTable[I]; unsigned NewBucket = FullHash & (NewSize-1); - if (NewTableArray[NewBucket].Item == 0) { - NewTableArray[FullHash & (NewSize-1)].Item = IB->Item; - NewTableArray[FullHash & (NewSize-1)].FullHashValue = FullHash; + if (NewTableArray[NewBucket] == 0) { + NewTableArray[FullHash & (NewSize-1)] = Bucket; + NewHashArray[FullHash & (NewSize-1)] = FullHash; continue; } @@ -214,11 +221,11 @@ unsigned ProbeSize = 1; do { NewBucket = (NewBucket + ProbeSize++) & (NewSize-1); - } while (NewTableArray[NewBucket].Item); + } while (NewTableArray[NewBucket]); // Finally found a slot. Fill it in. - NewTableArray[NewBucket].Item = IB->Item; - NewTableArray[NewBucket].FullHashValue = FullHash; + NewTableArray[NewBucket] = Bucket; + NewHashArray[NewBucket] = FullHash; } } From rafael.espindola at gmail.com Tue Dec 27 15:37:11 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Tue, 27 Dec 2011 21:37:11 -0000 Subject: [llvm-commits] [llvm] r147296 - /llvm/trunk/Makefile.rules Message-ID: <20111227213711.B680A1BE003@llvm.org> Author: rafael Date: Tue Dec 27 15:37:11 2011 New Revision: 147296 URL: http://llvm.org/viewvc/llvm-project?rev=147296&view=rev Log: PR11642 has been fixed, enable -fvisibility-inlines-hidden everywhere. Modified: llvm/trunk/Makefile.rules Modified: llvm/trunk/Makefile.rules URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/Makefile.rules?rev=147296&r1=147295&r2=147296&view=diff ============================================================================== --- llvm/trunk/Makefile.rules (original) +++ llvm/trunk/Makefile.rules Tue Dec 27 15:37:11 2011 @@ -320,11 +320,8 @@ endif ifeq ($(ENABLE_VISIBILITY_INLINES_HIDDEN),1) -# FIXME: clang's -fvisibility-inlines-hidden is broken for shared libs. PR11642. -ifneq ($(ENABLE_SHARED),1) CXX.Flags += -fvisibility-inlines-hidden endif -endif ifdef ENABLE_EXPENSIVE_CHECKS # GNU libstdc++ uses RTTI if you define _GLIBCXX_DEBUG, which we did above. From kcc at google.com Tue Dec 27 15:57:12 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 21:57:12 -0000 Subject: [llvm-commits] [compiler-rt] r147297 - in /compiler-rt/trunk/lib/asan: Makefile.old asan_interface.h asan_internal.h tests/asan_test.cc Message-ID: <20111227215712.778531BE003@llvm.org> Author: kcc Date: Tue Dec 27 15:57:12 2011 New Revision: 147297 URL: http://llvm.org/viewvc/llvm-project?rev=147297&view=rev Log: [asan] rely on __has_feature(address_sanitizer) instead of the ADDRESS_SANITIZER macro Modified: compiler-rt/trunk/lib/asan/Makefile.old compiler-rt/trunk/lib/asan/asan_interface.h compiler-rt/trunk/lib/asan/asan_internal.h compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/Makefile.old URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/Makefile.old?rev=147297&r1=147296&r2=147297&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/Makefile.old (original) +++ compiler-rt/trunk/lib/asan/Makefile.old Tue Dec 27 15:57:12 2011 @@ -137,7 +137,6 @@ CLANG_ASAN_CXX=$(CLANG_CXX) \ -faddress-sanitizer \ - -DADDRESS_SANITIZER=1 \ $(BLACKLIST) \ -mllvm -asan-stack=$(ASAN_STACK) \ -mllvm -asan-globals=$(ASAN_GLOBALS) \ @@ -152,7 +151,6 @@ GCC_ASAN_PATH=SET_FROM_COMMAND_LINE GCC_ASAN_CXX=$(GCC_ASAN_PATH)/g++ \ -faddress-sanitizer \ - -DADDRESS_SANITIZER=1 \ $(COMMON_ASAN_DEFINES) GCC_ASAN_LD=$(GCC_ASAN_PATH)/g++ -ldl -lpthread Modified: compiler-rt/trunk/lib/asan/asan_interface.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interface.h?rev=147297&r1=147296&r2=147297&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interface.h (original) +++ compiler-rt/trunk/lib/asan/asan_interface.h Tue Dec 27 15:57:12 2011 @@ -75,7 +75,7 @@ void __asan_unpoison_memory_region(void const volatile *addr, size_t size); // User code should use macro instead of functions. -#ifdef ADDRESS_SANITIZER +#if defined(__has_feature) && __has_feature(address_sanitizer) #define ASAN_POISON_MEMORY_REGION(addr, size) \ __asan_poison_memory_region((addr), (size)) #define ASAN_UNPOISON_MEMORY_REGION(addr, size) \ Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=147297&r1=147296&r2=147297&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Tue Dec 27 15:57:12 2011 @@ -36,7 +36,7 @@ #include #endif -#ifdef ADDRESS_SANITIZER +#if defined(__has_feature) && __has_feature(address_sanitizer) # error "The AddressSanitizer run-time should not be" " instrumented by AddressSanitizer" #endif Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147297&r1=147296&r2=147297&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Tue Dec 27 15:57:12 2011 @@ -219,8 +219,13 @@ asan_write((T*)(p + off)); } -TEST(AddressSanitizer, ADDRESS_SANITIZER_MacroTest) { - EXPECT_EQ(1, ADDRESS_SANITIZER); +TEST(AddressSanitizer, HasFeatureAddressSanitizerTest) { +#if defined(__has_feature) && __has_feature(address_sanitizer) + bool asan = 1; +#else + bool asan = 0; +#endif + EXPECT_EQ(true, asan); } TEST(AddressSanitizer, SimpleDeathTest) { From kcc at google.com Tue Dec 27 17:08:32 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 15:08:32 -0800 Subject: [llvm-commits] [PATCH] asan-rt: getenv() from .preinit_array In-Reply-To: References: Message-ID: I'd prefer something different: - Don't put this into sysinfo/sysinfo.cc (we don't use it in some settings, also we may want to get rid of it eventually). - I suggest we implement ReadProcSelfEnviron in asan_rtl.cc and then use it to implement asan_getenv() in asan_linux.cc (and in asan_mac.cc, if needed). - no need for PLATFORM_WINDOWS section (yet). Such code will need to go to asan_windows.cc once we have it. - we want to avoid memchr/memcmp/etc (use __internal* variants). - don't fallback to libc, just do ASAN_DIE - do we need HAVE___ENVIRON section? --kcc On Wed, Dec 21, 2011 at 3:57 AM, Evgeniy Stepanov wrote: > Hi, > > this patch brings in the implementation of GetenvBeforeMain() from > google-perftools and uses it in place of getenv(). This is required to > call __asan_init from .preinit_array. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/f1d81455/attachment.html From kcc at google.com Tue Dec 27 17:11:09 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 23:11:09 -0000 Subject: [llvm-commits] [compiler-rt] r147300 - /compiler-rt/trunk/lib/asan/asan_rtl.cc Message-ID: <20111227231109.EEDF81BE003@llvm.org> Author: kcc Date: Tue Dec 27 17:11:09 2011 New Revision: 147300 URL: http://llvm.org/viewvc/llvm-project?rev=147300&view=rev Log: new() has slightly different signature on Android. This patch adds the Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147300&r1=147299&r2=147300&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Tue Dec 27 17:11:09 2011 @@ -404,12 +404,17 @@ GET_STACK_TRACE_HERE_FOR_MALLOC;\ return asan_memalign(0, size, &stack); +#ifdef ANDROID +void *operator new(size_t size) { OPERATOR_NEW_BODY; } +void *operator new[](size_t size) { OPERATOR_NEW_BODY; } +#else void *operator new(size_t size) throw(std::bad_alloc) { OPERATOR_NEW_BODY; } void *operator new[](size_t size) throw(std::bad_alloc) { OPERATOR_NEW_BODY; } void *operator new(size_t size, std::nothrow_t const&) throw() { OPERATOR_NEW_BODY; } void *operator new[](size_t size, std::nothrow_t const&) throw() { OPERATOR_NEW_BODY; } +#endif #define OPERATOR_DELETE_BODY \ GET_STACK_TRACE_HERE_FOR_FREE(ptr);\ From kcc at google.com Tue Dec 27 17:15:07 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 15:15:07 -0800 Subject: [llvm-commits] [PATCH] asan-rt: wrappers for new() on Android In-Reply-To: References: Message-ID: r147300 On Wed, Dec 21, 2011 at 3:59 AM, Evgeniy Stepanov wrote: > Hi, > > new() has slightly different signature on Android. This patch adds the > required wrappers. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/05d77c71/attachment.html From kcc at google.com Tue Dec 27 17:24:39 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 15:24:39 -0800 Subject: [llvm-commits] [PATCH] asan-rt: discover main thread stack limits without pthread In-Reply-To: References: Message-ID: I like the idea (it may actually fix the problem reported today on wine), but we need to guard the code that uses sysinfo/sysinfo.h with ASAN_USE_SYSINFO==1 (see asan_stack.cc). Or add some macro machinery to get sysinfo.h from an alternative place as in perftools (base/sysinfo.h). --kcc On Wed, Dec 21, 2011 at 4:04 AM, Evgeniy Stepanov wrote: > Hi, > > if __asan_init is called from .preinit_array, pthread_getattr_np may > become unsafe. This patch adds a different way of locating the stack > of the main thread with a combination of getlrimit() and > /proc/self/maps. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/d855f8ce/attachment.html From kcc at google.com Tue Dec 27 17:36:22 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 15:36:22 -0800 Subject: [llvm-commits] [PATCH] asan-rt: add definitions for ucontext_t and friends on Android In-Reply-To: References: Message-ID: You are kidding :) Let's not put the definition of ucontext into asan-rt. Can't we really get it from somewhere in the system? If no, we can simply have # if defined(__arm__) *pc = *sp = *bp = 0; These used to be required when asan worked through SIGILL. Not any more by default. I am actually considering to remove SIGILL-related code altogether (not 100% sure yet). --kcc On Wed, Dec 21, 2011 at 4:08 AM, Evgeniy Stepanov wrote: > Hi, > > libc headers on Android miss ucontext_t and friends. This patch adds > some compatible definitions. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/db840bee/attachment.html From kcc at google.com Tue Dec 27 17:42:56 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 23:42:56 -0000 Subject: [llvm-commits] [compiler-rt] r147301 - /compiler-rt/trunk/lib/asan/tests/asan_test.cc Message-ID: <20111227234256.17AB01BE003@llvm.org> Author: kcc Date: Tue Dec 27 17:42:55 2011 New Revision: 147301 URL: http://llvm.org/viewvc/llvm-project?rev=147301&view=rev Log: [asan] remove the test for cfree. 'man cfree' says: 'This function should never be used.' and this function is not found on many OSes we support. Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147301&r1=147300&r2=147301&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Tue Dec 27 17:42:55 2011 @@ -255,9 +255,6 @@ delete c; #ifndef __APPLE__ - // cfree - cfree(Ident(malloc(1))); - // fprintf(stderr, "posix_memalign\n"); int *pm; int pm_res = posix_memalign((void**)&pm, kPageSize, kPageSize); From kcc at google.com Tue Dec 27 17:47:47 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 15:47:47 -0800 Subject: [llvm-commits] [PATCH] asan-rt: no cfree() on Android In-Reply-To: References: Message-ID: I removed the test for cfree completely: r147301. Thanks! --kcc On Wed, Dec 21, 2011 at 4:10 AM, Evgeniy Stepanov wrote: > Hi, > > this patch disables the cfree() test on Android. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/dde38c60/attachment.html From kcc at google.com Tue Dec 27 18:59:39 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 00:59:39 -0000 Subject: [llvm-commits] [compiler-rt] r147302 - /compiler-rt/trunk/lib/asan/asan_rtl.cc Message-ID: <20111228005939.3D5D41BE003@llvm.org> Author: kcc Date: Tue Dec 27 18:59:39 2011 New Revision: 147302 URL: http://llvm.org/viewvc/llvm-project?rev=147302&view=rev Log: [asan] make sure __asan_report_* functions are not inlined (so that they are not optimized away and are kept in the resulting library). Patch by glider at google.com Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147302&r1=147301&r2=147302&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Tue Dec 27 18:59:39 2011 @@ -331,12 +331,12 @@ } // exported functions -#define ASAN_REPORT_ERROR(type, is_write, size) \ -extern "C" void __asan_report_ ## type ## size(uintptr_t addr) \ - __attribute__((visibility("default"))); \ -extern "C" void __asan_report_ ## type ## size(uintptr_t addr) { \ - GET_BP_PC_SP; \ - __asan_report_error(pc, bp, sp, addr, is_write, size); \ +#define ASAN_REPORT_ERROR(type, is_write, size) \ +extern "C" void __asan_report_ ## type ## size(uintptr_t addr) \ + __attribute__((visibility("default"))) __attribute__((noinline)); \ +extern "C" void __asan_report_ ## type ## size(uintptr_t addr) { \ + GET_BP_PC_SP; \ + __asan_report_error(pc, bp, sp, addr, is_write, size); \ } ASAN_REPORT_ERROR(load, false, 1) @@ -355,8 +355,7 @@ // dynamic libraries access the symbol even if it is not used by the executable // itself. This should help if the build system is removing dead code at link // time. -extern "C" -void __asan_force_interface_symbols() { +static void force_interface_symbols() { volatile int fake_condition = 0; // prevent dead condition elimination. if (fake_condition) { __asan_report_load1(NULL); @@ -775,7 +774,7 @@ asanThreadRegistry().Init(); asanThreadRegistry().GetMain()->ThreadStart(); - __asan_force_interface_symbols(); // no-op. + force_interface_symbols(); // no-op. if (FLAG_v) { Report("AddressSanitizer Init done\n"); From kcc at google.com Tue Dec 27 19:03:41 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 17:03:41 -0800 Subject: [llvm-commits] [PATCH] AddressSanitizer: disallow inlining the __asan_report_* functions In-Reply-To: References: Message-ID: r147302. On Mon, Dec 26, 2011 at 8:16 AM, Alexander Potapenko wrote: > Hi all, > > The attached patch marks the __asan_report_{load,store}{1,2,4,8,16} > functions as noinline, thus making sure they'll be present in the > resulting binary (this is necessary e.g. for Chrome) > Without that it turned out that the calls to those functions from > __asan_force_interface_symbols were inlined. > > -- > Alexander Potapenko > Software Engineer > Google Moscow > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/94d59589/attachment.html From kcc at google.com Tue Dec 27 19:08:15 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 01:08:15 -0000 Subject: [llvm-commits] [compiler-rt] r147303 - in /compiler-rt/trunk/lib/asan: asan_interceptors.h mach_override/README.txt mach_override/mach_override.c mach_override/mach_override.h Message-ID: <20111228010815.4A7141BE003@llvm.org> Author: kcc Date: Tue Dec 27 19:08:14 2011 New Revision: 147303 URL: http://llvm.org/viewvc/llvm-project?rev=147303&view=rev Log: The code instrumented with ASan may have its own instance of the mach_override library. In this case chances are that functions from it will be called from mach_override_ptr() during ASan initialization. This may lead to crashes (if those functions are instrumented) or incorrect behavior (if the implementations differ). The attached patch renames mach_override_ptr() into __asan_mach_override_ptr() and makes the rest of the mach_override internals hidden. The corresponding AddressSanitizer bug is http://code.google.com/p/address-sanitizer/issues/detail?id=22 Patch by glider at google.com Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/mach_override/README.txt compiler-rt/trunk/lib/asan/mach_override/mach_override.c compiler-rt/trunk/lib/asan/mach_override/mach_override.h Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=147303&r1=147302&r2=147303&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Tue Dec 27 19:08:14 2011 @@ -39,15 +39,15 @@ #define WRAPPER_NAME(x) "wrap_"#x #define OVERRIDE_FUNCTION(oldfunc, newfunc) \ - CHECK(0 == mach_override_ptr((void*)(oldfunc), \ - (void*)(newfunc), \ - (void**)&real_##oldfunc)); \ + CHECK(0 == __asan_mach_override_ptr((void*)(oldfunc), \ + (void*)(newfunc), \ + (void**)&real_##oldfunc)); \ CHECK(real_##oldfunc != NULL); #define OVERRIDE_FUNCTION_IF_EXISTS(oldfunc, newfunc) \ - do { mach_override_ptr((void*)(oldfunc), \ - (void*)(newfunc), \ - (void**)&real_##oldfunc); } while (0) + do { __asan_mach_override_ptr((void*)(oldfunc), \ + (void*)(newfunc), \ + (void**)&real_##oldfunc); } while (0) #define INTERCEPT_FUNCTION(func) \ OVERRIDE_FUNCTION(func, WRAP(func)) Modified: compiler-rt/trunk/lib/asan/mach_override/README.txt URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/mach_override/README.txt?rev=147303&r1=147302&r2=147303&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/mach_override/README.txt (original) +++ compiler-rt/trunk/lib/asan/mach_override/README.txt Tue Dec 27 19:08:14 2011 @@ -4,4 +4,6 @@ -- The files are guarded with #ifdef __APPLE__ -- some opcodes are added in order to parse the library functions on Lion -- fixupInstructions() is extended to relocate relative calls, not only jumps +-- mach_override_ptr is renamed to __asan_mach_override_ptr and + other functions are marked as hidden. Modified: compiler-rt/trunk/lib/asan/mach_override/mach_override.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/mach_override/mach_override.c?rev=147303&r1=147302&r2=147303&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/mach_override/mach_override.c (original) +++ compiler-rt/trunk/lib/asan/mach_override/mach_override.c Tue Dec 27 19:08:14 2011 @@ -108,18 +108,18 @@ allocateBranchIsland( BranchIsland **island, int allocateHigh, - void *originalFunctionAddress); + void *originalFunctionAddress) __attribute__((visibility("hidden"))); mach_error_t freeBranchIsland( - BranchIsland *island ); + BranchIsland *island ) __attribute__((visibility("hidden"))); #if defined(__ppc__) || defined(__POWERPC__) mach_error_t setBranchIslandTarget( BranchIsland *island, const void *branchTo, - long instruction ); + long instruction ) __attribute__((visibility("hidden"))); #endif #if defined(__i386__) || defined(__x86_64__) @@ -127,11 +127,11 @@ setBranchIslandTarget_i386( BranchIsland *island, const void *branchTo, - char* instructions ); + char* instructions ) __attribute__((visibility("hidden"))); void atomic_mov64( uint64_t *targetAddress, - uint64_t value ); + uint64_t value ) __attribute__((visibility("hidden"))); static Boolean eatKnownInstructions( @@ -140,7 +140,7 @@ int *howManyEaten, char *originalInstructions, int *originalInstructionCount, - uint8_t *originalInstructionSizes ); + uint8_t *originalInstructionSizes ) __attribute__((visibility("hidden"))); static void fixupInstructions( @@ -148,7 +148,7 @@ void *escapeIsland, void *instructionsToFix, int instructionCount, - uint8_t *instructionSizes ); + uint8_t *instructionSizes ) __attribute__((visibility("hidden"))); #endif /******************************************************************************* @@ -176,7 +176,7 @@ #endif mach_error_t -mach_override_ptr( +__asan_mach_override_ptr( void *originalFunctionAddress, const void *overrideFunctionAddress, void **originalFunctionReentryIsland ) Modified: compiler-rt/trunk/lib/asan/mach_override/mach_override.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/mach_override/mach_override.h?rev=147303&r1=147302&r2=147303&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/mach_override/mach_override.h (original) +++ compiler-rt/trunk/lib/asan/mach_override/mach_override.h Tue Dec 27 19:08:14 2011 @@ -77,8 +77,10 @@ ************************************************************************************/ +// We're prefixing mach_override_ptr() with "__asan_" to avoid name conflicts with other +// mach_override_ptr() implementations that may appear in the client program. mach_error_t -mach_override_ptr( +__asan_mach_override_ptr( void *originalFunctionAddress, const void *overrideFunctionAddress, void **originalFunctionReentryIsland ); From kcc at google.com Tue Dec 27 19:12:52 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 17:12:52 -0800 Subject: [llvm-commits] [PATCH] AddressSanitizer: avoid name conflicts between multiple mach_override instances. In-Reply-To: References: Message-ID: r147303 (I also added a README.txt entry). Please talk to the mach_override maintainers, they may want to accept some (or all) of our patches. --kcc On Tue, Dec 27, 2011 at 12:24 AM, Alexander Potapenko wrote: > Hi, > > The code instrumented with ASan may have its own instance of the > mach_override library. > In this case chances are that functions from it will be called from > mach_override_ptr() during ASan initialization. > This may lead to crashes (if those functions are instrumented) or > incorrect behavior (if the implementations differ). > > The attached patch renames mach_override_ptr() into > __asan_mach_override_ptr() and makes the rest of the mach_override > internals hidden. > The corresponding AddressSanitizer bug is > http://code.google.com/p/address-sanitizer/issues/detail?id=22 > > -- > Alexander Potapenko > Software Engineer > Google Moscow > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/d732fcc4/attachment.html From kcc at google.com Tue Dec 27 20:24:50 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 02:24:50 -0000 Subject: [llvm-commits] [compiler-rt] r147304 - in /compiler-rt/trunk/lib/asan: asan_interceptors.cc asan_interceptors.h tests/asan_test.cc Message-ID: <20111228022450.A6EAB1BE003@llvm.org> Author: kcc Date: Tue Dec 27 20:24:50 2011 New Revision: 147304 URL: http://llvm.org/viewvc/llvm-project?rev=147304&view=rev Log: [asan] interceptors for strcasecmp and strncasecmp. patch by samsonov at google.com Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=147304&r1=147303&r2=147304&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Tue Dec 27 20:24:50 2011 @@ -20,8 +20,10 @@ #include "asan_stack.h" #include "asan_stats.h" +#include #include #include +#include namespace __asan { @@ -29,11 +31,13 @@ memcpy_f real_memcpy; memmove_f real_memmove; memset_f real_memset; +strcasecmp_f real_strcasecmp; strchr_f real_strchr; strcmp_f real_strcmp; strcpy_f real_strcpy; strdup_f real_strdup; strlen_f real_strlen; +strncasecmp_f real_strncasecmp; strncmp_f real_strncmp; strncpy_f real_strncpy; strnlen_f real_strnlen; @@ -123,11 +127,13 @@ INTERCEPT_FUNCTION(memcpy); INTERCEPT_FUNCTION(memmove); INTERCEPT_FUNCTION(memset); + INTERCEPT_FUNCTION(strcasecmp); INTERCEPT_FUNCTION(strchr); INTERCEPT_FUNCTION(strcmp); INTERCEPT_FUNCTION(strcpy); // NOLINT INTERCEPT_FUNCTION(strdup); INTERCEPT_FUNCTION(strlen); + INTERCEPT_FUNCTION(strncasecmp); INTERCEPT_FUNCTION(strncmp); INTERCEPT_FUNCTION(strncpy); #ifndef __APPLE__ @@ -202,6 +208,26 @@ return (c1 == c2) ? 0 : (c1 < c2) ? -1 : 1; } +static inline int CharCaseCmp(unsigned char c1, unsigned char c2) { + int c1_low = tolower(c1); + int c2_low = tolower(c2); + return c1_low - c2_low; +} + +int WRAP(strcasecmp)(const char *s1, const char *s2) { + ENSURE_ASAN_INITED(); + unsigned char c1, c2; + size_t i; + for (i = 0; ; i++) { + c1 = (unsigned char)s1[i]; + c2 = (unsigned char)s2[i]; + if (CharCaseCmp(c1, c2) != 0 || c1 == '\0') break; + } + ASAN_READ_RANGE(s1, i + 1); + ASAN_READ_RANGE(s2, i + 1); + return CharCaseCmp(c1, c2); +} + int WRAP(strcmp)(const char *s1, const char *s2) { // strcmp is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. @@ -259,6 +285,20 @@ return length; } +int WRAP(strncasecmp)(const char *s1, const char *s2, size_t size) { + ENSURE_ASAN_INITED(); + unsigned char c1 = 0, c2 = 0; + size_t i; + for (i = 0; i < size; i++) { + c1 = (unsigned char)s1[i]; + c2 = (unsigned char)s2[i]; + if (CharCaseCmp(c1, c2) != 0 || c1 == '\0') break; + } + ASAN_READ_RANGE(s1, Min(i + 1, size)); + ASAN_READ_RANGE(s2, Min(i + 1, size)); + return CharCaseCmp(c1, c2); +} + int WRAP(strncmp)(const char *s1, const char *s2, size_t size) { // strncmp is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=147304&r1=147303&r2=147304&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Tue Dec 27 20:24:50 2011 @@ -70,11 +70,13 @@ void *WRAP(memcpy)(void *to, const void *from, size_t size); void *WRAP(memmove)(void *to, const void *from, size_t size); void *WRAP(memset)(void *block, int c, size_t size); +int WRAP(strcasecmp)(const char *s1, const char *s2); char *WRAP(strchr)(const char *string, int c); int WRAP(strcmp)(const char *s1, const char *s2); char *WRAP(strcpy)(char *to, const char *from); // NOLINT char *WRAP(strdup)(const char *s); size_t WRAP(strlen)(const char *s); +int WRAP(strncasecmp)(const char *s1, const char *s2, size_t n); int WRAP(strncmp)(const char *s1, const char *s2, size_t size); char *WRAP(strncpy)(char *to, const char *from, size_t size); #endif @@ -85,11 +87,13 @@ typedef void* (*memcpy_f)(void *to, const void *from, size_t size); typedef void* (*memmove_f)(void *to, const void *from, size_t size); typedef void* (*memset_f)(void *block, int c, size_t size); +typedef int (*strcasecmp_f)(const char *s1, const char *s2); typedef char* (*strchr_f)(const char *str, int c); typedef int (*strcmp_f)(const char *s1, const char *s2); typedef char* (*strcpy_f)(char *to, const char *from); typedef char* (*strdup_f)(const char *s); typedef size_t (*strlen_f)(const char *s); +typedef int (*strncasecmp_f)(const char *s1, const char *s2, size_t n); typedef int (*strncmp_f)(const char *s1, const char *s2, size_t size); typedef char* (*strncpy_f)(char *to, const char *from, size_t size); typedef size_t (*strnlen_f)(const char *s, size_t maxlen); @@ -99,11 +103,13 @@ extern memcpy_f real_memcpy; extern memmove_f real_memmove; extern memset_f real_memset; +extern strcasecmp_f real_strcasecmp; extern strchr_f real_strchr; extern strcmp_f real_strcmp; extern strcpy_f real_strcpy; extern strdup_f real_strdup; extern strlen_f real_strlen; +extern strncasecmp_f real_strncasecmp; extern strncmp_f real_strncmp; extern strncpy_f real_strncpy; extern strnlen_f real_strnlen; Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147304&r1=147303&r2=147304&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Tue Dec 27 20:24:50 2011 @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -1050,11 +1051,16 @@ free(heap_string); } +static inline char* MallocAndMemsetString(size_t size) { + char *s = Ident((char*)malloc(size)); + memset(s, 'z', size); + return s; +} + #ifndef __APPLE__ TEST(AddressSanitizer, StrNLenOOBTest) { size_t size = Ident(123); - char *str = Ident((char*)malloc(size)); - memset(str, 'z', size); + char *str = MallocAndMemsetString(size); // Normal strnlen calls. Ident(strnlen(str - 1, 0)); Ident(strnlen(str, size)); @@ -1073,9 +1079,8 @@ TEST(AddressSanitizer, StrDupOOBTest) { size_t size = Ident(42); - char *str = Ident((char*)malloc(size)); + char *str = MallocAndMemsetString(size); char *new_str; - memset(str, 'z', size); // Normal strdup calls. str[size - 1] = '\0'; new_str = strdup(str); @@ -1157,8 +1162,7 @@ typedef char*(*PointerToStrChr)(const char*, int); void RunStrChrTest(PointerToStrChr StrChr) { size_t size = Ident(100); - char *str = Ident((char*)malloc(size)); - memset(str, 'z', size); + char *str = MallocAndMemsetString(size); str[10] = 'q'; str[11] = '\0'; EXPECT_EQ(str, StrChr(str, 'z')); @@ -1181,81 +1185,113 @@ // strcmp EXPECT_EQ(0, strcmp("", "")); EXPECT_EQ(0, strcmp("abcd", "abcd")); - EXPECT_EQ(-1, strcmp("ab", "ac")); - EXPECT_EQ(-1, strcmp("abc", "abcd")); - EXPECT_EQ(1, strcmp("acc", "abc")); - EXPECT_EQ(1, strcmp("abcd", "abc")); + EXPECT_GT(0, strcmp("ab", "ac")); + EXPECT_GT(0, strcmp("abc", "abcd")); + EXPECT_LT(0, strcmp("acc", "abc")); + EXPECT_LT(0, strcmp("abcd", "abc")); // strncmp EXPECT_EQ(0, strncmp("a", "b", 0)); EXPECT_EQ(0, strncmp("abcd", "abcd", 10)); EXPECT_EQ(0, strncmp("abcd", "abcef", 3)); - EXPECT_EQ(-1, strncmp("abcde", "abcfa", 4)); - EXPECT_EQ(-1, strncmp("a", "b", 5)); - EXPECT_EQ(-1, strncmp("bc", "bcde", 4)); - EXPECT_EQ(1, strncmp("xyz", "xyy", 10)); - EXPECT_EQ(1, strncmp("baa", "aaa", 1)); - EXPECT_EQ(1, strncmp("zyx", "", 2)); -} - -static inline char* MallocAndMemsetString(size_t size) { - char *s = Ident((char*)malloc(size)); - memset(s, 'z', size); - return s; + EXPECT_GT(0, strncmp("abcde", "abcfa", 4)); + EXPECT_GT(0, strncmp("a", "b", 5)); + EXPECT_GT(0, strncmp("bc", "bcde", 4)); + EXPECT_LT(0, strncmp("xyz", "xyy", 10)); + EXPECT_LT(0, strncmp("baa", "aaa", 1)); + EXPECT_LT(0, strncmp("zyx", "", 2)); + + // strcasecmp + EXPECT_EQ(0, strcasecmp("", "")); + EXPECT_EQ(0, strcasecmp("zzz", "zzz")); + EXPECT_EQ(0, strcasecmp("abCD", "ABcd")); + EXPECT_GT(0, strcasecmp("aB", "Ac")); + EXPECT_GT(0, strcasecmp("ABC", "ABCd")); + EXPECT_LT(0, strcasecmp("acc", "abc")); + EXPECT_LT(0, strcasecmp("ABCd", "abc")); + + // strncasecmp + EXPECT_EQ(0, strncasecmp("a", "b", 0)); + EXPECT_EQ(0, strncasecmp("abCD", "ABcd", 10)); + EXPECT_EQ(0, strncasecmp("abCd", "ABcef", 3)); + EXPECT_GT(0, strncasecmp("abcde", "ABCfa", 4)); + EXPECT_GT(0, strncasecmp("a", "B", 5)); + EXPECT_GT(0, strncasecmp("bc", "BCde", 4)); + EXPECT_LT(0, strncasecmp("xyz", "xyy", 10)); + EXPECT_LT(0, strncasecmp("Baa", "aaa", 1)); + EXPECT_LT(0, strncasecmp("zyx", "", 2)); } -TEST(AddressSanitizer, StrCmpOOBTest) { +typedef int(*PointerToStrCmp)(const char*, const char*); +void RunStrCmpTest(PointerToStrCmp StrCmp) { size_t size = Ident(100); char *s1 = MallocAndMemsetString(size); char *s2 = MallocAndMemsetString(size); s1[size - 1] = '\0'; s2[size - 1] = '\0'; - // Normal strcmp calls - Ident(strcmp(s1, s2)); - Ident(strcmp(s1, s2 + size - 1)); - Ident(strcmp(s1 + size - 1, s2 + size - 1)); + // Normal StrCmp calls + Ident(StrCmp(s1, s2)); + Ident(StrCmp(s1, s2 + size - 1)); + Ident(StrCmp(s1 + size - 1, s2 + size - 1)); s1[size - 1] = 'z'; s2[size - 1] = 'x'; - Ident(strcmp(s1, s2)); + Ident(StrCmp(s1, s2)); // One of arguments points to not allocated memory. - EXPECT_DEATH(Ident(strcmp)(s1 - 1, s2), LeftOOBErrorMessage(1)); - EXPECT_DEATH(Ident(strcmp)(s1, s2 - 1), LeftOOBErrorMessage(1)); - EXPECT_DEATH(Ident(strcmp)(s1 + size, s2), RightOOBErrorMessage(0)); - EXPECT_DEATH(Ident(strcmp)(s1, s2 + size), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrCmp)(s1 - 1, s2), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(StrCmp)(s1, s2 - 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(StrCmp)(s1 + size, s2), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrCmp)(s1, s2 + size), RightOOBErrorMessage(0)); // Hit unallocated memory and die. s2[size - 1] = 'z'; - EXPECT_DEATH(Ident(strcmp)(s1, s1), RightOOBErrorMessage(0)); - EXPECT_DEATH(Ident(strcmp)(s1 + size - 1, s2), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrCmp)(s1, s1), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrCmp)(s1 + size - 1, s2), RightOOBErrorMessage(0)); free(s1); free(s2); } -TEST(AddressSanitizer, StrNCmpOOBTest) { +TEST(AddressSanitizer, StrCmpOOBTest) { + RunStrCmpTest(&strcmp); +} + +TEST(AddressSanitizer, StrCaseCmpOOBTest) { + RunStrCmpTest(&strcasecmp); +} + +typedef int(*PointerToStrNCmp)(const char*, const char*, size_t); +void RunStrNCmpTest(PointerToStrNCmp StrNCmp) { size_t size = Ident(100); char *s1 = MallocAndMemsetString(size); char *s2 = MallocAndMemsetString(size); s1[size - 1] = '\0'; s2[size - 1] = '\0'; - // Normal strncmp calls - Ident(strncmp(s1, s2, size + 2)); + // Normal StrNCmp calls + Ident(StrNCmp(s1, s2, size + 2)); s1[size - 1] = 'z'; s2[size - 1] = 'x'; - Ident(strncmp(s1 + size - 2, s2 + size - 2, size)); + Ident(StrNCmp(s1 + size - 2, s2 + size - 2, size)); s2[size - 1] = 'z'; - Ident(strncmp(s1 - 1, s2 - 1, 0)); - Ident(strncmp(s1 + size - 1, s2 + size - 1, 1)); + Ident(StrNCmp(s1 - 1, s2 - 1, 0)); + Ident(StrNCmp(s1 + size - 1, s2 + size - 1, 1)); // One of arguments points to not allocated memory. - EXPECT_DEATH(Ident(strncmp)(s1 - 1, s2, 1), LeftOOBErrorMessage(1)); - EXPECT_DEATH(Ident(strncmp)(s1, s2 - 1, 1), LeftOOBErrorMessage(1)); - EXPECT_DEATH(Ident(strncmp)(s1 + size, s2, 1), RightOOBErrorMessage(0)); - EXPECT_DEATH(Ident(strncmp)(s1, s2 + size, 1), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrNCmp)(s1 - 1, s2, 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(StrNCmp)(s1, s2 - 1, 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(StrNCmp)(s1 + size, s2, 1), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrNCmp)(s1, s2 + size, 1), RightOOBErrorMessage(0)); // Hit unallocated memory and die. - EXPECT_DEATH(Ident(strncmp)(s1 + 1, s2 + 1, size), RightOOBErrorMessage(0)); - EXPECT_DEATH(Ident(strncmp)(s1 + size - 1, s2, 2), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrNCmp)(s1 + 1, s2 + 1, size), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(StrNCmp)(s1 + size - 1, s2, 2), RightOOBErrorMessage(0)); free(s1); free(s2); } +TEST(AddressSanitizer, StrNCmpOOBTest) { + RunStrNCmpTest(&strncmp); +} + +TEST(AddressSanitizer, StrNCaseCmpOOBTest) { + RunStrNCmpTest(&strncasecmp); +} + static const char *kOverlapErrorMessage = "strcpy-param-overlap"; TEST(AddressSanitizer, StrArgsOverlapTest) { From kcc at google.com Tue Dec 27 20:28:26 2011 From: kcc at google.com (Kostya Serebryany) Date: Tue, 27 Dec 2011 18:28:26 -0800 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptors for strcasecmp and strncasecmp In-Reply-To: References: Message-ID: r147304. thanks! On Tue, Dec 27, 2011 at 5:31 AM, Alexey Samsonov wrote: > Rietveld link: http://codereview.appspot.com/5500082/ > > -- > Alexey Samsonov > Software Engineer, Moscow > samsonov at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111227/8cb287ea/attachment.html From raghesh.a at gmail.com Tue Dec 27 20:48:26 2011 From: raghesh.a at gmail.com (Raghesh Aloor) Date: Wed, 28 Dec 2011 02:48:26 -0000 Subject: [llvm-commits] [polly] r147305 - /polly/trunk/lib/CodeGeneration.cpp Message-ID: <20111228024826.F36941BE003@llvm.org> Author: raghesh Date: Tue Dec 27 20:48:26 2011 New Revision: 147305 URL: http://llvm.org/viewvc/llvm-project?rev=147305&view=rev Log: Memaccess: Using isl_map_dim_max Use isl_map_dim_max to extract the details of the changed access relation. Only constant access functions are supported now. Modified: polly/trunk/lib/CodeGeneration.cpp Modified: polly/trunk/lib/CodeGeneration.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/CodeGeneration.cpp?rev=147305&r1=147304&r2=147305&view=diff ============================================================================== --- polly/trunk/lib/CodeGeneration.cpp (original) +++ polly/trunk/lib/CodeGeneration.cpp Tue Dec 27 20:48:26 2011 @@ -44,6 +44,8 @@ #include "cloog/cloog.h" #include "cloog/isl/cloog.h" +#include "isl/aff.h" + #include #include @@ -83,6 +85,11 @@ typedef DenseMap ValueMapT; typedef DenseMap CharMapT; typedef std::vector VectorValueMapT; +typedef struct { + Value *BaseAddress; + Value *Result; + IRBuilder<> *Builder; +}IslPwAffUserInfo; // Create a new loop. // @@ -315,6 +322,58 @@ return vector; } + static Value* islAffToValue(__isl_take isl_aff *Aff, + IslPwAffUserInfo *UserInfo) { + assert(isl_aff_is_cst(Aff) && "Only constant access functions supported"); + + IRBuilder<> *Builder = UserInfo->Builder; + + isl_int OffsetIsl; + mpz_t OffsetMPZ; + + isl_int_init(OffsetIsl); + mpz_init(OffsetMPZ); + isl_aff_get_constant(Aff, &OffsetIsl); + isl_int_get_gmp(OffsetIsl, OffsetMPZ); + + Value *OffsetValue = NULL; + APInt Offset = APInt_from_MPZ(OffsetMPZ); + OffsetValue = ConstantInt::get(Builder->getContext(), Offset); + + mpz_clear(OffsetMPZ); + isl_int_clear(OffsetIsl); + isl_aff_free(Aff); + + return OffsetValue; + } + + static int mergeIslAffValues(__isl_take isl_set *Set, + __isl_take isl_aff *Aff, void *User) { + IslPwAffUserInfo *UserInfo = (IslPwAffUserInfo *)User; + + assert((UserInfo->Result == NULL) && "Result is already set." + "Currently only single isl_aff is supported"); + assert(isl_set_plain_is_universe(Set) + && "Code generation failed because the set is not universe"); + + UserInfo->Result = islAffToValue(Aff, UserInfo); + + isl_set_free(Set); + return 0; + } + + Value* islPwAffToValue(__isl_take isl_pw_aff *PwAff, Value *BaseAddress) { + IslPwAffUserInfo UserInfo; + UserInfo.BaseAddress = BaseAddress; + UserInfo.Result = NULL; + UserInfo.Builder = &Builder; + isl_pw_aff_foreach_piece(PwAff, mergeIslAffValues, &UserInfo); + assert(UserInfo.Result && "Code generation for isl_pw_aff failed"); + + isl_pw_aff_free(PwAff); + return UserInfo.Result; + } + /// @brief Get the memory access offset to be added to the base address std::vector getMemoryAccessIndex(__isl_keep isl_map *AccessRelation, Value *BaseAddress) { @@ -324,13 +383,9 @@ assert((isl_map_dim(AccessRelation, isl_dim_out) == 1) && "Only single dimensional access functions supported"); - if (isl_map_plain_is_fixed(AccessRelation, isl_dim_out, - 0, &OffsetMPZ) == -1) - errs() << "Only fixed value access functions supported\n"; + isl_pw_aff *PwAff = isl_map_dim_max(isl_map_copy(AccessRelation), 0); + Value *OffsetValue = islPwAffToValue(PwAff, BaseAddress); - // Convert the offset from MPZ to Value*. - APInt Offset = APInt_from_MPZ(OffsetMPZ); - Value *OffsetValue = ConstantInt::get(Builder.getContext(), Offset); PointerType *BaseAddressType = dyn_cast( BaseAddress->getType()); Type *ArrayTy = BaseAddressType->getElementType(); From nicholas at mxc.ca Wed Dec 28 00:57:32 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 28 Dec 2011 06:57:32 -0000 Subject: [llvm-commits] [llvm] r147307 - /llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Message-ID: <20111228065732.B91E61BE003@llvm.org> Author: nicholas Date: Wed Dec 28 00:57:32 2011 New Revision: 147307 URL: http://llvm.org/viewvc/llvm-project?rev=147307&view=rev Log: Demystify this comment. Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147307&r1=147306&r2=147307&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Wed Dec 28 00:57:32 2011 @@ -1664,12 +1664,23 @@ // Merge probability data into PredBlock's branch. APInt A, B, C, D; if (ExtractBranchMetadata(PBI, C, D) && ExtractBranchMetadata(BI, A, B)) { - // bbA: br bbB (a% probability), bbC (b% prob.) - // bbB: br bbD (c% probability), bbC (d% prob.) - // --> bbA: br bbD ((a*c)% prob.), bbC ((b+a*d)% prob.) + // Given IR which does: + // bbA: + // br i1 %x, label %bbB, label %bbC + // bbB: + // br i1 %y, label %bbD, label %bbC + // Let's call the probability that we take the edge from %bbA to %bbB + // 'a', from %bbA to %bbC, 'b', from %bbB to %bbD 'c' and from %bbB to + // %bbC probability 'd'. // - // Probabilities aren't stored as ratios directly. Converting to - // probability-numerator form, we get: + // We transform the IR into: + // bbA: + // br i1 %z, label %bbD, label %bbC + // where the probability of going to %bbD is (a*c) and going to bbC is + // (b+a*d). + // + // Probabilities aren't stored as ratios directly. Using branch weights, + // we get: // (a*c)% = A*C, (b+(a*d))% = A*D+B*C+B*D. bool Overflow1 = false, Overflow2 = false, Overflow3 = false; From nicholas at mxc.ca Wed Dec 28 01:05:41 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 27 Dec 2011 23:05:41 -0800 Subject: [llvm-commits] [llvm] r147286 - in /llvm/trunk: lib/Transforms/Utils/SimplifyCFG.cpp test/Transforms/SimplifyCFG/preserve-branchweights.ll In-Reply-To: References: <20111227043152.F2AE42A6C12C@llvm.org> Message-ID: <4EFABFC5.2030905@mxc.ca> On 12/27/2011 02:50 AM, Chandler Carruth wrote: > > On Dec 26, 2011 11:40 PM, "Nick Lewycky" > wrote: > > > > Author: nicholas > > Date: Mon Dec 26 22:31:52 2011 > > New Revision: 147286 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147286&view=rev > > > Log: > > Teach simplifycfg to recompute branch weights when merging some > branches, and > > to discard weights when appropriate. Still more to do (and a new > TODO), but > > it's a start! > > > > Modified: > > llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp > > llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > > > > Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=147286&r1=147285&r2=147286&view=diff > > > > ============================================================================== > > --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) > > +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Mon Dec 26 > 22:31:52 2011 > > @@ -18,6 +18,8 @@ > > #include "llvm/GlobalVariable.h" > > #include "llvm/Instructions.h" > > #include "llvm/IntrinsicInst.h" > > +#include "llvm/LLVMContext.h" > > +#include "llvm/Metadata.h" > > #include "llvm/Type.h" > > #include "llvm/Analysis/InstructionSimplify.h" > > #include "llvm/Analysis/ValueTracking.h" > > @@ -1462,6 +1464,26 @@ > > return true; > > } > > > > +/// ExtractBranchMetadata - Given a conditional BranchInstruction, > retrieve the > > +/// probabilities of the branch taking each edge. Fills in the two APInt > > +/// parameters and return true, or returns false if no or invalid > metadata was > > +/// found. > > +static bool ExtractBranchMetadata(BranchInst *BI, > > + APInt &ProbTrue, APInt &ProbFalse) { > > + assert(BI->isConditional() && > > + "Looking for probabilities on unconditional branch?"); > > + MDNode *ProfileData = BI->getMetadata(LLVMContext::MD_prof); > > + if (!ProfileData || ProfileData->getNumOperands() != 3) return 0; > > return false; // ? Done. > > > + ConstantInt *CITrue = > dyn_cast(ProfileData->getOperand(1)); > > + ConstantInt *CIFalse = > dyn_cast(ProfileData->getOperand(2)); > > + if (!CITrue || !CIFalse) return 0; > > return false; // ? Done. > > + ProbTrue = CITrue->getValue(); > > + ProbFalse = CIFalse->getValue(); > > + assert(ProbTrue.getBitWidth() == 32 && ProbFalse.getBitWidth() == > 32 && > > + "Branch probability metadata must be 32-bit integers"); > > + return true; > > +} > > + > > /// FoldBranchToCommonDest - If this basic block is simple enough, > and if a > > /// predecessor branches to us and one of our successors, fold the > block into > > /// the predecessor and use logical operations to pick the right > destination. > > @@ -1636,6 +1658,51 @@ > > PBI->setSuccessor(1, FalseDest); > > } > > > > + // TODO: If BB is reachable from all paths through PredBlock, > then we > > + // could replace PBI's branch probabilities with BI's. > > + > > + // Merge probability data into PredBlock's branch. > > + APInt A, B, C, D; > > + if (ExtractBranchMetadata(PBI, C, D) && > ExtractBranchMetadata(BI, A, B)) { > > + // bbA: br bbB (a% probability), bbC (b% prob.) > > + // bbB: br bbD (c% probability), bbC (d% prob.) > > I don't understand this comment at all... the association between > letters is particularly mysterious. Fair enough. I've replaced this with almost a page of text. > > + // --> bbA: br bbD ((a*c)% prob.), bbC ((b+a*d)% prob.) > > + // > > + // Probabilities aren't stored as ratios directly. Converting to > > + // probability-numerator form, we get: > > + // (a*c)% = A*C, (b+(a*d))% = A*D+B*C+B*D. > > Why is this done with explicit math? At the least it seems like we > should be able to form BranchProbability objects to represent the ratio > form. Even better would be to use the BranchProbability analysis to > compute the ratios from the metadata? I'm not sure what you mean by BranchProbability analysis? I've looked at the BranchProbability class, and it doesn't have any mathematical operations on it, just comparisons. I could add them if you want? The other thing is that I'd still have to compute the denominator needlessly, but that's only a uadd_ov away. If you were thinking of BlockFrequency, I'd appreciate some documentation for it. I don't know what it computes (what is a "block frequency"?) or how it relates to the problem I'm solving. I noticed that it doesn't have any way to turn into a BranchProbability or a MD_prof node, so I hadn't given it too much thought. Nick > > + > > + bool Overflow1 = false, Overflow2 = false, Overflow3 = false; > > + bool Overflow4 = false, Overflow5 = false, Overflow6 = false; > > + APInt ProbTrue = A.umul_ov(C, Overflow1); > > + > > + APInt Tmp1 = A.umul_ov(D, Overflow2); > > + APInt Tmp2 = B.umul_ov(C, Overflow3); > > + APInt Tmp3 = B.umul_ov(D, Overflow4); > > + APInt Tmp4 = Tmp1.uadd_ov(Tmp2, Overflow5); > > + APInt ProbFalse = Tmp4.uadd_ov(Tmp3, Overflow6); > > + > > + APInt GCD = APIntOps::GreatestCommonDivisor(ProbTrue, ProbFalse); > > + ProbTrue = ProbTrue.udiv(GCD); > > + ProbFalse = ProbFalse.udiv(GCD); > > + > > + if (Overflow1 || Overflow2 || Overflow3 || Overflow4 || > Overflow5 || > > + Overflow6) { > > + DEBUG(dbgs() << "Overflow recomputing branch weight on: " << > *PBI > > + << "when merging with: " << *BI); > > + PBI->setMetadata(LLVMContext::MD_prof, NULL); > > + } else { > > + LLVMContext &Context = BI->getContext(); > > + Value *Ops[3]; > > + Ops[0] = BI->getMetadata(LLVMContext::MD_prof)->getOperand(0); > > + Ops[1] = ConstantInt::get(Context, ProbTrue); > > + Ops[2] = ConstantInt::get(Context, ProbFalse); > > + PBI->setMetadata(LLVMContext::MD_prof, MDNode::get(Context, > Ops)); > > + } > > + } else { > > + PBI->setMetadata(LLVMContext::MD_prof, NULL); > > + } > > + > > // Copy any debug value intrinsics into the end of PredBlock. > > for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; > ++I) > > if (isa(*I)) > > > > Modified: > llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll?rev=147286&r1=147285&r2=147286&view=diff > > > > ============================================================================== > > --- llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > (original) > > +++ llvm/trunk/test/Transforms/SimplifyCFG/preserve-branchweights.ll > Mon Dec 26 22:31:52 2011 > > @@ -10,6 +10,45 @@ > > > > X: > > %c = or i1 %b, false > > + br i1 %c, label %Z, label %Y, !prof !1 > > + > > +Y: > > + call void @helper(i32 0) > > + ret void > > + > > +Z: > > + call void @helper(i32 1) > > + ret void > > +} > > + > > +define void @test2(i1 %a, i1 %b) { > > +; CHECK: @test2 > > +entry: > > + br i1 %a, label %X, label %Y, !prof !1 > > +; CHECK: br i1 %or.cond, label %Z, label %Y, !prof !1 > > +; CHECK-NOT: !prof > > + > > +X: > > + %c = or i1 %b, false > > + br i1 %c, label %Z, label %Y, !prof !2 > > + > > +Y: > > + call void @helper(i32 0) > > + ret void > > + > > +Z: > > + call void @helper(i32 1) > > + ret void > > +} > > + > > +define void @test3(i1 %a, i1 %b) { > > +; CHECK: @test3 > > +; CHECK-NOT: !prof > > +entry: > > + br i1 %a, label %X, label %Y, !prof !1 > > + > > +X: > > + %c = or i1 %b, false > > br i1 %c, label %Z, label %Y > > > > Y: > > @@ -21,6 +60,29 @@ > > ret void > > } > > > > -!0 = metadata !{metadata !"branch_weights", i32 1, i32 2} > > +define void @test4(i1 %a, i1 %b) { > > +; CHECK: @test4 > > +; CHECK-NOT: !prof > > +entry: > > + br i1 %a, label %X, label %Y > > + > > +X: > > + %c = or i1 %b, false > > + br i1 %c, label %Z, label %Y, !prof !1 > > + > > +Y: > > + call void @helper(i32 0) > > + ret void > > + > > +Z: > > + call void @helper(i32 1) > > + ret void > > +} > > + > > +!0 = metadata !{metadata !"branch_weights", i32 3, i32 5} > > +!1 = metadata !{metadata !"branch_weights", i32 1, i32 1} > > +!2 = metadata !{metadata !"branch_weights", i32 1, i32 2} > > > > -; CHECK: !0 = metadata !{metadata !"branch_weights", i32 2, i32 1} > > +; CHECK: !0 = metadata !{metadata !"branch_weights", i32 5, i32 11} > > +; CHECK: !1 = metadata !{metadata !"branch_weights", i32 1, i32 5} > > +; CHECK-NOT: !2 > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From elena.demikhovsky at intel.com Wed Dec 28 02:14:01 2011 From: elena.demikhovsky at intel.com (Elena Demikhovsky) Date: Wed, 28 Dec 2011 08:14:01 -0000 Subject: [llvm-commits] [llvm] r147308 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/avx-shuffle.ll Message-ID: <20111228081401.E07221BE003@llvm.org> Author: delena Date: Wed Dec 28 02:14:01 2011 New Revision: 147308 URL: http://llvm.org/viewvc/llvm-project?rev=147308&view=rev Log: Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/avx-shuffle.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147308&r1=147307&r2=147308&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 02:14:01 2011 @@ -3448,6 +3448,11 @@ /// isMOVLPMask - Return true if the specified VECTOR_SHUFFLE operand /// specifies a shuffle of elements that is suitable for input to MOVLP{S|D}. bool X86::isMOVLPMask(ShuffleVectorSDNode *N) { + EVT VT = N->getValueType(0); + + if (VT.getSizeInBits() != 128) + return false; + unsigned NumElems = N->getValueType(0).getVectorNumElements(); if (NumElems != 2 && NumElems != 4) @@ -3666,6 +3671,8 @@ static bool isMOVLMask(const SmallVectorImpl &Mask, EVT VT) { if (VT.getVectorElementType().getSizeInBits() < 32) return false; + if (VT.getSizeInBits() == 256) + return false; int NumElts = VT.getVectorNumElements(); @@ -5158,16 +5165,30 @@ return DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item); } else if (ExtVT == MVT::i32 || ExtVT == MVT::f32 || ExtVT == MVT::f64 || (ExtVT == MVT::i64 && Subtarget->is64Bit())) { + if (VT.getSizeInBits() == 256) { + + EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); + Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item); + SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); + return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), + DAG, dl); + } Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item); // Turn it into a MOVL (i.e. movss, movsd, or movd) to a zero vector. return getShuffleVectorZeroOrUndef(Item, 0, true,Subtarget->hasXMMInt(), DAG); } else if (ExtVT == MVT::i16 || ExtVT == MVT::i8) { Item = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Item); - unsigned NumBits = VT.getSizeInBits(); - assert((NumBits == 128 || NumBits == 256) && - "Expected an SSE or AVX value type!"); - EVT MiddleVT = NumBits == 128 ? MVT::v4i32 : MVT::v8i32; + if (VT.getSizeInBits() == 256) { + + EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); + Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item); + SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); + return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), + DAG, dl); + } + assert (VT.getSizeInBits() == 128 || "Expected an SSE value type!"); + EVT MiddleVT = MVT::v4i32; Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MiddleVT, Item); Item = getShuffleVectorZeroOrUndef(Item, 0, true, Subtarget->hasXMMInt(), DAG); Modified: llvm/trunk/test/CodeGen/X86/avx-shuffle.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-shuffle.ll?rev=147308&r1=147307&r2=147308&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-shuffle.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-shuffle.ll Wed Dec 28 02:14:01 2011 @@ -13,8 +13,22 @@ define <3 x i64> @test2(<2 x i64> %v) nounwind readnone { ; CHECK: test2: ; CHECK: vxorpd -; CHECK: vmovsd +; CHECK: vperm2f128 %1 = shufflevector <2 x i64> %v, <2 x i64> %v, <3 x i32> %2 = shufflevector <3 x i64> zeroinitializer, <3 x i64> %1, <3 x i32> ret <3 x i64> %2 } + +define <4 x i64> @test3(<4 x i64> %a, <4 x i64> %b) nounwind { + %c = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> + ret <4 x i64> %c +; CHECK: test3: +; CHECK: vperm2f128 +} + +define <8 x float> @test4(float %a) nounwind { + %b = insertelement <8 x float> zeroinitializer, float %a, i32 0 + ret <8 x float> %b +; CHECK: test4: +; CHECK: vinsertf128 +} \ No newline at end of file From clattner at apple.com Wed Dec 28 02:29:12 2011 From: clattner at apple.com (Chris Lattner) Date: Wed, 28 Dec 2011 00:29:12 -0800 Subject: [llvm-commits] [patch] Remove the old ELF writer In-Reply-To: <4EF66438.8060406@gmail.com> References: <4EF66438.8060406@gmail.com> Message-ID: <17C96F93-AF42-469B-8038-BE71F777B918@apple.com> On Dec 24, 2011, at 3:46 PM, Rafael ?vila de Esp?ndola wrote: > Currently we have two ELF writers in LLVM. One is based on MC and is > used for direct object emission and MCJIT. The old one is only used to > provide debug info for the old JIT. > > With MCJIT for ELF coming along, I would like to propose removing the > old ELF writer. It is a fairly large chunk of code, and it blocks some > cleanup in the debug generation that has to work with both it and MC. > > The downside is that the old JIT will no longer produce debug info. > > The attached patch deletes just the ELF writer and direct callers. It is > already a fairly big cleanup: > > 10 files changed, 3 insertions(+), 2223 deletions(-) > > Is it OK? Should I propose this on llvmdev? Just MHO, I would really love to see this happen. Does this also remove the JIT EH code? -Chris From glider at google.com Wed Dec 28 02:54:15 2011 From: glider at google.com (Alexander Potapenko) Date: Wed, 28 Dec 2011 12:54:15 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: avoid name conflicts between multiple mach_override instances. In-Reply-To: References: Message-ID: I've sent some of them to the author, although we're a bit ahead of him again. I'll try to ping him. On Wed, Dec 28, 2011 at 5:12 AM, Kostya Serebryany wrote: > r147303 (I also added a README.txt entry). > Please talk to the?mach_override maintainers, they may want to accept some > (or all) of our patches. > > --kcc > > > On Tue, Dec 27, 2011 at 12:24 AM, Alexander Potapenko > wrote: >> >> Hi, >> >> The code instrumented with ASan may have its own instance of the >> mach_override library. >> In this case chances are that functions from it will be called from >> mach_override_ptr() during ASan initialization. >> This may lead to crashes (if those functions are instrumented) or >> incorrect behavior (if the implementations differ). >> >> The attached patch renames mach_override_ptr() into >> __asan_mach_override_ptr() and makes the rest of the mach_override >> internals hidden. >> The corresponding AddressSanitizer bug is >> http://code.google.com/p/address-sanitizer/issues/detail?id=22 >> >> -- >> Alexander Potapenko >> Software Engineer >> Google Moscow > > -- Alexander Potapenko Software Engineer Google Moscow From stpworld at narod.ru Wed Dec 28 04:42:59 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 28 Dec 2011 14:42:59 +0400 Subject: [llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches. In-Reply-To: <4EE8F7C4.2040002@narod.ru> References: <4ECF4A5A.9030607@narod.ru> <4ED36D99.1080306@narod.ru> <4ED51EEC.6050000@narod.ru> <4ED749D0.8030101@narod.ru> <4ED88CDF.2020104@narod.ru> <4EDCC136.1040903@narod.ru> <566DBA1B-7099-44E0-B2EB-3413A054F5E3@apple.com> <4EDF700A.3080605@narod.ru> <9F9EFCC2-3F32-49F8-97D6-B8BAC580B6F7@apple.com> <4EE36B3F.3090307@narod.ru> <0E3FA92F-92F2-45F5-9247-0A1934901F01@apple.com> <4EE8F7C4.2040002@narod.ru> Message-ID: <4EFAF2B3.1010806@narod.ru> Hi. A made some fixes that improves compile-time: 1. Size heuristics changed. Now we calculate number of unswitching branches only once per loop. 2. Some checks was moved from UnswitchIfProfitable to processCurrentLoop, since it is not changed during processCurrentLoop iteration. It allows decide to skip some loops at an early stage. I checked the compile-time on test MultiSource/Benchmarks/Prolangs-C++/shapes/shapes (there was compile time regression after my previous patch). Relative to previous patch the compile-time improved on ~8.5%. Relative to old revisions (before r146578) the compile time is improved on ~2%. Please find the patch in attachment for review. -Stepan. Stepan Dyatkovskiy wrote: > Commited as r146578. > > Thanks. > -Stepan. > > Dan Gohman wrote: >> Thanks. The patch looks ok to me. >> >> Dan >> >> On Dec 10, 2011, at 6:22 AM, Stepan Dyatkovskiy wrote: >> >>> I fixed code heuristics. How it works now: >>> >>> 1. Calculate average number of produced instructions and basics blocks: >>> number-of-instructions=curret-loop->number-of-instructions * unswitched-number >>> number-of-bb=curret-loop->number-of-bb * unswitched-number >>> >>> 2. If number-of-instructions> Threshold || number-of-bb*5> Threshold, stop unswitching. >>> >>> By default Threshold is 50. But user can set custom threshold using -loop-unswitch-threshold option. This option existed before my patch, and I kept it without changes though. >>> >>> I compiled ffmpeg (ffmpeg.org) with llvm + clang toolchain (with and without my patch). I got the next results: >>> >>> Without patch: ffmpeg size is 7369612 bytes. >>> Threshold = 50: ffmpeg size is 7349132 bytes (less then in ToT version). >>> Threshold = 200: ffmpeg size is 7439244 bytes. >>> >>> I also checked the ffmpeg build time in all cases it was 2 mins, 35 secs. >>> Transcoding time with ffmpeg (mp3 -> mp2) in all cases was also the same: process took 1 min, 19 secs. >>> >>> I added unit test that checks unswitch kicking in case of insufficient size. Unit tests and fixed patch (default threshold is 50) are attached to this post. >>> >>> Thanks. >>> -Stepan. >> > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -------------- next part -------------- A non-text attachment was scrubbed... Name: loop-unswitch-unswitchvals-1.1.patch Type: text/x-patch Size: 9194 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/bcba4db0/attachment.bin From eugeni.stepanov at gmail.com Wed Dec 28 05:36:25 2011 From: eugeni.stepanov at gmail.com (Evgeniy Stepanov) Date: Wed, 28 Dec 2011 15:36:25 +0400 Subject: [llvm-commits] [PATCH] asan-rt: add definitions for ucontext_t and friends on Android In-Reply-To: References: Message-ID: On Wed, Dec 28, 2011 at 3:36 AM, Kostya Serebryany wrote: > You are kidding :) > Let's not put the definition of?ucontext into asan-rt. Can't we really get > it from somewhere in the system? > If no, we can simply have > # if defined(__arm__) > ??*pc = *sp = *bp = 0; > These used to be required when asan worked through SIGILL. Not any more by > default. > I am actually considering to remove SIGILL-related code altogether (not 100% > sure yet). > > --kcc > > > > > On Wed, Dec 21, 2011 at 4:08 AM, Evgeniy Stepanov > wrote: >> >> Hi, >> >> libc headers on Android miss ucontext_t and friends. This patch adds >> some compatible definitions. > > -------------- next part -------------- A non-text attachment was scrubbed... Name: ucontext.patch Type: text/x-patch Size: 1069 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/35352518/attachment.bin From eugeni.stepanov at gmail.com Wed Dec 28 05:38:38 2011 From: eugeni.stepanov at gmail.com (Evgeniy Stepanov) Date: Wed, 28 Dec 2011 15:38:38 +0400 Subject: [llvm-commits] [PATCH] asan-rt: discover main thread stack limits without pthread In-Reply-To: References: Message-ID: In fact, I'd like copy or reimplement ProcMapsIterator as part of ASan to escape the ifdef hell. But this should be good enough, lets land it first. On Wed, Dec 28, 2011 at 3:24 AM, Kostya Serebryany wrote: > I like the idea (it may actually fix the problem reported today on wine), > but we need to guard the code that uses?sysinfo/sysinfo.h > with?ASAN_USE_SYSINFO==1 (see?asan_stack.cc). > Or add some macro machinery to get sysinfo.h from an alternative place as in > perftools (base/sysinfo.h). > > --kcc > > > On Wed, Dec 21, 2011 at 4:04 AM, Evgeniy Stepanov > wrote: >> >> Hi, >> >> if __asan_init is called from .preinit_array, pthread_getattr_np may >> become unsafe. This patch adds a different way of locating the stack >> of the main thread with a combination of getlrimit() and >> /proc/self/maps. > > -------------- next part -------------- A non-text attachment was scrubbed... Name: stack.patch Type: text/x-patch Size: 1598 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/271fb67e/attachment.bin From eugeni.stepanov at gmail.com Wed Dec 28 05:41:18 2011 From: eugeni.stepanov at gmail.com (Evgeniy Stepanov) Date: Wed, 28 Dec 2011 15:41:18 +0400 Subject: [llvm-commits] [PATCH] asan-rt: missing PTHREAD_DESTRUCTOR_ITERATIONS on Android Message-ID: Hi, PTHREAD_DESTRUCTOR_ITERATIONS is missing from Android headers. -------------- next part -------------- A non-text attachment was scrubbed... Name: dtor.patch Type: text/x-patch Size: 616 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/bb0fe514/attachment.bin From eugeni.stepanov at gmail.com Wed Dec 28 06:43:34 2011 From: eugeni.stepanov at gmail.com (Evgeniy Stepanov) Date: Wed, 28 Dec 2011 16:43:34 +0400 Subject: [llvm-commits] [PATCH] asan-rt: getenv() from .preinit_array In-Reply-To: References: Message-ID: On Wed, Dec 28, 2011 at 3:08 AM, Kostya Serebryany wrote: > I'd prefer something different: > > - Don't put this into?sysinfo/sysinfo.cc (we don't use it in some settings, > also we may want to get rid of it eventually). > - I suggest we implement ReadProcSelfEnviron in asan_rtl.cc and then use it > to implement asan_getenv() in asan_linux.cc (and in asan_mac.cc, if needed). > - no need for?PLATFORM_WINDOWS section (yet). Such code will need to go to > asan_windows.cc once we have it. > - we want to avoid?memchr/memcmp/etc (use __internal* variants). > - don't fallback to libc, just do ASAN_DIE > - do we need?HAVE___ENVIRON section? > > --kcc > > On Wed, Dec 21, 2011 at 3:57 AM, Evgeniy Stepanov > wrote: >> >> Hi, >> >> this patch brings in the implementation of GetenvBeforeMain() from >> google-perftools and uses it in place of getenv(). This is required to >> call __asan_init from .preinit_array. > > -------------- next part -------------- A non-text attachment was scrubbed... Name: getenv.patch Type: text/x-patch Size: 4877 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/839f6602/attachment-0001.bin From nadav.rotem at intel.com Wed Dec 28 07:08:20 2011 From: nadav.rotem at intel.com (Nadav Rotem) Date: Wed, 28 Dec 2011 13:08:20 -0000 Subject: [llvm-commits] [llvm] r147309 - in /llvm/trunk: lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp test/CodeGen/X86/2011-12-28-vselecti8.ll Message-ID: <20111228130820.C6C312A6C12C@llvm.org> Author: nadav Date: Wed Dec 28 07:08:20 2011 New Revision: 147309 URL: http://llvm.org/viewvc/llvm-project?rev=147309&view=rev Log: PR11662. Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage. Added: llvm/trunk/test/CodeGen/X86/2011-12-28-vselecti8.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=147309&r1=147308&r2=147309&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Wed Dec 28 07:08:20 2011 @@ -489,7 +489,11 @@ } SDValue DAGTypeLegalizer::PromoteIntRes_VSELECT(SDNode *N) { - SDValue Mask = GetPromotedInteger(N->getOperand(0)); + SDValue Mask = N->getOperand(0); + EVT OpTy = N->getOperand(1).getValueType(); + + // Promote all the way up to the canonical SetCC type. + Mask = PromoteTargetBoolean(Mask, TLI.getSetCCResultType(OpTy)); SDValue LHS = GetPromotedInteger(N->getOperand(1)); SDValue RHS = GetPromotedInteger(N->getOperand(2)); return DAG.getNode(ISD::VSELECT, N->getDebugLoc(), Added: llvm/trunk/test/CodeGen/X86/2011-12-28-vselecti8.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2011-12-28-vselecti8.ll?rev=147309&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/2011-12-28-vselecti8.ll (added) +++ llvm/trunk/test/CodeGen/X86/2011-12-28-vselecti8.ll Wed Dec 28 07:08:20 2011 @@ -0,0 +1,20 @@ +; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7 | FileCheck %s +; ModuleID = '' +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +target triple = "x86_64-apple-darwin11.2.0" + +; CHECK: @foo8 +; CHECK: psll +; CHECK: psraw +; CHECK: pblendvb +; CHECK: ret +define void @foo8(float* nocapture %RET) nounwind { +allocas: + %resultvec.i = select <8 x i1> , <8 x i8> , <8 x i8> + %uint2float = uitofp <8 x i8> %resultvec.i to <8 x float> + %ptr = bitcast float * %RET to <8 x float> * + store <8 x float> %uint2float, <8 x float>* %ptr, align 4 + ret void +} + + From glider at google.com Wed Dec 28 07:18:51 2011 From: glider at google.com (Alexander Potapenko) Date: Wed, 28 Dec 2011 17:18:51 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: allow disabling __cxa_throw at runtime Message-ID: The attached patch introduces the wrap_cxa_throw flag that should help us to build Chrome while http://code.google.com/p/address-sanitizer/issues/detail?id=23 is not fixed (tl;dr: wrapping __cxa_throw possibly affects stack unwinding and exception handling). -- Alexander Potapenko Software Engineer Google Moscow From rafael.espindola at gmail.com Wed Dec 28 08:44:14 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Wed, 28 Dec 2011 09:44:14 -0500 Subject: [llvm-commits] [patch] Remove the old ELF writer In-Reply-To: <17C96F93-AF42-469B-8038-BE71F777B918@apple.com> References: <4EF66438.8060406@gmail.com> <17C96F93-AF42-469B-8038-BE71F777B918@apple.com> Message-ID: <4EFB2B3E.3000408@gmail.com> >> Is it OK? Should I propose this on llvmdev? > > Just MHO, I would really love to see this happen. Does this also remove the JIT EH code? Not yet. The patch still keeps JITDwarfEmitter.cpp. > -Chris Cheers, Rafael From glider at google.com Wed Dec 28 09:56:31 2011 From: glider at google.com (Alexander Potapenko) Date: Wed, 28 Dec 2011 19:56:31 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: allow disabling __cxa_throw at runtime In-Reply-To: References: Message-ID: On Wed, Dec 28, 2011 at 5:18 PM, Alexander Potapenko wrote: > The attached patch introduces the wrap_cxa_throw flag that should help > us to build Chrome while > http://code.google.com/p/address-sanitizer/issues/detail?id=23 is not > fixed (tl;dr: wrapping __cxa_throw possibly affects stack unwinding > and exception handling). > Actually it looks like Chrome does not work with wrap___cxa_throw either, so we'll need to disable it on Mac. Kostya, is it safe to do so, or this will lead to false positives? From rafael.espindola at gmail.com Wed Dec 28 11:08:00 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Wed, 28 Dec 2011 17:08:00 -0000 Subject: [llvm-commits] [llvm] r147312 - in /llvm/trunk: autoconf/configure.ac configure Message-ID: <20111228170800.E2BCA2A6C12C@llvm.org> Author: rafael Date: Wed Dec 28 11:08:00 2011 New Revision: 147312 URL: http://llvm.org/viewvc/llvm-project?rev=147312&view=rev Log: Add support for mipsel in configure. Fixes PR11669. Patch by Sylvestre Ledru. Modified: llvm/trunk/autoconf/configure.ac llvm/trunk/configure Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=147312&r1=147311&r2=147312&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Wed Dec 28 11:08:00 2011 @@ -360,6 +360,7 @@ powerpc*-*) llvm_cv_target_arch="PowerPC" ;; arm*-*) llvm_cv_target_arch="ARM" ;; mips-*) llvm_cv_target_arch="Mips" ;; + mipsel-*) llvm_cv_target_arch="Mips" ;; xcore-*) llvm_cv_target_arch="XCore" ;; msp430-*) llvm_cv_target_arch="MSP430" ;; hexagon-*) llvm_cv_target_arch="Hexagon" ;; @@ -638,6 +639,7 @@ powerpc) TARGETS_TO_BUILD="PowerPC $TARGETS_TO_BUILD" ;; arm) TARGETS_TO_BUILD="ARM $TARGETS_TO_BUILD" ;; mips) TARGETS_TO_BUILD="Mips $TARGETS_TO_BUILD" ;; + mipsel) TARGETS_TO_BUILD="Mips $TARGETS_TO_BUILD" ;; spu) TARGETS_TO_BUILD="CellSPU $TARGETS_TO_BUILD" ;; xcore) TARGETS_TO_BUILD="XCore $TARGETS_TO_BUILD" ;; msp430) TARGETS_TO_BUILD="MSP430 $TARGETS_TO_BUILD" ;; Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=147312&r1=147311&r2=147312&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Wed Dec 28 11:08:00 2011 @@ -3886,6 +3886,7 @@ powerpc*-*) llvm_cv_target_arch="PowerPC" ;; arm*-*) llvm_cv_target_arch="ARM" ;; mips-*) llvm_cv_target_arch="Mips" ;; + mipsel-*) llvm_cv_target_arch="Mips" ;; xcore-*) llvm_cv_target_arch="XCore" ;; msp430-*) llvm_cv_target_arch="MSP430" ;; hexagon-*) llvm_cv_target_arch="Hexagon" ;; @@ -5308,6 +5309,7 @@ powerpc) TARGETS_TO_BUILD="PowerPC $TARGETS_TO_BUILD" ;; arm) TARGETS_TO_BUILD="ARM $TARGETS_TO_BUILD" ;; mips) TARGETS_TO_BUILD="Mips $TARGETS_TO_BUILD" ;; + mipsel) TARGETS_TO_BUILD="Mips $TARGETS_TO_BUILD" ;; spu) TARGETS_TO_BUILD="CellSPU $TARGETS_TO_BUILD" ;; xcore) TARGETS_TO_BUILD="XCore $TARGETS_TO_BUILD" ;; msp430) TARGETS_TO_BUILD="MSP430 $TARGETS_TO_BUILD" ;; @@ -10500,7 +10502,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext < Author: bwilson Date: Wed Dec 28 12:51:08 2011 New Revision: 147314 URL: http://llvm.org/viewvc/llvm-project?rev=147314&view=rev Log: Update OCaml bindings for the new half float type. Patch by Jonathan Ragan-Kelley! Modified: llvm/trunk/bindings/ocaml/llvm/llvm.ml llvm/trunk/bindings/ocaml/llvm/llvm.mli Modified: llvm/trunk/bindings/ocaml/llvm/llvm.ml URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/bindings/ocaml/llvm/llvm.ml?rev=147314&r1=147313&r2=147314&view=diff ============================================================================== --- llvm/trunk/bindings/ocaml/llvm/llvm.ml (original) +++ llvm/trunk/bindings/ocaml/llvm/llvm.ml Wed Dec 28 12:51:08 2011 @@ -20,6 +20,7 @@ module TypeKind = struct type t = | Void + | Half | Float | Double | X86fp80 @@ -1234,5 +1235,6 @@ | TypeKind.X86fp80 -> "x86_fp80" | TypeKind.Double -> "double" | TypeKind.Float -> "float" + | TypeKind.Half -> "half" | TypeKind.Void -> "void" | TypeKind.Metadata -> "metadata" Modified: llvm/trunk/bindings/ocaml/llvm/llvm.mli URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/bindings/ocaml/llvm/llvm.mli?rev=147314&r1=147313&r2=147314&view=diff ============================================================================== --- llvm/trunk/bindings/ocaml/llvm/llvm.mli (original) +++ llvm/trunk/bindings/ocaml/llvm/llvm.mli Wed Dec 28 12:51:08 2011 @@ -53,6 +53,7 @@ module TypeKind : sig type t = Void + | Half | Float | Double | X86fp80 From kcc at google.com Wed Dec 28 12:56:43 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 18:56:43 -0000 Subject: [llvm-commits] [compiler-rt] r147315 - in /compiler-rt/trunk/lib/asan: asan_interceptors.cc asan_interceptors.h asan_stack.cc tests/asan_test.cc Message-ID: <20111228185643.37FE92A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 12:56:42 2011 New Revision: 147315 URL: http://llvm.org/viewvc/llvm-project?rev=147315&view=rev Log: [asan] interceptor for memcmp. Patch by samsonov at google.com Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/asan_stack.cc compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=147315&r1=147314&r2=147315&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Wed Dec 28 12:56:42 2011 @@ -28,6 +28,7 @@ namespace __asan { index_f real_index; +memcmp_f real_memcmp; memcpy_f real_memcpy; memmove_f real_memmove; memset_f real_memset; @@ -124,6 +125,7 @@ #else OVERRIDE_FUNCTION(index, WRAP(strchr)); #endif + INTERCEPT_FUNCTION(memcmp); INTERCEPT_FUNCTION(memcpy); INTERCEPT_FUNCTION(memmove); INTERCEPT_FUNCTION(memset); @@ -149,6 +151,32 @@ // ---------------------- Wrappers ---------------- {{{1 using namespace __asan; // NOLINT +static inline int CharCmp(unsigned char c1, unsigned char c2) { + return (c1 == c2) ? 0 : (c1 < c2) ? -1 : 1; +} + +static inline int CharCaseCmp(unsigned char c1, unsigned char c2) { + int c1_low = tolower(c1); + int c2_low = tolower(c2); + return c1_low - c2_low; +} + +int WRAP(memcmp)(const void *a1, const void *a2, size_t size) { + ENSURE_ASAN_INITED(); + unsigned char c1 = 0, c2 = 0; + const unsigned char *s1 = (const unsigned char*)a1; + const unsigned char *s2 = (const unsigned char*)a2; + size_t i; + for (i = 0; i < size; i++) { + c1 = s1[i]; + c2 = s2[i]; + if (c1 != c2) break; + } + ASAN_READ_RANGE(s1, Min(i + 1, size)); + ASAN_READ_RANGE(s2, Min(i + 1, size)); + return CharCmp(c1, c2); +} + void *WRAP(memcpy)(void *to, const void *from, size_t size) { // memcpy is called during __asan_init() from the internals // of printf(...). @@ -204,16 +232,6 @@ return result; } -static inline int CharCmp(unsigned char c1, unsigned char c2) { - return (c1 == c2) ? 0 : (c1 < c2) ? -1 : 1; -} - -static inline int CharCaseCmp(unsigned char c1, unsigned char c2) { - int c1_low = tolower(c1); - int c2_low = tolower(c2); - return c1_low - c2_low; -} - int WRAP(strcasecmp)(const char *s1, const char *s2) { ENSURE_ASAN_INITED(); unsigned char c1, c2; Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=147315&r1=147314&r2=147315&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Wed Dec 28 12:56:42 2011 @@ -67,6 +67,7 @@ #endif #ifdef __APPLE__ +int WRAP(memcmp)(const void *a1, const void *a2, size_t size); void *WRAP(memcpy)(void *to, const void *from, size_t size); void *WRAP(memmove)(void *to, const void *from, size_t size); void *WRAP(memset)(void *block, int c, size_t size); @@ -84,6 +85,7 @@ namespace __asan { typedef void* (*index_f)(const char *string, int c); +typedef int (*memcmp_f)(const void *a1, const void *a2, size_t size); typedef void* (*memcpy_f)(void *to, const void *from, size_t size); typedef void* (*memmove_f)(void *to, const void *from, size_t size); typedef void* (*memset_f)(void *block, int c, size_t size); @@ -100,6 +102,7 @@ // __asan::real_X() holds pointer to library implementation of X(). extern index_f real_index; +extern memcmp_f real_memcmp; extern memcpy_f real_memcpy; extern memmove_f real_memmove; extern memset_f real_memset; Modified: compiler-rt/trunk/lib/asan/asan_stack.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_stack.cc?rev=147315&r1=147314&r2=147315&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_stack.cc (original) +++ compiler-rt/trunk/lib/asan/asan_stack.cc Wed Dec 28 12:56:42 2011 @@ -230,8 +230,8 @@ // |res| may be greater than check_stack.size, because // UncompressStack(CompressStack(stack)) eliminates the 0x0 frames. CHECK(res >= check_stack.size); - CHECK(0 == memcmp(check_stack.trace, stack->trace, - check_stack.size * sizeof(uintptr_t))); + CHECK(0 == real_memcmp(check_stack.trace, stack->trace, + check_stack.size * sizeof(uintptr_t))); #endif return res; Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147315&r1=147314&r2=147315&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Wed Dec 28 12:56:42 2011 @@ -1220,6 +1220,14 @@ EXPECT_LT(0, strncasecmp("xyz", "xyy", 10)); EXPECT_LT(0, strncasecmp("Baa", "aaa", 1)); EXPECT_LT(0, strncasecmp("zyx", "", 2)); + + // memcmp + EXPECT_EQ(0, memcmp("a", "b", 0)); + EXPECT_EQ(0, memcmp("ab\0c", "ab\0c", 4)); + EXPECT_GT(0, memcmp("\0ab", "\0ac", 3)); + EXPECT_GT(0, memcmp("abb\0", "abba", 4)); + EXPECT_LT(0, memcmp("ab\0cd", "ab\0c\0", 5)); + EXPECT_LT(0, memcmp("zza", "zyx", 3)); } typedef int(*PointerToStrCmp)(const char*, const char*); @@ -1292,6 +1300,30 @@ RunStrNCmpTest(&strncasecmp); } +TEST(AddressSanitizer, MemCmpOOBTest) { + size_t size = Ident(100); + char *s1 = MallocAndMemsetString(size); + char *s2 = MallocAndMemsetString(size); + // Normal memcmp calls. + Ident(memcmp(s1, s2, size)); + Ident(memcmp(s1 + size - 1, s2 + size - 1, 1)); + Ident(memcmp(s1 - 1, s2 - 1, 0)); + // One of arguments points to not allocated memory. + EXPECT_DEATH(Ident(memcmp)(s1 - 1, s2, 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(memcmp)(s1, s2 - 1, 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(Ident(memcmp)(s1 + size, s2, 1), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(memcmp)(s1, s2 + size, 1), RightOOBErrorMessage(0)); + // Hit unallocated memory and die. + EXPECT_DEATH(Ident(memcmp)(s1 + 1, s2 + 1, size), RightOOBErrorMessage(0)); + EXPECT_DEATH(Ident(memcmp)(s1 + size - 1, s2, 2), RightOOBErrorMessage(0)); + // Zero bytes are not terminators and don't prevent from OOB. + s1[size - 1] = '\0'; + s2[size - 1] = '\0'; + EXPECT_DEATH(Ident(memcmp)(s1, s2, size + 1), RightOOBErrorMessage(0)); + free(s1); + free(s2); +} + static const char *kOverlapErrorMessage = "strcpy-param-overlap"; TEST(AddressSanitizer, StrArgsOverlapTest) { From kcc at google.com Wed Dec 28 13:00:31 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 11:00:31 -0800 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptor for memcmp In-Reply-To: References: Message-ID: r147315. On Tue, Dec 27, 2011 at 5:34 AM, Alexey Samsonov wrote: > Rietveld link: http://codereview.appspot.com/5501076/ > > -- > Alexey Samsonov > Software Engineer, Moscow > samsonov at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/a40cf280/attachment.html From kcc at google.com Wed Dec 28 13:08:49 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 19:08:49 -0000 Subject: [llvm-commits] [compiler-rt] r147316 - in /compiler-rt/trunk/lib/asan: asan_interceptors.cc asan_interceptors.h mach_override/mach_override.c tests/asan_test.cc Message-ID: <20111228190849.7BEDB2A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 13:08:49 2011 New Revision: 147316 URL: http://llvm.org/viewvc/llvm-project?rev=147316&view=rev Log: [asan] interceptor for strcat. Patch by samsonov at google.com Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/mach_override/mach_override.c compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=147316&r1=147315&r2=147316&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Wed Dec 28 13:08:49 2011 @@ -33,6 +33,7 @@ memmove_f real_memmove; memset_f real_memset; strcasecmp_f real_strcasecmp; +strcat_f real_strcat; strchr_f real_strchr; strcmp_f real_strcmp; strcpy_f real_strcpy; @@ -80,18 +81,17 @@ // Behavior of functions like "memcpy" or "strcpy" is undefined // if memory intervals overlap. We report error in this case. // Macro is used to avoid creation of new frames. -static inline bool RangesOverlap(const char *offset1, const char *offset2, - size_t length) { - return !((offset1 + length <= offset2) || (offset2 + length <= offset1)); +static inline bool RangesOverlap(const char *offset1, size_t length1, + const char *offset2, size_t length2) { + return !((offset1 + length1 <= offset2) || (offset2 + length2 <= offset1)); } -#define CHECK_RANGES_OVERLAP(_offset1, _offset2, length) do { \ +#define CHECK_RANGES_OVERLAP(_offset1, length1, _offset2, length2) do { \ const char *offset1 = (const char*)_offset1; \ const char *offset2 = (const char*)_offset2; \ - if (RangesOverlap((const char*)offset1, (const char*)offset2, \ - length)) { \ + if (RangesOverlap(offset1, length1, offset2, length2)) { \ Report("ERROR: AddressSanitizer strcpy-param-overlap: " \ "memory ranges [%p,%p) and [%p, %p) overlap\n", \ - offset1, offset1 + length, offset2, offset2 + length); \ + offset1, offset1 + length1, offset2, offset2 + length2); \ PRINT_CURRENT_STACK(); \ ShowStatsAndAbort(); \ } \ @@ -130,6 +130,7 @@ INTERCEPT_FUNCTION(memmove); INTERCEPT_FUNCTION(memset); INTERCEPT_FUNCTION(strcasecmp); + INTERCEPT_FUNCTION(strcat); // NOLINT INTERCEPT_FUNCTION(strchr); INTERCEPT_FUNCTION(strcmp); INTERCEPT_FUNCTION(strcpy); // NOLINT @@ -185,7 +186,7 @@ } ENSURE_ASAN_INITED(); if (FLAG_replace_intrin) { - CHECK_RANGES_OVERLAP(to, from, size); + CHECK_RANGES_OVERLAP(to, size, from, size); ASAN_WRITE_RANGE(from, size); ASAN_READ_RANGE(to, size); } @@ -246,6 +247,21 @@ return CharCaseCmp(c1, c2); } +char *WRAP(strcat)(char *to, const char *from) { // NOLINT + ENSURE_ASAN_INITED(); + if (FLAG_replace_str) { + size_t from_length = real_strlen(from); + ASAN_READ_RANGE(from, from_length + 1); + if (from_length > 0) { + size_t to_length = real_strlen(to); + ASAN_READ_RANGE(to, to_length); + ASAN_WRITE_RANGE(to + to_length, from_length + 1); + CHECK_RANGES_OVERLAP(to, to_length + 1, from, from_length + 1); + } + } + return real_strcat(to, from); +} + int WRAP(strcmp)(const char *s1, const char *s2) { // strcmp is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. @@ -273,7 +289,7 @@ ENSURE_ASAN_INITED(); if (FLAG_replace_str) { size_t from_size = real_strlen(from) + 1; - CHECK_RANGES_OVERLAP(to, from, from_size); + CHECK_RANGES_OVERLAP(to, from_size, from, from_size); ASAN_READ_RANGE(from, from_size); ASAN_WRITE_RANGE(to, from_size); } @@ -339,7 +355,7 @@ ENSURE_ASAN_INITED(); if (FLAG_replace_str) { size_t from_size = Min(size, internal_strnlen(from, size) + 1); - CHECK_RANGES_OVERLAP(to, from, from_size); + CHECK_RANGES_OVERLAP(to, from_size, from, from_size); ASAN_READ_RANGE(from, from_size); ASAN_WRITE_RANGE(to, size); } Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=147316&r1=147315&r2=147316&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Wed Dec 28 13:08:49 2011 @@ -72,6 +72,7 @@ void *WRAP(memmove)(void *to, const void *from, size_t size); void *WRAP(memset)(void *block, int c, size_t size); int WRAP(strcasecmp)(const char *s1, const char *s2); +char *WRAP(strcat)(char *to, const char *from); // NOLINT char *WRAP(strchr)(const char *string, int c); int WRAP(strcmp)(const char *s1, const char *s2); char *WRAP(strcpy)(char *to, const char *from); // NOLINT @@ -90,6 +91,7 @@ typedef void* (*memmove_f)(void *to, const void *from, size_t size); typedef void* (*memset_f)(void *block, int c, size_t size); typedef int (*strcasecmp_f)(const char *s1, const char *s2); +typedef char* (*strcat_f)(char *to, const char *from); typedef char* (*strchr_f)(const char *str, int c); typedef int (*strcmp_f)(const char *s1, const char *s2); typedef char* (*strcpy_f)(char *to, const char *from); @@ -107,6 +109,7 @@ extern memmove_f real_memmove; extern memset_f real_memset; extern strcasecmp_f real_strcasecmp; +extern strcat_f real_strcat; extern strchr_f real_strchr; extern strcmp_f real_strcmp; extern strcpy_f real_strcpy; Modified: compiler-rt/trunk/lib/asan/mach_override/mach_override.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/mach_override/mach_override.c?rev=147316&r1=147315&r2=147316&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/mach_override/mach_override.c (original) +++ compiler-rt/trunk/lib/asan/mach_override/mach_override.c Wed Dec 28 13:08:49 2011 @@ -627,6 +627,7 @@ { 0x2, {0xFF, 0xFF}, {0x31, 0xC0} }, // xor %eax, %eax { 0x3, {0xFF, 0x4F, 0x00}, {0x8B, 0x45, 0x00} }, // mov $imm(%ebp), %reg { 0x3, {0xFF, 0x4C, 0x00}, {0x8B, 0x40, 0x00} }, // mov $imm(%eax-%edx), %reg + { 0x3, {0xFF, 0xCF, 0x00}, {0x8B, 0x4D, 0x00} }, // mov $imm(%rpb), %reg { 0x3, {0xFF, 0x4F, 0x00}, {0x8A, 0x4D, 0x00} }, // mov $imm(%ebp), %cl { 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x8B, 0x4C, 0x24, 0x00} }, // mov $imm(%esp), %ecx { 0x4, {0xFF, 0x00, 0x00, 0x00}, {0x8B, 0x00, 0x00, 0x00} }, // mov r16,r/m16 or r32,r/m32 Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147316&r1=147315&r2=147316&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Wed Dec 28 13:08:49 2011 @@ -1324,6 +1324,39 @@ free(s2); } +TEST(AddressSanitizer, StrCatOOBTest) { + size_t to_size = Ident(100); + char *to = MallocAndMemsetString(to_size); + to[0] = '\0'; + size_t from_size = Ident(20); + char *from = MallocAndMemsetString(from_size); + from[from_size - 1] = '\0'; + // Normal strcat calls. + strcat(to, from); + strcat(to, from); + strcat(to + from_size, from + from_size - 2); + // Catenate empty string is not always an error. + strcat(to - 1, from + from_size - 1); + // One of arguments points to not allocated memory. + EXPECT_DEATH(strcat(to - 1, from), LeftOOBErrorMessage(1)); + EXPECT_DEATH(strcat(to, from - 1), LeftOOBErrorMessage(1)); + EXPECT_DEATH(strcat(to + to_size, from), RightOOBErrorMessage(0)); + EXPECT_DEATH(strcat(to, from + from_size), RightOOBErrorMessage(0)); + + // "from" is not zero-terminated. + from[from_size - 1] = 'z'; + EXPECT_DEATH(strcat(to, from), RightOOBErrorMessage(0)); + from[from_size - 1] = '\0'; + // "to" is not zero-terminated. + memset(to, 'z', to_size); + EXPECT_DEATH(strcat(to, from), RightOOBErrorMessage(0)); + // "to" is too short to fit "from". + to[to_size - from_size + 1] = '\0'; + EXPECT_DEATH(strcat(to, from), RightOOBErrorMessage(0)); + // length of "to" is just enough. + strcat(to, from + 1); +} + static const char *kOverlapErrorMessage = "strcpy-param-overlap"; TEST(AddressSanitizer, StrArgsOverlapTest) { @@ -1357,6 +1390,18 @@ strncpy(str + 11, str, 20); EXPECT_DEATH(strncpy(str + 10, str, 20), kOverlapErrorMessage); + // Check "strcat". + memset(str, 'z', size); + str[10] = '\0'; + str[20] = '\0'; + strcat(str, str + 10); + strcat(str, str + 11); + str[10] = '\0'; + strcat(str + 11, str); + EXPECT_DEATH(strcat(str, str + 9), kOverlapErrorMessage); + EXPECT_DEATH(strcat(str + 9, str), kOverlapErrorMessage); + EXPECT_DEATH(strcat(str + 10, str), kOverlapErrorMessage); + free(str); } From kcc at google.com Wed Dec 28 13:12:27 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 11:12:27 -0800 Subject: [llvm-commits] Patch for AddressSanitizer [projects/compiler-rt/lib/asan]: interceptor for strcat In-Reply-To: References: Message-ID: r147316 On Tue, Dec 27, 2011 at 5:37 AM, Alexey Samsonov wrote: > Rietveld link: http://codereview.appspot.com/5504087/ > > -- > Alexey Samsonov > Software Engineer, Moscow > samsonov at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/7c3e7c6c/attachment.html From kcc at google.com Wed Dec 28 13:24:31 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 19:24:31 -0000 Subject: [llvm-commits] [compiler-rt] r147317 - in /compiler-rt/trunk/lib/asan: asan_interceptors.cc tests/asan_test.cc Message-ID: <20111228192431.8DF7E2A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 13:24:31 2011 New Revision: 147317 URL: http://llvm.org/viewvc/llvm-project?rev=147317&view=rev Log: [asan] better message for parameter overlap bugs Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=147317&r1=147316&r2=147317&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Wed Dec 28 13:24:31 2011 @@ -85,13 +85,13 @@ const char *offset2, size_t length2) { return !((offset1 + length1 <= offset2) || (offset2 + length2 <= offset1)); } -#define CHECK_RANGES_OVERLAP(_offset1, length1, _offset2, length2) do { \ +#define CHECK_RANGES_OVERLAP(name, _offset1, length1, _offset2, length2) do { \ const char *offset1 = (const char*)_offset1; \ const char *offset2 = (const char*)_offset2; \ if (RangesOverlap(offset1, length1, offset2, length2)) { \ - Report("ERROR: AddressSanitizer strcpy-param-overlap: " \ + Report("ERROR: AddressSanitizer %s-param-overlap: " \ "memory ranges [%p,%p) and [%p, %p) overlap\n", \ - offset1, offset1 + length1, offset2, offset2 + length2); \ + name, offset1, offset1 + length1, offset2, offset2 + length2); \ PRINT_CURRENT_STACK(); \ ShowStatsAndAbort(); \ } \ @@ -186,7 +186,7 @@ } ENSURE_ASAN_INITED(); if (FLAG_replace_intrin) { - CHECK_RANGES_OVERLAP(to, size, from, size); + CHECK_RANGES_OVERLAP("memcpy", to, size, from, size); ASAN_WRITE_RANGE(from, size); ASAN_READ_RANGE(to, size); } @@ -256,7 +256,7 @@ size_t to_length = real_strlen(to); ASAN_READ_RANGE(to, to_length); ASAN_WRITE_RANGE(to + to_length, from_length + 1); - CHECK_RANGES_OVERLAP(to, to_length + 1, from, from_length + 1); + CHECK_RANGES_OVERLAP("strcat", to, to_length + 1, from, from_length + 1); } } return real_strcat(to, from); @@ -289,7 +289,7 @@ ENSURE_ASAN_INITED(); if (FLAG_replace_str) { size_t from_size = real_strlen(from) + 1; - CHECK_RANGES_OVERLAP(to, from_size, from, from_size); + CHECK_RANGES_OVERLAP("strcpy", to, from_size, from, from_size); ASAN_READ_RANGE(from, from_size); ASAN_WRITE_RANGE(to, from_size); } @@ -355,7 +355,7 @@ ENSURE_ASAN_INITED(); if (FLAG_replace_str) { size_t from_size = Min(size, internal_strnlen(from, size) + 1); - CHECK_RANGES_OVERLAP(to, from_size, from, from_size); + CHECK_RANGES_OVERLAP("strncpy", to, from_size, from, from_size); ASAN_READ_RANGE(from, from_size); ASAN_WRITE_RANGE(to, size); } Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147317&r1=147316&r2=147317&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Wed Dec 28 13:24:31 2011 @@ -1357,7 +1357,9 @@ strcat(to, from + 1); } -static const char *kOverlapErrorMessage = "strcpy-param-overlap"; +static string OverlapErrorMessage(const string &func) { + return func + "-param-overlap"; +} TEST(AddressSanitizer, StrArgsOverlapTest) { size_t size = Ident(100); @@ -1368,27 +1370,28 @@ memset(str, 'z', size); Ident(memcpy)(str + 1, str + 11, 10); Ident(memcpy)(str, str, 0); - EXPECT_DEATH(Ident(memcpy)(str, str + 14, 15), kOverlapErrorMessage); - EXPECT_DEATH(Ident(memcpy)(str + 14, str, 15), kOverlapErrorMessage); - EXPECT_DEATH(Ident(memcpy)(str + 20, str + 20, 1), kOverlapErrorMessage); + EXPECT_DEATH(Ident(memcpy)(str, str + 14, 15), OverlapErrorMessage("memcpy")); + EXPECT_DEATH(Ident(memcpy)(str + 14, str, 15), OverlapErrorMessage("memcpy")); + EXPECT_DEATH(Ident(memcpy)(str + 20, str + 20, 1), + OverlapErrorMessage("memcpy")); #endif // Check "strcpy". memset(str, 'z', size); str[9] = '\0'; strcpy(str + 10, str); - EXPECT_DEATH(strcpy(str + 9, str), kOverlapErrorMessage); - EXPECT_DEATH(strcpy(str, str + 4), kOverlapErrorMessage); + EXPECT_DEATH(strcpy(str + 9, str), OverlapErrorMessage("strcpy")); + EXPECT_DEATH(strcpy(str, str + 4), OverlapErrorMessage("strcpy")); strcpy(str, str + 5); // Check "strncpy". memset(str, 'z', size); strncpy(str, str + 10, 10); - EXPECT_DEATH(strncpy(str, str + 9, 10), kOverlapErrorMessage); - EXPECT_DEATH(strncpy(str + 9, str, 10), kOverlapErrorMessage); + EXPECT_DEATH(strncpy(str, str + 9, 10), OverlapErrorMessage("strncpy")); + EXPECT_DEATH(strncpy(str + 9, str, 10), OverlapErrorMessage("strncpy")); str[10] = '\0'; strncpy(str + 11, str, 20); - EXPECT_DEATH(strncpy(str + 10, str, 20), kOverlapErrorMessage); + EXPECT_DEATH(strncpy(str + 10, str, 20), OverlapErrorMessage("strncpy")); // Check "strcat". memset(str, 'z', size); @@ -1398,9 +1401,9 @@ strcat(str, str + 11); str[10] = '\0'; strcat(str + 11, str); - EXPECT_DEATH(strcat(str, str + 9), kOverlapErrorMessage); - EXPECT_DEATH(strcat(str + 9, str), kOverlapErrorMessage); - EXPECT_DEATH(strcat(str + 10, str), kOverlapErrorMessage); + EXPECT_DEATH(strcat(str, str + 9), OverlapErrorMessage("strcat")); + EXPECT_DEATH(strcat(str + 9, str), OverlapErrorMessage("strcat")); + EXPECT_DEATH(strcat(str + 10, str), OverlapErrorMessage("strcat")); free(str); } From kcc at google.com Wed Dec 28 13:55:30 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 19:55:30 -0000 Subject: [llvm-commits] [compiler-rt] r147319 - in /compiler-rt/trunk/lib/asan: asan_rtl.cc asan_stack.cc tests/asan_test.cc Message-ID: <20111228195530.B32882A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 13:55:30 2011 New Revision: 147319 URL: http://llvm.org/viewvc/llvm-project?rev=147319&view=rev Log: [asan] enable memset/memcpy/memmove interceptors in asan-rt (in addition to those in the compiler module) Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc compiler-rt/trunk/lib/asan/asan_stack.cc compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147319&r1=147318&r2=147319&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Wed Dec 28 13:55:30 2011 @@ -676,7 +676,7 @@ FLAG_replace_cfallocator = IntFlagValue(options, "replace_cfallocator=", 1); FLAG_fast_unwind = IntFlagValue(options, "fast_unwind=", 1); FLAG_replace_str = IntFlagValue(options, "replace_str=", 1); - FLAG_replace_intrin = IntFlagValue(options, "replace_intrin=", 0); + FLAG_replace_intrin = IntFlagValue(options, "replace_intrin=", 1); FLAG_use_fake_stack = IntFlagValue(options, "use_fake_stack=", 1); FLAG_exitcode = IntFlagValue(options, "exitcode=", EXIT_FAILURE); FLAG_allow_user_poisoning = IntFlagValue(options, Modified: compiler-rt/trunk/lib/asan/asan_stack.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_stack.cc?rev=147319&r1=147318&r2=147319&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_stack.cc (original) +++ compiler-rt/trunk/lib/asan/asan_stack.cc Wed Dec 28 13:55:30 2011 @@ -183,9 +183,8 @@ compressed[i] = stack->trace[i]; res++; } - for (size_t i = stack->size; i < size; i++) { - compressed[i] = 0; - } + if (stack->size < size) + compressed[stack->size] = 0; #else // 64 bits, compress. uintptr_t prev_pc = 0; const uintptr_t kMaxOffset = (1ULL << 30) - 1; @@ -214,9 +213,10 @@ res++; prev_pc = pc; } - for (size_t i = c_index; i < size; i++) { - compressed[i] = 0; - } + if (c_index < size) + compressed[c_index] = 0; + if (c_index + 1 < size) + compressed[c_index + 1] = 0; #endif // __WORDSIZE // debug-only code Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=147319&r1=147318&r2=147319&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Wed Dec 28 13:55:30 2011 @@ -1365,7 +1365,6 @@ size_t size = Ident(100); char *str = Ident((char*)malloc(size)); -#if 0 // Check "memcpy". Use Ident() to avoid inlining. memset(str, 'z', size); Ident(memcpy)(str + 1, str + 11, 10); @@ -1374,7 +1373,6 @@ EXPECT_DEATH(Ident(memcpy)(str + 14, str, 15), OverlapErrorMessage("memcpy")); EXPECT_DEATH(Ident(memcpy)(str + 20, str + 20, 1), OverlapErrorMessage("memcpy")); -#endif // Check "strcpy". memset(str, 'z', size); From kcc at google.com Wed Dec 28 14:22:21 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 20:22:21 -0000 Subject: [llvm-commits] [compiler-rt] r147320 - /compiler-rt/trunk/lib/asan/asan_rtl.cc Message-ID: <20111228202221.4DD252A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 14:22:21 2011 New Revision: 147320 URL: http://llvm.org/viewvc/llvm-project?rev=147320&view=rev Log: [asan] no ucontext on Android. patch by eugeni.stepanov at gmail.com Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147320&r1=147319&r2=147320&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Wed Dec 28 14:22:21 2011 @@ -39,7 +39,9 @@ #include #include #include +#ifndef ANDROID #include +#endif #include #include #include @@ -254,7 +256,9 @@ // -------------------------- Run-time entry ------------------- {{{1 void GetPcSpBpAx(void *context, uintptr_t *pc, uintptr_t *sp, uintptr_t *bp, uintptr_t *ax) { +#ifndef ANDROID ucontext_t *ucontext = (ucontext_t*)context; +#endif #ifdef __APPLE__ # if __WORDSIZE == 64 *pc = ucontext->uc_mcontext->__ss.__rip; @@ -268,7 +272,9 @@ *ax = ucontext->uc_mcontext->__ss.__eax; # endif // __WORDSIZE #else // assume linux -# if defined(__arm__) +# if defined (ANDROID) + *pc = *sp = *bp = *ax = 0; +# elif defined(__arm__) *pc = ucontext->uc_mcontext.arm_pc; *bp = ucontext->uc_mcontext.arm_fp; *sp = ucontext->uc_mcontext.arm_sp; From kcc at google.com Wed Dec 28 14:26:19 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 12:26:19 -0800 Subject: [llvm-commits] [PATCH] asan-rt: add definitions for ucontext_t and friends on Android In-Reply-To: References: Message-ID: r147320. On Wed, Dec 28, 2011 at 3:36 AM, Evgeniy Stepanov wrote: > On Wed, Dec 28, 2011 at 3:36 AM, Kostya Serebryany wrote: > > You are kidding :) > > Let's not put the definition of ucontext into asan-rt. Can't we really > get > > it from somewhere in the system? > > If no, we can simply have > > # if defined(__arm__) > > *pc = *sp = *bp = 0; > > These used to be required when asan worked through SIGILL. Not any more > by > > default. > > I am actually considering to remove SIGILL-related code altogether (not > 100% > > sure yet). > > > > --kcc > > > > > > > > > > On Wed, Dec 21, 2011 at 4:08 AM, Evgeniy Stepanov > > wrote: > >> > >> Hi, > >> > >> libc headers on Android miss ucontext_t and friends. This patch adds > >> some compatible definitions. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/4a17f170/attachment.html From kcc at google.com Wed Dec 28 14:34:30 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 20:34:30 -0000 Subject: [llvm-commits] [compiler-rt] r147321 - /compiler-rt/trunk/lib/asan/asan_thread.cc Message-ID: <20111228203430.B29342A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 14:34:30 2011 New Revision: 147321 URL: http://llvm.org/viewvc/llvm-project?rev=147321&view=rev Log: [asan] discover main thread stack limits without pthread. patch by eugeni.stepanov at gmail.com Modified: compiler-rt/trunk/lib/asan/asan_thread.cc Modified: compiler-rt/trunk/lib/asan/asan_thread.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_thread.cc?rev=147321&r1=147320&r2=147321&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_thread.cc (original) +++ compiler-rt/trunk/lib/asan/asan_thread.cc Wed Dec 28 14:34:30 2011 @@ -17,7 +17,13 @@ #include "asan_thread_registry.h" #include "asan_mapping.h" +#if ASAN_USE_SYSINFO == 1 +#include "sysinfo/sysinfo.h" +#endif + #include +#include +#include #include #include #include @@ -121,6 +127,36 @@ int local; CHECK(AddrIsInStack((uintptr_t)&local)); #else +#if ASAN_USE_SYSINFO == 1 + if (tid() == 0) { + // This is the main thread. Libpthread may not be initialized yet. + struct rlimit rl; + CHECK(getrlimit(RLIMIT_STACK, &rl) == 0); + + // Find the mapping that contains a stack variable. + ProcMapsIterator it(0); + uint64_t start, end; + uint64_t prev_end = 0; + while (it.Next(&start, &end, NULL, NULL, NULL, NULL)) { + if ((uintptr_t)&rl < end) + break; + prev_end = end; + } + CHECK((uintptr_t)&rl >= start && (uintptr_t)&rl < end); + + // Get stacksize from rlimit, but clip it so that it does not overlap + // with other mappings. + size_t stacksize = rl.rlim_cur; + if (stacksize > end - prev_end) + stacksize = end - prev_end; + if (stacksize > kMaxThreadStackSize) + stacksize = kMaxThreadStackSize; + stack_top_ = end; + stack_bottom_ = end - stacksize; + CHECK(AddrIsInStack((uintptr_t)&rl)); + return; + } +#endif pthread_attr_t attr; CHECK(pthread_getattr_np(pthread_self(), &attr) == 0); size_t stacksize = 0; From kcc at google.com Wed Dec 28 14:39:32 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 12:39:32 -0800 Subject: [llvm-commits] [PATCH] asan-rt: discover main thread stack limits without pthread In-Reply-To: References: Message-ID: On Wed, Dec 28, 2011 at 3:38 AM, Evgeniy Stepanov wrote: > In fact, I'd like copy or reimplement ProcMapsIterator as part of ASan > to escape the ifdef hell. > Yea, I've got tired of it myself too. I'll try to re-implement the linux version next week. > But this should be good enough, lets land it first. > r147321 (fixed style) Thanks (let's see if this fixed the wine problem) --kcc > On Wed, Dec 28, 2011 at 3:24 AM, Kostya Serebryany wrote: > > I like the idea (it may actually fix the problem reported today on wine), > > but we need to guard the code that uses sysinfo/sysinfo.h > > with ASAN_USE_SYSINFO==1 (see asan_stack.cc). > > Or add some macro machinery to get sysinfo.h from an alternative place > as in > > perftools (base/sysinfo.h). > > > > --kcc > > > > > > On Wed, Dec 21, 2011 at 4:04 AM, Evgeniy Stepanov > > wrote: > >> > >> Hi, > >> > >> if __asan_init is called from .preinit_array, pthread_getattr_np may > >> become unsafe. This patch adds a different way of locating the stack > >> of the main thread with a combination of getlrimit() and > >> /proc/self/maps. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/83fd01e5/attachment.html From kcc at google.com Wed Dec 28 14:47:22 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 20:47:22 -0000 Subject: [llvm-commits] [compiler-rt] r147322 - /compiler-rt/trunk/lib/asan/asan_thread_registry.cc Message-ID: <20111228204722.350842A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 14:47:21 2011 New Revision: 147322 URL: http://llvm.org/viewvc/llvm-project?rev=147322&view=rev Log: [asan] missing PTHREAD_DESTRUCTOR_ITERATIONS on Android. patch by eugeni.stepanov at gmail.com Modified: compiler-rt/trunk/lib/asan/asan_thread_registry.cc Modified: compiler-rt/trunk/lib/asan/asan_thread_registry.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_thread_registry.cc?rev=147322&r1=147321&r2=147322&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_thread_registry.cc (original) +++ compiler-rt/trunk/lib/asan/asan_thread_registry.cc Wed Dec 28 14:47:21 2011 @@ -27,6 +27,12 @@ return asan_thread_registry; } +#ifdef ANDROID +#ifndef PTHREAD_DESTRUCTOR_ITERATIONS +#define PTHREAD_DESTRUCTOR_ITERATIONS 4 +#endif +#endif + // Dark magic below. In order to be able to notice that we're not handling // some thread creation routines (e.g. on Mac OS) we want to distinguish the // thread that used to have a corresponding AsanThread object from the thread From kcc at google.com Wed Dec 28 14:50:52 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 12:50:52 -0800 Subject: [llvm-commits] [PATCH] asan-rt: missing PTHREAD_DESTRUCTOR_ITERATIONS on Android In-Reply-To: References: Message-ID: r147322 On Wed, Dec 28, 2011 at 3:41 AM, Evgeniy Stepanov wrote: > Hi, > > PTHREAD_DESTRUCTOR_ITERATIONS is missing from Android headers. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/18cb5896/attachment.html From eli.friedman at gmail.com Wed Dec 28 15:24:44 2011 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 28 Dec 2011 21:24:44 -0000 Subject: [llvm-commits] [llvm] r147323 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/vec_fpext.ll Message-ID: <20111228212444.85BB22A6C12C@llvm.org> Author: efriedma Date: Wed Dec 28 15:24:44 2011 New Revision: 147323 URL: http://llvm.org/viewvc/llvm-project?rev=147323&view=rev Log: Fix type-checking for load transformation which is not legal on floating-point types. PR11674. Added: llvm/trunk/test/CodeGen/X86/vec_fpext.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147323&r1=147322&r2=147323&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 15:24:44 2011 @@ -13924,7 +13924,8 @@ // shuffle. We need SSE4 for the shuffles. // TODO: It is possible to support ZExt by zeroing the undef values // during the shuffle phase or after the shuffle. - if (RegVT.isVector() && Ext == ISD::EXTLOAD && Subtarget->hasSSE41()) { + if (RegVT.isVector() && RegVT.isInteger() && + Ext == ISD::EXTLOAD && Subtarget->hasSSE41()) { assert(MemVT != RegVT && "Cannot extend to the same type"); assert(MemVT.isVector() && "Must load a vector from memory"); Added: llvm/trunk/test/CodeGen/X86/vec_fpext.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_fpext.ll?rev=147323&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/vec_fpext.ll (added) +++ llvm/trunk/test/CodeGen/X86/vec_fpext.ll Wed Dec 28 15:24:44 2011 @@ -0,0 +1,14 @@ +; RUN: llc < %s -march=x86 -mattr=+sse41,-avx | FileCheck %s + +; PR11674 +define void @fpext_frommem(<2 x float>* %in, <2 x double>* %out) { +entry: +; TODO: We should be able to generate cvtps2pd for the load. +; For now, just check that we generate something sane. +; CHECK: cvtss2sd +; CHECK: cvtss2sd + %0 = load <2 x float>* %in, align 8 + %1 = fpext <2 x float> %0 to <2 x double> + store <2 x double> %1, <2 x double>* %out, align 1 + ret void +} From kcc at google.com Wed Dec 28 16:58:01 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 22:58:01 -0000 Subject: [llvm-commits] [compiler-rt] r147326 - in /compiler-rt/trunk/lib/asan: asan_allocator.cc asan_interceptors.cc asan_interceptors.h asan_internal.h asan_linux.cc asan_mac.cc asan_printf.cc asan_rtl.cc Message-ID: <20111228225801.D15042A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 16:58:01 2011 New Revision: 147326 URL: http://llvm.org/viewvc/llvm-project?rev=147326&view=rev Log: [asan] use custom libc-free getenv; a bit of refactoring around mmap calls Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/asan_internal.h compiler-rt/trunk/lib/asan/asan_linux.cc compiler-rt/trunk/lib/asan/asan_mac.cc compiler-rt/trunk/lib/asan/asan_printf.cc compiler-rt/trunk/lib/asan/asan_rtl.cc Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_allocator.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_allocator.cc (original) +++ compiler-rt/trunk/lib/asan/asan_allocator.cc Wed Dec 28 16:58:01 2011 @@ -60,32 +60,15 @@ static const size_t kMaxAllowedMallocSize = 8UL << 30; // 8G #endif -static void OutOfMemoryMessage(const char *mem_type, size_t size) { - AsanThread *t = asanThreadRegistry().GetCurrent(); - CHECK(t); - Report("ERROR: AddressSanitizer failed to allocate " - "0x%lx (%lu) bytes (%s) in T%d\n", - size, size, mem_type, t->tid()); -} - static inline bool IsAligned(uintptr_t a, uintptr_t alignment) { return (a & (alignment - 1)) == 0; } -static inline bool IsPowerOfTwo(size_t x) { - return (x & (x - 1)) == 0; -} - static inline size_t Log2(size_t x) { CHECK(IsPowerOfTwo(x)); return __builtin_ctzl(x); } -static inline size_t RoundUpTo(size_t size, size_t boundary) { - CHECK(IsPowerOfTwo(boundary)); - return (size + boundary - 1) & ~(boundary - 1); -} - static inline size_t RoundUpToPowerOfTwo(size_t size) { CHECK(size); if (IsPowerOfTwo(size)) return size; @@ -132,14 +115,7 @@ static uint8_t *MmapNewPagesAndPoisonShadow(size_t size) { CHECK(IsAligned(size, kPageSize)); - uint8_t *res = (uint8_t*)asan_mmap(0, size, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON, -1, 0); - if (res == (uint8_t*)-1) { - OutOfMemoryMessage(__FUNCTION__, size); - PRINT_CURRENT_STACK(); - ASAN_DIE; - } + uint8_t *res = (uint8_t*)AsanMmapSomewhereOrDie(size, __FUNCTION__); PoisonShadow((uintptr_t)res, size, kAsanHeapLeftRedzoneMagic); if (FLAG_debug) { Printf("ASAN_MMAP: [%p, %p)\n", res, res + size); @@ -929,8 +905,7 @@ if (mem) { PoisonShadow(mem, ClassMmapSize(i), 0); allocated_size_classes_[i] = 0; - int munmap_res = munmap((void*)mem, ClassMmapSize(i)); - CHECK(munmap_res == 0); + AsanUnmapOrDie((void*)mem, ClassMmapSize(i)); } } } @@ -941,10 +916,8 @@ void FakeStack::AllocateOneSizeClass(size_t size_class) { CHECK(ClassMmapSize(size_class) >= kPageSize); - uintptr_t new_mem = (uintptr_t)asan_mmap(0, ClassMmapSize(size_class), - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON, -1, 0); - CHECK(new_mem != (uintptr_t)-1); + uintptr_t new_mem = (uintptr_t)AsanMmapSomewhereOrDie( + ClassMmapSize(size_class), __FUNCTION__); // Printf("T%d new_mem[%ld]: %p-%p mmap %ld\n", // asanThreadRegistry().GetCurrent()->tid(), // size_class, new_mem, new_mem + ClassMmapSize(size_class), Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Wed Dec 28 16:58:01 2011 @@ -119,6 +119,23 @@ return i; } +void* internal_memchr(const void* s, int c, size_t n) { + const char* t = (char*)s; + for (size_t i = 0; i < n; ++i, ++t) + if (*t == c) + return (void*)t; + return NULL; +} + +int internal_memcmp(const void* s1, const void* s2, size_t n) { + const char* t1 = (char*)s1; + const char* t2 = (char*)s2; + for (size_t i = 0; i < n; ++i, ++t1, ++t2) + if (*t1 != *t2) + return *t1 < *t2 ? -1 : 1; + return 0; +} + void InitializeAsanInterceptors() { #ifndef __APPLE__ INTERCEPT_FUNCTION(index); Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Wed Dec 28 16:58:01 2011 @@ -123,6 +123,8 @@ // __asan::internal_X() is the implementation of X() for use in RTL. size_t internal_strlen(const char *s); size_t internal_strnlen(const char *s, size_t maxlen); +void* internal_memchr(const void* s, int c, size_t n); +int internal_memcmp(const void* s1, const void* s2, size_t n); // Initializes pointers to str*/mem* functions. void InitializeAsanInterceptors(); Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Wed Dec 28 16:58:01 2011 @@ -83,11 +83,20 @@ // asan_malloc_linux.cc / asan_malloc_mac.cc void ReplaceSystemMalloc(); +void OutOfMemoryMessageAndDie(const char *mem_type, size_t size); + // asan_linux.cc / asan_mac.cc void *AsanDoesNotSupportStaticLinkage(); +int AsanOpenReadonly(const char* filename); void *asan_mmap(void *addr, size_t length, int prot, int flags, int fd, uint64_t offset); -ssize_t asan_write(int fd, const void *buf, size_t count); + +void *AsanMmapSomewhereOrDie(size_t size, const char *where); +void AsanUnmapOrDie(void *ptr, size_t size); + +ssize_t AsanRead(int fd, void *buf, size_t count); +ssize_t AsanWrite(int fd, const void *buf, size_t count); +int AsanClose(int fd); // asan_printf.cc void RawWrite(const char *buffer); @@ -109,7 +118,6 @@ uintptr_t redzone_size, uint8_t value); - extern size_t FLAG_quarantine_size; extern int FLAG_demangle; extern bool FLAG_symbolize; @@ -185,6 +193,16 @@ static const uintptr_t kCurrentStackFrameMagic = 0x41B58AB3; static const uintptr_t kRetiredStackFrameMagic = 0x45E0360E; +// --------------------------- Bit twiddling ------- {{{1 +inline bool IsPowerOfTwo(size_t x) { + return (x & (x - 1)) == 0; +} + +inline size_t RoundUpTo(size_t size, size_t boundary) { + CHECK(IsPowerOfTwo(boundary)); + return (size + boundary - 1) & ~(boundary - 1); +} + // -------------------------- LowLevelAllocator ----- {{{1 // A simple low-level memory allocator for internal use. class LowLevelAllocator { Modified: compiler-rt/trunk/lib/asan/asan_linux.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_linux.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_linux.cc (original) +++ compiler-rt/trunk/lib/asan/asan_linux.cc Wed Dec 28 16:58:01 2011 @@ -17,6 +17,8 @@ #include #include +#include +#include #include extern char _DYNAMIC[]; @@ -28,22 +30,49 @@ return &_DYNAMIC; } -#ifdef ANDROID -#define SYS_mmap2 __NR_mmap2 -#define SYS_write __NR_write -#endif - void *asan_mmap(void *addr, size_t length, int prot, int flags, int fd, uint64_t offset) { # if __WORDSIZE == 64 - return (void *)syscall(SYS_mmap, addr, length, prot, flags, fd, offset); + return (void *)syscall(__NR_mmap, addr, length, prot, flags, fd, offset); # else - return (void *)syscall(SYS_mmap2, addr, length, prot, flags, fd, offset); + return (void *)syscall(__NR_mmap2, addr, length, prot, flags, fd, offset); # endif } -ssize_t asan_write(int fd, const void *buf, size_t count) { - return (ssize_t)syscall(SYS_write, fd, buf, count); +void *AsanMmapSomewhereOrDie(size_t size, const char *mem_type) { + size = RoundUpTo(size, kPageSize); + void *res = asan_mmap(0, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON, -1, 0); + if (res == (void*)-1) { + OutOfMemoryMessageAndDie(mem_type, size); + } + return res; +} + +void AsanUnmapOrDie(void *addr, size_t size) { + if (!addr || !size) return; + int res = syscall(__NR_munmap, addr, size); + if (res != 0) { + Report("Failed to unmap\n"); + ASAN_DIE; + } +} + +ssize_t AsanWrite(int fd, const void *buf, size_t count) { + return (ssize_t)syscall(__NR_write, fd, buf, count); +} + +int AsanOpenReadonly(const char* filename) { + return open(filename, O_RDONLY); +} + +ssize_t AsanRead(int fd, void *buf, size_t count) { + return (ssize_t)syscall(__NR_read, fd, buf, count); +} + +int AsanClose(int fd) { + return close(fd); } } // namespace __asan Modified: compiler-rt/trunk/lib/asan/asan_mac.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.cc (original) +++ compiler-rt/trunk/lib/asan/asan_mac.cc Wed Dec 28 16:58:01 2011 @@ -49,6 +49,38 @@ return write(fd, buf, count); } +void *AsanMmapSomewhereOrDie(size_t size, const char *mem_type) { + size = RoundUpTo(size, kPageSize); + void *res = asan_mmap(0, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON, -1, 0); + if (res == (void*)-1) { + OutOfMemoryMessageAndDie(mem_type, size); + } + return res; +} + +void AsanUnmapOrDie(void *addr, size_t size) { + if (!addr || !size) return; + int res = munmap(addr, size); + if (res != 0) { + Report("Failed to unmap\n"); + ASAN_DIE; + } +} + +int AsanOpenReadonly(const char* filename) { + return open(filename, O_RDONLY); +} + +ssize_t AsanRead(int fd, void *buf, size_t count) { + return read(fd, buf, count); +} + +int AsanClose(int fd) { + return close(fd); +} + // Support for the following functions from libdispatch on Mac OS: // dispatch_async_f() // dispatch_async() Modified: compiler-rt/trunk/lib/asan/asan_printf.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_printf.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_printf.cc (original) +++ compiler-rt/trunk/lib/asan/asan_printf.cc Wed Dec 28 16:58:01 2011 @@ -24,8 +24,8 @@ void RawWrite(const char *buffer) { static const char *kRawWriteError = "RawWrite can't output requested buffer!"; ssize_t length = (ssize_t)internal_strlen(buffer); - if (length != asan_write(2, buffer, length)) { - asan_write(2, kRawWriteError, internal_strlen(kRawWriteError)); + if (length != AsanWrite(2, buffer, length)) { + AsanWrite(2, kRawWriteError, internal_strlen(kRawWriteError)); ASAN_DIE; } } Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147326&r1=147325&r2=147326&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Wed Dec 28 16:58:01 2011 @@ -122,6 +122,56 @@ Printf("\n"); } +// Opens the file 'file_name" and reads up to 'max_len' bytes. +// The resulting buffer is mmaped and stored in '*buff'. +// Returns the number of read bytes or -1 if file can not be opened. +static ssize_t ReadFileToBuffer(const char *file_name, char **buff, + size_t max_len) { + const size_t kMinFileLen = kPageSize; + ssize_t read_len = -1; + *buff = 0; + size_t maped_size = 0; + // The files we usually open are not seekable, so try different buffer sizes. + for (size_t size = kMinFileLen; size <= max_len; size *= 2) { + int fd = AsanOpenReadonly(file_name); + if (fd < 0) return -1; + AsanUnmapOrDie(*buff, maped_size); + maped_size = size; + *buff = (char*)AsanMmapSomewhereOrDie(size, __FUNCTION__); + read_len = AsanRead(fd, *buff, size); + AsanClose(fd); + if (read_len < size) // We've read the whole file. + break; + } + return read_len; +} + +// Like getenv, but reads env directly from /proc and does not use libc. +// This function should be called first inside __asan_init. +static const char* GetEnvFromProcSelfEnviron(const char* name) { + static char *environ; + static ssize_t len; + static bool inited; + if (!inited) { + inited = true; + len = ReadFileToBuffer("/proc/self/environ", &environ, 1 << 20); + } + if (!environ || len <= 0) return NULL; + size_t namelen = internal_strlen(name); + const char *p = environ; + while (*p != '\0') { // will happen at the \0\0 that terminates the buffer + // proc file has the format NAME=value\0NAME=value\0NAME=value\0... + const char* endp = + (char*)internal_memchr(p, '\0', len - (p - environ)); + if (endp == NULL) // this entry isn't NUL terminated + return NULL; + else if (!internal_memcmp(p, name, namelen) && p[namelen] == '=') // Match. + return p + namelen + 1; // point after = + p = endp + 1; + } + return NULL; // Not found. +} + // ---------------------- Thread ------------------------- {{{1 static void *asan_thread_start(void *arg) { AsanThread *t= (AsanThread*)arg; @@ -130,10 +180,12 @@ } // ---------------------- mmap -------------------- {{{1 -static void OutOfMemoryMessage(const char *mem_type, size_t size) { +void OutOfMemoryMessageAndDie(const char *mem_type, size_t size) { Report("ERROR: AddressSanitizer failed to allocate " "0x%lx (%ld) bytes of %s\n", size, size, mem_type); + PRINT_CURRENT_STACK(); + ShowStatsAndAbort(); } static char *mmap_pages(size_t start_page, size_t n_pages, const char *mem_type, @@ -144,8 +196,7 @@ // Printf("%p => %p\n", (void*)start_page, res); char *ch = (char*)res; if (res == (void*)-1L && abort_on_failure) { - OutOfMemoryMessage(mem_type, n_pages * kPageSize); - ShowStatsAndAbort(); + OutOfMemoryMessageAndDie(mem_type, n_pages * kPageSize); } CHECK(res == (void*)start_page || res == (void*)-1L); return ch; @@ -175,10 +226,8 @@ CHECK((size & (size - 1)) == 0 && "size must be a power of two"); if (allocated_end_ - allocated_current_ < size) { size_t size_to_allocate = Max(size, kPageSize); - allocated_current_ = (char*)asan_mmap(0, size_to_allocate, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON, -1, 0); - CHECK((allocated_current_ != (char*)-1) && "Can't mmap"); + allocated_current_ = + (char*)AsanMmapSomewhereOrDie(size_to_allocate, __FUNCTION__); allocated_end_ = allocated_current_ + size_to_allocate; PoisonShadow((uintptr_t)allocated_current_, size_to_allocate, kAsanInternalHeapMagic); @@ -306,7 +355,7 @@ return; } // Write the first message using the bullet-proof write. - if (13 != asan_write(2, "ASAN:SIGSEGV\n", 13)) ASAN_DIE; + if (13 != AsanWrite(2, "ASAN:SIGSEGV\n", 13)) ASAN_DIE; uintptr_t pc, sp, bp, ax; GetPcSpBpAx(context, &pc, &sp, &bp, &ax); Report("ERROR: AddressSanitizer crashed on unknown address %p" @@ -321,7 +370,7 @@ static void ASAN_OnSIGILL(int, siginfo_t *siginfo, void *context) { // Write the first message using the bullet-proof write. - if (12 != asan_write(2, "ASAN:SIGILL\n", 12)) ASAN_DIE; + if (12 != AsanWrite(2, "ASAN:SIGILL\n", 12)) ASAN_DIE; uintptr_t pc, sp, bp, ax; GetPcSpBpAx(context, &pc, &sp, &bp, &ax); @@ -656,7 +705,7 @@ AsanDoesNotSupportStaticLinkage(); // flags - const char *options = getenv("ASAN_OPTIONS"); + const char *options = GetEnvFromProcSelfEnviron("ASAN_OPTIONS"); FLAG_malloc_context_size = IntFlagValue(options, "malloc_context_size=", kMallocContextSize); CHECK(FLAG_malloc_context_size <= kMallocContextSize); From nicholas at mxc.ca Wed Dec 28 17:24:21 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Wed, 28 Dec 2011 23:24:21 -0000 Subject: [llvm-commits] [llvm] r147327 - in /llvm/trunk: include/llvm/Analysis/CaptureTracking.h lib/Analysis/CaptureTracking.cpp lib/Analysis/MemoryDependenceAnalysis.cpp lib/Transforms/IPO/FunctionAttrs.cpp test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll test/Transforms/FunctionAttrs/nocapture.ll Message-ID: <20111228232421.A6C612A6C12C@llvm.org> Author: nicholas Date: Wed Dec 28 17:24:21 2011 New Revision: 147327 URL: http://llvm.org/viewvc/llvm-project?rev=147327&view=rev Log: Change CaptureTracking to pass a Use* instead of a Value* when a value is captured. This allows the tracker to look at the specific use, which may be especially interesting for function calls. Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does not iterate until a fixpoint and does not guarantee that it produces the same result regardless of iteration order. The new implementation builds up a graph of how arguments are passed from function to function, and uses a bottom-up walk on the argument-SCCs to assign nocapture. This gets us nocapture more often, and does so rather efficiently and independent of iteration order. Modified: llvm/trunk/include/llvm/Analysis/CaptureTracking.h llvm/trunk/lib/Analysis/CaptureTracking.cpp llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp llvm/trunk/lib/Transforms/IPO/FunctionAttrs.cpp llvm/trunk/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Modified: llvm/trunk/include/llvm/Analysis/CaptureTracking.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/CaptureTracking.h?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/CaptureTracking.h (original) +++ llvm/trunk/include/llvm/Analysis/CaptureTracking.h Wed Dec 28 17:24:21 2011 @@ -50,10 +50,10 @@ /// U->getUser() is always an Instruction. virtual bool shouldExplore(Use *U) = 0; - /// captured - The instruction I captured the pointer. Return true to - /// stop the traversal or false to continue looking for more capturing - /// instructions. - virtual bool captured(Instruction *I) = 0; + /// captured - Information about the pointer was captured by the user of + /// use U. Return true to stop the traversal or false to continue looking + /// for more capturing instructions. + virtual bool captured(Use *U) = 0; }; /// PointerMayBeCaptured - Visit the value and the values derived from it and Modified: llvm/trunk/lib/Analysis/CaptureTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/CaptureTracking.cpp?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/CaptureTracking.cpp (original) +++ llvm/trunk/lib/Analysis/CaptureTracking.cpp Wed Dec 28 17:24:21 2011 @@ -30,8 +30,8 @@ bool shouldExplore(Use *U) { return true; } - bool captured(Instruction *I) { - if (isa(I) && !ReturnCaptures) + bool captured(Use *U) { + if (isa(U->getUser()) && !ReturnCaptures) return false; Captured = true; @@ -117,7 +117,7 @@ for (CallSite::arg_iterator A = B; A != E; ++A) if (A->get() == V && !CS.doesNotCapture(A - B)) // The parameter is not marked 'nocapture' - captured. - if (Tracker->captured(I)) + if (Tracker->captured(U)) return; break; } @@ -130,7 +130,7 @@ case Instruction::Store: if (V == I->getOperand(0)) // Stored the pointer - conservatively assume it may be captured. - if (Tracker->captured(I)) + if (Tracker->captured(U)) return; // Storing to the pointee does not cause the pointer to be captured. break; @@ -158,12 +158,12 @@ break; // Otherwise, be conservative. There are crazy ways to capture pointers // using comparisons. - if (Tracker->captured(I)) + if (Tracker->captured(U)) return; break; default: // Something else - be conservative and say it is captured. - if (Tracker->captured(I)) + if (Tracker->captured(U)) return; break; } Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Wed Dec 28 17:24:21 2011 @@ -349,7 +349,8 @@ return true; } - bool captured(Instruction *I) { + bool captured(Use *U) { + Instruction *I = cast(U->getUser()); if (BeforeHere != I && DT->dominates(BeforeHere, I)) return false; Captured = true; Modified: llvm/trunk/lib/Transforms/IPO/FunctionAttrs.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/FunctionAttrs.cpp?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/FunctionAttrs.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/FunctionAttrs.cpp Wed Dec 28 17:24:21 2011 @@ -27,6 +27,7 @@ #include "llvm/Analysis/AliasAnalysis.h" #include "llvm/Analysis/CallGraph.h" #include "llvm/Analysis/CaptureTracking.h" +#include "llvm/ADT/SCCIterator.h" #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/Statistic.h" #include "llvm/ADT/UniqueVector.h" @@ -225,31 +226,247 @@ return MadeChange; } +namespace { + // For a given pointer Argument, this retains a list of Arguments of functions + // in the same SCC that the pointer data flows into. We use this to build an + // SCC of the arguments. + struct ArgumentGraphNode { + Argument *Definition; + SmallVector Uses; + }; + + class ArgumentGraph { + // We store pointers to ArgumentGraphNode objects, so it's important that + // that they not move around upon insert. + typedef std::map ArgumentMapTy; + + ArgumentMapTy ArgumentMap; + + // There is no root node for the argument graph, in fact: + // void f(int *x, int *y) { if (...) f(x, y); } + // is an example where the graph is disconnected. The SCCIterator requires a + // single entry point, so we maintain a fake ("synthetic") root node that + // uses every node. Because the graph is directed and nothing points into + // the root, it will not participate in any SCCs (except for its own). + ArgumentGraphNode SyntheticRoot; + + public: + ArgumentGraph() { SyntheticRoot.Definition = 0; } + + typedef SmallVectorImpl::iterator iterator; + + iterator begin() { return SyntheticRoot.Uses.begin(); } + iterator end() { return SyntheticRoot.Uses.end(); } + ArgumentGraphNode *getEntryNode() { return &SyntheticRoot; } + + ArgumentGraphNode *operator[](Argument *A) { + ArgumentGraphNode &Node = ArgumentMap[A]; + Node.Definition = A; + SyntheticRoot.Uses.push_back(&Node); + return &Node; + } + }; + + // This tracker checks whether callees are in the SCC, and if so it does not + // consider that a capture, instead adding it to the "Uses" list and + // continuing with the analysis. + struct ArgumentUsesTracker : public CaptureTracker { + ArgumentUsesTracker(const SmallPtrSet &SCCNodes) + : Captured(false), SCCNodes(SCCNodes) {} + + void tooManyUses() { Captured = true; } + + bool shouldExplore(Use *U) { return true; } + + bool captured(Use *U) { + CallSite CS(U->getUser()); + if (!CS.getInstruction()) { Captured = true; return true; } + + Function *F = CS.getCalledFunction(); + if (!F || !SCCNodes.count(F)) { Captured = true; return true; } + + Function::arg_iterator AI = F->arg_begin(), AE = F->arg_end(); + for (CallSite::arg_iterator PI = CS.arg_begin(), PE = CS.arg_end(); + PI != PE; ++PI, ++AI) { + if (AI == AE) { + assert(F->isVarArg() && "More params than args in non-varargs call"); + Captured = true; + return true; + } + if (PI == U) { + Uses.push_back(AI); + break; + } + } + assert(!Uses.empty() && "Capturing call-site captured nothing?"); + return false; + } + + bool Captured; // True only if certainly captured (used outside our SCC). + SmallVector Uses; // Uses within our SCC. + + const SmallPtrSet &SCCNodes; + }; +} + +namespace llvm { + template<> struct GraphTraits { + typedef ArgumentGraphNode NodeType; + typedef SmallVectorImpl::iterator ChildIteratorType; + + static inline NodeType *getEntryNode(NodeType *A) { return A; } + static inline ChildIteratorType child_begin(NodeType *N) { + return N->Uses.begin(); + } + static inline ChildIteratorType child_end(NodeType *N) { + return N->Uses.end(); + } + }; + template<> struct GraphTraits + : public GraphTraits { + static NodeType *getEntryNode(ArgumentGraph *AG) { + return AG->getEntryNode(); + } + static ChildIteratorType nodes_begin(ArgumentGraph *AG) { + return AG->begin(); + } + static ChildIteratorType nodes_end(ArgumentGraph *AG) { + return AG->end(); + } + }; +} + /// AddNoCaptureAttrs - Deduce nocapture attributes for the SCC. bool FunctionAttrs::AddNoCaptureAttrs(const CallGraphSCC &SCC) { bool Changed = false; + SmallPtrSet SCCNodes; + + // Fill SCCNodes with the elements of the SCC. Used for quickly + // looking up whether a given CallGraphNode is in this SCC. + for (CallGraphSCC::iterator I = SCC.begin(), E = SCC.end(); I != E; ++I) { + Function *F = (*I)->getFunction(); + if (F && !F->isDeclaration() && !F->mayBeOverridden()) + SCCNodes.insert(F); + } + + ArgumentGraph AG; + // Check each function in turn, determining which pointer arguments are not // captured. for (CallGraphSCC::iterator I = SCC.begin(), E = SCC.end(); I != E; ++I) { Function *F = (*I)->getFunction(); if (F == 0) - // External node - skip it; + // External node - only a problem for arguments that we pass to it. continue; // Definitions with weak linkage may be overridden at linktime with - // something that writes memory, so treat them like declarations. + // something that captures pointers, so treat them like declarations. if (F->isDeclaration() || F->mayBeOverridden()) continue; + // Functions that are readonly (or readnone) and nounwind and don't return + // a value can't capture arguments. Don't analyze them. + if (F->onlyReadsMemory() && F->doesNotThrow() && + F->getReturnType()->isVoidTy()) { + for (Function::arg_iterator A = F->arg_begin(), E = F->arg_end(); + A != E; ++A) { + if (A->getType()->isPointerTy() && !A->hasNoCaptureAttr()) { + A->addAttr(Attribute::NoCapture); + ++NumNoCapture; + Changed = true; + } + } + continue; + } + for (Function::arg_iterator A = F->arg_begin(), E = F->arg_end(); A!=E; ++A) - if (A->getType()->isPointerTy() && !A->hasNoCaptureAttr() && - !PointerMayBeCaptured(A, true, /*StoreCaptures=*/false)) { - A->addAttr(Attribute::NoCapture); + if (A->getType()->isPointerTy() && !A->hasNoCaptureAttr()) { + ArgumentUsesTracker Tracker(SCCNodes); + PointerMayBeCaptured(A, &Tracker); + if (!Tracker.Captured) { + if (Tracker.Uses.empty()) { + // If it's trivially not captured, mark it nocapture now. + A->addAttr(Attribute::NoCapture); + ++NumNoCapture; + Changed = true; + } else { + // If it's not trivially captured and not trivially not captured, + // then it must be calling into another function in our SCC. Save + // its particulars for Argument-SCC analysis later. + ArgumentGraphNode *Node = AG[A]; + for (SmallVectorImpl::iterator UI = Tracker.Uses.begin(), + UE = Tracker.Uses.end(); UI != UE; ++UI) + Node->Uses.push_back(AG[*UI]); + } + } + // Otherwise, it's captured. Don't bother doing SCC analysis on it. + } + } + + // The graph we've collected is partial because we stopped scanning for + // argument uses once we solved the argument trivially. These partial nodes + // show up as ArgumentGraphNode objects with an empty Uses list, and for + // these nodes the final decision about whether they capture has already been + // made. If the definition doesn't have a 'nocapture' attribute by now, it + // captures. + + for (scc_iterator I = scc_begin(&AG), E = scc_end(&AG); + I != E; ++I) { + std::vector &ArgumentSCC = *I; + if (ArgumentSCC.size() == 1) { + if (!ArgumentSCC[0]->Definition) continue; // synthetic root node + + // eg. "void f(int* x) { if (...) f(x); }" + if (ArgumentSCC[0]->Uses.size() == 1 && + ArgumentSCC[0]->Uses[0] == ArgumentSCC[0]) { + ArgumentSCC[0]->Definition->addAttr(Attribute::NoCapture); ++NumNoCapture; Changed = true; } + continue; + } + + bool SCCCaptured = false; + for (std::vector::iterator I = ArgumentSCC.begin(), + E = ArgumentSCC.end(); I != E && !SCCCaptured; ++I) { + ArgumentGraphNode *Node = *I; + if (Node->Uses.empty()) { + if (!Node->Definition->hasNoCaptureAttr()) + SCCCaptured = true; + } + } + if (SCCCaptured) continue; + + SmallPtrSet ArgumentSCCNodes; + // Fill ArgumentSCCNodes with the elements of the ArgumentSCC. Used for + // quickly looking up whether a given Argument is in this ArgumentSCC. + for (std::vector::iterator I = ArgumentSCC.begin(), + E = ArgumentSCC.end(); I != E; ++I) { + ArgumentSCCNodes.insert((*I)->Definition); + } + + for (std::vector::iterator I = ArgumentSCC.begin(), + E = ArgumentSCC.end(); I != E && !SCCCaptured; ++I) { + ArgumentGraphNode *N = *I; + for (SmallVectorImpl::iterator UI = N->Uses.begin(), + UE = N->Uses.end(); UI != UE; ++UI) { + Argument *A = (*UI)->Definition; + if (A->hasNoCaptureAttr() || ArgumentSCCNodes.count(A)) + continue; + SCCCaptured = true; + break; + } + } + if (SCCCaptured) continue; + + for (unsigned i = 0, e = ArgumentSCC.size(); i != e && !SCCCaptured; ++i) { + Argument *A = ArgumentSCC[i]->Definition; + A->addAttr(Attribute::NoCapture); + ++NumNoCapture; + Changed = true; + } } return Changed; Modified: llvm/trunk/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll (original) +++ llvm/trunk/test/Analysis/TypeBasedAliasAnalysis/functionattrs.ll Wed Dec 28 17:24:21 2011 @@ -24,7 +24,7 @@ ; Add the readonly attribute, since there's just a call to a function which ; TBAA says doesn't modify any memory. -; CHECK: define void @test1_yes(i32* %p) nounwind readonly { +; CHECK: define void @test1_yes(i32* nocapture %p) nounwind readonly { define void @test1_yes(i32* %p) nounwind { call void @callee(i32* %p), !tbaa !1 ret void Modified: llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll?rev=147327&r1=147326&r2=147327&view=diff ============================================================================== --- llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll (original) +++ llvm/trunk/test/Transforms/FunctionAttrs/nocapture.ll Wed Dec 28 17:24:21 2011 @@ -115,3 +115,64 @@ call void %f(i8* nocapture %p) ret void } + +; CHECK: define void @test1_1(i8* nocapture %x1_1, i8* %y1_1) +define void @test1_1(i8* %x1_1, i8* %y1_1) { + call i8* @test1_2(i8* %x1_1, i8* %y1_1) + store i32* null, i32** @g + ret void +} + +; CHECK: define i8* @test1_2(i8* nocapture %x1_2, i8* %y1_2) +define i8* @test1_2(i8* %x1_2, i8* %y1_2) { + call void @test1_1(i8* %x1_2, i8* %y1_2) + store i32* null, i32** @g + ret i8* %y1_2 +} + +; CHECK: define void @test2(i8* nocapture %x2) +define void @test2(i8* %x2) { + call void @test2(i8* %x2) + store i32* null, i32** @g + ret void +} + +; CHECK: define void @test3(i8* nocapture %x3, i8* nocapture %y3, i8* nocapture %z3) +define void @test3(i8* %x3, i8* %y3, i8* %z3) { + call void @test3(i8* %z3, i8* %y3, i8* %x3) + store i32* null, i32** @g + ret void +} + +; CHECK: define void @test4_1(i8* %x4_1) +define void @test4_1(i8* %x4_1) { + call i8* @test4_2(i8* %x4_1, i8* %x4_1, i8* %x4_1) + store i32* null, i32** @g + ret void +} + +; CHECK: define i8* @test4_2(i8* nocapture %x4_2, i8* %y4_2, i8* nocapture %z4_2) +define i8* @test4_2(i8* %x4_2, i8* %y4_2, i8* %z4_2) { + call void @test4_1(i8* null) + store i32* null, i32** @g + ret i8* %y4_2 +} + +declare i8* @test5_1(i8* %x5_1) + +; CHECK: define void @test5_2(i8* %x5_2) +define void @test5_2(i8* %x5_2) { + call i8* @test5_1(i8* %x5_2) + store i32* null, i32** @g + ret void +} + +declare void @test6_1(i8* %x6_1, i8* nocapture %y6_1, ...) + +; CHECK: define void @test6_2(i8* %x6_2, i8* nocapture %y6_2, i8* %z6_2) +define void @test6_2(i8* %x6_2, i8* %y6_2, i8* %z6_2) { + call void (i8*, i8*, ...)* @test6_1(i8* %x6_2, i8* %y6_2, i8* %z6_2) + store i32* null, i32** @g + ret void +} + From kcc at google.com Wed Dec 28 17:28:55 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 23:28:55 -0000 Subject: [llvm-commits] [compiler-rt] r147328 - in /compiler-rt/trunk/lib/asan: asan_allocator.cc asan_internal.h asan_linux.cc asan_mac.cc asan_rtl.cc asan_thread.cc Message-ID: <20111228232855.359D82A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 17:28:54 2011 New Revision: 147328 URL: http://llvm.org/viewvc/llvm-project?rev=147328&view=rev Log: [asan] refactoring: don't #include in non-os-specific files Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc compiler-rt/trunk/lib/asan/asan_internal.h compiler-rt/trunk/lib/asan/asan_linux.cc compiler-rt/trunk/lib/asan/asan_mac.cc compiler-rt/trunk/lib/asan/asan_rtl.cc compiler-rt/trunk/lib/asan/asan_thread.cc Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_allocator.cc?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_allocator.cc (original) +++ compiler-rt/trunk/lib/asan/asan_allocator.cc Wed Dec 28 17:28:54 2011 @@ -35,7 +35,6 @@ #include "asan_thread.h" #include "asan_thread_registry.h" -#include #include #include #include Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Wed Dec 28 17:28:54 2011 @@ -88,9 +88,10 @@ // asan_linux.cc / asan_mac.cc void *AsanDoesNotSupportStaticLinkage(); int AsanOpenReadonly(const char* filename); -void *asan_mmap(void *addr, size_t length, int prot, int flags, - int fd, uint64_t offset); +void *AsanMmapFixedNoReserve(uintptr_t fixed_addr, size_t size); +void *AsanMmapFixedReserve(uintptr_t fixed_addr, size_t size); +void *AsanMprotect(uintptr_t fixed_addr, size_t size); void *AsanMmapSomewhereOrDie(size_t size, const char *where); void AsanUnmapOrDie(void *ptr, size_t size); Modified: compiler-rt/trunk/lib/asan/asan_linux.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_linux.cc?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_linux.cc (original) +++ compiler-rt/trunk/lib/asan/asan_linux.cc Wed Dec 28 17:28:54 2011 @@ -30,7 +30,7 @@ return &_DYNAMIC; } -void *asan_mmap(void *addr, size_t length, int prot, int flags, +static void *asan_mmap(void *addr, size_t length, int prot, int flags, int fd, uint64_t offset) { # if __WORDSIZE == 64 return (void *)syscall(__NR_mmap, addr, length, prot, flags, fd, offset); @@ -50,6 +50,27 @@ return res; } +void *AsanMmapFixedNoReserve(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, + 0, 0); +} + +void *AsanMmapFixedReserve(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, + 0, 0); +} + +void *AsanMprotect(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_NONE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, + 0, 0); +} + void AsanUnmapOrDie(void *addr, size_t size) { if (!addr || !size) return; int res = syscall(__NR_munmap, addr, size); Modified: compiler-rt/trunk/lib/asan/asan_mac.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.cc?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.cc (original) +++ compiler-rt/trunk/lib/asan/asan_mac.cc Wed Dec 28 17:28:54 2011 @@ -22,6 +22,7 @@ #include "asan_thread_registry.h" #include +#include #include #include @@ -40,12 +41,12 @@ return NULL; } -void *asan_mmap(void *addr, size_t length, int prot, int flags, +static void *asan_mmap(void *addr, size_t length, int prot, int flags, int fd, uint64_t offset) { return mmap(addr, length, prot, flags, fd, offset); } -ssize_t asan_write(int fd, const void *buf, size_t count) { +ssize_t AsanWrite(int fd, const void *buf, size_t count) { return write(fd, buf, count); } @@ -60,6 +61,27 @@ return res; } +void *AsanMmapFixedNoReserve(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, + 0, 0); +} + +void *AsanMmapFixedReserve(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED, + 0, 0); +} + +void *AsanMprotect(uintptr_t fixed_addr, size_t size) { + return asan_mmap((void*)fixed_addr, size, + PROT_NONE, + MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, + 0, 0); +} + void AsanUnmapOrDie(void *addr, size_t size) { if (!addr || !size) return; int res = munmap(addr, size); Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Wed Dec 28 17:28:54 2011 @@ -16,9 +16,7 @@ #include "asan_interface.h" #include "asan_internal.h" #include "asan_lock.h" -#ifdef __APPLE__ #include "asan_mac.h" -#endif #include "asan_mapping.h" #include "asan_stack.h" #include "asan_stats.h" @@ -36,7 +34,6 @@ #include #include #include -#include #include #include #ifndef ANDROID @@ -188,37 +185,13 @@ ShowStatsAndAbort(); } -static char *mmap_pages(size_t start_page, size_t n_pages, const char *mem_type, - bool abort_on_failure = true) { - void *res = asan_mmap((void*)start_page, kPageSize * n_pages, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, 0, 0); - // Printf("%p => %p\n", (void*)start_page, res); - char *ch = (char*)res; - if (res == (void*)-1L && abort_on_failure) { - OutOfMemoryMessageAndDie(mem_type, n_pages * kPageSize); - } - CHECK(res == (void*)start_page || res == (void*)-1L); - return ch; -} - -// mmap range [beg, end] -static char *mmap_range(uintptr_t beg, uintptr_t end, const char *mem_type) { +// Reserve memory range [beg, end]. +static void ReserveShadowMemoryRange(uintptr_t beg, uintptr_t end) { CHECK((beg % kPageSize) == 0); CHECK(((end + 1) % kPageSize) == 0); - // Printf("mmap_range %p %p %ld\n", beg, end, (end - beg) / kPageSize); - return mmap_pages(beg, (end - beg + 1) / kPageSize, mem_type); -} - -// protect range [beg, end] -static void protect_range(uintptr_t beg, uintptr_t end) { - CHECK((beg % kPageSize) == 0); - CHECK(((end+1) % kPageSize) == 0); - // Printf("protect_range %p %p %ld\n", beg, end, (end - beg) / kPageSize); - void *res = asan_mmap((void*)beg, end - beg + 1, - PROT_NONE, - MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_NORESERVE, 0, 0); - CHECK(res == (void*)beg); + size_t size = end - beg + 1; + void *res = AsanMmapFixedNoReserve(beg, size); + CHECK(res == (void*)beg && "ReserveShadowMemoryRange failed"); } // ---------------------- LowLevelAllocator ------------- {{{1 @@ -349,9 +322,7 @@ // this address. const uintptr_t chunk_size = kPageSize << 10; // 4M uintptr_t chunk = addr & ~(chunk_size - 1); - asan_mmap((void*)chunk, chunk_size, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON | MAP_FIXED, 0, 0); + AsanMmapFixedReserve(chunk, chunk_size); return; } // Write the first message using the bullet-proof write. @@ -813,13 +784,14 @@ if (!FLAG_lazy_shadow) { if (kLowShadowBeg != kLowShadowEnd) { // mmap the low shadow plus one page. - mmap_range(kLowShadowBeg - kPageSize, kLowShadowEnd, "LowShadow"); + ReserveShadowMemoryRange(kLowShadowBeg - kPageSize, kLowShadowEnd); } // mmap the high shadow. - mmap_range(kHighShadowBeg, kHighShadowEnd, "HighShadow"); + ReserveShadowMemoryRange(kHighShadowBeg, kHighShadowEnd); } // protect the gap - protect_range(kShadowGapBeg, kShadowGapEnd); + void *prot = AsanMprotect(kShadowGapBeg, kShadowGapEnd - kShadowGapBeg + 1); + CHECK(prot == (void*)kShadowGapBeg); } // On Linux AsanThread::ThreadStart() calls malloc() that's why asan_inited Modified: compiler-rt/trunk/lib/asan/asan_thread.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_thread.cc?rev=147328&r1=147327&r2=147328&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_thread.cc (original) +++ compiler-rt/trunk/lib/asan/asan_thread.cc Wed Dec 28 17:28:54 2011 @@ -21,7 +21,6 @@ #include "sysinfo/sysinfo.h" #endif -#include #include #include #include From kcc at google.com Wed Dec 28 17:36:26 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 15:36:26 -0800 Subject: [llvm-commits] [PATCH] asan-rt: getenv() from .preinit_array In-Reply-To: References: Message-ID: r147326 This slightly different, but the idea is the same. I'm glad to get rid of getenv. On Wed, Dec 28, 2011 at 4:43 AM, Evgeniy Stepanov wrote: > On Wed, Dec 28, 2011 at 3:08 AM, Kostya Serebryany wrote: > > I'd prefer something different: > > > > - Don't put this into sysinfo/sysinfo.cc (we don't use it in some > settings, > > also we may want to get rid of it eventually). > > - I suggest we implement ReadProcSelfEnviron in asan_rtl.cc and then use > it > > to implement asan_getenv() in asan_linux.cc (and in asan_mac.cc, if > needed). > > - no need for PLATFORM_WINDOWS section (yet). Such code will need to go > to > > asan_windows.cc once we have it. > > - we want to avoid memchr/memcmp/etc (use __internal* variants). > > - don't fallback to libc, just do ASAN_DIE > > - do we need HAVE___ENVIRON section? > > > > --kcc > > > > On Wed, Dec 21, 2011 at 3:57 AM, Evgeniy Stepanov > > wrote: > >> > >> Hi, > >> > >> this patch brings in the implementation of GetenvBeforeMain() from > >> google-perftools and uses it in place of getenv(). This is required to > >> call __asan_init from .preinit_array. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/d28c664f/attachment.html From kcc at google.com Wed Dec 28 17:35:46 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 23:35:46 -0000 Subject: [llvm-commits] [compiler-rt] r147329 - /compiler-rt/trunk/lib/asan/asan_rtl.cc Message-ID: <20111228233546.6ABE12A6C12C@llvm.org> Author: kcc Date: Wed Dec 28 17:35:46 2011 New Revision: 147329 URL: http://llvm.org/viewvc/llvm-project?rev=147329&view=rev Log: [asan] force the __asan_unregister_globals to reside in the runtime library Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=147329&r1=147328&r2=147329&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Wed Dec 28 17:35:46 2011 @@ -396,6 +396,7 @@ __asan_report_store16(NULL); __asan_register_global(0, 0, NULL); __asan_register_globals(NULL, 0); + __asan_unregister_globals(NULL, 0); } } From kcc at google.com Wed Dec 28 17:48:16 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 15:48:16 -0800 Subject: [llvm-commits] [PATCH] AddressSanitizer: allow disabling __cxa_throw at runtime In-Reply-To: References: Message-ID: If we don't wrap __cxa_throw, we will have stack-buffer-overflow false positives (the stack will be poisoned on entry and never unpoisoned on exit). Maybe, if we build asan-rt w/ exceptions (remove -fno-exceptions from compiler-rt/make/config.mk) the bug will get fixed, but I'd really like to find another solution (ideally, asan-rt should not require libstdc++ at all, and on linux this seems to work) --kcc On Wed, Dec 28, 2011 at 7:56 AM, Alexander Potapenko wrote: > On Wed, Dec 28, 2011 at 5:18 PM, Alexander Potapenko > wrote: > > The attached patch introduces the wrap_cxa_throw flag that should help > > us to build Chrome while > > http://code.google.com/p/address-sanitizer/issues/detail?id=23 is not > > fixed (tl;dr: wrapping __cxa_throw possibly affects stack unwinding > > and exception handling). > > > Actually it looks like Chrome does not work with wrap___cxa_throw > either, so we'll need to disable it on Mac. > Kostya, is it safe to do so, or this will lead to false positives? > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/b596f902/attachment.html From kcc at google.com Wed Dec 28 18:18:47 2011 From: kcc at google.com (Kostya Serebryany) Date: Wed, 28 Dec 2011 16:18:47 -0800 Subject: [llvm-commits] [PATCH] asan-rt: fix signal wrapper on Android In-Reply-To: References: Message-ID: Can this be made with fewer ifdefs? e.g. #ifdef ANDROID # define SIGNAL_FUNC bsd_signal #else # define SIGNAL_FUNC signal #endif What's wrong with __cxa_throw on android? --kcc On Wed, Dec 21, 2011 at 3:52 AM, Evgeniy Stepanov wrote: > Hi, > > this patch wraps bsd_signal instead of signal on Android. Signal is a > macro there. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111228/b544966f/attachment.html From rafael.espindola at gmail.com Wed Dec 28 20:15:06 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Thu, 29 Dec 2011 02:15:06 -0000 Subject: [llvm-commits] [llvm] r147333 - /llvm/trunk/include/llvm/Analysis/CodeMetrics.h Message-ID: <20111229021506.A36802A6C12C@llvm.org> Author: rafael Date: Wed Dec 28 20:15:06 2011 New Revision: 147333 URL: http://llvm.org/viewvc/llvm-project?rev=147333&view=rev Log: Fix grammar error noticed by Duncan. Modified: llvm/trunk/include/llvm/Analysis/CodeMetrics.h Modified: llvm/trunk/include/llvm/Analysis/CodeMetrics.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/CodeMetrics.h?rev=147333&r1=147332&r2=147333&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/CodeMetrics.h (original) +++ llvm/trunk/include/llvm/Analysis/CodeMetrics.h Wed Dec 28 20:15:06 2011 @@ -32,7 +32,7 @@ // bool NeverInline; // True if this function contains a call to setjmp or other functions - // with attribute "returns twice" without having the attribute by itself. + // with attribute "returns twice" without having the attribute itself. bool exposesReturnsTwice; // True if this function calls itself From craig.topper at gmail.com Wed Dec 28 21:09:33 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 03:09:33 -0000 Subject: [llvm-commits] [llvm] r147335 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20111229030934.1060D2A6C12C@llvm.org> Author: ctopper Date: Wed Dec 28 21:09:33 2011 New Revision: 147335 URL: http://llvm.org/viewvc/llvm-project?rev=147335&view=rev Log: Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147335&r1=147334&r2=147335&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 21:09:33 2011 @@ -5166,13 +5166,13 @@ } else if (ExtVT == MVT::i32 || ExtVT == MVT::f32 || ExtVT == MVT::f64 || (ExtVT == MVT::i64 && Subtarget->is64Bit())) { if (VT.getSizeInBits() == 256) { - EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item); - SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); + SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), DAG, dl); } + assert (VT.getSizeInBits() == 128 && "Expected an SSE value type!"); Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item); // Turn it into a MOVL (i.e. movss, movsd, or movd) to a zero vector. return getShuffleVectorZeroOrUndef(Item, 0, true,Subtarget->hasXMMInt(), @@ -5180,16 +5180,14 @@ } else if (ExtVT == MVT::i16 || ExtVT == MVT::i8) { Item = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Item); if (VT.getSizeInBits() == 256) { - EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item); - SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); + SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), DAG, dl); } - assert (VT.getSizeInBits() == 128 || "Expected an SSE value type!"); - EVT MiddleVT = MVT::v4i32; - Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MiddleVT, Item); + assert (VT.getSizeInBits() == 128 && "Expected an SSE value type!"); + Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32, Item); Item = getShuffleVectorZeroOrUndef(Item, 0, true, Subtarget->hasXMMInt(), DAG); return DAG.getNode(ISD::BITCAST, dl, VT, Item); From craig.topper at gmail.com Wed Dec 28 21:20:51 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 03:20:51 -0000 Subject: [llvm-commits] [llvm] r147336 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20111229032051.9893D2A6C12C@llvm.org> Author: ctopper Date: Wed Dec 28 21:20:51 2011 New Revision: 147336 URL: http://llvm.org/viewvc/llvm-project?rev=147336&view=rev Log: Remove some elses after returns. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147336&r1=147335&r2=147336&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 21:20:51 2011 @@ -5161,9 +5161,10 @@ // the rest of the elements. This will be matched as movd/movq/movss/movsd // depending on what the source datatype is. if (Idx == 0) { - if (NumZero == 0) { + if (NumZero == 0) return DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item); - } else if (ExtVT == MVT::i32 || ExtVT == MVT::f32 || ExtVT == MVT::f64 || + + if (ExtVT == MVT::i32 || ExtVT == MVT::f32 || ExtVT == MVT::f64 || (ExtVT == MVT::i64 && Subtarget->is64Bit())) { if (VT.getSizeInBits() == 256) { EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); @@ -5172,12 +5173,14 @@ return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), DAG, dl); } - assert (VT.getSizeInBits() == 128 && "Expected an SSE value type!"); + assert(VT.getSizeInBits() == 128 && "Expected an SSE value type!"); Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT, Item); // Turn it into a MOVL (i.e. movss, movsd, or movd) to a zero vector. - return getShuffleVectorZeroOrUndef(Item, 0, true,Subtarget->hasXMMInt(), - DAG); - } else if (ExtVT == MVT::i16 || ExtVT == MVT::i8) { + return getShuffleVectorZeroOrUndef(Item, 0, true, + Subtarget->hasXMMInt(), DAG); + } + + if (ExtVT == MVT::i16 || ExtVT == MVT::i8) { Item = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Item); if (VT.getSizeInBits() == 256) { EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); @@ -5186,7 +5189,7 @@ return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), DAG, dl); } - assert (VT.getSizeInBits() == 128 && "Expected an SSE value type!"); + assert(VT.getSizeInBits() == 128 && "Expected an SSE value type!"); Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32, Item); Item = getShuffleVectorZeroOrUndef(Item, 0, true, Subtarget->hasXMMInt(), DAG); From craig.topper at gmail.com Wed Dec 28 21:34:55 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 03:34:55 -0000 Subject: [llvm-commits] [llvm] r147337 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20111229033455.1D0D42A6C12C@llvm.org> Author: ctopper Date: Wed Dec 28 21:34:54 2011 New Revision: 147337 URL: http://llvm.org/viewvc/llvm-project?rev=147337&view=rev Log: Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147337&r1=147336&r2=147337&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 28 21:34:54 2011 @@ -5182,17 +5182,16 @@ if (ExtVT == MVT::i16 || ExtVT == MVT::i8) { Item = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Item); + Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32, Item); if (VT.getSizeInBits() == 256) { - EVT VT128 = EVT::getVectorVT(*DAG.getContext(), ExtVT, NumElems / 2); - Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, VT128, Item); - SDValue ZeroVec = getZeroVector(VT, true, DAG, dl); - return Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), - DAG, dl); + SDValue ZeroVec = getZeroVector(MVT::v8i32, true, DAG, dl); + Item = Insert128BitVector(ZeroVec, Item, DAG.getConstant(0, MVT::i32), + DAG, dl); + } else { + assert(VT.getSizeInBits() == 128 && "Expected an SSE value type!"); + Item = getShuffleVectorZeroOrUndef(Item, 0, true, + Subtarget->hasXMMInt(), DAG); } - assert(VT.getSizeInBits() == 128 && "Expected an SSE value type!"); - Item = DAG.getNode(ISD::SCALAR_TO_VECTOR, dl, MVT::v4i32, Item); - Item = getShuffleVectorZeroOrUndef(Item, 0, true, - Subtarget->hasXMMInt(), DAG); return DAG.getNode(ISD::BITCAST, dl, VT, Item); } } From clattner at apple.com Thu Dec 29 01:31:11 2011 From: clattner at apple.com (Chris Lattner) Date: Wed, 28 Dec 2011 23:31:11 -0800 Subject: [llvm-commits] [patch] Remove the old ELF writer In-Reply-To: <4EFB2B3E.3000408@gmail.com> References: <4EF66438.8060406@gmail.com> <17C96F93-AF42-469B-8038-BE71F777B918@apple.com> <4EFB2B3E.3000408@gmail.com> Message-ID: <949B3840-A5CC-40D2-8F6A-A50343779DE5@apple.com> On Dec 28, 2011, at 6:44 AM, Rafael ?vila de Esp?ndola wrote: >>> Is it OK? Should I propose this on llvmdev? >> >> Just MHO, I would really love to see this happen. Does this also remove the JIT EH code? > > Not yet. The patch still keeps JITDwarfEmitter.cpp. So sad... Still good progress! :) -Chris From glider at google.com Thu Dec 29 03:36:16 2011 From: glider at google.com (Alexander Potapenko) Date: Thu, 29 Dec 2011 13:36:16 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: allow disabling __cxa_throw at runtime In-Reply-To: References: Message-ID: On Thu, Dec 29, 2011 at 3:48 AM, Kostya Serebryany wrote: > If we don't wrap __cxa_throw, we will have stack-buffer-overflow false > positives (the stack will be poisoned on entry and never unpoisoned on > exit). > > Maybe, if we build asan-rt w/ exceptions (remove?-fno-exceptions > from?compiler-rt/make/config.mk) the bug will get fixed, > but I'd really like to find another solution (ideally, asan-rt should not > require libstdc++ at all, and on linux this seems to work) > > --kcc > Please disregard the above patch. I suppose we can use -fno-exceptions together with -funwind-tables to fix the issue From STPWORLD at narod.ru Thu Dec 29 03:37:48 2011 From: STPWORLD at narod.ru (Stepan Dyatkovskiy) Date: Thu, 29 Dec 2011 13:37:48 +0400 Subject: [llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches. In-Reply-To: <4EFAF2B3.1010806@narod.ru> References: <4ECF4A5A.9030607@narod.ru> <4ED36D99.1080306@narod.ru> <4ED51EEC.6050000@narod.ru> <4ED749D0.8030101@narod.ru> <4ED88CDF.2020104@narod.ru> <4EDCC136.1040903@narod.ru> <566DBA1B-7099-44E0-B2EB-3413A054F5E3@apple.com> <4EDF700A.3080605@narod.ru> <9F9EFCC2-3F32-49F8-97D6-B8BAC580B6F7@apple.com> <4EE36B3F.3090307@narod.ru> <0E3FA92F-92F2-45F5-9247-0A1934901F01@apple.com> <4EE8F7C4.2040002@narod.ru> <4EFAF2B3.1010806@narod.ru> Message-ID: <108781325151469@web84.yandex.ru> Ping. -Stepan 28.12.2011, 14:42, "Stepan Dyatkovskiy" : > Hi. A made some fixes that improves compile-time: > > 1. Size heuristics changed. Now we calculate number of unswitching > branches only once per loop. > 2. Some checks was moved from UnswitchIfProfitable to > processCurrentLoop, since it is not changed during processCurrentLoop > iteration. It allows decide to skip some loops at an early stage. > > I checked the compile-time on test > > MultiSource/Benchmarks/Prolangs-C++/shapes/shapes > (there was compile time regression after my previous patch). > > Relative to previous patch the compile-time improved on ~8.5%. Relative > to old revisions (before r146578) the compile time is improved on ~2%. > > Please find the patch in attachment for review. > > -Stepan. > > Stepan Dyatkovskiy wrote: > >> ?Commited as r146578. >> >> ?Thanks. >> ?-Stepan. >> >> ?Dan Gohman wrote: >>> ?Thanks. The patch looks ok to me. >>> >>> ?Dan >>> >>> ?On Dec 10, 2011, at 6:22 AM, Stepan Dyatkovskiy wrote: >>>> ?I fixed code heuristics. How it works now: >>>> >>>> ?1. Calculate average number of produced instructions and basics blocks: >>>> ?number-of-instructions=curret-loop->number-of-instructions * unswitched-number >>>> ?number-of-bb=curret-loop->number-of-bb * unswitched-number >>>> >>>> ?2. If number-of-instructions> ??Threshold || number-of-bb*5> ??Threshold, stop unswitching. >>>> >>>> ?By default Threshold is 50. But user can set custom threshold using -loop-unswitch-threshold ??option. This option existed before my patch, and I kept it without changes though. >>>> >>>> ?I compiled ffmpeg (ffmpeg.org) with llvm + clang toolchain (with and without my patch). I got the next results: >>>> >>>> ?Without patch: ??ffmpeg size is 7369612 bytes. >>>> ?Threshold = 50: ?ffmpeg size is 7349132 bytes (less then in ToT version). >>>> ?Threshold = 200: ffmpeg size is 7439244 bytes. >>>> >>>> ?I also checked the ffmpeg build time in all cases it was 2 mins, 35 secs. >>>> ?Transcoding time with ffmpeg (mp3 -> ??mp2) in all cases was also the same: process took 1 min, 19 secs. >>>> >>>> ?I added unit test that checks unswitch kicking in case of insufficient size. Unit tests and fixed patch (default threshold is 50) are attached to this post. >>>> >>>> ?Thanks. >>>> ?-Stepan. >> ?_______________________________________________ >> ?llvm-commits mailing list >> ?llvm-commits at cs.uiuc.edu >> ?http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From baldrick at free.fr Thu Dec 29 04:06:28 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 29 Dec 2011 11:06:28 +0100 Subject: [llvm-commits] [llvm] r147337 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp In-Reply-To: <20111229033455.1D0D42A6C12C@llvm.org> References: <20111229033455.1D0D42A6C12C@llvm.org> Message-ID: <4EFC3BA4.3040504@free.fr> > Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8. Testcase? Ciao, Duncan. From baldrick at free.fr Thu Dec 29 04:09:13 2011 From: baldrick at free.fr (Duncan Sands) Date: Thu, 29 Dec 2011 11:09:13 +0100 Subject: [llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches. In-Reply-To: <108781325151469@web84.yandex.ru> References: <4ECF4A5A.9030607@narod.ru> <4ED36D99.1080306@narod.ru> <4ED51EEC.6050000@narod.ru> <4ED749D0.8030101@narod.ru> <4ED88CDF.2020104@narod.ru> <4EDCC136.1040903@narod.ru> <566DBA1B-7099-44E0-B2EB-3413A054F5E3@apple.com> <4EDF700A.3080605@narod.ru> <9F9EFCC2-3F32-49F8-97D6-B8BAC580B6F7@apple.com> <4EE36B3F.3090307@narod.ru> <0E3FA92F-92F2-45F5-9247-0A1934901F01@apple.com> <4EE8F7C4.2040002@narod.ru> <4EFAF2B3.1010806@narod.ru> <108781325151469@web84.yandex.ru> Message-ID: <4EFC3C49.6090006@free.fr> On 29/12/11 10:37, Stepan Dyatkovskiy wrote: > Ping. Please wait more than one day before pinging. Ciao, Duncan. From tobias at grosser.es Thu Dec 29 08:00:49 2011 From: tobias at grosser.es (Tobias Grosser) Date: Thu, 29 Dec 2011 15:00:49 +0100 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <1323822351.590.1687.camel@sapling> References: <1319914924.23036.852.camel@sapling> <1319919418.23036.881.camel@sapling> <1319928991.23036.957.camel@sapling> <1320108633.23036.1266.camel@sapling> <1320172356.23036.1298.camel@sapling> <4EB0462C.5010209@grosser.es> <1320184739.23036.1334.camel@sapling> <1320191694.23036.1497.camel@sapling> <1320749109.19359.76.camel@sapling> <4EB90E98.4010805@grosser.es> <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> Message-ID: <4EFC7291.9040808@grosser.es> On 12/14/2011 01:25 AM, Hal Finkel wrote: > Tobias, > > I've attached an updated copy of the patch. I believe that I accounted > for all of your suggestions except for: > > 1. You said that I could make AA a member of the class and initialize it > for each basic block. I suppose that I'd need to make it a pointer, but > more generally, what is the thread-safely model that I should have in > mind for the analysis passes (will multiple threads always use distinct > instances of the passes)? Furthermore, if I am going to make AA a class > variable that is re-initialized on every call to runOnBasicBlock, then I > should probably do the same with all of the top-level maps, etc. Do you > agree? I am actually not sure about the thread safety conventions in LLVM. I personally always assumed that a transformation class will only be instantiated and used within a single thread and in case of multiple threads, multiple class instances would be created. The pattern of assigning analysis results to object variables is common (e.g. lib/Analysis/IVUsers.cpp). So we should be able to use it. I personally do not have any strong opinion about the other top-level maps. Just decide yourself what is easier to read. > 2. I have not (yet) changed the types of the maps from holding Value* to > Instruction*. Doing so would eliminate a few casts, but would cause > even-more casts (and maybe checks) in computePairsConnectedTo. Even so, > it might be worthwhile to make the change for clarity (conceptually, > those maps do hold only pointers to instructions). I believe if code relies on the assumption that some data structures only objects of a certain type, we should use that type to define the data structures. Though, I do not think this change is required to commit the patch. It can be addressed after the code was committed. One thing that I would still like to have is a test case where bb-vectorize-search-limit is needed to avoid exponential compile time growth and another test case that is not optimized, if bb-vectorize-search-limit is set to a value less than 4000. I think those cases are very valuable to understand the performance behavior of this code. Especially, as I am not yet sure why we need a value as high as 4000. This is my last issue on this patch, that should be addressed before committing it. After we discussed this, I propose to post a final patch stating that you will commit after three days to give other people a chance to veto. Tobi From craig.topper at gmail.com Thu Dec 29 09:45:40 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 09:45:40 -0600 Subject: [llvm-commits] [llvm] r147337 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp In-Reply-To: <4EFC3BA4.3040504@free.fr> References: <20111229033455.1D0D42A6C12C@llvm.org> <4EFC3BA4.3040504@free.fr> Message-ID: Somehow it worked either way, but it looked weird to put a 32-bit value into a SCALAR_TO_VECTOR with a result type of v8i16 or v16i8. Looking at the DAG it seemed to show a SCALAR_TO_VECTOR of v4i32 and then a bitcast to v8i16 and v16i8. So I don't actually know how to write a test case. On Thu, Dec 29, 2011 at 4:06 AM, Duncan Sands wrote: > > Make LowerBUILD_VECTOR keep node vector types consistent when creating > MOVL for v16i16 and v32i8. > > Testcase? > > Ciao, Duncan. > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/75c9774a/attachment.html From craig.topper at gmail.com Thu Dec 29 09:51:45 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 15:51:45 -0000 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td Message-ID: <20111229155145.6D55D2A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 09:51:45 2011 New Revision: 147339 URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev Log: Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A. Modified: llvm/trunk/lib/Target/X86/X86.td Modified: llvm/trunk/lib/Target/X86/X86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86.td (original) +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 @@ -55,7 +55,7 @@ [FeatureSSSE3]>; def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", "Enable SSE 4.2 instructions", - [FeatureSSE41, FeaturePOPCNT]>; + [FeatureSSE41]>; def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow", "Enable 3DNow! instructions", [FeatureMMX]>; @@ -77,8 +77,7 @@ "IsUAMemFast", "true", "Fast unaligned memory access">; def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", - "Support SSE 4a instructions", - [FeaturePOPCNT]>; + "Support SSE 4a instructions">; def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", "Enable AVX instructions">; From 6yearold at gmail.com Thu Dec 29 10:09:25 2011 From: 6yearold at gmail.com (arrowdodger) Date: Thu, 29 Dec 2011 19:09:25 +0300 Subject: [llvm-commits] [PATCH][CMake] PR10050. In-Reply-To: References: Message-ID: Updated patches attached. Changes: 1. All generating targets are now running only when user is invoking install rule. Credits go to @sakra on SO. 2. Wrapped some common code into macros, put them into cmake/modules/AddLLVM.cmake. 3. Create .tar.gz's only if both tar and gzip are found, otherwise warn user and don't even generate targets. 4. Code style fixes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/e8086ea5/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm.cmake.docs.patch Type: text/x-patch Size: 8558 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/e8086ea5/attachment-0002.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: clang.cmake.docs.patch Type: text/x-patch Size: 4830 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/e8086ea5/attachment-0003.bin From hfinkel at anl.gov Thu Dec 29 11:32:09 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Thu, 29 Dec 2011 11:32:09 -0600 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <4EFC7291.9040808@grosser.es> References: <1319914924.23036.852.camel@sapling> <1319919418.23036.881.camel@sapling> <1319928991.23036.957.camel@sapling> <1320108633.23036.1266.camel@sapling> <1320172356.23036.1298.camel@sapling> <4EB0462C.5010209@grosser.es> <1320184739.23036.1334.camel@sapling> <1320191694.23036.1497.camel@sapling> <1320749109.19359.76.camel@sapling> <4EB90E98.4010805@grosser.es> <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <4EFC7291.9040808@grosser.es> Message-ID: <1325179929.13080.2839.camel@sapling> On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote: > On 12/14/2011 01:25 AM, Hal Finkel wrote: > > Tobias, > > > > I've attached an updated copy of the patch. I believe that I accounted > > for all of your suggestions except for: > > > > 1. You said that I could make AA a member of the class and initialize it > > for each basic block. I suppose that I'd need to make it a pointer, but > > more generally, what is the thread-safely model that I should have in > > mind for the analysis passes (will multiple threads always use distinct > > instances of the passes)? Furthermore, if I am going to make AA a class > > variable that is re-initialized on every call to runOnBasicBlock, then I > > should probably do the same with all of the top-level maps, etc. Do you > > agree? > > I am actually not sure about the thread safety conventions in LLVM. I > personally always assumed that a transformation class will only be > instantiated and used within a single thread and in case of multiple > threads, multiple class instances would be created. > > The pattern of assigning analysis results to object variables is common > (e.g. lib/Analysis/IVUsers.cpp). So we should be able to use it. Fair enough; I'll make the change. > I > personally do not have any strong opinion about the other top-level > maps. Just decide yourself what is easier to read. > > > 2. I have not (yet) changed the types of the maps from holding Value* to > > Instruction*. Doing so would eliminate a few casts, but would cause > > even-more casts (and maybe checks) in computePairsConnectedTo. Even so, > > it might be worthwhile to make the change for clarity (conceptually, > > those maps do hold only pointers to instructions). > > I believe if code relies on the assumption that some data structures > only objects of a certain type, we should use that type to define the > data structures. Though, I do not think this change is required to > commit the patch. It can be addressed after the code was committed. This makes sense, I'll probably change it prior to commit. > > One thing that I would still like to have is a test case where > bb-vectorize-search-limit is needed to avoid exponential compile time > growth and another test case that is not optimized, if > bb-vectorize-search-limit is set to a value less than 4000. I think > those cases are very valuable to understand the performance behavior of > this code. Good idea, I'll add these test cases. > Especially, as I am not yet sure why we need a value as high > as 4000. I am not exactly sure why that turned out to be the best number, but I'll try this again in combination with my load/store reordering patch and see if such a large value still seems best. > > This is my last issue on this patch, that should be addressed before > committing it. After we discussed this, I propose to post a final patch > stating that you will commit after three days to give other people a > chance to veto. Sounds like a good plan. I had actually introduced some bugs during refactoring in the last patch I had posted; I'm writing test cases for the fixes and then I'll post an updated patch. Thanks again, Hal > > Tobi -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From kcc at google.com Thu Dec 29 11:29:20 2011 From: kcc at google.com (Kostya Serebryany) Date: Thu, 29 Dec 2011 17:29:20 -0000 Subject: [llvm-commits] [compiler-rt] r147341 - in /compiler-rt/trunk: lib/asan/Makefile.old make/config.mk Message-ID: <20111229172920.CE3E02A6C12C@llvm.org> Author: kcc Date: Thu Dec 29 11:29:20 2011 New Revision: 147341 URL: http://llvm.org/viewvc/llvm-project?rev=147341&view=rev Log: [asan] build asan-rt with -funwind-tables Modified: compiler-rt/trunk/lib/asan/Makefile.old compiler-rt/trunk/make/config.mk Modified: compiler-rt/trunk/lib/asan/Makefile.old URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/Makefile.old?rev=147341&r1=147340&r2=147341&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/Makefile.old (original) +++ compiler-rt/trunk/lib/asan/Makefile.old Thu Dec 29 11:29:20 2011 @@ -262,7 +262,8 @@ $(ASAN_CXX) $(GTEST_INCLUDE) -I. -g -c $< -O2 -o $@ -ObjC $(PIE) $(CFLAGS) $(BIN)/%$(SUFF).o: %.cc $(RTL_HDR) $(MAKEFILE) - $(CXX) $(PIE) $(CFLAGS) -fPIC -c -O2 -fno-exceptions -o $@ -g $< -Ithird_party \ + $(CXX) $(PIE) $(CFLAGS) -fPIC -c -O2 -fno-exceptions -funwind-tables \ + -o $@ -g $< -Ithird_party \ -DASAN_USE_SYSINFO=1 \ -DASAN_NEEDS_SEGV=$(ASAN_NEEDS_SEGV) \ -DASAN_HAS_EXCEPTIONS=$(ASAN_HAS_EXCEPTIONS) \ Modified: compiler-rt/trunk/make/config.mk URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/make/config.mk?rev=147341&r1=147340&r2=147341&view=diff ============================================================================== --- compiler-rt/trunk/make/config.mk (original) +++ compiler-rt/trunk/make/config.mk Thu Dec 29 11:29:20 2011 @@ -42,5 +42,5 @@ ### # Common compiler options -COMMON_CXXFLAGS=-fno-exceptions -fPIC +COMMON_CXXFLAGS=-fno-exceptions -fPIC -funwind-tables COMMON_CFLAGS=-fPIC From craig.topper at gmail.com Thu Dec 29 11:41:56 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 17:41:56 -0000 Subject: [llvm-commits] [llvm] r147342 - /llvm/trunk/lib/Target/X86/X86InstrSSE.td Message-ID: <20111229174156.CF7322A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 11:41:56 2011 New Revision: 147342 URL: http://llvm.org/viewvc/llvm-project?rev=147342&view=rev Log: Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns. Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=147342&r1=147341&r2=147342&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Thu Dec 29 11:41:56 2011 @@ -7022,8 +7022,7 @@ !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), [(set VR128:$dst, - (IntId128 VR128:$src1, - (bitconvert (memopv2i64 addr:$src2))))]>, OpSize; + (IntId128 VR128:$src1, (memopv2i64 addr:$src2)))]>, OpSize; } // Perform One Round of an AES Encryption/Decryption Flow @@ -7049,44 +7048,6 @@ int_x86_aesni_aesdeclast>; } -let Predicates = [HasAES] in { - def : Pat<(v2i64 (int_x86_aesni_aesenc VR128:$src1, VR128:$src2)), - (AESENCrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenc VR128:$src1, (memop addr:$src2))), - (AESENCrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenclast VR128:$src1, VR128:$src2)), - (AESENCLASTrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenclast VR128:$src1, (memop addr:$src2))), - (AESENCLASTrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdec VR128:$src1, VR128:$src2)), - (AESDECrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdec VR128:$src1, (memop addr:$src2))), - (AESDECrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdeclast VR128:$src1, VR128:$src2)), - (AESDECLASTrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdeclast VR128:$src1, (memop addr:$src2))), - (AESDECLASTrm VR128:$src1, addr:$src2)>; -} - -let Predicates = [HasAVX, HasAES], AddedComplexity = 20 in { - def : Pat<(v2i64 (int_x86_aesni_aesenc VR128:$src1, VR128:$src2)), - (VAESENCrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenc VR128:$src1, (memop addr:$src2))), - (VAESENCrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenclast VR128:$src1, VR128:$src2)), - (VAESENCLASTrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesenclast VR128:$src1, (memop addr:$src2))), - (VAESENCLASTrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdec VR128:$src1, VR128:$src2)), - (VAESDECrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdec VR128:$src1, (memop addr:$src2))), - (VAESDECrm VR128:$src1, addr:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdeclast VR128:$src1, VR128:$src2)), - (VAESDECLASTrr VR128:$src1, VR128:$src2)>; - def : Pat<(v2i64 (int_x86_aesni_aesdeclast VR128:$src1, (memop addr:$src2))), - (VAESDECLASTrm VR128:$src1, addr:$src2)>; -} - // Perform the AES InvMixColumn Transformation let Predicates = [HasAVX, HasAES] in { def VAESIMCrr : AES8I<0xDB, MRMSrcReg, (outs VR128:$dst), @@ -7098,8 +7059,7 @@ def VAESIMCrm : AES8I<0xDB, MRMSrcMem, (outs VR128:$dst), (ins i128mem:$src1), "vaesimc\t{$src1, $dst|$dst, $src1}", - [(set VR128:$dst, - (int_x86_aesni_aesimc (bitconvert (memopv2i64 addr:$src1))))]>, + [(set VR128:$dst, (int_x86_aesni_aesimc (memopv2i64 addr:$src1)))]>, OpSize, VEX; } def AESIMCrr : AES8I<0xDB, MRMSrcReg, (outs VR128:$dst), @@ -7111,8 +7071,7 @@ def AESIMCrm : AES8I<0xDB, MRMSrcMem, (outs VR128:$dst), (ins i128mem:$src1), "aesimc\t{$src1, $dst|$dst, $src1}", - [(set VR128:$dst, - (int_x86_aesni_aesimc (bitconvert (memopv2i64 addr:$src1))))]>, + [(set VR128:$dst, (int_x86_aesni_aesimc (memopv2i64 addr:$src1)))]>, OpSize; // AES Round Key Generation Assist @@ -7127,8 +7086,7 @@ (ins i128mem:$src1, i8imm:$src2), "vaeskeygenassist\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set VR128:$dst, - (int_x86_aesni_aeskeygenassist (bitconvert (memopv2i64 addr:$src1)), - imm:$src2))]>, + (int_x86_aesni_aeskeygenassist (memopv2i64 addr:$src1), imm:$src2))]>, OpSize, VEX; } def AESKEYGENASSIST128rr : AESAI<0xDF, MRMSrcReg, (outs VR128:$dst), @@ -7141,8 +7099,7 @@ (ins i128mem:$src1, i8imm:$src2), "aeskeygenassist\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set VR128:$dst, - (int_x86_aesni_aeskeygenassist (bitconvert (memopv2i64 addr:$src1)), - imm:$src2))]>, + (int_x86_aesni_aeskeygenassist (memopv2i64 addr:$src1), imm:$src2))]>, OpSize; //===----------------------------------------------------------------------===// From stpworld at narod.ru Thu Dec 29 11:59:06 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Thu, 29 Dec 2011 21:59:06 +0400 Subject: [llvm-commits] [LLVM, SwitchInst, case ranges] Auxiliary patch #1 In-Reply-To: <4EFA0748.9080702@narod.ru> References: <4EAA9B5D.802@narod.ru> <4EAA9DE8.80000@free.fr> <485181319805488@web67.yandex.ru> <4EAB079D.6000606@free.fr> <4EB18F12.6060409@narod.ru> <4EB7C319.1000709@narod.ru> <4EDE7D75.704@narod.ru> <4EDFD0F4.1040204@narod.ru> <4EE25B61.9070006@narod.ru> <4EE5C06C.3050705@narod.ru> <333531323974498@web57.yandex.ru> <4EEB9C52.1050301@narod.ru> <4EF37B6B.6000205@narod.ru> <4EFA0748.9080702@narod.ru> Message-ID: <4EFCAA6A.20203@narod.ru> Ping. -Stepan. Stepan Dyatkovskiy wrote: > ping. > Stepan Dyatkovskiy wrote: >> Ping. >> >> Stepan Dyatkovskiy wrote: >>> Ping. >>> >>> -Stepan. >> > From craig.topper at gmail.com Thu Dec 29 12:00:08 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 18:00:08 -0000 Subject: [llvm-commits] [llvm] r147344 - /llvm/trunk/lib/Target/X86/X86InstrFormats.td Message-ID: <20111229180008.AD3D72A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 12:00:08 2011 New Revision: 147344 URL: http://llvm.org/viewvc/llvm-project?rev=147344&view=rev Log: Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms. Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147344&r1=147343&r2=147344&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Thu Dec 29 12:00:08 2011 @@ -483,12 +483,12 @@ class AES8I o, Format F, dag outs, dag ins, string asm, listpattern> : I, T8, - Requires<[HasAES]>; + Requires<[HasSSE2, HasAES]>; class AESAI o, Format F, dag outs, dag ins, string asm, list pattern> : Ii8, TA, - Requires<[HasAES]>; + Requires<[HasSSE2, HasAES]>; // CLMUL Instruction Templates class CLMULIi8 o, Format F, dag outs, dag ins, string asm, From craig.topper at gmail.com Thu Dec 29 12:08:37 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 18:08:37 -0000 Subject: [llvm-commits] [llvm] r147345 - /llvm/trunk/lib/Target/X86/X86InstrFormats.td Message-ID: <20111229180837.181F12A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 12:08:36 2011 New Revision: 147345 URL: http://llvm.org/viewvc/llvm-project?rev=147345&view=rev Log: Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet. Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147345&r1=147344&r2=147345&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Thu Dec 29 12:08:36 2011 @@ -494,7 +494,7 @@ class CLMULIi8 o, Format F, dag outs, dag ins, string asm, listpattern> : Ii8, TA, - OpSize, Requires<[HasCLMUL]>; + OpSize, Requires<[HasSSE2, HasCLMUL]>; class AVXCLMULIi8 o, Format F, dag outs, dag ins, string asm, listpattern> From hfinkel at anl.gov Thu Dec 29 12:14:08 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Thu, 29 Dec 2011 12:14:08 -0600 Subject: [llvm-commits] [PATCH] Add missing TargetData defaults for 256-bit vector types Message-ID: <1325182448.13080.2841.camel@sapling> Any objections? diff --git a/lib/Target/TargetData.cpp b/lib/Target/TargetData.cpp index 2b39f13..0f85aa6 100644 --- a/lib/Target/TargetData.cpp +++ b/lib/Target/TargetData.cpp @@ -152,6 +152,7 @@ void TargetData::init() { setAlignment(FLOAT_ALIGN, 8, 8, 64); // double setAlignment(VECTOR_ALIGN, 8, 8, 64); // v2i32, v1i64, ... setAlignment(VECTOR_ALIGN, 16, 16, 128); // v16i8, v8i16, v4i32, ... + setAlignment(VECTOR_ALIGN, 32, 32, 256); // v4i64, ... setAlignment(AGGREGATE_ALIGN, 0, 8, 0); // struct } Thanks again, Hal -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From benny.kra at googlemail.com Thu Dec 29 12:20:50 2011 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Thu, 29 Dec 2011 19:20:50 +0100 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: <20111229155145.6D55D2A6C12C@llvm.org> References: <20111229155145.6D55D2A6C12C@llvm.org> Message-ID: <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> On 29.12.2011, at 16:51, Craig Topper wrote: > Author: ctopper > Date: Thu Dec 29 09:51:45 2011 > New Revision: 147339 > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > Log: > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A. Hi Craig, What's your intention here? popcnt is part of sse42 and sse4a so you won't find a processor that has sse42/a and not popcnt. It simplifies the model descriptions in this file. You can still selectively enable sse42 and disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. This patch disables popcnt emission unless it's explicitly enabled (think of the JIT). Please revert this patch. - Ben > > Modified: > llvm/trunk/lib/Target/X86/X86.td > > Modified: llvm/trunk/lib/Target/X86/X86.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86.td (original) > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > @@ -55,7 +55,7 @@ > [FeatureSSSE3]>; > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", > "Enable SSE 4.2 instructions", > - [FeatureSSE41, FeaturePOPCNT]>; > + [FeatureSSE41]>; > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow", > "Enable 3DNow! instructions", > [FeatureMMX]>; > @@ -77,8 +77,7 @@ > "IsUAMemFast", "true", > "Fast unaligned memory access">; > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > - "Support SSE 4a instructions", > - [FeaturePOPCNT]>; > + "Support SSE 4a instructions">; > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > "Enable AVX instructions">; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From craig.topper at gmail.com Thu Dec 29 12:23:17 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 12:23:17 -0600 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> Message-ID: Doesn't having SSE42 explicitly imply popcnt make disabling popcnt through mattr also disable SSE42? On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer wrote: > > On 29.12.2011, at 16:51, Craig Topper wrote: > > > Author: ctopper > > Date: Thu Dec 29 09:51:45 2011 > > New Revision: 147339 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > > Log: > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be > disabled on its own without disabling SSE4.2 or SSE4A. > > Hi Craig, > > What's your intention here? popcnt is part of sse42 and sse4a so you won't > find a processor that has sse42/a and not popcnt. It simplifies the model > descriptions in this file. You can still selectively enable sse42 and > disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang > -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. > > This patch disables popcnt emission unless it's explicitly enabled (think > of the JIT). Please revert this patch. > > - Ben > > > > > Modified: > > llvm/trunk/lib/Target/X86/X86.td > > > > Modified: llvm/trunk/lib/Target/X86/X86.td > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > > > ============================================================================== > > --- llvm/trunk/lib/Target/X86/X86.td (original) > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > > @@ -55,7 +55,7 @@ > > [FeatureSSSE3]>; > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", > > "Enable SSE 4.2 instructions", > > - [FeatureSSE41, FeaturePOPCNT]>; > > + [FeatureSSE41]>; > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", > "ThreeDNow", > > "Enable 3DNow! instructions", > > [FeatureMMX]>; > > @@ -77,8 +77,7 @@ > > "IsUAMemFast", "true", > > "Fast unaligned memory access">; > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > > - "Support SSE 4a instructions", > > - [FeaturePOPCNT]>; > > + "Support SSE 4a instructions">; > > > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > > "Enable AVX instructions">; > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/147d4a1a/attachment.html From benny.kra at googlemail.com Thu Dec 29 12:30:16 2011 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Thu, 29 Dec 2011 19:30:16 +0100 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> Message-ID: <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> On 29.12.2011, at 19:23, Craig Topper wrote: > Doesn't having SSE42 explicitly imply popcnt make disabling popcnt through mattr also disable SSE42? No. it just means that enabling sse42 will also enable popcnt. You can still selectively disable popcnt without disabling all of sse42. - Ben > > On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer wrote: > > On 29.12.2011, at 16:51, Craig Topper wrote: > > > Author: ctopper > > Date: Thu Dec 29 09:51:45 2011 > > New Revision: 147339 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > > Log: > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A. > > Hi Craig, > > What's your intention here? popcnt is part of sse42 and sse4a so you won't find a processor that has sse42/a and not popcnt. It simplifies the model descriptions in this file. You can still selectively enable sse42 and disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. > > This patch disables popcnt emission unless it's explicitly enabled (think of the JIT). Please revert this patch. > > - Ben > > > > > Modified: > > llvm/trunk/lib/Target/X86/X86.td > > > > Modified: llvm/trunk/lib/Target/X86/X86.td > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > > ============================================================================== > > --- llvm/trunk/lib/Target/X86/X86.td (original) > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > > @@ -55,7 +55,7 @@ > > [FeatureSSSE3]>; > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", > > "Enable SSE 4.2 instructions", > > - [FeatureSSE41, FeaturePOPCNT]>; > > + [FeatureSSE41]>; > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow", > > "Enable 3DNow! instructions", > > [FeatureMMX]>; > > @@ -77,8 +77,7 @@ > > "IsUAMemFast", "true", > > "Fast unaligned memory access">; > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > > - "Support SSE 4a instructions", > > - [FeaturePOPCNT]>; > > + "Support SSE 4a instructions">; > > > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > > "Enable AVX instructions">; > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > > -- > ~Craig From craig.topper at gmail.com Thu Dec 29 12:35:38 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 12:35:38 -0600 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> Message-ID: With the patch reverted the following test case does not pass. I believe LLVM assumes that implied enabling of another feature also means the implied feature is required by the feature that implied it. There are comments suggesting that near the definition for Feature64Bit. ; RUN: llc -march=x86-64 -mattr=+sse42,-popcnt < %s | FileCheck %s define <2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) { ; CHECK: pcmpgtq %res = call <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1] ret <2 x i64> %res } declare <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>, <2 x i64>) nounwind readnone On Thu, Dec 29, 2011 at 12:30 PM, Benjamin Kramer wrote: > > On 29.12.2011, at 19:23, Craig Topper wrote: > > > Doesn't having SSE42 explicitly imply popcnt make disabling popcnt > through mattr also disable SSE42? > > No. it just means that enabling sse42 will also enable popcnt. You can > still selectively disable popcnt without disabling all of sse42. > > - Ben > > > > > On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer < > benny.kra at googlemail.com> wrote: > > > > On 29.12.2011, at 16:51, Craig Topper wrote: > > > > > Author: ctopper > > > Date: Thu Dec 29 09:51:45 2011 > > > New Revision: 147339 > > > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > > > Log: > > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be > disabled on its own without disabling SSE4.2 or SSE4A. > > > > Hi Craig, > > > > What's your intention here? popcnt is part of sse42 and sse4a so you > won't find a processor that has sse42/a and not popcnt. It simplifies the > model descriptions in this file. You can still selectively enable sse42 and > disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang > -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. > > > > This patch disables popcnt emission unless it's explicitly enabled > (think of the JIT). Please revert this patch. > > > > - Ben > > > > > > > > Modified: > > > llvm/trunk/lib/Target/X86/X86.td > > > > > > Modified: llvm/trunk/lib/Target/X86/X86.td > > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > > > > ============================================================================== > > > --- llvm/trunk/lib/Target/X86/X86.td (original) > > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > > > @@ -55,7 +55,7 @@ > > > [FeatureSSSE3]>; > > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", > > > "Enable SSE 4.2 instructions", > > > - [FeatureSSE41, FeaturePOPCNT]>; > > > + [FeatureSSE41]>; > > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", > "ThreeDNow", > > > "Enable 3DNow! instructions", > > > [FeatureMMX]>; > > > @@ -77,8 +77,7 @@ > > > "IsUAMemFast", "true", > > > "Fast unaligned memory > access">; > > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > > > - "Support SSE 4a instructions", > > > - [FeaturePOPCNT]>; > > > + "Support SSE 4a instructions">; > > > > > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > > > "Enable AVX instructions">; > > > > > > > > > _______________________________________________ > > > llvm-commits mailing list > > > llvm-commits at cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > > > > > > > -- > > ~Craig > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/f1f85883/attachment.html From craig.topper at gmail.com Thu Dec 29 12:37:04 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 12:37:04 -0600 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> Message-ID: I believe that behavior is also what makes disabling SSE1, disable all other SSE levels. On Thu, Dec 29, 2011 at 12:35 PM, Craig Topper wrote: > With the patch reverted the following test case does not pass. I believe > LLVM assumes that implied enabling of another feature also means the > implied feature is required by the feature that implied it. There are > comments suggesting that near the definition for Feature64Bit. > > ; RUN: llc -march=x86-64 -mattr=+sse42,-popcnt < %s | FileCheck %s > > define <2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) { > ; CHECK: pcmpgtq > %res = call <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0, <2 x i64> > %a1) ; <<2 x i64>> [#uses=1] > ret <2 x i64> %res > } > declare <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>, <2 x i64>) nounwind > readnone > > > On Thu, Dec 29, 2011 at 12:30 PM, Benjamin Kramer < > benny.kra at googlemail.com> wrote: > >> >> On 29.12.2011, at 19:23, Craig Topper wrote: >> >> > Doesn't having SSE42 explicitly imply popcnt make disabling popcnt >> through mattr also disable SSE42? >> >> No. it just means that enabling sse42 will also enable popcnt. You can >> still selectively disable popcnt without disabling all of sse42. >> >> - Ben >> >> > >> > On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer < >> benny.kra at googlemail.com> wrote: >> > >> > On 29.12.2011, at 16:51, Craig Topper wrote: >> > >> > > Author: ctopper >> > > Date: Thu Dec 29 09:51:45 2011 >> > > New Revision: 147339 >> > > >> > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev >> > > Log: >> > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be >> disabled on its own without disabling SSE4.2 or SSE4A. >> > >> > Hi Craig, >> > >> > What's your intention here? popcnt is part of sse42 and sse4a so you >> won't find a processor that has sse42/a and not popcnt. It simplifies the >> model descriptions in this file. You can still selectively enable sse42 and >> disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang >> -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. >> > >> > This patch disables popcnt emission unless it's explicitly enabled >> (think of the JIT). Please revert this patch. >> > >> > - Ben >> > >> > > >> > > Modified: >> > > llvm/trunk/lib/Target/X86/X86.td >> > > >> > > Modified: llvm/trunk/lib/Target/X86/X86.td >> > > URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff >> > > >> ============================================================================== >> > > --- llvm/trunk/lib/Target/X86/X86.td (original) >> > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 >> > > @@ -55,7 +55,7 @@ >> > > [FeatureSSSE3]>; >> > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", >> > > "Enable SSE 4.2 instructions", >> > > - [FeatureSSE41, FeaturePOPCNT]>; >> > > + [FeatureSSE41]>; >> > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", >> "ThreeDNow", >> > > "Enable 3DNow! instructions", >> > > [FeatureMMX]>; >> > > @@ -77,8 +77,7 @@ >> > > "IsUAMemFast", "true", >> > > "Fast unaligned memory >> access">; >> > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", >> > > - "Support SSE 4a instructions", >> > > - [FeaturePOPCNT]>; >> > > + "Support SSE 4a instructions">; >> > > >> > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", >> > > "Enable AVX instructions">; >> > > >> > > >> > > _______________________________________________ >> > > llvm-commits mailing list >> > > llvm-commits at cs.uiuc.edu >> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > >> > >> > >> > >> > -- >> > ~Craig >> >> > > > -- > ~Craig > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/9565cb42/attachment.html From benny.kra at googlemail.com Thu Dec 29 12:44:10 2011 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Thu, 29 Dec 2011 19:44:10 +0100 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> Message-ID: <26828045-27E0-42EE-83F3-953CBB9399E2@googlemail.com> On 29.12.2011, at 19:37, Craig Topper wrote: > I believe that behavior is also what makes disabling SSE1, disable all other SSE levels. I guess you're right then, sorry. In that case please add FeaturePOPCNT to all cpu models that support them (i.e. on all CPUs where SSE42 or SSE4A is currently specified). - Ben > > On Thu, Dec 29, 2011 at 12:35 PM, Craig Topper wrote: > With the patch reverted the following test case does not pass. I believe LLVM assumes that implied enabling of another feature also means the implied feature is required by the feature that implied it. There are comments suggesting that near the definition for Feature64Bit. > > ; RUN: llc -march=x86-64 -mattr=+sse42,-popcnt < %s | FileCheck %s > > define <2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) { > ; CHECK: pcmpgtq > %res = call <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1] > ret <2 x i64> %res > } > declare <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>, <2 x i64>) nounwind readnone > > > On Thu, Dec 29, 2011 at 12:30 PM, Benjamin Kramer wrote: > > On 29.12.2011, at 19:23, Craig Topper wrote: > > > Doesn't having SSE42 explicitly imply popcnt make disabling popcnt through mattr also disable SSE42? > > No. it just means that enabling sse42 will also enable popcnt. You can still selectively disable popcnt without disabling all of sse42. > > - Ben > > > > > On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer wrote: > > > > On 29.12.2011, at 16:51, Craig Topper wrote: > > > > > Author: ctopper > > > Date: Thu Dec 29 09:51:45 2011 > > > New Revision: 147339 > > > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > > > Log: > > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A. > > > > Hi Craig, > > > > What's your intention here? popcnt is part of sse42 and sse4a so you won't find a processor that has sse42/a and not popcnt. It simplifies the model descriptions in this file. You can still selectively enable sse42 and disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. > > > > This patch disables popcnt emission unless it's explicitly enabled (think of the JIT). Please revert this patch. > > > > - Ben > > > > > > > > Modified: > > > llvm/trunk/lib/Target/X86/X86.td > > > > > > Modified: llvm/trunk/lib/Target/X86/X86.td > > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > > > ============================================================================== > > > --- llvm/trunk/lib/Target/X86/X86.td (original) > > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > > > @@ -55,7 +55,7 @@ > > > [FeatureSSSE3]>; > > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", "SSE42", > > > "Enable SSE 4.2 instructions", > > > - [FeatureSSE41, FeaturePOPCNT]>; > > > + [FeatureSSE41]>; > > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow", > > > "Enable 3DNow! instructions", > > > [FeatureMMX]>; > > > @@ -77,8 +77,7 @@ > > > "IsUAMemFast", "true", > > > "Fast unaligned memory access">; > > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > > > - "Support SSE 4a instructions", > > > - [FeaturePOPCNT]>; > > > + "Support SSE 4a instructions">; > > > > > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > > > "Enable AVX instructions">; > > > > > > > > > _______________________________________________ > > > llvm-commits mailing list > > > llvm-commits at cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > > > > > > > -- > > ~Craig > > > > > -- > ~Craig > > > > -- > ~Craig From craig.topper at gmail.com Thu Dec 29 12:46:37 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 12:46:37 -0600 Subject: [llvm-commits] [llvm] r147339 - /llvm/trunk/lib/Target/X86/X86.td In-Reply-To: <26828045-27E0-42EE-83F3-953CBB9399E2@googlemail.com> References: <20111229155145.6D55D2A6C12C@llvm.org> <9C7D3B66-A085-41FF-9276-0F67AE85C0B7@googlemail.com> <4BAD6A7A-26A5-40C0-92A5-F5166BF3BE38@googlemail.com> <26828045-27E0-42EE-83F3-953CBB9399E2@googlemail.com> Message-ID: Will do. Not sure why I didn't think of that when I removed it in the first place. On Thu, Dec 29, 2011 at 12:44 PM, Benjamin Kramer wrote: > > On 29.12.2011, at 19:37, Craig Topper wrote: > > > I believe that behavior is also what makes disabling SSE1, disable all > other SSE levels. > > I guess you're right then, sorry. > > In that case please add FeaturePOPCNT to all cpu models that support them > (i.e. on all CPUs where SSE42 or SSE4A is currently specified). > > - Ben > > > > > On Thu, Dec 29, 2011 at 12:35 PM, Craig Topper > wrote: > > With the patch reverted the following test case does not pass. I believe > LLVM assumes that implied enabling of another feature also means the > implied feature is required by the feature that implied it. There are > comments suggesting that near the definition for Feature64Bit. > > > > ; RUN: llc -march=x86-64 -mattr=+sse42,-popcnt < %s | FileCheck %s > > > > define <2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) { > > ; CHECK: pcmpgtq > > %res = call <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0, <2 x i64> > %a1) ; <<2 x i64>> [#uses=1] > > ret <2 x i64> %res > > } > > declare <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>, <2 x i64>) nounwind > readnone > > > > > > On Thu, Dec 29, 2011 at 12:30 PM, Benjamin Kramer < > benny.kra at googlemail.com> wrote: > > > > On 29.12.2011, at 19:23, Craig Topper wrote: > > > > > Doesn't having SSE42 explicitly imply popcnt make disabling popcnt > through mattr also disable SSE42? > > > > No. it just means that enabling sse42 will also enable popcnt. You can > still selectively disable popcnt without disabling all of sse42. > > > > - Ben > > > > > > > > On Thu, Dec 29, 2011 at 12:20 PM, Benjamin Kramer < > benny.kra at googlemail.com> wrote: > > > > > > On 29.12.2011, at 16:51, Craig Topper wrote: > > > > > > > Author: ctopper > > > > Date: Thu Dec 29 09:51:45 2011 > > > > New Revision: 147339 > > > > > > > > URL: http://llvm.org/viewvc/llvm-project?rev=147339&view=rev > > > > Log: > > > > Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be > disabled on its own without disabling SSE4.2 or SSE4A. > > > > > > Hi Craig, > > > > > > What's your intention here? popcnt is part of sse42 and sse4a so you > won't find a processor that has sse42/a and not popcnt. It simplifies the > model descriptions in this file. You can still selectively enable sse42 and > disable the popcnt feature (e.g. "llc -mattr=+sse42,-popcnt" or "clang > -msse4.2 -mno-popcnt") and it won't emit popcnt instructions. > > > > > > This patch disables popcnt emission unless it's explicitly enabled > (think of the JIT). Please revert this patch. > > > > > > - Ben > > > > > > > > > > > Modified: > > > > llvm/trunk/lib/Target/X86/X86.td > > > > > > > > Modified: llvm/trunk/lib/Target/X86/X86.td > > > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147339&r1=147338&r2=147339&view=diff > > > > > ============================================================================== > > > > --- llvm/trunk/lib/Target/X86/X86.td (original) > > > > +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 09:51:45 2011 > > > > @@ -55,7 +55,7 @@ > > > > [FeatureSSSE3]>; > > > > def FeatureSSE42 : SubtargetFeature<"sse42", "X86SSELevel", > "SSE42", > > > > "Enable SSE 4.2 instructions", > > > > - [FeatureSSE41, > FeaturePOPCNT]>; > > > > + [FeatureSSE41]>; > > > > def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", > "ThreeDNow", > > > > "Enable 3DNow! instructions", > > > > [FeatureMMX]>; > > > > @@ -77,8 +77,7 @@ > > > > "IsUAMemFast", "true", > > > > "Fast unaligned memory > access">; > > > > def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", > > > > - "Support SSE 4a instructions", > > > > - [FeaturePOPCNT]>; > > > > + "Support SSE 4a > instructions">; > > > > > > > > def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", > > > > "Enable AVX instructions">; > > > > > > > > > > > > _______________________________________________ > > > > llvm-commits mailing list > > > > llvm-commits at cs.uiuc.edu > > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > > > > > > > > > > > > -- > > > ~Craig > > > > > > > > > > -- > > ~Craig > > > > > > > > -- > > ~Craig > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/fc81ebb4/attachment.html From craig.topper at gmail.com Thu Dec 29 12:47:31 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 18:47:31 -0000 Subject: [llvm-commits] [llvm] r147347 - /llvm/trunk/lib/Target/X86/X86.td Message-ID: <20111229184731.AA8CA2A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 12:47:31 2011 New Revision: 147347 URL: http://llvm.org/viewvc/llvm-project?rev=147347&view=rev Log: Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339. Modified: llvm/trunk/lib/Target/X86/X86.td Modified: llvm/trunk/lib/Target/X86/X86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147347&r1=147346&r2=147347&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86.td (original) +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 12:47:31 2011 @@ -146,31 +146,34 @@ FeatureSlowBTMem]>; // "Arrandale" along with corei3 and corei5 def : Proc<"corei7", [FeatureSSE42, FeatureCMPXCHG16B, - FeatureSlowBTMem, FeatureFastUAMem, FeatureAES]>; + FeatureSlowBTMem, FeatureFastUAMem, + FeaturePOPCNT, FeatureAES]>; def : Proc<"nehalem", [FeatureSSE42, FeatureCMPXCHG16B, - FeatureSlowBTMem, FeatureFastUAMem]>; + FeatureSlowBTMem, FeatureFastUAMem, + FeaturePOPCNT]>; // Westmere is a similar machine to nehalem with some additional features. // Westmere is the corei3/i5/i7 path from nehalem to sandybridge def : Proc<"westmere", [FeatureSSE42, FeatureCMPXCHG16B, - FeatureSlowBTMem, FeatureFastUAMem, FeatureAES, - FeatureCLMUL]>; + FeatureSlowBTMem, FeatureFastUAMem, + FeaturePOPCNT, FeatureAES, FeatureCLMUL]>; // Sandy Bridge // SSE is not listed here since llvm treats AVX as a reimplementation of SSE, // rather than a superset. // FIXME: Disabling AVX for now since it's not ready. -def : Proc<"corei7-avx", [FeatureSSE42, FeatureCMPXCHG16B, +def : Proc<"corei7-avx", [FeatureSSE42, FeatureCMPXCHG16B, FeaturePOPCNT, FeatureAES, FeatureCLMUL]>; // Ivy Bridge -def : Proc<"core-avx-i", [FeatureSSE42, FeatureCMPXCHG16B, +def : Proc<"core-avx-i", [FeatureSSE42, FeatureCMPXCHG16B, FeaturePOPCNT, FeatureAES, FeatureCLMUL, FeatureRDRAND, FeatureF16C, FeatureFSGSBase]>; // Haswell // FIXME: Disabling AVX/AVX2 for now since it's not ready. -def : Proc<"core-avx2", [FeatureSSE42, FeatureCMPXCHG16B, FeatureAES, - FeatureCLMUL, FeatureRDRAND, FeatureF16C, - FeatureFSGSBase, FeatureFMA3, FeatureMOVBE, - FeatureLZCNT, FeatureBMI, FeatureBMI2]>; +def : Proc<"core-avx2", [FeatureSSE42, FeatureCMPXCHG16B, FeaturePOPCNT, + FeatureAES, FeatureCLMUL, FeatureRDRAND, + FeatureF16C, FeatureFSGSBase, FeatureFMA3, + FeatureMOVBE, FeatureLZCNT, FeatureBMI, + FeatureBMI2]>; def : Proc<"k6", [FeatureMMX]>; def : Proc<"k6-2", [Feature3DNow]>; @@ -196,15 +199,15 @@ FeatureSlowBTMem]>; def : Proc<"amdfam10", [FeatureSSE3, FeatureSSE4A, Feature3DNowA, FeatureCMPXCHG16B, FeatureLZCNT, - FeatureSlowBTMem]>; + FeaturePOPCNT, FeatureSlowBTMem]>; // FIXME: Disabling AVX for now since it's not ready. def : Proc<"bdver1", [FeatureSSE42, FeatureSSE4A, FeatureCMPXCHG16B, FeatureAES, FeatureCLMUL, FeatureFMA4, - FeatureXOP, FeatureLZCNT]>; + FeatureXOP, FeatureLZCNT, FeaturePOPCNT]>; def : Proc<"bdver2", [FeatureSSE42, FeatureSSE4A, FeatureCMPXCHG16B, FeatureAES, FeatureCLMUL, FeatureFMA4, - FeatureXOP, FeatureF16C, FeatureLZCNT, - FeatureBMI]>; + FeatureXOP, FeatureF16C, FeatureLZCNT, + FeaturePOPCNT, FeatureBMI]>; def : Proc<"winchip-c6", [FeatureMMX]>; def : Proc<"winchip2", [Feature3DNow]>; From craig.topper at gmail.com Thu Dec 29 13:25:56 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 19:25:56 -0000 Subject: [llvm-commits] [llvm] r147348 - /llvm/trunk/lib/Target/X86/X86Subtarget.cpp Message-ID: <20111229192556.6D6312A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 13:25:56 2011 New Revision: 147348 URL: http://llvm.org/viewvc/llvm-project?rev=147348&view=rev Log: Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit. Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=147348&r1=147347&r2=147348&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Thu Dec 29 13:25:56 2011 @@ -266,15 +266,19 @@ HasLZCNT = true; ToggleFeature(X86::FeatureLZCNT); } - if (IsAMD && ((ECX >> 6) & 0x1)) { - HasSSE4A = true; - ToggleFeature(X86::FeatureSSE4A); - } - if (IsAMD && ((ECX >> 16) & 0x1)) { - HasFMA4 = true; - ToggleFeature(X86::FeatureFMA4); - HasXOP = true; - ToggleFeature(X86::FeatureXOP); + if (IsAMD) { + if ((ECX >> 6) & 0x1) { + HasSSE4A = true; + ToggleFeature(X86::FeatureSSE4A); + } + if ((ECX >> 11) & 0x1) { + HasXOP = true; + ToggleFeature(X86::FeatureXOP); + } + if ((ECX >> 16) & 0x1) { + HasFMA4 = true; + ToggleFeature(X86::FeatureFMA4); + } } } } From craig.topper at gmail.com Thu Dec 29 13:46:19 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 19:46:19 -0000 Subject: [llvm-commits] [llvm] r147349 - /llvm/trunk/lib/Target/X86/X86.td Message-ID: <20111229194619.F3B372A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 13:46:19 2011 New Revision: 147349 URL: http://llvm.org/viewvc/llvm-project?rev=147349&view=rev Log: Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled. Modified: llvm/trunk/lib/Target/X86/X86.td Modified: llvm/trunk/lib/Target/X86/X86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147349&r1=147348&r2=147349&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86.td (original) +++ llvm/trunk/lib/Target/X86/X86.td Thu Dec 29 13:46:19 2011 @@ -87,7 +87,8 @@ def FeatureCLMUL : SubtargetFeature<"clmul", "HasCLMUL", "true", "Enable carry-less multiplication instructions">; def FeatureFMA3 : SubtargetFeature<"fma3", "HasFMA3", "true", - "Enable three-operand fused multiple-add">; + "Enable three-operand fused multiple-add", + [FeatureAVX]>; def FeatureFMA4 : SubtargetFeature<"fma4", "HasFMA4", "true", "Enable four-operand fused multiple-add">; def FeatureXOP : SubtargetFeature<"xop", "HasXOP", "true", @@ -168,10 +169,10 @@ FeatureRDRAND, FeatureF16C, FeatureFSGSBase]>; // Haswell -// FIXME: Disabling AVX/AVX2 for now since it's not ready. +// FIXME: Disabling AVX/AVX2/FMA3 for now since it's not ready. def : Proc<"core-avx2", [FeatureSSE42, FeatureCMPXCHG16B, FeaturePOPCNT, FeatureAES, FeatureCLMUL, FeatureRDRAND, - FeatureF16C, FeatureFSGSBase, FeatureFMA3, + FeatureF16C, FeatureFSGSBase, FeatureMOVBE, FeatureLZCNT, FeatureBMI, FeatureBMI2]>; From spop at codeaurora.org Thu Dec 29 14:06:09 2011 From: spop at codeaurora.org (Sebastian Pop) Date: Thu, 29 Dec 2011 14:06:09 -0600 Subject: [llvm-commits] [zorg] r146460 - /zorg/trunk/lnt/lnt/db/runinfo.py In-Reply-To: <20111213005906.382A61BE003@llvm.org> References: <20111213005906.382A61BE003@llvm.org> Message-ID: Hi Daniel, I just started reading your changes to zorg: Tobi pointed me to your recent patches. On Mon, Dec 12, 2011 at 6:59 PM, Daniel Dunbar wrote: > Author: ddunbar > Date: Mon Dec 12 18:59:05 2011 > New Revision: 146460 > > URL: http://llvm.org/viewvc/llvm-project?rev=146460&view=rev > Log: > lnt: Take two small steps to reduce higher-than-manageable number of significant > changes in reports... > > ?- First, when using an estimated standard deviation, only treat the change as > ? significant if it is above the threshold from the estimated mean. I *think* > ? this is somewhat statistically sound, based on how we do the estimation. > > ?- Second, don't report any changes with delta's under 0.01 in ignore_small > ? mode. This obviously has no mathematical basis, but appears to be useful in > ? practice. > > Modified: > ? ?zorg/trunk/lnt/lnt/db/runinfo.py > > Modified: zorg/trunk/lnt/lnt/db/runinfo.py > URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/lnt/lnt/db/runinfo.py?rev=146460&r1=146459&r2=146460&view=diff > ============================================================================== > --- zorg/trunk/lnt/lnt/db/runinfo.py (original) > +++ zorg/trunk/lnt/lnt/db/runinfo.py Mon Dec 12 18:59:05 2011 > @@ -10,7 +10,8 @@ > > ?class ComparisonResult: > ? ? def __init__(self, cur_value, prev_value, delta, pct_delta, stddev, MAD, > - ? ? ? ? ? ? ? ? cur_failed, prev_failed, samples): > + ? ? ? ? ? ? ? ? cur_failed, prev_failed, samples, stddev_mean = None, > + ? ? ? ? ? ? ? ? stddev_is_estimated = False): > ? ? ? ? self.current = cur_value > ? ? ? ? self.previous = prev_value > ? ? ? ? self.delta = delta > @@ -20,6 +21,8 @@ > ? ? ? ? self.failed = cur_failed > ? ? ? ? self.prev_failed = prev_failed > ? ? ? ? self.samples = samples > + ? ? ? ?self.stddev_mean = stddev_mean > + ? ? ? ?self.stddev_is_estimated = stddev_is_estimated > > ? ? def get_samples(self): > ? ? ? ? return self.samples > @@ -65,10 +68,27 @@ > ? ? ? ? if ignore_small and abs(self.pct_delta) < .01: > ? ? ? ? ? ? return UNCHANGED_PASS > > + ? ? ? ?# Always ignore changes with small deltas. There is no mathematical > + ? ? ? ?# basis for this, it should be obviated by appropriate statistical > + ? ? ? ?# checks, but practical evidence indicates what we currently have isn't > + ? ? ? ?# good enough (for reasons I do not yet understand). > + ? ? ? ?if ignore_small and abs(self.delta) < .01: > + ? ? ? ? ? ?return UNCHANGED_PASS I guess I am ok with this smoothing "hack" to filter out tests that are not running long enough. I see that you are using this computation: # Compute the comparison status for the test value. delta = run_value - prev_value and so I assume that the values are in seconds. I would say that differences of less than 0.01 seconds are "unnoticeable" unless they are for testcases that run less than 1 second, in which case a 0.01s difference is more than what the previous test would discard: if ignore_small and abs(self.pct_delta) < .01: return UNCHANGED_PASS > + > ? ? ? ? # If we have a comparison window, then measure using a symmetic > ? ? ? ? # confidence interval. > ? ? ? ? if self.stddev is not None: > - ? ? ? ? ? ?if abs(self.delta) > self.stddev * confidence_interval: > + ? ? ? ? ? ?is_significant = abs(self.delta) > (self.stddev * > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?confidence_interval) > + > + ? ? ? ? ? ?# If the stddev is estimated, then it is also only significant if > + ? ? ? ? ? ?# the delta from the estimate mean is above the confidence interval. > + ? ? ? ? ? ?if self.stddev_is_estimated: > + ? ? ? ? ? ? ? ?is_significant &= (abs(self.current - self.stddev_mean) > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? self.stddev * confidence_interval) I think that using this threshold is fine. From what I see you are computing the Manhattan distance from the current value of a test to the mean value and comparing against the standard deviation scaled by the magic constant (btw, I think that 2.576 is a reasonable default: I was using 2.2 in http://repo.or.cz/w/gcc-perf-regression-tester.git/blob/34640748810602004e265ad6927b095155ca9772:/analyze-core.R ) One small problem in here is that the stddev is the Euclidean distance: def standard_deviation(l): m = mean(l) means_sqrd = sum([(v - m)**2 for v in l]) / len(l) rms = math.sqrt(means_sqrd) return rms So what do you think about changing the is_significant test to also use the Euclidean distance? I would have to understand better the code, but probably you can tell me: what is the size of the window of past results that you are using to compute the mean and the noise level. Based on the size of this window, I can see another potential problem: it could be that you are computing the mean and stddev over all the past results that you have, in which case, supposing that there are several improvements and degradations of the performance of the compiler, you would end up having these speed-ups and slow-downs represented in the mean and in the stddev. The way I solved this problem is having a sliding window of (say a magic number) 10 measures. If you compute the mean and stddev of this sliding window, you end up again with a temporal series of mean and stddev numbers that are a bit less prone to average the speed-ups and slow-downs of the compiler. You then compute a second mean and stddev of this new temporal series and you use that as your mean and stddev. That's like computing a second derivative. Sebastian -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum From craig.topper at gmail.com Thu Dec 29 14:03:15 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 20:03:15 -0000 Subject: [llvm-commits] [llvm] r147351 - in /llvm/trunk: lib/Target/X86/X86InstrFMA.td test/MC/Disassembler/X86/simple-tests.txt Message-ID: <20111229200315.354C72A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 14:03:14 2011 New Revision: 147351 URL: http://llvm.org/viewvc/llvm-project?rev=147351&view=rev Log: Expose FMA3 instructions to the disassembler. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147351&r1=147350&r2=147351&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 14:03:14 2011 @@ -41,23 +41,21 @@ defm r231 : fma_rm; } -let isAsmParserOnly = 1 in { - // Fused Multiply-Add - defm VFMADDPS : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "ps">; - defm VFMADDPD : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "pd">, VEX_W; - defm VFMADDSUBPS : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "ps">; - defm VFMADDSUBPD : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "pd">, VEX_W; - defm VFMSUBADDPS : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "ps">; - defm VFMSUBADDPD : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "pd">, VEX_W; - defm VFMSUBPS : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "ps">; - defm VFMSUBPD : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "pd">, VEX_W; - - // Fused Negative Multiply-Add - defm VFNMADDPS : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "ps">; - defm VFNMADDPD : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "pd">, VEX_W; - defm VFNMSUBPS : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "ps">; - defm VFNMSUBPD : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "pd">, VEX_W; -} +// Fused Multiply-Add +defm VFMADDPS : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "ps">; +defm VFMADDPD : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "pd">, VEX_W; +defm VFMADDSUBPS : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "ps">; +defm VFMADDSUBPD : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "pd">, VEX_W; +defm VFMSUBADDPS : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "ps">; +defm VFMSUBADDPD : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "pd">, VEX_W; +defm VFMSUBPS : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "ps">; +defm VFMSUBPD : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "pd">, VEX_W; + +// Fused Negative Multiply-Add +defm VFNMADDPS : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "ps">; +defm VFNMADDPD : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "pd">, VEX_W; +defm VFNMSUBPS : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "ps">; +defm VFNMSUBPD : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "pd">, VEX_W; //===----------------------------------------------------------------------===// // FMA4 - AMD 4 operand Fused Multiply-Add instructions Modified: llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt?rev=147351&r1=147350&r2=147351&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt (original) +++ llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Thu Dec 29 14:03:14 2011 @@ -647,3 +647,27 @@ # CHECK: shrxq %r12, %r11, %r10 0xc4 0x42 0x9b 0xf7 0xd3 + +# CHECK: vfmadd132ps %xmm11, %xmm12, %xmm10 +0xc4 0x42 0x19 0x98 0xd3 + +# CHECK: vfmadd132pd %xmm11, %xmm12, %xmm10 +0xc4 0x42 0x99 0x98 0xd3 + +# CHECK: vfmadd132ps %ymm11, %ymm12, %ymm10 +0xc4 0x42 0x1d 0x98 0xd3 + +# CHECK: vfmadd132pd %ymm11, %ymm12, %ymm10 +0xc4 0x42 0x9d 0x98 0xd3 + +# CHECK: vfmadd132ps (%rax), %xmm12, %xmm10 +0xc4 0x62 0x19 0x98 0x10 + +# CHECK: vfmadd132pd (%rax), %xmm12, %xmm10 +0xc4 0x62 0x99 0x98 0x10 + +# CHECK: vfmadd132ps (%rax), %ymm12, %ymm10 +0xc4 0x62 0x1d 0x98 0x10 + +# CHECK: vfmadd132pd (%rax), %ymm12, %ymm10 +0xc4 0x62 0x9d 0x98 0x10 From hfinkel at anl.gov Thu Dec 29 14:20:24 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Thu, 29 Dec 2011 14:20:24 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc Message-ID: <1325190024.13080.2848.camel@sapling> This small patch fixes a compatibility problem between the ppc linux asm printer and recent versions of binutils (as explained in the test case). With this patch, llvm's behavior will again match that of gcc. It touches CodeGen/AsmPrinter; please review. Thanks again, Hal -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm_ppc64_sizedir.diff Type: text/x-patch Size: 3540 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111229/2f50c133/attachment.bin From rafael.espindola at gmail.com Thu Dec 29 14:24:47 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Thu, 29 Dec 2011 20:24:47 -0000 Subject: [llvm-commits] [llvm] r147352 - in /llvm/trunk: include/llvm/MC/MCDwarf.h include/llvm/MC/MCStreamer.h lib/MC/MCDwarf.cpp lib/MC/MCParser/AsmParser.cpp lib/MC/MCStreamer.cpp test/MC/ELF/cfi-escape.s Message-ID: <20111229202447.D60372A6C12C@llvm.org> Author: rafael Date: Thu Dec 29 14:24:47 2011 New Revision: 147352 URL: http://llvm.org/viewvc/llvm-project?rev=147352&view=rev Log: Implement .cfi_escape. Patch by Brian Anderson! Added: llvm/trunk/test/MC/ELF/cfi-escape.s Modified: llvm/trunk/include/llvm/MC/MCDwarf.h llvm/trunk/include/llvm/MC/MCStreamer.h llvm/trunk/lib/MC/MCDwarf.cpp llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/lib/MC/MCStreamer.cpp Modified: llvm/trunk/include/llvm/MC/MCDwarf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCDwarf.h?rev=147352&r1=147351&r2=147352&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCDwarf.h (original) +++ llvm/trunk/include/llvm/MC/MCDwarf.h Thu Dec 29 14:24:47 2011 @@ -271,13 +271,14 @@ class MCCFIInstruction { public: - enum OpType { SameValue, Remember, Restore, Move, RelMove }; + enum OpType { SameValue, Remember, Restore, Move, RelMove, Escape }; private: OpType Operation; MCSymbol *Label; // Move to & from location. MachineLocation Destination; MachineLocation Source; + std::vector Values; public: MCCFIInstruction(OpType Op, MCSymbol *L) : Operation(Op), Label(L) { @@ -296,10 +297,17 @@ : Operation(Op), Label(L), Destination(D), Source(S) { assert(Op == RelMove); } + MCCFIInstruction(OpType Op, MCSymbol *L, StringRef Vals) + : Operation(Op), Label(L), Values(Vals.begin(), Vals.end()) { + assert(Op == Escape); + } OpType getOperation() const { return Operation; } MCSymbol *getLabel() const { return Label; } const MachineLocation &getDestination() const { return Destination; } const MachineLocation &getSource() const { return Source; } + const StringRef getValues() const { + return StringRef(&Values[0], Values.size()); + } }; struct MCDwarfFrameInfo { Modified: llvm/trunk/include/llvm/MC/MCStreamer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCStreamer.h?rev=147352&r1=147351&r2=147352&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCStreamer.h (original) +++ llvm/trunk/include/llvm/MC/MCStreamer.h Thu Dec 29 14:24:47 2011 @@ -549,6 +549,7 @@ virtual void EmitCFISameValue(int64_t Register); virtual void EmitCFIRelOffset(int64_t Register, int64_t Offset); virtual void EmitCFIAdjustCfaOffset(int64_t Adjustment); + virtual void EmitCFIEscape(StringRef Values); virtual void EmitWin64EHStartProc(const MCSymbol *Symbol); virtual void EmitWin64EHEndProc(); Modified: llvm/trunk/lib/MC/MCDwarf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCDwarf.cpp?rev=147352&r1=147351&r2=147352&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCDwarf.cpp (original) +++ llvm/trunk/lib/MC/MCDwarf.cpp Thu Dec 29 14:24:47 2011 @@ -987,6 +987,10 @@ Streamer.EmitULEB128IntValue(Reg); return; } + case MCCFIInstruction::Escape: + if (VerboseAsm) Streamer.AddComment("Escape bytes"); + Streamer.EmitBytes(Instr.getValues(), 0); + return; } llvm_unreachable("Unhandled case in switch"); } Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=147352&r1=147351&r2=147352&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Thu Dec 29 14:24:47 2011 @@ -301,6 +301,8 @@ &GenericAsmParser::ParseDirectiveCFIRestoreState>(".cfi_restore_state"); AddDirectiveHandler< &GenericAsmParser::ParseDirectiveCFISameValue>(".cfi_same_value"); + AddDirectiveHandler< + &GenericAsmParser::ParseDirectiveCFIEscape>(".cfi_escape"); // Macro directives. AddDirectiveHandler<&GenericAsmParser::ParseDirectiveMacrosOnOff>( @@ -334,6 +336,7 @@ bool ParseDirectiveCFIRememberState(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveCFIRestoreState(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveCFISameValue(StringRef, SMLoc DirectiveLoc); + bool ParseDirectiveCFIEscape(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveMacrosOnOff(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveMacro(StringRef, SMLoc DirectiveLoc); @@ -2812,6 +2815,30 @@ return false; } +/// ParseDirectiveCFIEscape +/// ::= .cfi_escape expression[,...] +bool GenericAsmParser::ParseDirectiveCFIEscape(StringRef IDVal, + SMLoc DirectiveLoc) { + std::string Values; + int64_t CurrValue; + if (getParser().ParseAbsoluteExpression(CurrValue)) + return true; + + Values.push_back((uint8_t)CurrValue); + + while (getLexer().is(AsmToken::Comma)) { + Lex(); + + if (getParser().ParseAbsoluteExpression(CurrValue)) + return true; + + Values.push_back((uint8_t)CurrValue); + } + + getStreamer().EmitCFIEscape(Values); + return false; +} + /// ParseDirectiveMacrosOnOff /// ::= .macros_on /// ::= .macros_off Modified: llvm/trunk/lib/MC/MCStreamer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCStreamer.cpp?rev=147352&r1=147351&r2=147352&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCStreamer.cpp (original) +++ llvm/trunk/lib/MC/MCStreamer.cpp Thu Dec 29 14:24:47 2011 @@ -408,6 +408,15 @@ CurFrame->Instructions.push_back(Instruction); } +void MCStreamer::EmitCFIEscape(StringRef Values) { + EnsureValidFrame(); + MCDwarfFrameInfo *CurFrame = getCurrentFrameInfo(); + MCSymbol *Label = getContext().CreateTempSymbol(); + EmitLabel(Label); + MCCFIInstruction Instruction(MCCFIInstruction::Escape, Label, Values); + CurFrame->Instructions.push_back(Instruction); +} + void MCStreamer::setCurrentW64UnwindInfo(MCWin64EHUnwindInfo *Frame) { W64UnwindInfos.push_back(Frame); CurrentW64UnwindInfo = W64UnwindInfos.back(); Added: llvm/trunk/test/MC/ELF/cfi-escape.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/cfi-escape.s?rev=147352&view=auto ============================================================================== --- llvm/trunk/test/MC/ELF/cfi-escape.s (added) +++ llvm/trunk/test/MC/ELF/cfi-escape.s Thu Dec 29 14:24:47 2011 @@ -0,0 +1,42 @@ +// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | elf-dump --dump-section-data | FileCheck %s + +f: + .cfi_startproc + nop + .cfi_escape 0x15, 7, 0x7f # DW_CFA_val_offset_sf, %esp, 8/-8 + nop + .cfi_endproc + +// CHECK: # Section 4 +// CHECK-NEXT: (('sh_name', 0x00000011) # '.eh_frame' +// CHECK-NEXT: ('sh_type', 0x00000001) +// CHECK-NEXT: ('sh_flags', 0x0000000000000002) +// CHECK-NEXT: ('sh_addr', 0x0000000000000000) +// CHECK-NEXT: ('sh_offset', 0x0000000000000048) +// CHECK-NEXT: ('sh_size', 0x0000000000000030) +// CHECK-NEXT: ('sh_link', 0x00000000) +// CHECK-NEXT: ('sh_info', 0x00000000) +// CHECK-NEXT: ('sh_addralign', 0x0000000000000008) +// CHECK-NEXT: ('sh_entsize', 0x0000000000000000) +// CHECK-NEXT: ('_section_data', '14000000 00000000 017a5200 01781001 1b0c0708 90010000 14000000 1c000000 00000000 02000000 00411507 7f000000') +// CHECK-NEXT: ), +// CHECK-NEXT: # Section 5 +// CHECK-NEXT: (('sh_name', 0x0000000c) # '.rela.eh_frame' +// CHECK-NEXT: ('sh_type', 0x00000004) +// CHECK-NEXT: ('sh_flags', 0x0000000000000000) +// CHECK-NEXT: ('sh_addr', 0x0000000000000000) +// CHECK-NEXT: ('sh_offset', 0x0000000000000390) +// CHECK-NEXT: ('sh_size', 0x0000000000000018) +// CHECK-NEXT: ('sh_link', 0x00000007) +// CHECK-NEXT: ('sh_info', 0x00000004) +// CHECK-NEXT: ('sh_addralign', 0x0000000000000008) +// CHECK-NEXT: ('sh_entsize', 0x0000000000000018) +// CHECK-NEXT: ('_relocations', [ +// CHECK-NEXT: # Relocation 0 +// CHECK-NEXT: (('r_offset', 0x0000000000000020) +// CHECK-NEXT: ('r_sym', 0x00000002) +// CHECK-NEXT: ('r_type', 0x00000002) +// CHECK-NEXT: ('r_addend', 0x0000000000000000) +// CHECK-NEXT: ), +// CHECK-NEXT: ]) +// CHECK-NEXT: ), From craig.topper at gmail.com Thu Dec 29 14:43:41 2011 From: craig.topper at gmail.com (Craig Topper) Date: Thu, 29 Dec 2011 20:43:41 -0000 Subject: [llvm-commits] [llvm] r147353 - in /llvm/trunk: lib/Target/X86/X86InstrFMA.td lib/Target/X86/X86InstrFormats.td test/MC/Disassembler/X86/simple-tests.txt Message-ID: <20111229204341.55F8E2A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 14:43:40 2011 New Revision: 147353 URL: http://llvm.org/viewvc/llvm-project?rev=147353&view=rev Log: Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/lib/Target/X86/X86InstrFormats.td llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147353&r1=147352&r2=147353&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 14:43:40 2011 @@ -15,7 +15,7 @@ // FMA3 - Intel 3 operand Fused Multiply-Add instructions //===----------------------------------------------------------------------===// -multiclass fma_rm opc, string OpcodeStr> { +multiclass fma3p_rm opc, string OpcodeStr> { def r : FMA3; } -multiclass fma_forms opc132, bits<8> opc213, bits<8> opc231, - string OpcodeStr, string PackTy> { - defm r132 : fma_rm; - defm r213 : fma_rm; - defm r231 : fma_rm; +multiclass fma3p_forms opc132, bits<8> opc213, bits<8> opc231, + string OpcodeStr, string PackTy> { + defm r132 : fma3p_rm; + defm r213 : fma3p_rm; + defm r231 : fma3p_rm; } // Fused Multiply-Add -defm VFMADDPS : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "ps">; -defm VFMADDPD : fma_forms<0x98, 0xA8, 0xB8, "vfmadd", "pd">, VEX_W; -defm VFMADDSUBPS : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "ps">; -defm VFMADDSUBPD : fma_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "pd">, VEX_W; -defm VFMSUBADDPS : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "ps">; -defm VFMSUBADDPD : fma_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "pd">, VEX_W; -defm VFMSUBPS : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "ps">; -defm VFMSUBPD : fma_forms<0x9A, 0xAA, 0xBA, "vfmsub", "pd">, VEX_W; +let ExeDomain = SSEPackedSingle in { + defm VFMADDPS : fma3p_forms<0x98, 0xA8, 0xB8, "vfmadd", "ps">; + defm VFMSUBPS : fma3p_forms<0x9A, 0xAA, 0xBA, "vfmsub", "ps">; + defm VFMADDSUBPS : fma3p_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "ps">; + defm VFMSUBADDPS : fma3p_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "ps">; +} + +let ExeDomain = SSEPackedDouble in { + defm VFMADDPD : fma3p_forms<0x98, 0xA8, 0xB8, "vfmadd", "pd">, VEX_W; + defm VFMSUBPD : fma3p_forms<0x9A, 0xAA, 0xBA, "vfmsub", "pd">, VEX_W; + defm VFMADDSUBPD : fma3p_forms<0x96, 0xA6, 0xB6, "vfmaddsub", "pd">, VEX_W; + defm VFMSUBADDPD : fma3p_forms<0x97, 0xA7, 0xB7, "vfmsubadd", "pd">, VEX_W; +} // Fused Negative Multiply-Add -defm VFNMADDPS : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "ps">; -defm VFNMADDPD : fma_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "pd">, VEX_W; -defm VFNMSUBPS : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "ps">; -defm VFNMSUBPD : fma_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "pd">, VEX_W; +let ExeDomain = SSEPackedSingle in { + defm VFNMADDPS : fma3p_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "ps">; + defm VFNMSUBPS : fma3p_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "ps">; +} +let ExeDomain = SSEPackedDouble in { + defm VFNMADDPD : fma3p_forms<0x9C, 0xAC, 0xBC, "vfnmadd", "pd">, VEX_W; + defm VFNMSUBPD : fma3p_forms<0x9E, 0xAE, 0xBE, "vfnmsub", "pd">, VEX_W; +} + +multiclass fma3s_rm opc, string OpcodeStr, X86MemOperand x86memop> { + def r : FMA3; + def m : FMA3; +} + +multiclass fma3s_forms opc132, bits<8> opc213, bits<8> opc231, + string OpcodeStr> { + defm SSr132 : fma3s_rm; + defm SSr213 : fma3s_rm; + defm SSr231 : fma3s_rm; + defm SDr132 : fma3s_rm, VEX_W; + defm SDr213 : fma3s_rm, VEX_W; + defm SDr231 : fma3s_rm, VEX_W; +} + +defm VFMADD : fma3s_forms<0x99, 0xA9, 0xB9, "vfmadd">; +defm VFMSUB : fma3s_forms<0x9B, 0xAB, 0xBB, "vfmsub">; + +defm VFNMADD : fma3s_forms<0x9D, 0xAD, 0xBD, "vfnmadd">; +defm VFNMSUB : fma3s_forms<0x9F, 0xAF, 0xBF, "vfnmsub">; //===----------------------------------------------------------------------===// // FMA4 - AMD 4 operand Fused Multiply-Add instructions Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147353&r1=147352&r2=147353&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Thu Dec 29 14:43:40 2011 @@ -504,7 +504,7 @@ // FMA3 Instruction Templates class FMA3 o, Format F, dag outs, dag ins, string asm, listpattern> - : I, T8, + : I, T8, OpSize, VEX_4V, Requires<[HasFMA3]>; // FMA4 Instruction Templates Modified: llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt?rev=147353&r1=147352&r2=147353&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt (original) +++ llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Thu Dec 29 14:43:40 2011 @@ -671,3 +671,15 @@ # CHECK: vfmadd132pd (%rax), %ymm12, %ymm10 0xc4 0x62 0x9d 0x98 0x10 + +# CHECK: vfmadd132ss %xmm11, %xmm12, %xmm10 +0xc4 0x42 0x19 0x99 0xd3 + +# CHECK: vfmadd132sd %xmm11, %xmm12, %xmm10 +0xc4 0x42 0x99 0x99 0xd3 + +# CHECK: vfmadd132ss (%rax), %xmm12, %xmm10 +0xc4 0x62 0x19 0x99 0x10 + +# CHECK: vfmadd132sd (%rax), %xmm12, %xmm10 +0xc4 0x62 0x99 0x99 0x10 From rafael.espindola at gmail.com Thu Dec 29 15:09:08 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Thu, 29 Dec 2011 21:09:08 -0000 Subject: [llvm-commits] [llvm] r147354 - in /llvm/trunk: include/llvm/MC/MCDwarf.h lib/MC/MCDwarf.cpp lib/MC/MCStreamer.cpp Message-ID: <20111229210908.CD2422A6C12C@llvm.org> Author: rafael Date: Thu Dec 29 15:09:08 2011 New Revision: 147354 URL: http://llvm.org/viewvc/llvm-project?rev=147354&view=rev Log: Rename Remember and Restore to RememberState and RestoreState for consistency. Modified: llvm/trunk/include/llvm/MC/MCDwarf.h llvm/trunk/lib/MC/MCDwarf.cpp llvm/trunk/lib/MC/MCStreamer.cpp Modified: llvm/trunk/include/llvm/MC/MCDwarf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCDwarf.h?rev=147354&r1=147353&r2=147354&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCDwarf.h (original) +++ llvm/trunk/include/llvm/MC/MCDwarf.h Thu Dec 29 15:09:08 2011 @@ -271,7 +271,7 @@ class MCCFIInstruction { public: - enum OpType { SameValue, Remember, Restore, Move, RelMove, Escape }; + enum OpType { SameValue, RememberState, RestoreState, Move, RelMove, Escape }; private: OpType Operation; MCSymbol *Label; @@ -282,7 +282,7 @@ public: MCCFIInstruction(OpType Op, MCSymbol *L) : Operation(Op), Label(L) { - assert(Op == Remember || Op == Restore); + assert(Op == RememberState || Op == RestoreState); } MCCFIInstruction(OpType Op, MCSymbol *L, unsigned Register) : Operation(Op), Label(L), Destination(Register) { Modified: llvm/trunk/lib/MC/MCDwarf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCDwarf.cpp?rev=147354&r1=147353&r2=147354&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCDwarf.cpp (original) +++ llvm/trunk/lib/MC/MCDwarf.cpp Thu Dec 29 15:09:08 2011 @@ -971,11 +971,11 @@ } return; } - case MCCFIInstruction::Remember: + case MCCFIInstruction::RememberState: if (VerboseAsm) Streamer.AddComment("DW_CFA_remember_state"); Streamer.EmitIntValue(dwarf::DW_CFA_remember_state, 1); return; - case MCCFIInstruction::Restore: + case MCCFIInstruction::RestoreState: if (VerboseAsm) Streamer.AddComment("DW_CFA_restore_state"); Streamer.EmitIntValue(dwarf::DW_CFA_restore_state, 1); return; Modified: llvm/trunk/lib/MC/MCStreamer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCStreamer.cpp?rev=147354&r1=147353&r2=147354&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCStreamer.cpp (original) +++ llvm/trunk/lib/MC/MCStreamer.cpp Thu Dec 29 15:09:08 2011 @@ -385,7 +385,7 @@ MCDwarfFrameInfo *CurFrame = getCurrentFrameInfo(); MCSymbol *Label = getContext().CreateTempSymbol(); EmitLabel(Label); - MCCFIInstruction Instruction(MCCFIInstruction::Remember, Label); + MCCFIInstruction Instruction(MCCFIInstruction::RememberState, Label); CurFrame->Instructions.push_back(Instruction); } @@ -395,7 +395,7 @@ MCDwarfFrameInfo *CurFrame = getCurrentFrameInfo(); MCSymbol *Label = getContext().CreateTempSymbol(); EmitLabel(Label); - MCCFIInstruction Instruction(MCCFIInstruction::Restore, Label); + MCCFIInstruction Instruction(MCCFIInstruction::RestoreState, Label); CurFrame->Instructions.push_back(Instruction); } From rafael.espindola at gmail.com Thu Dec 29 15:43:03 2011 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Thu, 29 Dec 2011 21:43:03 -0000 Subject: [llvm-commits] [llvm] r147356 - in /llvm/trunk: include/llvm/MC/MCDwarf.h include/llvm/MC/MCStreamer.h lib/MC/MCDwarf.cpp lib/MC/MCParser/AsmParser.cpp lib/MC/MCStreamer.cpp test/MC/ELF/cfi-restore.s Message-ID: <20111229214303.9F6F32A6C12C@llvm.org> Author: rafael Date: Thu Dec 29 15:43:03 2011 New Revision: 147356 URL: http://llvm.org/viewvc/llvm-project?rev=147356&view=rev Log: Implement cfi_restore. Patch by Brian Anderson! Added: llvm/trunk/test/MC/ELF/cfi-restore.s Modified: llvm/trunk/include/llvm/MC/MCDwarf.h llvm/trunk/include/llvm/MC/MCStreamer.h llvm/trunk/lib/MC/MCDwarf.cpp llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/lib/MC/MCStreamer.cpp Modified: llvm/trunk/include/llvm/MC/MCDwarf.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCDwarf.h?rev=147356&r1=147355&r2=147356&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCDwarf.h (original) +++ llvm/trunk/include/llvm/MC/MCDwarf.h Thu Dec 29 15:43:03 2011 @@ -271,7 +271,8 @@ class MCCFIInstruction { public: - enum OpType { SameValue, RememberState, RestoreState, Move, RelMove, Escape }; + enum OpType { SameValue, RememberState, RestoreState, Move, RelMove, Escape, + Restore}; private: OpType Operation; MCSymbol *Label; @@ -286,7 +287,7 @@ } MCCFIInstruction(OpType Op, MCSymbol *L, unsigned Register) : Operation(Op), Label(L), Destination(Register) { - assert(Op == SameValue); + assert(Op == SameValue || Op == Restore); } MCCFIInstruction(MCSymbol *L, const MachineLocation &D, const MachineLocation &S) Modified: llvm/trunk/include/llvm/MC/MCStreamer.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCStreamer.h?rev=147356&r1=147355&r2=147356&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCStreamer.h (original) +++ llvm/trunk/include/llvm/MC/MCStreamer.h Thu Dec 29 15:43:03 2011 @@ -547,6 +547,7 @@ virtual void EmitCFIRememberState(); virtual void EmitCFIRestoreState(); virtual void EmitCFISameValue(int64_t Register); + virtual void EmitCFIRestore(int64_t Register); virtual void EmitCFIRelOffset(int64_t Register, int64_t Offset); virtual void EmitCFIAdjustCfaOffset(int64_t Adjustment); virtual void EmitCFIEscape(StringRef Values); Modified: llvm/trunk/lib/MC/MCDwarf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCDwarf.cpp?rev=147356&r1=147355&r2=147356&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCDwarf.cpp (original) +++ llvm/trunk/lib/MC/MCDwarf.cpp Thu Dec 29 15:43:03 2011 @@ -987,6 +987,15 @@ Streamer.EmitULEB128IntValue(Reg); return; } + case MCCFIInstruction::Restore: { + unsigned Reg = Instr.getDestination().getReg(); + if (VerboseAsm) { + Streamer.AddComment("DW_CFA_restore"); + Streamer.AddComment(Twine("Reg ") + Twine(Reg)); + } + Streamer.EmitIntValue(dwarf::DW_CFA_restore | Reg, 1); + return; + } case MCCFIInstruction::Escape: if (VerboseAsm) Streamer.AddComment("Escape bytes"); Streamer.EmitBytes(Instr.getValues(), 0); Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=147356&r1=147355&r2=147356&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Thu Dec 29 15:43:03 2011 @@ -302,6 +302,8 @@ AddDirectiveHandler< &GenericAsmParser::ParseDirectiveCFISameValue>(".cfi_same_value"); AddDirectiveHandler< + &GenericAsmParser::ParseDirectiveCFIRestore>(".cfi_restore"); + AddDirectiveHandler< &GenericAsmParser::ParseDirectiveCFIEscape>(".cfi_escape"); // Macro directives. @@ -336,6 +338,7 @@ bool ParseDirectiveCFIRememberState(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveCFIRestoreState(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveCFISameValue(StringRef, SMLoc DirectiveLoc); + bool ParseDirectiveCFIRestore(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveCFIEscape(StringRef, SMLoc DirectiveLoc); bool ParseDirectiveMacrosOnOff(StringRef, SMLoc DirectiveLoc); @@ -2815,6 +2818,19 @@ return false; } +/// ParseDirectiveCFIRestore +/// ::= .cfi_restore register +bool GenericAsmParser::ParseDirectiveCFIRestore(StringRef IDVal, + SMLoc DirectiveLoc) { + int64_t Register = 0; + if (ParseRegisterOrRegisterNumber(Register, DirectiveLoc)) + return true; + + getStreamer().EmitCFIRestore(Register); + + return false; +} + /// ParseDirectiveCFIEscape /// ::= .cfi_escape expression[,...] bool GenericAsmParser::ParseDirectiveCFIEscape(StringRef IDVal, Modified: llvm/trunk/lib/MC/MCStreamer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCStreamer.cpp?rev=147356&r1=147355&r2=147356&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCStreamer.cpp (original) +++ llvm/trunk/lib/MC/MCStreamer.cpp Thu Dec 29 15:43:03 2011 @@ -408,6 +408,15 @@ CurFrame->Instructions.push_back(Instruction); } +void MCStreamer::EmitCFIRestore(int64_t Register) { + EnsureValidFrame(); + MCDwarfFrameInfo *CurFrame = getCurrentFrameInfo(); + MCSymbol *Label = getContext().CreateTempSymbol(); + EmitLabel(Label); + MCCFIInstruction Instruction(MCCFIInstruction::Restore, Label, Register); + CurFrame->Instructions.push_back(Instruction); +} + void MCStreamer::EmitCFIEscape(StringRef Values) { EnsureValidFrame(); MCDwarfFrameInfo *CurFrame = getCurrentFrameInfo(); Added: llvm/trunk/test/MC/ELF/cfi-restore.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/cfi-restore.s?rev=147356&view=auto ============================================================================== --- llvm/trunk/test/MC/ELF/cfi-restore.s (added) +++ llvm/trunk/test/MC/ELF/cfi-restore.s Thu Dec 29 15:43:03 2011 @@ -0,0 +1,42 @@ +// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | elf-dump --dump-section-data | FileCheck %s + +f: + .cfi_startproc + nop + .cfi_restore %rbp + nop + .cfi_endproc + +// CHECK: # Section 4 +// CHECK-NEXT: (('sh_name', 0x00000011) # '.eh_frame' +// CHECK-NEXT: ('sh_type', 0x00000001) +// CHECK-NEXT: ('sh_flags', 0x0000000000000002) +// CHECK-NEXT: ('sh_addr', 0x0000000000000000) +// CHECK-NEXT: ('sh_offset', 0x0000000000000048) +// CHECK-NEXT: ('sh_size', 0x0000000000000030) +// CHECK-NEXT: ('sh_link', 0x00000000) +// CHECK-NEXT: ('sh_info', 0x00000000) +// CHECK-NEXT: ('sh_addralign', 0x0000000000000008) +// CHECK-NEXT: ('sh_entsize', 0x0000000000000000) +// CHECK-NEXT: ('_section_data', '14000000 00000000 017a5200 01781001 1b0c0708 90010000 14000000 1c000000 00000000 02000000 0041c600 00000000') +// CHECK-NEXT: ), +// CHECK-NEXT: # Section 5 +// CHECK-NEXT: (('sh_name', 0x0000000c) # '.rela.eh_frame' +// CHECK-NEXT: ('sh_type', 0x00000004) +// CHECK-NEXT: ('sh_flags', 0x0000000000000000) +// CHECK-NEXT: ('sh_addr', 0x0000000000000000) +// CHECK-NEXT: ('sh_offset', 0x0000000000000390) +// CHECK-NEXT: ('sh_size', 0x0000000000000018) +// CHECK-NEXT: ('sh_link', 0x00000007) +// CHECK-NEXT: ('sh_info', 0x00000004) +// CHECK-NEXT: ('sh_addralign', 0x0000000000000008) +// CHECK-NEXT: ('sh_entsize', 0x0000000000000018) +// CHECK-NEXT: ('_relocations', [ +// CHECK-NEXT: # Relocation 0 +// CHECK-NEXT: (('r_offset', 0x0000000000000020) +// CHECK-NEXT: ('r_sym', 0x00000002) +// CHECK-NEXT: ('r_type', 0x00000002) +// CHECK-NEXT: ('r_addend', 0x0000000000000000) +// CHECK-NEXT: ), +// CHECK-NEXT: ]) +// CHECK-NEXT: ), From hfinkel at anl.gov Thu Dec 29 18:34:00 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Fri, 30 Dec 2011 00:34:00 -0000 Subject: [llvm-commits] [llvm] r147359 - in /llvm/trunk: lib/Target/PowerPC/PPCFrameLowering.cpp lib/Target/PowerPC/PPCRegisterInfo.cpp test/CodeGen/PowerPC/2010-02-12-saveCR.ll test/CodeGen/PowerPC/Frames-large.ll Message-ID: <20111230003401.06D3A2A6C12C@llvm.org> Author: hfinkel Date: Thu Dec 29 18:34:00 2011 New Revision: 147359 URL: http://llvm.org/viewvc/llvm-project?rev=147359&view=rev Log: Cleanup stack/frame register define/kill states. This fixes two bugs: 1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test). 2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this. Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp llvm/trunk/test/CodeGen/PowerPC/2010-02-12-saveCR.ll llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp?rev=147359&r1=147358&r2=147359&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCFrameLowering.cpp Thu Dec 29 18:34:00 2011 @@ -367,8 +367,8 @@ .addReg(PPC::R0, RegState::Kill) .addImm(NegFrameSize); BuildMI(MBB, MBBI, dl, TII.get(PPC::STWUX)) - .addReg(PPC::R1) - .addReg(PPC::R1) + .addReg(PPC::R1, RegState::Kill) + .addReg(PPC::R1, RegState::Define) .addReg(PPC::R0); } else if (isInt<16>(NegFrameSize)) { BuildMI(MBB, MBBI, dl, TII.get(PPC::STWU), PPC::R1) @@ -382,8 +382,8 @@ .addReg(PPC::R0, RegState::Kill) .addImm(NegFrameSize & 0xFFFF); BuildMI(MBB, MBBI, dl, TII.get(PPC::STWUX)) - .addReg(PPC::R1) - .addReg(PPC::R1) + .addReg(PPC::R1, RegState::Kill) + .addReg(PPC::R1, RegState::Define) .addReg(PPC::R0); } } else { // PPC64. @@ -400,8 +400,8 @@ .addReg(PPC::X0) .addImm(NegFrameSize); BuildMI(MBB, MBBI, dl, TII.get(PPC::STDUX)) - .addReg(PPC::X1) - .addReg(PPC::X1) + .addReg(PPC::X1, RegState::Kill) + .addReg(PPC::X1, RegState::Define) .addReg(PPC::X0); } else if (isInt<16>(NegFrameSize)) { BuildMI(MBB, MBBI, dl, TII.get(PPC::STDU), PPC::X1) @@ -415,8 +415,8 @@ .addReg(PPC::X0, RegState::Kill) .addImm(NegFrameSize & 0xFFFF); BuildMI(MBB, MBBI, dl, TII.get(PPC::STDUX)) - .addReg(PPC::X1) - .addReg(PPC::X1) + .addReg(PPC::X1, RegState::Kill) + .addReg(PPC::X1, RegState::Define) .addReg(PPC::X0); } } Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=147359&r1=147358&r2=147359&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Thu Dec 29 18:34:00 2011 @@ -299,8 +299,9 @@ DebugLoc dl = MI->getDebugLoc(); if (isInt<16>(CalleeAmt)) { - BuildMI(MBB, I, dl, TII.get(ADDIInstr), StackReg).addReg(StackReg). - addImm(CalleeAmt); + BuildMI(MBB, I, dl, TII.get(ADDIInstr), StackReg) + .addReg(StackReg, RegState::Kill) + .addImm(CalleeAmt); } else { MachineBasicBlock::iterator MBBI = I; BuildMI(MBB, MBBI, dl, TII.get(LISInstr), TmpReg) @@ -308,9 +309,8 @@ BuildMI(MBB, MBBI, dl, TII.get(ORIInstr), TmpReg) .addReg(TmpReg, RegState::Kill) .addImm(CalleeAmt & 0xFFFF); - BuildMI(MBB, MBBI, dl, TII.get(ADDInstr)) - .addReg(StackReg) - .addReg(StackReg) + BuildMI(MBB, MBBI, dl, TII.get(ADDInstr), StackReg) + .addReg(StackReg, RegState::Kill) .addReg(TmpReg); } } @@ -407,12 +407,12 @@ if (requiresRegisterScavenging(MF)) // FIXME (64-bit): Use "true" part. BuildMI(MBB, II, dl, TII.get(PPC::STDUX)) .addReg(Reg, RegState::Kill) - .addReg(PPC::X1) + .addReg(PPC::X1, RegState::Define) .addReg(MI.getOperand(1).getReg()); else BuildMI(MBB, II, dl, TII.get(PPC::STDUX)) .addReg(PPC::X0, RegState::Kill) - .addReg(PPC::X1) + .addReg(PPC::X1, RegState::Define) .addReg(MI.getOperand(1).getReg()); if (!MI.getOperand(1).isKill()) @@ -428,7 +428,7 @@ } else { BuildMI(MBB, II, dl, TII.get(PPC::STWUX)) .addReg(Reg, RegState::Kill) - .addReg(PPC::R1) + .addReg(PPC::R1, RegState::Define) .addReg(MI.getOperand(1).getReg()); if (!MI.getOperand(1).isKill()) @@ -681,7 +681,7 @@ unsigned StackReg = MI.getOperand(FIOperandNo).getReg(); MI.getOperand(OperandBase).ChangeToRegister(StackReg, false); - MI.getOperand(OperandBase + 1).ChangeToRegister(SReg, false); + MI.getOperand(OperandBase + 1).ChangeToRegister(SReg, false, false, true); } unsigned PPCRegisterInfo::getFrameRegister(const MachineFunction &MF) const { Modified: llvm/trunk/test/CodeGen/PowerPC/2010-02-12-saveCR.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/2010-02-12-saveCR.ll?rev=147359&r1=147358&r2=147359&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/2010-02-12-saveCR.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/2010-02-12-saveCR.ll Thu Dec 29 18:34:00 2011 @@ -6,16 +6,22 @@ define void @foo() nounwind { entry: +;CHECK: mfcr r2 ;CHECK: lis r3, 1 +;CHECK: rlwinm r2, r2, 8, 0, 31 ;CHECK: ori r3, r3, 34524 +;CHECK: stwx r2, r1, r3 +; Make sure that the register scavenger returns the same temporary register. ;CHECK: mfcr r2 -;CHECK: rlwinm r2, r2, 8, 0, 31 +;CHECK: lis r3, 1 +;CHECK: rlwinm r2, r2, 12, 0, 31 +;CHECK: ori r3, r3, 34520 ;CHECK: stwx r2, r1, r3 %x = alloca [100000 x i8] ; <[100000 x i8]*> [#uses=1] %"alloca point" = bitcast i32 0 to i32 ; [#uses=0] %x1 = bitcast [100000 x i8]* %x to i8* ; [#uses=1] call void @bar(i8* %x1) nounwind - call void asm sideeffect "", "~{cr2}"() nounwind + call void asm sideeffect "", "~{cr2},~{cr3}"() nounwind br label %return return: ; preds = %entry Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll?rev=147359&r1=147358&r2=147359&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll Thu Dec 29 18:34:00 2011 @@ -15,9 +15,9 @@ ; PPC32-NOFP: _f1: ; PPC32-NOFP: lis r0, -1 -; PPC32-NOFP: addi r3, r1, 68 ; PPC32-NOFP: ori r0, r0, 32704 ; PPC32-NOFP: stwux r1, r1, r0 +; PPC32-NOFP: addi r3, r1, 68 ; PPC32-NOFP: lwz r1, 0(r1) ; PPC32-NOFP: blr @@ -25,10 +25,10 @@ ; PPC32-FP: _f1: ; PPC32-FP: lis r0, -1 ; PPC32-FP: stw r31, -4(r1) -; PPC32-FP: mr r31, r1 ; PPC32-FP: ori r0, r0, 32704 -; PPC32-FP: addi r3, r31, 64 ; PPC32-FP: stwux r1, r1, r0 +; PPC32-FP: mr r31, r1 +; PPC32-FP: addi r3, r31, 64 ; PPC32-FP: lwz r1, 0(r1) ; PPC32-FP: lwz r31, -4(r1) ; PPC32-FP: blr @@ -36,9 +36,9 @@ ; PPC64-NOFP: _f1: ; PPC64-NOFP: lis r0, -1 -; PPC64-NOFP: addi r3, r1, 116 ; PPC64-NOFP: ori r0, r0, 32656 ; PPC64-NOFP: stdux r1, r1, r0 +; PPC64-NOFP: addi r3, r1, 116 ; PPC64-NOFP: ld r1, 0(r1) ; PPC64-NOFP: blr @@ -46,10 +46,10 @@ ; PPC64-FP: _f1: ; PPC64-FP: lis r0, -1 ; PPC64-FP: std r31, -8(r1) -; PPC64-FP: mr r31, r1 ; PPC64-FP: ori r0, r0, 32640 -; PPC64-FP: addi r3, r31, 124 ; PPC64-FP: stdux r1, r1, r0 +; PPC64-FP: mr r31, r1 +; PPC64-FP: addi r3, r31, 124 ; PPC64-FP: ld r1, 0(r1) ; PPC64-FP: ld r31, -8(r1) ; PPC64-FP: blr From craig.topper at gmail.com Thu Dec 29 19:49:53 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 01:49:53 -0000 Subject: [llvm-commits] [llvm] r147360 - in /llvm/trunk: lib/Target/X86/X86InstrFMA.td test/CodeGen/X86/fma4-intrinsics-x86_64.ll Message-ID: <20111230014953.CDC732A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 19:49:53 2011 New Revision: 147360 URL: http://llvm.org/viewvc/llvm-project?rev=147360&view=rev Log: Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147360&r1=147359&r2=147360&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 19:49:53 2011 @@ -98,23 +98,22 @@ //===----------------------------------------------------------------------===// -multiclass fma4s opc, string OpcodeStr> { +multiclass fma4s opc, string OpcodeStr, Operand memop> { def rr : FMA4, XOP_W; def rm : FMA4, XOP_W; def mr : FMA4; - } multiclass fma4p opc, string OpcodeStr> { @@ -151,20 +150,20 @@ } let isAsmParserOnly = 1 in { - defm VFMADDSS4 : fma4s<0x6A, "vfmaddss">; - defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd">; + defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem>; + defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem>; defm VFMADDPS4 : fma4p<0x68, "vfmaddps">; defm VFMADDPD4 : fma4p<0x69, "vfmaddpd">; - defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss">; - defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd">; + defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem>; + defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem>; defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps">; defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd">; - defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss">; - defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd">; + defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem>; + defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem>; defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps">; defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd">; - defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss">; - defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd">; + defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem>; + defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem>; defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps">; defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd">; defm VFMADDSUBPS4 : fma4p<0x5C, "vfmaddsubps">; @@ -178,21 +177,17 @@ // VFMADD def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), - (VFMADDSS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, (alignedloadv4f32 addr:$src2), - VR128:$src3), - (VFMADDSS4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), + (VFMADDSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; +def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), + (VFMADDSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), - (VFMADDSD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, (alignedloadv2f64 addr:$src2), - VR128:$src3), - (VFMADDSD4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), + (VFMADDSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; +def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), + (VFMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -235,21 +230,17 @@ // VFMSUB def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), - (VFMSUBSS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, (alignedloadv4f32 addr:$src2), - VR128:$src3), - (VFMSUBSS4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), + (VFMSUBSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; +def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), + (VFMSUBSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), - (VFMSUBSD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, (alignedloadv2f64 addr:$src2), - VR128:$src3), - (VFMSUBSD4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), + (VFMSUBSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; +def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), + (VFMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -292,21 +283,17 @@ // VFNMADD def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), - (VFNMADDSS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, (alignedloadv4f32 addr:$src2), - VR128:$src3), - (VFNMADDSS4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), + (VFNMADDSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; +def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), + (VFNMADDSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), - (VFNMADDSD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, (alignedloadv2f64 addr:$src2), - VR128:$src3), - (VFNMADDSD4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), + (VFNMADDSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; +def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), + (VFNMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -349,21 +336,17 @@ // VFNMSUB def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), - (VFNMSUBSS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, (alignedloadv4f32 addr:$src2), - VR128:$src3), - (VFNMSUBSS4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), + (VFNMSUBSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; +def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), + (VFNMSUBSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), - (VFNMSUBSD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, (alignedloadv2f64 addr:$src2), - VR128:$src3), - (VFNMSUBSD4mr VR128:$src1, addr:$src2, VR128:$src3)>; +def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), + (VFNMSUBSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; +def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), + (VFNMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; Modified: llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll?rev=147360&r1=147359&r2=147360&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll (original) +++ llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll Thu Dec 29 19:49:53 2011 @@ -6,6 +6,20 @@ %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ss(< 4 x float > %a0, < 4 x float > %a1, < 4 x float > %a2) ; [#uses=1] ret < 4 x float > %res } +define < 4 x float > @test_x86_fma4_vfmadd_ss_load(< 4 x float > %a0, < 4 x float > %a1, float* %a2) { + ; CHECK: vfmaddss (%{{.*}}) + %x = load float *%a2 + %y = insertelement <4 x float> undef, float %x, i32 0 + %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ss(< 4 x float > %a0, < 4 x float > %a1, < 4 x float > %y) ; [#uses=1] + ret < 4 x float > %res +} +define < 4 x float > @test_x86_fma4_vfmadd_ss_load2(< 4 x float > %a0, float* %a1, < 4 x float > %a2) { + ; CHECK: vfmaddss %{{.*}}, (%{{.*}}) + %x = load float *%a1 + %y = insertelement <4 x float> undef, float %x, i32 0 + %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ss(< 4 x float > %a0, < 4 x float > %y, < 4 x float > %a2) ; [#uses=1] + ret < 4 x float > %res +} declare < 4 x float > @llvm.x86.fma4.vfmadd.ss(< 4 x float >, < 4 x float >, < 4 x float >) nounwind readnone define < 2 x double > @test_x86_fma4_vfmadd_sd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %a2) { @@ -13,6 +27,20 @@ %res = call < 2 x double > @llvm.x86.fma4.vfmadd.sd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %a2) ; [#uses=1] ret < 2 x double > %res } +define < 2 x double > @test_x86_fma4_vfmadd_sd_load(< 2 x double > %a0, < 2 x double > %a1, double* %a2) { + ; CHECK: vfmaddsd (%{{.*}}) + %x = load double *%a2 + %y = insertelement <2 x double> undef, double %x, i32 0 + %res = call < 2 x double > @llvm.x86.fma4.vfmadd.sd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %y) ; [#uses=1] + ret < 2 x double > %res +} +define < 2 x double > @test_x86_fma4_vfmadd_sd_load2(< 2 x double > %a0, double* %a1, < 2 x double > %a2) { + ; CHECK: vfmaddsd %{{.*}}, (%{{.*}}) + %x = load double *%a1 + %y = insertelement <2 x double> undef, double %x, i32 0 + %res = call < 2 x double > @llvm.x86.fma4.vfmadd.sd(< 2 x double > %a0, < 2 x double > %y, < 2 x double > %a2) ; [#uses=1] + ret < 2 x double > %res +} declare < 2 x double > @llvm.x86.fma4.vfmadd.sd(< 2 x double >, < 2 x double >, < 2 x double >) nounwind readnone define < 4 x float > @test_x86_fma4_vfmadd_ps(< 4 x float > %a0, < 4 x float > %a1, < 4 x float > %a2) { From rafael.espindola at gmail.com Thu Dec 29 20:09:55 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Thu, 29 Dec 2011 21:09:55 -0500 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <1325190024.13080.2848.camel@sapling> References: <1325190024.13080.2848.camel@sapling> Message-ID: <4EFD1D73.5070407@gmail.com> On 29/12/11 03:20 PM, Hal Finkel wrote: > This small patch fixes a compatibility problem between the ppc linux asm > printer and recent versions of binutils (as explained in the test case). > With this patch, llvm's behavior will again match that of gcc. It > touches CodeGen/AsmPrinter; please review. Do you know when this was changed in binutils? Maybe we should always use the local symbol instead of making this a ppc only change. > Thanks again, > Hal Cheers, Rafael From craig.topper at gmail.com Thu Dec 29 20:18:37 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 02:18:37 -0000 Subject: [llvm-commits] [llvm] r147361 - in /llvm/trunk: lib/Target/X86/X86InstrFMA.td test/CodeGen/X86/fma4-intrinsics-x86_64.ll Message-ID: <20111230021837.12B072A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 20:18:36 2011 New Revision: 147361 URL: http://llvm.org/viewvc/llvm-project?rev=147361&view=rev Log: Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147361&r1=147360&r2=147361&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 20:18:36 2011 @@ -192,38 +192,36 @@ def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFMADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFMADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFMADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), VR256:$src3), (VFMADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFMADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), VR256:$src3), (VFMADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; @@ -245,38 +243,36 @@ def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFMSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFMSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFMSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, (memopv8f32 addr:$src2), VR256:$src3), (VFMSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFMSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, (memopv4f64 addr:$src2), VR256:$src3), (VFMSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; @@ -298,38 +294,36 @@ def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFNMADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFNMADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFNMADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFNMADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFNMADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFNMADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), +def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), VR256:$src3), (VFNMADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFNMADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFNMADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), +def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), VR256:$src3), (VFNMADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; @@ -351,38 +345,38 @@ def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFNMSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFNMSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFNMSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFNMSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFNMSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFNMSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), + (memopv8f32 addr:$src2), VR256:$src3), (VFNMSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFNMSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFNMSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), + (memopv4f64 addr:$src2), VR256:$src3), (VFNMSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; @@ -390,38 +384,36 @@ def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFMADDSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMADDSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFMADDSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMADDSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMADDSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFMADDSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, (memopv8f32 addr:$src2), VR256:$src3), (VFMADDSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMADDSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFMADDSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, (memopv4f64 addr:$src2), VR256:$src3), (VFMADDSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; @@ -429,37 +421,35 @@ def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, VR128:$src2, - (alignedloadv4f32 addr:$src3)), + (memopv4f32 addr:$src3)), (VFMSUBADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, (alignedloadv4f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMSUBADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, VR128:$src2, - (alignedloadv2f64 addr:$src3)), + (memopv2f64 addr:$src3)), (VFMSUBADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, (alignedloadv2f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMSUBADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMSUBADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, VR256:$src2, - (alignedloadv8f32 addr:$src3)), + (memopv8f32 addr:$src3)), (VFMSUBADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, - (alignedloadv8f32 addr:$src2), +def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), VR256:$src3), (VFMSUBADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), (VFMSUBADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, VR256:$src2, - (alignedloadv4f64 addr:$src3)), + (memopv4f64 addr:$src3)), (VFMSUBADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, - (alignedloadv4f64 addr:$src2), +def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), VR256:$src3), (VFMSUBADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; Modified: llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll?rev=147361&r1=147360&r2=147361&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll (original) +++ llvm/trunk/test/CodeGen/X86/fma4-intrinsics-x86_64.ll Thu Dec 29 20:18:36 2011 @@ -48,6 +48,18 @@ %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ps(< 4 x float > %a0, < 4 x float > %a1, < 4 x float > %a2) ; [#uses=1] ret < 4 x float > %res } +define < 4 x float > @test_x86_fma4_vfmadd_ps_load(< 4 x float > %a0, < 4 x float > %a1, < 4 x float >* %a2) { + ; CHECK: vfmaddps (%{{.*}}) + %x = load <4 x float>* %a2 + %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ps(< 4 x float > %a0, < 4 x float > %a1, < 4 x float > %x) ; [#uses=1] + ret < 4 x float > %res +} +define < 4 x float > @test_x86_fma4_vfmadd_ps_load2(< 4 x float > %a0, < 4 x float >* %a1, < 4 x float > %a2) { + ; CHECK: vfmaddps %{{.*}}, (%{{.*}}) + %x = load <4 x float>* %a1 + %res = call < 4 x float > @llvm.x86.fma4.vfmadd.ps(< 4 x float > %a0, < 4 x float > %x, < 4 x float > %a2) ; [#uses=1] + ret < 4 x float > %res +} declare < 4 x float > @llvm.x86.fma4.vfmadd.ps(< 4 x float >, < 4 x float >, < 4 x float >) nounwind readnone define < 2 x double > @test_x86_fma4_vfmadd_pd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %a2) { @@ -55,6 +67,18 @@ %res = call < 2 x double > @llvm.x86.fma4.vfmadd.pd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %a2) ; [#uses=1] ret < 2 x double > %res } +define < 2 x double > @test_x86_fma4_vfmadd_pd_load(< 2 x double > %a0, < 2 x double > %a1, < 2 x double >* %a2) { + ; CHECK: vfmaddpd (%{{.*}}) + %x = load <2 x double>* %a2 + %res = call < 2 x double > @llvm.x86.fma4.vfmadd.pd(< 2 x double > %a0, < 2 x double > %a1, < 2 x double > %x) ; [#uses=1] + ret < 2 x double > %res +} +define < 2 x double > @test_x86_fma4_vfmadd_pd_load2(< 2 x double > %a0, < 2 x double >* %a1, < 2 x double > %a2) { + ; CHECK: vfmaddpd %{{.*}}, (%{{.*}}) + %x = load <2 x double>* %a1 + %res = call < 2 x double > @llvm.x86.fma4.vfmadd.pd(< 2 x double > %a0, < 2 x double > %x, < 2 x double > %a2) ; [#uses=1] + ret < 2 x double > %res +} declare < 2 x double > @llvm.x86.fma4.vfmadd.pd(< 2 x double >, < 2 x double >, < 2 x double >) nounwind readnone define < 8 x float > @test_x86_fma4_vfmadd_ps_256(< 8 x float > %a0, < 8 x float > %a1, < 8 x float > %a2) { From craig.topper at gmail.com Thu Dec 29 21:17:15 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 03:17:15 -0000 Subject: [llvm-commits] [llvm] r147364 - /llvm/trunk/lib/Target/X86/X86InstrFMA.td Message-ID: <20111230031716.098D72A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 21:17:15 2011 New Revision: 147364 URL: http://llvm.org/viewvc/llvm-project?rev=147364&view=rev Log: Combine FMA4 PS/PD patterns with the instruction definitions. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147364&r1=147363&r2=147364&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 21:17:15 2011 @@ -116,64 +116,86 @@ []>; } -multiclass fma4p opc, string OpcodeStr> { +multiclass fma4p opc, string OpcodeStr, + Intrinsic Int128, Intrinsic Int256, + PatFrag ld_frag128, PatFrag ld_frag256> { def rr : FMA4, XOP_W; + [(set VR128:$dst, + (Int128 VR128:$src1, VR128:$src2, VR128:$src3))]>, XOP_W; def rm : FMA4, XOP_W; + [(set VR128:$dst, (Int128 VR128:$src1, VR128:$src2, + (ld_frag128 addr:$src3)))]>, XOP_W; def mr : FMA4; + [(set VR128:$dst, + (Int128 VR128:$src1, (ld_frag128 addr:$src2), VR128:$src3))]>; def rrY : FMA4, XOP_W; + [(set VR256:$dst, + (Int256 VR256:$src1, VR256:$src2, VR256:$src3))]>, XOP_W; def rmY : FMA4, XOP_W; + [(set VR256:$dst, (Int256 VR256:$src1, VR256:$src2, + (ld_frag256 addr:$src3)))]>, XOP_W; def mrY : FMA4; + [(set VR256:$dst, + (Int256 VR256:$src1, (ld_frag256 addr:$src2), VR256:$src3))]>; } let isAsmParserOnly = 1 in { defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem>; defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem>; - defm VFMADDPS4 : fma4p<0x68, "vfmaddps">; - defm VFMADDPD4 : fma4p<0x69, "vfmaddpd">; + defm VFMADDPS4 : fma4p<0x68, "vfmaddps", int_x86_fma4_vfmadd_ps, + int_x86_fma4_vfmadd_ps_256, memopv4f32, memopv8f32>; + defm VFMADDPD4 : fma4p<0x69, "vfmaddpd", int_x86_fma4_vfmadd_pd, + int_x86_fma4_vfmadd_pd_256, memopv2f64, memopv4f64>; defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem>; defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem>; - defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps">; - defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd">; + defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps", int_x86_fma4_vfmsub_ps, + int_x86_fma4_vfmsub_ps_256, memopv4f32, memopv8f32>; + defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd", int_x86_fma4_vfmsub_pd, + int_x86_fma4_vfmsub_pd_256, memopv2f64, memopv4f64>; defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem>; defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem>; - defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps">; - defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd">; + defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps", int_x86_fma4_vfnmadd_ps, + int_x86_fma4_vfnmadd_ps_256, memopv4f32, memopv8f32>; + defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd", int_x86_fma4_vfnmadd_pd, + int_x86_fma4_vfnmadd_pd_256, memopv2f64, memopv4f64>; defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem>; defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem>; - defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps">; - defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd">; - defm VFMADDSUBPS4 : fma4p<0x5C, "vfmaddsubps">; - defm VFMADDSUBPD4 : fma4p<0x5D, "vfmaddsubpd">; - defm VFMSUBADDPS4 : fma4p<0x5E, "vfmsubaddps">; - defm VFMSUBADDPD4 : fma4p<0x5F, "vfmsubaddpd">; + defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps", int_x86_fma4_vfnmsub_ps, + int_x86_fma4_vfnmsub_ps_256, memopv4f32, memopv8f32>; + defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd", int_x86_fma4_vfnmsub_pd, + int_x86_fma4_vfnmsub_pd_256, memopv2f64, memopv4f64>; + defm VFMADDSUBPS4 : fma4p<0x5C, "vfmaddsubps", int_x86_fma4_vfmaddsub_ps, + int_x86_fma4_vfmaddsub_ps_256, memopv4f32, memopv8f32>; + defm VFMADDSUBPD4 : fma4p<0x5D, "vfmaddsubpd", int_x86_fma4_vfmaddsub_pd, + int_x86_fma4_vfmaddsub_pd_256, memopv2f64, memopv4f64>; + defm VFMSUBADDPS4 : fma4p<0x5E, "vfmsubaddps", int_x86_fma4_vfmsubadd_ps, + int_x86_fma4_vfmsubadd_ps_256, memopv4f32, memopv8f32>; + defm VFMSUBADDPD4 : fma4p<0x5F, "vfmsubaddpd", int_x86_fma4_vfmsubadd_pd, + int_x86_fma4_vfmsubadd_pd_256, memopv2f64, memopv4f64>; } // FMA4 Intrinsics patterns +let Predicates = [HasFMA4] in { + // VFMADD def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -189,42 +211,6 @@ def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), (VFMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFMADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFMADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFMADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFMADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFMADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), - VR256:$src3), - (VFMADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFMADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), - VR256:$src3), - (VFMADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - // VFMSUB def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -240,42 +226,6 @@ def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), (VFMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFMSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFMSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFMSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFMSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFMSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ps_256 VR256:$src1, (memopv8f32 addr:$src2), - VR256:$src3), - (VFMSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFMSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_pd_256 VR256:$src1, (memopv4f64 addr:$src2), - VR256:$src3), - (VFMSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - // VFNMADD def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFNMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -291,42 +241,6 @@ def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), (VFNMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFNMADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFNMADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFNMADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFNMADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFNMADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFNMADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), - VR256:$src3), - (VFNMADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFNMADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFNMADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), - VR256:$src3), - (VFNMADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - // VFNMSUB def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), (VFNMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; @@ -342,114 +256,23 @@ def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), (VFNMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFNMSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFNMSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFNMSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFNMSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFNMSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFNMSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ps_256 VR256:$src1, - (memopv8f32 addr:$src2), - VR256:$src3), - (VFNMSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFNMSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFNMSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_pd_256 VR256:$src1, - (memopv4f64 addr:$src2), - VR256:$src3), - (VFNMSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - // VFMADDSUB -def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDSUBPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFMADDSUBPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMADDSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDSUBPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFMADDSUBPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMADDSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMADDSUBPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFMADDSUBPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_ps_256 VR256:$src1, (memopv8f32 addr:$src2), - VR256:$src3), - (VFMADDSUBPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMADDSUBPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFMADDSUBPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmaddsub_pd_256 VR256:$src1, (memopv4f64 addr:$src2), - VR256:$src3), - (VFMADDSUBPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; // VFMSUBADD -def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBADDPS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, VR128:$src2, - (memopv4f32 addr:$src3)), - (VFMSUBADDPS4rm VR128:$src1, VR128:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, (memopv4f32 addr:$src2), VR128:$src3), (VFMSUBADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBADDPD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, VR128:$src2, - (memopv2f64 addr:$src3)), - (VFMSUBADDPD4rm VR128:$src1, VR128:$src2, addr:$src3)>; def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, (memopv2f64 addr:$src2), VR128:$src3), (VFMSUBADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMSUBADDPS4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, VR256:$src2, - (memopv8f32 addr:$src3)), - (VFMSUBADDPS4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_ps_256 VR256:$src1, (memopv8f32 addr:$src2), - VR256:$src3), - (VFMSUBADDPS4mrY VR256:$src1, addr:$src2, VR256:$src3)>; - -def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, VR256:$src2, VR256:$src3), - (VFMSUBADDPD4rrY VR256:$src1, VR256:$src2, VR256:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, VR256:$src2, - (memopv4f64 addr:$src3)), - (VFMSUBADDPD4rmY VR256:$src1, VR256:$src2, addr:$src3)>; -def : Pat<(int_x86_fma4_vfmsubadd_pd_256 VR256:$src1, (memopv4f64 addr:$src2), - VR256:$src3), - (VFMSUBADDPD4mrY VR256:$src1, addr:$src2, VR256:$src3)>; +} // Predicates = [HasFMA4] From craig.topper at gmail.com Thu Dec 29 21:33:59 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 03:33:59 -0000 Subject: [llvm-commits] [llvm] r147365 - /llvm/trunk/lib/Target/X86/X86InstrFMA.td Message-ID: <20111230033359.D6C822A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 21:33:59 2011 New Revision: 147365 URL: http://llvm.org/viewvc/llvm-project?rev=147365&view=rev Log: Combine FMA4 SS/SD patterns with the instruction definitions. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147365&r1=147364&r2=147365&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 21:33:59 2011 @@ -98,22 +98,26 @@ //===----------------------------------------------------------------------===// -multiclass fma4s opc, string OpcodeStr, Operand memop> { +multiclass fma4s opc, string OpcodeStr, Operand memop, + ComplexPattern mem_cpat, Intrinsic Int> { def rr : FMA4, XOP_W; + [(set VR128:$dst, + (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, XOP_W; def rm : FMA4, XOP_W; + [(set VR128:$dst, + (Int VR128:$src1, VR128:$src2, mem_cpat:$src3))]>, XOP_W; def mr : FMA4; + [(set VR128:$dst, + (Int VR128:$src1, mem_cpat:$src2, VR128:$src3))]>; } multiclass fma4p opc, string OpcodeStr, @@ -158,26 +162,34 @@ } let isAsmParserOnly = 1 in { - defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem>; - defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem>; + defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem, sse_load_f32, + int_x86_fma4_vfmadd_ss>; + defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem, sse_load_f64, + int_x86_fma4_vfmadd_sd>; defm VFMADDPS4 : fma4p<0x68, "vfmaddps", int_x86_fma4_vfmadd_ps, int_x86_fma4_vfmadd_ps_256, memopv4f32, memopv8f32>; defm VFMADDPD4 : fma4p<0x69, "vfmaddpd", int_x86_fma4_vfmadd_pd, int_x86_fma4_vfmadd_pd_256, memopv2f64, memopv4f64>; - defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem>; - defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem>; + defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem, sse_load_f32, + int_x86_fma4_vfmsub_ss>; + defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem, sse_load_f64, + int_x86_fma4_vfmsub_sd>; defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps", int_x86_fma4_vfmsub_ps, int_x86_fma4_vfmsub_ps_256, memopv4f32, memopv8f32>; defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd", int_x86_fma4_vfmsub_pd, int_x86_fma4_vfmsub_pd_256, memopv2f64, memopv4f64>; - defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem>; - defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem>; + defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem, sse_load_f32, + int_x86_fma4_vfnmadd_ss>; + defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem, sse_load_f64, + int_x86_fma4_vfnmadd_sd>; defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps", int_x86_fma4_vfnmadd_ps, int_x86_fma4_vfnmadd_ps_256, memopv4f32, memopv8f32>; defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd", int_x86_fma4_vfnmadd_pd, int_x86_fma4_vfnmadd_pd_256, memopv2f64, memopv4f64>; - defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem>; - defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem>; + defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem, sse_load_f32, + int_x86_fma4_vfnmsub_ss>; + defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem, sse_load_f64, + int_x86_fma4_vfnmsub_sd>; defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps", int_x86_fma4_vfnmsub_ps, int_x86_fma4_vfnmsub_ps_256, memopv4f32, memopv8f32>; defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd", int_x86_fma4_vfnmsub_pd, @@ -191,88 +203,3 @@ defm VFMSUBADDPD4 : fma4p<0x5F, "vfmsubaddpd", int_x86_fma4_vfmsubadd_pd, int_x86_fma4_vfmsubadd_pd_256, memopv2f64, memopv4f64>; } - -// FMA4 Intrinsics patterns - -let Predicates = [HasFMA4] in { - -// VFMADD -def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), - (VFMADDSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), - (VFMADDSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMADDSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), - (VFMADDSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; -def : Pat<(int_x86_fma4_vfmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), - (VFMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; - -// VFMSUB -def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), - (VFMSUBSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), - (VFMSUBSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, VR128:$src2, VR128:$src3), - (VFMSUBSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), - (VFMSUBSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; -def : Pat<(int_x86_fma4_vfmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), - (VFMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; - -// VFNMADD -def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMADDSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), - (VFNMADDSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), - (VFNMADDSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMADDSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), - (VFNMADDSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; -def : Pat<(int_x86_fma4_vfnmadd_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), - (VFNMADDSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; - -// VFNMSUB -def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMSUBSS4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, VR128:$src2, sse_load_f32:$src3), - (VFNMSUBSS4rm VR128:$src1, VR128:$src2, sse_load_f32:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_ss VR128:$src1, sse_load_f32:$src2, VR128:$src3), - (VFNMSUBSS4mr VR128:$src1, sse_load_f32:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, VR128:$src2, VR128:$src3), - (VFNMSUBSD4rr VR128:$src1, VR128:$src2, VR128:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, VR128:$src2, sse_load_f64:$src3), - (VFNMSUBSD4rm VR128:$src1, VR128:$src2, sse_load_f64:$src3)>; -def : Pat<(int_x86_fma4_vfnmsub_sd VR128:$src1, sse_load_f64:$src2, VR128:$src3), - (VFNMSUBSD4mr VR128:$src1, sse_load_f64:$src2, VR128:$src3)>; - -// VFMADDSUB -def : Pat<(int_x86_fma4_vfmaddsub_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFMADDSUBPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmaddsub_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFMADDSUBPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - - -// VFMSUBADD -def : Pat<(int_x86_fma4_vfmsubadd_ps VR128:$src1, (memopv4f32 addr:$src2), - VR128:$src3), - (VFMSUBADDPS4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -def : Pat<(int_x86_fma4_vfmsubadd_pd VR128:$src1, (memopv2f64 addr:$src2), - VR128:$src3), - (VFMSUBADDPD4mr VR128:$src1, addr:$src2, VR128:$src3)>; - -} // Predicates = [HasFMA4] From hfinkel at anl.gov Thu Dec 29 21:52:55 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Thu, 29 Dec 2011 21:52:55 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <4EFD1D73.5070407@gmail.com> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> Message-ID: <1325217175.13080.2933.camel@sapling> On Thu, 2011-12-29 at 21:09 -0500, Rafael ?vila de Esp?ndola wrote: > On 29/12/11 03:20 PM, Hal Finkel wrote: > > This small patch fixes a compatibility problem between the ppc linux asm > > printer and recent versions of binutils (as explained in the test case). > > With this patch, llvm's behavior will again match that of gcc. It > > touches CodeGen/AsmPrinter; please review. > > Do you know when this was changed in binutils? Maybe we should always > use the local symbol instead of making this a ppc only change. I don't know exactly what version of binutils made the change, but I believe it was sometime in 2011 (guessing from online bug reports); binutils 2.21.1 displays the new behavior, 2.20.51.0.2 (which is from 2009) does not. I think that always using the local symbol going to be a safe choice (apparently safer than using the global symbol name). Do we have a uniform way of getting that symbol? Thanks again, Hal > > > Thanks again, > > Hal > > Cheers, > Rafael > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From craig.topper at gmail.com Thu Dec 29 22:48:55 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 04:48:55 -0000 Subject: [llvm-commits] [llvm] r147366 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86BaseInfo.h MCTargetDesc/X86MCCodeEmitter.cpp X86InstrFMA.td X86InstrFormats.td X86InstrXOP.td Message-ID: <20111230044855.1E5912A6C12C@llvm.org> Author: ctopper Date: Thu Dec 29 22:48:54 2011 New Revision: 147366 URL: http://llvm.org/viewvc/llvm-project?rev=147366&view=rev Log: Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation. Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/lib/Target/X86/X86InstrFormats.td llvm/trunk/lib/Target/X86/X86InstrXOP.td Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h?rev=147366&r1=147365&r2=147366&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h (original) +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h Thu Dec 29 22:48:54 2011 @@ -426,10 +426,9 @@ /// this flag to indicate that the encoder should do the wacky 3DNow! thing. Has3DNow0F0FOpcode = 1U << 7, - /// XOP_W - Same bit as VEX_W. Used to indicate swapping of - /// operand 3 and 4 to be encoded in ModRM or I8IMM. This is used - /// for FMA4 and XOP instructions. - XOP_W = 1U << 8, + /// MemOp4 - Used to indicate swapping of operand 3 and 4 to be encoded in + /// ModRM or I8IMM. This is used for FMA4 and XOP instructions. + MemOp4 = 1U << 8, /// XOP - Opcode prefix used by XOP instructions. XOP = 1U << 9 @@ -503,11 +502,11 @@ return 0; case X86II::MRMSrcMem: { bool HasVEX_4V = (TSFlags >> X86II::VEXShift) & X86II::VEX_4V; - bool HasXOP_W = (TSFlags >> X86II::VEXShift) & X86II::XOP_W; + bool HasMemOp4 = (TSFlags >> X86II::VEXShift) & X86II::MemOp4; unsigned FirstMemOp = 1; if (HasVEX_4V) ++FirstMemOp;// Skip the register source (which is encoded in VEX_VVVV). - if (HasXOP_W) + if (HasMemOp4) ++FirstMemOp;// Skip the register source (which is encoded in I8IMM). // FIXME: Maybe lea should have its own form? This is a horrible hack. Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp?rev=147366&r1=147365&r2=147366&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (original) +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp Thu Dec 29 22:48:54 2011 @@ -431,10 +431,6 @@ // opcode extension, or ignored, depending on the opcode byte) unsigned char VEX_W = 0; - // XOP_W: opcode specific, same bit as VEX_W, but used to - // swap operand 3 and 4 for FMA4 and XOP instructions - unsigned char XOP_W = 0; - // XOP: Use XOP prefix byte 0x8f instead of VEX. unsigned char XOP = 0; @@ -477,9 +473,6 @@ if ((TSFlags >> X86II::VEXShift) & X86II::VEX_W) VEX_W = 1; - if ((TSFlags >> X86II::VEXShift) & X86II::XOP_W) - XOP_W = 1; - if ((TSFlags >> X86II::VEXShift) & X86II::XOP) XOP = 1; @@ -669,7 +662,7 @@ // 3 byte VEX prefix EmitByte(XOP ? 0x8F : 0xC4, CurByte, OS); EmitByte(VEX_R << 7 | VEX_X << 6 | VEX_B << 5 | VEX_5M, CurByte, OS); - EmitByte(LastByte | ((VEX_W | XOP_W) << 7), CurByte, OS); + EmitByte(LastByte | (VEX_W << 7), CurByte, OS); } /// DetermineREXPrefix - Determine if the MCInst has to be encoded with a X86-64 @@ -929,8 +922,8 @@ // It uses the VEX.VVVV field? bool HasVEX_4V = (TSFlags >> X86II::VEXShift) & X86II::VEX_4V; bool HasVEX_4VOp3 = (TSFlags >> X86II::VEXShift) & X86II::VEX_4VOp3; - bool HasXOP_W = (TSFlags >> X86II::VEXShift) & X86II::XOP_W; - unsigned XOP_W_I8IMMOperand = 2; + bool HasMemOp4 = (TSFlags >> X86II::VEXShift) & X86II::MemOp4; + const unsigned MemOp4_I8IMMOperand = 2; // Determine where the memory operand starts, if present. int MemoryOperand = X86II::getMemoryOperandNo(TSFlags, Opcode); @@ -1003,14 +996,14 @@ if (HasVEX_4V) // Skip 1st src (which is encoded in VEX_VVVV) SrcRegNum++; - if(HasXOP_W) // Skip 2nd src (which is encoded in I8IMM) + if(HasMemOp4) // Skip 2nd src (which is encoded in I8IMM) SrcRegNum++; EmitRegModRMByte(MI.getOperand(SrcRegNum), GetX86RegNum(MI.getOperand(CurOp)), CurByte, OS); - // 2 operands skipped with HasXOP_W, comensate accordingly - CurOp = HasXOP_W ? SrcRegNum : SrcRegNum + 1; + // 2 operands skipped with HasMemOp4, comensate accordingly + CurOp = HasMemOp4 ? SrcRegNum : SrcRegNum + 1; if (HasVEX_4VOp3) ++CurOp; break; @@ -1022,7 +1015,7 @@ ++AddrOperands; ++FirstMemOp; // Skip the register source (which is encoded in VEX_VVVV). } - if(HasXOP_W) // Skip second register source (encoded in I8IMM) + if(HasMemOp4) // Skip second register source (encoded in I8IMM) ++FirstMemOp; EmitByte(BaseOpcode, CurByte, OS); @@ -1113,7 +1106,7 @@ // The last source register of a 4 operand instruction in AVX is encoded // in bits[7:4] of a immediate byte. if ((TSFlags >> X86II::VEXShift) & X86II::VEX_I8IMM) { - const MCOperand &MO = MI.getOperand(HasXOP_W ? XOP_W_I8IMMOperand + const MCOperand &MO = MI.getOperand(HasMemOp4 ? MemOp4_I8IMMOperand : CurOp); CurOp++; bool IsExtReg = X86II::isX86_64ExtendedReg(MO.getReg()); Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147366&r1=147365&r2=147366&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 22:48:54 2011 @@ -105,13 +105,13 @@ !strconcat(OpcodeStr, "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), [(set VR128:$dst, - (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, XOP_W; + (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4; def rm : FMA4, XOP_W; + (Int VR128:$src1, VR128:$src2, mem_cpat:$src3))]>, VEX_W, MemOp4; def mr : FMA4, XOP_W; + (Int128 VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4; def rm : FMA4, XOP_W; + (ld_frag128 addr:$src3)))]>, VEX_W, MemOp4; def mr : FMA4, XOP_W; + (Int256 VR256:$src1, VR256:$src2, VR256:$src3))]>, VEX_W, MemOp4; def rmY : FMA4, XOP_W; + (ld_frag256 addr:$src3)))]>, VEX_W, MemOp4; def mrY : FMA4 opcod, Format f, ImmType i, dag outs, dag ins, string AsmStr, Domain d = GenericDomain> @@ -161,7 +161,7 @@ bit hasVEX_L = 0; // Does this inst use large (256-bit) registers? bit ignoresVEX_L = 0; // Does this instruction ignore the L-bit bit has3DNow0F0FOpcode =0;// Wacky 3dNow! encoding? - bit hasXOP_WPrefix = 0; // Same bit as VEX_W, but used for swapping operands + bit hasMemOp4Prefix = 0; // Same bit as VEX_W, but used for swapping operands bit hasXOP_Prefix = 0; // Does this inst require an XOP prefix? // TSFlags layout should be kept in sync with X86InstrInfo.h. @@ -184,7 +184,7 @@ let TSFlags{38} = hasVEX_L; let TSFlags{39} = ignoresVEX_L; let TSFlags{40} = has3DNow0F0FOpcode; - let TSFlags{41} = hasXOP_WPrefix; + let TSFlags{41} = hasMemOp4Prefix; let TSFlags{42} = hasXOP_Prefix; } Modified: llvm/trunk/lib/Target/X86/X86InstrXOP.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrXOP.td?rev=147366&r1=147365&r2=147366&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrXOP.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrXOP.td Thu Dec 29 22:48:54 2011 @@ -169,7 +169,7 @@ (ins VR128:$src1, VR128:$src2, f128mem:$src3), !strconcat(OpcodeStr, "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), - []>, VEX_4V, VEX_I8IMM, XOP_W; + []>, VEX_4V, VEX_I8IMM, VEX_W, MemOp4; def mr : IXOPi8, VEX_4V, VEX_I8IMM, XOP_W; + []>, VEX_4V, VEX_I8IMM, VEX_W, MemOp4; def mrY : IXOPi8, XOP_W; + []>, VEX_W, MemOp4; def mr : IXOP5, XOP_W; + []>, VEX_W, MemOp4; def mrY : IXOP5 Author: ctopper Date: Thu Dec 29 23:20:36 2011 New Revision: 147367 URL: http://llvm.org/viewvc/llvm-project?rev=147367&view=rev Log: Add FMA4 instructions to disassembler. Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp llvm/trunk/utils/TableGen/X86RecognizableInstr.h Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147367&r1=147366&r2=147367&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 23:20:36 2011 @@ -118,6 +118,12 @@ "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), [(set VR128:$dst, (Int VR128:$src1, mem_cpat:$src2, VR128:$src3))]>; +// For disassembler +let isCodeGenOnly = 1 in + def rr_REV : FMA4; } multiclass fma4p opc, string OpcodeStr, @@ -159,47 +165,56 @@ "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), [(set VR256:$dst, (Int256 VR256:$src1, (ld_frag256 addr:$src2), VR256:$src3))]>; +// For disassembler +let isCodeGenOnly = 1 in { + def rr_REV : FMA4; + def rrY_REV : FMA4; +} // isCodeGenOnly = 1 } -let isAsmParserOnly = 1 in { - defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem, sse_load_f32, - int_x86_fma4_vfmadd_ss>; - defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem, sse_load_f64, - int_x86_fma4_vfmadd_sd>; - defm VFMADDPS4 : fma4p<0x68, "vfmaddps", int_x86_fma4_vfmadd_ps, - int_x86_fma4_vfmadd_ps_256, memopv4f32, memopv8f32>; - defm VFMADDPD4 : fma4p<0x69, "vfmaddpd", int_x86_fma4_vfmadd_pd, - int_x86_fma4_vfmadd_pd_256, memopv2f64, memopv4f64>; - defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem, sse_load_f32, - int_x86_fma4_vfmsub_ss>; - defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem, sse_load_f64, - int_x86_fma4_vfmsub_sd>; - defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps", int_x86_fma4_vfmsub_ps, - int_x86_fma4_vfmsub_ps_256, memopv4f32, memopv8f32>; - defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd", int_x86_fma4_vfmsub_pd, - int_x86_fma4_vfmsub_pd_256, memopv2f64, memopv4f64>; - defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem, sse_load_f32, - int_x86_fma4_vfnmadd_ss>; - defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem, sse_load_f64, - int_x86_fma4_vfnmadd_sd>; - defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps", int_x86_fma4_vfnmadd_ps, - int_x86_fma4_vfnmadd_ps_256, memopv4f32, memopv8f32>; - defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd", int_x86_fma4_vfnmadd_pd, - int_x86_fma4_vfnmadd_pd_256, memopv2f64, memopv4f64>; - defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem, sse_load_f32, - int_x86_fma4_vfnmsub_ss>; - defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem, sse_load_f64, - int_x86_fma4_vfnmsub_sd>; - defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps", int_x86_fma4_vfnmsub_ps, - int_x86_fma4_vfnmsub_ps_256, memopv4f32, memopv8f32>; - defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd", int_x86_fma4_vfnmsub_pd, - int_x86_fma4_vfnmsub_pd_256, memopv2f64, memopv4f64>; - defm VFMADDSUBPS4 : fma4p<0x5C, "vfmaddsubps", int_x86_fma4_vfmaddsub_ps, +defm VFMADDSS4 : fma4s<0x6A, "vfmaddss", ssmem, sse_load_f32, + int_x86_fma4_vfmadd_ss>; +defm VFMADDSD4 : fma4s<0x6B, "vfmaddsd", sdmem, sse_load_f64, + int_x86_fma4_vfmadd_sd>; +defm VFMADDPS4 : fma4p<0x68, "vfmaddps", int_x86_fma4_vfmadd_ps, + int_x86_fma4_vfmadd_ps_256, memopv4f32, memopv8f32>; +defm VFMADDPD4 : fma4p<0x69, "vfmaddpd", int_x86_fma4_vfmadd_pd, + int_x86_fma4_vfmadd_pd_256, memopv2f64, memopv4f64>; +defm VFMSUBSS4 : fma4s<0x6E, "vfmsubss", ssmem, sse_load_f32, + int_x86_fma4_vfmsub_ss>; +defm VFMSUBSD4 : fma4s<0x6F, "vfmsubsd", sdmem, sse_load_f64, + int_x86_fma4_vfmsub_sd>; +defm VFMSUBPS4 : fma4p<0x6C, "vfmsubps", int_x86_fma4_vfmsub_ps, + int_x86_fma4_vfmsub_ps_256, memopv4f32, memopv8f32>; +defm VFMSUBPD4 : fma4p<0x6D, "vfmsubpd", int_x86_fma4_vfmsub_pd, + int_x86_fma4_vfmsub_pd_256, memopv2f64, memopv4f64>; +defm VFNMADDSS4 : fma4s<0x7A, "vfnmaddss", ssmem, sse_load_f32, + int_x86_fma4_vfnmadd_ss>; +defm VFNMADDSD4 : fma4s<0x7B, "vfnmaddsd", sdmem, sse_load_f64, + int_x86_fma4_vfnmadd_sd>; +defm VFNMADDPS4 : fma4p<0x78, "vfnmaddps", int_x86_fma4_vfnmadd_ps, + int_x86_fma4_vfnmadd_ps_256, memopv4f32, memopv8f32>; +defm VFNMADDPD4 : fma4p<0x79, "vfnmaddpd", int_x86_fma4_vfnmadd_pd, + int_x86_fma4_vfnmadd_pd_256, memopv2f64, memopv4f64>; +defm VFNMSUBSS4 : fma4s<0x7E, "vfnmsubss", ssmem, sse_load_f32, + int_x86_fma4_vfnmsub_ss>; +defm VFNMSUBSD4 : fma4s<0x7F, "vfnmsubsd", sdmem, sse_load_f64, + int_x86_fma4_vfnmsub_sd>; +defm VFNMSUBPS4 : fma4p<0x7C, "vfnmsubps", int_x86_fma4_vfnmsub_ps, + int_x86_fma4_vfnmsub_ps_256, memopv4f32, memopv8f32>; +defm VFNMSUBPD4 : fma4p<0x7D, "vfnmsubpd", int_x86_fma4_vfnmsub_pd, + int_x86_fma4_vfnmsub_pd_256, memopv2f64, memopv4f64>; +defm VFMADDSUBPS4 : fma4p<0x5C, "vfmaddsubps", int_x86_fma4_vfmaddsub_ps, int_x86_fma4_vfmaddsub_ps_256, memopv4f32, memopv8f32>; - defm VFMADDSUBPD4 : fma4p<0x5D, "vfmaddsubpd", int_x86_fma4_vfmaddsub_pd, +defm VFMADDSUBPD4 : fma4p<0x5D, "vfmaddsubpd", int_x86_fma4_vfmaddsub_pd, int_x86_fma4_vfmaddsub_pd_256, memopv2f64, memopv4f64>; - defm VFMSUBADDPS4 : fma4p<0x5E, "vfmsubaddps", int_x86_fma4_vfmsubadd_ps, +defm VFMSUBADDPS4 : fma4p<0x5E, "vfmsubaddps", int_x86_fma4_vfmsubadd_ps, int_x86_fma4_vfmsubadd_ps_256, memopv4f32, memopv8f32>; - defm VFMSUBADDPD4 : fma4p<0x5F, "vfmsubaddpd", int_x86_fma4_vfmsubadd_pd, +defm VFMSUBADDPD4 : fma4p<0x5F, "vfmsubaddpd", int_x86_fma4_vfmsubadd_pd, int_x86_fma4_vfmsubadd_pd_256, memopv2f64, memopv4f64>; -} Modified: llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt?rev=147367&r1=147366&r2=147367&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt (original) +++ llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Thu Dec 29 23:20:36 2011 @@ -683,3 +683,9 @@ # CHECK: vfmadd132sd (%rax), %xmm12, %xmm10 0xc4 0x62 0x99 0x99 0x10 + +# CHEDCK: vfmaddss (%rcx), %xmm1, %xmm0, %xmm0 +0xc4 0xe3 0xf9 0x6a 0x01 0x10 + +# CHEDCK: vfmaddss %xmm1, (%rcx), %xmm0, %xmm0 +0xc4 0xe3 0x79 0x6a 0x01 0x10 Modified: llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp?rev=147367&r1=147366&r2=147367&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp (original) +++ llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp Thu Dec 29 23:20:36 2011 @@ -221,6 +221,7 @@ HasVEX_4VPrefix = Rec->getValueAsBit("hasVEX_4VPrefix"); HasVEX_4VOp3Prefix = Rec->getValueAsBit("hasVEX_4VOp3Prefix"); HasVEX_WPrefix = Rec->getValueAsBit("hasVEX_WPrefix"); + HasMemOp4Prefix = Rec->getValueAsBit("hasMemOp4Prefix"); IgnoresVEX_L = Rec->getValueAsBit("ignoresVEX_L"); HasLockPrefix = Rec->getValueAsBit("hasLockPrefix"); IsCodeGenOnly = Rec->getValueAsBit("isCodeGenOnly"); @@ -690,6 +691,9 @@ // in ModRMVEX and the one above the one in the VEX.VVVV field HANDLE_OPERAND(vvvvRegister) + if (HasMemOp4Prefix) + HANDLE_OPERAND(immediate) + HANDLE_OPERAND(rmRegister) if (HasVEX_4VOp3Prefix) @@ -717,6 +721,9 @@ // in ModRMVEX and the one above the one in the VEX.VVVV field HANDLE_OPERAND(vvvvRegister) + if (HasMemOp4Prefix) + HANDLE_OPERAND(immediate) + HANDLE_OPERAND(memory) if (HasVEX_4VOp3Prefix) Modified: llvm/trunk/utils/TableGen/X86RecognizableInstr.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/X86RecognizableInstr.h?rev=147367&r1=147366&r2=147367&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/X86RecognizableInstr.h (original) +++ llvm/trunk/utils/TableGen/X86RecognizableInstr.h Thu Dec 29 23:20:36 2011 @@ -62,7 +62,9 @@ bool HasVEX_WPrefix; /// Inferred from the operands; indicates whether the L bit in the VEX prefix is set bool HasVEX_LPrefix; - // The ignoreVEX_L field from the record + /// The hasMemOp4Prefix field from the record + bool HasMemOp4Prefix; + /// The ignoreVEX_L field from the record bool IgnoresVEX_L; /// The hasLockPrefix field from the record bool HasLockPrefix; From craig.topper at gmail.com Fri Dec 30 00:23:40 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 06:23:40 -0000 Subject: [llvm-commits] [llvm] r147368 - in /llvm/trunk: lib/Target/X86/Disassembler/X86DisassemblerDecoder.c lib/Target/X86/X86InstrXOP.td test/MC/Disassembler/X86/simple-tests.txt utils/TableGen/X86RecognizableInstr.cpp Message-ID: <20111230062340.5CE4B2A6C12C@llvm.org> Author: ctopper Date: Fri Dec 30 00:23:39 2011 New Revision: 147368 URL: http://llvm.org/viewvc/llvm-project?rev=147368&view=rev Log: Add disassembler support for VPERMIL2PD and VPERMIL2PS. Modified: llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c llvm/trunk/lib/Target/X86/X86InstrXOP.td llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp Modified: llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c?rev=147368&r1=147367&r2=147368&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c (original) +++ llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c Fri Dec 30 00:23:39 2011 @@ -1472,6 +1472,7 @@ static int readOperands(struct InternalInstruction* insn) { int index; int hasVVVV, needVVVV; + int sawRegImm = 0; dbgprintf(insn, "readOperands()"); @@ -1500,11 +1501,20 @@ dbgprintf(insn, "We currently don't hande code-offset encodings"); return -1; case ENCODING_IB: + if (sawRegImm) { + // saw a register immediate so don't read again and instead split the previous immediate + // FIXME: This is a hack + insn->immediates[insn->numImmediatesConsumed++] = insn->immediates[insn->numImmediatesConsumed - 1] & 0xf; + break; + } if (readImmediate(insn, 1)) return -1; if (insn->spec->operands[index].type == TYPE_IMM3 && insn->immediates[insn->numImmediatesConsumed - 1] > 7) return -1; + if (insn->spec->operands[index].type == TYPE_XMM128 || + insn->spec->operands[index].type == TYPE_XMM256) + sawRegImm = 1; break; case ENCODING_IW: if (readImmediate(insn, 2)) Modified: llvm/trunk/lib/Target/X86/X86InstrXOP.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrXOP.td?rev=147368&r1=147367&r2=147368&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrXOP.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrXOP.td Fri Dec 30 00:23:39 2011 @@ -237,7 +237,5 @@ []>; } -let isAsmParserOnly = 1 in { - defm VPERMIL2PD : xop5op<0x49, "vpermil2pd">; - defm VPERMIL2PS : xop5op<0x48, "vpermil2ps">; -} +defm VPERMIL2PD : xop5op<0x49, "vpermil2pd">; +defm VPERMIL2PS : xop5op<0x48, "vpermil2ps">; Modified: llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt?rev=147368&r1=147367&r2=147368&view=diff ============================================================================== --- llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt (original) +++ llvm/trunk/test/MC/Disassembler/X86/simple-tests.txt Fri Dec 30 00:23:39 2011 @@ -684,8 +684,11 @@ # CHECK: vfmadd132sd (%rax), %xmm12, %xmm10 0xc4 0x62 0x99 0x99 0x10 -# CHEDCK: vfmaddss (%rcx), %xmm1, %xmm0, %xmm0 +# CHECK: vfmaddss (%rcx), %xmm1, %xmm0, %xmm0 0xc4 0xe3 0xf9 0x6a 0x01 0x10 -# CHEDCK: vfmaddss %xmm1, (%rcx), %xmm0, %xmm0 +# CHECK: vfmaddss %xmm1, (%rcx), %xmm0, %xmm0 0xc4 0xe3 0x79 0x6a 0x01 0x10 + +# CHECK: vpermil2ps $1, 4(%rax), %xmm2, %xmm3, %xmm0 +0xc4 0xe3 0xe1 0x48 0x40 0x04 0x21 Modified: llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp?rev=147368&r1=147367&r2=147368&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp (original) +++ llvm/trunk/utils/TableGen/X86RecognizableInstr.cpp Fri Dec 30 00:23:39 2011 @@ -559,7 +559,7 @@ bool hasFROperands = false; - assert(numOperands < X86_MAX_OPERANDS && "X86_MAX_OPERANDS is not large enough"); + assert(numOperands <= X86_MAX_OPERANDS && "X86_MAX_OPERANDS is not large enough"); for (operandIndex = 0; operandIndex < numOperands; ++operandIndex) { if (OperandList[operandIndex].Constraints.size()) { @@ -678,7 +678,7 @@ // Operand 3 (optional) is an immediate. if (HasVEX_4VPrefix || HasVEX_4VOp3Prefix) - assert(numPhysicalOperands >= 3 && numPhysicalOperands <= 4 && + assert(numPhysicalOperands >= 3 && numPhysicalOperands <= 5 && "Unexpected number of operands for MRMSrcRegFrm with VEX_4V"); else assert(numPhysicalOperands >= 2 && numPhysicalOperands <= 3 && @@ -699,7 +699,9 @@ if (HasVEX_4VOp3Prefix) HANDLE_OPERAND(vvvvRegister) - HANDLE_OPTIONAL(immediate) + if (!HasMemOp4Prefix) + HANDLE_OPTIONAL(immediate) + HANDLE_OPTIONAL(immediate) // above might be a register in 7:4 break; case X86Local::MRMSrcMem: // Operand 1 is a register operand in the Reg/Opcode field. @@ -708,7 +710,7 @@ // Operand 3 (optional) is an immediate. if (HasVEX_4VPrefix || HasVEX_4VOp3Prefix) - assert(numPhysicalOperands >= 3 && numPhysicalOperands <= 4 && + assert(numPhysicalOperands >= 3 && numPhysicalOperands <= 5 && "Unexpected number of operands for MRMSrcMemFrm with VEX_4V"); else assert(numPhysicalOperands >= 2 && numPhysicalOperands <= 3 && @@ -729,7 +731,9 @@ if (HasVEX_4VOp3Prefix) HANDLE_OPERAND(vvvvRegister) - HANDLE_OPTIONAL(immediate) + if (!HasMemOp4Prefix) + HANDLE_OPTIONAL(immediate) + HANDLE_OPTIONAL(immediate) // above might be a register in 7:4 break; case X86Local::MRM0r: case X86Local::MRM1r: From craig.topper at gmail.com Fri Dec 30 01:16:00 2011 From: craig.topper at gmail.com (Craig Topper) Date: Fri, 30 Dec 2011 07:16:00 -0000 Subject: [llvm-commits] [llvm] r147369 - /llvm/trunk/lib/Target/X86/X86.td Message-ID: <20111230071600.4435D2A6C12C@llvm.org> Author: ctopper Date: Fri Dec 30 01:16:00 2011 New Revision: 147369 URL: http://llvm.org/viewvc/llvm-project?rev=147369&view=rev Log: Make FMA4 imply AVX so that YMM registers would be available. Necessitates removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal. Modified: llvm/trunk/lib/Target/X86/X86.td Modified: llvm/trunk/lib/Target/X86/X86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=147369&r1=147368&r2=147369&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86.td (original) +++ llvm/trunk/lib/Target/X86/X86.td Fri Dec 30 01:16:00 2011 @@ -77,7 +77,8 @@ "IsUAMemFast", "true", "Fast unaligned memory access">; def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true", - "Support SSE 4a instructions">; + "Support SSE 4a instructions", + [FeatureSSE3]>; def FeatureAVX : SubtargetFeature<"avx", "HasAVX", "true", "Enable AVX instructions">; @@ -90,8 +91,9 @@ "Enable three-operand fused multiple-add", [FeatureAVX]>; def FeatureFMA4 : SubtargetFeature<"fma4", "HasFMA4", "true", - "Enable four-operand fused multiple-add">; -def FeatureXOP : SubtargetFeature<"xop", "HasXOP", "true", + "Enable four-operand fused multiple-add", + [FeatureAVX]>; +def FeatureXOP : SubtargetFeature<"xop", "HasXOP", "true", "Enable XOP instructions">; def FeatureVectorUAMem : SubtargetFeature<"vector-unaligned-mem", "HasVectorUAMem", "true", @@ -201,12 +203,12 @@ def : Proc<"amdfam10", [FeatureSSE3, FeatureSSE4A, Feature3DNowA, FeatureCMPXCHG16B, FeatureLZCNT, FeaturePOPCNT, FeatureSlowBTMem]>; -// FIXME: Disabling AVX for now since it's not ready. +// FIXME: Disabling AVX/FMA4 for now since it's not ready. def : Proc<"bdver1", [FeatureSSE42, FeatureSSE4A, FeatureCMPXCHG16B, - FeatureAES, FeatureCLMUL, FeatureFMA4, + FeatureAES, FeatureCLMUL, FeatureXOP, FeatureLZCNT, FeaturePOPCNT]>; def : Proc<"bdver2", [FeatureSSE42, FeatureSSE4A, FeatureCMPXCHG16B, - FeatureAES, FeatureCLMUL, FeatureFMA4, + FeatureAES, FeatureCLMUL, FeatureXOP, FeatureF16C, FeatureLZCNT, FeaturePOPCNT, FeatureBMI]>; From tobias at grosser.es Fri Dec 30 03:09:39 2011 From: tobias at grosser.es (Tobias Grosser) Date: Fri, 30 Dec 2011 10:09:39 +0100 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <1325179929.13080.2839.camel@sapling> References: <1319928991.23036.957.camel@sapling> <1320108633.23036.1266.camel@sapling> <1320172356.23036.1298.camel@sapling> <4EB0462C.5010209@grosser.es> <1320184739.23036.1334.camel@sapling> <1320191694.23036.1497.camel@sapling> <1320749109.19359.76.camel@sapling> <4EB90E98.4010805@grosser.es> <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <4EFC7291.9040808@grosser.es> <1325179929.13080.2839.camel@sapling> Message-ID: <4EFD7FD3.8040800@grosser.es> On 12/29/2011 06:32 PM, Hal Finkel wrote: > On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote: >> On 12/14/2011 01:25 AM, Hal Finkel wrote: >> One thing that I would still like to have is a test case where >> bb-vectorize-search-limit is needed to avoid exponential compile time >> growth and another test case that is not optimized, if >> bb-vectorize-search-limit is set to a value less than 4000. I think >> those cases are very valuable to understand the performance behavior of >> this code. > > Good idea, I'll add these test cases. > >> Especially, as I am not yet sure why we need a value as high >> as 4000. > > I am not exactly sure why that turned out to be the best number, but > I'll try this again in combination with my load/store reordering patch > and see if such a large value still seems best. They reason why I am surprised about this value, is that I believe partial loop unrolling would not yield bbs of this size. Code size limits should prevent size. However, loop unrolling seems to be the major reason why two accesses to adjacent memory may be placed far away. Without loop unrolling, at a distance of 4000 the fact that two instructions access adjacent memory locations seems to be completely random and the probability that the following instructions perform the same calculations seems low. Also, I believe at 4000 the compile time should already be significant higher. As it seems my intuition is wrong, I am very eager to see and understand an example where a search limit of 4000 is really needed. Cheers Tobi From stpworld at narod.ru Fri Dec 30 06:14:36 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Fri, 30 Dec 2011 16:14:36 +0400 Subject: [llvm-commits] [LLVM, opt, LoopUnswitch] Compile-time improvements. Message-ID: <4EFDAB2C.5000606@narod.ru> Hi. A made some fixes that improves compile-time: 1. Size heuristics changed. Now we calculate number of unswitching branches only once per loop. 2. Some checks was moved from UnswitchIfProfitable to processCurrentLoop, since it is not changed during processCurrentLoop iteration. It allows decide to skip some loops at an early stage. I checked the compile-time on test MultiSource/Benchmarks/Prolangs-C++/shapes/shapes (there was compile time regression after my previous patch). Relative to my previous patch the compile-time improved on ~8.5%. Relative to old revisions (before r146578) the compile time is improved on ~2%. Please find the patch in attachment for review. -Stepan. -------------- next part -------------- A non-text attachment was scrubbed... Name: loop-unswitch-unswitchvals-1.1.patch Type: text/x-patch Size: 9194 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111230/8bf30ffa/attachment.bin From stpworld at narod.ru Fri Dec 30 06:15:50 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Fri, 30 Dec 2011 16:15:50 +0400 Subject: [llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches. In-Reply-To: <108781325151469@web84.yandex.ru> References: <4ECF4A5A.9030607@narod.ru> <4ED36D99.1080306@narod.ru> <4ED51EEC.6050000@narod.ru> <4ED749D0.8030101@narod.ru> <4ED88CDF.2020104@narod.ru> <4EDCC136.1040903@narod.ru> <566DBA1B-7099-44E0-B2EB-3413A054F5E3@apple.com> <4EDF700A.3080605@narod.ru> <9F9EFCC2-3F32-49F8-97D6-B8BAC580B6F7@apple.com> <4EE36B3F.3090307@narod.ru> <0E3FA92F-92F2-45F5-9247-0A1934901F01@apple.com> <4EE8F7C4.2040002@narod.ru> <4EFAF2B3.1010806@narod.ru> <108781325151469@web84.yandex.ru> Message-ID: <4EFDAB76.9070102@narod.ru> Moved to another thread: "[LLVM, opt, LoopUnswitch] Compile-time improvements". -Stepan. Stepan Dyatkovskiy wrote: > Hi. A made some fixes that improves compile-time: >> >> 1. Size heuristics changed. Now we calculate number of unswitching >> branches only once per loop. >> 2. Some checks was moved from UnswitchIfProfitable to >> processCurrentLoop, since it is not changed during processCurrentLoop >> iteration. It allows decide to skip some loops at an early stage. >> >> I checked the compile-time on test >> >> MultiSource/Benchmarks/Prolangs-C++/shapes/shapes >> (there was compile time regression after my previous patch). >> >> Relative to previous patch the compile-time improved on ~8.5%. Relative >> to old revisions (before r146578) the compile time is improved on ~2%. >> >> Please find the patch in attachment for review. From rafael.espindola at gmail.com Fri Dec 30 08:33:23 2011 From: rafael.espindola at gmail.com (=?UTF-8?B?UmFmYWVsIMOBdmlsYSBkZSBFc3DDrW5kb2xh?=) Date: Fri, 30 Dec 2011 09:33:23 -0500 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <1325217175.13080.2933.camel@sapling> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <1325217175.13080.2933.camel@sapling> Message-ID: <4EFDCBB3.4070207@gmail.com> > I don't know exactly what version of binutils made the change, but I > believe it was sometime in 2011 (guessing from online bug reports); > binutils 2.21.1 displays the new behavior, 2.20.51.0.2 (which is from > 2009) does not. I have been unable to reproduce this with 6d17265fc58ec069dd1079317f14c9cdcbb60e06 (git://sourceware.org/git/binutils.git) targeting powerpc-pc-linux. I was able to assemble ----------------------------------------------- .file "test.c" .text .globl f .type f, at function f: # @f .Ltmp0: .cfi_startproc .Ltmp1: .size f, .Ltmp1-f .Ltmp2: .cfi_endproc .Leh_func_end0: ------------------------------------------------ I wonder if it was a temporary bug is gas. Could you please provide a .s and target triple that gas rejects? > I think that always using the local symbol going to be a safe choice > (apparently safer than using the global symbol name). Do we have a > uniform way of getting that symbol? > > Thanks again, > Hal Thanks, Rafael From joerg at britannica.bec.de Fri Dec 30 08:50:35 2011 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Fri, 30 Dec 2011 15:50:35 +0100 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <4EFD1D73.5070407@gmail.com> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> Message-ID: <20111230145035.GA25733@britannica.bec.de> On Thu, Dec 29, 2011 at 09:09:55PM -0500, Rafael ?vila de Esp?ndola wrote: > On 29/12/11 03:20 PM, Hal Finkel wrote: > > This small patch fixes a compatibility problem between the ppc linux asm > > printer and recent versions of binutils (as explained in the test case). > > With this patch, llvm's behavior will again match that of gcc. It > > touches CodeGen/AsmPrinter; please review. > > Do you know when this was changed in binutils? Maybe we should always > use the local symbol instead of making this a ppc only change. Or more importantly, why it was changed. There seems to be a plethora of questionable changes in binutils lately. Joerg From hfinkel at anl.gov Fri Dec 30 08:58:41 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Fri, 30 Dec 2011 08:58:41 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <4EFDCBB3.4070207@gmail.com> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <1325217175.13080.2933.camel@sapling> <4EFDCBB3.4070207@gmail.com> Message-ID: <1325257121.13080.2963.camel@sapling> On Fri, 2011-12-30 at 09:33 -0500, Rafael ?vila de Esp?ndola wrote: > > I don't know exactly what version of binutils made the change, but I > > believe it was sometime in 2011 (guessing from online bug reports); > > binutils 2.21.1 displays the new behavior, 2.20.51.0.2 (which is from > > 2009) does not. > > I have been unable to reproduce this with > 6d17265fc58ec069dd1079317f14c9cdcbb60e06 > (git://sourceware.org/git/binutils.git) targeting powerpc-pc-linux. I > was able to assemble > > ----------------------------------------------- > .file "test.c" > .text > .globl f > .type f, at function > f: # @f > .Ltmp0: > .cfi_startproc > .Ltmp1: > .size f, .Ltmp1-f > .Ltmp2: > .cfi_endproc > .Leh_func_end0: > ------------------------------------------------ > > I wonder if it was a temporary bug is gas. Could you please provide a .s > and target triple that gas rejects? I think that you can just use the output from llc with the test case in my patch. I apologize as I may not have said this explicitly, but the problem only appears on powerpc64 for linux (powerpc64-unknown-linux-gnu). If that does not reproduce the problem for you, I'll send you a failing .s file so that we can compare. Thanks again, Hal > > > I think that always using the local symbol going to be a safe choice > > (apparently safer than using the global symbol name). Do we have a > > uniform way of getting that symbol? > > > > Thanks again, > > Hal > > Thanks, > Rafael -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From hfinkel at anl.gov Fri Dec 30 10:01:10 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Fri, 30 Dec 2011 10:01:10 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <20111230145035.GA25733@britannica.bec.de> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <20111230145035.GA25733@britannica.bec.de> Message-ID: <1325260870.13080.2970.camel@sapling> On Fri, 2011-12-30 at 15:50 +0100, Joerg Sonnenberger wrote: > On Thu, Dec 29, 2011 at 09:09:55PM -0500, Rafael ?vila de Esp?ndola wrote: > > On 29/12/11 03:20 PM, Hal Finkel wrote: > > > This small patch fixes a compatibility problem between the ppc linux asm > > > printer and recent versions of binutils (as explained in the test case). > > > With this patch, llvm's behavior will again match that of gcc. It > > > touches CodeGen/AsmPrinter; please review. > > > > Do you know when this was changed in binutils? Maybe we should always > > use the local symbol instead of making this a ppc only change. > > Or more importantly, why it was changed. There seems to be a plethora of > questionable changes in binutils lately. To be fair, the current behavior produces assembly like: test1: .quad .L.test1,.TOC. at tocbase .previous .L.test1: ... .Ltmp0: .size test1, .Ltmp0-test1 so this is asking gas to produce a size constant by subtracting two labels from different (sub)sections. Is that generally legal? Once I realized what it was doing, I was surprised that it had ever worked. -Hal > > Joerg > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From joerg at britannica.bec.de Fri Dec 30 10:12:01 2011 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Fri, 30 Dec 2011 17:12:01 +0100 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <1325260870.13080.2970.camel@sapling> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <20111230145035.GA25733@britannica.bec.de> <1325260870.13080.2970.camel@sapling> Message-ID: <20111230161201.GA27811@britannica.bec.de> On Fri, Dec 30, 2011 at 10:01:10AM -0600, Hal Finkel wrote: > On Fri, 2011-12-30 at 15:50 +0100, Joerg Sonnenberger wrote: > > On Thu, Dec 29, 2011 at 09:09:55PM -0500, Rafael ?vila de Esp?ndola wrote: > > > On 29/12/11 03:20 PM, Hal Finkel wrote: > > > > This small patch fixes a compatibility problem between the ppc linux asm > > > > printer and recent versions of binutils (as explained in the test case). > > > > With this patch, llvm's behavior will again match that of gcc. It > > > > touches CodeGen/AsmPrinter; please review. > > > > > > Do you know when this was changed in binutils? Maybe we should always > > > use the local symbol instead of making this a ppc only change. > > > > Or more importantly, why it was changed. There seems to be a plethora of > > questionable changes in binutils lately. > > To be fair, the current behavior produces assembly like: > test1: > .quad .L.test1,.TOC. at tocbase > .previous > .L.test1: > ... > .Ltmp0: > .size test1, .Ltmp0-test1 > > so this is asking gas to produce a size constant by subtracting two > labels from different (sub)sections. Is that generally legal? Once I > realized what it was doing, I was surprised that it had ever worked. OK, this is a bit questionable. What's the purpose of the .previous again? Could we use the more modern .pushsection/.popsection instead? They are a bit less nasty in the backend... I don't think there is a good reason for not just always creating the local label to avoid the difference. Joerg From rafael.espindola at gmail.com Fri Dec 30 10:37:24 2011 From: rafael.espindola at gmail.com (=?UTF-8?B?UmFmYWVsIMOBdmlsYSBkZSBFc3DDrW5kb2xh?=) Date: Fri, 30 Dec 2011 11:37:24 -0500 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <1325257121.13080.2963.camel@sapling> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <1325217175.13080.2933.camel@sapling> <4EFDCBB3.4070207@gmail.com> <1325257121.13080.2963.camel@sapling> Message-ID: <4EFDE8C4.7080500@gmail.com> > I think that you can just use the output from llc with the test case in > my patch. I apologize as I may not have said this explicitly, but the > problem only appears on powerpc64 for linux > (powerpc64-unknown-linux-gnu). If that does not reproduce the problem > for you, I'll send you a failing .s file so that we can compare. OK, I can reproduce this, but it looks like a bug in our end. We are printing something like .text .section ".opd","aw" test1: .quad .L.test1,.TOC. at tocbase .previous .Ltmp0: .size test1, .Ltmp0-test1 So test1 ends up in section ".opd". Is this really what we want? What does gcc print for int test1(int x) {return x} ? > Thanks again, > Hal Cheers, Rafael From hfinkel at anl.gov Fri Dec 30 10:41:00 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Fri, 30 Dec 2011 10:41:00 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <20111230161201.GA27811@britannica.bec.de> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <20111230145035.GA25733@britannica.bec.de> <1325260870.13080.2970.camel@sapling> <20111230161201.GA27811@britannica.bec.de> Message-ID: <1325263260.13080.2981.camel@sapling> On Fri, 2011-12-30 at 17:12 +0100, Joerg Sonnenberger wrote: > On Fri, Dec 30, 2011 at 10:01:10AM -0600, Hal Finkel wrote: > > On Fri, 2011-12-30 at 15:50 +0100, Joerg Sonnenberger wrote: > > > On Thu, Dec 29, 2011 at 09:09:55PM -0500, Rafael ?vila de Esp?ndola wrote: > > > > On 29/12/11 03:20 PM, Hal Finkel wrote: > > > > > This small patch fixes a compatibility problem between the ppc linux asm > > > > > printer and recent versions of binutils (as explained in the test case). > > > > > With this patch, llvm's behavior will again match that of gcc. It > > > > > touches CodeGen/AsmPrinter; please review. > > > > > > > > Do you know when this was changed in binutils? Maybe we should always > > > > use the local symbol instead of making this a ppc only change. > > > > > > Or more importantly, why it was changed. There seems to be a plethora of > > > questionable changes in binutils lately. > > > > To be fair, the current behavior produces assembly like: > > test1: > > .quad .L.test1,.TOC. at tocbase > > .previous > > .L.test1: > > ... > > .Ltmp0: > > .size test1, .Ltmp0-test1 > > > > so this is asking gas to produce a size constant by subtracting two > > labels from different (sub)sections. Is that generally legal? Once I > > realized what it was doing, I was surprised that it had ever worked. > > OK, this is a bit questionable. What's the purpose of the .previous > again? Could we use the more modern .pushsection/.popsection instead? > They are a bit less nasty in the backend... My experience here is limited, so I'm not sure. Does the ".quad .L.test1,.TOC. at tocbase" induce an implied section change? > > I don't think there is a good reason for not just always creating the > local label to avoid the difference. I agree, this sounds like a good idea. -Hal > > Joerg > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From hfinkel at anl.gov Fri Dec 30 10:53:31 2011 From: hfinkel at anl.gov (Hal Finkel) Date: Fri, 30 Dec 2011 10:53:31 -0600 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <4EFDE8C4.7080500@gmail.com> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <1325217175.13080.2933.camel@sapling> <4EFDCBB3.4070207@gmail.com> <1325257121.13080.2963.camel@sapling> <4EFDE8C4.7080500@gmail.com> Message-ID: <1325264011.13080.2985.camel@sapling> On Fri, 2011-12-30 at 11:37 -0500, Rafael ?vila de Esp?ndola wrote: > > I think that you can just use the output from llc with the test case in > > my patch. I apologize as I may not have said this explicitly, but the > > problem only appears on powerpc64 for linux > > (powerpc64-unknown-linux-gnu). If that does not reproduce the problem > > for you, I'll send you a failing .s file so that we can compare. > > OK, I can reproduce this, but it looks like a bug in our end. We are > printing something like > > .text > .section ".opd","aw" > test1: > .quad .L.test1,.TOC. at tocbase > .previous > .Ltmp0: > .size test1, .Ltmp0-test1 > > So test1 ends up in section ".opd". Is this really what we want? What > does gcc print for I am not sure, but that seems to be what gcc does: .file "h.c" .section ".toc","aw" .section ".text" .align 2 .p2align 4,,15 .globl test1 .section ".opd","aw" .align 3 test1: .quad .L.test1,.TOC. at tocbase .previous .type test1, @function .L.test1: blr .long 0 .byte 0,0,0,0,0,0,0,0 .size test1,.-.L.test1 .ident "GCC: (GNU) 4.4.4 20100726 (Red Hat 4.4.4-13)" .section .note.GNU-stack,"", at progbits -Hal > > int test1(int x) {return x} > > ? > > > Thanks again, > > Hal > > Cheers, > Rafael -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From rafael.espindola at gmail.com Fri Dec 30 11:30:07 2011 From: rafael.espindola at gmail.com (=?UTF-8?B?UmFmYWVsIMOBdmlsYSBkZSBFc3DDrW5kb2xh?=) Date: Fri, 30 Dec 2011 12:30:07 -0500 Subject: [llvm-commits] [PATCH] Use an alternate symbol for function-size calc In-Reply-To: <1325264011.13080.2985.camel@sapling> References: <1325190024.13080.2848.camel@sapling> <4EFD1D73.5070407@gmail.com> <1325217175.13080.2933.camel@sapling> <4EFDCBB3.4070207@gmail.com> <1325257121.13080.2963.camel@sapling> <4EFDE8C4.7080500@gmail.com> <1325264011.13080.2985.camel@sapling> Message-ID: <4EFDF51F.4000609@gmail.com> > I am not sure, but that seems to be what gcc does: > .section ".opd","aw" ... > test1: ... > .previous > .L.test1: ... > .size test1,.-.L.test1 Same behavior, interesting. I think you original patch is OK then. This is needed because of a particularity of how ppc64 works. > -Hal > Cheers, Rafael From nicholas at mxc.ca Fri Dec 30 13:17:24 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 30 Dec 2011 19:17:24 -0000 Subject: [llvm-commits] [llvm] r147379 - /llvm/trunk/tools/llvm-extract/llvm-extract.cpp Message-ID: <20111230191724.2B1ED2A6C12C@llvm.org> Author: nicholas Date: Fri Dec 30 13:17:23 2011 New Revision: 147379 URL: http://llvm.org/viewvc/llvm-project?rev=147379&view=rev Log: Remove extraneous ".get()->" which is just "->". No functionality change. Modified: llvm/trunk/tools/llvm-extract/llvm-extract.cpp Modified: llvm/trunk/tools/llvm-extract/llvm-extract.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-extract/llvm-extract.cpp?rev=147379&r1=147378&r2=147379&view=diff ============================================================================== --- llvm/trunk/tools/llvm-extract/llvm-extract.cpp (original) +++ llvm/trunk/tools/llvm-extract/llvm-extract.cpp Fri Dec 30 13:17:23 2011 @@ -99,7 +99,7 @@ // Figure out which globals we should extract. for (size_t i = 0, e = ExtractGlobals.size(); i != e; ++i) { - GlobalValue *GV = M.get()->getNamedGlobal(ExtractGlobals[i]); + GlobalValue *GV = M->getNamedGlobal(ExtractGlobals[i]); if (!GV) { errs() << argv[0] << ": program doesn't contain global named '" << ExtractGlobals[i] << "'!\n"; @@ -117,8 +117,8 @@ "invalid regex: " << Error; } bool match = false; - for (Module::global_iterator GV = M.get()->global_begin(), - E = M.get()->global_end(); GV != E; GV++) { + for (Module::global_iterator GV = M->global_begin(), + E = M->global_end(); GV != E; GV++) { if (RegEx.match(GV->getName())) { GVs.insert(&*GV); match = true; @@ -133,7 +133,7 @@ // Figure out which functions we should extract. for (size_t i = 0, e = ExtractFuncs.size(); i != e; ++i) { - GlobalValue *GV = M.get()->getFunction(ExtractFuncs[i]); + GlobalValue *GV = M->getFunction(ExtractFuncs[i]); if (!GV) { errs() << argv[0] << ": program doesn't contain function named '" << ExtractFuncs[i] << "'!\n"; @@ -151,7 +151,7 @@ "invalid regex: " << Error; } bool match = false; - for (Module::iterator F = M.get()->begin(), E = M.get()->end(); F != E; + for (Module::iterator F = M->begin(), E = M->end(); F != E; F++) { if (RegEx.match(F->getName())) { GVs.insert(&*F); From stpworld at narod.ru Fri Dec 30 13:41:32 2011 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Fri, 30 Dec 2011 23:41:32 +0400 Subject: [llvm-commits] [LLVM, PR11652 PATCH]: Fixed Bug 11652 - assertion failures when Type.cpp is compiled with -Os Message-ID: <4EFE13EC.2010801@narod.ru> The problem is in Type.h. The fields in Type class are declared in next order: TypeID ID : 8; unsigned SubclassData : 24; unsigned NumContainedTys; Attempt to set new SubclassData value rewrites lowest byte in NumContainedTys when -Os is set. GCC bug? Anyway setting SubclassData with two workaround strings fixes the problem: void setSubclassData(unsigned val) { unsigned tmp = NumContainedTys; // Workaround for GCC -Os SubclassData = val; NumContainedTys = tmp; // Workaround for GCC -Os // Ensure we don't have any accidental truncation. assert(SubclassData == val && "Subclass data too large for field"); } Probably there is another ways to protect NumContainedTys from overwritting? Please find the patch in attachment for review. -Stepan. -------------- next part -------------- A non-text attachment was scrubbed... Name: 11652.patch Type: text/x-patch Size: 738 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111230/e5ea0279/attachment.bin From bruno.cardoso at gmail.com Fri Dec 30 15:07:21 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 19:07:21 -0200 Subject: [llvm-commits] [PATCH] AVX vmovaps +vxoprs + vinsertf128 DAG combine to vmovaps In-Reply-To: <9B9D99A1-48D5-4E4A-A32B-C97F167EFB51@apple.com> References: <9B9D99A1-48D5-4E4A-A32B-C97F167EFB51@apple.com> Message-ID: Hi Chad, On Thu, Dec 22, 2011 at 12:12 AM, Chad Rosier wrote: > This patch is for an AVX specific DAGcombine optimization. > > The following code: > > __m256 foo(float *f) { > ? ?return _mm256_castps128_ps256 (_mm_load_ps(f)); > } > > generates this assembly: > > ? ? ? ?vmovaps (%rdi), %xmm0 > ? ? ? ?vxorps ?%ymm1, %ymm1, %ymm1 > ? ? ? ?vinsertf128 ? ? $0, %xmm0, %ymm1, %ymm0 > > On AVX enabled processors, the vmovaps will zero the upper bits (255:128) of the corresponding YMM register. ?Therefore, the vxorps and vinsertf128 instructions are not necessary. > > This patch implements a DAG combine that removes the unnecessary vxorps and vinsertf128 instructions. ?Currently, this is only working as an enhancement to one of Bruno's DAGcombines (r135727), but I do plan on making this more general in the future. LGTM, just a few comments: + return DAG.getNode(ISD::BITCAST, dl, VT, ResNode); Since you're early returning here, + } else { + // Emit a zeroed vector and insert the desired subvector on its + // first half. + SDValue Zeros = getZeroVector(VT, true /* HasXMMInt */, DAG, dl); + SDValue InsV = Insert128BitVector(Zeros, V1.getOperand(0), + DAG.getConstant(0, MVT::i32), DAG, dl); + return DCI.CombineTo(N, InsV); + } to follow llvm coding style, you don't need the "else". +def X86vzload128 : SDNode<"X86ISD::VZEXT_LOAD128", SDTLoad, + [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>; + // VZEXT_LOAD128 - Load vector and zero extend. + VZEXT_LOAD128, + Why not use the previous X86vzload and VZEXT_LOAD instead? You can you use it and still match it right by using v4i64 in the pattern. -- Bruno Cardoso Lopes http://www.brunocardoso.cc From bruno.cardoso at gmail.com Fri Dec 30 15:04:30 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 21:04:30 -0000 Subject: [llvm-commits] [llvm] r147382 - in /llvm/trunk/lib/Target/Mips: MipsCodeEmitter.cpp MipsJITInfo.cpp MipsRelocations.h Message-ID: <20111230210430.E9FEF2A6C12C@llvm.org> Author: bruno Date: Fri Dec 30 15:04:30 2011 New Revision: 147382 URL: http://llvm.org/viewvc/llvm-project?rev=147382&view=rev Log: Improve Mips JIT. Implement encoder methods getJumpTargetOpValue and getBranchTargetOpValue for jmptarget and brtarget Mips tablegen operand types in the code emitter for old-style JIT. Rename the pc relative relocation for branches - new name is Mips::reloc_mips_pc16. Patch by Sasa Stankovic Modified: llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp llvm/trunk/lib/Target/Mips/MipsJITInfo.cpp llvm/trunk/lib/Target/Mips/MipsRelocations.h Modified: llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp?rev=147382&r1=147381&r2=147382&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp Fri Dec 30 15:04:30 2011 @@ -163,7 +163,7 @@ return Mips::reloc_mips_26; if ((Form == MipsII::FrmI || Form == MipsII::FrmFI) && MI.isBranch()) - return Mips::reloc_mips_branch; + return Mips::reloc_mips_pc16; if (Form == MipsII::FrmI && MI.getOpcode() == Mips::LUi) return Mips::reloc_mips_hi; return Mips::reloc_mips_lo; @@ -171,13 +171,22 @@ unsigned MipsCodeEmitter::getJumpTargetOpValue(const MachineInstr &MI, unsigned OpNo) const { - // FIXME: implement + MachineOperand MO = MI.getOperand(OpNo); + if (MO.isGlobal()) + emitGlobalAddress(MO.getGlobal(), getRelocation(MI, MO), true); + else if (MO.isSymbol()) + emitExternalSymbolAddress(MO.getSymbolName(), getRelocation(MI, MO)); + else if (MO.isMBB()) + emitMachineBasicBlock(MO.getMBB(), getRelocation(MI, MO)); + else + llvm_unreachable("Unexpected jump target operand kind."); return 0; } unsigned MipsCodeEmitter::getBranchTargetOpValue(const MachineInstr &MI, unsigned OpNo) const { - // FIXME: implement + MachineOperand MO = MI.getOperand(OpNo); + emitMachineBasicBlock(MO.getMBB(), getRelocation(MI, MO)); return 0; } Modified: llvm/trunk/lib/Target/Mips/MipsJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsJITInfo.cpp?rev=147382&r1=147381&r2=147382&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsJITInfo.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsJITInfo.cpp Fri Dec 30 15:04:30 2011 @@ -200,7 +200,7 @@ intptr_t ResultPtr = (intptr_t) MR->getResultPointer(); switch ((Mips::RelocationType) MR->getRelocationType()) { - case Mips::reloc_mips_branch: + case Mips::reloc_mips_pc16: ResultPtr = (((ResultPtr - (intptr_t) RelocPos) - 4) >> 2) & 0xffff; *((unsigned*) RelocPos) |= (unsigned) ResultPtr; break; Modified: llvm/trunk/lib/Target/Mips/MipsRelocations.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsRelocations.h?rev=147382&r1=147381&r2=147382&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsRelocations.h (original) +++ llvm/trunk/lib/Target/Mips/MipsRelocations.h Fri Dec 30 15:04:30 2011 @@ -20,10 +20,10 @@ namespace llvm { namespace Mips{ enum RelocationType { - // reloc_mips_branch - pc relative relocation for branches. The lower 18 + // reloc_mips_pc16 - pc relative relocation for branches. The lower 18 // bits of the difference between the branch target and the branch // instruction, shifted right by 2. - reloc_mips_branch = 1, + reloc_mips_pc16 = 1, // reloc_mips_hi - upper 16 bits of the address (modified by +1 if the // lower 16 bits of the address is negative). From bruno.cardoso at gmail.com Fri Dec 30 15:09:41 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 21:09:41 -0000 Subject: [llvm-commits] [llvm] r147383 - in /llvm/trunk/lib/Target/Mips: MipsAsmPrinter.cpp MipsCodeEmitter.cpp MipsMCInstLower.cpp MipsRegisterInfo.cpp Message-ID: <20111230210941.BA4ED2A6C12C@llvm.org> Author: bruno Date: Fri Dec 30 15:09:41 2011 New Revision: 147383 URL: http://llvm.org/viewvc/llvm-project?rev=147383&view=rev Log: Cleanup Mips code and rename some variables. Patch by Jack Carter Modified: llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp llvm/trunk/lib/Target/Mips/MipsMCInstLower.cpp llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp Modified: llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp?rev=147383&r1=147382&r2=147383&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsAsmPrinter.cpp Fri Dec 30 15:09:41 2011 @@ -66,10 +66,10 @@ } void MipsAsmPrinter::EmitInstruction(const MachineInstr *MI) { - SmallString<128> Str; - raw_svector_ostream OS(Str); - if (MI->isDebugValue()) { + SmallString<128> Str; + raw_svector_ostream OS(Str); + PrintDebugValueComment(MI, OS); return; } @@ -178,7 +178,7 @@ if (Mips::CPURegsRegisterClass->contains(Reg)) break; - unsigned RegNum = MipsRegisterInfo::getRegisterNumbering(Reg); + unsigned RegNum = getMipsRegisterNumbering(Reg); if (Mips::AFGR64RegisterClass->contains(Reg)) { FPUBitmask |= (3 << RegNum); CSFPRegsSize += AFGR64RegSize; @@ -193,7 +193,7 @@ // Set CPU Bitmask. for (; i != e; ++i) { unsigned Reg = CSI[i].getReg(); - unsigned RegNum = MipsRegisterInfo::getRegisterNumbering(Reg); + unsigned RegNum = getMipsRegisterNumbering(Reg); CPUBitmask |= (1 << RegNum); } Modified: llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp?rev=147383&r1=147382&r2=147383&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsCodeEmitter.cpp Fri Dec 30 15:09:41 2011 @@ -216,7 +216,7 @@ unsigned MipsCodeEmitter::getMachineOpValue(const MachineInstr &MI, const MachineOperand &MO) const { if (MO.isReg()) - return MipsRegisterInfo::getRegisterNumbering(MO.getReg()); + return getMipsRegisterNumbering(MO.getReg()); else if (MO.isImm()) return static_cast(MO.getImm()); else if (MO.isGlobal()) { Modified: llvm/trunk/lib/Target/Mips/MipsMCInstLower.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsMCInstLower.cpp?rev=147383&r1=147382&r2=147383&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsMCInstLower.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsMCInstLower.cpp Fri Dec 30 15:09:41 2011 @@ -213,102 +213,102 @@ SmallVector& MCInsts) { unsigned Opc = MI->getOpcode(); - MCInst instr1, instr2, instr3, move; + MCInst Instr1, Instr2, Instr3, Move; - bool two_instructions = false; + bool TwoInstructions = false; assert(MI->getNumOperands() == 3); assert(MI->getOperand(0).isReg()); assert(MI->getOperand(1).isReg()); - MCOperand target = LowerOperand(MI->getOperand(0)); - MCOperand base = LowerOperand(MI->getOperand(1)); - MCOperand atReg = MCOperand::CreateReg(Mips::AT); - MCOperand zeroReg = MCOperand::CreateReg(Mips::ZERO); - - MachineOperand unloweredName = MI->getOperand(2); - MCOperand name = LowerOperand(unloweredName); - - move.setOpcode(Mips::ADDu); - move.addOperand(target); - move.addOperand(atReg); - move.addOperand(zeroReg); + MCOperand Target = LowerOperand(MI->getOperand(0)); + MCOperand Base = LowerOperand(MI->getOperand(1)); + MCOperand ATReg = MCOperand::CreateReg(Mips::AT); + MCOperand ZeroReg = MCOperand::CreateReg(Mips::ZERO); + + MachineOperand UnLoweredName = MI->getOperand(2); + MCOperand Name = LowerOperand(UnLoweredName); + + Move.setOpcode(Mips::ADDu); + Move.addOperand(Target); + Move.addOperand(ATReg); + Move.addOperand(ZeroReg); switch (Opc) { case Mips::ULW: { // FIXME: only works for little endian right now - MCOperand adj_name = LowerOperand(unloweredName, 3); - if (base.getReg() == (target.getReg())) { - instr1.setOpcode(Mips::LWL); - instr1.addOperand(atReg); - instr1.addOperand(base); - instr1.addOperand(adj_name); - instr2.setOpcode(Mips::LWR); - instr2.addOperand(atReg); - instr2.addOperand(base); - instr2.addOperand(name); - instr3 = move; + MCOperand AdjName = LowerOperand(UnLoweredName, 3); + if (Base.getReg() == (Target.getReg())) { + Instr1.setOpcode(Mips::LWL); + Instr1.addOperand(ATReg); + Instr1.addOperand(Base); + Instr1.addOperand(AdjName); + Instr2.setOpcode(Mips::LWR); + Instr2.addOperand(ATReg); + Instr2.addOperand(Base); + Instr2.addOperand(Name); + Instr3 = Move; } else { - two_instructions = true; - instr1.setOpcode(Mips::LWL); - instr1.addOperand(target); - instr1.addOperand(base); - instr1.addOperand(adj_name); - instr2.setOpcode(Mips::LWR); - instr2.addOperand(target); - instr2.addOperand(base); - instr2.addOperand(name); + TwoInstructions = true; + Instr1.setOpcode(Mips::LWL); + Instr1.addOperand(Target); + Instr1.addOperand(Base); + Instr1.addOperand(AdjName); + Instr2.setOpcode(Mips::LWR); + Instr2.addOperand(Target); + Instr2.addOperand(Base); + Instr2.addOperand(Name); } break; } case Mips::ULHu: { // FIXME: only works for little endian right now - MCOperand adj_name = LowerOperand(unloweredName, 1); - instr1.setOpcode(Mips::LBu); - instr1.addOperand(atReg); - instr1.addOperand(base); - instr1.addOperand(adj_name); - instr2.setOpcode(Mips::LBu); - instr2.addOperand(target); - instr2.addOperand(base); - instr2.addOperand(name); - instr3.setOpcode(Mips::INS); - instr3.addOperand(target); - instr3.addOperand(atReg); - instr3.addOperand(MCOperand::CreateImm(0x8)); - instr3.addOperand(MCOperand::CreateImm(0x18)); + MCOperand AdjName = LowerOperand(UnLoweredName, 1); + Instr1.setOpcode(Mips::LBu); + Instr1.addOperand(ATReg); + Instr1.addOperand(Base); + Instr1.addOperand(AdjName); + Instr2.setOpcode(Mips::LBu); + Instr2.addOperand(Target); + Instr2.addOperand(Base); + Instr2.addOperand(Name); + Instr3.setOpcode(Mips::INS); + Instr3.addOperand(Target); + Instr3.addOperand(ATReg); + Instr3.addOperand(MCOperand::CreateImm(0x8)); + Instr3.addOperand(MCOperand::CreateImm(0x18)); break; } case Mips::USW: { // FIXME: only works for little endian right now - assert (base.getReg() != target.getReg()); - two_instructions = true; - MCOperand adj_name = LowerOperand(unloweredName, 3); - instr1.setOpcode(Mips::SWL); - instr1.addOperand(target); - instr1.addOperand(base); - instr1.addOperand(adj_name); - instr2.setOpcode(Mips::SWR); - instr2.addOperand(target); - instr2.addOperand(base); - instr2.addOperand(name); + assert (Base.getReg() != Target.getReg()); + TwoInstructions = true; + MCOperand AdjName = LowerOperand(UnLoweredName, 3); + Instr1.setOpcode(Mips::SWL); + Instr1.addOperand(Target); + Instr1.addOperand(Base); + Instr1.addOperand(AdjName); + Instr2.setOpcode(Mips::SWR); + Instr2.addOperand(Target); + Instr2.addOperand(Base); + Instr2.addOperand(Name); break; } case Mips::USH: { - MCOperand adj_name = LowerOperand(unloweredName, 1); - instr1.setOpcode(Mips::SB); - instr1.addOperand(target); - instr1.addOperand(base); - instr1.addOperand(name); - instr2.setOpcode(Mips::SRL); - instr2.addOperand(atReg); - instr2.addOperand(target); - instr2.addOperand(MCOperand::CreateImm(8)); - instr3.setOpcode(Mips::SB); - instr3.addOperand(atReg); - instr3.addOperand(base); - instr3.addOperand(adj_name); + MCOperand AdjName = LowerOperand(UnLoweredName, 1); + Instr1.setOpcode(Mips::SB); + Instr1.addOperand(Target); + Instr1.addOperand(Base); + Instr1.addOperand(Name); + Instr2.setOpcode(Mips::SRL); + Instr2.addOperand(ATReg); + Instr2.addOperand(Target); + Instr2.addOperand(MCOperand::CreateImm(8)); + Instr3.setOpcode(Mips::SB); + Instr3.addOperand(ATReg); + Instr3.addOperand(Base); + Instr3.addOperand(AdjName); break; } default: @@ -316,8 +316,8 @@ assert(0 && "unaligned instruction not processed"); } - MCInsts.push_back(instr1); - MCInsts.push_back(instr2); - if (!two_instructions) MCInsts.push_back(instr3); + MCInsts.push_back(Instr1); + MCInsts.push_back(Instr2); + if (!TwoInstructions) MCInsts.push_back(Instr3); } Modified: llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp?rev=147383&r1=147382&r2=147383&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp Fri Dec 30 15:09:41 2011 @@ -45,98 +45,6 @@ const TargetInstrInfo &tii) : MipsGenRegisterInfo(Mips::RA), Subtarget(ST), TII(tii) {} -/// getRegisterNumbering - Given the enum value for some register, e.g. -/// Mips::RA, return the number that it corresponds to (e.g. 31). -unsigned MipsRegisterInfo:: -getRegisterNumbering(unsigned RegEnum) -{ - switch (RegEnum) { - case Mips::ZERO: case Mips::ZERO_64: case Mips::F0: case Mips::D0_64: - case Mips::D0: - return 0; - case Mips::AT: case Mips::AT_64: case Mips::F1: case Mips::D1_64: - return 1; - case Mips::V0: case Mips::V0_64: case Mips::F2: case Mips::D2_64: - case Mips::D1: - return 2; - case Mips::V1: case Mips::V1_64: case Mips::F3: case Mips::D3_64: - return 3; - case Mips::A0: case Mips::A0_64: case Mips::F4: case Mips::D4_64: - case Mips::D2: - return 4; - case Mips::A1: case Mips::A1_64: case Mips::F5: case Mips::D5_64: - return 5; - case Mips::A2: case Mips::A2_64: case Mips::F6: case Mips::D6_64: - case Mips::D3: - return 6; - case Mips::A3: case Mips::A3_64: case Mips::F7: case Mips::D7_64: - return 7; - case Mips::T0: case Mips::T0_64: case Mips::F8: case Mips::D8_64: - case Mips::D4: - return 8; - case Mips::T1: case Mips::T1_64: case Mips::F9: case Mips::D9_64: - return 9; - case Mips::T2: case Mips::T2_64: case Mips::F10: case Mips::D10_64: - case Mips::D5: - return 10; - case Mips::T3: case Mips::T3_64: case Mips::F11: case Mips::D11_64: - return 11; - case Mips::T4: case Mips::T4_64: case Mips::F12: case Mips::D12_64: - case Mips::D6: - return 12; - case Mips::T5: case Mips::T5_64: case Mips::F13: case Mips::D13_64: - return 13; - case Mips::T6: case Mips::T6_64: case Mips::F14: case Mips::D14_64: - case Mips::D7: - return 14; - case Mips::T7: case Mips::T7_64: case Mips::F15: case Mips::D15_64: - return 15; - case Mips::S0: case Mips::S0_64: case Mips::F16: case Mips::D16_64: - case Mips::D8: - return 16; - case Mips::S1: case Mips::S1_64: case Mips::F17: case Mips::D17_64: - return 17; - case Mips::S2: case Mips::S2_64: case Mips::F18: case Mips::D18_64: - case Mips::D9: - return 18; - case Mips::S3: case Mips::S3_64: case Mips::F19: case Mips::D19_64: - return 19; - case Mips::S4: case Mips::S4_64: case Mips::F20: case Mips::D20_64: - case Mips::D10: - return 20; - case Mips::S5: case Mips::S5_64: case Mips::F21: case Mips::D21_64: - return 21; - case Mips::S6: case Mips::S6_64: case Mips::F22: case Mips::D22_64: - case Mips::D11: - return 22; - case Mips::S7: case Mips::S7_64: case Mips::F23: case Mips::D23_64: - return 23; - case Mips::T8: case Mips::T8_64: case Mips::F24: case Mips::D24_64: - case Mips::D12: - return 24; - case Mips::T9: case Mips::T9_64: case Mips::F25: case Mips::D25_64: - return 25; - case Mips::K0: case Mips::K0_64: case Mips::F26: case Mips::D26_64: - case Mips::D13: - return 26; - case Mips::K1: case Mips::K1_64: case Mips::F27: case Mips::D27_64: - return 27; - case Mips::GP: case Mips::GP_64: case Mips::F28: case Mips::D28_64: - case Mips::D14: - return 28; - case Mips::SP: case Mips::SP_64: case Mips::F29: case Mips::D29_64: - case Mips::HWR29: - return 29; - case Mips::FP: case Mips::FP_64: case Mips::F30: case Mips::D30_64: - case Mips::D15: - return 30; - case Mips::RA: case Mips::RA_64: case Mips::F31: case Mips::D31_64: - return 31; - default: llvm_unreachable("Unknown register number!"); - } - return 0; // Not reached -} - unsigned MipsRegisterInfo::getPICCallReg() { return Mips::T9; } //===----------------------------------------------------------------------===// From bruno.cardoso at gmail.com Fri Dec 30 15:13:38 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 19:13:38 -0200 Subject: [llvm-commits] [Mips] Patch request In-Reply-To: <86AC779C188FE74F88F6494478B46332E8F859@exchdb03.mips.com> References: <86AC779C188FE74F88F6494478B46332E8F859@exchdb03.mips.com> Message-ID: Committed r147383 On Tue, Dec 27, 2011 at 6:27 PM, Carter, Jack wrote: > This is a formatting change that somehow got left out of an earlier > submission. > It does not change behavior. It only conforms to coding standards. > > Jack Carter -- Bruno Cardoso Lopes http://www.brunocardoso.cc From bruno.cardoso at gmail.com Fri Dec 30 15:13:57 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 19:13:57 -0200 Subject: [llvm-commits] [Mips] Request for review: redundant code elimination In-Reply-To: <86AC779C188FE74F88F6494478B46332E8F6D0@exchdb03.mips.com> References: <86AC779C188FE74F88F6494478B46332E8F6D0@exchdb03.mips.com> Message-ID: Committed r147383 On Mon, Dec 26, 2011 at 11:40 PM, Carter, Jack wrote: > getRegisterNumbering.patch > > This patch takes out a redundant table. It does not affect output and thus > there is no attached test case. > . > contributer: Jack Carter > > lib/Target/Mips/MipsAsmPrinter.cpp > lib/Target/Mips/MipsCodeEmitter.cpp > lib/Target/Mips/MipsRegisterInfo.cpp > -- Bruno Cardoso Lopes http://www.brunocardoso.cc From bruno.cardoso at gmail.com Fri Dec 30 15:21:50 2011 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Fri, 30 Dec 2011 19:21:50 -0200 Subject: [llvm-commits] [llvm] r147366 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86BaseInfo.h MCTargetDesc/X86MCCodeEmitter.cpp X86InstrFMA.td X86InstrFormats.td X86InstrXOP.td In-Reply-To: <20111230044855.1E5912A6C12C@llvm.org> References: <20111230044855.1E5912A6C12C@llvm.org> Message-ID: On Fri, Dec 30, 2011 at 2:48 AM, Craig Topper wrote: > Author: ctopper > Date: Thu Dec 29 22:48:54 2011 > New Revision: 147366 > > URL: http://llvm.org/viewvc/llvm-project?rev=147366&view=rev > Log: > Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation. > > Modified: > ? ?llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h > ? ?llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp > ? ?llvm/trunk/lib/Target/X86/X86InstrFMA.td > ? ?llvm/trunk/lib/Target/X86/X86InstrFormats.td > ? ?llvm/trunk/lib/Target/X86/X86InstrXOP.td > > Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h?rev=147366&r1=147365&r2=147366&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h (original) > +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86BaseInfo.h Thu Dec 29 22:48:54 2011 > @@ -426,10 +426,9 @@ > ? ? /// this flag to indicate that the encoder should do the wacky 3DNow! thing. > ? ? Has3DNow0F0FOpcode = 1U << 7, > > - ? ?/// XOP_W - Same bit as VEX_W. Used to indicate swapping of > - ? ?/// operand 3 and 4 to be encoded in ModRM or I8IMM. This is used > - ? ?/// for FMA4 and XOP instructions. > - ? ?XOP_W = 1U << 8, > + ? ?/// MemOp4 - Used to indicate swapping of operand 3 and 4 to be encoded in > + ? ?/// ModRM or I8IMM. This is used for FMA4 and XOP instructions. > + ? ?MemOp4 = 1U << 8, Given the comment, maybe this should be a more descriptive instead: what about SwpMemImmOp4? or something like that... > ? ? /// XOP - Opcode prefix used by XOP instructions. > ? ? XOP = 1U << 9 > @@ -503,11 +502,11 @@ > ? ? ? return 0; > ? ? case X86II::MRMSrcMem: { > ? ? ? bool HasVEX_4V = (TSFlags >> X86II::VEXShift) & X86II::VEX_4V; > - ? ? ?bool HasXOP_W = (TSFlags >> X86II::VEXShift) & X86II::XOP_W; > + ? ? ?bool HasMemOp4 = (TSFlags >> X86II::VEXShift) & X86II::MemOp4; > ? ? ? unsigned FirstMemOp = 1; > ? ? ? if (HasVEX_4V) > ? ? ? ? ++FirstMemOp;// Skip the register source (which is encoded in VEX_VVVV). > - ? ? ?if (HasXOP_W) > + ? ? ?if (HasMemOp4) > ? ? ? ? ++FirstMemOp;// Skip the register source (which is encoded in I8IMM). > > ? ? ? // FIXME: Maybe lea should have its own form? ?This is a horrible hack. > > Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp?rev=147366&r1=147365&r2=147366&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (original) > +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp Thu Dec 29 22:48:54 2011 > @@ -431,10 +431,6 @@ > ? // opcode extension, or ignored, depending on the opcode byte) > ? unsigned char VEX_W = 0; > > - ?// XOP_W: opcode specific, same bit as VEX_W, but used to > - ?// swap operand 3 and 4 for FMA4 and XOP instructions > - ?unsigned char XOP_W = 0; > - > ? // XOP: Use XOP prefix byte 0x8f instead of VEX. > ? unsigned char XOP = 0; > > @@ -477,9 +473,6 @@ > ? if ((TSFlags >> X86II::VEXShift) & X86II::VEX_W) > ? ? VEX_W = 1; > > - ?if ((TSFlags >> X86II::VEXShift) & X86II::XOP_W) > - ? ?XOP_W = 1; > - > ? if ((TSFlags >> X86II::VEXShift) & X86II::XOP) > ? ? XOP = 1; > > @@ -669,7 +662,7 @@ > ? // 3 byte VEX prefix > ? EmitByte(XOP ? 0x8F : 0xC4, CurByte, OS); > ? EmitByte(VEX_R << 7 | VEX_X << 6 | VEX_B << 5 | VEX_5M, CurByte, OS); > - ?EmitByte(LastByte | ((VEX_W | XOP_W) << 7), CurByte, OS); > + ?EmitByte(LastByte | (VEX_W << 7), CurByte, OS); > ?} > > ?/// DetermineREXPrefix - Determine if the MCInst has to be encoded with a X86-64 > @@ -929,8 +922,8 @@ > ? // It uses the VEX.VVVV field? > ? bool HasVEX_4V = (TSFlags >> X86II::VEXShift) & X86II::VEX_4V; > ? bool HasVEX_4VOp3 = (TSFlags >> X86II::VEXShift) & X86II::VEX_4VOp3; > - ?bool HasXOP_W = (TSFlags >> X86II::VEXShift) & X86II::XOP_W; > - ?unsigned XOP_W_I8IMMOperand = 2; > + ?bool HasMemOp4 = (TSFlags >> X86II::VEXShift) & X86II::MemOp4; > + ?const unsigned MemOp4_I8IMMOperand = 2; > > ? // Determine where the memory operand starts, if present. > ? int MemoryOperand = X86II::getMemoryOperandNo(TSFlags, Opcode); > @@ -1003,14 +996,14 @@ > ? ? if (HasVEX_4V) // Skip 1st src (which is encoded in VEX_VVVV) > ? ? ? SrcRegNum++; > > - ? ?if(HasXOP_W) // Skip 2nd src (which is encoded in I8IMM) > + ? ?if(HasMemOp4) // Skip 2nd src (which is encoded in I8IMM) > ? ? ? SrcRegNum++; > > ? ? EmitRegModRMByte(MI.getOperand(SrcRegNum), > ? ? ? ? ? ? ? ? ? ? ?GetX86RegNum(MI.getOperand(CurOp)), CurByte, OS); > > - ? ?// 2 operands skipped with HasXOP_W, comensate accordingly > - ? ?CurOp = HasXOP_W ? SrcRegNum : SrcRegNum + 1; > + ? ?// 2 operands skipped with HasMemOp4, comensate accordingly > + ? ?CurOp = HasMemOp4 ? SrcRegNum : SrcRegNum + 1; > ? ? if (HasVEX_4VOp3) > ? ? ? ++CurOp; > ? ? break; > @@ -1022,7 +1015,7 @@ > ? ? ? ++AddrOperands; > ? ? ? ++FirstMemOp; ?// Skip the register source (which is encoded in VEX_VVVV). > ? ? } > - ? ?if(HasXOP_W) // Skip second register source (encoded in I8IMM) > + ? ?if(HasMemOp4) // Skip second register source (encoded in I8IMM) > ? ? ? ++FirstMemOp; > > ? ? EmitByte(BaseOpcode, CurByte, OS); > @@ -1113,7 +1106,7 @@ > ? ? // The last source register of a 4 operand instruction in AVX is encoded > ? ? // in bits[7:4] of a immediate byte. > ? ? if ((TSFlags >> X86II::VEXShift) & X86II::VEX_I8IMM) { > - ? ? ?const MCOperand &MO = MI.getOperand(HasXOP_W ? XOP_W_I8IMMOperand > + ? ? ?const MCOperand &MO = MI.getOperand(HasMemOp4 ? MemOp4_I8IMMOperand > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?: CurOp); > ? ? ? CurOp++; > ? ? ? bool IsExtReg = X86II::isX86_64ExtendedReg(MO.getReg()); > > Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=147366&r1=147365&r2=147366&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Thu Dec 29 22:48:54 2011 > @@ -105,13 +105,13 @@ > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR128:$dst, > - ? ? ? ? ? ? (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, XOP_W; > + ? ? ? ? ? ? (Int VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4; > ? def rm : FMA4 ? ? ? ? ? ?(ins VR128:$src1, VR128:$src2, memop:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR128:$dst, > - ? ? ? ? ? ? (Int VR128:$src1, VR128:$src2, mem_cpat:$src3))]>, XOP_W; > + ? ? ? ? ? ? (Int VR128:$src1, VR128:$src2, mem_cpat:$src3))]>, VEX_W, MemOp4; > ? def mr : FMA4 ? ? ? ? ? ?(ins VR128:$src1, memop:$src2, VR128:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > @@ -128,13 +128,13 @@ > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR128:$dst, > - ? ? ? ? ? ? (Int128 VR128:$src1, VR128:$src2, VR128:$src3))]>, XOP_W; > + ? ? ? ? ? ? (Int128 VR128:$src1, VR128:$src2, VR128:$src3))]>, VEX_W, MemOp4; > ? def rm : FMA4 ? ? ? ? ? ?(ins VR128:$src1, VR128:$src2, f128mem:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR128:$dst, (Int128 VR128:$src1, VR128:$src2, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(ld_frag128 addr:$src3)))]>, XOP_W; > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(ld_frag128 addr:$src3)))]>, VEX_W, MemOp4; > ? def mr : FMA4 ? ? ? ? ? ?(ins VR128:$src1, f128mem:$src2, VR128:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > @@ -146,13 +146,13 @@ > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR256:$dst, > - ? ? ? ? ? ? (Int256 VR256:$src1, VR256:$src2, VR256:$src3))]>, XOP_W; > + ? ? ? ? ? ? (Int256 VR256:$src1, VR256:$src2, VR256:$src3))]>, VEX_W, MemOp4; > ? def rmY : FMA4 ? ? ? ? ? ?(ins VR256:$src1, VR256:$src2, f256mem:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > ? ? ? ? ? ?[(set VR256:$dst, (Int256 VR256:$src1, VR256:$src2, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(ld_frag256 addr:$src3)))]>, XOP_W; > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(ld_frag256 addr:$src3)))]>, VEX_W, MemOp4; > ? def mrY : FMA4 ? ? ? ? ? ?(ins VR256:$src1, f256mem:$src2, VR256:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > > Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147366&r1=147365&r2=147366&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Thu Dec 29 22:48:54 2011 > @@ -120,7 +120,7 @@ > ?class VEX_L ?{ bit hasVEX_L = 1; } > ?class VEX_LIG { bit ignoresVEX_L = 1; } > ?class Has3DNow0F0FOpcode ?{ bit has3DNow0F0FOpcode = 1; } > -class XOP_W { bit hasXOP_WPrefix = 1; } > +class MemOp4 { bit hasMemOp4Prefix = 1; } > ?class XOP { bit hasXOP_Prefix = 1; } > ?class X86Inst opcod, Format f, ImmType i, dag outs, dag ins, > ? ? ? ? ? ? ? string AsmStr, Domain d = GenericDomain> > @@ -161,7 +161,7 @@ > ? bit hasVEX_L = 0; ? ? ? ? // Does this inst use large (256-bit) registers? > ? bit ignoresVEX_L = 0; ? ? // Does this instruction ignore the L-bit > ? bit has3DNow0F0FOpcode =0;// Wacky 3dNow! encoding? > - ?bit hasXOP_WPrefix = 0; ? // Same bit as VEX_W, but used for swapping operands > + ?bit hasMemOp4Prefix = 0; ?// Same bit as VEX_W, but used for swapping operands > ? bit hasXOP_Prefix = 0; ? ?// Does this inst require an XOP prefix? > > ? // TSFlags layout should be kept in sync with X86InstrInfo.h. > @@ -184,7 +184,7 @@ > ? let TSFlags{38} ? ?= hasVEX_L; > ? let TSFlags{39} ? ?= ignoresVEX_L; > ? let TSFlags{40} ? ?= has3DNow0F0FOpcode; > - ?let TSFlags{41} ? ?= hasXOP_WPrefix; > + ?let TSFlags{41} ? ?= hasMemOp4Prefix; > ? let TSFlags{42} ? ?= hasXOP_Prefix; > ?} > > > Modified: llvm/trunk/lib/Target/X86/X86InstrXOP.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrXOP.td?rev=147366&r1=147365&r2=147366&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrXOP.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrXOP.td Thu Dec 29 22:48:54 2011 > @@ -169,7 +169,7 @@ > ? ? ? ? ? ?(ins VR128:$src1, VR128:$src2, f128mem:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > - ? ? ? ? ? []>, VEX_4V, VEX_I8IMM, XOP_W; > + ? ? ? ? ? []>, VEX_4V, VEX_I8IMM, VEX_W, MemOp4; > ? def mr : IXOPi8 ? ? ? ? ? ?(ins VR128:$src1, f128mem:$src2, VR128:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > @@ -192,7 +192,7 @@ > ? ? ? ? ? ?(ins VR256:$src1, VR256:$src2, f256mem:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > ? ? ? ? ? ?"\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > - ? ? ? ? ? []>, VEX_4V, VEX_I8IMM, XOP_W; > + ? ? ? ? ? []>, VEX_4V, VEX_I8IMM, VEX_W, MemOp4; > ? def mrY : IXOPi8 ? ? ? ? ? ?(ins VR256:$src1, f256mem:$src2, VR256:$src3), > ? ? ? ? ? ?!strconcat(OpcodeStr, > @@ -214,7 +214,7 @@ > ? ? ? ? (ins VR128:$src1, VR128:$src2, f128mem:$src3, i8imm:$src4), > ? ? ? ? !strconcat(OpcodeStr, > ? ? ? ? "\t{$src4, $src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3, $src4}"), > - ? ? ? ?[]>, XOP_W; > + ? ? ? ?[]>, VEX_W, MemOp4; > ? def mr : IXOP5 ? ? ? ? (ins VR128:$src1, f128mem:$src2, VR128:$src3, i8imm:$src4), > ? ? ? ? !strconcat(OpcodeStr, > @@ -229,7 +229,7 @@ > ? ? ? ? (ins VR256:$src1, VR256:$src2, f256mem:$src3, i8imm:$src4), > ? ? ? ? !strconcat(OpcodeStr, > ? ? ? ? "\t{$src4, $src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3, $src4}"), > - ? ? ? ?[]>, XOP_W; > + ? ? ? ?[]>, VEX_W, MemOp4; > ? def mrY : IXOP5 ? ? ? ? (ins VR256:$src1, f256mem:$src2, VR256:$src3, i8imm:$src4), > ? ? ? ? !strconcat(OpcodeStr, > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -- Bruno Cardoso Lopes http://www.brunocardoso.cc From geek4civic at gmail.com Fri Dec 30 21:33:08 2011 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Sat, 31 Dec 2011 12:33:08 +0900 Subject: [llvm-commits] [PATCH] Happy new year 2012! Message-ID: --- clang/LICENSE.TXT | 2 +- compiler-rt/LICENSE.TXT | 4 ++-- libcxx/LICENSE.TXT | 4 ++-- llvm/LICENSE.TXT | 2 +- llvm/autoconf/configure.ac | 4 ++-- llvm/configure | 6 +++--- llvm/docs/doxygen.footer | 2 +- polly/LICENSE.txt | 2 +- 8 files changed, 13 insertions(+), 13 deletions(-) I will commit them after 1st-Jan-2012 came in PST. ...Takumi -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Happy-new-year-2012.patch.txt Type: text/x-patch Size: 5428 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/a4ae3602/attachment.bin From rafael.espindola at gmail.com Sat Dec 31 00:09:40 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Sat, 31 Dec 2011 01:09:40 -0500 Subject: [llvm-commits] [pr11677][patch] Eagerly materialize functions whose BBs are used in global variables inits Message-ID: <4EFEA724.3090104@gmail.com> I think this is the simplest non hack fix for PR11677. The problem is that when a user sees a global variable, it has no way of knowing if that variable is "complete" or if materializing a function will change its initialization. To prevent this from happening, this patch materializes all functions that could have an impact. I intend to add a unit test before committing, but it is a bit cumbersome as I couldn't find any infrastructure for using .ll or .bc files in the unit tests. Cheers, Rafael -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pr11677.patch Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/18caf92b/attachment.pl From bob.wilson at apple.com Sat Dec 31 01:06:57 2011 From: bob.wilson at apple.com (Bob Wilson) Date: Sat, 31 Dec 2011 07:06:57 -0000 Subject: [llvm-commits] [llvm-gcc-4.2] r147389 - /llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos Message-ID: <20111231070657.65D832A6C12C@llvm.org> Author: bwilson Date: Sat Dec 31 01:06:57 2011 New Revision: 147389 URL: http://llvm.org/viewvc/llvm-project?rev=147389&view=rev Log: Stop including the default arm slice in libgcc_s.1.dylib. This is redundant as of svn r145650, which makes the default be armv7, since there is already an explicit armv7 multilib. Modified: llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos Modified: llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos?rev=147389&r1=147388&r2=147389&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos (original) +++ llvm-gcc-4.2/trunk/gcc/config/arm/t-slibgcc-iphoneos Sat Dec 31 01:06:57 2011 @@ -47,7 +47,7 @@ MLIBS=`$(GCC_FOR_TARGET) --print-multi-lib \ | sed -e 's/;.*$$//' -e '/^\.$$/d'` ; \ if [ -n "$$MLIBS" ] ; then \ - for mlib in '' $$MLIBS ; do \ + for mlib in $$MLIBS ; do \ cp ./$${mlib}/libgcc_s.$(SHLIB_SOVERSION)$(SHLIB_EXT).tmp \ ./libgcc_s.$(SHLIB_SOVERSION)$(SHLIB_EXT)_T_$${mlib} || exit 1 ; \ done ; \ From ed at 80386.nl Sat Dec 31 06:55:19 2011 From: ed at 80386.nl (Ed Schouten) Date: Sat, 31 Dec 2011 13:55:19 +0100 Subject: [llvm-commits] [Patch] Unbreak compiler-rt on FreeBSD/sparc64 and FreeBSD/mips64 Message-ID: <20111231125519.GU1895@hoeg.nl> Hello all, On FreeBSD/sparc64 and FreeBSD/mips64, compiler-rt runs into endless recursion when using __clzdi2() or __ctzdi2(), as described in bug #11663. This is due to the fact that __builtin_c?z() emit function calls to __c?zdi2() instead of __c?zsi2(). The following (evil) workaround fixes it for us. I'd rather not have too many local changes to our version of compiler-rt, so would it be possible to get this fix upstreamed? Index: lib/int_lib.h =================================================================== --- lib/int_lib.h (revision 229003) +++ lib/int_lib.h (working copy) @@ -153,4 +153,23 @@ long double f; } long_double_bits; +/* + * Workaround for LLVM bug 11663. Prevent endless recursion in + * __c?zdi2(), where calls to __builtin_c?z() are expanded to + * __c?zdi2() instead of __c?zsi2(). + * + * Instead of placing this workaround in c?zdi2.c, put it in this + * global header to prevent other C files from making the detour + * through __c?zdi2() as well. + * + * This problem has only been observed on FreeBSD for sparc64 and + * mips64 with GCC 4.2.1. + */ +#if defined(__FreeBSD__) && (defined(__sparc64__) || defined(__mips_n64)) +si_int __clzsi2(si_int); +si_int __ctzsi2(si_int); +#define __builtin_clz __clzsi2 +#define __builtin_ctz __ctzsi2 +#endif + #endif /* INT_LIB_H */ Thanks, -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/75d81f8e/attachment.bin From ed at 80386.nl Sat Dec 31 07:04:25 2011 From: ed at 80386.nl (Ed Schouten) Date: Sat, 31 Dec 2011 14:04:25 +0100 Subject: [llvm-commits] [Patch] assembly-written implementation of {u, }{div, mod}si3 for SPARC64 Message-ID: <20111231130425.GV1895@hoeg.nl> Hello all, According to a developer at the FreeBSD project, FreeBSD's total compilation time increases by 2.6% when the host system is built against compiler-rt instead of libgcc. This is likely due to the fact that GCC has assembly-written versions of the division and modulo routines, while compiler-rt does not. The division and modulo routines used by GCC can easily be re-used by compiler-rt. They are provided for free in The SPARC Architecture Manual Version 8. Attached to this bug report is a patch that I have written for compiler-rt. It contains the M4 file that is listed in the manual, with some small modifications: - The M4 file uses exponentiation (2^N). This seems to be a Sun-specific extension to M4, as I cannot reproduce it with GNU and BSD m4. Fix this similar to OpenBSD's version by replacing 2^N with TWOSUPN. - Use the same register layout as GCC's version. - Integrate into compiler-rt's codebase by using DEFINE_COMPILERRT_FUNCTION(). To generate the proper assembly files, simply run the `generate.sh' script. The patch lacks modifications to CMake files, but this is due to the fact that we use a custom BSD Makefile. As I have very little experience with CMake, I hope one of you can add this glue. The u*si3.S files in this patch are simply added because our own Makefile is written in such a way that it prefers the MI assembly files over the MD C files. -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: sparc64.diff Type: text/x-diff Size: 8195 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/3ade395c/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/3ade395c/attachment-0001.bin From ed at 80386.nl Sat Dec 31 07:11:19 2011 From: ed at 80386.nl (Ed Schouten) Date: Sat, 31 Dec 2011 14:11:19 +0100 Subject: [llvm-commits] [Patch] Unbreak compiler-rt on FreeBSD/sparc64 and FreeBSD/mips64 In-Reply-To: <20111231125519.GU1895@hoeg.nl> References: <20111231125519.GU1895@hoeg.nl> Message-ID: <20111231131119.GW1895@hoeg.nl> Hi, It seems the MIPS64 check is incorrect, as we must test against both __mips_n64 and __mips_o64. Updated patch below: Index: lib/int_lib.h =================================================================== --- lib/int_lib.h (revision 229003) +++ lib/int_lib.h (working copy) @@ -153,4 +153,24 @@ long double f; } long_double_bits; +/* + * Workaround for LLVM bug 11663. Prevent endless recursion in + * __c?zdi2(), where calls to __builtin_c?z() are expanded to + * __c?zdi2() instead of __c?zsi2(). + * + * Instead of placing this workaround in c?zdi2.c, put it in this + * global header to prevent other C files from making the detour + * through __c?zdi2() as well. + * + * This problem has only been observed on FreeBSD for sparc64 and + * mips64 with GCC 4.2.1. + */ +#if defined(__FreeBSD__) && (defined(__sparc64__) || \ + defined(__mips_n64) || defined(__mips_o64)) +si_int __clzsi2(si_int); +si_int __ctzsi2(si_int); +#define __builtin_clz __clzsi2 +#define __builtin_ctz __ctzsi2 +#endif + #endif /* INT_LIB_H */ -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/d1476c14/attachment.bin From nobled at dreamwidth.org Sat Dec 31 07:58:59 2011 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Sat, 31 Dec 2011 13:58:59 -0000 Subject: [llvm-commits] [llvm] r147390 - /llvm/trunk/lib/VMCore/Type.cpp Message-ID: <20111231135859.331792A6C12C@llvm.org> Author: nobled Date: Sat Dec 31 07:58:58 2011 New Revision: 147390 URL: http://llvm.org/viewvc/llvm-project?rev=147390&view=rev Log: VMCore: add assert for miscompile See PR11652. Trying to add this assert to setSubclassData() itself actually prevented the miscompile entirely, so it has to be here. This makes the source of the bug more obvious than the other asserts triggering later on did. Modified: llvm/trunk/lib/VMCore/Type.cpp Modified: llvm/trunk/lib/VMCore/Type.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Type.cpp?rev=147390&r1=147389&r2=147390&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Type.cpp (original) +++ llvm/trunk/lib/VMCore/Type.cpp Sat Dec 31 07:58:58 2011 @@ -707,7 +707,12 @@ PointerType::PointerType(Type *E, unsigned AddrSpace) : SequentialType(PointerTyID, E) { +#ifndef NDEBUG + const unsigned oldNCT = NumContainedTys; +#endif setSubclassData(AddrSpace); + // Check for miscompile. PR11652. + assert(oldNCT == NumContainedTys && "bitfield written out of bounds?"); } PointerType *Type::getPointerTo(unsigned addrs) { From rafael.espindola at gmail.com Sat Dec 31 09:05:47 2011 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Sat, 31 Dec 2011 10:05:47 -0500 Subject: [llvm-commits] [pr11677][patch] Eagerly materialize functions whose BBs are used in global variables inits In-Reply-To: <4EFEA724.3090104@gmail.com> References: <4EFEA724.3090104@gmail.com> Message-ID: <4EFF24CB.2020407@gmail.com> > I intend to add a unit test before committing, but it is a bit > cumbersome as I couldn't find any infrastructure for using .ll or .bc > files in the unit tests. A version of the patch with the included test is attached. Is it OK? Cheers, Rafael -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pr11677.patch Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/178c14f4/attachment-0001.pl From ed at 80386.nl Sat Dec 31 13:07:56 2011 From: ed at 80386.nl (Ed Schouten) Date: Sat, 31 Dec 2011 20:07:56 +0100 Subject: [llvm-commits] [Patch] FreeBSD-specific modifications to compiler-rt Message-ID: <20111231190756.GY1895@hoeg.nl> Hi, The C runtime library that is shipped with FreeBSD is an almost unmodified copy of compiler-rt. It would be nice if we could get the two small local changes upstreamed as well. Change #1: clear_cache.c can call compilerrt_abort() on non-x86 non-Apple architectures. Prevent a build failure on those architectures by properly including "int_lib.h". --- lib/clear_cache.c +++ lib/clear_cache.c @@ -8,6 +8,8 @@ * ===----------------------------------------------------------------------=== */ +#include "int_lib.h" + #if __APPLE__ #include #endif Change #2: The code provided in trampoline_setup.c only seems to apply to 32-bit PowerPC systems -- not 64-bit systems. Don't cause a build failure on those systems. --- lib/trampoline_setup.c +++ lib/trampoline_setup.c @@ -20,7 +20,7 @@ * and then jumps to the target nested function. */ -#if __ppc__ +#if __ppc__ && !defined(__powerpc64__) void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated, const void* realFunc, void* localsPtr) { -- Ed Schouten WWW: http://80386.nl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 834 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20111231/146d8f9d/attachment.bin From nicholas at mxc.ca Sat Dec 31 15:30:23 2011 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 31 Dec 2011 21:30:23 -0000 Subject: [llvm-commits] [llvm] r147391 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineShifts.cpp test/Transforms/InstCombine/shift.ll Message-ID: <20111231213023.1E8B72A6C12C@llvm.org> Author: nicholas Date: Sat Dec 31 15:30:22 2011 New Revision: 147391 URL: http://llvm.org/viewvc/llvm-project?rev=147391&view=rev Log: Make use of the exact bit when optimizing '(X >>exact 3) << 1' to eliminate the 'and' that would zero out the trailing bits, and to produce an exact shift ourselves. Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/trunk/test/Transforms/InstCombine/shift.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp?rev=147391&r1=147390&r2=147391&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineShifts.cpp Sat Dec 31 15:30:22 2011 @@ -536,12 +536,11 @@ if (ShiftAmt1 == 0) return 0; // Will be simplified in the future. Value *X = ShiftOp->getOperand(0); - uint32_t AmtSum = ShiftAmt1+ShiftAmt2; // Fold into one big shift. - IntegerType *Ty = cast(I.getType()); // Check for (X << c1) << c2 and (X >> c1) >> c2 if (I.getOpcode() == ShiftOp->getOpcode()) { + uint32_t AmtSum = ShiftAmt1+ShiftAmt2; // Fold into one big shift. // If this is oversized composite shift, then unsigned shifts get 0, ashr // saturates. if (AmtSum >= TypeBits) { @@ -603,9 +602,16 @@ // (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) if (I.getOpcode() == Instruction::Shl && ShiftOp->getOpcode() != Instruction::Shl) { - Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), X, - ConstantInt::get(Ty, ShiftDiff)); - + ConstantInt *ShiftDiffCst = ConstantInt::get(Ty, ShiftDiff); + if (ShiftOp->isExact()) { + // (X >>?exact C1) << C2 --> X >>?exact (C1-C2) + BinaryOperator *NewShr = BinaryOperator::Create(ShiftOp->getOpcode(), + X, ShiftDiffCst); + NewShr->setIsExact(true); + return NewShr; + } + Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), + X, ShiftDiffCst); APInt Mask(APInt::getHighBitsSet(TypeBits, TypeBits - ShiftAmt2)); return BinaryOperator::CreateAnd(Shift, ConstantInt::get(I.getContext(),Mask)); Modified: llvm/trunk/test/Transforms/InstCombine/shift.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift.ll?rev=147391&r1=147390&r2=147391&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shift.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shift.ll Sat Dec 31 15:30:22 2011 @@ -542,3 +542,21 @@ ; CHECK-NEXT: %y = lshr i32 %a, 5 ; CHECK-NEXT: ret i32 %y } + +define i32 @test46(i32 %a) { + %y = ashr exact i32 %a, 3 + %z = shl i32 %y, 1 + ret i32 %z +; CHECK: @test46 +; CHECK-NEXT: %z = ashr exact i32 %a, 2 +; CHECK-NEXT: ret i32 %z +} + +define i32 @test47(i32 %a) { + %y = lshr exact i32 %a, 3 + %z = shl i32 %y, 1 + ret i32 %z +; CHECK: @test47 +; CHECK-NEXT: %z = lshr exact i32 %a, 2 +; CHECK-NEXT: ret i32 %z +} From craig.topper at gmail.com Sat Dec 31 17:15:11 2011 From: craig.topper at gmail.com (Craig Topper) Date: Sat, 31 Dec 2011 23:15:11 -0000 Subject: [llvm-commits] [llvm] r147392 - in /llvm/trunk: lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/avx-vshufp.ll Message-ID: <20111231231511.E37EF2A6C12C@llvm.org> Author: ctopper Date: Sat Dec 31 17:15:11 2011 New Revision: 147392 URL: http://llvm.org/viewvc/llvm-project?rev=147392&view=rev Log: Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected. Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td llvm/trunk/test/CodeGen/X86/avx-vshufp.ll Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=147392&r1=147391&r2=147392&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Sat Dec 31 17:15:11 2011 @@ -2348,7 +2348,7 @@ (SHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; // Generic SHUFPD patterns - def : Pat<(v2f64 (X86Shufps VR128:$src1, + def : Pat<(v2f64 (X86Shufpd VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (SHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; def : Pat<(v2i64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), @@ -2397,7 +2397,7 @@ (VSHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; - def : Pat<(v2f64 (X86Shufps VR128:$src1, + def : Pat<(v2f64 (X86Shufpd VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (VSHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; def : Pat<(v2i64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), Modified: llvm/trunk/test/CodeGen/X86/avx-vshufp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vshufp.ll?rev=147392&r1=147391&r2=147392&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-vshufp.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-vshufp.ll Sat Dec 31 17:15:11 2011 @@ -7,7 +7,7 @@ ret <8 x float> %shuffle } -; CHECK: vshufps $-53, (% +; CHECK: vshufps $-53, (%{{.*}}), %ymm define <8 x float> @A2(<8 x float>* %a, <8 x float>* %b) nounwind uwtable readnone ssp { entry: %a2 = load <8 x float>* %a @@ -23,7 +23,7 @@ ret <4 x double> %shuffle } -; CHECK: vshufpd $10, (% +; CHECK: vshufpd $10, (%{{.*}}), %ymm define <4 x double> @B2(<4 x double>* %a, <4 x double>* %b) nounwind uwtable readnone ssp { entry: %a2 = load <4 x double>* %a @@ -59,3 +59,35 @@ %shuffle = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> ret <4 x double> %shuffle } + +; CHECK: vshufps $-53, %xmm +define <4 x float> @A128(<4 x float> %a, <4 x float> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <4 x float> %a, <4 x float> %b, <4 x i32> + ret <4 x float> %shuffle +} + +; CHECK: vshufps $-53, (%{{.*}}), %xmm +define <4 x float> @A2128(<4 x float>* %a, <4 x float>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <4 x float>* %a + %b2 = load <4 x float>* %b + %shuffle = shufflevector <4 x float> %a2, <4 x float> %b2, <4 x i32> + ret <4 x float> %shuffle +} + +; CHECK: vshufpd $1, %xmm +define <2 x double> @B128(<2 x double> %a, <2 x double> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <2 x double> %a, <2 x double> %b, <2 x i32> + ret <2 x double> %shuffle +} + +; CHECK: vshufpd $1, (%{{.*}}), %xmm +define <2 x double> @B2128(<2 x double>* %a, <2 x double>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <2 x double>* %a + %b2 = load <2 x double>* %b + %shuffle = shufflevector <2 x double> %a2, <2 x double> %b2, <2 x i32> + ret <2 x double> %shuffle +} From craig.topper at gmail.com Sat Dec 31 17:24:50 2011 From: craig.topper at gmail.com (Craig Topper) Date: Sat, 31 Dec 2011 23:24:50 -0000 Subject: [llvm-commits] [llvm] r147393 - in /llvm/trunk: lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/avx-vshufp.ll Message-ID: <20111231232450.328502A6C12C@llvm.org> Author: ctopper Date: Sat Dec 31 17:24:49 2011 New Revision: 147393 URL: http://llvm.org/viewvc/llvm-project?rev=147393&view=rev Log: Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td llvm/trunk/test/CodeGen/X86/avx-vshufp.ll Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=147393&r1=147392&r2=147393&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Sat Dec 31 17:24:49 2011 @@ -2348,6 +2348,9 @@ (SHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; // Generic SHUFPD patterns + def : Pat<(v2i64 (X86Shufpd VR128:$src1, + (memopv2i64 addr:$src2), (i8 imm:$imm))), + (SHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; def : Pat<(v2f64 (X86Shufpd VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (SHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; @@ -2397,6 +2400,9 @@ (VSHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; + def : Pat<(v2i64 (X86Shufpd VR128:$src1, + (memopv2i64 addr:$src2), (i8 imm:$imm))), + (VSHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; def : Pat<(v2f64 (X86Shufpd VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (VSHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; Modified: llvm/trunk/test/CodeGen/X86/avx-vshufp.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vshufp.ll?rev=147393&r1=147392&r2=147393&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-vshufp.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-vshufp.ll Sat Dec 31 17:24:49 2011 @@ -16,6 +16,22 @@ ret <8 x float> %shuffle } +; CHECK: vshufps $-53, %ymm +define <8 x i32> @A3(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> + ret <8 x i32> %shuffle +} + +; CHECK: vshufps $-53, (%{{.*}}), %ymm +define <8 x i32> @A4(<8 x i32>* %a, <8 x i32>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <8 x i32>* %a + %b2 = load <8 x i32>* %b + %shuffle = shufflevector <8 x i32> %a2, <8 x i32> %b2, <8 x i32> + ret <8 x i32> %shuffle +} + ; CHECK: vshufpd $10, %ymm define <4 x double> @B(<4 x double> %a, <4 x double> %b) nounwind uwtable readnone ssp { entry: @@ -32,6 +48,22 @@ ret <4 x double> %shuffle } +; CHECK: vshufpd $10, %ymm +define <4 x i64> @B3(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> + ret <4 x i64> %shuffle +} + +; CHECK: vshufpd $10, (%{{.*}}), %ymm +define <4 x i64> @B4(<4 x i64>* %a, <4 x i64>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <4 x i64>* %a + %b2 = load <4 x i64>* %b + %shuffle = shufflevector <4 x i64> %a2, <4 x i64> %b2, <4 x i32> + ret <4 x i64> %shuffle +} + ; CHECK: vshufps $-53, %ymm define <8 x float> @C(<8 x float> %a, <8 x float> %b) nounwind uwtable readnone ssp { entry: @@ -76,6 +108,22 @@ ret <4 x float> %shuffle } +; CHECK: vshufps $-53, %xmm +define <4 x i32> @A3128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> + ret <4 x i32> %shuffle +} + +; CHECK: vshufps $-53, (%{{.*}}), %xmm +define <4 x i32> @A4128(<4 x i32>* %a, <4 x i32>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <4 x i32>* %a + %b2 = load <4 x i32>* %b + %shuffle = shufflevector <4 x i32> %a2, <4 x i32> %b2, <4 x i32> + ret <4 x i32> %shuffle +} + ; CHECK: vshufpd $1, %xmm define <2 x double> @B128(<2 x double> %a, <2 x double> %b) nounwind uwtable readnone ssp { entry: @@ -91,3 +139,19 @@ %shuffle = shufflevector <2 x double> %a2, <2 x double> %b2, <2 x i32> ret <2 x double> %shuffle } + +; CHECK: vshufpd $1, %xmm +define <2 x i64> @B3128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp { +entry: + %shuffle = shufflevector <2 x i64> %a, <2 x i64> %b, <2 x i32> + ret <2 x i64> %shuffle +} + +; CHECK: vshufpd $1, (%{{.*}}), %xmm +define <2 x i64> @B4128(<2 x i64>* %a, <2 x i64>* %b) nounwind uwtable readnone ssp { +entry: + %a2 = load <2 x i64>* %a + %b2 = load <2 x i64>* %b + %shuffle = shufflevector <2 x i64> %a2, <2 x i64> %b2, <2 x i32> + ret <2 x i64> %shuffle +} From craig.topper at gmail.com Sat Dec 31 17:50:22 2011 From: craig.topper at gmail.com (Craig Topper) Date: Sat, 31 Dec 2011 23:50:22 -0000 Subject: [llvm-commits] [llvm] r147394 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrSSE.td Message-ID: <20111231235022.5478D2A6C12C@llvm.org> Author: ctopper Date: Sat Dec 31 17:50:21 2011 New Revision: 147394 URL: http://llvm.org/viewvc/llvm-project?rev=147394&view=rev Log: Merge X86 SHUFPS and SHUFPD node types. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.h llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td llvm/trunk/lib/Target/X86/X86InstrSSE.td Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=147394&r1=147393&r2=147394&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sat Dec 31 17:50:21 2011 @@ -2869,9 +2869,8 @@ case X86ISD::PSHUFD: case X86ISD::PSHUFHW: case X86ISD::PSHUFLW: - case X86ISD::SHUFPD: + case X86ISD::SHUFP: case X86ISD::PALIGN: - case X86ISD::SHUFPS: case X86ISD::MOVLHPS: case X86ISD::MOVLHPD: case X86ISD::MOVHLPS: @@ -2923,8 +2922,7 @@ switch(Opc) { default: llvm_unreachable("Unknown x86 shuffle node"); case X86ISD::PALIGN: - case X86ISD::SHUFPD: - case X86ISD::SHUFPS: + case X86ISD::SHUFP: case X86ISD::VPERM2X128: return DAG.getNode(Opc, dl, VT, V1, V2, DAG.getConstant(TargetMask, MVT::i8)); @@ -4495,8 +4493,7 @@ SDValue ImmN; switch(Opcode) { - case X86ISD::SHUFPS: - case X86ISD::SHUFPD: + case X86ISD::SHUFP: ImmN = N->getOperand(N->getNumOperands()-1); DecodeSHUFPMask(VT, cast(ImmN)->getZExtValue(), ShuffleMask); @@ -6346,22 +6343,6 @@ return getTargetShuffleNode(X86ISD::MOVHLPS, dl, VT, V1, V2, DAG); } -static inline unsigned getSHUFPOpcode(EVT VT) { - switch(VT.getSimpleVT().SimpleTy) { - case MVT::v8i32: // Use fp unit for int unpack. - case MVT::v8f32: - case MVT::v4i32: // Use fp unit for int unpack. - case MVT::v4f32: return X86ISD::SHUFPS; - case MVT::v4i64: // Use fp unit for int unpack. - case MVT::v4f64: - case MVT::v2i64: // Use fp unit for int unpack. - case MVT::v2f64: return X86ISD::SHUFPD; - default: - llvm_unreachable("Unknown type for shufp*"); - } - return 0; -} - static SDValue getMOVLP(SDValue &Op, DebugLoc &dl, SelectionDAG &DAG, bool HasXMMInt) { SDValue V1 = Op.getOperand(0); @@ -6415,7 +6396,7 @@ assert(VT != MVT::v4i32 && "unsupported shuffle type"); // Invert the operand order and use SHUFPS to match it. - return getTargetShuffleNode(getSHUFPOpcode(VT), dl, VT, V2, V1, + return getTargetShuffleNode(X86ISD::SHUFP, dl, VT, V2, V1, X86::getShuffleSHUFImmediate(SVOp), DAG); } @@ -6557,7 +6538,7 @@ if (HasXMMInt && (VT == MVT::v4f32 || VT == MVT::v4i32)) return getTargetShuffleNode(X86ISD::PSHUFD, dl, VT, V1, TargetMask, DAG); - return getTargetShuffleNode(getSHUFPOpcode(VT), dl, VT, V1, V1, + return getTargetShuffleNode(X86ISD::SHUFP, dl, VT, V1, V1, TargetMask, DAG); } @@ -6707,7 +6688,7 @@ DAG); if (isSHUFPMask(M, VT)) - return getTargetShuffleNode(getSHUFPOpcode(VT), dl, VT, V1, V2, + return getTargetShuffleNode(X86ISD::SHUFP, dl, VT, V1, V2, X86::getShuffleSHUFImmediate(SVOp), DAG); if (isUNPCKL_v_undef_Mask(M, VT, HasAVX2)) @@ -6736,7 +6717,7 @@ // Handle VSHUFPS/DY permutations if (isVSHUFPYMask(M, VT, HasAVX)) - return getTargetShuffleNode(getSHUFPOpcode(VT), dl, VT, V1, V2, + return getTargetShuffleNode(X86ISD::SHUFP, dl, VT, V1, V2, getShuffleVSHUFPYImmediate(SVOp), DAG); //===--------------------------------------------------------------------===// @@ -11031,8 +11012,7 @@ case X86ISD::PSHUFHW_LD: return "X86ISD::PSHUFHW_LD"; case X86ISD::PSHUFLW: return "X86ISD::PSHUFLW"; case X86ISD::PSHUFLW_LD: return "X86ISD::PSHUFLW_LD"; - case X86ISD::SHUFPS: return "X86ISD::SHUFPS"; - case X86ISD::SHUFPD: return "X86ISD::SHUFPD"; + case X86ISD::SHUFP: return "X86ISD::SHUFP"; case X86ISD::MOVLHPS: return "X86ISD::MOVLHPS"; case X86ISD::MOVLHPD: return "X86ISD::MOVLHPD"; case X86ISD::MOVHLPS: return "X86ISD::MOVHLPS"; @@ -14639,8 +14619,7 @@ case X86ISD::VZEXT_MOVL: return PerformVZEXT_MOVLCombine(N, DAG); case ISD::ZERO_EXTEND: return PerformZExtCombine(N, DAG); case X86ISD::SETCC: return PerformSETCCCombine(N, DAG); - case X86ISD::SHUFPS: // Handle all target specific shuffles - case X86ISD::SHUFPD: + case X86ISD::SHUFP: // Handle all target specific shuffles case X86ISD::PALIGN: case X86ISD::UNPCKH: case X86ISD::UNPCKL: Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=147394&r1=147393&r2=147394&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Sat Dec 31 17:50:21 2011 @@ -258,8 +258,7 @@ PSHUFLW, PSHUFHW_LD, PSHUFLW_LD, - SHUFPD, - SHUFPS, + SHUFP, MOVDDUP, MOVSHDUP, MOVSLDUP, Modified: llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td?rev=147394&r1=147393&r2=147394&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td Sat Dec 31 17:50:21 2011 @@ -112,8 +112,7 @@ def X86PShufhw : SDNode<"X86ISD::PSHUFHW", SDTShuff2OpI>; def X86PShuflw : SDNode<"X86ISD::PSHUFLW", SDTShuff2OpI>; -def X86Shufpd : SDNode<"X86ISD::SHUFPD", SDTShuff3OpI>; -def X86Shufps : SDNode<"X86ISD::SHUFPS", SDTShuff3OpI>; +def X86Shufp : SDNode<"X86ISD::SHUFP", SDTShuff3OpI>; def X86Movddup : SDNode<"X86ISD::MOVDDUP", SDTShuff1Op>; def X86Movshdup : SDNode<"X86ISD::MOVSHDUP", SDTShuff1Op>; Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=147394&r1=147393&r2=147394&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Sat Dec 31 17:50:21 2011 @@ -2306,15 +2306,15 @@ } let Predicates = [HasSSE1] in { - def : Pat<(v4f32 (X86Shufps VR128:$src1, + def : Pat<(v4f32 (X86Shufp VR128:$src1, (memopv4f32 addr:$src2), (i8 imm:$imm))), (SHUFPSrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4f32 (X86Shufps VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v4f32 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (SHUFPSrri VR128:$src1, VR128:$src2, imm:$imm)>; - def : Pat<(v4i32 (X86Shufps VR128:$src1, + def : Pat<(v4i32 (X86Shufp VR128:$src1, (bc_v4i32 (memopv2i64 addr:$src2)), (i8 imm:$imm))), (SHUFPSrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4i32 (X86Shufps VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v4i32 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (SHUFPSrri VR128:$src1, VR128:$src2, imm:$imm)>; // vector_shuffle v1, v2 <4, 5, 2, 3> using SHUFPSrri (we prefer movsd, but // fall back to this for SSE1) @@ -2348,28 +2348,28 @@ (SHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; // Generic SHUFPD patterns - def : Pat<(v2i64 (X86Shufpd VR128:$src1, + def : Pat<(v2i64 (X86Shufp VR128:$src1, (memopv2i64 addr:$src2), (i8 imm:$imm))), (SHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v2f64 (X86Shufpd VR128:$src1, + def : Pat<(v2f64 (X86Shufp VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (SHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v2i64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v2i64 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (SHUFPDrri VR128:$src1, VR128:$src2, imm:$imm)>; - def : Pat<(v2f64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v2f64 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (SHUFPDrri VR128:$src1, VR128:$src2, imm:$imm)>; } let Predicates = [HasAVX] in { - def : Pat<(v4f32 (X86Shufps VR128:$src1, + def : Pat<(v4f32 (X86Shufp VR128:$src1, (memopv4f32 addr:$src2), (i8 imm:$imm))), (VSHUFPSrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4f32 (X86Shufps VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v4f32 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (VSHUFPSrri VR128:$src1, VR128:$src2, imm:$imm)>; - def : Pat<(v4i32 (X86Shufps VR128:$src1, + def : Pat<(v4i32 (X86Shufp VR128:$src1, (bc_v4i32 (memopv2i64 addr:$src2)), (i8 imm:$imm))), (VSHUFPSrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4i32 (X86Shufps VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v4i32 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (VSHUFPSrri VR128:$src1, VR128:$src2, imm:$imm)>; // vector_shuffle v1, v2 <4, 5, 2, 3> using SHUFPSrri (we prefer movsd, but // fall back to this for SSE1) @@ -2400,39 +2400,39 @@ (VSHUFPDrri VR128:$src1, VR128:$src2, (SHUFFLE_get_shuf_imm VR128:$src3))>; - def : Pat<(v2i64 (X86Shufpd VR128:$src1, + def : Pat<(v2i64 (X86Shufp VR128:$src1, (memopv2i64 addr:$src2), (i8 imm:$imm))), (VSHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v2f64 (X86Shufpd VR128:$src1, + def : Pat<(v2f64 (X86Shufp VR128:$src1, (memopv2f64 addr:$src2), (i8 imm:$imm))), (VSHUFPDrmi VR128:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v2i64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v2i64 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (VSHUFPDrri VR128:$src1, VR128:$src2, imm:$imm)>; - def : Pat<(v2f64 (X86Shufpd VR128:$src1, VR128:$src2, (i8 imm:$imm))), + def : Pat<(v2f64 (X86Shufp VR128:$src1, VR128:$src2, (i8 imm:$imm))), (VSHUFPDrri VR128:$src1, VR128:$src2, imm:$imm)>; // 256-bit patterns - def : Pat<(v8i32 (X86Shufps VR256:$src1, VR256:$src2, (i8 imm:$imm))), + def : Pat<(v8i32 (X86Shufp VR256:$src1, VR256:$src2, (i8 imm:$imm))), (VSHUFPSYrri VR256:$src1, VR256:$src2, imm:$imm)>; - def : Pat<(v8i32 (X86Shufps VR256:$src1, + def : Pat<(v8i32 (X86Shufp VR256:$src1, (bc_v8i32 (memopv4i64 addr:$src2)), (i8 imm:$imm))), (VSHUFPSYrmi VR256:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v8f32 (X86Shufps VR256:$src1, VR256:$src2, (i8 imm:$imm))), + def : Pat<(v8f32 (X86Shufp VR256:$src1, VR256:$src2, (i8 imm:$imm))), (VSHUFPSYrri VR256:$src1, VR256:$src2, imm:$imm)>; - def : Pat<(v8f32 (X86Shufps VR256:$src1, + def : Pat<(v8f32 (X86Shufp VR256:$src1, (memopv8f32 addr:$src2), (i8 imm:$imm))), (VSHUFPSYrmi VR256:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4i64 (X86Shufpd VR256:$src1, VR256:$src2, (i8 imm:$imm))), + def : Pat<(v4i64 (X86Shufp VR256:$src1, VR256:$src2, (i8 imm:$imm))), (VSHUFPDYrri VR256:$src1, VR256:$src2, imm:$imm)>; - def : Pat<(v4i64 (X86Shufpd VR256:$src1, + def : Pat<(v4i64 (X86Shufp VR256:$src1, (memopv4i64 addr:$src2), (i8 imm:$imm))), (VSHUFPDYrmi VR256:$src1, addr:$src2, imm:$imm)>; - def : Pat<(v4f64 (X86Shufpd VR256:$src1, VR256:$src2, (i8 imm:$imm))), + def : Pat<(v4f64 (X86Shufp VR256:$src1, VR256:$src2, (i8 imm:$imm))), (VSHUFPDYrri VR256:$src1, VR256:$src2, imm:$imm)>; - def : Pat<(v4f64 (X86Shufpd VR256:$src1, + def : Pat<(v4f64 (X86Shufp VR256:$src1, (memopv4f64 addr:$src2), (i8 imm:$imm))), (VSHUFPDYrmi VR256:$src1, addr:$src2, imm:$imm)>; } From proljc at gmail.com Sat Dec 31 20:21:30 2011 From: proljc at gmail.com (Jia Liu) Date: Sun, 1 Jan 2012 10:21:30 +0800 Subject: [llvm-commits] [PATCH] Update "--enable-targets" list. In-Reply-To: References: Message-ID: <1325384490-3217-1-git-send-email-proljc@gmail.com> Removed alpha, added arm, hexagon, mips and so on. Signed-off-by: Jia Liu Modified: docs/GettingStarted.html --- docs/GettingStarted.html | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/docs/GettingStarted.html b/docs/GettingStarted.html index d78344a..d90dbc0 100644 --- a/docs/GettingStarted.html +++ b/docs/GettingStarted.html @@ -1110,7 +1110,7 @@ script to configure the build system:

selected as the target of the build host. You can also specify a comma separated list of target names that you want available in llc. The target names use all lower case. The current set of targets is:
- alpha, ia64, powerpc, skeleton, sparc, x86. + arm, cbe, cpp, hexagon, mblaze, mips, mipsel, msp430, powerpc, ptx, sparc, spu, x86, x86_64, xcore.

--enable-doxygen
Look for the doxygen program and enable construction of doxygen based -- 1.7.5.4 From geek4civic at gmail.com Sun Jan 1 02:16:56 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Sun, 01 Jan 2012 08:16:56 -0000 Subject: [llvm-commits] [compiler-rt] r147395 - /compiler-rt/trunk/LICENSE.TXT Message-ID: <20120101081656.C08772A6C12C@llvm.org> Author: chapuni Date: Sun Jan 1 02:16:56 2012 New Revision: 147395 URL: http://llvm.org/viewvc/llvm-project?rev=147395&view=rev Log: Happy new year 2012! Modified: compiler-rt/trunk/LICENSE.TXT Modified: compiler-rt/trunk/LICENSE.TXT URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/LICENSE.TXT?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- compiler-rt/trunk/LICENSE.TXT (original) +++ compiler-rt/trunk/LICENSE.TXT Sun Jan 1 02:16:56 2012 @@ -14,7 +14,7 @@ University of Illinois/NCSA Open Source License -Copyright (c) 2009-2010 by the contributors listed in CREDITS.TXT +Copyright (c) 2009-2012 by the contributors listed in CREDITS.TXT All rights reserved. @@ -55,7 +55,7 @@ ============================================================================== -Copyright (c) 2009-2010 by the contributors listed in CREDITS.TXT +Copyright (c) 2009-2012 by the contributors listed in CREDITS.TXT Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal From geek4civic at gmail.com Sun Jan 1 02:16:57 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Sun, 01 Jan 2012 08:16:57 -0000 Subject: [llvm-commits] [polly] r147395 - /polly/trunk/LICENSE.txt Message-ID: <20120101081657.14F842A6C131@llvm.org> Author: chapuni Date: Sun Jan 1 02:16:56 2012 New Revision: 147395 URL: http://llvm.org/viewvc/llvm-project?rev=147395&view=rev Log: Happy new year 2012! Modified: polly/trunk/LICENSE.txt Modified: polly/trunk/LICENSE.txt URL: http://llvm.org/viewvc/llvm-project/polly/trunk/LICENSE.txt?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- polly/trunk/LICENSE.txt (original) +++ polly/trunk/LICENSE.txt Sun Jan 1 02:16:56 2012 @@ -4,7 +4,7 @@ University of Illinois/NCSA Open Source License -Copyright (c) 2009-2011 Polly Team +Copyright (c) 2009-2012 Polly Team All rights reserved. Developed by: From geek4civic at gmail.com Sun Jan 1 02:16:56 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Sun, 01 Jan 2012 08:16:56 -0000 Subject: [llvm-commits] [llvm] r147395 - in /llvm/trunk: LICENSE.TXT autoconf/configure.ac configure docs/doxygen.footer Message-ID: <20120101081657.01AD12A6C130@llvm.org> Author: chapuni Date: Sun Jan 1 02:16:56 2012 New Revision: 147395 URL: http://llvm.org/viewvc/llvm-project?rev=147395&view=rev Log: Happy new year 2012! Modified: llvm/trunk/LICENSE.TXT llvm/trunk/autoconf/configure.ac llvm/trunk/configure llvm/trunk/docs/doxygen.footer Modified: llvm/trunk/LICENSE.TXT URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/LICENSE.TXT?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- llvm/trunk/LICENSE.TXT (original) +++ llvm/trunk/LICENSE.TXT Sun Jan 1 02:16:56 2012 @@ -4,7 +4,7 @@ University of Illinois/NCSA Open Source License -Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign. +Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign. All rights reserved. Developed by: Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Sun Jan 1 02:16:56 2012 @@ -35,8 +35,8 @@ dnl Provide a copyright substitution and ensure the copyright notice is included dnl in the output of --version option of the generated configure script. -AC_SUBST(LLVM_COPYRIGHT,["Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign."]) -AC_COPYRIGHT([Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign.]) +AC_SUBST(LLVM_COPYRIGHT,["Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign."]) +AC_COPYRIGHT([Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign.]) dnl Indicate that we require autoconf 2.60 or later. AC_PREREQ(2.60) Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Sun Jan 1 02:16:56 2012 @@ -9,7 +9,7 @@ # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. # -# Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign. +# Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign. ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## @@ -1541,7 +1541,7 @@ This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. -Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign. +Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign. _ACEOF exit fi @@ -1903,7 +1903,7 @@ -LLVM_COPYRIGHT="Copyright (c) 2003-2011 University of Illinois at Urbana-Champaign." +LLVM_COPYRIGHT="Copyright (c) 2003-2012 University of Illinois at Urbana-Champaign." Modified: llvm/trunk/docs/doxygen.footer URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/doxygen.footer?rev=147395&r1=147394&r2=147395&view=diff ============================================================================== --- llvm/trunk/docs/doxygen.footer (original) +++ llvm/trunk/docs/doxygen.footer Sun Jan 1 02:16:56 2012 @@ -3,7 +3,7 @@ Generated on $datetime for $projectname by Doxygen$doxygenversion
-Copyright © 2003-2009 University of Illinois at Urbana-Champaign. +Copyright © 2003-2012 University of Illinois at Urbana-Champaign. All Rights Reserved.


From baldrick at free.fr Sun Jan 1 08:24:26 2012 From: baldrick at free.fr (Duncan Sands) Date: Sun, 01 Jan 2012 15:24:26 +0100 Subject: [llvm-commits] [llvm] r147391 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineShifts.cpp test/Transforms/InstCombine/shift.ll In-Reply-To: <20111231213023.1E8B72A6C12C@llvm.org> References: <20111231213023.1E8B72A6C12C@llvm.org> Message-ID: <4F006C9A.4080709@free.fr> Hi Nick, > @@ -603,9 +602,16 @@ > // (X>>? C1)<< C2 --> X>>? (C1-C2)& (-1<< C2) > if (I.getOpcode() == Instruction::Shl&& > ShiftOp->getOpcode() != Instruction::Shl) { > - Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), X, > - ConstantInt::get(Ty, ShiftDiff)); > - > + ConstantInt *ShiftDiffCst = ConstantInt::get(Ty, ShiftDiff); > + if (ShiftOp->isExact()) { > + // (X>>?exact C1)<< C2 --> X>>?exact (C1-C2) what happens if C2 is bigger than C1? Ciao, Duncan. > + BinaryOperator *NewShr = BinaryOperator::Create(ShiftOp->getOpcode(), > + X, ShiftDiffCst); > + NewShr->setIsExact(true); > + return NewShr; > + } > + Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), > + X, ShiftDiffCst); > APInt Mask(APInt::getHighBitsSet(TypeBits, TypeBits - ShiftAmt2)); > return BinaryOperator::CreateAnd(Shift, > ConstantInt::get(I.getContext(),Mask)); > > Modified: llvm/trunk/test/Transforms/InstCombine/shift.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift.ll?rev=147391&r1=147390&r2=147391&view=diff > ============================================================================== > --- llvm/trunk/test/Transforms/InstCombine/shift.ll (original) > +++ llvm/trunk/test/Transforms/InstCombine/shift.ll Sat Dec 31 15:30:22 2011 > @@ -542,3 +542,21 @@ > ; CHECK-NEXT: %y = lshr i32 %a, 5 > ; CHECK-NEXT: ret i32 %y > } > + > +define i32 @test46(i32 %a) { > + %y = ashr exact i32 %a, 3 > + %z = shl i32 %y, 1 > + ret i32 %z > +; CHECK: @test46 > +; CHECK-NEXT: %z = ashr exact i32 %a, 2 > +; CHECK-NEXT: ret i32 %z > +} > + > +define i32 @test47(i32 %a) { > + %y = lshr exact i32 %a, 3 > + %z = shl i32 %y, 1 > + ret i32 %z > +; CHECK: @test47 > +; CHECK-NEXT: %z = lshr exact i32 %a, 2 > +; CHECK-NEXT: ret i32 %z > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From nicholas at mxc.ca Sun Jan 1 09:04:25 2012 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 01 Jan 2012 07:04:25 -0800 Subject: [llvm-commits] [llvm] r147391 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineShifts.cpp test/Transforms/InstCombine/shift.ll In-Reply-To: <4F006C9A.4080709@free.fr> References: <20111231213023.1E8B72A6C12C@llvm.org> <4F006C9A.4080709@free.fr> Message-ID: <4F0075F9.5080709@mxc.ca> On 01/01/2012 06:24 AM, Duncan Sands wrote: > Hi Nick, > >> @@ -603,9 +602,16 @@ >> // (X>>? C1)<< C2 --> X>>? (C1-C2)& (-1<< C2) >> if (I.getOpcode() == Instruction::Shl&& >> ShiftOp->getOpcode() != Instruction::Shl) { >> - Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), X, >> - ConstantInt::get(Ty, ShiftDiff)); >> - >> + ConstantInt *ShiftDiffCst = ConstantInt::get(Ty, ShiftDiff); >> + if (ShiftOp->isExact()) { >> + // (X>>?exact C1)<< C2 --> X>>?exact (C1-C2) > > what happens if C2 is bigger than C1? We won't reach here. It extracts C1 and C2 into ShiftAmt1 and ShiftAmt2 and does if (ShiftAmt1 == ShiftAmt2) { ... } else if (ShiftAmt1 < ShiftAmt2) { ... } else { our transform here }. Nick > > Ciao, Duncan. > >> + BinaryOperator *NewShr = BinaryOperator::Create(ShiftOp->getOpcode(), >> + X, ShiftDiffCst); >> + NewShr->setIsExact(true); >> + return NewShr; >> + } >> + Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), >> + X, ShiftDiffCst); >> APInt Mask(APInt::getHighBitsSet(TypeBits, TypeBits - ShiftAmt2)); >> return BinaryOperator::CreateAnd(Shift, >> ConstantInt::get(I.getContext(),Mask)); >> >> Modified: llvm/trunk/test/Transforms/InstCombine/shift.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift.ll?rev=147391&r1=147390&r2=147391&view=diff >> ============================================================================== >> --- llvm/trunk/test/Transforms/InstCombine/shift.ll (original) >> +++ llvm/trunk/test/Transforms/InstCombine/shift.ll Sat Dec 31 15:30:22 2011 >> @@ -542,3 +542,21 @@ >> ; CHECK-NEXT: %y = lshr i32 %a, 5 >> ; CHECK-NEXT: ret i32 %y >> } >> + >> +define i32 @test46(i32 %a) { >> + %y = ashr exact i32 %a, 3 >> + %z = shl i32 %y, 1 >> + ret i32 %z >> +; CHECK: @test46 >> +; CHECK-NEXT: %z = ashr exact i32 %a, 2 >> +; CHECK-NEXT: ret i32 %z >> +} >> + >> +define i32 @test47(i32 %a) { >> + %y = lshr exact i32 %a, 3 >> + %z = shl i32 %y, 1 >> + ret i32 %z >> +; CHECK: @test47 >> +; CHECK-NEXT: %z = lshr exact i32 %a, 2 >> +; CHECK-NEXT: ret i32 %z >> +} >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From baldrick at free.fr Sun Jan 1 09:06:42 2012 From: baldrick at free.fr (Duncan Sands) Date: Sun, 01 Jan 2012 16:06:42 +0100 Subject: [llvm-commits] [llvm] r147391 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineShifts.cpp test/Transforms/InstCombine/shift.ll In-Reply-To: <4F0075F9.5080709@mxc.ca> References: <20111231213023.1E8B72A6C12C@llvm.org> <4F006C9A.4080709@free.fr> <4F0075F9.5080709@mxc.ca> Message-ID: <4F007682.9020905@free.fr> Hi Nick, >>> @@ -603,9 +602,16 @@ >>> // (X>>? C1)<< C2 --> X>>? (C1-C2)& (-1<< C2) >>> if (I.getOpcode() == Instruction::Shl&& >>> ShiftOp->getOpcode() != Instruction::Shl) { >>> - Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), X, >>> - ConstantInt::get(Ty, ShiftDiff)); >>> - >>> + ConstantInt *ShiftDiffCst = ConstantInt::get(Ty, ShiftDiff); >>> + if (ShiftOp->isExact()) { >>> + // (X>>?exact C1)<< C2 --> X>>?exact (C1-C2) >> >> what happens if C2 is bigger than C1? > > We won't reach here. It extracts C1 and C2 into ShiftAmt1 and ShiftAmt2 and does > if (ShiftAmt1 == ShiftAmt2) { ... } else if (ShiftAmt1 < ShiftAmt2) { ... } else > { our transform here }. OK, thanks. However if C2 is bigger than C1 then, thanks to the exact flag, you can just turn it into a left shift (without an "and"). Likewise, for C1 == C2 the pair of shifts becomes a no-op. But maybe those are handled already? Ciao, Duncan. > > Nick > >> >> Ciao, Duncan. >> >>> + BinaryOperator *NewShr = BinaryOperator::Create(ShiftOp->getOpcode(), >>> + X, ShiftDiffCst); >>> + NewShr->setIsExact(true); >>> + return NewShr; >>> + } >>> + Value *Shift = Builder->CreateBinOp(ShiftOp->getOpcode(), >>> + X, ShiftDiffCst); >>> APInt Mask(APInt::getHighBitsSet(TypeBits, TypeBits - ShiftAmt2)); >>> return BinaryOperator::CreateAnd(Shift, >>> ConstantInt::get(I.getContext(),Mask)); >>> >>> Modified: llvm/trunk/test/Transforms/InstCombine/shift.ll >>> URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shift.ll?rev=147391&r1=147390&r2=147391&view=diff >>> >>> ============================================================================== >>> --- llvm/trunk/test/Transforms/InstCombine/shift.ll (original) >>> +++ llvm/trunk/test/Transforms/InstCombine/shift.ll Sat Dec 31 15:30:22 2011 >>> @@ -542,3 +542,21 @@ >>> ; CHECK-NEXT: %y = lshr i32 %a, 5 >>> ; CHECK-NEXT: ret i32 %y >>> } >>> + >>> +define i32 @test46(i32 %a) { >>> + %y = ashr exact i32 %a, 3 >>> + %z = shl i32 %y, 1 >>> + ret i32 %z >>> +; CHECK: @test46 >>> +; CHECK-NEXT: %z = ashr exact i32 %a, 2 >>> +; CHECK-NEXT: ret i32 %z >>> +} >>> + >>> +define i32 @test47(i32 %a) { >>> + %y = lshr exact i32 %a, 3 >>> + %z = shl i32 %y, 1 >>> + ret i32 %z >>> +; CHECK: @test47 >>> +; CHECK-NEXT: %z = lshr exact i32 %a, 2 >>> +; CHECK-NEXT: ret i32 %z >>> +} >>> >>> >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > From victor.umansky at intel.com Sun Jan 1 10:18:04 2012 From: victor.umansky at intel.com (Umansky, Victor) Date: Sun, 1 Jan 2012 16:18:04 +0000 Subject: [llvm-commits] X86SSELevel for AVX architecture Message-ID: Hi Evan, I noticed that in X86Subtarget constructor you set 'X86SSELevel' member of the class as 'NoMMXSSE' in the case when HasAVX member is set to 'true'. Effectively that invalidates SSE features for AVX architecture - because hasSSEn() accessors return 'false' when HasAVXn() is 'true'. I wonder whether this is the behavior which you'd like to enforce - as conceptually AVX architecture complements SSE rather than replaces it completely. I noticed this problem after discovering that LLVM fails to lower "sse2.fence" intrinsic when generating a code for AVX architecture - because this intrinsic is conditioned on hasSSE2() being 'true'. Is that case was somehow missed from regression testing, or there is another way to lower that intrinsic? I'd appreciate your clarifications. Best Regards, Victor Umansky --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120101/c509bdce/attachment.html From elena.demikhovsky at intel.com Sun Jan 1 10:22:47 2012 From: elena.demikhovsky at intel.com (Elena Demikhovsky) Date: Sun, 01 Jan 2012 16:22:47 -0000 Subject: [llvm-commits] [llvm] r147399 - in /llvm/trunk: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp test/CodeGen/X86/avx-shuffle-x86_32.ll Message-ID: <20120101162247.B625F2A6C12C@llvm.org> Author: delena Date: Sun Jan 1 10:22:47 2012 New Revision: 147399 URL: http://llvm.org/viewvc/llvm-project?rev=147399&view=rev Log: Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. Added: llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll (with props) Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=147399&r1=147398&r2=147399&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Sun Jan 1 10:22:47 2012 @@ -2795,15 +2795,48 @@ Node->getOperand(2), dl)); break; case ISD::VECTOR_SHUFFLE: { - SmallVector Mask; + SmallVector Mask(32U, -1); cast(Node)->getMask(Mask); EVT VT = Node->getValueType(0); EVT EltVT = VT.getVectorElementType(); - if (!TLI.isTypeLegal(EltVT)) + SDValue Op0 = Node->getOperand(0); + SDValue Op1 = Node->getOperand(1); + if (!TLI.isTypeLegal(EltVT)) { + EltVT = TLI.getTypeToTransformTo(*DAG.getContext(), EltVT); + + // Convert shuffle node + // If original node was v4i64 and the new EltVT is i32, + // cast operands to v8i32 and re-build the mask + unsigned OldNumElems = VT.getVectorNumElements(); + // Calculate new VT + VT = EVT::getVectorVT(*DAG.getContext(), EltVT, VT.getSizeInBits()/EltVT.getSizeInBits()); + + // cast operands to new VT + Op0 = DAG.getNode(ISD::BITCAST, dl, VT, Op0); + Op1 = DAG.getNode(ISD::BITCAST, dl, VT, Op1); + + // Convert the shuffle mask + unsigned int factor = VT.getVectorNumElements()/OldNumElems; + // assume that EltVT gets smaller + assert(factor > 0); + SmallVector NewMask(32U, -1); + + for (unsigned i = 0; i < OldNumElems; ++i) { + if (Mask[i] < 0) { + for (unsigned fi = 0; fi < factor; ++fi) + NewMask[i*factor+fi] = Mask[i]; + } + else { + for (unsigned fi = 0; fi < factor; ++fi) + NewMask[i*factor+fi] = Mask[i]*factor+fi; + } + Mask = NewMask; + } + } unsigned NumElems = VT.getVectorNumElements(); - SmallVector Ops; + SmallVector Ops; for (unsigned i = 0; i != NumElems; ++i) { if (Mask[i] < 0) { Ops.push_back(DAG.getUNDEF(EltVT)); @@ -2812,13 +2845,14 @@ unsigned Idx = Mask[i]; if (Idx < NumElems) Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, EltVT, - Node->getOperand(0), + Op0, DAG.getIntPtrConstant(Idx))); else Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, EltVT, - Node->getOperand(1), + Op1, DAG.getIntPtrConstant(Idx - NumElems))); } + Tmp1 = DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Ops[0], Ops.size()); Results.push_back(Tmp1); break; Added: llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll?rev=147399&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll (added) +++ llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll Sun Jan 1 10:22:47 2012 @@ -0,0 +1,8 @@ +; RUN: llc < %s -mtriple=i686-pc-win32 -mcpu=corei7-avx -mattr=+avx | FileCheck %s + +define <4 x i64> @test1(<4 x i64> %a) nounwind { + %b = shufflevector <4 x i64> %a, <4 x i64> undef, <4 x i32> + ret <4 x i64>%b + ; CHECK test1: + ; CHECK: vinsertf128 + } Propchange: llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll ------------------------------------------------------------------------------ svn:executable = * From rafael.espindola at gmail.com Sun Jan 1 11:36:23 2012 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Sun, 01 Jan 2012 17:36:23 -0000 Subject: [llvm-commits] [llvm] r147400 - in /llvm/trunk: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp test/CodeGen/X86/avx-shuffle-x86_32.ll Message-ID: <20120101173623.6F0302A6C12C@llvm.org> Author: rafael Date: Sun Jan 1 11:36:23 2012 New Revision: 147400 URL: http://llvm.org/viewvc/llvm-project?rev=147400&view=rev Log: Revert 147399. It broke CodeGen/ARM/vext.ll. Removed: llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=147400&r1=147399&r2=147400&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Sun Jan 1 11:36:23 2012 @@ -2795,48 +2795,15 @@ Node->getOperand(2), dl)); break; case ISD::VECTOR_SHUFFLE: { - SmallVector Mask(32U, -1); + SmallVector Mask; cast(Node)->getMask(Mask); EVT VT = Node->getValueType(0); EVT EltVT = VT.getVectorElementType(); - SDValue Op0 = Node->getOperand(0); - SDValue Op1 = Node->getOperand(1); - if (!TLI.isTypeLegal(EltVT)) { - + if (!TLI.isTypeLegal(EltVT)) EltVT = TLI.getTypeToTransformTo(*DAG.getContext(), EltVT); - - // Convert shuffle node - // If original node was v4i64 and the new EltVT is i32, - // cast operands to v8i32 and re-build the mask - unsigned OldNumElems = VT.getVectorNumElements(); - // Calculate new VT - VT = EVT::getVectorVT(*DAG.getContext(), EltVT, VT.getSizeInBits()/EltVT.getSizeInBits()); - - // cast operands to new VT - Op0 = DAG.getNode(ISD::BITCAST, dl, VT, Op0); - Op1 = DAG.getNode(ISD::BITCAST, dl, VT, Op1); - - // Convert the shuffle mask - unsigned int factor = VT.getVectorNumElements()/OldNumElems; - // assume that EltVT gets smaller - assert(factor > 0); - SmallVector NewMask(32U, -1); - - for (unsigned i = 0; i < OldNumElems; ++i) { - if (Mask[i] < 0) { - for (unsigned fi = 0; fi < factor; ++fi) - NewMask[i*factor+fi] = Mask[i]; - } - else { - for (unsigned fi = 0; fi < factor; ++fi) - NewMask[i*factor+fi] = Mask[i]*factor+fi; - } - Mask = NewMask; - } - } unsigned NumElems = VT.getVectorNumElements(); - SmallVector Ops; + SmallVector Ops; for (unsigned i = 0; i != NumElems; ++i) { if (Mask[i] < 0) { Ops.push_back(DAG.getUNDEF(EltVT)); @@ -2845,14 +2812,13 @@ unsigned Idx = Mask[i]; if (Idx < NumElems) Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, EltVT, - Op0, + Node->getOperand(0), DAG.getIntPtrConstant(Idx))); else Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, EltVT, - Op1, + Node->getOperand(1), DAG.getIntPtrConstant(Idx - NumElems))); } - Tmp1 = DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Ops[0], Ops.size()); Results.push_back(Tmp1); break; Removed: llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll?rev=147399&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-shuffle-x86_32.ll (removed) @@ -1,8 +0,0 @@ -; RUN: llc < %s -mtriple=i686-pc-win32 -mcpu=corei7-avx -mattr=+avx | FileCheck %s - -define <4 x i64> @test1(<4 x i64> %a) nounwind { - %b = shufflevector <4 x i64> %a, <4 x i64> undef, <4 x i32> - ret <4 x i64>%b - ; CHECK test1: - ; CHECK: vinsertf128 - } From rafael.espindola at gmail.com Sun Jan 1 11:44:19 2012 From: rafael.espindola at gmail.com (=?ISO-8859-1?Q?Rafael_=C1vila_de_Esp=EDndola?=) Date: Sun, 01 Jan 2012 12:44:19 -0500 Subject: [llvm-commits] [llvm] r147399 - in /llvm/trunk: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp test/CodeGen/X86/avx-shuffle-x86_32.ll In-Reply-To: <20120101162247.B625F2A6C12C@llvm.org> References: <20120101162247.B625F2A6C12C@llvm.org> Message-ID: <4F009B73.70103@gmail.com> On 01/01/12 11:22 AM, Elena Demikhovsky wrote: > Author: delena > Date: Sun Jan 1 10:22:47 2012 > New Revision: 147399 > > URL: http://llvm.org/viewvc/llvm-project?rev=147399&view=rev > Log: > Fixed a bug in SelectionDAG.cpp. > The failure seen on win32, when i64 type is illegal. > It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. > > The failure message is: > llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. > > I added a special test that checks vector shuffle on win32. > Hi Elena, I reverted this because it broke CodeGen/ARM/vext.ll, which made the bots red. Cheers, Rafael From benny.kra at googlemail.com Sun Jan 1 11:55:23 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 01 Jan 2012 17:55:23 -0000 Subject: [llvm-commits] [llvm] r147402 - /llvm/trunk/include/llvm/Support/PatternMatch.h Message-ID: <20120101175523.C03C02A6C12C@llvm.org> Author: d0k Date: Sun Jan 1 11:55:23 2012 New Revision: 147402 URL: http://llvm.org/viewvc/llvm-project?rev=147402&view=rev Log: PatternMatch: Simplify code by reusing the Operator class. Modified: llvm/trunk/include/llvm/Support/PatternMatch.h Modified: llvm/trunk/include/llvm/Support/PatternMatch.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/PatternMatch.h?rev=147402&r1=147401&r2=147402&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/PatternMatch.h (original) +++ llvm/trunk/include/llvm/Support/PatternMatch.h Sun Jan 1 11:55:23 2012 @@ -31,6 +31,7 @@ #include "llvm/Constants.h" #include "llvm/Instructions.h" +#include "llvm/Operator.h" namespace llvm { namespace PatternMatch { @@ -529,10 +530,8 @@ template bool match(OpTy *V) { - if (CastInst *I = dyn_cast(V)) - return I->getOpcode() == Opcode && Op.match(I->getOperand(0)); - if (ConstantExpr *CE = dyn_cast(V)) - return CE->getOpcode() == Opcode && Op.match(CE->getOperand(0)); + if (Operator *O = dyn_cast(V)) + return O->getOpcode() == Opcode && Op.match(O->getOperand(0)); return false; } }; @@ -585,12 +584,9 @@ template bool match(OpTy *V) { - if (Instruction *I = dyn_cast(V)) - if (I->getOpcode() == Instruction::Xor) - return matchIfNot(I->getOperand(0), I->getOperand(1)); - if (ConstantExpr *CE = dyn_cast(V)) - if (CE->getOpcode() == Instruction::Xor) - return matchIfNot(CE->getOperand(0), CE->getOperand(1)); + if (Operator *O = dyn_cast(V)) + if (O->getOpcode() == Instruction::Xor) + return matchIfNot(O->getOperand(0), O->getOperand(1)); return false; } private: @@ -615,12 +611,9 @@ template bool match(OpTy *V) { - if (Instruction *I = dyn_cast(V)) - if (I->getOpcode() == Instruction::Sub) - return matchIfNeg(I->getOperand(0), I->getOperand(1)); - if (ConstantExpr *CE = dyn_cast(V)) - if (CE->getOpcode() == Instruction::Sub) - return matchIfNeg(CE->getOperand(0), CE->getOperand(1)); + if (Operator *O = dyn_cast(V)) + if (O->getOpcode() == Instruction::Sub) + return matchIfNeg(O->getOperand(0), O->getOperand(1)); return false; } private: @@ -644,12 +637,9 @@ template bool match(OpTy *V) { - if (Instruction *I = dyn_cast(V)) - if (I->getOpcode() == Instruction::FSub) - return matchIfFNeg(I->getOperand(0), I->getOperand(1)); - if (ConstantExpr *CE = dyn_cast(V)) - if (CE->getOpcode() == Instruction::FSub) - return matchIfFNeg(CE->getOperand(0), CE->getOperand(1)); + if (Operator *O = dyn_cast(V)) + if (O->getOpcode() == Instruction::FSub) + return matchIfFNeg(O->getOperand(0), O->getOperand(1)); return false; } private: From benny.kra at googlemail.com Sun Jan 1 11:55:30 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 01 Jan 2012 17:55:30 -0000 Subject: [llvm-commits] [llvm] r147403 - in /llvm/trunk: include/llvm/Support/PatternMatch.h lib/Analysis/InstructionSimplify.cpp lib/Analysis/ValueTracking.cpp Message-ID: <20120101175531.03CDD2A6C12C@llvm.org> Author: d0k Date: Sun Jan 1 11:55:30 2012 New Revision: 147403 URL: http://llvm.org/viewvc/llvm-project?rev=147403&view=rev Log: PatternMatch: Introduce a matcher for instructions with the "exact" bit. Use it to simplify a few matchers. Modified: llvm/trunk/include/llvm/Support/PatternMatch.h llvm/trunk/lib/Analysis/InstructionSimplify.cpp llvm/trunk/lib/Analysis/ValueTracking.cpp Modified: llvm/trunk/include/llvm/Support/PatternMatch.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Support/PatternMatch.h?rev=147403&r1=147402&r2=147403&view=diff ============================================================================== --- llvm/trunk/include/llvm/Support/PatternMatch.h (original) +++ llvm/trunk/include/llvm/Support/PatternMatch.h Sun Jan 1 11:55:30 2012 @@ -442,6 +442,26 @@ } //===----------------------------------------------------------------------===// +// Class that matches exact binary ops. +// +template +struct Exact_match { + SubPattern_t SubPattern; + + Exact_match(const SubPattern_t &SP) : SubPattern(SP) {} + + template + bool match(OpTy *V) { + if (PossiblyExactOperator *PEO = dyn_cast(V)) + return PEO->isExact() && SubPattern.match(V); + return false; + } +}; + +template +inline Exact_match m_Exact(const T &SubPattern) { return SubPattern; } + +//===----------------------------------------------------------------------===// // Matchers for CmpInst classes // Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=147403&r1=147402&r2=147403&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Sun Jan 1 11:55:30 2012 @@ -812,14 +812,10 @@ return Op0; // (X / Y) * Y -> X if the division is exact. - Value *X = 0, *Y = 0; - if ((match(Op0, m_IDiv(m_Value(X), m_Value(Y))) && Y == Op1) || // (X / Y) * Y - (match(Op1, m_IDiv(m_Value(X), m_Value(Y))) && Y == Op0)) { // Y * (X / Y) - PossiblyExactOperator *Div = - cast(Y == Op1 ? Op0 : Op1); - if (Div->isExact()) - return X; - } + Value *X = 0; + if (match(Op0, m_Exact(m_IDiv(m_Value(X), m_Specific(Op1)))) || // (X / Y) * Y + match(Op1, m_Exact(m_IDiv(m_Value(X), m_Specific(Op0))))) // Y * (X / Y) + return X; // i1 mul -> and. if (MaxRecurse && Op0->getType()->isIntegerTy(1)) @@ -1162,8 +1158,7 @@ // (X >> A) << A -> X Value *X; - if (match(Op0, m_Shr(m_Value(X), m_Specific(Op1))) && - cast(Op0)->isExact()) + if (match(Op0, m_Exact(m_Shr(m_Value(X), m_Specific(Op1))))) return X; return 0; } Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=147403&r1=147402&r2=147403&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Sun Jan 1 11:55:30 2012 @@ -811,11 +811,9 @@ // An exact divide or right shift can only shift off zero bits, so the result // is a power of two only if the first operand is a power of two and not // copying a sign bit (sdiv int_min, 2). - if (match(V, m_LShr(m_Value(), m_Value())) || - match(V, m_UDiv(m_Value(), m_Value()))) { - PossiblyExactOperator *PEO = cast(V); - if (PEO->isExact()) - return isPowerOfTwo(PEO->getOperand(0), TD, OrZero, Depth); + if (match(V, m_Exact(m_LShr(m_Value(), m_Value()))) || + match(V, m_Exact(m_UDiv(m_Value(), m_Value())))) { + return isPowerOfTwo(cast(V)->getOperand(0), TD, OrZero, Depth); } return false; @@ -879,10 +877,8 @@ return true; } // div exact can only produce a zero if the dividend is zero. - else if (match(V, m_IDiv(m_Value(X), m_Value()))) { - PossiblyExactOperator *BO = cast(V); - if (BO->isExact()) - return isKnownNonZero(X, TD, Depth); + else if (match(V, m_Exact(m_IDiv(m_Value(X), m_Value())))) { + return isKnownNonZero(X, TD, Depth); } // X + Y. else if (match(V, m_Add(m_Value(X), m_Value(Y)))) { From benny.kra at googlemail.com Sun Jan 1 11:55:37 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 01 Jan 2012 17:55:37 -0000 Subject: [llvm-commits] [llvm] r147404 - /llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c Message-ID: <20120101175537.2AB792A6C12C@llvm.org> Author: d0k Date: Sun Jan 1 11:55:36 2012 New Revision: 147404 URL: http://llvm.org/viewvc/llvm-project?rev=147404&view=rev Log: X86Disassembler: Fix undefined behavior found by GCC 4.6 Modified: llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c Modified: llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c?rev=147404&r1=147403&r2=147404&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c (original) +++ llvm/trunk/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c Sun Jan 1 11:55:36 2012 @@ -1502,9 +1502,11 @@ return -1; case ENCODING_IB: if (sawRegImm) { - // saw a register immediate so don't read again and instead split the previous immediate - // FIXME: This is a hack - insn->immediates[insn->numImmediatesConsumed++] = insn->immediates[insn->numImmediatesConsumed - 1] & 0xf; + // Saw a register immediate so don't read again and instead split the + // previous immediate. FIXME: This is a hack + insn->immediates[insn->numImmediatesConsumed] = + insn->immediates[insn->numImmediatesConsumed - 1] & 0xf; + ++insn->numImmediatesConsumed; break; } if (readImmediate(insn, 1)) From craig.topper at gmail.com Sun Jan 1 13:06:31 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 1 Jan 2012 13:06:31 -0600 Subject: [llvm-commits] X86SSELevel for AVX architecture In-Reply-To: References: Message-ID: This is similar to the fix for the prefetch instruction in r146163. I think the fence instructions and clflush are similarly broken. I'll see if I can find any others and I'll commit a fix. On Sun, Jan 1, 2012 at 10:18 AM, Umansky, Victor wrote: > Hi Evan, > > I noticed that in X86Subtarget constructor you set ?X86SSELevel? member > of the class as ?NoMMXSSE? in the case when HasAVX member is set to > ?true?. > Effectively that invalidates SSE features for AVX architecture - because > hasSSEn() accessors return ?false? when HasAVXn() is ?true?. > I wonder whether this is the behavior which you?d like to enforce ? as > conceptually AVX architecture complements SSE rather than replaces it > completely. > > I noticed this problem after discovering that LLVM fails to lower > ?sse2.fence? intrinsic when generating a code for AVX architecture ? > because this intrinsic is conditioned on hasSSE2() being ?true?. > Is that case was somehow missed from regression testing, or there is > another way to lower that intrinsic? > > I?d appreciate your clarifications. > > Best Regards, > Victor Umansky > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120101/bd21e245/attachment.html From craig.topper at gmail.com Sun Jan 1 13:40:22 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 01 Jan 2012 19:40:22 -0000 Subject: [llvm-commits] [llvm] r147409 - in /llvm/trunk: lib/Target/X86/X86InstrFormats.td lib/Target/X86/X86InstrInfo.td lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/apm.ll test/CodeGen/X86/avx-intrinsics-x86.ll Message-ID: <20120101194023.15B981BE003@llvm.org> Author: ctopper Date: Sun Jan 1 13:40:22 2012 New Revision: 147409 URL: http://llvm.org/viewvc/llvm-project?rev=147409&view=rev Log: Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers. Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td llvm/trunk/lib/Target/X86/X86InstrInfo.td llvm/trunk/lib/Target/X86/X86InstrSSE.td llvm/trunk/test/CodeGen/X86/apm.ll llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147409&r1=147408&r2=147409&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Sun Jan 1 13:40:22 2012 @@ -339,10 +339,6 @@ list pattern> : I, TB, Requires<[HasAVX]>; -class VoPSI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, - Requires<[HasXMM]>; // SSE2 Instruction Templates: // Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=147409&r1=147408&r2=147409&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Sun Jan 1 13:40:22 2012 @@ -475,6 +475,7 @@ def HasAVX2 : Predicate<"Subtarget->hasAVX2()">; def HasXMM : Predicate<"Subtarget->hasXMM()">; def HasXMMInt : Predicate<"Subtarget->hasXMMInt()">; +def HasSSE3orAVX : Predicate<"Subtarget->hasSSE3orAVX()">; def HasPOPCNT : Predicate<"Subtarget->hasPOPCNT()">; def HasAES : Predicate<"Subtarget->hasAES()">; Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=147409&r1=147408&r2=147409&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Sun Jan 1 13:40:22 2012 @@ -3253,19 +3253,21 @@ //===----------------------------------------------------------------------===// // Prefetch intrinsic. -def PREFETCHT0 : VoPSI<0x18, MRM1m, (outs), (ins i8mem:$src), - "prefetcht0\t$src", [(prefetch addr:$src, imm, (i32 3), (i32 1))]>; -def PREFETCHT1 : VoPSI<0x18, MRM2m, (outs), (ins i8mem:$src), - "prefetcht1\t$src", [(prefetch addr:$src, imm, (i32 2), (i32 1))]>; -def PREFETCHT2 : VoPSI<0x18, MRM3m, (outs), (ins i8mem:$src), - "prefetcht2\t$src", [(prefetch addr:$src, imm, (i32 1), (i32 1))]>; -def PREFETCHNTA : VoPSI<0x18, MRM0m, (outs), (ins i8mem:$src), - "prefetchnta\t$src", [(prefetch addr:$src, imm, (i32 0), (i32 1))]>; +let Predicates = [HasXMM] in { +def PREFETCHT0 : I<0x18, MRM1m, (outs), (ins i8mem:$src), + "prefetcht0\t$src", [(prefetch addr:$src, imm, (i32 3), (i32 1))]>, TB; +def PREFETCHT1 : I<0x18, MRM2m, (outs), (ins i8mem:$src), + "prefetcht1\t$src", [(prefetch addr:$src, imm, (i32 2), (i32 1))]>, TB; +def PREFETCHT2 : I<0x18, MRM3m, (outs), (ins i8mem:$src), + "prefetcht2\t$src", [(prefetch addr:$src, imm, (i32 1), (i32 1))]>, TB; +def PREFETCHNTA : I<0x18, MRM0m, (outs), (ins i8mem:$src), + "prefetchnta\t$src", [(prefetch addr:$src, imm, (i32 0), (i32 1))]>, TB; +} // Flush cache def CLFLUSH : I<0xAE, MRM7m, (outs), (ins i8mem:$src), "clflush\t$src", [(int_x86_sse2_clflush addr:$src)]>, - TB, Requires<[HasSSE2]>; + TB, Requires<[HasXMMInt]>; // Pause. This "instruction" is encoded as "rep; nop", so even though it // was introduced with SSE2, it's backward compatible. @@ -3273,11 +3275,11 @@ // Load, store, and memory fence def SFENCE : I<0xAE, MRM_F8, (outs), (ins), - "sfence", [(int_x86_sse_sfence)]>, TB, Requires<[HasSSE1]>; + "sfence", [(int_x86_sse_sfence)]>, TB, Requires<[HasXMM]>; def LFENCE : I<0xAE, MRM_E8, (outs), (ins), - "lfence", [(int_x86_sse2_lfence)]>, TB, Requires<[HasSSE2]>; + "lfence", [(int_x86_sse2_lfence)]>, TB, Requires<[HasXMMInt]>; def MFENCE : I<0xAE, MRM_F0, (outs), (ins), - "mfence", [(int_x86_sse2_mfence)]>, TB, Requires<[HasSSE2]>; + "mfence", [(int_x86_sse2_mfence)]>, TB, Requires<[HasXMMInt]>; def : Pat<(X86SFence), (SFENCE)>; def : Pat<(X86LFence), (LFENCE)>; @@ -5463,17 +5465,19 @@ let usesCustomInserter = 1 in { def MONITOR : PseudoI<(outs), (ins i32mem:$src1, GR32:$src2, GR32:$src3), - [(int_x86_sse3_monitor addr:$src1, GR32:$src2, GR32:$src3)]>; + [(int_x86_sse3_monitor addr:$src1, GR32:$src2, GR32:$src3)]>, + Requires<[HasSSE3orAVX]>; def MWAIT : PseudoI<(outs), (ins GR32:$src1, GR32:$src2), - [(int_x86_sse3_mwait GR32:$src1, GR32:$src2)]>; + [(int_x86_sse3_mwait GR32:$src1, GR32:$src2)]>, + Requires<[HasSSE3orAVX]>; } let Uses = [EAX, ECX, EDX] in def MONITORrrr : I<0x01, MRM_C8, (outs), (ins), "monitor", []>, TB, - Requires<[HasSSE3]>; + Requires<[HasSSE3orAVX]>; let Uses = [ECX, EAX] in def MWAITrr : I<0x01, MRM_C9, (outs), (ins), "mwait", []>, TB, - Requires<[HasSSE3]>; + Requires<[HasSSE3orAVX]>; def : InstAlias<"mwait %eax, %ecx", (MWAITrr)>, Requires<[In32BitMode]>; def : InstAlias<"mwait %rax, %rcx", (MWAITrr)>, Requires<[In64BitMode]>; Modified: llvm/trunk/test/CodeGen/X86/apm.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/apm.ll?rev=147409&r1=147408&r2=147409&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/apm.ll (original) +++ llvm/trunk/test/CodeGen/X86/apm.ll Sun Jan 1 13:40:22 2012 @@ -1,5 +1,5 @@ -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s -; RUN: llc < %s -mtriple=x86_64-win32 | FileCheck %s -check-prefix=WIN64 +; RUN: llc < %s -mtriple=x86_64-linux -mattr=+sse3 | FileCheck %s +; RUN: llc < %s -mtriple=x86_64-win32 -mattr=+sse3 | FileCheck %s -check-prefix=WIN64 ; PR8573 ; CHECK: foo: Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=147409&r1=147408&r2=147409&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Sun Jan 1 13:40:22 2012 @@ -2481,4 +2481,52 @@ } declare void @llvm.x86.avx.vzeroupper() nounwind +; Make sure instructions with no AVX equivalents, but are associated with SSEX feature flags still work +; CHECK: monitor +define void @monitor(i8* %P, i32 %E, i32 %H) nounwind { +entry: + tail call void @llvm.x86.sse3.monitor(i8* %P, i32 %E, i32 %H) + ret void +} +declare void @llvm.x86.sse3.monitor(i8*, i32, i32) nounwind + +; CHECK: mwait +define void @mwait(i32 %E, i32 %H) nounwind { +entry: + tail call void @llvm.x86.sse3.mwait(i32 %E, i32 %H) + ret void +} +declare void @llvm.x86.sse3.mwait(i32, i32) nounwind + +; CHECK: sfence +define void @sfence() nounwind { +entry: + tail call void @llvm.x86.sse.sfence() + ret void +} +declare void @llvm.x86.sse.sfence() nounwind + +; CHECK: lfence +define void @lfence() nounwind { +entry: + tail call void @llvm.x86.sse2.lfence() + ret void +} +declare void @llvm.x86.sse2.lfence() nounwind + +; CHECK: mfence +define void @mfence() nounwind { +entry: + tail call void @llvm.x86.sse2.mfence() + ret void +} +declare void @llvm.x86.sse2.mfence() nounwind + +; CHECK: clflush +define void @clflush(i8* %p) nounwind { +entry: + tail call void @llvm.x86.sse2.clflush(i8* %p) + ret void +} +declare void @llvm.x86.sse2.clflush(i8*) nounwind From craig.topper at gmail.com Sun Jan 1 13:44:39 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 1 Jan 2012 13:44:39 -0600 Subject: [llvm-commits] X86SSELevel for AVX architecture In-Reply-To: References: Message-ID: Fixed sfence, mfence, lfence, clflush, monitor, and mwait in r147409. On Sun, Jan 1, 2012 at 1:06 PM, Craig Topper wrote: > This is similar to the fix for the prefetch instruction in r146163. I > think the fence instructions and clflush are similarly broken. I'll see if > I can find any others and I'll commit a fix. > > On Sun, Jan 1, 2012 at 10:18 AM, Umansky, Victor > wrote: > >> Hi Evan, >> >> I noticed that in X86Subtarget constructor you set ?X86SSELevel? member >> of the class as ?NoMMXSSE? in the case when HasAVX member is set to >> ?true?. >> Effectively that invalidates SSE features for AVX architecture - because >> hasSSEn() accessors return ?false? when HasAVXn() is ?true?. >> I wonder whether this is the behavior which you?d like to enforce ? as >> conceptually AVX architecture complements SSE rather than replaces it >> completely. >> >> I noticed this problem after discovering that LLVM fails to lower >> ?sse2.fence? intrinsic when generating a code for AVX architecture ? >> because this intrinsic is conditioned on hasSSE2() being ?true?. >> Is that case was somehow missed from regression testing, or there is >> another way to lower that intrinsic? >> >> I?d appreciate your clarifications. >> >> Best Regards, >> Victor Umansky >> >> --------------------------------------------------------------------- >> Intel Israel (74) Limited >> >> This e-mail and any attachments may contain confidential material for >> the sole use of the intended recipient(s). Any review or distribution >> by others is strictly prohibited. If you are not the intended >> recipient, please contact the sender and delete all copies. >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> >> > > > -- > ~Craig > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120101/6b6a02ec/attachment.html From craig.topper at gmail.com Sun Jan 1 13:51:58 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 01 Jan 2012 19:51:58 -0000 Subject: [llvm-commits] [llvm] r147411 - in /llvm/trunk: lib/Target/X86/X86InstrFormats.td lib/Target/X86/X86InstrInfo.td test/CodeGen/X86/avx-intrinsics-x86.ll Message-ID: <20120101195158.8B2EB2A6C12C@llvm.org> Author: ctopper Date: Sun Jan 1 13:51:58 2012 New Revision: 147411 URL: http://llvm.org/viewvc/llvm-project?rev=147411&view=rev Log: Allow CRC32 instructions to be selected when AVX is enabled. Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td llvm/trunk/lib/Target/X86/X86InstrInfo.td llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=147411&r1=147410&r2=147411&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Sun Jan 1 13:51:58 2012 @@ -436,8 +436,8 @@ // SS42FI - SSE 4.2 instructions with T8XD prefix. class SS42FI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, T8XD, Requires<[HasSSE42]>; - + : I, T8XD, Requires<[HasSSE42orAVX]>; + // SS42AI = SSE 4.2 instructions with TA prefix class SS42AI o, Format F, dag outs, dag ins, string asm, list pattern> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=147411&r1=147410&r2=147411&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Sun Jan 1 13:51:58 2012 @@ -476,6 +476,7 @@ def HasXMM : Predicate<"Subtarget->hasXMM()">; def HasXMMInt : Predicate<"Subtarget->hasXMMInt()">; def HasSSE3orAVX : Predicate<"Subtarget->hasSSE3orAVX()">; +def HasSSE42orAVX : Predicate<"Subtarget->hasSSE42orAVX()">; def HasPOPCNT : Predicate<"Subtarget->hasPOPCNT()">; def HasAES : Predicate<"Subtarget->hasAES()">; Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=147411&r1=147410&r2=147411&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Sun Jan 1 13:51:58 2012 @@ -2530,3 +2530,24 @@ ret void } declare void @llvm.x86.sse2.clflush(i8*) nounwind + +; CHECK: crc32b +define i32 @crc32_32_8(i32 %a, i8 %b) nounwind { + %tmp = call i32 @llvm.x86.sse42.crc32.32.8(i32 %a, i8 %b) + ret i32 %tmp +} +declare i32 @llvm.x86.sse42.crc32.32.8(i32, i8) nounwind + +; CHECK: crc32w +define i32 @crc32_32_16(i32 %a, i16 %b) nounwind { + %tmp = call i32 @llvm.x86.sse42.crc32.32.16(i32 %a, i16 %b) + ret i32 %tmp +} +declare i32 @llvm.x86.sse42.crc32.32.16(i32, i16) nounwind + +; CHECK: crc32l +define i32 @crc32_32_32(i32 %a, i32 %b) nounwind { + %tmp = call i32 @llvm.x86.sse42.crc32.32.32(i32 %a, i32 %b) + ret i32 %tmp +} +declare i32 @llvm.x86.sse42.crc32.32.32(i32, i32) nounwind From craig.topper at gmail.com Sun Jan 1 13:56:00 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 1 Jan 2012 13:56:00 -0600 Subject: [llvm-commits] X86SSELevel for AVX architecture In-Reply-To: References: Message-ID: Missed the CRC32 instructions. They have now been fixed in r147411. On Sun, Jan 1, 2012 at 1:44 PM, Craig Topper wrote: > Fixed sfence, mfence, lfence, clflush, monitor, and mwait in r147409. > > > On Sun, Jan 1, 2012 at 1:06 PM, Craig Topper wrote: > >> This is similar to the fix for the prefetch instruction in r146163. I >> think the fence instructions and clflush are similarly broken. I'll see if >> I can find any others and I'll commit a fix. >> >> On Sun, Jan 1, 2012 at 10:18 AM, Umansky, Victor < >> victor.umansky at intel.com> wrote: >> >>> Hi Evan, >>> >>> I noticed that in X86Subtarget constructor you set ?X86SSELevel? member >>> of the class as ?NoMMXSSE? in the case when HasAVX member is set to >>> ?true?. >>> Effectively that invalidates SSE features for AVX architecture - because >>> hasSSEn() accessors return ?false? when HasAVXn() is ?true?. >>> I wonder whether this is the behavior which you?d like to enforce ? as >>> conceptually AVX architecture complements SSE rather than replaces it >>> completely. >>> >>> I noticed this problem after discovering that LLVM fails to lower >>> ?sse2.fence? intrinsic when generating a code for AVX architecture ? >>> because this intrinsic is conditioned on hasSSE2() being ?true?. >>> Is that case was somehow missed from regression testing, or there is >>> another way to lower that intrinsic? >>> >>> I?d appreciate your clarifications. >>> >>> Best Regards, >>> Victor Umansky >>> >>> --------------------------------------------------------------------- >>> Intel Israel (74) Limited >>> >>> This e-mail and any attachments may contain confidential material for >>> the sole use of the intended recipient(s). Any review or distribution >>> by others is strictly prohibited. If you are not the intended >>> recipient, please contact the sender and delete all copies. >>> >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >>> >>> >> >> >> -- >> ~Craig >> > > > > -- > ~Craig > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120101/2ed98d05/attachment.html