From clattner at apple.com Mon Jan 30 00:05:30 2012 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Jan 2012 22:05:30 -0800 Subject: [llvm-commits] [llvm] r149216 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrXOP.td In-Reply-To: <20120130011016.71DFF2A6C12C@llvm.org> References: <20120130011016.71DFF2A6C12C@llvm.org> Message-ID: On Jan 29, 2012, at 5:10 PM, Craig Topper wrote: > Author: ctopper > Date: Sun Jan 29 19:10:15 2012 > New Revision: 149216 > > URL: http://llvm.org/viewvc/llvm-project?rev=149216&view=rev > Log: > Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes. Hi Craig, Would it make sense to replace the intrinsics with something that more closely matches what the code generator wants? It would be really nice to reduce the # intrinsics and that huge switch. -Chris From craig.topper at gmail.com Mon Jan 30 00:16:04 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 29 Jan 2012 22:16:04 -0800 Subject: [llvm-commits] [llvm] r149216 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrXOP.td In-Reply-To: References: <20120130011016.71DFF2A6C12C@llvm.org> Message-ID: On Sun, Jan 29, 2012 at 10:05 PM, Chris Lattner wrote: > > On Jan 29, 2012, at 5:10 PM, Craig Topper wrote: > > > Author: ctopper > > Date: Sun Jan 29 19:10:15 2012 > > New Revision: 149216 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=149216&view=rev > > Log: > > Move some XOP patterns into instruction definition. Replae VPCMOV > intrinsic patterns with custom lowering to a target specific nodes. > > Hi Craig, > > Would it make sense to replace the intrinsics with something that more > closely matches what the code generator wants? It would be really nice to > reduce the # intrinsics and that huge switch. > Currently we have no code generator support for these instructions. But we were previously wasting an awful lot of patterns on these. Hopefully this is at least better than the patterns. I was wondering if it made more sense to do the immediate selection on the clang side? > > -Chris > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/d338eb54/attachment.html From viridia at gmail.com Mon Jan 30 00:16:21 2012 From: viridia at gmail.com (Talin) Date: Sun, 29 Jan 2012 22:16:21 -0800 Subject: [llvm-commits] PATCH: DenseMap::find_as() Message-ID: This patch adds a new method to DenseMap, "find_as()", which allows more efficient lookups in cases where the map keys are expensive to construct. Motivation: There are several examples within LLVM of maps which have keys that contain instances of std::vector. Doing a find() on these maps requires allocating memory to construct a new key. The "find_as()" method is identical in behavior to "find()", except that the key type is a template parameter to the function, allowing the lookup key to have a different type than the keys stored in the map. This alternate key type can be one that is much cheaper to construct, such as an ArrayRef instead of a vector. The alternate key type must be equality-comparable with the primary key type, and it must hash to the same value. This is handled by the DenseMapInfo for the map. For each alternate key type used, two additional methods must be defined in the DenseMapInfo: unsigned getHashValue(AltKeyType); bool isEqual(AltKeyType, PrimaryKeyType); This also means that maps that use find_as() must use a custom DenseMapInfo, instead of the template default. With this change, it will be possible to make some fairly significant optimizations in ConstantUniqueMap and other places. Note that I didn't come up with the name "find_as", I got it from another STL implementation that I worked on while I was at Electronic Arts. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/541857db/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: findas.patch Type: application/octet-stream Size: 3874 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/541857db/attachment.obj From sabre at nondot.org Mon Jan 30 00:21:21 2012 From: sabre at nondot.org (Chris Lattner) Date: Mon, 30 Jan 2012 06:21:21 -0000 Subject: [llvm-commits] [llvm] r149226 - /llvm/trunk/lib/VMCore/Constants.cpp Message-ID: <20120130062121.7B36E2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 00:21:21 2012 New Revision: 149226 URL: http://llvm.org/viewvc/llvm-project?rev=149226&view=rev Log: First step of flipping on ConstantDataSequential: enable ConstantDataVector to be formed whenever ConstantVector::get is used. Modified: llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149226&r1=149225&r2=149226&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Mon Jan 30 00:21:21 2012 @@ -836,11 +836,89 @@ return ConstantAggregateZero::get(T); if (isUndef) return UndefValue::get(T); + + // Check to see if all of the elements are ConstantFP or ConstantInt and if + // the element type is compatible with ConstantDataVector. If so, use it. + if (ConstantDataSequential::isElementTypeCompatible(C->getType()) && + (isa(C) || isa(C))) { + // We speculatively build the elements here even if it turns out that there + // is a constantexpr or something else weird in the array, since it is so + // uncommon for that to happen. + if (ConstantInt *CI = dyn_cast(C)) { + if (CI->getType()->isIntegerTy(8)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(16)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(32)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(64)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } + } + if (ConstantFP *CFP = dyn_cast(C)) { + if (CFP->getType()->isFloatTy()) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantFP *CFP = dyn_cast(V[i])) + Elts.push_back(CFP->getValueAPF().convertToFloat()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } else if (CFP->getType()->isDoubleTy()) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantFP *CFP = dyn_cast(V[i])) + Elts.push_back(CFP->getValueAPF().convertToDouble()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataVector::get(C->getContext(), Elts); + } + } + } + + // Otherwise, the element type isn't compatible with ConstantDataVector, or + // the operand list constants a ConstantExpr or something else strange. return pImpl->VectorConstants.getOrCreate(T, V); } Constant *ConstantVector::getSplat(unsigned NumElts, Constant *V) { + // If this splat is compatible with ConstantDataVector, use it instead of + // ConstantVector. + if ((isa(V) || isa(V)) && + ConstantDataSequential::isElementTypeCompatible(V->getType())) + return ConstantDataVector::getSplat(NumElts, V); + SmallVector Elts(NumElts, V); return get(Elts); } @@ -2196,14 +2274,18 @@ return get(V->getContext(), Elts); } - ConstantFP *CFP = cast(V); - if (CFP->getType()->isFloatTy()) { - SmallVector Elts(NumElts, CFP->getValueAPF().convertToFloat()); - return get(V->getContext(), Elts); + if (ConstantFP *CFP = dyn_cast(V)) { + if (CFP->getType()->isFloatTy()) { + SmallVector Elts(NumElts, CFP->getValueAPF().convertToFloat()); + return get(V->getContext(), Elts); + } + if (CFP->getType()->isDoubleTy()) { + SmallVector Elts(NumElts, + CFP->getValueAPF().convertToDouble()); + return get(V->getContext(), Elts); + } } - assert(CFP->getType()->isDoubleTy() && "Unsupported ConstantData type"); - SmallVector Elts(NumElts, CFP->getValueAPF().convertToDouble()); - return get(V->getContext(), Elts); + return ConstantVector::getSplat(NumElts, V); } From clattner at apple.com Mon Jan 30 00:29:49 2012 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Jan 2012 22:29:49 -0800 Subject: [llvm-commits] [llvm] r149216 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrXOP.td In-Reply-To: References: <20120130011016.71DFF2A6C12C@llvm.org> Message-ID: <122D82E5-6C24-4515-8DE4-22A356426D66@apple.com> On Jan 29, 2012, at 10:16 PM, Craig Topper wrote: > > URL: http://llvm.org/viewvc/llvm-project?rev=149216&view=rev > > Log: > > Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes. > > Hi Craig, > > Would it make sense to replace the intrinsics with something that more closely matches what the code generator wants? It would be really nice to reduce the # intrinsics and that huge switch. > > Currently we have no code generator support for these instructions. But we were previously wasting an awful lot of patterns on these. Hopefully this is at least better than the patterns. I agree that this is a lot better than having all those patterns! > I was wondering if it made more sense to do the immediate selection on the clang side? Yeah, I think that would be best. Clang controls the definition of it's builtins and the corresponding avxintrin.h file, so we can just make the builtins take the immediate, factoring all of this into just a few intrinsics. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/0caf4927/attachment.html From clattner at apple.com Mon Jan 30 00:31:26 2012 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Jan 2012 22:31:26 -0800 Subject: [llvm-commits] PATCH: DenseMap::find_as() In-Reply-To: References: Message-ID: <17E850DC-1D93-4F50-9745-3EEFC09DCA62@apple.com> On Jan 29, 2012, at 10:16 PM, Talin wrote: > This patch adds a new method to DenseMap, "find_as()", which allows more efficient lookups in cases where the map keys are expensive to construct. Sounds great to me, please commit. Please add some of this explanation to the datastructure section of the programmer's manual though! -Chris > > Motivation: There are several examples within LLVM of maps which have keys that contain instances of std::vector. Doing a find() on these maps requires allocating memory to construct a new key. The "find_as()" method is identical in behavior to "find()", except that the key type is a template parameter to the function, allowing the lookup key to have a different type than the keys stored in the map. This alternate key type can be one that is much cheaper to construct, such as an ArrayRef instead of a vector. > > The alternate key type must be equality-comparable with the primary key type, and it must hash to the same value. This is handled by the DenseMapInfo for the map. For each alternate key type used, two additional methods must be defined in the DenseMapInfo: > > unsigned getHashValue(AltKeyType); > bool isEqual(AltKeyType, PrimaryKeyType); > > This also means that maps that use find_as() must use a custom DenseMapInfo, instead of the template default. > > With this change, it will be possible to make some fairly significant optimizations in ConstantUniqueMap and other places. > > Note that I didn't come up with the name "find_as", I got it from another STL implementation that I worked on while I was at Electronic Arts. > > -- > -- Talin > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From craig.topper at gmail.com Mon Jan 30 00:47:38 2012 From: craig.topper at gmail.com (Craig Topper) Date: Sun, 29 Jan 2012 22:47:38 -0800 Subject: [llvm-commits] [llvm] r149216 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrXOP.td In-Reply-To: <122D82E5-6C24-4515-8DE4-22A356426D66@apple.com> References: <20120130011016.71DFF2A6C12C@llvm.org> <122D82E5-6C24-4515-8DE4-22A356426D66@apple.com> Message-ID: On Sun, Jan 29, 2012 at 10:29 PM, Chris Lattner wrote: > > On Jan 29, 2012, at 10:16 PM, Craig Topper wrote: > > > URL: http://llvm.org/viewvc/llvm-project?rev=149216&view=rev >> > Log: >> > Move some XOP patterns into instruction definition. Replae VPCMOV >> intrinsic patterns with custom lowering to a target specific nodes. >> >> Hi Craig, >> >> Would it make sense to replace the intrinsics with something that more >> closely matches what the code generator wants? It would be really nice to >> reduce the # intrinsics and that huge switch. >> > > Currently we have no code generator support for these instructions. But we > were previously wasting an awful lot of patterns on these. Hopefully this > is at least better than the patterns. > > > I agree that this is a lot better than having all those patterns! > On a related subject, we also have a ton of patterns for VPCMOV right now because we have intrinsics for every vector type, but the opcode itself does a bitwise conditional move based on a mask. The builtins around these intrinsics currently match gcc. How best to simplify this? Custom lower to bitcasts and a target specific node in llvm? Keep the builtins, and type cast to a single intrinsic in CGBuiltins in clang? Put the casts in xopintrin.h and use a single builtin? > I was wondering if it made more sense to do the immediate selection on > the clang side? > > > Yeah, I think that would be best. Clang controls the definition of it's > builtins and the corresponding avxintrin.h file, so we can just make the > builtins take the immediate, factoring all of this into just a few > intrinsics. > > -Chris > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/7b4756a1/attachment.html From viridia at gmail.com Mon Jan 30 00:55:43 2012 From: viridia at gmail.com (Talin) Date: Mon, 30 Jan 2012 06:55:43 -0000 Subject: [llvm-commits] [llvm] r149229 - in /llvm/trunk: docs/ProgrammersManual.html include/llvm/ADT/DenseMap.h unittests/ADT/DenseMapTest.cpp Message-ID: <20120130065544.0509A2A6C12C@llvm.org> Author: talin Date: Mon Jan 30 00:55:43 2012 New Revision: 149229 URL: http://llvm.org/viewvc/llvm-project?rev=149229&view=rev Log: DenseMap::find_as() and unit tests. Modified: llvm/trunk/docs/ProgrammersManual.html llvm/trunk/include/llvm/ADT/DenseMap.h llvm/trunk/unittests/ADT/DenseMapTest.cpp Modified: llvm/trunk/docs/ProgrammersManual.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ProgrammersManual.html?rev=149229&r1=149228&r2=149229&view=diff ============================================================================== --- llvm/trunk/docs/ProgrammersManual.html (original) +++ llvm/trunk/docs/ProgrammersManual.html Mon Jan 30 00:55:43 2012 @@ -1753,7 +1753,7 @@

There are several aspects of DenseMap that you should be aware of, however. The -iterators in a densemap are invalidated whenever an insertion occurs, unlike +iterators in a DenseMap are invalidated whenever an insertion occurs, unlike map. Also, because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default), it will waste a lot of space if your keys or values are large. Finally, you must implement a partial specialization of @@ -1761,6 +1761,14 @@ is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally.

+

+DenseMap's find_as() method supports lookup operations using an alternate key +type. This is useful in cases where the normal key type is expensive to +construct, but cheap to compare against. The DenseMapInfo is responsible for +defining the appropriate comparison and hashing methods for each alternate +key type used. +

+ Modified: llvm/trunk/include/llvm/ADT/DenseMap.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/DenseMap.h?rev=149229&r1=149228&r2=149229&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/DenseMap.h (original) +++ llvm/trunk/include/llvm/ADT/DenseMap.h Mon Jan 30 00:55:43 2012 @@ -147,6 +147,26 @@ return end(); } + /// Alternate version of find() which allows a different, and possibly + /// less expensive, key type. + /// The DenseMapInfo is responsible for supplying methods + /// getHashValue(LookupKeyT) and isEqual(LookupKeyT, KeyT) for each key + /// type used. + template + iterator find_as(const LookupKeyT &Val) { + BucketT *TheBucket; + if (LookupBucketFor(Val, TheBucket)) + return iterator(TheBucket, Buckets+NumBuckets, true); + return end(); + } + template + const_iterator find_as(const LookupKeyT &Val) const { + BucketT *TheBucket; + if (LookupBucketFor(Val, TheBucket)) + return const_iterator(TheBucket, Buckets+NumBuckets, true); + return end(); + } + /// lookup - Return the entry for the specified key, or a default /// constructed value if no such entry exists. ValueT lookup(const KeyT &Val) const { @@ -309,6 +329,10 @@ static unsigned getHashValue(const KeyT &Val) { return KeyInfoT::getHashValue(Val); } + template + static unsigned getHashValue(const LookupKeyT &Val) { + return KeyInfoT::getHashValue(Val); + } static const KeyT getEmptyKey() { return KeyInfoT::getEmptyKey(); } @@ -320,7 +344,8 @@ /// FoundBucket. If the bucket contains the key and a value, this returns /// true, otherwise it returns a bucket with an empty marker or tombstone and /// returns false. - bool LookupBucketFor(const KeyT &Val, BucketT *&FoundBucket) const { + template + bool LookupBucketFor(const LookupKeyT &Val, BucketT *&FoundBucket) const { unsigned BucketNo = getHashValue(Val); unsigned ProbeAmt = 1; BucketT *BucketsPtr = Buckets; @@ -341,7 +366,7 @@ while (1) { BucketT *ThisBucket = BucketsPtr + (BucketNo & (NumBuckets-1)); // Found Val's bucket? If so, return it. - if (KeyInfoT::isEqual(ThisBucket->first, Val)) { + if (KeyInfoT::isEqual(Val, ThisBucket->first)) { FoundBucket = ThisBucket; return true; } Modified: llvm/trunk/unittests/ADT/DenseMapTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/DenseMapTest.cpp?rev=149229&r1=149228&r2=149229&view=diff ============================================================================== --- llvm/trunk/unittests/ADT/DenseMapTest.cpp (original) +++ llvm/trunk/unittests/ADT/DenseMapTest.cpp Mon Jan 30 00:55:43 2012 @@ -176,4 +176,45 @@ EXPECT_TRUE(cit == cit2); } +// Key traits that allows lookup with either an unsigned or char* key; +// In the latter case, "a" == 0, "b" == 1 and so on. +struct TestDenseMapInfo { + static inline unsigned getEmptyKey() { return ~0; } + static inline unsigned getTombstoneKey() { return ~0U - 1; } + static unsigned getHashValue(const unsigned& Val) { return Val * 37U; } + static unsigned getHashValue(const char* Val) { + return (unsigned)(Val[0] - 'a') * 37U; + } + static bool isEqual(const unsigned& LHS, const unsigned& RHS) { + return LHS == RHS; + } + static bool isEqual(const char* LHS, const unsigned& RHS) { + return (unsigned)(LHS[0] - 'a') == RHS; + } +}; + +// find_as() tests +TEST_F(DenseMapTest, FindAsTest) { + DenseMap map; + map[0] = 1; + map[1] = 2; + map[2] = 3; + + // Size tests + EXPECT_EQ(3u, map.size()); + + // Normal lookup tests + EXPECT_EQ(1, map.count(1)); + EXPECT_EQ(1u, map.find(0)->second); + EXPECT_EQ(2u, map.find(1)->second); + EXPECT_EQ(3u, map.find(2)->second); + EXPECT_TRUE(map.find(3) == map.end()); + + // find_as() tests + EXPECT_EQ(1u, map.find_as("a")->second); + EXPECT_EQ(2u, map.find_as("b")->second); + EXPECT_EQ(3u, map.find_as("c")->second); + EXPECT_TRUE(map.find_as("d") == map.end()); +} + } From marina.yatsina at intel.com Mon Jan 30 01:19:58 2012 From: marina.yatsina at intel.com (Yatsina, Marina) Date: Mon, 30 Jan 2012 07:19:58 +0000 Subject: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp References: <20110708211415.2D93E2A6C12C@llvm.org> <7DE70FDACDE4CD4887C4278C12A2E30506DE39@HASMSX104.ger.corp.intel.com> Message-ID: Hi, I did not get a confirmation that my previous mail got distributed to the llvm-commits ML. I wanted to know the status of my patch. Thank you, Marina. -----Original Message----- From: Yatsina, Marina Sent: Wednesday, January 25, 2012 13:57 To: 'llvm-commits at cs.uiuc.edu' Subject: RE: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp Hi, I have found a bug introduced by commit 134741. The commit added use of macros that are not defined on Windows and they are causing X86Subtarget to choose "generic" as the CPUName. I've opened Bug #11834 on the problem: http://www.llvm.org/bugs/show_bug.cgi?id=11834 I've also attached a fix to this mail and to the bug opened in bugzilla. Thank you, Marina. -----Original Message----- From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Evan Cheng Sent: Saturday, July 09, 2011 00:14 To: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp Author: evancheng Date: Fri Jul 8 16:14:14 2011 New Revision: 134741 URL: http://llvm.org/viewvc/llvm-project?rev=134741&view=rev Log: For non-x86 host, used generic as CPU name. Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp llvm/trunk/lib/Target/X86/X86Subtarget.cpp Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp?rev=134741&r1=134740&r2=134741&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp (original) +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp Fri Jul 8 16:14:14 2011 @@ -140,8 +140,13 @@ } std::string CPUName = CPU; - if (CPUName.empty()) + if (CPUName.empty()) { +#if defined (__x86_64__) || defined(__i386__) CPUName = sys::getHostCPUName(); +#else + CPUName = "generic"; +#endif + } if (ArchFS.empty() && CPUName.empty() && hasX86_64()) // Auto-detect if host is 64-bit capable, it's the default if true. Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=134741&r1=134740&r2=134741&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Fri Jul 8 16:14:14 2011 @@ -258,12 +258,17 @@ ArchFS = FS; } - std::string CPUName = CPU; - if (CPUName.empty()) - CPUName = sys::getHostCPUName(); - // Determine default and user specified characteristics - if (!CPUName.empty() || !ArchFS.empty()) { + if (!ArchFS.empty()) { + std::string CPUName = CPU; + if (CPUName.empty()) { +#if defined (__x86_64__) || defined(__i386__) + CPUName = sys::getHostCPUName(); +#else + CPUName = "generic"; +#endif + } + // If feature string is not empty, parse features string. ParseSubtargetFeatures(CPUName, ArchFS); // All X86-64 CPUs also have SSE2, however user might request no SSE via _______________________________________________ llvm-commits mailing list llvm-commits at cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From sabre at nondot.org Mon Jan 30 01:36:01 2012 From: sabre at nondot.org (Chris Lattner) Date: Mon, 30 Jan 2012 07:36:01 -0000 Subject: [llvm-commits] [llvm] r149230 - /llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Message-ID: <20120130073601.B01052A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 01:36:01 2012 New Revision: 149230 URL: http://llvm.org/viewvc/llvm-project?rev=149230&view=rev Log: fix a major oversight that is breaking some llvm-test tests. Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=149230&r1=149229&r2=149230&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Mon Jan 30 01:36:01 2012 @@ -900,6 +900,7 @@ AbbrevToUse = CString7Abbrev; } else if (const ConstantDataSequential *CDS = dyn_cast(C)) { + Code = bitc::CST_CODE_DATA; Type *EltTy = CDS->getType()->getElementType(); if (isa(EltTy)) { for (unsigned i = 0, e = CDS->getNumElements(); i != e; ++i) From clattner at apple.com Mon Jan 30 01:42:25 2012 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Jan 2012 23:42:25 -0800 Subject: [llvm-commits] [llvm] r149216 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86InstrFragmentsSIMD.td X86InstrXOP.td In-Reply-To: References: <20120130011016.71DFF2A6C12C@llvm.org> <122D82E5-6C24-4515-8DE4-22A356426D66@apple.com> Message-ID: <10D2C4A7-6029-4631-AC7D-7F1E57D27967@apple.com> On Jan 29, 2012, at 10:47 PM, Craig Topper wrote: >> Currently we have no code generator support for these instructions. But we were previously wasting an awful lot of patterns on these. Hopefully this is at least better than the patterns. > > I agree that this is a lot better than having all those patterns! > > On a related subject, we also have a ton of patterns for VPCMOV right now because we have intrinsics for every vector type, but the opcode itself does a bitwise conditional move based on a mask. The builtins around these intrinsics currently match gcc. Ick. It is occasionally convenient to use the same builtins as GCC (e.g. the gcc_builtin logic), but not if it makes the compiler more gross. It sounds like these should all be merged into a single builtin. Matching GCC's vector __builtin's is a non-goal, and we already deviate strongly from it in the SSE and ARM intrinsics. > How best to simplify this? Custom lower to bitcasts and a target specific node in llvm? Keep the builtins, and type cast to a single intrinsic in CGBuiltins in clang? Put the casts in xopintrin.h and use a single builtin? The later. Push the complexity into the header file if possible. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120129/d073d5fe/attachment.html From craig.topper at gmail.com Mon Jan 30 01:50:31 2012 From: craig.topper at gmail.com (Craig Topper) Date: Mon, 30 Jan 2012 07:50:31 -0000 Subject: [llvm-commits] [llvm] r149232 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/avx-shuffle.ll Message-ID: <20120130075031.E27B52A6C12C@llvm.org> Author: ctopper Date: Mon Jan 30 01:50:31 2012 New Revision: 149232 URL: http://llvm.org/viewvc/llvm-project?rev=149232&view=rev Log: Fix pattern for memory form of PSHUFD for use with FP vectors to remove bitcast to an integer vector that normal code wouldn't have. Also remove bitcasts from code that turns splat vector loads into a shuffle as it was making the broken pattern necessary. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86InstrSSE.td llvm/trunk/test/CodeGen/X86/avx-shuffle.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149232&r1=149231&r2=149232&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Jan 30 01:50:31 2012 @@ -4846,21 +4846,16 @@ int EltNo = (Offset - StartOffset) >> 2; int NumElems = VT.getVectorNumElements(); - EVT CanonVT = VT.getSizeInBits() == 128 ? MVT::v4i32 : MVT::v8i32; EVT NVT = EVT::getVectorVT(*DAG.getContext(), PVT, NumElems); SDValue V1 = DAG.getLoad(NVT, dl, Chain, Ptr, LD->getPointerInfo().getWithOffset(StartOffset), false, false, false, 0); - // Canonicalize it to a v4i32 or v8i32 shuffle. SmallVector Mask; for (int i = 0; i < NumElems; ++i) Mask.push_back(EltNo); - V1 = DAG.getNode(ISD::BITCAST, dl, CanonVT, V1); - return DAG.getNode(ISD::BITCAST, dl, NVT, - DAG.getVectorShuffle(CanonVT, dl, V1, - DAG.getUNDEF(CanonVT),&Mask[0])); + return DAG.getVectorShuffle(NVT, dl, V1, DAG.getUNDEF(NVT), &Mask[0]); } return SDValue(); Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=149232&r1=149231&r2=149232&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Mon Jan 30 01:50:31 2012 @@ -3998,8 +3998,7 @@ def : Pat<(v4i32 (X86PShufd (bc_v4i32 (memopv2i64 addr:$src1)), (i8 imm:$imm))), (VPSHUFDmi addr:$src1, imm:$imm)>; - def : Pat<(v4i32 (X86PShufd (bc_v4i32 (memopv4f32 addr:$src1)), - (i8 imm:$imm))), + def : Pat<(v4f32 (X86PShufd (memopv4f32 addr:$src1), (i8 imm:$imm))), (VPSHUFDmi addr:$src1, imm:$imm)>; def : Pat<(v4f32 (X86PShufd VR128:$src1, (i8 imm:$imm))), (VPSHUFDri VR128:$src1, imm:$imm)>; @@ -4051,8 +4050,7 @@ def : Pat<(v4i32 (X86PShufd (bc_v4i32 (memopv2i64 addr:$src1)), (i8 imm:$imm))), (PSHUFDmi addr:$src1, imm:$imm)>; - def : Pat<(v4i32 (X86PShufd (bc_v4i32 (memopv4f32 addr:$src1)), - (i8 imm:$imm))), + def : Pat<(v4f32 (X86PShufd (memopv4f32 addr:$src1), (i8 imm:$imm))), (PSHUFDmi addr:$src1, imm:$imm)>; def : Pat<(v4f32 (X86PShufd VR128:$src1, (i8 imm:$imm))), (PSHUFDri VR128:$src1, imm:$imm)>; Modified: llvm/trunk/test/CodeGen/X86/avx-shuffle.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-shuffle.ll?rev=149232&r1=149231&r2=149232&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-shuffle.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-shuffle.ll Mon Jan 30 01:50:31 2012 @@ -96,3 +96,16 @@ %r = extractelement <8 x i32> %b, i32 2 ret i32 %r } + +define <4 x float> @test11(<4 x float> %a) nounwind { +; CHECK: pshufd $27 + %tmp1 = shufflevector <4 x float> %a, <4 x float> undef, <4 x i32> + ret <4 x float> %tmp1 +} + +define <4 x float> @test12(<4 x float>* %a) nounwind { +; CHECK: pshufd $27, ( + %tmp0 = load <4 x float>* %a + %tmp1 = shufflevector <4 x float> %tmp0, <4 x float> undef, <4 x i32> + ret <4 x float> %tmp1 +} From tobias at grosser.es Mon Jan 30 02:36:49 2012 From: tobias at grosser.es (Tobias Grosser) Date: Mon, 30 Jan 2012 09:36:49 +0100 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <1327764041.2489.820.camel@sapling> References: <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <4EFC7291.9040808@grosser.es> <1325179929.13080.2839.camel@sapling> <4EFD7FD3.8040800@grosser.es> <1327378420.32397.1603.camel@sapling > <4F1ED40C.80306@grosser.es> <1327421849.11266.69.camel@sapling> <1327438907.11266.134.camel@s apling> <4F1FD5A4.9040301@grosser.es> <1327764041.2489.820.camel@s apling> Message-ID: <4F2656A1.6040409@grosser.es> On 01/28/2012 04:20 PM, Hal Finkel wrote: > On Wed, 2012-01-25 at 11:12 +0100, Tobias Grosser wrote: >> On 01/24/2012 10:01 PM, Hal Finkel wrote: >>> I have attached the latest version of my basic-block autovectorization >>> pass. >> >> Nice. >> >> >>> With regard to the non-trivial cycle checking I had mentioned >>> previously, I implemented the "late abort" solution and made it the >>> default for cases where the full cycle check would be expensive (for >>> blocks that have many candidate pairs). For blocks with fewer candidate >>> pairs, the full cycle check is used. >> >> Good. >> >>> I believe that I have addressed all concerns raised thus far (except for >>> the container Value* -> Instruction* type changes, which Tobias said he >>> would be okay with having changed post commit). If I receive no >>> objections over the next few days, I'll commit. >> >> Alright. >> >>> I would like to thank everyone who has provided feedback, many of the >>> suggestions have proved quite valuable. >> >> A final nitpick: >> >>> + if (CallInst *C = dyn_cast(I)) { >>> + if (!isVectorizableIntrinsic(C)) >>> + return false; >>> + } else if (LoadInst *L = dyn_cast(I)) { >>> + // Vectorize simple loads if possbile: >>> + IsSimpleLoadStore = L->isSimple(); >>> + if (!IsSimpleLoadStore || NoMemOps) >>> + return false; >>> + } else if (StoreInst *S = dyn_cast(I)) { >>> + // Vectorize simple stores if possbile: >>> + IsSimpleLoadStore = S->isSimple(); >>> + if (!IsSimpleLoadStore || NoMemOps) >>> + return false; >>> + } else if (CastInst *C = dyn_cast(I)) { >>> + // We can vectorize casts, but not casts of pointer types, etc. >>> + if (NoCasts) >>> + return false; >>> + >>> + Type *SrcTy = C->getSrcTy(); >>> + if (!SrcTy->isSingleValueType() || SrcTy->isPointerTy()) >>> + return false; >>> + >>> + Type *DestTy = C->getDestTy(); >>> + if (!DestTy->isSingleValueType() || DestTy->isPointerTy()) >>> + return false; >>> + } else if (!(I->isBinaryOp() || isa(I) || >>> + isa(I) || isa(I))) >>> + return false; >> >> You may want to add braces to a single statement branch, if the other >> branches have also braces. (I think I have seen this happening a couple >> of times). > > Good, I prefer it this way (updated patch attached). I'll commit this > today or tomorrow. Has anyone checked the CMake build with this patch > recently? I checked on Friday and at least for me it compiles fine. Also 'make check' finishes without errors. Cheers Tobi From craig.topper at gmail.com Mon Jan 30 02:33:36 2012 From: craig.topper at gmail.com (Craig Topper) Date: Mon, 30 Jan 2012 08:33:36 -0000 Subject: [llvm-commits] [llvm] r149234 - /llvm/trunk/include/llvm/IntrinsicsX86.td Message-ID: <20120130083336.D41922A6C12D@llvm.org> Author: ctopper Date: Mon Jan 30 02:33:36 2012 New Revision: 149234 URL: http://llvm.org/viewvc/llvm-project?rev=149234&view=rev Log: Add GCCBuiltin declarations for cmpsd/cmpss/cmppd/cmpps to allow custom code to be removed from clang. Modified: llvm/trunk/include/llvm/IntrinsicsX86.td Modified: llvm/trunk/include/llvm/IntrinsicsX86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=149234&r1=149233&r2=149234&view=diff ============================================================================== --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) +++ llvm/trunk/include/llvm/IntrinsicsX86.td Mon Jan 30 02:33:36 2012 @@ -145,10 +145,10 @@ // Comparison ops let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". - def int_x86_sse_cmp_ss : + def int_x86_sse_cmp_ss : GCCBuiltin<"__builtin_ia32_cmpss">, Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty], [IntrNoMem]>; - def int_x86_sse_cmp_ps : + def int_x86_sse_cmp_ps : GCCBuiltin<"__builtin_ia32_cmpps">, Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty], [IntrNoMem]>; def int_x86_sse_comieq_ss : GCCBuiltin<"__builtin_ia32_comieq">, @@ -281,10 +281,10 @@ // FP comparison ops let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". - def int_x86_sse2_cmp_sd : + def int_x86_sse2_cmp_sd : GCCBuiltin<"__builtin_ia32_cmpsd">, Intrinsic<[llvm_v2f64_ty], [llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty], [IntrNoMem]>; - def int_x86_sse2_cmp_pd : + def int_x86_sse2_cmp_pd : GCCBuiltin<"__builtin_ia32_cmppd">, Intrinsic<[llvm_v2f64_ty], [llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty], [IntrNoMem]>; def int_x86_sse2_comieq_sd : GCCBuiltin<"__builtin_ia32_comisdeq">, From grosser at fim.uni-passau.de Mon Jan 30 03:07:45 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 09:07:45 -0000 Subject: [llvm-commits] [polly] r149239 - /polly/trunk/lib/JSON/json_value.cpp Message-ID: <20120130090745.79D332A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 03:07:45 2012 New Revision: 149239 URL: http://llvm.org/viewvc/llvm-project?rev=149239&view=rev Log: Disable some clang warnings in imported JSON code. Modified: polly/trunk/lib/JSON/json_value.cpp Modified: polly/trunk/lib/JSON/json_value.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/JSON/json_value.cpp?rev=149239&r1=149238&r2=149239&view=diff ============================================================================== --- polly/trunk/lib/JSON/json_value.cpp (original) +++ polly/trunk/lib/JSON/json_value.cpp Mon Jan 30 03:07:45 2012 @@ -13,6 +13,10 @@ # include "json_batchallocator.h" #endif // #ifndef JSON_USE_SIMPLE_INTERNAL_ALLOCATOR +// Disable warnings. We do not fix these warnings, as this is a file imported +// into Polly and we do not want to diverge from the original source. +#pragma clang diagnostic ignored "-Wcovered-switch-default" + #define JSON_ASSERT_UNREACHABLE assert( false ) #define JSON_ASSERT( condition ) assert( condition ); // @todo <= change this into an exception throw // Do not use throw when exception is disable. From grosser at fim.uni-passau.de Mon Jan 30 03:07:51 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 09:07:51 -0000 Subject: [llvm-commits] [polly] r149240 - in /polly/trunk: include/polly/RegisterPasses.h lib/RegisterPasses.cpp Message-ID: <20120130090751.210C22A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 03:07:50 2012 New Revision: 149240 URL: http://llvm.org/viewvc/llvm-project?rev=149240&view=rev Log: RegisterPass: Expose functions to register Polly passes Added: polly/trunk/include/polly/RegisterPasses.h Modified: polly/trunk/lib/RegisterPasses.cpp Added: polly/trunk/include/polly/RegisterPasses.h URL: http://llvm.org/viewvc/llvm-project/polly/trunk/include/polly/RegisterPasses.h?rev=149240&view=auto ============================================================================== --- polly/trunk/include/polly/RegisterPasses.h (added) +++ polly/trunk/include/polly/RegisterPasses.h Mon Jan 30 03:07:50 2012 @@ -0,0 +1,27 @@ +//===------ polly/RegisterPasses.h - Register the Polly passes *- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// Functions to register the Polly passes in a LLVM pass manager. +// +//===----------------------------------------------------------------------===// + +namespace llvm { + class PassManagerBase; +} + +// Register the Polly preoptimization passes. Preoptimizations are used to +// prepare the LLVM-IR for Polly. They increase the amount of code that can be +// optimized. +// (These passes are automatically included in registerPollyPasses). +void registerPollyPreoptPasses(llvm::PassManagerBase &PM); + +// Register the Polly optimizer (including its preoptimizations). +void registerPollyPasses(llvm::PassManagerBase &PM, + bool DisableScheduler = false, + bool DisableCodegen = false); Modified: polly/trunk/lib/RegisterPasses.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/RegisterPasses.cpp?rev=149240&r1=149239&r2=149240&view=diff ============================================================================== --- polly/trunk/lib/RegisterPasses.cpp (original) +++ polly/trunk/lib/RegisterPasses.cpp Mon Jan 30 03:07:50 2012 @@ -10,14 +10,7 @@ // Add the Polly passes to the optimization passes executed at -O3. // //===----------------------------------------------------------------------===// -#include "llvm/Analysis/Passes.h" -#include "llvm/InitializePasses.h" -#include "llvm/PassManager.h" -#include "llvm/PassRegistry.h" -#include "llvm/Transforms/Scalar.h" -#include "llvm/Transforms/IPO/PassManagerBuilder.h" -#include "llvm/Support/CommandLine.h" - +#include "polly/RegisterPasses.h" #include "polly/LinkAllPasses.h" #include "polly/Cloog.h" @@ -26,6 +19,14 @@ #include "polly/ScopInfo.h" #include "polly/TempScopInfo.h" +#include "llvm/Analysis/Passes.h" +#include "llvm/InitializePasses.h" +#include "llvm/PassManager.h" +#include "llvm/PassRegistry.h" +#include "llvm/Transforms/Scalar.h" +#include "llvm/Transforms/IPO/PassManagerBuilder.h" +#include "llvm/Support/CommandLine.h" + #include using namespace llvm; @@ -109,8 +110,7 @@ static StaticInitializer InitializeEverything; -static void registerPollyPreoptPasses(const llvm::PassManagerBuilder &Builder, - llvm::PassManagerBase &PM) { +void registerPollyPreoptPasses(llvm::PassManagerBase &PM) { // A standard set of optimization passes partially taken/copied from the // set of default optimization passes. It is used to bring the code into // a canonical form that can than be analyzed by Polly. This set of passes is @@ -142,38 +142,12 @@ PM.add(polly::createRegionSimplifyPass()); } -static void registerPollyPasses(const llvm::PassManagerBuilder &Builder, - llvm::PassManagerBase &PM) { - - if (Builder.OptLevel == 0) - return; - - if (PollyOnlyPrinter || PollyPrinter || PollyOnlyViewer || PollyViewer || - ExportJScop || ImportJScop) - PollyEnabled = true; - - if (!PollyEnabled) { - if (DisableCodegen) - errs() << "The option -polly-no-codegen has no effect. " - "Polly was not enabled\n"; - - if (DisableScheduler) - errs() << "The option -polly-no-optimizer has no effect. " - "Polly was not enabled\n"; - - return; - } - - // Polly is only enabled at -O3 - if (Builder.OptLevel != 3) { - errs() << "Polly should only be run with -O3. Disabling Polly.\n"; - return; - } - +void registerPollyPasses(llvm::PassManagerBase &PM, bool DisableScheduler, + bool DisableCodegen) { bool RunScheduler = !DisableScheduler; bool RunCodegen = !DisableCodegen; - registerPollyPreoptPasses(Builder, PM); + registerPollyPreoptPasses(PM); if (PollyViewer) PM.add(polly::createDOTViewerPass()); @@ -213,6 +187,44 @@ PM.add(polly::createCodeGenerationPass()); } +static +void registerPollyEarlyAsPossiblePasses(const llvm::PassManagerBuilder &Builder, + llvm::PassManagerBase &PM) { + + if (Builder.OptLevel == 0) + return; + + if (PollyOnlyPrinter || PollyPrinter || PollyOnlyViewer || PollyViewer || + ExportJScop || ImportJScop) + PollyEnabled = true; + + if (!PollyEnabled) { + if (DisableCodegen) + errs() << "The option -polly-no-codegen has no effect. " + "Polly was not enabled\n"; + + if (DisableScheduler) + errs() << "The option -polly-no-optimizer has no effect. " + "Polly was not enabled\n"; + + return; + } + + // Polly is only enabled at -O3 + if (Builder.OptLevel != 3) { + errs() << "Polly should only be run with -O3. Disabling Polly.\n"; + return; + } + + registerPollyPasses(PM, DisableScheduler, DisableCodegen); +} + +static void registerPollyOptLevel0Passes(const llvm::PassManagerBuilder &, + llvm::PassManagerBase &PM) { + registerPollyPreoptPasses(PM); +} + + // Execute Polly together with a set of preparing passes. // // We run Polly that early to run before loop optimizer passes like LICM or @@ -221,7 +233,7 @@ static llvm::RegisterStandardPasses PassRegister(llvm::PassManagerBuilder::EP_EarlyAsPossible, - registerPollyPasses); + registerPollyEarlyAsPossiblePasses); static llvm::RegisterStandardPasses PassRegisterPreopt(llvm::PassManagerBuilder::EP_EnabledOnOptLevel0, - registerPollyPreoptPasses); + registerPollyOptLevel0Passes); From asl at math.spbu.ru Mon Jan 30 04:21:23 2012 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Mon, 30 Jan 2012 10:21:23 -0000 Subject: [llvm-commits] [compiler-rt] r149241 - in /compiler-rt/trunk/lib/arm: aeabi_memcmp.S aeabi_memcpy.S aeabi_memmove.S aeabi_memset.S Message-ID: <20120130102123.6688D2A6C12C@llvm.org> Author: asl Date: Mon Jan 30 04:21:23 2012 New Revision: 149241 URL: http://llvm.org/viewvc/llvm-project?rev=149241&view=rev Log: Provide aeabi_mem* functions. Added: compiler-rt/trunk/lib/arm/aeabi_memcmp.S compiler-rt/trunk/lib/arm/aeabi_memcpy.S compiler-rt/trunk/lib/arm/aeabi_memmove.S compiler-rt/trunk/lib/arm/aeabi_memset.S Added: compiler-rt/trunk/lib/arm/aeabi_memcmp.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_memcmp.S?rev=149241&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_memcmp.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_memcmp.S Mon Jan 30 04:21:23 2012 @@ -0,0 +1,19 @@ +//===-- aeabi_memcmp.S - EABI memcmp implementation -----------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// void __aeabi_memcmp(void *dest, void *src, size_t n) { memcmp(dest, src, n); } + + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_memcmp) + b memcmp + +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memcmp4, __aeabi_memcmp) +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memcmp8, __aeabi_memcmp) Added: compiler-rt/trunk/lib/arm/aeabi_memcpy.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_memcpy.S?rev=149241&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_memcpy.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_memcpy.S Mon Jan 30 04:21:23 2012 @@ -0,0 +1,19 @@ +//===-- aeabi_memcpy.S - EABI memcpy implementation -----------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// void __aeabi_memcpy(void *dest, void *src, size_t n) { memcpy(dest, src, n); } + + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_memcpy) + b memcpy + +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memcpy4, __aeabi_memcpy) +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memcpy8, __aeabi_memcpy) Added: compiler-rt/trunk/lib/arm/aeabi_memmove.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_memmove.S?rev=149241&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_memmove.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_memmove.S Mon Jan 30 04:21:23 2012 @@ -0,0 +1,19 @@ +//===-- aeabi_memmove.S - EABI memmove implementation --------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===---------------------------------------------------------------------===// + +#include "../assembly.h" + +// void __aeabi_memmove(void *dest, void *src, size_t n) { memmove(dest, src, n); } + + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_memmove) + b memmove + +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memmove4, __aeabi_memmove) +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memmove8, __aeabi_memmove) Added: compiler-rt/trunk/lib/arm/aeabi_memset.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_memset.S?rev=149241&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_memset.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_memset.S Mon Jan 30 04:21:23 2012 @@ -0,0 +1,32 @@ +//===-- aeabi_memset.S - EABI memset implementation -----------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// void __aeabi_memset(void *dest, size_t n, int c) { memset(dest, c, n); } +// void __aeabi_memclr(void *dest, size_t n) { __aeabi_memset(dest, n, 0); } + + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_memset) + mov r3, r1 + mov r1, r2 + mov r2, r3 + b memset + +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memset4, __aeabi_memset) +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memset8, __aeabi_memset) + +DEFINE_COMPILERRT_FUNCTION(__aeabi_memclr) + mov r2, r1 + mov r1, #0 + b memset + +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memclr4, __aeabi_memclr) +DEFINE_AEABI_FUNCTION_ALIAS(__aeabi_memclr8, __aeabi_memclr) + From asl at math.spbu.ru Mon Jan 30 04:21:51 2012 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Mon, 30 Jan 2012 10:21:51 -0000 Subject: [llvm-commits] [compiler-rt] r149242 - in /compiler-rt/trunk/lib: arm/aeabi_idivmod.S arm/aeabi_ldivmod.S arm/aeabi_uidivmod.S arm/aeabi_uldivmod.S divmoddi4.c udivmoddi4.c Message-ID: <20120130102151.DC14D2A6C12C@llvm.org> Author: asl Date: Mon Jan 30 04:21:51 2012 New Revision: 149242 URL: http://llvm.org/viewvc/llvm-project?rev=149242&view=rev Log: Proper divmod implementation Added: compiler-rt/trunk/lib/arm/aeabi_idivmod.S compiler-rt/trunk/lib/arm/aeabi_ldivmod.S compiler-rt/trunk/lib/arm/aeabi_uidivmod.S compiler-rt/trunk/lib/arm/aeabi_uldivmod.S Modified: compiler-rt/trunk/lib/divmoddi4.c compiler-rt/trunk/lib/udivmoddi4.c Added: compiler-rt/trunk/lib/arm/aeabi_idivmod.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_idivmod.S?rev=149242&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_idivmod.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_idivmod.S Mon Jan 30 04:21:51 2012 @@ -0,0 +1,27 @@ +//===-- aeabi_idivmod.S - EABI idivmod implementation ---------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// struct { int quot, int rem} __aeabi_idivmod(int numerator, int denominator) { +// int rem, quot; +// quot = __divmodsi4(numerator, denominator, &rem); +// return {quot, rem}; +// } + + .syntax unified + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_idivmod) + push { lr } + sub sp, sp, #4 + mov r2, sp + bl SYMBOL_NAME(__divmodsi4) + ldr r1, [sp] + add sp, sp, #4 + pop { pc } Added: compiler-rt/trunk/lib/arm/aeabi_ldivmod.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_ldivmod.S?rev=149242&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_ldivmod.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_ldivmod.S Mon Jan 30 04:21:51 2012 @@ -0,0 +1,30 @@ +//===-- aeabi_ldivmod.S - EABI ldivmod implementation ---------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// struct { int64_t quot, int64_t rem} +// __aeabi_ldivmod(int64_t numerator, int64_t denominator) { +// int64_t rem, quot; +// quot = __divmoddi4(numerator, denominator, &rem); +// return {quot, rem}; +// } + + .syntax unified + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_ldivmod) + push {r11, lr} + sub sp, sp, #16 + add r12, sp, #8 + str r12, [sp] + bl SYMBOL_NAME(__divmoddi4) + ldr r2, [sp, #8] + ldr r3, [sp, #12] + add sp, sp, #16 + pop {r11, pc} Added: compiler-rt/trunk/lib/arm/aeabi_uidivmod.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_uidivmod.S?rev=149242&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_uidivmod.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_uidivmod.S Mon Jan 30 04:21:51 2012 @@ -0,0 +1,28 @@ +//===-- aeabi_uidivmod.S - EABI uidivmod implementation -------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// struct { unsigned quot, unsigned rem} +// __aeabi_uidivmod(unsigned numerator, unsigned denominator) { +// unsigned rem, quot; +// quot = __udivmodsi4(numerator, denominator, &rem); +// return {quot, rem}; +// } + + .syntax unified + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_uidivmod) + push { lr } + sub sp, sp, #4 + mov r2, sp + bl SYMBOL_NAME(__udivmodsi4) + ldr r1, [sp] + add sp, sp, #4 + pop { pc } Added: compiler-rt/trunk/lib/arm/aeabi_uldivmod.S URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/arm/aeabi_uldivmod.S?rev=149242&view=auto ============================================================================== --- compiler-rt/trunk/lib/arm/aeabi_uldivmod.S (added) +++ compiler-rt/trunk/lib/arm/aeabi_uldivmod.S Mon Jan 30 04:21:51 2012 @@ -0,0 +1,30 @@ +//===-- aeabi_uldivmod.S - EABI uldivmod implementation -------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is dual licensed under the MIT and the University of Illinois Open +// Source Licenses. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "../assembly.h" + +// struct { uint64_t quot, uint64_t rem} +// __aeabi_uldivmod(uint64_t numerator, uint64_t denominator) { +// uint64_t rem, quot; +// quot = __udivmoddi4(numerator, denominator, &rem); +// return {quot, rem}; +// } + + .syntax unified + .align 2 +DEFINE_COMPILERRT_FUNCTION(__aeabi_uldivmod) + push {r11, lr} + sub sp, sp, #16 + add r12, sp, #8 + str r12, [sp] + bl SYMBOL_NAME(__udivmoddi4) + ldr r2, [sp, #8] + ldr r3, [sp, #12] + add sp, sp, #16 + pop {r11, pc} \ No newline at end of file Modified: compiler-rt/trunk/lib/divmoddi4.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/divmoddi4.c?rev=149242&r1=149241&r2=149242&view=diff ============================================================================== --- compiler-rt/trunk/lib/divmoddi4.c (original) +++ compiler-rt/trunk/lib/divmoddi4.c Mon Jan 30 04:21:51 2012 @@ -16,8 +16,6 @@ extern COMPILER_RT_ABI di_int __divdi3(di_int a, di_int b); -ARM_EABI_FNALIAS(ldivmod, divmoddi4); - /* Returns: a / b, *rem = a % b */ COMPILER_RT_ABI di_int Modified: compiler-rt/trunk/lib/udivmoddi4.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/udivmoddi4.c?rev=149242&r1=149241&r2=149242&view=diff ============================================================================== --- compiler-rt/trunk/lib/udivmoddi4.c (original) +++ compiler-rt/trunk/lib/udivmoddi4.c Mon Jan 30 04:21:51 2012 @@ -20,8 +20,6 @@ /* Translated from Figure 3-40 of The PowerPC Compiler Writer's Guide */ -ARM_EABI_FNALIAS(uldivmod, udivmoddi4); - COMPILER_RT_ABI du_int __udivmoddi4(du_int a, du_int b, du_int* rem) { From glider at google.com Mon Jan 30 04:40:22 2012 From: glider at google.com (Alexander Potapenko) Date: Mon, 30 Jan 2012 10:40:22 -0000 Subject: [llvm-commits] [llvm] r149243 - /llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Message-ID: <20120130104022.A06F72A6C12C@llvm.org> Author: glider Date: Mon Jan 30 04:40:22 2012 New Revision: 149243 URL: http://llvm.org/viewvc/llvm-project?rev=149243&view=rev Log: Fix compilation of ASan tests on OS X Lion (see http://code.google.com/p/address-sanitizer/issues/detail?id=32) The redzones emitted by AddressSanitizer for CFString instances confuse the linker and are of little use, so we shouldn't add them. Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=149243&r1=149242&r2=149243&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon Jan 30 04:40:22 2012 @@ -422,16 +422,28 @@ continue; } - // Ignore the globals from the __OBJC section. The ObjC runtime assumes - // those conform to /usr/lib/objc/runtime.h, so we can't add redzones to - // them. if (G->hasSection()) { StringRef Section(G->getSection()); + // Ignore the globals from the __OBJC section. The ObjC runtime assumes + // those conform to /usr/lib/objc/runtime.h, so we can't add redzones to + // them. if ((Section.find("__OBJC,") == 0) || (Section.find("__DATA, __objc_") == 0)) { DEBUG(dbgs() << "Ignoring ObjC runtime global: " << *G); continue; } + // See http://code.google.com/p/address-sanitizer/issues/detail?id=32 + // Constant CFString instances are compiled in the following way: + // -- the string buffer is emitted into + // __TEXT,__cstring,cstring_literals + // -- the constant NSConstantString structure referencing that buffer + // is placed into __DATA,__cfstring + // Therefore there's no point in placing redzones into __DATA,__cfstring. + // Moreover, it causes the linker to crash on OS X 10.7 + if (Section.find("__DATA,__cfstring") == 0) { + DEBUG(dbgs() << "Ignoring CFString: " << *G); + continue; + } } GlobalsToChange.push_back(G); From chandlerc at gmail.com Mon Jan 30 05:29:42 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 03:29:42 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple Message-ID: This predicate doesn't make much sense inside of LLVM currently because all of the backends track 64-bitness separately for some (likely historical) reason. However, Clang could benefit greatly from this predicate, it would remove one of the most commonly repeated queries about a triple. It also seems likely that targets could move toward relying more heavily on the triple to deduce these things, or at least start asserting that the explicit 64-bitness of the subtarget objects matches the triple. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/28452436/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: triple-predicates2.patch Type: text/x-patch Size: 1581 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/28452436/attachment.bin From anton at korobeynikov.info Mon Jan 30 05:51:26 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 30 Jan 2012 15:51:26 +0400 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: Message-ID: > This predicate doesn't make much sense inside of LLVM currently because all > of the backends track 64-bitness separately for some (likely historical) > reason. However, Clang could benefit greatly from this predicate, it would > remove one of the most commonly repeated queries about a triple. It also > seems likely that targets could move toward relying more heavily on the > triple to deduce these things, or at least start asserting that the explicit > 64-bitness of the subtarget objects matches the triple. This looks ok to me. One random thought - msp430 is 16 bit target. And I have some plans to do something in msp430 world again. Does this make sense to query for "bitness" of the platform instead? So, we can return e.g. whether the platform is 16 or 32 or 64 bits? -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From james.molloy at arm.com Mon Jan 30 05:53:18 2012 From: james.molloy at arm.com (James Molloy) Date: Mon, 30 Jan 2012 11:53:18 -0000 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: Message-ID: <000001ccdf45$c27b74f0$47725ed0$@molloy@arm.com> Hi Chandler, The predicate looks good, however I?d suggest adding a few more for orthogonality while you?re there? ? isArch32Bit() ? this is not necessarily !isArch64Bit() (although probably currently is for the architectures we currently support). ? get[32,64]BitArch() ? take the current arch, and return the [32,64] bit version of it. This could be used to massively simplify ?m32/-m64 in Clang. What do you think? James From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: 30 January 2012 11:30 To: Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple This predicate doesn't make much sense inside of LLVM currently because all of the backends track 64-bitness separately for some (likely historical) reason. However, Clang could benefit greatly from this predicate, it would remove one of the most commonly repeated queries about a triple. It also seems likely that targets could move toward relying more heavily on the triple to deduce these things, or at least start asserting that the explicit 64-bitness of the subtarget objects matches the triple. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/0ad23e27/attachment.html From chandlerc at gmail.com Mon Jan 30 05:56:27 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 03:56:27 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 3:51 AM, Anton Korobeynikov wrote: > > This predicate doesn't make much sense inside of LLVM currently because > all > > of the backends track 64-bitness separately for some (likely historical) > > reason. However, Clang could benefit greatly from this predicate, it > would > > remove one of the most commonly repeated queries about a triple. It also > > seems likely that targets could move toward relying more heavily on the > > triple to deduce these things, or at least start asserting that the > explicit > > 64-bitness of the subtarget objects matches the triple. > This looks ok to me. One random thought - msp430 is 16 bit target. And > I have some plans to do something in msp430 world again. > Does this make sense to query for "bitness" of the platform instead? > So, we can return e.g. whether the platform is 16 or 32 or 64 bits? I'd be down with having is32Bit and is16Bit.... but I hesitate to go too far in this direction. At a certain point, you want to query the TargetData that we have properly built up. This isn't the right layer of abstraction for that. This should be extremely coarse grained, requiring the query to have utility both to code generation and to potential frontends, etc. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/174880e0/attachment.html From chandlerc at gmail.com Mon Jan 30 05:57:14 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 03:57:14 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> Message-ID: On Mon, Jan 30, 2012 at 3:53 AM, James Molloy wrote: > Hi Chandler,**** > > ** ** > > The predicate looks good, however I?d suggest adding a few more for > orthogonality while you?re there?**** > > ** ** > > **? **isArch32Bit() ? this is not necessarily !isArch64Bit() > (although probably currently is for the architectures we currently support). > Yea, eventually we should have this. As Anton mentioned, we have a 16-bit arch. > **** > > **? **get[32,64]BitArch() ? take the current arch, and return the > [32,64] bit version of it. This could be used to massively simplify > ?m32/-m64 in Clang. > This was going to be my next patch. =p -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/f8c5dd21/attachment.html From glider at google.com Mon Jan 30 06:49:00 2012 From: glider at google.com (Alexander Potapenko) Date: Mon, 30 Jan 2012 12:49:00 -0000 Subject: [llvm-commits] [compiler-rt] r149245 - /compiler-rt/trunk/lib/asan/mach_override/mach_override.c Message-ID: <20120130124900.AA7FD2A6C12C@llvm.org> Author: glider Date: Mon Jan 30 06:49:00 2012 New Revision: 149245 URL: http://llvm.org/viewvc/llvm-project?rev=149245&view=rev Log: Add a mask for "cmpb $imm, (%rdi)" on x86_64 Modified: compiler-rt/trunk/lib/asan/mach_override/mach_override.c Modified: compiler-rt/trunk/lib/asan/mach_override/mach_override.c URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/mach_override/mach_override.c?rev=149245&r1=149244&r2=149245&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/mach_override/mach_override.c (original) +++ compiler-rt/trunk/lib/asan/mach_override/mach_override.c Mon Jan 30 06:49:00 2012 @@ -695,6 +695,7 @@ { 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} }, // pushq $imm(%rdi) { 0x2, {0xFF, 0xFF}, {0x31, 0xC0} }, // xor %eax, %eax { 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0x25, 0x00, 0x00, 0x00, 0x00} }, // and $imm, %eax + { 0x3, {0xFF, 0xFF, 0xFF}, {0x80, 0x3F, 0x00} }, // cmpb $imm, (%rdi) { 0x8, {0xFF, 0xFF, 0xCF, 0xFF, 0x00, 0x00, 0x00, 0x00}, {0x48, 0x8B, 0x04, 0x25, 0x00, 0x00, 0x00, 0x00}, }, // mov $imm, %{rax,rdx,rsp,rsi} From tobias at grosser.es Mon Jan 30 07:05:22 2012 From: tobias at grosser.es (Tobias Grosser) Date: Mon, 30 Jan 2012 14:05:22 +0100 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <1327764041.2489.820.camel@sapling> References: <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <4EFC7291.9040808@grosser.es> <1325179929.13080.2839.camel@sapling> <4EFD7FD3.8040800@grosser.es> <1327378420.32397.1603.camel@sapling > <4F1ED40C.80306@grosser.es> <1327421849.11266.69.camel@sapling> <1327438907.11266.134.camel@s apling> <4F1FD5A4.9040301@grosser.es> <1327764041.2489.820.camel@s apling> Message-ID: <4F269592.2030008@grosser.es> On 01/28/2012 04:20 PM, Hal Finkel wrote: > On Wed, 2012-01-25 at 11:12 +0100, Tobias Grosser wrote: >> On 01/24/2012 10:01 PM, Hal Finkel wrote: > Good, I prefer it this way (updated patch attached). I'll commit this > today or tomorrow. Has anyone checked the CMake build with this patch > recently? I also got some new warnings, when compiling with a very new clang. /home/grosser/Projekte/polly/git/lib/Transforms/Vectorize/BBVectorize.cpp:1552:31: warning: unused variable 'E' [-Wunused-variable] for (BasicBlock::iterator E = BB.end(); ^ /home/grosser/Projekte/polly/git/lib/Transforms/Vectorize/BBVectorize.cpp:1575:31: warning: unused variable 'E' [-Wunused-variable] for (BasicBlock::iterator E = BB.end(); Cheers Tobi From hfinkel at anl.gov Mon Jan 30 07:29:23 2012 From: hfinkel at anl.gov (Hal Finkel) Date: Mon, 30 Jan 2012 07:29:23 -0600 Subject: [llvm-commits] [LLVMdev] [PATCH] BasicBlock Autovectorization Pass In-Reply-To: <4F269592.2030008@grosser.es> References: <1320762963.19359.117.camel@sapling> <4EB98207.2070807@grosser.es> <1320791390.19359.262.camel@sapling> <4EBC4B0F.6010609@grosser.es> <1321050998.19359.539.camel@sapling> <4EBDA7F9.9080709@grosser.es> <1321053083.19359.550.camel@sapling> <4EBDB1BF.7090006@grosser.es> <1321400339.19359.782.camel@sapling> <1321486739.19359.1067.camel@sapling> <4EC504B5.2020408@grosser.es> <1321898108.2507.36.camel@sapling> <1321932161.2507.101.camel@sapling> <1322067157.2507.263.camel@sapling> <4ED8F7B0.8050309@grosser.es> <1323822351.590.1687.camel@sapling> <4EFC7291.9040808@grosser.es> <1325179929.13080.2839.camel@sapling> <4EFD7FD3.8040800@grosser.es> <1327378420.32397.1603.camel@sapling > <4F1ED40C.80306@grosser.es> <1327421849.11266.69.camel@sapling> <1327438907.11266.134.camel@s apling> <4F1FD5A4.9040301@grosser.es> <1327764041.2489.820.camel@s apling> <4F269592.2030008@grosser.es> Message-ID: <1327930164.2489.889.camel@sapling> On Mon, 2012-01-30 at 14:05 +0100, Tobias Grosser wrote: > On 01/28/2012 04:20 PM, Hal Finkel wrote: > > On Wed, 2012-01-25 at 11:12 +0100, Tobias Grosser wrote: > >> On 01/24/2012 10:01 PM, Hal Finkel wrote: > > Good, I prefer it this way (updated patch attached). I'll commit this > > today or tomorrow. Has anyone checked the CMake build with this patch > > recently? > > I also got some new warnings, when compiling with a very new clang. Will fix. Thanks! -Hal > > /home/grosser/Projekte/polly/git/lib/Transforms/Vectorize/BBVectorize.cpp:1552:31: > warning: unused variable 'E' > [-Wunused-variable] > for (BasicBlock::iterator E = BB.end(); > ^ > /home/grosser/Projekte/polly/git/lib/Transforms/Vectorize/BBVectorize.cpp:1575:31: > warning: unused variable 'E' > [-Wunused-variable] > for (BasicBlock::iterator E = BB.end(); > > Cheers > > Tobi -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From baldrick at free.fr Mon Jan 30 07:32:45 2012 From: baldrick at free.fr (Duncan Sands) Date: Mon, 30 Jan 2012 13:32:45 -0000 Subject: [llvm-commits] [dragonegg] r149246 - /dragonegg/trunk/include/x86/dragonegg/Target.h Message-ID: <20120130133245.7CFFC2A6C12C@llvm.org> Author: baldrick Date: Mon Jan 30 07:32:45 2012 New Revision: 149246 URL: http://llvm.org/viewvc/llvm-project?rev=149246&view=rev Log: Remove default case: all values are covered. Modified: dragonegg/trunk/include/x86/dragonegg/Target.h Modified: dragonegg/trunk/include/x86/dragonegg/Target.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/include/x86/dragonegg/Target.h?rev=149246&r1=149245&r2=149246&view=diff ============================================================================== --- dragonegg/trunk/include/x86/dragonegg/Target.h (original) +++ dragonegg/trunk/include/x86/dragonegg/Target.h Mon Jan 30 07:32:45 2012 @@ -393,10 +393,6 @@ /* Propagate code model setting to backend */ #define LLVM_SET_CODE_MODEL(CMModel) \ switch (ix86_cmodel) { \ - default: \ - sorry ("code model %<%s%> not supported yet", \ - ix86_cmodel_string); \ - break; \ case CM_32: \ CMModel = CodeModel::Default; \ break; \ From samsonov at google.com Mon Jan 30 07:42:44 2012 From: samsonov at google.com (Alexey Samsonov) Date: Mon, 30 Jan 2012 13:42:44 -0000 Subject: [llvm-commits] [compiler-rt] r149247 - in /compiler-rt/trunk/lib/asan: asan_interceptors.cc asan_interceptors.h asan_malloc_linux.cc Message-ID: <20120130134244.8304C2A6C12C@llvm.org> Author: samsonov Date: Mon Jan 30 07:42:44 2012 New Revision: 149247 URL: http://llvm.org/viewvc/llvm-project?rev=149247&view=rev Log: AddressSanitizer: Enforce default visibility for all libc interceptors Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc compiler-rt/trunk/lib/asan/asan_interceptors.h compiler-rt/trunk/lib/asan/asan_malloc_linux.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=149247&r1=149246&r2=149247&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Mon Jan 30 07:42:44 2012 @@ -45,6 +45,7 @@ // in __asan::real_f(). #if defined(__APPLE__) // Include the declarations of the original functions. +#include #include #include @@ -302,9 +303,7 @@ #ifndef _WIN32 extern "C" -#ifndef __APPLE__ -__attribute__((visibility("default"))) -#endif +INTERCEPTOR_ATTRIBUTE int WRAP(pthread_create)(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg) { GET_STACK_TRACE_HERE(kStackTraceMax); @@ -315,6 +314,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE void *WRAP(signal)(int signum, void *handler) { if (!AsanInterceptsSignal(signum)) { return real_signal(signum, handler); @@ -323,10 +323,9 @@ } extern "C" -int (sigaction)(int signum, const void *act, void *oldact); - -extern "C" -int WRAP(sigaction)(int signum, const void *act, void *oldact) { +INTERCEPTOR_ATTRIBUTE +int WRAP(sigaction)(int signum, const struct sigaction *act, + struct sigaction *oldact) { if (!AsanInterceptsSignal(signum)) { return real_sigaction(signum, act, oldact); } @@ -344,25 +343,35 @@ PoisonShadow(bottom, top - bottom, 0); } -extern "C" void WRAP(longjmp)(void *env, int val) { +extern "C" +INTERCEPTOR_ATTRIBUTE +void WRAP(longjmp)(void *env, int val) { UnpoisonStackFromHereToTop(); real_longjmp(env, val); } -extern "C" void WRAP(_longjmp)(void *env, int val) { +extern "C" +INTERCEPTOR_ATTRIBUTE +void WRAP(_longjmp)(void *env, int val) { UnpoisonStackFromHereToTop(); real__longjmp(env, val); } -extern "C" void WRAP(siglongjmp)(void *env, int val) { +extern "C" +INTERCEPTOR_ATTRIBUTE +void WRAP(siglongjmp)(void *env, int val) { UnpoisonStackFromHereToTop(); real_siglongjmp(env, val); } +#if ASAN_HAS_EXCEPTIONS == 1 +#ifdef __APPLE__ extern "C" void __cxa_throw(void *a, void *b, void *c); +#endif // __APPLE__ -#if ASAN_HAS_EXCEPTIONS == 1 -extern "C" void WRAP(__cxa_throw)(void *a, void *b, void *c) { +extern "C" +INTERCEPTOR_ATTRIBUTE +void WRAP(__cxa_throw)(void *a, void *b, void *c) { CHECK(&real___cxa_throw); UnpoisonStackFromHereToTop(); real___cxa_throw(a, b, c); @@ -379,18 +388,26 @@ printed = true; Printf("INFO: AddressSanitizer ignores mlock/mlockall/munlock/munlockall\n"); } + +INTERCEPTOR_ATTRIBUTE int mlock(const void *addr, size_t len) { MlockIsUnsupported(); return 0; } + +INTERCEPTOR_ATTRIBUTE int munlock(const void *addr, size_t len) { MlockIsUnsupported(); return 0; } + +INTERCEPTOR_ATTRIBUTE int mlockall(int flags) { MlockIsUnsupported(); return 0; } + +INTERCEPTOR_ATTRIBUTE int munlockall(void) { MlockIsUnsupported(); return 0; @@ -410,6 +427,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE int WRAP(memcmp)(const void *a1, const void *a2, size_t size) { ENSURE_ASAN_INITED(); unsigned char c1 = 0, c2 = 0; @@ -427,6 +445,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE void *WRAP(memcpy)(void *to, const void *from, size_t size) { // memcpy is called during __asan_init() from the internals // of printf(...). @@ -447,6 +466,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE void *WRAP(memmove)(void *to, const void *from, size_t size) { ENSURE_ASAN_INITED(); if (FLAG_replace_intrin) { @@ -457,6 +477,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE void *WRAP(memset)(void *block, int c, size_t size) { // memset is called inside INTERCEPT_FUNCTION on Mac. if (asan_init_is_running) { @@ -471,11 +492,13 @@ #ifndef __APPLE__ extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(index)(const char *str, int c) __attribute__((alias(WRAPPER_NAME(strchr)))); #endif extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(strchr)(const char *str, int c) { ENSURE_ASAN_INITED(); char *result = real_strchr(str, c); @@ -487,6 +510,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE int WRAP(strcasecmp)(const char *s1, const char *s2) { ENSURE_ASAN_INITED(); unsigned char c1, c2; @@ -502,6 +526,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(strcat)(char *to, const char *from) { // NOLINT ENSURE_ASAN_INITED(); if (FLAG_replace_str) { @@ -518,6 +543,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE int WRAP(strcmp)(const char *s1, const char *s2) { if (!asan_inited) { return internal_strcmp(s1, s2); @@ -535,6 +561,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(strcpy)(char *to, const char *from) { // NOLINT // strcpy is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. @@ -552,6 +579,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(strdup)(const char *s) { ENSURE_ASAN_INITED(); if (FLAG_replace_str) { @@ -562,6 +590,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE size_t WRAP(strlen)(const char *s) { // strlen is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. @@ -577,6 +606,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE int WRAP(strncasecmp)(const char *s1, const char *s2, size_t size) { ENSURE_ASAN_INITED(); unsigned char c1 = 0, c2 = 0; @@ -592,6 +622,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE int WRAP(strncmp)(const char *s1, const char *s2, size_t size) { // strncmp is called from malloc_default_purgeable_zone() // in __asan::ReplaceSystemAlloc() on Mac. @@ -611,6 +642,7 @@ } extern "C" +INTERCEPTOR_ATTRIBUTE char *WRAP(strncpy)(char *to, const char *from, size_t size) { ENSURE_ASAN_INITED(); if (FLAG_replace_str) { @@ -624,6 +656,7 @@ #ifndef __APPLE__ extern "C" +INTERCEPTOR_ATTRIBUTE size_t WRAP(strnlen)(const char *s, size_t maxlen) { ENSURE_ASAN_INITED(); size_t length = real_strnlen(s, maxlen); Modified: compiler-rt/trunk/lib/asan/asan_interceptors.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.h?rev=149247&r1=149246&r2=149247&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.h (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.h Mon Jan 30 07:42:44 2012 @@ -18,13 +18,18 @@ #if defined(__APPLE__) # define WRAP(x) wrap_##x +# define INTERCEPTOR_ATTRIBUTE #elif defined(_WIN32) // TODO(timurrrr): we're likely to use something else later on Windows. # define WRAP(x) wrap_##x +# define INTERCEPTOR_ATTRIBUTE #else # define WRAP(x) x +# define INTERCEPTOR_ATTRIBUTE __attribute__((visibility("default"))) #endif +struct sigaction; + namespace __asan { typedef void* (*index_f)(const char *string, int c); @@ -44,7 +49,8 @@ typedef char* (*strncpy_f)(char *to, const char *from, size_t size); typedef size_t (*strnlen_f)(const char *s, size_t maxlen); typedef void *(*signal_f)(int signum, void *handler); -typedef int (*sigaction_f)(int signum, const void *act, void *oldact); +typedef int (*sigaction_f)(int signum, const struct sigaction *act, + struct sigaction *oldact); // __asan::real_X() holds pointer to library implementation of X(). extern index_f real_index; Modified: compiler-rt/trunk/lib/asan/asan_malloc_linux.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_malloc_linux.cc?rev=149247&r1=149246&r2=149247&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_malloc_linux.cc (original) +++ compiler-rt/trunk/lib/asan/asan_malloc_linux.cc Mon Jan 30 07:42:44 2012 @@ -22,8 +22,6 @@ #include -#define INTERCEPTOR_ATTRIBUTE __attribute__((visibility("default"))) - #ifdef ANDROID struct MallocDebug { void* (*malloc)(size_t bytes); From elena.demikhovsky at intel.com Mon Jan 30 08:11:06 2012 From: elena.demikhovsky at intel.com (Demikhovsky, Elena) Date: Mon, 30 Jan 2012 14:11:06 +0000 Subject: [llvm-commits] Optimization for TRUNCATE on AVX - please review Message-ID: Truncating from v4i64 to v4i32 and v8i32 to v8i16 may be done with set of shuffles on AVX. - Elena --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- A non-text attachment was scrubbed... Name: trunk.diff Type: application/octet-stream Size: 6179 bytes Desc: trunk.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/d70df87a/attachment.obj From tobias at grosser.es Mon Jan 30 08:29:28 2012 From: tobias at grosser.es (Tobias Grosser) Date: Mon, 30 Jan 2012 15:29:28 +0100 Subject: [llvm-commits] Optimization for TRUNCATE on AVX - please review In-Reply-To: References: Message-ID: <4F26A948.7080903@grosser.es> On 01/30/2012 03:11 PM, Demikhovsky, Elena wrote: > Truncating from v4i64 to v4i32 and v8i32 to v8i16 may be done with set of shuffles on AVX. Hi Elena, no technical review, but some comments: > + if ((VT == MVT::v4i32)&& (OpVT == MVT::v4i64)) > + { We normally put the '{' in the same line than the 'if'. > Index: lib/Target/X86/X86ISelLowering.h > =================================================================== > --- lib/Target/X86/X86ISelLowering.h (revision 149245) > +++ lib/Target/X86/X86ISelLowering.h (working copy) > @@ -839,6 +839,8 @@ > SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG&DAG) const; > SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG&DAG) const; > SDValue LowerSIGN_EXTEND_INREG(SDValue Op, SelectionDAG&DAG) const; > + SDValue PerformBrcondCombine(SDNode* N, SelectionDAG&DAG, DAGCombinerInfo&DCI) const; Do you need to declare this function? Cheers Tobi From benny.kra at googlemail.com Mon Jan 30 09:16:22 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Mon, 30 Jan 2012 15:16:22 -0000 Subject: [llvm-commits] [llvm] r149248 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120130151622.2E6222A6C12C@llvm.org> Author: d0k Date: Mon Jan 30 09:16:21 2012 New Revision: 149248 URL: http://llvm.org/viewvc/llvm-project?rev=149248&view=rev Log: X86: Simplify shuffle mask generation code. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149248&r1=149247&r2=149248&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Jan 30 09:16:21 2012 @@ -52,6 +52,7 @@ #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Target/TargetOptions.h" +#include using namespace llvm; using namespace dwarf; @@ -5259,8 +5260,7 @@ } // If element VT is == 32 bits, turn it into a number of shuffles. - SmallVector V; - V.resize(NumElems); + SmallVector V(NumElems); if (NumElems == 4 && NumZero > 0) { for (unsigned i = 0; i < 4; ++i) { bool isZero = !(NonZeros & (1 << i)); @@ -5289,13 +5289,14 @@ } } - SmallVector MaskVec; - bool Reverse = (NonZeros & 0x3) == 2; - for (unsigned i = 0; i < 2; ++i) - MaskVec.push_back(Reverse ? 1-i : i); - Reverse = ((NonZeros & (0x3 << 2)) >> 2) == 2; - for (unsigned i = 0; i < 2; ++i) - MaskVec.push_back(Reverse ? 1-i+NumElems : i+NumElems); + bool Reverse1 = (NonZeros & 0x3) == 2; + bool Reverse2 = ((NonZeros & (0x3 << 2)) >> 2) == 2; + int MaskVec[] = { + Reverse1 ? 1 : 0, + Reverse1 ? 0 : 1, + Reverse2 ? 1-NumElems : NumElems, + Reverse2 ? NumElems : 1+NumElems + }; return DAG.getVectorShuffle(VT, dl, V[0], V[1], &MaskVec[0]); } @@ -5497,9 +5498,10 @@ // words from all 4 input quadwords. SDValue NewV; if (BestLoQuad >= 0 || BestHiQuad >= 0) { - SmallVector MaskV; - MaskV.push_back(BestLoQuad < 0 ? 0 : BestLoQuad); - MaskV.push_back(BestHiQuad < 0 ? 1 : BestHiQuad); + int MaskV[] = { + BestLoQuad < 0 ? 0 : BestLoQuad, + BestHiQuad < 0 ? 1 : BestHiQuad + }; NewV = DAG.getVectorShuffle(MVT::v2i64, dl, DAG.getNode(ISD::BITCAST, dl, MVT::v2i64, V1), DAG.getNode(ISD::BITCAST, dl, MVT::v2i64, V2), &MaskV[0]); @@ -5602,23 +5604,18 @@ // If BestLoQuad >= 0, generate a pshuflw to put the low elements in order, // and update MaskVals with new element order. - BitVector InOrder(8); + std::bitset<8> InOrder; if (BestLoQuad >= 0) { - SmallVector MaskV; + int MaskV[] = { -1, -1, -1, -1, 4, 5, 6, 7 }; for (int i = 0; i != 4; ++i) { int idx = MaskVals[i]; if (idx < 0) { - MaskV.push_back(-1); InOrder.set(i); } else if ((idx / 4) == BestLoQuad) { - MaskV.push_back(idx & 3); + MaskV[i] = idx & 3; InOrder.set(i); - } else { - MaskV.push_back(-1); } } - for (unsigned i = 4; i != 8; ++i) - MaskV.push_back(i); NewV = DAG.getVectorShuffle(MVT::v8i16, dl, NewV, DAG.getUNDEF(MVT::v8i16), &MaskV[0]); @@ -5632,19 +5629,14 @@ // If BestHi >= 0, generate a pshufhw to put the high elements in order, // and update MaskVals with the new element order. if (BestHiQuad >= 0) { - SmallVector MaskV; - for (unsigned i = 0; i != 4; ++i) - MaskV.push_back(i); + int MaskV[] = { 0, 1, 2, 3, -1, -1, -1, -1 }; for (unsigned i = 4; i != 8; ++i) { int idx = MaskVals[i]; if (idx < 0) { - MaskV.push_back(-1); InOrder.set(i); } else if ((idx / 4) == BestHiQuad) { - MaskV.push_back((idx & 3) + 4); + MaskV[i] = (idx & 3) + 4; InOrder.set(i); - } else { - MaskV.push_back(-1); } } NewV = DAG.getVectorShuffle(MVT::v8i16, dl, NewV, DAG.getUNDEF(MVT::v8i16), @@ -6025,9 +6017,8 @@ assert(VT.getSizeInBits() == 128 && "Unsupported vector size"); - SmallVector, 8> Locs; - Locs.resize(4); - SmallVector Mask1(4U, -1); + std::pair Locs[4]; + int Mask1[] = { -1, -1, -1, -1 }; SmallVector PermMask(SVOp->getMask().begin(), SVOp->getMask().end()); unsigned NumHi = 0; @@ -6058,17 +6049,14 @@ // vector operands, put the elements into the right order. V1 = DAG.getVectorShuffle(VT, dl, V1, V2, &Mask1[0]); - SmallVector Mask2(4U, -1); + int Mask2[] = { -1, -1, -1, -1 }; - for (unsigned i = 0; i != 4; ++i) { - if (Locs[i].first == -1) - continue; - else { + for (unsigned i = 0; i != 4; ++i) + if (Locs[i].first != -1) { unsigned Idx = (i < 2) ? 0 : 4; Idx += Locs[i].first * 2 + Locs[i].second; Mask2[i] = Idx; } - } return DAG.getVectorShuffle(VT, dl, V1, V1, &Mask2[0]); } else if (NumLo == 3 || NumHi == 3) { @@ -6121,18 +6109,16 @@ } // Break it into (shuffle shuffle_hi, shuffle_lo). - Locs.clear(); - Locs.resize(4); - SmallVector LoMask(4U, -1); - SmallVector HiMask(4U, -1); + int LoMask[] = { -1, -1, -1, -1 }; + int HiMask[] = { -1, -1, -1, -1 }; - SmallVector *MaskPtr = &LoMask; + int *MaskPtr = LoMask; unsigned MaskIdx = 0; unsigned LoIdx = 0; unsigned HiIdx = 2; for (unsigned i = 0; i != 4; ++i) { if (i == 2) { - MaskPtr = &HiMask; + MaskPtr = HiMask; MaskIdx = 1; LoIdx = 0; HiIdx = 2; @@ -6142,26 +6128,21 @@ Locs[i] = std::make_pair(-1, -1); } else if (Idx < 4) { Locs[i] = std::make_pair(MaskIdx, LoIdx); - (*MaskPtr)[LoIdx] = Idx; + MaskPtr[LoIdx] = Idx; LoIdx++; } else { Locs[i] = std::make_pair(MaskIdx, HiIdx); - (*MaskPtr)[HiIdx] = Idx; + MaskPtr[HiIdx] = Idx; HiIdx++; } } SDValue LoShuffle = DAG.getVectorShuffle(VT, dl, V1, V2, &LoMask[0]); SDValue HiShuffle = DAG.getVectorShuffle(VT, dl, V1, V2, &HiMask[0]); - SmallVector MaskOps; - for (unsigned i = 0; i != 4; ++i) { - if (Locs[i].first == -1) { - MaskOps.push_back(-1); - } else { - unsigned Idx = Locs[i].first * 4 + Locs[i].second; - MaskOps.push_back(Idx); - } - } + int MaskOps[] = { -1, -1, -1, -1 }; + for (unsigned i = 0; i != 4; ++i) + if (Locs[i].first != -1) + MaskOps[i] = Locs[i].first * 4 + Locs[i].second; return DAG.getVectorShuffle(VT, dl, LoShuffle, HiShuffle, &MaskOps[0]); } From elena.demikhovsky at intel.com Mon Jan 30 09:46:34 2012 From: elena.demikhovsky at intel.com (Demikhovsky, Elena) Date: Mon, 30 Jan 2012 15:46:34 +0000 Subject: [llvm-commits] Optimization for TRUNCATE on AVX - please review In-Reply-To: <4F26A948.7080903@grosser.es> References: <4F26A948.7080903@grosser.es> Message-ID: I forgot about '{', I'll fix. The function may be static. In this case I need to pass one more parameter. - Elena -----Original Message----- From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Tobias Grosser Sent: Monday, January 30, 2012 16:29 To: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] Optimization for TRUNCATE on AVX - please review On 01/30/2012 03:11 PM, Demikhovsky, Elena wrote: > Truncating from v4i64 to v4i32 and v8i32 to v8i16 may be done with set of shuffles on AVX. Hi Elena, no technical review, but some comments: > + if ((VT == MVT::v4i32)&& (OpVT == MVT::v4i64)) > + { We normally put the '{' in the same line than the 'if'. > Index: lib/Target/X86/X86ISelLowering.h > =================================================================== > --- lib/Target/X86/X86ISelLowering.h (revision 149245) > +++ lib/Target/X86/X86ISelLowering.h (working copy) > @@ -839,6 +839,8 @@ > SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG&DAG) const; > SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG&DAG) const; > SDValue LowerSIGN_EXTEND_INREG(SDValue Op, SelectionDAG&DAG) const; > + SDValue PerformBrcondCombine(SDNode* N, SelectionDAG&DAG, DAGCombinerInfo&DCI) const; Do you need to declare this function? Cheers Tobi _______________________________________________ llvm-commits mailing list llvm-commits at cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From tobias at grosser.es Mon Jan 30 10:01:36 2012 From: tobias at grosser.es (Tobias Grosser) Date: Mon, 30 Jan 2012 17:01:36 +0100 Subject: [llvm-commits] Optimization for TRUNCATE on AVX - please review In-Reply-To: References: <4F26A948.7080903@grosser.es> Message-ID: <4F26BEE0.4050306@grosser.es> On 01/30/2012 04:46 PM, Demikhovsky, Elena wrote: > I forgot about '{', I'll fix. > > The function may be static. In this case I need to pass one more parameter. Sorry I do not understand the second comment. You declare a function "PerformBrcondCombine" that is never defined. It seems to be a leftover from another patch. Tobi From dgregor at apple.com Mon Jan 30 10:57:18 2012 From: dgregor at apple.com (Douglas Gregor) Date: Mon, 30 Jan 2012 16:57:18 -0000 Subject: [llvm-commits] [llvm] r149254 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120130165718.D45882A6C12C@llvm.org> Author: dgregor Date: Mon Jan 30 10:57:18 2012 New Revision: 149254 URL: http://llvm.org/viewvc/llvm-project?rev=149254&view=rev Log: Eliminate narrowing conversion in initializer list, to make C++11 happy Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149254&r1=149253&r2=149254&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Jan 30 10:57:18 2012 @@ -5294,8 +5294,8 @@ int MaskVec[] = { Reverse1 ? 1 : 0, Reverse1 ? 0 : 1, - Reverse2 ? 1-NumElems : NumElems, - Reverse2 ? NumElems : 1+NumElems + static_cast(Reverse2 ? 1-NumElems : NumElems), + static_cast(Reverse2 ? NumElems : 1+NumElems) }; return DAG.getVectorShuffle(VT, dl, V[0], V[1], &MaskVec[0]); } From baldrick at free.fr Mon Jan 30 11:01:29 2012 From: baldrick at free.fr (Duncan Sands) Date: Mon, 30 Jan 2012 18:01:29 +0100 Subject: [llvm-commits] [llvm] r148741 - in /llvm/trunk: include/llvm/Constants.h include/llvm/Value.h lib/VMCore/Constants.cpp lib/VMCore/LLVMContextImpl.cpp lib/VMCore/LLVMContextImpl.h In-Reply-To: <20120123225711.23E212A6C12C@llvm.org> References: <20120123225711.23E212A6C12C@llvm.org> Message-ID: <4F26CCE9.3000304@free.fr> Hi Chris, > --- llvm/trunk/include/llvm/Constants.h (original) > +++ llvm/trunk/include/llvm/Constants.h Mon Jan 23 16:57:10 2012 > @@ -535,6 +534,166 @@ > return V->getValueID() == ConstantPointerNullVal; > } > }; > + > +//===----------------------------------------------------------------------===// > +/// ConstantDataSequential - A vector or array of data that contains no > +/// relocations, and whose element type is a simple 1/2/4/8-byte integer or why talk about "relocations" here? What is a "relocation"? Don't you just mean that it is an array of numbers? Same goes for the other uses of "relocations" later in the file. > +public: > + > + virtual void destroyConstant(); > + > + /// getElementAsInteger - If this is a sequential container of integers (of > + /// any size), return the specified element in the low bits of a uint64_t. > + uint64_t getElementAsInteger(unsigned i) const; > + > + /// getElementAsAPFloat - If this is a sequential container of floating point > + /// type, return the specified element as an APFloat. > + APFloat getElementAsAPFloat(unsigned i) const; Wouldn't it be more symmetric to also have a method "getElementAsAPInt"? > + /// getElementAsDouble - If this is an sequential container of doubles, return > + /// the specified element as a float. as a float -> as a double > --- llvm/trunk/lib/VMCore/Constants.cpp (original) > +++ llvm/trunk/lib/VMCore/Constants.cpp Mon Jan 23 16:57:10 2012 > @@ -1913,6 +1913,154 @@ > OperandList[i+1] = IdxList[i]; > } > > +//===----------------------------------------------------------------------===// > +// ConstantData* implementations > + > +void ConstantDataArray::anchor() {} > +void ConstantDataVector::anchor() {} > + > +/// isAllZeros - return true if the array is empty or all zeros. > +static bool isAllZeros(StringRef Arr) { > + for (StringRef::iterator I = Arr.begin(), E = Arr.end(); I != E; ++I) > + if (*I != 0) > + return false; > + return true; > +} Missing blank line. > +/// getImpl - This is the underlying implementation of all of the > +/// ConstantDataSequential::get methods. They all thunk down to here, providing > +/// the correct element type. We take the bytes in as an StringRef because an StringRef -> a StringRef Ciao, Duncan. From kcc at google.com Mon Jan 30 11:45:21 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 09:45:21 -0800 Subject: [llvm-commits] ThreadSanitizer, first patch. Please review. In-Reply-To: References: Message-ID: Any feedback? --kcc On Wed, Jan 18, 2012 at 11:38 AM, Kostya Serebryany wrote: > Hello, > > The proposed patch is the first step towards race detection built into > LLVM/Clang. > The tool will be similar to AddressSanitizer in the way it works: > 1. A simple instrumentation module > in lib/Transforms/Instrumentation/ThreadSanitizer.cpp (this patch) > 2. -fthread-sanitizer flag in clang (next patch) > 3. A run-time library in projects/compiler-rt/lib/tsan (patches will > follow). > > The patch: http://codereview.appspot.com/5545054/ (also attached). > > Here are some links about the previous versions of ThreadSanitizer > - http://code.google.com/p/data-race-test/ -- main project page > - http://code.google.com/p/data-race-test/wiki/CompileTimeInstrumentation - description > of the LLVM-based prototype (we are not going to reuse that code, but will > reuse the ideas). > - http://code.google.com/p/data-race-test/wiki/GccInstrumentation - > description of the GCC-based prototype. > - > http://dev.chromium.org/developers/how-tos/using-valgrind/threadsanitizer - > ThreadSanitizer for Chromium > - http://data-race-test.googlecode.com/files/ThreadSanitizer.pdf -- paper > about Valgrind-based tool published at WBIA'09 > - http://data-race-test.googlecode.com/files/ThreadSanitizerLLVM.pdf -- > paper about LLVM-based tool published at RV'2011 > > Thanks, > > --kcc > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/e59e82d3/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: issue5545054_10002.diff Type: text/x-patch Size: 8697 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/e59e82d3/attachment.bin From chandlerc at gmail.com Mon Jan 30 11:57:14 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 09:57:14 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> Message-ID: Updated patch, now with 64bit, 32bit, and 16bit. Also with unit tests to ensure this works, and continues to work. I've added a tiny blurb about why I used 3 predicates rather than exposing the bitwidth as an integer. I want these to be used as coarse categories, not precise architectural details. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/5d6997f4/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: triple-predicates2.patch Type: application/octet-stream Size: 4448 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/5d6997f4/attachment.obj From james.molloy at arm.com Mon Jan 30 12:08:49 2012 From: james.molloy at arm.com (James Molloy) Date: Mon, 30 Jan 2012 18:08:49 -0000 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> Message-ID: <000501ccdf7a$37adc140$a70943c0$@molloy@arm.com> Hi Chandler, One point: + /// Note that this tests for 16-bit pointer width, and nothing else. I?m not sure this comment is accurate. For example, real mode x86 would be 16-bit but has 24-bit pointers (seg:offset). Perhaps you should just explain that it is the obvious native width of the register file or something (or not be so explicit and let the reader rely on some common sense?) Cheers, James From: Chandler Carruth [mailto:chandlerc at gmail.com] Sent: 30 January 2012 17:57 To: James Molloy; Anton Korobeynikov Cc: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple Updated patch, now with 64bit, 32bit, and 16bit. Also with unit tests to ensure this works, and continues to work. I've added a tiny blurb about why I used 3 predicates rather than exposing the bitwidth as an integer. I want these to be used as coarse categories, not precise architectural details. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/da7b2075/attachment.html From clattner at apple.com Mon Jan 30 12:22:59 2012 From: clattner at apple.com (Chris Lattner) Date: Mon, 30 Jan 2012 10:22:59 -0800 Subject: [llvm-commits] [llvm] r148741 - in /llvm/trunk: include/llvm/Constants.h include/llvm/Value.h lib/VMCore/Constants.cpp lib/VMCore/LLVMContextImpl.cpp lib/VMCore/LLVMContextImpl.h In-Reply-To: <4F26CCE9.3000304@free.fr> References: <20120123225711.23E212A6C12C@llvm.org> <4F26CCE9.3000304@free.fr> Message-ID: <677C5FB9-B6D3-470D-9989-3ADD05887607@apple.com> Hey Duncan, thanks for the review! On Jan 30, 2012, at 9:01 AM, Duncan Sands wrote: >> --- llvm/trunk/include/llvm/Constants.h (original) >> +++ llvm/trunk/include/llvm/Constants.h Mon Jan 23 16:57:10 2012 >> @@ -535,6 +534,166 @@ >> return V->getValueID() == ConstantPointerNullVal; >> } >> }; >> + >> +//===----------------------------------------------------------------------===// >> +/// ConstantDataSequential - A vector or array of data that contains no >> +/// relocations, and whose element type is a simple 1/2/4/8-byte integer or > > why talk about "relocations" here? What is a "relocation"? Don't you > just mean that it is an array of numbers? Same goes for the other uses > of "relocations" later in the file. Done. > >> +public: >> + >> + virtual void destroyConstant(); >> + >> + /// getElementAsInteger - If this is a sequential container of integers (of >> + /// any size), return the specified element in the low bits of a uint64_t. >> + uint64_t getElementAsInteger(unsigned i) const; >> + >> + /// getElementAsAPFloat - If this is a sequential container of floating point >> + /// type, return the specified element as an APFloat. >> + APFloat getElementAsAPFloat(unsigned i) const; > > Wouldn't it be more symmetric to also have a method "getElementAsAPInt"? Perhaps. In practice, this isn't actually useful to any clients, and is quite a bit more expensive than returning a uint64_t (because APInt is non-POD). The rest fixed, thanks again for the review! -Chris From sabre at nondot.org Mon Jan 30 12:19:30 2012 From: sabre at nondot.org (Chris Lattner) Date: Mon, 30 Jan 2012 18:19:30 -0000 Subject: [llvm-commits] [llvm] r149255 - in /llvm/trunk: include/llvm/Constants.h lib/VMCore/Constants.cpp Message-ID: <20120130181930.6F52F2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 12:19:30 2012 New Revision: 149255 URL: http://llvm.org/viewvc/llvm-project?rev=149255&view=rev Log: Various improvements suggested by Duncan Modified: llvm/trunk/include/llvm/Constants.h llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/include/llvm/Constants.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Constants.h?rev=149255&r1=149254&r2=149255&view=diff ============================================================================== --- llvm/trunk/include/llvm/Constants.h (original) +++ llvm/trunk/include/llvm/Constants.h Mon Jan 30 12:19:30 2012 @@ -561,10 +561,13 @@ }; //===----------------------------------------------------------------------===// -/// ConstantDataSequential - A vector or array of data that contains no -/// relocations, and whose element type is a simple 1/2/4/8-byte integer or -/// float/double. This is the common base class of ConstantDataArray and -/// ConstantDataVector. +/// ConstantDataSequential - A vector or array constant whose element type is a +/// simple 1/2/4/8-byte integer or float/double, and whose elements are just +/// simple data values (i.e. ConstantInt/ConstantFP). This Constant node has no +/// operands because it stores all of the elements of the constant as densely +/// packed data, instead of as Value*'s. +/// +/// This is the common base class of ConstantDataArray and ConstantDataVector. /// class ConstantDataSequential : public Constant { friend class LLVMContextImpl; @@ -612,7 +615,7 @@ float getElementAsFloat(unsigned i) const; /// getElementAsDouble - If this is an sequential container of doubles, return - /// the specified element as a float. + /// the specified element as a double. double getElementAsDouble(unsigned i) const; /// getElementAsConstant - Return a Constant for a specified index's element. @@ -683,9 +686,11 @@ }; //===----------------------------------------------------------------------===// -/// ConstantDataArray - An array of data that contains no relocations, and whose -/// element type is a simple 1/2/4/8-byte integer or float/double. -/// +/// ConstantDataArray - An array constant whose element type is a simple +/// 1/2/4/8-byte integer or float/double, and whose elements are just simple +/// data values (i.e. ConstantInt/ConstantFP). This Constant node has no +/// operands because it stores all of the elements of the constant as densely +/// packed data, instead of as Value*'s. class ConstantDataArray : public ConstantDataSequential { void *operator new(size_t, unsigned); // DO NOT IMPLEMENT ConstantDataArray(const ConstantDataArray &); // DO NOT IMPLEMENT @@ -734,9 +739,11 @@ }; //===----------------------------------------------------------------------===// -/// ConstantDataVector - A vector of data that contains no relocations, and -/// whose element type is a simple 1/2/4/8-byte integer or float/double. -/// +/// ConstantDataVector - A vector constant whose element type is a simple +/// 1/2/4/8-byte integer or float/double, and whose elements are just simple +/// data values (i.e. ConstantInt/ConstantFP). This Constant node has no +/// operands because it stores all of the elements of the constant as densely +/// packed data, instead of as Value*'s. class ConstantDataVector : public ConstantDataSequential { void *operator new(size_t, unsigned); // DO NOT IMPLEMENT ConstantDataVector(const ConstantDataVector &); // DO NOT IMPLEMENT Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149255&r1=149254&r2=149255&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Mon Jan 30 12:19:30 2012 @@ -2109,7 +2109,7 @@ /// getImpl - This is the underlying implementation of all of the /// ConstantDataSequential::get methods. They all thunk down to here, providing -/// the correct element type. We take the bytes in as an StringRef because +/// the correct element type. We take the bytes in as a StringRef because /// we *want* an underlying "char*" to avoid TBAA type punning violations. Constant *ConstantDataSequential::getImpl(StringRef Elements, Type *Ty) { assert(isElementTypeCompatible(Ty->getSequentialElementType())); From lenny at Colorado.EDU Mon Jan 30 12:51:17 2012 From: lenny at Colorado.EDU (Lenny Maiorani) Date: Mon, 30 Jan 2012 11:51:17 -0700 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> Message-ID: On Jan 23, 2012, at 4:03 PM, Jakob Stoklund Olesen wrote: > > On Jan 23, 2012, at 2:30 PM, Lenny Maiorani wrote: > >> Thank you for the information about MachineCSE and noticing the hash tables which are being generated at the top of EarlyCSE::processNode(). My latest implementation contains roughly the same implementation as MachineCSE. MachineCSE generates stack of nodes to process using SmallVector. Since this data structure is so large, I think using SmallVector is not correct. > > SmallVector is almost always a better choice than std::vector, which is essentially equivalent to SmallVector. std::vector only makes sense if you need to create many vector instances that will probably overflow the small size. > >> Also, I implemented the solution using vec.resize() and std::copy(), but this doesn't significantly improve the performance of the vector solution. My numbers are not at all surprising to me. Vectors do not do as well as deques with many pushes and pops because deques manage their memory in slabs to prevent needing to do large reallocs and hence large copies on occasion. > > Interesting. A vector only needs to copy its elements once (amortized). > >> Using the DepthFirstIterator (df_iterator) would be nice, but I do not see an easy way to maintain the CurrentGeneration variable which gets modified on a per-tree-depth. This means that there would still need to be some sort of stack kept separately from the iterator. This would be confusing and clutter the code. > > Yes, unfortunately df_iterator doesn't allow you to use its internal stack. That would have been convenient. > >> I looked a bit at the hash tables at the top of EarlyCSE::processNode() and I do not think they are nested. It looks like they are copy-constructed from some hash tables which are class members. These are then modified and provided the order of node traversal does not change, then the data in those member hash tables should be the same. > > Look again. The implementation is in ScopedHashTable.h. > > When processing a node, there must be a ScopedHashTableScope instantiated for every dominator tree level above the node. The nested scopes form a linked list. > > /jakob > Hi Jakob, I am getting back to this and it looks like I will need to store the 3 ScopedHashTable objects. I am considering doing this via an IntrusiveRefCntPtr since I will need to store them in the local stack along with the generation and node and each scope might have multiple instances, one instance per child node. Is it preferred to have ScopedHashTable directly inherit from RefCountedBase (making all instances of ScopedHashTable have RefCountedBase) or create a new class within EarlyCSE which inherits and use a naked ptr to the ScopedHashTable inside that class? Thanks, -Lenny From stoklund at 2pi.dk Mon Jan 30 13:05:55 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 30 Jan 2012 11:05:55 -0800 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> Message-ID: <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> On Jan 30, 2012, at 10:51 AM, Lenny Maiorani wrote: > I am getting back to this and it looks like I will need to store the 3 ScopedHashTable objects. I am considering doing this via an IntrusiveRefCntPtr since I will need to store them in the local stack along with the generation and node and each scope might have multiple instances, one instance per child node. Is it preferred to have ScopedHashTable directly inherit from RefCountedBase (making all instances of ScopedHashTable have RefCountedBase) or create a new class within EarlyCSE which inherits and use a naked ptr to the ScopedHashTable inside that class? Hi Lenny, I really don't think you need reference counting to do this. A simple tree traversal with a stack on the side will do. Make sure you know the fundamental difference between the tree traversal code you posted earlier and the algorithm used by DepthFirstIterator.h. You need the latter. /jakob From matthewbg at google.com Mon Jan 30 13:26:20 2012 From: matthewbg at google.com (Matt Beaumont-Gay) Date: Mon, 30 Jan 2012 19:26:20 -0000 Subject: [llvm-commits] [llvm] r149259 - /llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp Message-ID: <20120130192620.419772A6C12C@llvm.org> Author: matthewbg Date: Mon Jan 30 13:26:20 2012 New Revision: 149259 URL: http://llvm.org/viewvc/llvm-project?rev=149259&view=rev Log: Here's a new one: GCC was complaining about an only-used-in-asserts *function*. Wrap the function in #ifndef NDEBUG. Modified: llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp Modified: llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp?rev=149259&r1=149258&r2=149259&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp (original) +++ llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp Mon Jan 30 13:26:20 2012 @@ -797,7 +797,7 @@ } } - +#ifndef NDEBUG static bool intervalRangesSane(const LiveInterval& li) { if (li.empty()) { return true; @@ -814,6 +814,7 @@ return true; } +#endif template static void handleMoveDefs(LiveIntervals& lis, SlotIndex origIdx, @@ -1145,4 +1146,3 @@ return LR; } - From lenny at Colorado.EDU Mon Jan 30 13:37:08 2012 From: lenny at Colorado.EDU (Lenny Maiorani) Date: Mon, 30 Jan 2012 12:37:08 -0700 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> Message-ID: <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> On Jan 30, 2012, at 12:05 PM, Jakob Stoklund Olesen wrote: > > On Jan 30, 2012, at 10:51 AM, Lenny Maiorani wrote: > >> I am getting back to this and it looks like I will need to store the 3 ScopedHashTable objects. I am considering doing this via an IntrusiveRefCntPtr since I will need to store them in the local stack along with the generation and node and each scope might have multiple instances, one instance per child node. Is it preferred to have ScopedHashTable directly inherit from RefCountedBase (making all instances of ScopedHashTable have RefCountedBase) or create a new class within EarlyCSE which inherits and use a naked ptr to the ScopedHashTable inside that class? > > Hi Lenny, > > I really don't think you need reference counting to do this. A simple tree traversal with a stack on the side will do. > > Make sure you know the fundamental difference between the tree traversal code you posted earlier and the algorithm used by DepthFirstIterator.h. You need the latter. > > /jakob > Hi Jakob, As I described in a previous message, the DepthFirstIterator does not do what I need. Since it abstracts the levels of the tree I do not see how it can used to know when to save CurrentGeneration and *Scope locals and when not to. Am I significantly misunderstanding what is going on here? -Lenny From grosser at fim.uni-passau.de Mon Jan 30 13:38:36 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:36 -0000 Subject: [llvm-commits] [polly] r149261 - /polly/trunk/lib/Analysis/Dependences.cpp Message-ID: <20120130193836.DB4B02A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:36 2012 New Revision: 149261 URL: http://llvm.org/viewvc/llvm-project?rev=149261&view=rev Log: Dependences: Coalesce the dependences before returning them. Modified: polly/trunk/lib/Analysis/Dependences.cpp Modified: polly/trunk/lib/Analysis/Dependences.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/Dependences.cpp?rev=149261&r1=149260&r2=149261&view=diff ============================================================================== --- polly/trunk/lib/Analysis/Dependences.cpp (original) +++ polly/trunk/lib/Analysis/Dependences.cpp Mon Jan 30 13:38:36 2012 @@ -418,7 +418,7 @@ dependences = isl_union_map_union(dependences, isl_union_map_copy(waw_dep)); - return dependences; + return isl_union_map_coalesce(dependences); } void Dependences::getAnalysisUsage(AnalysisUsage &AU) const { From grosser at fim.uni-passau.de Mon Jan 30 13:38:40 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:40 -0000 Subject: [llvm-commits] [polly] r149262 - /polly/trunk/utils/checkout_cloog.sh Message-ID: <20120130193840.7DF6C2A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:40 2012 New Revision: 149262 URL: http://llvm.org/viewvc/llvm-project?rev=149262&view=rev Log: Use isl version: 3c66541593a6bf3b5a3d35d31567abe6c9e5a04b This allows us to set the fusion strategy and to gist-simplify union_maps. Modified: polly/trunk/utils/checkout_cloog.sh Modified: polly/trunk/utils/checkout_cloog.sh URL: http://llvm.org/viewvc/llvm-project/polly/trunk/utils/checkout_cloog.sh?rev=149262&r1=149261&r2=149262&view=diff ============================================================================== --- polly/trunk/utils/checkout_cloog.sh (original) +++ polly/trunk/utils/checkout_cloog.sh Mon Jan 30 13:38:40 2012 @@ -1,7 +1,7 @@ #!/bin/sh CLOOG_HASH="57470e76bfd58a0c38c598e816411663193e0f45" -ISL_HASH="990ad28a4356f5743bed503fbdcb7e85747f727f" +ISL_HASH="3c66541593a6bf3b5a3d35d31567abe6c9e5a04b" PWD=`pwd` From grosser at fim.uni-passau.de Mon Jan 30 13:38:43 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:43 -0000 Subject: [llvm-commits] [polly] r149263 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130193843.EDE1B2A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:43 2012 New Revision: 149263 URL: http://llvm.org/viewvc/llvm-project?rev=149263&view=rev Log: Scheduler: Simplify dependences by default (only isl) This speeds up the scheduler by orders of magnitude and in addition yields often to a better schedule. With this we can compile all polybench kernels with less than 5x compile time overhead. In general the overhead is even less than 2-3x. This is still with running a lot of redundant passes and no compile time tuning at all. There are several obvious areas where we can improve here further. There are also two test cases where we cannot find a schedule any more (cholesky and another). I will look into them later on. With this we have a very solid base line from which we can start to optimize further. Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149263&r1=149262&r2=149263&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:43 2012 @@ -50,6 +50,11 @@ cl::desc("Disable tiling in the scheduler"), cl::Hidden, cl::location(polly::DisablePollyTiling), cl::init(false)); +static cl::opt +SimplifyDeps("polly-opt-simplify-deps", + cl::desc("Dependences should be simplified (yes/no)"), + cl::Hidden, cl::init("yes")); + namespace { class IslScheduleOptimizer : public ScopPass { @@ -410,8 +415,7 @@ | Dependences::TYPE_WAR | Dependences::TYPE_WAW; - isl_union_map *validity = D->getDependences(dependencyKinds); - isl_union_map *proximity = D->getDependences(dependencyKinds); + isl_union_map *dependences = D->getDependences(dependencyKinds); isl_union_set *domain = NULL; for (Scop::iterator SI = S.begin(), SE = S.end(); SI != SE; ++SI) @@ -426,6 +430,27 @@ if (!domain) return false; + // Simplify the dependences by removing the constraints introduced by the + // domains. This can speed up the scheduling time significantly, as large + // constant coefficients will be removed from the dependences. The + // introduction of some additional dependences reduces the possible + // transformations, but in most cases, such transformation do not seem to be + // interesting anyway. In some cases this option may stop the scheduler to + // find any schedule. + if (SimplifyDeps == "yes") { + dependences = isl_union_map_gist_domain(dependences, + isl_union_set_copy(domain)); + dependences = isl_union_map_gist_range(dependences, + isl_union_set_copy(domain)); + } else if (SimplifyDeps != "no") { + errs() << "warning: Option -polly-opt-simplify-deps should either be 'yes' " + "or 'no'. Falling back to default: 'yes'\n"; + } + + isl_schedule *schedule; + isl_union_map *proximity = isl_union_map_copy(dependences); + isl_union_map *validity = dependences; + DEBUG(dbgs() << "\n\nCompute schedule from: "); DEBUG(dbgs() << "Domain := "; isl_union_set_dump(domain); dbgs() << ";\n"); DEBUG(dbgs() << "Proximity := "; isl_union_map_dump(proximity); @@ -433,8 +458,6 @@ DEBUG(dbgs() << "Validity := "; isl_union_map_dump(validity); dbgs() << ";\n"); - isl_schedule *schedule; - isl_options_set_schedule_max_constant_term(S.getIslCtx(), CONSTANT_BOUND); isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), 1); schedule = isl_union_set_compute_schedule(domain, validity, proximity); From grosser at fim.uni-passau.de Mon Jan 30 13:38:47 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:47 -0000 Subject: [llvm-commits] [polly] r149264 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130193847.6506C2A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:47 2012 New Revision: 149264 URL: http://llvm.org/viewvc/llvm-project?rev=149264&view=rev Log: Scheduling: Use original schedule if we cannot find a new one After this we can now compile all polybench 2.0 kernels without any compiler crash. Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149264&r1=149263&r2=149264&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:47 2012 @@ -32,6 +32,7 @@ #include "isl/constraint.h" #include "isl/schedule.h" #include "isl/band.h" +#include "isl/options.h" #define DEBUG_TYPE "polly-opt-isl" #include "llvm/Support/Debug.h" @@ -460,7 +461,15 @@ isl_options_set_schedule_max_constant_term(S.getIslCtx(), CONSTANT_BOUND); isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), 1); + + isl_options_set_on_error(S.getIslCtx(), ISL_ON_ERROR_CONTINUE); schedule = isl_union_set_compute_schedule(domain, validity, proximity); + isl_options_set_on_error(S.getIslCtx(), ISL_ON_ERROR_ABORT); + + // In cases the scheduler is not able to optimize the code, we just do not + // touch the schedule. + if (!schedule) + return false; DEBUG(dbgs() << "Computed schedule: "); DEBUG(dbgs() << stringFromIslObj(schedule)); From grosser at fim.uni-passau.de Mon Jan 30 13:38:50 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:50 -0000 Subject: [llvm-commits] [polly] r149265 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130193850.D08CF2A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:50 2012 New Revision: 149265 URL: http://llvm.org/viewvc/llvm-project?rev=149265&view=rev Log: Scheduler: Allow to select the fusion strategy Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149265&r1=149264&r2=149265&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:50 2012 @@ -56,6 +56,11 @@ cl::desc("Dependences should be simplified (yes/no)"), cl::Hidden, cl::init("yes")); +static cl::opt +FusionStrategy("polly-opt-fusion", + cl::desc("The fusion strategy to choose (min/max)"), + cl::Hidden, cl::init("max")); + namespace { class IslScheduleOptimizer : public ScopPass { @@ -459,6 +464,19 @@ DEBUG(dbgs() << "Validity := "; isl_union_map_dump(validity); dbgs() << ";\n"); + int IslFusionStrategy; + + if (FusionStrategy == "max") { + IslFusionStrategy = ISL_SCHEDULE_FUSE_MAX; + } else if (FusionStrategy == "min") { + IslFusionStrategy = ISL_SCHEDULE_FUSE_MIN; + } else { + errs() << "warning: Unknown fusion strategy. Falling back to maximal " + "fusion.\n"; + IslFusionStrategy = ISL_SCHEDULE_FUSE_MAX; + } + + isl_options_set_schedule_fuse(S.getIslCtx(), IslFusionStrategy); isl_options_set_schedule_max_constant_term(S.getIslCtx(), CONSTANT_BOUND); isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), 1); From grosser at fim.uni-passau.de Mon Jan 30 13:38:54 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:54 -0000 Subject: [llvm-commits] [polly] r149266 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130193854.9B9262A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:54 2012 New Revision: 149266 URL: http://llvm.org/viewvc/llvm-project?rev=149266&view=rev Log: Scheduling: Add option to disable schedule_maximise_band_depth maximise_band_depth does not seem to have any effect for now, but it may help to increase the amount of tileable loops. We expose the flag to be able to analyze its effects when looking into individual benchmarks. Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149266&r1=149265&r2=149266&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:54 2012 @@ -61,6 +61,11 @@ cl::desc("The fusion strategy to choose (min/max)"), cl::Hidden, cl::init("max")); +static cl::opt +MaxizeBandDepth("polly-opt-maximize-bands", + cl::desc("Maxize the band depth (yes/no)"), + cl::Hidden, cl::init("yes")); + namespace { class IslScheduleOptimizer : public ScopPass { @@ -476,9 +481,21 @@ IslFusionStrategy = ISL_SCHEDULE_FUSE_MAX; } + int IslMaximizeBands; + + if (MaxizeBandDepth == "yes") { + IslMaximizeBands = 1; + } else if (MaxizeBandDepth == "no") { + IslMaximizeBands = 0; + } else { + errs() << "warning: Option -polly-opt-maximize-bands should either be 'yes'" + " or 'no'. Falling back to default: 'yes'\n"; + IslMaximizeBands = 1; + } + isl_options_set_schedule_fuse(S.getIslCtx(), IslFusionStrategy); isl_options_set_schedule_max_constant_term(S.getIslCtx(), CONSTANT_BOUND); - isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), 1); + isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), IslMaximizeBands); isl_options_set_on_error(S.getIslCtx(), ISL_ON_ERROR_CONTINUE); schedule = isl_union_set_compute_schedule(domain, validity, proximity); From grosser at fim.uni-passau.de Mon Jan 30 13:38:58 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 19:38:58 -0000 Subject: [llvm-commits] [polly] r149267 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130193858.627F32A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 13:38:58 2012 New Revision: 149267 URL: http://llvm.org/viewvc/llvm-project?rev=149267&view=rev Log: Scheduling: Set fusion strategy to minimal This has shown better results for 2mm, 3mm and a couple of other benchmarks. After this we show consistenly better results as PoCC with maxfuse. We need to see if PoCC can also give better results with another fusion strategy. Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149267&r1=149266&r2=149267&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:58 2012 @@ -59,7 +59,7 @@ static cl::opt FusionStrategy("polly-opt-fusion", cl::desc("The fusion strategy to choose (min/max)"), - cl::Hidden, cl::init("max")); + cl::Hidden, cl::init("min")); static cl::opt MaxizeBandDepth("polly-opt-maximize-bands", From stoklund at 2pi.dk Mon Jan 30 13:49:05 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 30 Jan 2012 11:49:05 -0800 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> Message-ID: <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> On Jan 30, 2012, at 11:37 AM, Lenny Maiorani wrote: > > On Jan 30, 2012, at 12:05 PM, Jakob Stoklund Olesen wrote: > >> >> On Jan 30, 2012, at 10:51 AM, Lenny Maiorani wrote: >> >>> I am getting back to this and it looks like I will need to store the 3 ScopedHashTable objects. I am considering doing this via an IntrusiveRefCntPtr since I will need to store them in the local stack along with the generation and node and each scope might have multiple instances, one instance per child node. Is it preferred to have ScopedHashTable directly inherit from RefCountedBase (making all instances of ScopedHashTable have RefCountedBase) or create a new class within EarlyCSE which inherits and use a naked ptr to the ScopedHashTable inside that class? >> >> Hi Lenny, >> >> I really don't think you need reference counting to do this. A simple tree traversal with a stack on the side will do. >> >> Make sure you know the fundamental difference between the tree traversal code you posted earlier and the algorithm used by DepthFirstIterator.h. You need the latter. >> >> /jakob >> > > Hi Jakob, > > As I described in a previous message, the DepthFirstIterator does not do what I need. Since it abstracts the levels of the tree I do not see how it can used to know when to save CurrentGeneration and *Scope locals and when not to. Am I significantly misunderstanding what is going on here? Yes. You can't use DepthFirstIterator directly because it doesn't let you share its stack. But you can use its algorithm for tree traversal. It is not the same as the algorithm you posted earlier, and it would not require you to use reference counting. In your algorithm, you push all children onto the stack immediately. I can see how you would need reference counting for that. The DFI algorithm keeps a (node, child-iterator) pair on the stack. It only pushes one child at a time, and the stack represents exactly the path from the root to the current node. That means you can simply put the hash table scopes on the stack as well. /jakob From zinob at codeaurora.org Mon Jan 30 14:01:32 2012 From: zinob at codeaurora.org (Zino Benaissa) Date: Mon, 30 Jan 2012 12:01:32 -0800 Subject: [llvm-commits] FW: Tuning LLVM Greedy Register Allocator to optimize for code size when targeting ARM Thumb 2 instruction set Message-ID: <000001ccdf89$f6aae410$e400ac30$@org> Resubmitting this patch without CMN fix that were submitted in a separate patch. Thanks -Zino From: Zino Benaissa [mailto:zinob at codeaurora.org] Sent: Monday, January 23, 2012 5:12 PM To: 'llvm-commits at cs.uiuc.edu' Cc: 'rajav at codeaurora.org' Subject: FW: Tuning LLVM Greedy Register Allocator to optimize for code size when targeting ARM Thumb 2 instruction set Description: This contribution extends LLVM greedy Register Allocator to optimize for code size when LLVM compiler targets ARM Thumb 2 instruction set. This heuristic favors assigning register R0 through R7 to operands used in instruction that can be encoded in 16 bits (16-bit is allowed only if R0-7 are used). Operands that appear most frequently in a function (and in instructions that qualify) get R0-7 register. This heuristic is turned on by default and has impact on generated code only if -mthumb compiler switch is used. To turn this heuristic off use -disable-favor-r0-7 feature flag. This patch modifies: 1) The LLVM greedy register allocator located in LLVM/CodeGen directory: To add the new code size heuristic. 2) The ARM-specific flies located in LLVM/Target/ARM directory: To add the function that determines which instruction can be encoded in 16-bits and a fix to enable the compiler to emit CMN instruction in 16-bits encoding. 3) The LLVM test suite: fix test/CodeGen/Thumb2/thumb2-cmn.ll test. Performance impact: I focused on -Os and -mthumb flags. But observed similar improvement with -O3 and -mthumb. Runtime measured on Qualcomm 8660. Code size: - SPEC2000 benchmarks between 0 to 0.6% code size reduction (with no noticeable regression). - EEMBC benchmarks between 0 to 6% reduction (no noticeable regression). Automotive and Networking average about 1% code size reduction and Consumer about 0.5%. Runtime: - SPEC2000 between -1% and 6% speed up (Spec2k/ammp 6%) - EEMBC overall averages faster -1 to 5%. Modified: test/CodeGen/Thumb2/thumb2-cmn.ll include/llvm/Target/TargetInstrInfo.h include/llvm/CodeGen/LiveInterval.h lib/Target/ARM/Thumb2SizeReduction.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h lib/CodeGen/RegAllocGreedy.cpp lib/CodeGen/CalcSpillWeights.cpp for details see RACodeSize.txt Testing: See ARMTestSuiteResult.txt and ARMSimple-Os-mthumb.txt Note -O3 is also completed on X86 and ARM CPUs -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/b7c03d11/attachment-0001.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ARMTestSuiteResult.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/b7c03d11/attachment-0003.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ARMsimple-Os-mthumb.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/b7c03d11/attachment-0004.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: RACodeSize.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/b7c03d11/attachment-0005.txt From benny.kra at googlemail.com Mon Jan 30 14:01:35 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Mon, 30 Jan 2012 20:01:35 -0000 Subject: [llvm-commits] [llvm] r149269 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120130200135.6EBFB2A6C12C@llvm.org> Author: d0k Date: Mon Jan 30 14:01:35 2012 New Revision: 149269 URL: http://llvm.org/viewvc/llvm-project?rev=149269&view=rev Log: Fix refacto. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149269&r1=149268&r2=149269&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Jan 30 14:01:35 2012 @@ -5294,8 +5294,8 @@ int MaskVec[] = { Reverse1 ? 1 : 0, Reverse1 ? 0 : 1, - static_cast(Reverse2 ? 1-NumElems : NumElems), - static_cast(Reverse2 ? NumElems : 1+NumElems) + static_cast(Reverse2 ? NumElems+1 : NumElems), + static_cast(Reverse2 ? NumElems : NumElems+1) }; return DAG.getVectorShuffle(VT, dl, V[0], V[1], &MaskVec[0]); } From bigcheesegs at gmail.com Mon Jan 30 14:08:27 2012 From: bigcheesegs at gmail.com (Michael Spencer) Date: Mon, 30 Jan 2012 12:08:27 -0800 Subject: [llvm-commits] [PATCH] YAML parser. Message-ID: Attached is the patch for the YAML parser I've been working on. YAML is a super set of JSON that adds many features that I want for writing tests for lld (the llvm linker) and other places where we use object files. The API is very similar to the existing JSON API, but is not exactly the same as YAML has an extended data model. This parser is slower than the currently existing JSON parser. For files with large scalars, there is almost no difference. For medium scalars, there's a ~2x slowdown. And for small scalars, there's a ~6x slowdown. Here are some performance numbers for {yaml,json}-bench -memory-limit 100 (a 32bit build can't do much more than that :P). Note that for YAML. The Parsing time includes the Tokenizing time. c:\Users\mspencer\Projects\llvm-project\llvm>yaml-bench -memory-limit 100 ===-------------------------------------------------------------------------=== YAML parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 5.2104 seconds (5.2185 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 2.8392 ( 56.3%) 0.1716 (100.0%) 3.0108 ( 57.8%) 3.0118 ( 57.7%) Small Values: Parsing 2.1216 ( 42.1%) 0.0000 ( 0.0%) 2.1216 ( 40.7%) 2.1257 ( 40.7%) Small Values: Tokenizing 0.0780 ( 1.5%) 0.0000 ( 0.0%) 0.0780 ( 1.5%) 0.0810 ( 1.6%) Small Values: Loop 5.0388 (100.0%) 0.1716 (100.0%) 5.2104 (100.0%) 5.2185 (100.0%) Total ===-------------------------------------------------------------------------=== YAML parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 0.4836 seconds (0.4740 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.2184 ( 45.2%) 0.2184 ( 45.2%) 0.2200 ( 46.4%) Medium Values: Parsing 0.1716 ( 35.5%) 0.1716 ( 35.5%) 0.1710 ( 36.1%) Medium Values: Tokenizing 0.0936 ( 19.4%) 0.0936 ( 19.4%) 0.0830 ( 17.5%) Medium Values: Loop 0.4836 (100.0%) 0.4836 (100.0%) 0.4740 (100.0%) Total ===-------------------------------------------------------------------------=== YAML parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 0.2496 seconds (0.2480 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.0780 ( 31.3%) 0.0780 ( 31.3%) 0.0830 ( 33.5%) Large Values: Parsing 0.0936 ( 37.5%) 0.0936 ( 37.5%) 0.0830 ( 33.5%) Large Values: Tokenizing 0.0780 ( 31.3%) 0.0780 ( 31.3%) 0.0820 ( 33.1%) Large Values: Loop 0.2496 (100.0%) 0.2496 (100.0%) 0.2480 (100.0%) Total c:\Users\mspencer\Projects\llvm-project\llvm>json-bench -memory-limit 100 ===-------------------------------------------------------------------------=== JSON parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 0.6552 seconds (0.6531 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 0.5460 ( 87.5%) 0.0312 (100.0%) 0.5772 ( 88.1%) 0.5721 ( 87.6%) Small Values: Parsing 0.0780 ( 12.5%) 0.0000 ( 0.0%) 0.0780 ( 11.9%) 0.0810 ( 12.4%) Small Values: Loop 0.6240 (100.0%) 0.0312 (100.0%) 0.6552 (100.0%) 0.6531 (100.0%) Total ===-------------------------------------------------------------------------=== JSON parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 0.1872 seconds (0.1830 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.1092 ( 58.3%) 0.1092 ( 58.3%) 0.1030 ( 56.3%) Medium Values: Parsing 0.0780 ( 41.7%) 0.0780 ( 41.7%) 0.0800 ( 43.7%) Medium Values: Loop 0.1872 (100.0%) 0.1872 (100.0%) 0.1830 (100.0%) Total ===-------------------------------------------------------------------------=== JSON parser benchmark ===-------------------------------------------------------------------------=== Total Execution Time: 0.1716 seconds (0.1620 wall clock) ---User Time--- --User+System-- ---Wall Time--- --- Name --- 0.0780 ( 45.5%) 0.0780 ( 45.5%) 0.0810 ( 50.0%) Large Values: Parsing 0.0936 ( 54.5%) 0.0936 ( 54.5%) 0.0810 ( 50.0%) Large Values: Loop 0.1716 (100.0%) 0.1716 (100.0%) 0.1620 (100.0%) Total - Michael Spencer -------------- next part -------------- A non-text attachment was scrubbed... Name: yaml-parser.patch Type: application/octet-stream Size: 150894 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/48c6e717/attachment-0001.obj From dpatel at apple.com Mon Jan 30 14:02:43 2012 From: dpatel at apple.com (Devang Patel) Date: Mon, 30 Jan 2012 20:02:43 -0000 Subject: [llvm-commits] [llvm] r149270 - in /llvm/trunk: lib/Target/X86/AsmParser/X86AsmParser.cpp test/MC/X86/intel-syntax-2.s Message-ID: <20120130200243.D8B6F2A6C12C@llvm.org> Author: dpatel Date: Mon Jan 30 14:02:42 2012 New Revision: 149270 URL: http://llvm.org/viewvc/llvm-project?rev=149270&view=rev Log: Intel syntax. Support .intel_syntax directive. Added: llvm/trunk/test/MC/X86/intel-syntax-2.s Modified: llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp Modified: llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp?rev=149270&r1=149269&r2=149270&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp (original) +++ llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp Mon Jan 30 14:02:42 2012 @@ -34,7 +34,7 @@ class X86AsmParser : public MCTargetAsmParser { MCSubtargetInfo &STI; MCAsmParser &Parser; - + bool IntelSyntax; private: MCAsmParser &getParser() const { return Parser; } @@ -94,7 +94,7 @@ public: X86AsmParser(MCSubtargetInfo &sti, MCAsmParser &parser) - : MCTargetAsmParser(), STI(sti), Parser(parser) { + : MCTargetAsmParser(), STI(sti), Parser(parser), IntelSyntax(false) { // Initialize the set of available features. setAvailableFeatures(ComputeAvailableFeatures(STI.getFeatureBits())); @@ -105,6 +105,10 @@ SmallVectorImpl &Operands); virtual bool ParseDirective(AsmToken DirectiveID); + + bool isParsingIntelSyntax() { + return IntelSyntax || getParser().getAssemblerDialect(); + } }; } // end anonymous namespace @@ -470,8 +474,7 @@ bool X86AsmParser::ParseRegister(unsigned &RegNo, SMLoc &StartLoc, SMLoc &EndLoc) { RegNo = 0; - bool IntelSyntax = getParser().getAssemblerDialect(); - if (!IntelSyntax) { + if (!isParsingIntelSyntax()) { const AsmToken &TokPercent = Parser.getTok(); assert(TokPercent.is(AsmToken::Percent) && "Invalid token kind!"); StartLoc = TokPercent.getLoc(); @@ -480,7 +483,7 @@ const AsmToken &Tok = Parser.getTok(); if (Tok.isNot(AsmToken::Identifier)) { - if (IntelSyntax) return true; + if (isParsingIntelSyntax()) return true; return Error(StartLoc, "invalid register name", SMRange(StartLoc, Tok.getEndLoc())); } @@ -564,7 +567,7 @@ } if (RegNo == 0) { - if (IntelSyntax) return true; + if (isParsingIntelSyntax()) return true; return Error(StartLoc, "invalid register name", SMRange(StartLoc, Tok.getEndLoc())); } @@ -575,7 +578,7 @@ } X86Operand *X86AsmParser::ParseOperand() { - if (getParser().getAssemblerDialect()) + if (isParsingIntelSyntax()) return ParseIntelOperand(); return ParseATTOperand(); } @@ -1170,7 +1173,7 @@ Name.startswith("rcl") || Name.startswith("rcr") || Name.startswith("rol") || Name.startswith("ror")) && Operands.size() == 3) { - if (getParser().getAssemblerDialect()) { + if (isParsingIntelSyntax()) { // Intel syntax X86Operand *Op1 = static_cast(Operands[2]); if (Op1->isImm() && isa(Op1->getImm()) && @@ -1485,8 +1488,8 @@ MCInst Inst; // First, try a direct match. - switch (MatchInstructionImpl(Operands, Inst, OrigErrorInfo, - getParser().getAssemblerDialect())) { + switch (MatchInstructionImpl(Operands, Inst, OrigErrorInfo, + isParsingIntelSyntax())) { default: break; case Match_Success: // Some instructions need post-processing to, for example, tweak which @@ -1640,6 +1643,17 @@ return ParseDirectiveWord(2, DirectiveID.getLoc()); else if (IDVal.startswith(".code")) return ParseDirectiveCode(IDVal, DirectiveID.getLoc()); + else if (IDVal.startswith(".intel_syntax")) { + IntelSyntax = true; + if (getLexer().isNot(AsmToken::EndOfStatement)) { + if(Parser.getTok().getString() == "noprefix") { + // FIXME : Handle noprefix + Parser.Lex(); + } else + return true; + } + return false; + } return true; } Added: llvm/trunk/test/MC/X86/intel-syntax-2.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/X86/intel-syntax-2.s?rev=149270&view=auto ============================================================================== --- llvm/trunk/test/MC/X86/intel-syntax-2.s (added) +++ llvm/trunk/test/MC/X86/intel-syntax-2.s Mon Jan 30 14:02:42 2012 @@ -0,0 +1,7 @@ +// RUN: llvm-mc -triple x86_64-unknown-unknown %s | FileCheck %s + + .intel_syntax +_test: +// CHECK: movl $257, -4(%rsp) + mov DWORD PTR [RSP - 4], 257 + From greened at obbligato.org Mon Jan 30 14:47:04 2012 From: greened at obbligato.org (David Greene) Date: Mon, 30 Jan 2012 20:47:04 -0000 Subject: [llvm-commits] [llvm] r149273 - /llvm/trunk/lib/TableGen/Record.cpp Message-ID: <20120130204704.B71532A6C12C@llvm.org> Author: greened Date: Mon Jan 30 14:47:04 2012 New Revision: 149273 URL: http://llvm.org/viewvc/llvm-project?rev=149273&view=rev Log: Implement String Cast from Integer Allow casts from integer to string. Modified: llvm/trunk/lib/TableGen/Record.cpp Modified: llvm/trunk/lib/TableGen/Record.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/TableGen/Record.cpp?rev=149273&r1=149272&r2=149273&view=diff ============================================================================== --- llvm/trunk/lib/TableGen/Record.cpp (original) +++ llvm/trunk/lib/TableGen/Record.cpp Mon Jan 30 14:47:04 2012 @@ -738,6 +738,11 @@ if (LHSd) { return StringInit::get(LHSd->getDef()->getName()); } + + IntInit *LHSi = dynamic_cast(LHS); + if (LHSi) { + return StringInit::get(LHSi->getAsString()); + } } else { StringInit *LHSs = dynamic_cast(LHS); if (LHSs) { From kcc at google.com Mon Jan 30 14:55:02 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 20:55:02 -0000 Subject: [llvm-commits] [compiler-rt] r149274 - in /compiler-rt/trunk/lib/asan: asan_allocator.cc asan_internal.h Message-ID: <20120130205502.B0B502A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 14:55:02 2012 New Revision: 149274 URL: http://llvm.org/viewvc/llvm-project?rev=149274&view=rev Log: [asan] minor ifdef cleanup Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc compiler-rt/trunk/lib/asan/asan_internal.h Modified: compiler-rt/trunk/lib/asan/asan_allocator.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_allocator.cc?rev=149274&r1=149273&r2=149274&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_allocator.cc (original) +++ compiler-rt/trunk/lib/asan/asan_allocator.cc Mon Jan 30 14:55:02 2012 @@ -35,6 +35,10 @@ #include "asan_thread.h" #include "asan_thread_registry.h" +#ifdef _WIN32 +#include +#endif + namespace __asan { #define REDZONE FLAG_redzone @@ -59,10 +63,6 @@ return (a & (alignment - 1)) == 0; } -#ifdef _WIN32 -#include -#endif - static inline size_t Log2(size_t x) { CHECK(IsPowerOfTwo(x)); #if defined(_WIN64) Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=149274&r1=149273&r2=149274&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Mon Jan 30 14:55:02 2012 @@ -20,9 +20,7 @@ #include // for size_t, uintptr_t, etc. -#if !defined(_WIN32) -#include // for __WORDSIZE -#else +#if defined(_WIN32) // There's no in Visual Studio 9, so we have to define [u]int*_t. typedef unsigned __int8 uint8_t; typedef unsigned __int16 uint16_t; @@ -32,16 +30,8 @@ typedef __int16 int16_t; typedef __int32 int32_t; typedef __int64 int64_t; - -// Visual Studio does not define ssize_t. -#ifdef _WIN64 -typedef int64_t ssize_t; -#define __WORDSIZE 64 #else -typedef int32_t ssize_t; -#define __WORDSIZE 32 -#endif - +# include // for __WORDSIZE #endif // _WIN32 // If __WORDSIZE was undefined by the platform, define it in terms of the From nobled at dreamwidth.org Mon Jan 30 15:01:56 2012 From: nobled at dreamwidth.org (nobled) Date: Mon, 30 Jan 2012 16:01:56 -0500 Subject: [llvm-commits] [PATCH] autoconf: add private config.h to clang Message-ID: This already exists in the CMake build, which is part of what makes building clang separately from llvm via cmake possible. This cleans up that discrepancy between the build systems. I'll just add the minimal file include/clang/Config/config.h.in (with the same contents as its config.h.cmake counterpart), then add it to LLVM's configure script so it gets generated, then remove the #ifdef logic from clang that was conditionally including the cmake generated header. Okay to commit? -------------- next part -------------- A non-text attachment was scrubbed... Name: config.h.in Type: application/octet-stream Size: 523 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/3f6c914a/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-autoconf-add-clang-s-private-config-header.patch Type: text/x-patch Size: 1415 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/3f6c914a/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-include-clang-s-config.h-unconditionally.patch Type: text/x-patch Size: 3731 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/3f6c914a/attachment-0001.bin From nobled at dreamwidth.org Mon Jan 30 15:06:45 2012 From: nobled at dreamwidth.org (nobled) Date: Mon, 30 Jan 2012 16:06:45 -0500 Subject: [llvm-commits] [PATCH] cmake: make BUILD_SHARED_LIBS a visible option Message-ID: Before, this could only be specified on the commandline via -DBUILD_SHARED_LIBS=ON, and wouldn't show up as an option when invoked via `cmake -i` at all. It also wasn't type-checked as a boolean variable with only ON/OFF as valid values. --- CMakeLists.txt | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 13e358a..1e7a4a3 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -94,6 +94,9 @@ else( MSVC ) CACHE STRING "Semicolon-separated list of targets to build, or \"all\".") endif( MSVC ) +option(BUILD_SHARED_LIBS + "Build all libraries as shared libraries instead of static" OFF) + option(LLVM_ENABLE_CBE_PRINTF_A "Set to ON if CBE is enabled for printf %a output" ON) if(LLVM_ENABLE_CBE_PRINTF_A) set(ENABLE_CBE_PRINTF_A 1) -- 1.7.4.1 From lenny at Colorado.EDU Mon Jan 30 15:11:49 2012 From: lenny at Colorado.EDU (Lenny Maiorani) Date: Mon, 30 Jan 2012 14:11:49 -0700 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> Message-ID: On Jan 30, 2012, at 12:49 PM, Jakob Stoklund Olesen wrote: > > On Jan 30, 2012, at 11:37 AM, Lenny Maiorani wrote: > >> >> On Jan 30, 2012, at 12:05 PM, Jakob Stoklund Olesen wrote: >> >>> >>> On Jan 30, 2012, at 10:51 AM, Lenny Maiorani wrote: >>> >>>> I am getting back to this and it looks like I will need to store the 3 ScopedHashTable objects. I am considering doing this via an IntrusiveRefCntPtr since I will need to store them in the local stack along with the generation and node and each scope might have multiple instances, one instance per child node. Is it preferred to have ScopedHashTable directly inherit from RefCountedBase (making all instances of ScopedHashTable have RefCountedBase) or create a new class within EarlyCSE which inherits and use a naked ptr to the ScopedHashTable inside that class? >>> >>> Hi Lenny, >>> >>> I really don't think you need reference counting to do this. A simple tree traversal with a stack on the side will do. >>> >>> Make sure you know the fundamental difference between the tree traversal code you posted earlier and the algorithm used by DepthFirstIterator.h. You need the latter. >>> >>> /jakob >>> >> >> Hi Jakob, >> >> As I described in a previous message, the DepthFirstIterator does not do what I need. Since it abstracts the levels of the tree I do not see how it can used to know when to save CurrentGeneration and *Scope locals and when not to. Am I significantly misunderstanding what is going on here? > > Yes. You can't use DepthFirstIterator directly because it doesn't let you share its stack. But you can use its algorithm for tree traversal. > > It is not the same as the algorithm you posted earlier, and it would not require you to use reference counting. > > In your algorithm, you push all children onto the stack immediately. I can see how you would need reference counting for that. The DFI algorithm keeps a (node, child-iterator) pair on the stack. It only pushes one child at a time, and the stack represents exactly the path from the root to the current node. > > That means you can simply put the hash table scopes on the stack as well. > > /jakob > Ok, I understand the algorithm difference, but I don't know what to do to store the hash table scopes on the stack. They are not copy-constructable or assignable so they don't work with STL containers. I could use a shared pointer, but that is just reference counting again. -Lenny From mcrosier at apple.com Mon Jan 30 15:13:22 2012 From: mcrosier at apple.com (Chad Rosier) Date: Mon, 30 Jan 2012 21:13:22 -0000 Subject: [llvm-commits] [llvm] r149275 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <20120130211322.74D702A6C12C@llvm.org> Author: mcrosier Date: Mon Jan 30 15:13:22 2012 New Revision: 149275 URL: http://llvm.org/viewvc/llvm-project?rev=149275&view=rev Log: Typo. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=149275&r1=149274&r2=149275&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Jan 30 15:13:22 2012 @@ -977,7 +977,7 @@ return CoerceAvailableValueToLoadType(SrcVal, LoadTy, InsertPt, TD); } -/// GetStoreValueForLoad - This function is called when we have a +/// GetLoadValueForLoad - This function is called when we have a /// memdep query of a load that ends up being a clobbering load. This means /// that the load *may* provide bits used by the load but we can't be sure /// because the pointers don't mustalias. Check this case to see if there is From spop at codeaurora.org Mon Jan 30 15:19:40 2012 From: spop at codeaurora.org (Sebastian Pop) Date: Mon, 30 Jan 2012 15:19:40 -0600 Subject: [llvm-commits] [polly] r149266 - /polly/trunk/lib/ScheduleOptimizer.cpp In-Reply-To: <20120130193854.9B9262A6C12C@llvm.org> References: <20120130193854.9B9262A6C12C@llvm.org> Message-ID: On Mon, Jan 30, 2012 at 1:38 PM, Tobias Grosser wrote: > Author: grosser > Date: Mon Jan 30 13:38:54 2012 > New Revision: 149266 > > URL: http://llvm.org/viewvc/llvm-project?rev=149266&view=rev > Log: > Scheduling: Add option to disable schedule_maximise_band_depth > > maximise_band_depth does not seem to have any effect for now, but it may help to > increase the amount of tileable loops. We expose the flag to be able to analyze > its effects when looking into individual benchmarks. > > Modified: > ? ?polly/trunk/lib/ScheduleOptimizer.cpp > > Modified: polly/trunk/lib/ScheduleOptimizer.cpp > URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149266&r1=149265&r2=149266&view=diff > ============================================================================== > --- polly/trunk/lib/ScheduleOptimizer.cpp (original) > +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 13:38:54 2012 > @@ -61,6 +61,11 @@ > ? ? ? ? ? ? ? ?cl::desc("The fusion strategy to choose (min/max)"), > ? ? ? ? ? ? ? ?cl::Hidden, cl::init("max")); > > +static cl::opt > +MaxizeBandDepth("polly-opt-maximize-bands", Please replace Maxize with Maximize here and below. Otherwise the patch looks good to me. Thanks, Sebastian -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum From kcc at google.com Mon Jan 30 15:35:00 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 21:35:00 -0000 Subject: [llvm-commits] [compiler-rt] r149278 - /compiler-rt/trunk/lib/asan/tests/asan_test.cc Message-ID: <20120130213500.37A652A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 15:34:59 2012 New Revision: 149278 URL: http://llvm.org/viewvc/llvm-project?rev=149278&view=rev Log: [asan] add a test for __attribute__ no_address_safety_analysis Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=149278&r1=149277&r2=149278&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Mon Jan 30 15:34:59 2012 @@ -1668,6 +1668,17 @@ *Ident(&a) = *Ident(&a); } + __attribute__((no_address_safety_analysis)) +static void NoAddressSafety() { + char *foo = new char[10]; + Ident(foo)[10] = 0; + delete [] foo; +} + +TEST(AddressSanitizer, AttributeNoAddressSafetyTest) { + Ident(NoAddressSafety)(); +} + // ------------------ demo tests; run each one-by-one ------------- // e.g. --gtest_filter=*DemoOOBLeftHigh --gtest_also_run_disabled_tests TEST(AddressSanitizer, DISABLED_DemoThreadedTest) { From kcc at google.com Mon Jan 30 16:11:04 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 22:11:04 -0000 Subject: [llvm-commits] [compiler-rt] r149281 - in /compiler-rt/trunk/lib/asan: asan_mac.cc asan_procmaps.h Message-ID: <20120130221104.9038E2A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 16:11:04 2012 New Revision: 149281 URL: http://llvm.org/viewvc/llvm-project?rev=149281&view=rev Log: [asan] ifdef/include cleanup Modified: compiler-rt/trunk/lib/asan/asan_mac.cc compiler-rt/trunk/lib/asan/asan_procmaps.h Modified: compiler-rt/trunk/lib/asan/asan_mac.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.cc?rev=149281&r1=149280&r2=149281&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.cc (original) +++ compiler-rt/trunk/lib/asan/asan_mac.cc Mon Jan 30 16:11:04 2012 @@ -24,6 +24,7 @@ #include // for _NSGetEnviron #include +#include #include #include #include Modified: compiler-rt/trunk/lib/asan/asan_procmaps.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_procmaps.h?rev=149281&r1=149280&r2=149281&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_procmaps.h (original) +++ compiler-rt/trunk/lib/asan/asan_procmaps.h Mon Jan 30 16:11:04 2012 @@ -15,9 +15,6 @@ #define ASAN_PROCMAPS_H #include "asan_internal.h" -#if defined __APPLE__ -#include -#endif namespace __asan { @@ -33,11 +30,6 @@ char filename[], size_t filename_size); ~AsanProcMaps(); private: -#if defined __APPLE__ - template - bool NextSegmentLoad(uintptr_t *start, uintptr_t *end, uintptr_t *offset, - char filename[], size_t filename_size); -#endif // Default implementation of GetObjectNameAndOffset. // Quite slow, because it iterates through the whole process map for each // lookup. @@ -63,6 +55,9 @@ size_t proc_self_maps_buff_len_; char *current_; #elif defined __APPLE__ + template + bool NextSegmentLoad(uintptr_t *start, uintptr_t *end, uintptr_t *offset, + char filename[], size_t filename_size); int current_image_; uint32_t current_magic_; int current_load_cmd_count_; From stoklund at 2pi.dk Mon Jan 30 16:31:14 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 30 Jan 2012 14:31:14 -0800 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> Message-ID: On Jan 30, 2012, at 1:11 PM, Lenny Maiorani wrote: > > Ok, I understand the algorithm difference, but I don't know what to do to store the hash table scopes on the stack. They are not copy-constructable or assignable so they don't work with STL containers. I could use a shared pointer, but that is just reference counting again. Oh, how annoying. I think you should just do what MachineCSE does and store pointers: void MachineCSE::EnterScope(MachineBasicBlock *MBB) { DEBUG(dbgs() << "Entering: " << MBB->getName() << '\n'); ScopeType *Scope = new ScopeType(VNT); ScopeMap[MBB] = Scope; } void MachineCSE::ExitScope(MachineBasicBlock *MBB) { DEBUG(dbgs() << "Exiting: " << MBB->getName() << '\n'); DenseMap::iterator SI = ScopeMap.find(MBB); assert(SI != ScopeMap.end()); ScopeMap.erase(SI); delete SI->second; } (But please don't dereference iterators after erasing them). /jakob From grosser at fim.uni-passau.de Mon Jan 30 16:43:56 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 22:43:56 -0000 Subject: [llvm-commits] [polly] r149287 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130224357.04A072A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 16:43:56 2012 New Revision: 149287 URL: http://llvm.org/viewvc/llvm-project?rev=149287&view=rev Log: Typo: Maxize -> Mazimize Found by Sebastian Pop. Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149287&r1=149286&r2=149287&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 16:43:56 2012 @@ -62,8 +62,8 @@ cl::Hidden, cl::init("min")); static cl::opt -MaxizeBandDepth("polly-opt-maximize-bands", - cl::desc("Maxize the band depth (yes/no)"), +MaximizeBandDepth("polly-opt-maximize-bands", + cl::desc("Maximize the band depth (yes/no)"), cl::Hidden, cl::init("yes")); namespace { @@ -483,9 +483,9 @@ int IslMaximizeBands; - if (MaxizeBandDepth == "yes") { + if (MaximizeBandDepth == "yes") { IslMaximizeBands = 1; - } else if (MaxizeBandDepth == "no") { + } else if (MaximizeBandDepth == "no") { IslMaximizeBands = 0; } else { errs() << "warning: Option -polly-opt-maximize-bands should either be 'yes'" From grosser at fim.uni-passau.de Mon Jan 30 16:44:05 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 22:44:05 -0000 Subject: [llvm-commits] [polly] r149288 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130224405.CE0542A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 16:44:05 2012 New Revision: 149288 URL: http://llvm.org/viewvc/llvm-project?rev=149288&view=rev Log: Scheduling: Limiting the constant term is not necessary any more Due to our gist simplifications, limiting the constant term does not seem to be necessary any more. Pointed out by Sven Verdoolaege Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149288&r1=149287&r2=149288&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 16:44:05 2012 @@ -494,7 +494,6 @@ } isl_options_set_schedule_fuse(S.getIslCtx(), IslFusionStrategy); - isl_options_set_schedule_max_constant_term(S.getIslCtx(), CONSTANT_BOUND); isl_options_set_schedule_maximize_band_depth(S.getIslCtx(), IslMaximizeBands); isl_options_set_on_error(S.getIslCtx(), ISL_ON_ERROR_CONTINUE); From mcrosier at apple.com Mon Jan 30 16:44:13 2012 From: mcrosier at apple.com (Chad Rosier) Date: Mon, 30 Jan 2012 22:44:13 -0000 Subject: [llvm-commits] [llvm] r149289 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <20120130224413.480042A6C12C@llvm.org> Author: mcrosier Date: Mon Jan 30 16:44:13 2012 New Revision: 149289 URL: http://llvm.org/viewvc/llvm-project?rev=149289&view=rev Log: Typo. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=149289&r1=149288&r2=149289&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Jan 30 16:44:13 2012 @@ -780,7 +780,7 @@ Value *WritePtr, uint64_t WriteSizeInBits, const TargetData &TD) { - // If the loaded or stored value is an first class array or struct, don't try + // If the loaded or stored value is a first class array or struct, don't try // to transform them. We need to be able to bitcast to integer. if (LoadTy->isStructTy() || LoadTy->isArrayTy()) return -1; From grosser at fim.uni-passau.de Mon Jan 30 16:46:22 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 22:46:22 -0000 Subject: [llvm-commits] [polly] r149290 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120130224622.C571C2A6C12C@llvm.org> Author: grosser Date: Mon Jan 30 16:46:22 2012 New Revision: 149290 URL: http://llvm.org/viewvc/llvm-project?rev=149290&view=rev Log: Remove leftover constant Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149290&r1=149289&r2=149290&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Mon Jan 30 16:46:22 2012 @@ -38,8 +38,6 @@ #include "llvm/Support/Debug.h" #include "llvm/Support/CommandLine.h" -static const int CONSTANT_BOUND = 20; - using namespace llvm; using namespace polly; From grosser at fim.uni-passau.de Mon Jan 30 16:50:32 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Jan 2012 23:50:32 +0100 Subject: [llvm-commits] [polly] r149266 - /polly/trunk/lib/ScheduleOptimizer.cpp In-Reply-To: References: <20120130193854.9B9262A6C12C@llvm.org> Message-ID: <4F271EB8.1010502@fim.uni-passau.de> On 01/30/2012 10:19 PM, Sebastian Pop wrote: > On Mon, Jan 30, 2012 at 1:38 PM, Tobias Grosser > wrote: >> Author: grosser >> Date: Mon Jan 30 13:38:54 2012 >> New Revision: 149266 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149266&view=rev >> Log: >> Scheduling: Add option to disable schedule_maximise_band_depth >> >> maximise_band_depth does not seem to have any effect for now, but it may help to >> increase the amount of tileable loops. We expose the flag to be able to analyze >> its effects when looking into individual benchmarks Thanks for the review. Fixed in: 149287 Tobi From dpatel at apple.com Mon Jan 30 16:47:12 2012 From: dpatel at apple.com (Devang Patel) Date: Mon, 30 Jan 2012 22:47:12 -0000 Subject: [llvm-commits] [llvm] r149291 - in /llvm/trunk: lib/Target/X86/AsmParser/X86AsmParser.cpp test/MC/X86/intel-syntax-encoding.s Message-ID: <20120130224712.937C52A6C12C@llvm.org> Author: dpatel Date: Mon Jan 30 16:47:12 2012 New Revision: 149291 URL: http://llvm.org/viewvc/llvm-project?rev=149291&view=rev Log: Intel syntax. Adjust special code, used to recognize cmp{ss,sd,ps,pd}, for intel syntax. Modified: llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp llvm/trunk/test/MC/X86/intel-syntax-encoding.s Modified: llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp?rev=149291&r1=149290&r2=149291&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp (original) +++ llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp Mon Jan 30 16:47:12 2012 @@ -981,10 +981,9 @@ Operands.push_back(X86Operand::CreateToken(PatchedName, NameLoc)); - if (ExtraImmOp) + if (ExtraImmOp && !isParsingIntelSyntax()) Operands.push_back(X86Operand::CreateImm(ExtraImmOp, NameLoc, NameLoc)); - // Determine whether this is an instruction prefix. bool isPrefix = Name == "lock" || Name == "rep" || @@ -1038,6 +1037,9 @@ else if (isPrefix && getLexer().is(AsmToken::Slash)) Parser.Lex(); // Consume the prefix separator Slash + if (ExtraImmOp && isParsingIntelSyntax()) + Operands.push_back(X86Operand::CreateImm(ExtraImmOp, NameLoc, NameLoc)); + // This is a terrible hack to handle "out[bwl]? %al, (%dx)" -> // "outb %al, %dx". Out doesn't take a memory form, but this is a widely // documented form in various unofficial manuals, so a lot of code uses it. Modified: llvm/trunk/test/MC/X86/intel-syntax-encoding.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/X86/intel-syntax-encoding.s?rev=149291&r1=149290&r2=149291&view=diff ============================================================================== --- llvm/trunk/test/MC/X86/intel-syntax-encoding.s (original) +++ llvm/trunk/test/MC/X86/intel-syntax-encoding.s Mon Jan 30 16:47:12 2012 @@ -39,3 +39,6 @@ // CHECK: encoding: [0xd1,0xe7] shl EDI, 1 + +// CHECK: encoding: [0x0f,0xc2,0xd1,0x01] + cmpltps XMM2, XMM1 From preston.gurd at intel.com Mon Jan 30 16:55:19 2012 From: preston.gurd at intel.com (Gurd, Preston) Date: Mon, 30 Jan 2012 22:55:19 +0000 Subject: [llvm-commits] FW: [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom Message-ID: Ping... From: Gurd, Preston Sent: Monday, January 23, 2012 6:06 PM To: Evan Cheng Cc: llvm-commits at cs.uiuc.edu Subject: [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom Revision 2: Tests which were failing, when run on an Atom, due to the tests finding a schedule different from what was expected, have been changed to use "-mcpu=generic" in order to prevent the Atom scheduler from running, so that all "make check" tests pass. From: Gurd, Preston Sent: Tuesday, January 17, 2012 4:29 PM To: Evan Cheng Cc: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom The attached patch implements most of an instruction scheduler for the Intel Atom. It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. It adds a test to verify that the scheduler is working. I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. Revision: the patch also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. Please commit the patch if it seems acceptable. Preston From: Evan Cheng [mailto:evan.cheng at apple.com] Sent: Monday, January 16, 2012 12:01 PM To: Gurd, Preston Cc: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] [llvm][PATCH][Review request] X86 Instruction scheduler for the Intel Atom Very nice. One question, I noticed you haven't changed the scheduling preference so x86_64 is still using ILP scheduler while i386 is using register pressure reduction scheduler. Have you tried changing the preference to latency scheduler for Atom? Evan On Jan 13, 2012, at 3:26 PM, Gurd, Preston wrote: The attached patch implements most of an instruction scheduler for the Intel Atom. It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. It adds a test to verify that the scheduler is working. I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. -- Preston Gurd > Intel Waterloo -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/56265423/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-x86-scheduler.diff Type: application/octet-stream Size: 186012 bytes Desc: llvm-x86-scheduler.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/56265423/attachment-0001.obj From gohman at apple.com Mon Jan 30 17:05:42 2012 From: gohman at apple.com (Dan Gohman) Date: Mon, 30 Jan 2012 23:05:42 -0000 Subject: [llvm-commits] [llvm] r149293 - /llvm/trunk/docs/AliasAnalysis.html Message-ID: <20120130230542.278C62A6C12C@llvm.org> Author: djg Date: Mon Jan 30 17:05:41 2012 New Revision: 149293 URL: http://llvm.org/viewvc/llvm-project?rev=149293&view=rev Log: basic-aa does support AliasAnalysis chaining now. Modified: llvm/trunk/docs/AliasAnalysis.html Modified: llvm/trunk/docs/AliasAnalysis.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/AliasAnalysis.html?rev=149293&r1=149292&r2=149293&view=diff ============================================================================== --- llvm/trunk/docs/AliasAnalysis.html (original) +++ llvm/trunk/docs/AliasAnalysis.html Mon Jan 30 17:05:41 2012 @@ -418,9 +418,8 @@
-

With only two special exceptions (the basicaa and no-aa -passes) every alias analysis pass chains to another alias analysis +

With only one special exception (the no-aa +pass) every alias analysis pass chains to another alias analysis implementation (for example, the user can specify "-basicaa -ds-aa -licm" to get the maximum benefit from both alias analyses). The alias analysis class automatically takes care of most of this From evan.cheng at apple.com Mon Jan 30 17:10:33 2012 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 30 Jan 2012 23:10:33 -0000 Subject: [llvm-commits] [llvm] r149294 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp Message-ID: <20120130231033.30CC82A6C12C@llvm.org> Author: evancheng Date: Mon Jan 30 17:10:32 2012 New Revision: 149294 URL: http://llvm.org/viewvc/llvm-project?rev=149294&view=rev Log: PR11834: Use macros which are defined on Windows. Patch by Marina Yatsina. Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp llvm/trunk/lib/Target/X86/X86Subtarget.cpp Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp?rev=149294&r1=149293&r2=149294&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp (original) +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp Mon Jan 30 17:10:32 2012 @@ -331,7 +331,8 @@ std::string CPUName = CPU; if (CPUName.empty()) { -#if defined (__x86_64__) || defined(__i386__) +#if defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86)\ + || defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) CPUName = sys::getHostCPUName(); #else CPUName = "generic"; Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=149294&r1=149293&r2=149294&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Mon Jan 30 17:10:32 2012 @@ -342,7 +342,8 @@ if (!FS.empty() || !CPU.empty()) { std::string CPUName = CPU; if (CPUName.empty()) { -#if defined (__x86_64__) || defined(__i386__) +#if defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86)\ + || defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) CPUName = sys::getHostCPUName(); #else CPUName = "generic"; From evan.cheng at apple.com Mon Jan 30 17:15:11 2012 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 30 Jan 2012 15:15:11 -0800 Subject: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp In-Reply-To: References: <20110708211415.2D93E2A6C12C@llvm.org> <7DE70FDACDE4CD4887C4278C12A2E30506DE39@HASMSX104.ger.corp.intel.com> Message-ID: <6CFE2B56-62D3-4F9F-AA5D-DE824AFB940A@apple.com> I've committed the patch. Please close the bug. Thanks. Evan On Jan 29, 2012, at 11:19 PM, Yatsina, Marina wrote: > Hi, > I did not get a confirmation that my previous mail got distributed to the llvm-commits ML. > I wanted to know the status of my patch. > > Thank you, > Marina. > > -----Original Message----- > From: Yatsina, Marina > Sent: Wednesday, January 25, 2012 13:57 > To: 'llvm-commits at cs.uiuc.edu' > Subject: RE: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp > > Hi, > > I have found a bug introduced by commit 134741. > The commit added use of macros that are not defined on Windows and they are causing X86Subtarget to choose "generic" as the CPUName. > > I've opened Bug #11834 on the problem: > http://www.llvm.org/bugs/show_bug.cgi?id=11834 > > > I've also attached a fix to this mail and to the bug opened in bugzilla. > > Thank you, > Marina. > > > > > -----Original Message----- > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Evan Cheng > Sent: Saturday, July 09, 2011 00:14 > To: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] [llvm] r134741 - in /llvm/trunk/lib/Target/X86: MCTargetDesc/X86MCTargetDesc.cpp X86Subtarget.cpp > > Author: evancheng > Date: Fri Jul 8 16:14:14 2011 > New Revision: 134741 > > URL: http://llvm.org/viewvc/llvm-project?rev=134741&view=rev > Log: > For non-x86 host, used generic as CPU name. > > Modified: > llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp > llvm/trunk/lib/Target/X86/X86Subtarget.cpp > > Modified: llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp?rev=134741&r1=134740&r2=134741&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp (original) > +++ llvm/trunk/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp Fri Jul 8 16:14:14 2011 > @@ -140,8 +140,13 @@ > } > > std::string CPUName = CPU; > - if (CPUName.empty()) > + if (CPUName.empty()) { > +#if defined (__x86_64__) || defined(__i386__) > CPUName = sys::getHostCPUName(); > +#else > + CPUName = "generic"; > +#endif > + } > > if (ArchFS.empty() && CPUName.empty() && hasX86_64()) > // Auto-detect if host is 64-bit capable, it's the default if true. > > Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=134741&r1=134740&r2=134741&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Fri Jul 8 16:14:14 2011 > @@ -258,12 +258,17 @@ > ArchFS = FS; > } > > - std::string CPUName = CPU; > - if (CPUName.empty()) > - CPUName = sys::getHostCPUName(); > - > // Determine default and user specified characteristics > - if (!CPUName.empty() || !ArchFS.empty()) { > + if (!ArchFS.empty()) { > + std::string CPUName = CPU; > + if (CPUName.empty()) { > +#if defined (__x86_64__) || defined(__i386__) > + CPUName = sys::getHostCPUName(); > +#else > + CPUName = "generic"; > +#endif > + } > + > // If feature string is not empty, parse features string. > ParseSubtargetFeatures(CPUName, ArchFS); > // All X86-64 CPUs also have SSE2, however user might request no SSE via > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From kcc at google.com Mon Jan 30 17:23:26 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 23:23:26 -0000 Subject: [llvm-commits] [compiler-rt] r149296 - in /compiler-rt/trunk/lib/asan/tests: asan_mac_test.mm asan_test.cc Message-ID: <20120130232326.B4A2E2A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 17:23:26 2012 New Revision: 149296 URL: http://llvm.org/viewvc/llvm-project?rev=149296&view=rev Log: [asan] fix issue 35: don't let the optimizer to optimize the test code away. Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm?rev=149296&r1=149295&r2=149296&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm (original) +++ compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Mon Jan 30 17:23:26 2012 @@ -32,6 +32,10 @@ CFAllocatorDeallocate(kCFAllocatorMallocZone, mem); } +__attribute__((noinline)) +void access_memory(char *a) { + *a = 0; +} // Test the +load instrumentation. // Because the +load methods are invoked before anything else is initialized, @@ -51,7 +55,8 @@ +(void) load { for (int i = 0; i < strlen(kStartupStr); i++) { - volatile char ch = kStartupStr[i]; // make sure no optimizations occur. + // TODO: this is currently broken, see Issue 33. + // access_memory(&kStartupStr[i]); // make sure no optimizations occur. } // Don't print anything here not to interfere with the death tests. } @@ -66,7 +71,7 @@ void worker_do_crash(int size) { char * volatile mem = malloc(size); - mem[size] = 0; // BOOM + access_memory(&mem[size]); // BOOM free(mem); } @@ -162,7 +167,7 @@ dispatch_source_set_timer(timer, milestone, DISPATCH_TIME_FOREVER, 0); char * volatile mem = malloc(10); dispatch_source_set_event_handler(timer, ^{ - mem[10] = 1; + access_memory(&mem[10]); }); dispatch_resume(timer); sleep(2); @@ -186,7 +191,7 @@ dispatch_source_cancel(timer); }); dispatch_source_set_cancel_handler(timer, ^{ - mem[10] = 1; + access_memory(&mem[10]); }); dispatch_resume(timer); sleep(2); @@ -197,7 +202,7 @@ dispatch_group_t group = dispatch_group_create(); char * volatile mem = malloc(10); dispatch_group_async(group, queue, ^{ - mem[10] = 1; + access_memory(&mem[10]); }); dispatch_group_wait(group, DISPATCH_TIME_FOREVER); } Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=149296&r1=149295&r2=149296&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Mon Jan 30 17:23:26 2012 @@ -1668,7 +1668,7 @@ *Ident(&a) = *Ident(&a); } - __attribute__((no_address_safety_analysis)) +__attribute__((no_address_safety_analysis)) static void NoAddressSafety() { char *foo = new char[10]; Ident(foo)[10] = 0; From stoklund at 2pi.dk Mon Jan 30 17:28:03 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 30 Jan 2012 15:28:03 -0800 Subject: [llvm-commits] Tuning LLVM Greedy Register Allocator to optimize for code size when targeting ARM Thumb 2 instruction set In-Reply-To: <000c01ccdbdf$45cacb90$d16062b0$@org> References: <000001ccda35$239f0e10$6add2a30$@org> <0E37D7B5-BBF8-4D29-9679-5C4D22B32AEB@2pi.dk> <000c01ccda5b$76b208c0$64161a40$@org> <38F583CE-3516-421A-84C2-46978621E648@apple.com> <901B7A01-6E81-4807-A78F-2922C100117D@2pi.dk> <001c01ccdacd$befdd9c0$3cf98d40$@org> <226102FF-900A-4896-870C-F46C00B82BD4@2pi.dk> <000c01ccdbdf$45cacb90$d16062b0$@org> Message-ID: <8837C2BB-DD5B-408A-9978-026181DB0E61@2pi.dk> On Jan 25, 2012, at 8:02 PM, Zino Benaissa wrote: >> Does this negative bias mean that VirtReg.bytes is 0 for most virtual > registers? How > many get VirtReg.bytes > 0? > > Of course it varies and depends on the compiled code. I have seen that > VirtReg.bytes =0 (or close) while this VirtReg occurs frequently in the > function (Which in this case their weight is high). At the same I have seen > Candidate that have a very high VirtReg.byte and (they were getting a > costPerUse register before this heuristic). Some EEMBC benchmarks shrank > with > 5% and are example of this. Alright, so I would expect that the majority of GPR VirtRegs have bytes > 0. >> As I am reading your changes to the eviction policy, you are completely >> replacing spill weights with a code size metric for live ranges with >> Virteg.bytes > 0. Is that the intention? > It depends why the eviction is invoked. Currently there are three reasons > for invoking eviction: enabling coalescing, preventing spill/split, > preventing a costPerUse register. Note all these evections where already put > in place before my heuristic. > 1) Both for coalescing or for preventing split/spill: VirtReg.bytes=0 and > the heuristic is ignored and only the pair is considered. > Whatever were put in place is still managing these type of evictions. > 2) This heuristic is ON only when a candidate gets a register that has a > CostPerUse. In this case, When the RA attempts to trade it for a register > with no cost, Now with this heuristic it has a metric to evaluate whether > there is a trade worth evicting for. Here is the problem: Whenever you do a 'luxury' eviction because you got a physreg with a CostPerUse, you could be evicting virtregs with very high spill weight. These are the 'used in a hot loop' virtregs you were talking about. Whenever VirtReg.bytes > 0, you are effectively replacing the spill weights with code size metrics. That is very heavily biased towards optimizing for code size, and I think it is too aggressive. Live range splitting is going to save you some of the time. It still uses speed metrics, but the overall behavior of the greedy algorithm becomes very erratic. Spill weights are used in two different ways when evicting: 1. The shouldEvict() policy function prevents a VirtReg from evicting something with a higher spill weight. (But you are overriding it!) 2. The tryEvict() function selects the eviction candidate that would cause the lowest maximum spill weight to be evicted. I don't think it is safe to override the shouldEvict() policy. You can get away with changing the candidate selection in 2., though. Here is what I suggest you do: - Don't override shouldEvict(). That policy should always stay in place. - Use code size metrics to select among multiple eviction candidates when evicting from 'cheap' physregs. - Don't evict from two physregs in selectOrSplit() and then only use one of them. You may be able to use code size metrics for selecting the best eviction candidate, but don't evict two different physregs needlessly. You should also make sure that the patch works for x86-64. There is a similar code size penalty to using r8-r15 and xmm8-15. /jakob From chandlerc at gmail.com Mon Jan 30 17:47:44 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 23:47:44 -0000 Subject: [llvm-commits] [llvm] r149299 - in /llvm/trunk: lib/CodeGen/AsmPrinter/AsmPrinter.cpp test/CodeGen/X86/fold-pcmpeqd-2.ll Message-ID: <20120130234744.AE8EB2A6C12C@llvm.org> Author: chandlerc Date: Mon Jan 30 17:47:44 2012 New Revision: 149299 URL: http://llvm.org/viewvc/llvm-project?rev=149299&view=rev Log: Chris's constant data sequence refactoring actually enabled printing vectors of all one bits to be printed more cleverly in the AsmPrinter. Unfortunately, the byte value for all one bits is the same with -fsigned-char as the error return of '-1'. Force this to be the unsigned byte value when returning it to avoid this problem, and update the test case for the shiny new behavior. Yay for building LLVM and Clang with -funsigned-char. Chris, please review, and let me know if there is any reason to not desire this change. It seems good on the surface, and certainly intended based on the code written. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=149299&r1=149298&r2=149299&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Mon Jan 30 17:47:44 2012 @@ -1558,7 +1558,7 @@ char C = Data[0]; for (unsigned i = 1, e = Data.size(); i != e; ++i) if (Data[i] != C) return -1; - return C; + return static_cast(C); // Ensure 255 is not returned as -1. } Modified: llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll?rev=149299&r1=149298&r2=149299&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll (original) +++ llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll Mon Jan 30 17:47:44 2012 @@ -8,10 +8,7 @@ ; RAGreedy defeats the test by splitting live ranges. ; Constant pool all-ones vector: -; CHECK: .long 4294967295 -; CHECK-NEXT: .long 4294967295 -; CHECK-NEXT: .long 4294967295 -; CHECK-NEXT: .long 4294967295 +; CHECK: .space 16,255 ; No pcmpeqd instructions, everybody uses the constant pool. ; CHECK: program_1: From kcc at google.com Mon Jan 30 17:50:10 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 23:50:10 -0000 Subject: [llvm-commits] [llvm] r149300 - /llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Message-ID: <20120130235010.9BEA62A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 17:50:10 2012 New Revision: 149300 URL: http://llvm.org/viewvc/llvm-project?rev=149300&view=rev Log: [asan] fix the ObjC support (asan Issue #33) Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=149300&r1=149299&r2=149300&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Mon Jan 30 17:50:10 2012 @@ -155,6 +155,7 @@ Instruction *InsertBefore, bool IsWrite); Value *memToShadow(Value *Shadow, IRBuilder<> &IRB); bool handleFunction(Module &M, Function &F); + bool maybeInsertAsanInitAtFunctionEntry(Function &F); bool poisonStackInFunction(Module &M, Function &F); virtual bool runOnModule(Module &M); bool insertGlobalRedzones(Module &M); @@ -617,9 +618,29 @@ return Res; } +bool AddressSanitizer::maybeInsertAsanInitAtFunctionEntry(Function &F) { + // For each NSObject descendant having a +load method, this method is invoked + // by the ObjC runtime before any of the static constructors is called. + // Therefore we need to instrument such methods with a call to __asan_init + // at the beginning in order to initialize our runtime before any access to + // the shadow memory. + // We cannot just ignore these methods, because they may call other + // instrumented functions. + if (F.getName().find(" load]") != std::string::npos) { + IRBuilder<> IRB(F.begin()->begin()); + IRB.CreateCall(AsanInitFunction); + return true; + } + return false; +} + bool AddressSanitizer::handleFunction(Module &M, Function &F) { if (BL->isIn(F)) return false; if (&F == AsanCtorFunction) return false; + + // If needed, insert __asan_init before checking for AddressSafety attr. + maybeInsertAsanInitAtFunctionEntry(F); + if (!F.hasFnAttr(Attribute::AddressSafety)) return false; if (!ClDebugFunc.empty() && ClDebugFunc != F.getName()) @@ -673,19 +694,6 @@ DEBUG(dbgs() << F); bool ChangedStack = poisonStackInFunction(M, F); - - // For each NSObject descendant having a +load method, this method is invoked - // by the ObjC runtime before any of the static constructors is called. - // Therefore we need to instrument such methods with a call to __asan_init - // at the beginning in order to initialize our runtime before any access to - // the shadow memory. - // We cannot just ignore these methods, because they may call other - // instrumented functions. - if (F.getName().find(" load]") != std::string::npos) { - IRBuilder<> IRB(F.begin()->begin()); - IRB.CreateCall(AsanInitFunction); - } - return NumInstrumented > 0 || ChangedStack; } From atrick at apple.com Mon Jan 30 16:56:55 2012 From: atrick at apple.com (Andrew Trick) Date: Mon, 30 Jan 2012 14:56:55 -0800 Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom In-Reply-To: References: Message-ID: <4E567858-9C06-4F34-83D4-D281335B35B9@apple.com> Thanks. I'm reviewing your patch today. -Andy On Jan 30, 2012, at 2:55 PM, "Gurd, Preston" wrote: > Ping? > > From: Gurd, Preston > Sent: Monday, January 23, 2012 6:06 PM > To: Evan Cheng > Cc: llvm-commits at cs.uiuc.edu > Subject: [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom > > Revision 2: Tests which were failing, when run on an Atom, due to the tests finding a schedule different from what was expected, have been changed to use ?-mcpu=generic? in order to prevent the Atom scheduler from running, so that all ?make check? tests pass. > > From: Gurd, Preston > Sent: Tuesday, January 17, 2012 4:29 PM > To: Evan Cheng > Cc: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. > > Revision: the patch also changes the scheduling preference to ?Hybrid? for i386 Atom, while leaving x86_64 as ILP. > > Please commit the patch if it seems acceptable. > > Preston > > > From: Evan Cheng [mailto:evan.cheng at apple.com] > Sent: Monday, January 16, 2012 12:01 PM > To: Gurd, Preston > Cc: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] [llvm][PATCH][Review request] X86 Instruction scheduler for the Intel Atom > > Very nice. One question, I noticed you haven't changed the scheduling preference so x86_64 is still using ILP scheduler while i386 is using register pressure reduction scheduler. Have you tried changing the preference to latency scheduler for Atom? > > Evan > > On Jan 13, 2012, at 3:26 PM, Gurd, Preston wrote: > > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. > > -- > Preston Gurd > Intel Waterloo > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/0a0f8fa9/attachment.html From kcc at google.com Mon Jan 30 17:55:46 2012 From: kcc at google.com (Kostya Serebryany) Date: Mon, 30 Jan 2012 23:55:46 -0000 Subject: [llvm-commits] [compiler-rt] r149302 - /compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Message-ID: <20120130235546.91D112A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 17:55:46 2012 New Revision: 149302 URL: http://llvm.org/viewvc/llvm-project?rev=149302&view=rev Log: [asan] re-enable the test for ObjC initialization bug Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm?rev=149302&r1=149301&r2=149302&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm (original) +++ compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Mon Jan 30 17:55:46 2012 @@ -55,8 +55,7 @@ +(void) load { for (int i = 0; i < strlen(kStartupStr); i++) { - // TODO: this is currently broken, see Issue 33. - // access_memory(&kStartupStr[i]); // make sure no optimizations occur. + access_memory(&kStartupStr[i]); // make sure no optimizations occur. } // Don't print anything here not to interfere with the death tests. } From clattner at apple.com Mon Jan 30 18:06:44 2012 From: clattner at apple.com (Chris Lattner) Date: Mon, 30 Jan 2012 16:06:44 -0800 Subject: [llvm-commits] [llvm] r149299 - in /llvm/trunk: lib/CodeGen/AsmPrinter/AsmPrinter.cpp test/CodeGen/X86/fold-pcmpeqd-2.ll In-Reply-To: <20120130234744.AE8EB2A6C12C@llvm.org> References: <20120130234744.AE8EB2A6C12C@llvm.org> Message-ID: On Jan 30, 2012, at 3:47 PM, Chandler Carruth wrote: > Author: chandlerc > Date: Mon Jan 30 17:47:44 2012 > New Revision: 149299 > > URL: http://llvm.org/viewvc/llvm-project?rev=149299&view=rev > Log: > Chris's constant data sequence refactoring actually enabled printing > vectors of all one bits to be printed more cleverly in the AsmPrinter. > Unfortunately, the byte value for all one bits is the same with > -fsigned-char as the error return of '-1'. Force this to be the unsigned > byte value when returning it to avoid this problem, and update the test > case for the shiny new behavior. > > Yay for building LLVM and Clang with -funsigned-char. > > Chris, please review, and let me know if there is any reason to not > desire this change. It seems good on the surface, and certainly intended > based on the code written. Looks right to me, thanks Chandler! I thought the buildbot failure was due to one of Craig's recent patches, I appreciate you tracking it down. -Chris > > Modified: > llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=149299&r1=149298&r2=149299&view=diff > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Mon Jan 30 17:47:44 2012 > @@ -1558,7 +1558,7 @@ > char C = Data[0]; > for (unsigned i = 1, e = Data.size(); i != e; ++i) > if (Data[i] != C) return -1; > - return C; > + return static_cast(C); // Ensure 255 is not returned as -1. > } > > > > Modified: llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll?rev=149299&r1=149298&r2=149299&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll (original) > +++ llvm/trunk/test/CodeGen/X86/fold-pcmpeqd-2.ll Mon Jan 30 17:47:44 2012 > @@ -8,10 +8,7 @@ > ; RAGreedy defeats the test by splitting live ranges. > > ; Constant pool all-ones vector: > -; CHECK: .long 4294967295 > -; CHECK-NEXT: .long 4294967295 > -; CHECK-NEXT: .long 4294967295 > -; CHECK-NEXT: .long 4294967295 > +; CHECK: .space 16,255 > > ; No pcmpeqd instructions, everybody uses the constant pool. > ; CHECK: program_1: > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From stoklund at 2pi.dk Mon Jan 30 18:17:55 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Mon, 30 Jan 2012 16:17:55 -0800 Subject: [llvm-commits] [llvm] r149259 - /llvm/trunk/lib/CodeGen/LiveIntervalAnalysis.cpp In-Reply-To: <20120130192620.419772A6C12C@llvm.org> References: <20120130192620.419772A6C12C@llvm.org> Message-ID: <47514F18-4F57-4428-BCC9-A3D5AB1E3D32@2pi.dk> On Jan 30, 2012, at 11:26 AM, Matt Beaumont-Gay wrote: > Author: matthewbg > Date: Mon Jan 30 13:26:20 2012 > New Revision: 149259 > > URL: http://llvm.org/viewvc/llvm-project?rev=149259&view=rev > Log: > Here's a new one: GCC was complaining about an only-used-in-asserts > *function*. Wrap the function in #ifndef NDEBUG. Clang was warning too. Thanks! /jakob From isanbard at gmail.com Mon Jan 30 18:26:24 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 00:26:24 -0000 Subject: [llvm-commits] [llvm] r149303 - in /llvm/trunk: include/llvm/BasicBlock.h lib/VMCore/BasicBlock.cpp Message-ID: <20120131002624.BD4B52A6C12C@llvm.org> Author: void Date: Mon Jan 30 18:26:24 2012 New Revision: 149303 URL: http://llvm.org/viewvc/llvm-project?rev=149303&view=rev Log: Add a constified getLandingPad() method. Modified: llvm/trunk/include/llvm/BasicBlock.h llvm/trunk/lib/VMCore/BasicBlock.cpp Modified: llvm/trunk/include/llvm/BasicBlock.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/BasicBlock.h?rev=149303&r1=149302&r2=149303&view=diff ============================================================================== --- llvm/trunk/include/llvm/BasicBlock.h (original) +++ llvm/trunk/include/llvm/BasicBlock.h Mon Jan 30 18:26:24 2012 @@ -268,6 +268,7 @@ /// getLandingPadInst() - Return the landingpad instruction associated with /// the landing pad. LandingPadInst *getLandingPadInst(); + const LandingPadInst *getLandingPadInst() const; private: /// AdjustBlockAddressRefCount - BasicBlock stores the number of BlockAddress Modified: llvm/trunk/lib/VMCore/BasicBlock.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/BasicBlock.cpp?rev=149303&r1=149302&r2=149303&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/BasicBlock.cpp (original) +++ llvm/trunk/lib/VMCore/BasicBlock.cpp Mon Jan 30 18:26:24 2012 @@ -366,3 +366,6 @@ LandingPadInst *BasicBlock::getLandingPadInst() { return dyn_cast(getFirstNonPHI()); } +const LandingPadInst *BasicBlock::getLandingPadInst() const { + return dyn_cast(getFirstNonPHI()); +} From kcc at google.com Mon Jan 30 18:52:18 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 00:52:18 -0000 Subject: [llvm-commits] [compiler-rt] r149306 - in /compiler-rt/trunk/lib/asan: asan_internal.h asan_posix.cc asan_rtl.cc tests/test_output.sh Message-ID: <20120131005218.8A3122A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 18:52:18 2012 New Revision: 149306 URL: http://llvm.org/viewvc/llvm-project?rev=149306&view=rev Log: [asan] new run-time flag: sleep_before_dying (asan Issue #31) Modified: compiler-rt/trunk/lib/asan/asan_internal.h compiler-rt/trunk/lib/asan/asan_posix.cc compiler-rt/trunk/lib/asan/asan_rtl.cc compiler-rt/trunk/lib/asan/tests/test_output.sh Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=149306&r1=149305&r2=149306&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Mon Jan 30 18:52:18 2012 @@ -166,6 +166,7 @@ extern size_t FLAG_max_malloc_fill_size; extern int FLAG_exitcode; extern bool FLAG_allow_user_poisoning; +extern int FLAG_sleep_before_dying; extern bool FLAG_handle_segv; extern int asan_inited; Modified: compiler-rt/trunk/lib/asan/asan_posix.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_posix.cc?rev=149306&r1=149305&r2=149306&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_posix.cc (original) +++ compiler-rt/trunk/lib/asan/asan_posix.cc Mon Jan 30 18:52:18 2012 @@ -70,6 +70,10 @@ } void AsanDie() { + if (FLAG_sleep_before_dying) { + Report("Sleeping for %d second(s)\n", FLAG_sleep_before_dying); + sleep(FLAG_sleep_before_dying); + } _exit(FLAG_exitcode); } Modified: compiler-rt/trunk/lib/asan/asan_rtl.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_rtl.cc?rev=149306&r1=149305&r2=149306&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_rtl.cc (original) +++ compiler-rt/trunk/lib/asan/asan_rtl.cc Mon Jan 30 18:52:18 2012 @@ -47,6 +47,7 @@ bool FLAG_use_fake_stack; int FLAG_exitcode = EXIT_FAILURE; bool FLAG_allow_user_poisoning; +int FLAG_sleep_before_dying; // -------------------------- Globals --------------------- {{{1 int asan_inited; @@ -411,6 +412,7 @@ FLAG_exitcode = IntFlagValue(options, "exitcode=", EXIT_FAILURE); FLAG_allow_user_poisoning = IntFlagValue(options, "allow_user_poisoning=", 1); + FLAG_sleep_before_dying = IntFlagValue(options, "sleep_before_dying=", 0); if (FLAG_atexit) { atexit(asan_atexit); Modified: compiler-rt/trunk/lib/asan/tests/test_output.sh URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/test_output.sh?rev=149306&r1=149305&r2=149306&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/test_output.sh (original) +++ compiler-rt/trunk/lib/asan/tests/test_output.sh Mon Jan 30 18:52:18 2012 @@ -19,6 +19,11 @@ ./a.out 2>&1 | grep "heap-use-after-free" > /dev/null rm ./a.out +echo "Testing sleep_before_dying" +$CC -g -faddress-sanitizer -O2 $C_TEST.c +ASAN_OPTIONS=sleep_before_dying=1 ./a.out 2>&1 | grep "Sleeping for 1 second" > /dev/null +rm a.out + for t in *.tmpl; do for b in 32 64; do for O in 0 1 2 3; do From isanbard at gmail.com Mon Jan 30 18:56:53 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 00:56:53 -0000 Subject: [llvm-commits] [llvm] r149307 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131005653.848EF2A6C12C@llvm.org> Author: void Date: Mon Jan 30 18:56:53 2012 New Revision: 149307 URL: http://llvm.org/viewvc/llvm-project?rev=149307&view=rev Log: Remove no-longer-useful dyn_casts and pals. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149307&r1=149306&r2=149307&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 18:56:53 2012 @@ -276,10 +276,7 @@ UnwindDestPHIValues.push_back(PHI->getIncomingValueForBlock(InvokeBB)); } - // FIXME: With the new EH, this if/dyn_cast should be a 'cast'. - if (LandingPadInst *LPI = dyn_cast(I)) { - CallerLPad = LPI; - } + CallerLPad = cast(I); } /// The outer unwind destination is the target of unwind edges @@ -507,13 +504,12 @@ for (BasicBlock::iterator BBI = BB->begin(), E = BB->end(); BBI != E; ) { Instruction *I = BBI++; - if (LPI) // FIXME: New EH - This won't be NULL in the new EH. - if (LandingPadInst *L = dyn_cast(I)) { - unsigned NumClauses = LPI->getNumClauses(); - L->reserveClauses(NumClauses); - for (unsigned i = 0; i != NumClauses; ++i) - L->addClause(LPI->getClause(i)); - } + if (LandingPadInst *L = dyn_cast(I)) { + unsigned NumClauses = LPI->getNumClauses(); + L->reserveClauses(NumClauses); + for (unsigned i = 0; i != NumClauses; ++i) + L->addClause(LPI->getClause(i)); + } // We only need to check for function calls: inlined invoke // instructions require no special handling. @@ -930,11 +926,8 @@ I != E; ++I) if (const InvokeInst *II = dyn_cast(I->getTerminator())) { const BasicBlock *BB = II->getUnwindDest(); - // FIXME: This 'if/dyn_cast' here should become a normal 'cast' once - // the new EH system is in place. - if (const LandingPadInst *LP = - dyn_cast(BB->getFirstNonPHI())) - CalleePersonality = LP->getPersonalityFn(); + const LandingPadInst *LP = BB->getLandingPadInst(); + CalleePersonality = LP->getPersonalityFn(); break; } @@ -946,11 +939,7 @@ I != E; ++I) if (const InvokeInst *II = dyn_cast(I->getTerminator())) { const BasicBlock *BB = II->getUnwindDest(); - // FIXME: This 'isa' here should become go away once the new EH system - // is in place. - if (!isa(BB->getFirstNonPHI())) - continue; - const LandingPadInst *LP = cast(BB->getFirstNonPHI()); + const LandingPadInst *LP = BB->getLandingPadInst(); // If the personality functions match, then we can perform the // inlining. Otherwise, we can't inline. From kremenek at apple.com Mon Jan 30 18:57:05 2012 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 31 Jan 2012 00:57:05 -0000 Subject: [llvm-commits] [llvm] r149308 - /llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Message-ID: <20120131005705.2D0732A6C12C@llvm.org> Author: kremenek Date: Mon Jan 30 18:57:04 2012 New Revision: 149308 URL: http://llvm.org/viewvc/llvm-project?rev=149308&view=rev Log: Use traits for IntrusiveRefCntPtr to determine how to increment/decrement a reference count. Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h?rev=149308&r1=149307&r2=149308&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h (original) +++ llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Mon Jan 30 18:57:04 2012 @@ -83,6 +83,12 @@ friend class IntrusiveRefCntPtr; }; + + template struct IntrusiveRefCntPtrInfo { + static void retain(T *obj) { obj->Retain(); } + static void release(T *obj) { obj->Release(); } + }; + //===----------------------------------------------------------------------===// /// IntrusiveRefCntPtr - A template class that implements a "smart pointer" /// that assumes the wrapped object has a reference count associated @@ -168,8 +174,8 @@ } private: - void retain() { if (Obj) Obj->Retain(); } - void release() { if (Obj) Obj->Release(); } + void retain() { if (Obj) IntrusiveRefCntPtrInfo::retain(Obj); } + void release() { if (Obj) IntrusiveRefCntPtrInfo::release(Obj); } void replace(T* S) { this_type(S).swap(*this); From kremenek at apple.com Mon Jan 30 18:57:08 2012 From: kremenek at apple.com (Ted Kremenek) Date: Tue, 31 Jan 2012 00:57:08 -0000 Subject: [llvm-commits] [llvm] r149309 - /llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Message-ID: <20120131005708.BA9A32A6C12C@llvm.org> Author: kremenek Date: Mon Jan 30 18:57:08 2012 New Revision: 149309 URL: http://llvm.org/viewvc/llvm-project?rev=149309&view=rev Log: Relax constructor for IntrusiveRefCntPtr to not be explicit. Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h?rev=149309&r1=149308&r2=149309&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h (original) +++ llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Mon Jan 30 18:57:08 2012 @@ -115,7 +115,7 @@ explicit IntrusiveRefCntPtr() : Obj(0) {} - explicit IntrusiveRefCntPtr(T* obj) : Obj(obj) { + IntrusiveRefCntPtr(T* obj) : Obj(obj) { retain(); } From isanbard at gmail.com Mon Jan 30 19:01:16 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:01:16 -0000 Subject: [llvm-commits] [llvm] r149312 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131010116.DD7AA2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:01:16 2012 New Revision: 149312 URL: http://llvm.org/viewvc/llvm-project?rev=149312&view=rev Log: Formatting cleanups. No functionality change. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149312&r1=149311&r2=149312&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:01:16 2012 @@ -570,10 +570,9 @@ // Otherwise, create the new invoke instruction. ImmutableCallSite CS(CI); SmallVector InvokeArgs(CS.arg_begin(), CS.arg_end()); - InvokeInst *II = - InvokeInst::Create(CI->getCalledValue(), Split, - Invoke.getOuterUnwindDest(), - InvokeArgs, CI->getName(), BB); + InvokeInst *II = InvokeInst::Create(CI->getCalledValue(), Split, + Invoke.getOuterUnwindDest(), + InvokeArgs, CI->getName(), BB); II->setCallingConv(CI->getCallingConv()); II->setAttributes(CI->getAttributes()); @@ -581,17 +580,17 @@ // updates the CallGraph if present, because it uses a WeakVH. CI->replaceAllUsesWith(II); - Split->getInstList().pop_front(); // Delete the original call + // Delete the original call + Split->getInstList().pop_front(); - // Update any PHI nodes in the exceptional block to indicate that - // there is now a new entry in them. + // Update any PHI nodes in the exceptional block to indicate that there is + // now a new entry in them. Invoke.addIncomingPHIValuesFor(BB); return false; } return false; } - /// HandleInlinedInvoke - If we inlined an invoke site, we need to convert calls /// in the body of the inlined function into invokes and turn unwind @@ -874,15 +873,15 @@ } } -// InlineFunction - This function inlines the called function into the basic -// block of the caller. This returns false if it is not possible to inline this -// call. The program is still in a well defined state if this occurs though. -// -// Note that this only does one level of inlining. For example, if the -// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now -// exists in the instruction stream. Similarly this will inline a recursive -// function by one level. -// +/// InlineFunction - This function inlines the called function into the basic +/// block of the caller. This returns false if it is not possible to inline +/// this call. The program is still in a well defined state if this occurs +/// though. +/// +/// Note that this only does one level of inlining. For example, if the +/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now +/// exists in the instruction stream. Similarly this will inline a recursive +/// function by one level. bool llvm::InlineFunction(CallSite CS, InlineFunctionInfo &IFI) { Instruction *TheCall = CS.getInstruction(); LLVMContext &Context = TheCall->getContext(); @@ -934,7 +933,7 @@ // Find the personality function used by the landing pads of the caller. If it // exists, then check to see that it matches the personality function used in // the callee. - if (CalleePersonality) + if (CalleePersonality) { for (Function::const_iterator I = Caller->begin(), E = Caller->end(); I != E; ++I) if (const InvokeInst *II = dyn_cast(I->getTerminator())) { @@ -950,6 +949,7 @@ break; } + } // Get an iterator to the last basic block in the function, which will have // the new function inlined after it. @@ -1016,7 +1016,6 @@ // block for the callee, move them to the entry block of the caller. First // calculate which instruction they should be inserted before. We insert the // instructions at the end of the current alloca list. - // { BasicBlock::iterator InsertPoint = Caller->begin()->begin(); for (BasicBlock::iterator I = FirstNewBlock->begin(), @@ -1301,11 +1300,12 @@ // If we inserted a phi node, check to see if it has a single value (e.g. all // the entries are the same or undef). If so, remove the PHI so it doesn't // block other optimizations. - if (PHI) + if (PHI) { if (Value *V = SimplifyInstruction(PHI, IFI.TD)) { PHI->replaceAllUsesWith(V); PHI->eraseFromParent(); } + } return true; } From isanbard at gmail.com Mon Jan 30 19:05:20 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:05:20 -0000 Subject: [llvm-commits] [llvm] r149314 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131010520.718FC2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:05:20 2012 New Revision: 149314 URL: http://llvm.org/viewvc/llvm-project?rev=149314&view=rev Log: Get rid of references to dead intrinsics. The eh.selector and eh.resume intrinsics aren't used anymore. Get rid of some calls to them. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149314&r1=149313&r2=149314&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:05:20 2012 @@ -482,13 +482,6 @@ RI->eraseFromParent(); } -/// [LIBUNWIND] Check whether this selector is "only cleanups": -/// call i32 @llvm.eh.selector(blah, blah, i32 0) -static bool isCleanupOnlySelector(EHSelectorInst *selector) { - if (selector->getNumArgOperands() != 3) return false; - ConstantInt *val = dyn_cast(selector->getArgOperand(2)); - return (val && val->isZero()); -} /// HandleCallsInBlockInlinedThroughInvoke - When we inline a basic block into /// an invoke, we have to turn all of the calls that can throw into @@ -514,59 +507,18 @@ // We only need to check for function calls: inlined invoke // instructions require no special handling. CallInst *CI = dyn_cast(I); - if (CI == 0) continue; - // LIBUNWIND: merge selector instructions. - if (EHSelectorInst *Inner = dyn_cast(CI)) { - EHSelectorInst *Outer = Invoke.getOuterSelector(); - if (!Outer) continue; - - bool innerIsOnlyCleanup = isCleanupOnlySelector(Inner); - bool outerIsOnlyCleanup = isCleanupOnlySelector(Outer); - - // If both selectors contain only cleanups, we don't need to do - // anything. TODO: this is really just a very specific instance - // of a much more general optimization. - if (innerIsOnlyCleanup && outerIsOnlyCleanup) continue; - - // Otherwise, we just append the outer selector to the inner selector. - SmallVector NewSelector; - for (unsigned i = 0, e = Inner->getNumArgOperands(); i != e; ++i) - NewSelector.push_back(Inner->getArgOperand(i)); - for (unsigned i = 2, e = Outer->getNumArgOperands(); i != e; ++i) - NewSelector.push_back(Outer->getArgOperand(i)); - - CallInst *NewInner = - IRBuilder<>(Inner).CreateCall(Inner->getCalledValue(), NewSelector); - // No need to copy attributes, calling convention, etc. - NewInner->takeName(Inner); - Inner->replaceAllUsesWith(NewInner); - Inner->eraseFromParent(); - continue; - } - // If this call cannot unwind, don't convert it to an invoke. - if (CI->doesNotThrow()) + if (!CI || CI->doesNotThrow()) continue; - - // Convert this function call into an invoke instruction. - // First, split the basic block. + + // Convert this function call into an invoke instruction. First, split the + // basic block. BasicBlock *Split = BB->splitBasicBlock(CI, CI->getName()+".noexc"); // Delete the unconditional branch inserted by splitBasicBlock BB->getInstList().pop_back(); - // LIBUNWIND: If this is a call to @llvm.eh.resume, just branch - // directly to the new landing pad. - if (Invoke.forwardEHResume(CI, BB)) { - // TODO: 'Split' is now unreachable; clean it up. - - // We want to leave the original call intact so that the call - // graph and other structures won't get misled. We also have to - // avoid processing the next block, or we'll iterate here forever. - return true; - } - // Otherwise, create the new invoke instruction. ImmutableCallSite CS(CI); SmallVector InvokeArgs(CS.arg_begin(), CS.arg_end()); From eli.friedman at gmail.com Mon Jan 30 19:08:03 2012 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 31 Jan 2012 01:08:03 -0000 Subject: [llvm-commits] [llvm] r149315 - /llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Message-ID: <20120131010803.3ED7A2A6C12C@llvm.org> Author: efriedma Date: Mon Jan 30 19:08:03 2012 New Revision: 149315 URL: http://llvm.org/viewvc/llvm-project?rev=149315&view=rev Log: Use the correct ShiftAmtTy for creating shifts after legalization. PR11881. Not committing a testcase because I think it will be too fragile. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp?rev=149315&r1=149314&r2=149315&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Mon Jan 30 19:08:03 2012 @@ -1612,15 +1612,17 @@ APInt MsbMask = APInt::getHighBitsSet(BitWidth, 1); // If we only care about the highest bit, don't bother shifting right. - if (MsbMask == DemandedMask) { + if (MsbMask == DemandedMask) { unsigned ShAmt = ExVT.getScalarType().getSizeInBits(); SDValue InOp = Op.getOperand(0); - // In this code we may handle vector types. We can't use the - // getShiftAmountTy API because it only works on scalars. - // We use the shift value type because we know that its an integer - // with enough bits. - SDValue ShiftAmt = TLO.DAG.getConstant(BitWidth - ShAmt, - Op.getValueType()); + + // Compute the correct shift amount type, which must be getShiftAmountTy + // for scalar types after legalization. + EVT ShiftAmtTy = Op.getValueType(); + if (TLO.LegalTypes() && !ShiftAmtTy.isVector()) + ShiftAmtTy = getShiftAmountTy(ShiftAmtTy); + + SDValue ShiftAmt = TLO.DAG.getConstant(BitWidth - ShAmt, ShiftAmtTy); return TLO.CombineTo(Op, TLO.DAG.getNode(ISD::SHL, dl, Op.getValueType(), InOp, ShiftAmt)); } From isanbard at gmail.com Mon Jan 30 19:14:49 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:14:49 -0000 Subject: [llvm-commits] [llvm] r149316 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131011449.D424A2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:14:49 2012 New Revision: 149316 URL: http://llvm.org/viewvc/llvm-project?rev=149316&view=rev Log: Remove some unused, old-EH methods. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149316&r1=149315&r2=149316&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:14:49 2012 @@ -291,15 +291,11 @@ return OuterSelector; } - BasicBlock *getInnerUnwindDest(); - // FIXME: New EH - Rename when new EH is turned on. BasicBlock *getInnerUnwindDestNewEH(); LandingPadInst *getLandingPadInst() const { return CallerLPad; } - bool forwardEHResume(CallInst *call, BasicBlock *src); - /// forwardResume - Forward the 'resume' instruction to the caller's landing /// pad block. When the landing pad block has only one predecessor, this is /// a simple branch. When there is more than one predecessor, we need to @@ -324,113 +320,8 @@ }; } -/// [LIBUNWIND] Get or create a target for the branch out of rewritten calls to -/// llvm.eh.resume. -BasicBlock *InvokeInliningInfo::getInnerUnwindDest() { - if (InnerUnwindDest) return InnerUnwindDest; - - // Find and hoist the llvm.eh.exception and llvm.eh.selector calls - // in the outer landing pad to immediately following the phis. - EHSelectorInst *selector = getOuterSelector(); - if (!selector) return 0; - - // The call to llvm.eh.exception *must* be in the landing pad. - Instruction *exn = cast(selector->getArgOperand(0)); - assert(exn->getParent() == OuterUnwindDest); - - // TODO: recognize when we've already done this, so that we don't - // get a linear number of these when inlining calls into lots of - // invokes with the same landing pad. - - // Do the hoisting. - Instruction *splitPoint = exn->getParent()->getFirstNonPHI(); - assert(splitPoint != selector && "selector-on-exception dominance broken!"); - if (splitPoint == exn) { - selector->removeFromParent(); - selector->insertAfter(exn); - splitPoint = selector->getNextNode(); - } else { - exn->moveBefore(splitPoint); - selector->moveBefore(splitPoint); - } - - // Split the landing pad. - InnerUnwindDest = OuterUnwindDest->splitBasicBlock(splitPoint, - OuterUnwindDest->getName() + ".body"); - - // The number of incoming edges we expect to the inner landing pad. - const unsigned phiCapacity = 2; - - // Create corresponding new phis for all the phis in the outer landing pad. - BasicBlock::iterator insertPoint = InnerUnwindDest->begin(); - BasicBlock::iterator I = OuterUnwindDest->begin(); - for (unsigned i = 0, e = UnwindDestPHIValues.size(); i != e; ++i, ++I) { - PHINode *outerPhi = cast(I); - PHINode *innerPhi = PHINode::Create(outerPhi->getType(), phiCapacity, - outerPhi->getName() + ".lpad-body", - insertPoint); - outerPhi->replaceAllUsesWith(innerPhi); - innerPhi->addIncoming(outerPhi, OuterUnwindDest); - } - - // Create a phi for the exception value... - InnerExceptionPHI = PHINode::Create(exn->getType(), phiCapacity, - "exn.lpad-body", insertPoint); - exn->replaceAllUsesWith(InnerExceptionPHI); - selector->setArgOperand(0, exn); // restore this use - InnerExceptionPHI->addIncoming(exn, OuterUnwindDest); - - // ...and the selector. - InnerSelectorPHI = PHINode::Create(selector->getType(), phiCapacity, - "selector.lpad-body", insertPoint); - selector->replaceAllUsesWith(InnerSelectorPHI); - InnerSelectorPHI->addIncoming(selector, OuterUnwindDest); - - // All done. - return InnerUnwindDest; -} - -/// [LIBUNWIND] Try to forward the given call, which logically occurs -/// at the end of the given block, as a branch to the inner unwind -/// block. Returns true if the call was forwarded. -bool InvokeInliningInfo::forwardEHResume(CallInst *call, BasicBlock *src) { - // First, check whether this is a call to the intrinsic. - Function *fn = dyn_cast(call->getCalledValue()); - if (!fn || fn->getName() != "llvm.eh.resume") - return false; - - // At this point, we need to return true on all paths, because - // otherwise we'll construct an invoke of the intrinsic, which is - // not well-formed. - - // Try to find or make an inner unwind dest, which will fail if we - // can't find a selector call for the outer unwind dest. - BasicBlock *dest = getInnerUnwindDest(); - bool hasSelector = (dest != 0); - - // If we failed, just use the outer unwind dest, dropping the - // exception and selector on the floor. - if (!hasSelector) - dest = OuterUnwindDest; - - // Make a branch. - BranchInst::Create(dest, src); - - // Update the phis in the destination. They were inserted in an - // order which makes this work. - addIncomingPHIValuesForInto(src, dest); - - if (hasSelector) { - InnerExceptionPHI->addIncoming(call->getArgOperand(0), src); - InnerSelectorPHI->addIncoming(call->getArgOperand(1), src); - } - - return true; -} - -/// Get or create a target for the branch from ResumeInsts. +/// getInnerUnwindDest - Get or create a target for the branch from ResumeInsts. BasicBlock *InvokeInliningInfo::getInnerUnwindDestNewEH() { - // FIXME: New EH - rename this function when new EH is turned on. if (InnerResumeDest) return InnerResumeDest; // Split the landing pad. @@ -482,7 +373,6 @@ RI->eraseFromParent(); } - /// HandleCallsInBlockInlinedThroughInvoke - When we inline a basic block into /// an invoke, we have to turn all of the calls that can throw into /// invokes. This function analyze BB to see if there are any calls, and if so, @@ -519,7 +409,7 @@ // Delete the unconditional branch inserted by splitBasicBlock BB->getInstList().pop_back(); - // Otherwise, create the new invoke instruction. + // Create the new invoke instruction. ImmutableCallSite CS(CI); SmallVector InvokeArgs(CS.arg_begin(), CS.arg_end()); InvokeInst *II = InvokeInst::Create(CI->getCalledValue(), Split, @@ -596,9 +486,8 @@ Invoke.addIncomingPHIValuesFor(BB); } - if (ResumeInst *RI = dyn_cast(BB->getTerminator())) { + if (ResumeInst *RI = dyn_cast(BB->getTerminator())) Invoke.forwardResume(RI); - } } // Now that everything is happy, we have one final detail. The PHI nodes in @@ -799,7 +688,6 @@ InlinedAtDL.getAsMDNode(Ctx)); } - /// fixupLineNumbers - Update inlined instructions' line numbers to /// to encode location where these instructions are inlined. static void fixupLineNumbers(Function *Fn, Function::iterator FI, @@ -905,7 +793,6 @@ // Get an iterator to the last basic block in the function, which will have // the new function inlined after it. - // Function::iterator LastBlock = &Caller->back(); // Make sure to capture all of the return instructions from the cloned From isanbard at gmail.com Mon Jan 30 19:15:59 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:15:59 -0000 Subject: [llvm-commits] [llvm] r149317 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131011559.A23B12A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:15:59 2012 New Revision: 149317 URL: http://llvm.org/viewvc/llvm-project?rev=149317&view=rev Log: s/getInnerUnwindDestNewEH/getInnerUnwindDest/g Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149317&r1=149316&r2=149317&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:15:59 2012 @@ -291,8 +291,7 @@ return OuterSelector; } - // FIXME: New EH - Rename when new EH is turned on. - BasicBlock *getInnerUnwindDestNewEH(); + BasicBlock *getInnerUnwindDest(); LandingPadInst *getLandingPadInst() const { return CallerLPad; } @@ -321,7 +320,7 @@ } /// getInnerUnwindDest - Get or create a target for the branch from ResumeInsts. -BasicBlock *InvokeInliningInfo::getInnerUnwindDestNewEH() { +BasicBlock *InvokeInliningInfo::getInnerUnwindDest() { if (InnerResumeDest) return InnerResumeDest; // Split the landing pad. @@ -360,7 +359,7 @@ /// branch. When there is more than one predecessor, we need to split the /// landing pad block after the landingpad instruction and jump to there. void InvokeInliningInfo::forwardResume(ResumeInst *RI) { - BasicBlock *Dest = getInnerUnwindDestNewEH(); + BasicBlock *Dest = getInnerUnwindDest(); BasicBlock *Src = RI->getParent(); BranchInst::Create(Dest, Src); From isanbard at gmail.com Mon Jan 30 19:18:21 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:18:21 -0000 Subject: [llvm-commits] [llvm] r149318 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131011821.91D2C2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:18:21 2012 New Revision: 149318 URL: http://llvm.org/viewvc/llvm-project?rev=149318&view=rev Log: Remove more dead functions. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149318&r1=149317&r2=149318&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:18:21 2012 @@ -10,13 +10,6 @@ // This file implements inlining of a function into a call site, resolving // parameters and the return value as appropriate. // -// The code in this file for handling inlines through invoke -// instructions preserves semantics only under some assumptions about -// the behavior of unwinders which correspond to gcc-style libUnwind -// exception personality functions. Eventually the IR will be -// improved to make this unnecessary, but until then, this code is -// marked [LIBUNWIND]. -// //===----------------------------------------------------------------------===// #include "llvm/Transforms/Utils/Cloning.h" @@ -45,204 +38,6 @@ return InlineFunction(CallSite(II), IFI); } -// FIXME: New EH - Remove the functions marked [LIBUNWIND] when new EH is -// turned on. - -/// [LIBUNWIND] Look for an llvm.eh.exception call in the given block. -static EHExceptionInst *findExceptionInBlock(BasicBlock *bb) { - for (BasicBlock::iterator i = bb->begin(), e = bb->end(); i != e; i++) { - EHExceptionInst *exn = dyn_cast(i); - if (exn) return exn; - } - - return 0; -} - -/// [LIBUNWIND] Look for the 'best' llvm.eh.selector instruction for -/// the given llvm.eh.exception call. -static EHSelectorInst *findSelectorForException(EHExceptionInst *exn) { - BasicBlock *exnBlock = exn->getParent(); - - EHSelectorInst *outOfBlockSelector = 0; - for (Instruction::use_iterator - ui = exn->use_begin(), ue = exn->use_end(); ui != ue; ++ui) { - EHSelectorInst *sel = dyn_cast(*ui); - if (!sel) continue; - - // Immediately accept an eh.selector in the same block as the - // excepton call. - if (sel->getParent() == exnBlock) return sel; - - // Otherwise, use the first selector we see. - if (!outOfBlockSelector) outOfBlockSelector = sel; - } - - return outOfBlockSelector; -} - -/// [LIBUNWIND] Find the (possibly absent) call to @llvm.eh.selector -/// in the given landing pad. In principle, llvm.eh.exception is -/// required to be in the landing pad; in practice, SplitCriticalEdge -/// can break that invariant, and then inlining can break it further. -/// There's a real need for a reliable solution here, but until that -/// happens, we have some fragile workarounds here. -static EHSelectorInst *findSelectorForLandingPad(BasicBlock *lpad) { - // Look for an exception call in the actual landing pad. - EHExceptionInst *exn = findExceptionInBlock(lpad); - if (exn) return findSelectorForException(exn); - - // Okay, if that failed, look for one in an obvious successor. If - // we find one, we'll fix the IR by moving things back to the - // landing pad. - - bool dominates = true; // does the lpad dominate the exn call - BasicBlock *nonDominated = 0; // if not, the first non-dominated block - BasicBlock *lastDominated = 0; // and the block which branched to it - - BasicBlock *exnBlock = lpad; - - // We need to protect against lpads that lead into infinite loops. - SmallPtrSet visited; - visited.insert(exnBlock); - - do { - // We're not going to apply this hack to anything more complicated - // than a series of unconditional branches, so if the block - // doesn't terminate in an unconditional branch, just fail. More - // complicated cases can arise when, say, sinking a call into a - // split unwind edge and then inlining it; but that can do almost - // *anything* to the CFG, including leaving the selector - // completely unreachable. The only way to fix that properly is - // to (1) prohibit transforms which move the exception or selector - // values away from the landing pad, e.g. by producing them with - // instructions that are pinned to an edge like a phi, or - // producing them with not-really-instructions, and (2) making - // transforms which split edges deal with that. - BranchInst *branch = dyn_cast(&exnBlock->back()); - if (!branch || branch->isConditional()) return 0; - - BasicBlock *successor = branch->getSuccessor(0); - - // Fail if we found an infinite loop. - if (!visited.insert(successor)) return 0; - - // If the successor isn't dominated by exnBlock: - if (!successor->getSinglePredecessor()) { - // We don't want to have to deal with threading the exception - // through multiple levels of phi, so give up if we've already - // followed a non-dominating edge. - if (!dominates) return 0; - - // Otherwise, remember this as a non-dominating edge. - dominates = false; - nonDominated = successor; - lastDominated = exnBlock; - } - - exnBlock = successor; - - // Can we stop here? - exn = findExceptionInBlock(exnBlock); - } while (!exn); - - // Look for a selector call for the exception we found. - EHSelectorInst *selector = findSelectorForException(exn); - if (!selector) return 0; - - // The easy case is when the landing pad still dominates the - // exception call, in which case we can just move both calls back to - // the landing pad. - if (dominates) { - selector->moveBefore(lpad->getFirstNonPHI()); - exn->moveBefore(selector); - return selector; - } - - // Otherwise, we have to split at the first non-dominating block. - // The CFG looks basically like this: - // lpad: - // phis_0 - // insnsAndBranches_1 - // br label %nonDominated - // nonDominated: - // phis_2 - // insns_3 - // %exn = call i8* @llvm.eh.exception() - // insnsAndBranches_4 - // %selector = call @llvm.eh.selector(i8* %exn, ... - // We need to turn this into: - // lpad: - // phis_0 - // %exn0 = call i8* @llvm.eh.exception() - // %selector0 = call @llvm.eh.selector(i8* %exn0, ... - // insnsAndBranches_1 - // br label %split // from lastDominated - // nonDominated: - // phis_2 (without edge from lastDominated) - // %exn1 = call i8* @llvm.eh.exception() - // %selector1 = call i8* @llvm.eh.selector(i8* %exn1, ... - // br label %split - // split: - // phis_2 (edge from lastDominated, edge from split) - // %exn = phi ... - // %selector = phi ... - // insns_3 - // insnsAndBranches_4 - - assert(nonDominated); - assert(lastDominated); - - // First, make clones of the intrinsics to go in lpad. - EHExceptionInst *lpadExn = cast(exn->clone()); - EHSelectorInst *lpadSelector = cast(selector->clone()); - lpadSelector->setArgOperand(0, lpadExn); - lpadSelector->insertBefore(lpad->getFirstNonPHI()); - lpadExn->insertBefore(lpadSelector); - - // Split the non-dominated block. - BasicBlock *split = - nonDominated->splitBasicBlock(nonDominated->getFirstNonPHI(), - nonDominated->getName() + ".lpad-fix"); - - // Redirect the last dominated branch there. - cast(lastDominated->back()).setSuccessor(0, split); - - // Move the existing intrinsics to the end of the old block. - selector->moveBefore(&nonDominated->back()); - exn->moveBefore(selector); - - Instruction *splitIP = &split->front(); - - // For all the phis in nonDominated, make a new phi in split to join - // that phi with the edge from lastDominated. - for (BasicBlock::iterator - i = nonDominated->begin(), e = nonDominated->end(); i != e; ++i) { - PHINode *phi = dyn_cast(i); - if (!phi) break; - - PHINode *splitPhi = PHINode::Create(phi->getType(), 2, phi->getName(), - splitIP); - phi->replaceAllUsesWith(splitPhi); - splitPhi->addIncoming(phi, nonDominated); - splitPhi->addIncoming(phi->removeIncomingValue(lastDominated), - lastDominated); - } - - // Make new phis for the exception and selector. - PHINode *exnPhi = PHINode::Create(exn->getType(), 2, "", splitIP); - exn->replaceAllUsesWith(exnPhi); - selector->setArgOperand(0, exn); // except for this use - exnPhi->addIncoming(exn, nonDominated); - exnPhi->addIncoming(lpadExn, lastDominated); - - PHINode *selectorPhi = PHINode::Create(selector->getType(), 2, "", splitIP); - selector->replaceAllUsesWith(selectorPhi); - selectorPhi->addIncoming(selector, nonDominated); - selectorPhi->addIncoming(lpadSelector, lastDominated); - - return lpadSelector; -} - namespace { /// A class for recording information about inlining through an invoke. class InvokeInliningInfo { @@ -285,12 +80,6 @@ return OuterUnwindDest; } - EHSelectorInst *getOuterSelector() { - if (!OuterSelector) - OuterSelector = findSelectorForLandingPad(OuterUnwindDest); - return OuterSelector; - } - BasicBlock *getInnerUnwindDest(); LandingPadInst *getLandingPadInst() const { return CallerLPad; } From isanbard at gmail.com Mon Jan 30 19:22:03 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:22:03 -0000 Subject: [llvm-commits] [llvm] r149322 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131012203.601222A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:22:03 2012 New Revision: 149322 URL: http://llvm.org/viewvc/llvm-project?rev=149322&view=rev Log: Remove unused ivars and s/getOuterUnwindDest/getOuterResumeDest/g. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149322&r1=149321&r2=149322&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:22:03 2012 @@ -42,22 +42,17 @@ /// A class for recording information about inlining through an invoke. class InvokeInliningInfo { BasicBlock *OuterUnwindDest; - EHSelectorInst *OuterSelector; - BasicBlock *InnerUnwindDest; - PHINode *InnerExceptionPHI; - PHINode *InnerSelectorPHI; - SmallVector UnwindDestPHIValues; // FIXME: New EH - These will replace the analogous ones above. BasicBlock *OuterResumeDest; //< Destination of the invoke's unwind. BasicBlock *InnerResumeDest; //< Destination for the callee's resume. LandingPadInst *CallerLPad; //< LandingPadInst associated with the invoke. PHINode *InnerEHValuesPHI; //< PHI for EH values from landingpad insts. + SmallVector UnwindDestPHIValues; public: InvokeInliningInfo(InvokeInst *II) - : OuterUnwindDest(II->getUnwindDest()), OuterSelector(0), - InnerUnwindDest(0), InnerExceptionPHI(0), InnerSelectorPHI(0), + : OuterUnwindDest(II->getUnwindDest()), OuterResumeDest(II->getUnwindDest()), InnerResumeDest(0), CallerLPad(0), InnerEHValuesPHI(0) { // If there are PHI nodes in the unwind destination block, we need to keep @@ -76,7 +71,7 @@ /// The outer unwind destination is the target of unwind edges /// introduced for calls within the inlined function. - BasicBlock *getOuterUnwindDest() const { + BasicBlock *getOuterResumeDest() const { return OuterUnwindDest; } @@ -201,7 +196,7 @@ ImmutableCallSite CS(CI); SmallVector InvokeArgs(CS.arg_begin(), CS.arg_end()); InvokeInst *II = InvokeInst::Create(CI->getCalledValue(), Split, - Invoke.getOuterUnwindDest(), + Invoke.getOuterResumeDest(), InvokeArgs, CI->getName(), BB); II->setCallingConv(CI->getCallingConv()); II->setAttributes(CI->getAttributes()); From isanbard at gmail.com Mon Jan 30 19:25:55 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:25:55 -0000 Subject: [llvm-commits] [llvm] r149323 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131012555.2672D2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:25:54 2012 New Revision: 149323 URL: http://llvm.org/viewvc/llvm-project?rev=149323&view=rev Log: Remove ivar which is identical to another ivar. Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149323&r1=149322&r2=149323&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:25:54 2012 @@ -41,9 +41,6 @@ namespace { /// A class for recording information about inlining through an invoke. class InvokeInliningInfo { - BasicBlock *OuterUnwindDest; - - // FIXME: New EH - These will replace the analogous ones above. BasicBlock *OuterResumeDest; //< Destination of the invoke's unwind. BasicBlock *InnerResumeDest; //< Destination for the callee's resume. LandingPadInst *CallerLPad; //< LandingPadInst associated with the invoke. @@ -52,14 +49,13 @@ public: InvokeInliningInfo(InvokeInst *II) - : OuterUnwindDest(II->getUnwindDest()), - OuterResumeDest(II->getUnwindDest()), InnerResumeDest(0), + : OuterResumeDest(II->getUnwindDest()), InnerResumeDest(0), CallerLPad(0), InnerEHValuesPHI(0) { // If there are PHI nodes in the unwind destination block, we need to keep // track of which values came into them from the invoke before removing // the edge from this block. llvm::BasicBlock *InvokeBB = II->getParent(); - BasicBlock::iterator I = OuterUnwindDest->begin(); + BasicBlock::iterator I = OuterResumeDest->begin(); for (; isa(I); ++I) { // Save the value to use for this edge. PHINode *PHI = cast(I); @@ -69,10 +65,10 @@ CallerLPad = cast(I); } - /// The outer unwind destination is the target of unwind edges - /// introduced for calls within the inlined function. + /// getOuterResumeDest - The outer unwind destination is the target of + /// unwind edges introduced for calls within the inlined function. BasicBlock *getOuterResumeDest() const { - return OuterUnwindDest; + return OuterResumeDest; } BasicBlock *getInnerUnwindDest(); @@ -90,7 +86,7 @@ /// destination block for the given basic block, using the values for the /// original invoke's source block. void addIncomingPHIValuesFor(BasicBlock *BB) const { - addIncomingPHIValuesForInto(BB, OuterUnwindDest); + addIncomingPHIValuesForInto(BB, OuterResumeDest); } void addIncomingPHIValuesForInto(BasicBlock *src, BasicBlock *dest) const { From isanbard at gmail.com Mon Jan 30 19:46:13 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:46:13 -0000 Subject: [llvm-commits] [llvm] r149326 - in /llvm/trunk: include/llvm-c/Core.h include/llvm/CodeGen/FunctionLoweringInfo.h include/llvm/IntrinsicInst.h lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Message-ID: <20120131014613.946FC2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:46:13 2012 New Revision: 149326 URL: http://llvm.org/viewvc/llvm-project?rev=149326&view=rev Log: Remove the eh.exception and eh.selector intrinsics. Also remove a hack to copy over the catch information. The catch information is now tacked to the invoke instruction. Modified: llvm/trunk/include/llvm-c/Core.h llvm/trunk/include/llvm/CodeGen/FunctionLoweringInfo.h llvm/trunk/include/llvm/IntrinsicInst.h llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Modified: llvm/trunk/include/llvm-c/Core.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Core.h?rev=149326&r1=149325&r2=149326&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Core.h (original) +++ llvm/trunk/include/llvm-c/Core.h Mon Jan 30 19:46:13 2012 @@ -481,8 +481,6 @@ macro(IntrinsicInst) \ macro(DbgInfoIntrinsic) \ macro(DbgDeclareInst) \ - macro(EHExceptionInst) \ - macro(EHSelectorInst) \ macro(MemIntrinsic) \ macro(MemCpyInst) \ macro(MemMoveInst) \ Modified: llvm/trunk/include/llvm/CodeGen/FunctionLoweringInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/FunctionLoweringInfo.h?rev=149326&r1=149325&r2=149326&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/FunctionLoweringInfo.h (original) +++ llvm/trunk/include/llvm/CodeGen/FunctionLoweringInfo.h Mon Jan 30 19:46:13 2012 @@ -216,11 +216,6 @@ void AddCatchInfo(const CallInst &I, MachineModuleInfo *MMI, MachineBasicBlock *MBB); -/// CopyCatchInfo - Copy catch information from SuccBB (or one of its -/// successors) to LPad. -void CopyCatchInfo(const BasicBlock *SuccBB, const BasicBlock *LPad, - MachineModuleInfo *MMI, FunctionLoweringInfo &FLI); - /// AddLandingPadInfo - Extract the exception handling information from the /// landingpad instruction and add them to the specified machine module info. void AddLandingPadInfo(const LandingPadInst &I, MachineModuleInfo &MMI, Modified: llvm/trunk/include/llvm/IntrinsicInst.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicInst.h?rev=149326&r1=149325&r2=149326&view=diff ============================================================================== --- llvm/trunk/include/llvm/IntrinsicInst.h (original) +++ llvm/trunk/include/llvm/IntrinsicInst.h Mon Jan 30 19:46:13 2012 @@ -277,34 +277,6 @@ } }; - /// EHExceptionInst - This represents the llvm.eh.exception instruction. - /// - class EHExceptionInst : public IntrinsicInst { - public: - // Methods for support type inquiry through isa, cast, and dyn_cast: - static inline bool classof(const EHExceptionInst *) { return true; } - static inline bool classof(const IntrinsicInst *I) { - return I->getIntrinsicID() == Intrinsic::eh_exception; - } - static inline bool classof(const Value *V) { - return isa(V) && classof(cast(V)); - } - }; - - /// EHSelectorInst - This represents the llvm.eh.selector instruction. - /// - class EHSelectorInst : public IntrinsicInst { - public: - // Methods for support type inquiry through isa, cast, and dyn_cast: - static inline bool classof(const EHSelectorInst *) { return true; } - static inline bool classof(const IntrinsicInst *I) { - return I->getIntrinsicID() == Intrinsic::eh_selector; - } - static inline bool classof(const Value *V) { - return isa(V) && classof(cast(V)); - } - }; - } #endif Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp?rev=149326&r1=149325&r2=149326&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp Mon Jan 30 19:46:13 2012 @@ -425,34 +425,6 @@ } } -void llvm::CopyCatchInfo(const BasicBlock *SuccBB, const BasicBlock *LPad, - MachineModuleInfo *MMI, FunctionLoweringInfo &FLI) { - SmallPtrSet Visited; - - // The 'eh.selector' call may not be in the direct successor of a basic block, - // but could be several successors deeper. If we don't find it, try going one - // level further. - while (Visited.insert(SuccBB)) { - for (BasicBlock::const_iterator I = SuccBB->begin(), E = --SuccBB->end(); - I != E; ++I) - if (const EHSelectorInst *EHSel = dyn_cast(I)) { - // Apply the catch info to LPad. - AddCatchInfo(*EHSel, MMI, FLI.MBBMap[LPad]); -#ifndef NDEBUG - if (!FLI.MBBMap[SuccBB]->isLandingPad()) - FLI.CatchInfoFound.insert(EHSel); -#endif - return; - } - - const BranchInst *Br = dyn_cast(SuccBB->getTerminator()); - if (Br && Br->isUnconditional()) - SuccBB = Br->getSuccessor(0); - else - break; - } -} - /// AddLandingPadInfo - Extract the exception handling information from the /// landingpad instruction and add them to the specified machine module info. void llvm::AddLandingPadInfo(const LandingPadInst &I, MachineModuleInfo &MMI, Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp?rev=149326&r1=149325&r2=149326&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Mon Jan 30 19:46:13 2012 @@ -785,31 +785,6 @@ // Mark exception selector register as live in. Reg = TLI.getExceptionSelectorRegister(); if (Reg) MBB->addLiveIn(Reg); - - // FIXME: Hack around an exception handling flaw (PR1508): the personality - // function and list of typeids logically belong to the invoke (or, if you - // like, the basic block containing the invoke), and need to be associated - // with it in the dwarf exception handling tables. Currently however the - // information is provided by an intrinsic (eh.selector) that can be moved - // to unexpected places by the optimizers: if the unwind edge is critical, - // then breaking it can result in the intrinsics being in the successor of - // the landing pad, not the landing pad itself. This results - // in exceptions not being caught because no typeids are associated with - // the invoke. This may not be the only way things can go wrong, but it - // is the only way we try to work around for the moment. - const BasicBlock *LLVMBB = MBB->getBasicBlock(); - const BranchInst *Br = dyn_cast(LLVMBB->getTerminator()); - - if (Br && Br->isUnconditional()) { // Critical edge? - BasicBlock::const_iterator I, E; - for (I = LLVMBB->begin(), E = --LLVMBB->end(); I != E; ++I) - if (isa(I)) - break; - - if (I == E) - // No catch info found - try to extract some from the successor. - CopyCatchInfo(Br->getSuccessor(0), LLVMBB, &MF->getMMI(), *FuncInfo); - } } /// TryToFoldFastISelLoad - We're checking to see if we can fold the specified From isanbard at gmail.com Mon Jan 30 19:48:40 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:48:40 -0000 Subject: [llvm-commits] [llvm] r149328 - /llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Message-ID: <20120131014840.8EC9F2A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:48:40 2012 New Revision: 149328 URL: http://llvm.org/viewvc/llvm-project?rev=149328&view=rev Log: s/getInnerUnwindDest/getInnerResumeDest/g Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Modified: llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp?rev=149328&r1=149327&r2=149328&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp Mon Jan 30 19:48:40 2012 @@ -71,7 +71,7 @@ return OuterResumeDest; } - BasicBlock *getInnerUnwindDest(); + BasicBlock *getInnerResumeDest(); LandingPadInst *getLandingPadInst() const { return CallerLPad; } @@ -99,8 +99,8 @@ }; } -/// getInnerUnwindDest - Get or create a target for the branch from ResumeInsts. -BasicBlock *InvokeInliningInfo::getInnerUnwindDest() { +/// getInnerResumeDest - Get or create a target for the branch from ResumeInsts. +BasicBlock *InvokeInliningInfo::getInnerResumeDest() { if (InnerResumeDest) return InnerResumeDest; // Split the landing pad. @@ -139,7 +139,7 @@ /// branch. When there is more than one predecessor, we need to split the /// landing pad block after the landingpad instruction and jump to there. void InvokeInliningInfo::forwardResume(ResumeInst *RI) { - BasicBlock *Dest = getInnerUnwindDest(); + BasicBlock *Dest = getInnerResumeDest(); BasicBlock *Src = RI->getParent(); BranchInst::Create(Dest, Src); From kcc at google.com Mon Jan 30 19:56:59 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 01:56:59 -0000 Subject: [llvm-commits] [compiler-rt] r149330 - in /compiler-rt/trunk/lib/asan/tests: heap-overflow.tmpl.Darwin strncpy-overflow.tmpl.Darwin use-after-free.tmpl.Darwin Message-ID: <20120131015659.2851C2A6C12C@llvm.org> Author: kcc Date: Mon Jan 30 19:56:58 2012 New Revision: 149330 URL: http://llvm.org/viewvc/llvm-project?rev=149330&view=rev Log: [asan] tests should not require the asan-rt to be built with debug info Modified: compiler-rt/trunk/lib/asan/tests/heap-overflow.tmpl.Darwin compiler-rt/trunk/lib/asan/tests/strncpy-overflow.tmpl.Darwin compiler-rt/trunk/lib/asan/tests/use-after-free.tmpl.Darwin Modified: compiler-rt/trunk/lib/asan/tests/heap-overflow.tmpl.Darwin URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/heap-overflow.tmpl.Darwin?rev=149330&r1=149329&r2=149330&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/heap-overflow.tmpl.Darwin (original) +++ compiler-rt/trunk/lib/asan/tests/heap-overflow.tmpl.Darwin Mon Jan 30 19:56:58 2012 @@ -2,7 +2,7 @@ #0 0x.* in main .*heap-overflow.cc:6 0x.* is located 0 bytes to the right of 10-byte region allocated by thread T0 here: - #0 0x.* in .*mz_malloc.* _asan_rtl_ + #0 0x.* in .*mz_malloc.* #1 0x.* in malloc_zone_malloc.* #2 0x.* in malloc.* #3 0x.* in main heap-overflow.cc:[45] Modified: compiler-rt/trunk/lib/asan/tests/strncpy-overflow.tmpl.Darwin URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/strncpy-overflow.tmpl.Darwin?rev=149330&r1=149329&r2=149330&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/strncpy-overflow.tmpl.Darwin (original) +++ compiler-rt/trunk/lib/asan/tests/strncpy-overflow.tmpl.Darwin Mon Jan 30 19:56:58 2012 @@ -3,7 +3,7 @@ #1 0x.* in main .*strncpy-overflow.cc:[78] 0x.* is located 0 bytes to the right of 9-byte region allocated by thread T0 here: - #0 0x.* in .*mz_malloc.* _asan_rtl_ + #0 0x.* in .*mz_malloc.* #1 0x.* in malloc_zone_malloc.* #2 0x.* in malloc.* #3 0x.* in main .*strncpy-overflow.cc:6 Modified: compiler-rt/trunk/lib/asan/tests/use-after-free.tmpl.Darwin URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/use-after-free.tmpl.Darwin?rev=149330&r1=149329&r2=149330&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/use-after-free.tmpl.Darwin (original) +++ compiler-rt/trunk/lib/asan/tests/use-after-free.tmpl.Darwin Mon Jan 30 19:56:58 2012 @@ -3,12 +3,12 @@ #0 0x.* in main .*use-after-free.cc:5 0x.* is located 5 bytes inside of 10-byte region .0x.*,0x.* freed by thread T0 here: - #0 0x.* in .*mz_free.* _asan_rtl_ + #0 0x.* in .*mz_free.* # We override free() on Darwin, thus no malloc_zone_free #1 0x.* in free #2 0x.* in main .*use-after-free.cc:[45] previously allocated by thread T0 here: - #0 0x.* in .*mz_malloc.* _asan_rtl_ + #0 0x.* in .*mz_malloc.* #1 0x.* in malloc_zone_malloc.* #2 0x.* in malloc.* #3 0x.* in main .*use-after-free.cc:3 From isanbard at gmail.com Mon Jan 30 19:58:48 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 01:58:48 -0000 Subject: [llvm-commits] [llvm] r149331 - in /llvm/trunk: include/llvm/Intrinsics.td lib/CodeGen/IntrinsicLowering.cpp lib/CodeGen/SelectionDAG/FastISel.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Message-ID: <20120131015848.B9F562A6C12C@llvm.org> Author: void Date: Mon Jan 30 19:58:48 2012 New Revision: 149331 URL: http://llvm.org/viewvc/llvm-project?rev=149331&view=rev Log: Remove the now-dead llvm.eh.exception and llvm.eh.selector intrinsics. Modified: llvm/trunk/include/llvm/Intrinsics.td llvm/trunk/lib/CodeGen/IntrinsicLowering.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Modified: llvm/trunk/include/llvm/Intrinsics.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Intrinsics.td?rev=149331&r1=149330&r2=149331&view=diff ============================================================================== --- llvm/trunk/include/llvm/Intrinsics.td (original) +++ llvm/trunk/include/llvm/Intrinsics.td Mon Jan 30 19:58:48 2012 @@ -304,10 +304,6 @@ //===------------------ Exception Handling Intrinsics----------------------===// // -def int_eh_exception : Intrinsic<[llvm_ptr_ty], [], [IntrReadMem]>; -def int_eh_selector : Intrinsic<[llvm_i32_ty], - [llvm_ptr_ty, llvm_ptr_ty, llvm_vararg_ty]>; -def int_eh_resume : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], [Throws]>; // The result of eh.typeid.for depends on the enclosing function, but inside a // given function it is 'const' and may be CSE'd etc. Modified: llvm/trunk/lib/CodeGen/IntrinsicLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/IntrinsicLowering.cpp?rev=149331&r1=149330&r2=149331&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/IntrinsicLowering.cpp (original) +++ llvm/trunk/lib/CodeGen/IntrinsicLowering.cpp Mon Jan 30 19:58:48 2012 @@ -448,11 +448,6 @@ case Intrinsic::dbg_declare: break; // Simply strip out debugging intrinsics - case Intrinsic::eh_exception: - case Intrinsic::eh_selector: - CI->replaceAllUsesWith(Constant::getNullValue(CI->getType())); - break; - case Intrinsic::eh_typeid_for: // Return something different to eh_selector. CI->replaceAllUsesWith(ConstantInt::get(CI->getType(), 1)); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp?rev=149331&r1=149330&r2=149331&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp Mon Jan 30 19:58:48 2012 @@ -630,60 +630,6 @@ } return true; } - case Intrinsic::eh_exception: { - EVT VT = TLI.getValueType(Call->getType()); - if (TLI.getOperationAction(ISD::EXCEPTIONADDR, VT)!=TargetLowering::Expand) - break; - - assert(FuncInfo.MBB->isLandingPad() && - "Call to eh.exception not in landing pad!"); - unsigned Reg = TLI.getExceptionAddressRegister(); - const TargetRegisterClass *RC = TLI.getRegClassFor(VT); - unsigned ResultReg = createResultReg(RC); - BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DL, TII.get(TargetOpcode::COPY), - ResultReg).addReg(Reg); - UpdateValueMap(Call, ResultReg); - return true; - } - case Intrinsic::eh_selector: { - EVT VT = TLI.getValueType(Call->getType()); - if (TLI.getOperationAction(ISD::EHSELECTION, VT) != TargetLowering::Expand) - break; - if (FuncInfo.MBB->isLandingPad()) - AddCatchInfo(*Call, &FuncInfo.MF->getMMI(), FuncInfo.MBB); - else { -#ifndef NDEBUG - FuncInfo.CatchInfoLost.insert(Call); -#endif - // FIXME: Mark exception selector register as live in. Hack for PR1508. - unsigned Reg = TLI.getExceptionSelectorRegister(); - if (Reg) FuncInfo.MBB->addLiveIn(Reg); - } - - unsigned Reg = TLI.getExceptionSelectorRegister(); - EVT SrcVT = TLI.getPointerTy(); - const TargetRegisterClass *RC = TLI.getRegClassFor(SrcVT); - unsigned ResultReg = createResultReg(RC); - BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DL, TII.get(TargetOpcode::COPY), - ResultReg).addReg(Reg); - - bool ResultRegIsKill = hasTrivialKill(Call); - - // Cast the register to the type of the selector. - if (SrcVT.bitsGT(MVT::i32)) - ResultReg = FastEmit_r(SrcVT.getSimpleVT(), MVT::i32, ISD::TRUNCATE, - ResultReg, ResultRegIsKill); - else if (SrcVT.bitsLT(MVT::i32)) - ResultReg = FastEmit_r(SrcVT.getSimpleVT(), MVT::i32, - ISD::SIGN_EXTEND, ResultReg, ResultRegIsKill); - if (ResultReg == 0) - // Unhandled operand. Halt "fast" selection and bail. - return false; - - UpdateValueMap(Call, ResultReg); - - return true; - } case Intrinsic::objectsize: { ConstantInt *CI = cast(Call->getArgOperand(1)); unsigned long long Res = CI->isZero() ? -1ULL : 0; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=149331&r1=149330&r2=149331&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Mon Jan 30 19:58:48 2012 @@ -4686,43 +4686,6 @@ MMI.setVariableDbgInfo(Variable, FI, DI.getDebugLoc()); return 0; } - case Intrinsic::eh_exception: { - // Insert the EXCEPTIONADDR instruction. - assert(FuncInfo.MBB->isLandingPad() && - "Call to eh.exception not in landing pad!"); - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[1]; - Ops[0] = DAG.getRoot(); - SDValue Op = DAG.getNode(ISD::EXCEPTIONADDR, dl, VTs, Ops, 1); - setValue(&I, Op); - DAG.setRoot(Op.getValue(1)); - return 0; - } - - case Intrinsic::eh_selector: { - MachineBasicBlock *CallMBB = FuncInfo.MBB; - MachineModuleInfo &MMI = DAG.getMachineFunction().getMMI(); - if (CallMBB->isLandingPad()) - AddCatchInfo(I, &MMI, CallMBB); - else { -#ifndef NDEBUG - FuncInfo.CatchInfoLost.insert(&I); -#endif - // FIXME: Mark exception selector register as live in. Hack for PR1508. - unsigned Reg = TLI.getExceptionSelectorRegister(); - if (Reg) FuncInfo.MBB->addLiveIn(Reg); - } - - // Insert the EHSELECTION instruction. - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[2]; - Ops[0] = getValue(I.getArgOperand(0)); - Ops[1] = getRoot(); - SDValue Op = DAG.getNode(ISD::EHSELECTION, dl, VTs, Ops, 2); - DAG.setRoot(Op.getValue(1)); - setValue(&I, DAG.getSExtOrTrunc(Op, dl, MVT::i32)); - return 0; - } case Intrinsic::eh_typeid_for: { // Find the type id for the given typeinfo. From isanbard at gmail.com Mon Jan 30 20:04:20 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 02:04:20 -0000 Subject: [llvm-commits] [llvm] r149332 - /llvm/trunk/test/Other/2009-03-31-CallGraph.ll Message-ID: <20120131020420.72DC92A6C12C@llvm.org> Author: void Date: Mon Jan 30 20:04:20 2012 New Revision: 149332 URL: http://llvm.org/viewvc/llvm-project?rev=149332&view=rev Log: Update test to new EH model. Modified: llvm/trunk/test/Other/2009-03-31-CallGraph.ll Modified: llvm/trunk/test/Other/2009-03-31-CallGraph.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Other/2009-03-31-CallGraph.ll?rev=149332&r1=149331&r2=149332&view=diff ============================================================================== --- llvm/trunk/test/Other/2009-03-31-CallGraph.ll (original) +++ llvm/trunk/test/Other/2009-03-31-CallGraph.ll Mon Jan 30 20:04:20 2012 @@ -15,6 +15,8 @@ unreachable lpad2: + %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0 + cleanup unreachable } @@ -29,3 +31,4 @@ declare void @f8() +declare i32 @__gxx_personality_v0(...) From isanbard at gmail.com Mon Jan 30 20:05:14 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 02:05:14 -0000 Subject: [llvm-commits] [llvm] r149333 - /llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Message-ID: <20120131020514.5677E2A6C12C@llvm.org> Author: void Date: Mon Jan 30 20:05:13 2012 New Revision: 149333 URL: http://llvm.org/viewvc/llvm-project?rev=149333&view=rev Log: Update test to new EH model. Modified: llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Modified: llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll?rev=149333&r1=149332&r2=149333&view=diff ============================================================================== --- llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll (original) +++ llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Mon Jan 30 20:05:13 2012 @@ -23,8 +23,8 @@ ret i32 %retval lpad: - %eh_ptr = call i8* @llvm.eh.exception() - %eh_select = call i32 (i8*, i8*, ...)* @llvm.eh.selector(i8* %eh_ptr, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i8* null) + %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0 + catch i8* null unreachable } From isanbard at gmail.com Mon Jan 30 20:09:07 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 02:09:07 -0000 Subject: [llvm-commits] [llvm] r149335 - in /llvm/trunk/test: CodeGen/Generic/2007-12-31-UnusedSelector.ll CodeGen/Generic/2009-11-16-BadKillsCrash.ll CodeGen/Mips/eh.ll CodeGen/X86/2008-05-28-LocalRegAllocBug.ll CodeGen/X86/negate-add-zero.ll Transforms/Inline/inline-invoke-tail.ll Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll Message-ID: <20120131020907.DD6682A6C12C@llvm.org> Author: void Date: Mon Jan 30 20:09:07 2012 New Revision: 149335 URL: http://llvm.org/viewvc/llvm-project?rev=149335&view=rev Log: Remove all references to the old EH. There was always the current EH. -- Ministry of Truth Modified: llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll llvm/trunk/test/CodeGen/Mips/eh.ll llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll llvm/trunk/test/CodeGen/X86/negate-add-zero.ll llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll Modified: llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll (original) +++ llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll Mon Jan 30 20:09:07 2012 @@ -30,8 +30,6 @@ declare void @__cxa_throw(i8*, i8*, void (i8*)*) noreturn -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) - declare void @__cxa_end_catch() declare i32 @__gxx_personality_v0(...) Modified: llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll (original) +++ llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll Mon Jan 30 20:09:07 2012 @@ -15,8 +15,6 @@ %"struct.std::locale::facet" = type { i32 (...)**, i32 } %union..0._15 = type { i32 } -declare i8* @llvm.eh.exception() nounwind readonly - declare i8* @__cxa_begin_catch(i8*) nounwind declare %"struct.std::ctype"* @_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale(%"struct.std::locale"*) Modified: llvm/trunk/test/CodeGen/Mips/eh.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/eh.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/eh.ll (original) +++ llvm/trunk/test/CodeGen/Mips/eh.ll Mon Jan 30 20:09:07 2012 @@ -54,16 +54,10 @@ declare i8* @__cxa_allocate_exception(i32) -declare i8* @llvm.eh.exception() nounwind readonly - declare i32 @__gxx_personality_v0(...) -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind - declare i32 @llvm.eh.typeid.for(i8*) nounwind -declare void @llvm.eh.resume(i8*, i32) - declare void @__cxa_throw(i8*, i8*, i8*) declare i8* @__cxa_begin_catch(i8*) Modified: llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll (original) +++ llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll Mon Jan 30 20:09:07 2012 @@ -2,8 +2,6 @@ @_ZTVN10Evaluation10GridOutputILi3EEE = external constant [5 x i32 (...)*] ; <[5 x i32 (...)*]*> [#uses=1] -declare i8* @llvm.eh.exception() nounwind - declare i8* @_Znwm(i32) declare i8* @__cxa_begin_catch(i8*) nounwind Modified: llvm/trunk/test/CodeGen/X86/negate-add-zero.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/negate-add-zero.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/negate-add-zero.ll (original) +++ llvm/trunk/test/CodeGen/X86/negate-add-zero.ll Mon Jan 30 20:09:07 2012 @@ -486,10 +486,6 @@ declare i8* @_Znwm(i32) -declare i8* @llvm.eh.exception() nounwind - -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind - declare i32 @llvm.eh.typeid.for.i32(i8*) nounwind declare void @_ZdlPv(i8*) nounwind Modified: llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll (original) +++ llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Mon Jan 30 20:09:07 2012 @@ -28,10 +28,6 @@ unreachable } -declare i8* @llvm.eh.exception() nounwind readonly - -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind - declare i32 @__gxx_personality_v0(...) declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind Modified: llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll?rev=149335&r1=149334&r2=149335&view=diff ============================================================================== --- llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll (original) +++ llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll Mon Jan 30 20:09:07 2012 @@ -21,10 +21,6 @@ declare i8* @__cxa_begin_catch(i8*) nounwind -declare i8* @llvm.eh.exception() nounwind - -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind - declare void @__cxa_end_catch() declare i32 @__gxx_personality_v0(...) From dblaikie at gmail.com Mon Jan 30 20:17:42 2012 From: dblaikie at gmail.com (David Blaikie) Date: Mon, 30 Jan 2012 18:17:42 -0800 Subject: [llvm-commits] [llvm] r149309 - /llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h In-Reply-To: <20120131005708.BA9A32A6C12C@llvm.org> References: <20120131005708.BA9A32A6C12C@llvm.org> Message-ID: On Mon, Jan 30, 2012 at 4:57 PM, Ted Kremenek wrote: > Author: kremenek > Date: Mon Jan 30 18:57:08 2012 > New Revision: 149309 > > URL: http://llvm.org/viewvc/llvm-project?rev=149309&view=rev > Log: > Relax constructor for IntrusiveRefCntPtr to not be explicit. > > Modified: > ? ?llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h > > Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h?rev=149309&r1=149308&r2=149309&view=diff > ============================================================================== > --- llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h (original) > +++ llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Mon Jan 30 18:57:08 2012 > @@ -115,7 +115,7 @@ > > ? ? explicit IntrusiveRefCntPtr() : Obj(0) {} > > - ? ?explicit IntrusiveRefCntPtr(T* obj) : Obj(obj) { > + ? ?IntrusiveRefCntPtr(T* obj) : Obj(obj) { It seems to be the convention in some other parts of clang that deliberately implicit ctors are prefixed with /*implicit*/ like this: /*implicit*/ IntrusiveRefCntPtr(T* obj) : Obj(obj) { > ? ? ? retain(); > ? ? } > > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From sabre at nondot.org Mon Jan 30 20:55:07 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 02:55:07 -0000 Subject: [llvm-commits] [llvm] r149340 - /llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp Message-ID: <20120131025507.407F62A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 20:55:06 2012 New Revision: 149340 URL: http://llvm.org/viewvc/llvm-project?rev=149340&view=rev Log: enhance logic to support ConstantDataArray. Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?rev=149340&r1=149339&r2=149340&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp Mon Jan 30 20:55:06 2012 @@ -203,8 +203,12 @@ // We need TD information to know the pointer size unless this is inbounds. if (!GEP->isInBounds() && TD == 0) return 0; - ConstantArray *Init = dyn_cast(GV->getInitializer()); - if (Init == 0 || Init->getNumOperands() > 1024) return 0; + Constant *Init = GV->getInitializer(); + if (!isa(Init) && !isa(Init)) + return 0; + + uint64_t ArrayElementCount = Init->getType()->getArrayNumElements(); + if (ArrayElementCount > 1024) return 0; // Don't blow up on huge arrays. // There are many forms of this optimization we can handle, for now, just do // the simple index into a single-dimensional array. @@ -221,7 +225,7 @@ // structs. SmallVector LaterIndices; - Type *EltTy = cast(Init->getType())->getElementType(); + Type *EltTy = Init->getType()->getArrayElementType(); for (unsigned i = 3, e = GEP->getNumOperands(); i != e; ++i) { ConstantInt *Idx = dyn_cast(GEP->getOperand(i)); if (Idx == 0) return 0; // Variable index. @@ -272,8 +276,9 @@ // Scan the array and see if one of our patterns matches. Constant *CompareRHS = cast(ICI.getOperand(1)); - for (unsigned i = 0, e = Init->getNumOperands(); i != e; ++i) { - Constant *Elt = Init->getOperand(i); + for (unsigned i = 0, e = ArrayElementCount; i != e; ++i) { + Constant *Elt = Init->getAggregateElement(i); + if (Elt == 0) return 0; // If this is indexing an array of structures, get the structure element. if (!LaterIndices.empty()) @@ -440,10 +445,10 @@ // If a 32-bit or 64-bit magic bitvector captures the entire comparison state // of this load, replace it with computation that does: // ((magic_cst >> i) & 1) != 0 - if (Init->getNumOperands() <= 32 || - (TD && Init->getNumOperands() <= 64 && TD->isLegalInteger(64))) { + if (ArrayElementCount <= 32 || + (TD && ArrayElementCount <= 64 && TD->isLegalInteger(64))) { Type *Ty; - if (Init->getNumOperands() <= 32) + if (ArrayElementCount <= 32) Ty = Type::getInt32Ty(Init->getContext()); else Ty = Type::getInt64Ty(Init->getContext()); From sabre at nondot.org Mon Jan 30 21:15:40 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 03:15:40 -0000 Subject: [llvm-commits] [llvm] r149341 - /llvm/trunk/lib/VMCore/AsmWriter.cpp Message-ID: <20120131031540.43FEE2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 21:15:40 2012 New Revision: 149341 URL: http://llvm.org/viewvc/llvm-project?rev=149341&view=rev Log: fix asmwriting of ConstantDataArray to use the right element count, simplify ConstantArray handling, since they can never be empty. Modified: llvm/trunk/lib/VMCore/AsmWriter.cpp Modified: llvm/trunk/lib/VMCore/AsmWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/AsmWriter.cpp?rev=149341&r1=149340&r2=149341&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/AsmWriter.cpp (original) +++ llvm/trunk/lib/VMCore/AsmWriter.cpp Mon Jan 30 21:15:40 2012 @@ -837,19 +837,17 @@ Out << '"'; } else { // Cannot output in string format... Out << '['; - if (CA->getNumOperands()) { + TypePrinter.print(ETy, Out); + Out << ' '; + WriteAsOperandInternal(Out, CA->getOperand(0), + &TypePrinter, Machine, + Context); + for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { + Out << ", "; TypePrinter.print(ETy, Out); Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(0), - &TypePrinter, Machine, + WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, Context); - for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { - Out << ", "; - TypePrinter.print(ETy, Out); - Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, - Context); - } } Out << ']'; } @@ -868,21 +866,19 @@ Type *ETy = CA->getType()->getElementType(); Out << '['; - if (CA->getNumOperands()) { + TypePrinter.print(ETy, Out); + Out << ' '; + WriteAsOperandInternal(Out, CA->getElementAsConstant(0), + &TypePrinter, Machine, + Context); + for (unsigned i = 1, e = CA->getNumElements(); i != e; ++i) { + Out << ", "; TypePrinter.print(ETy, Out); Out << ' '; - WriteAsOperandInternal(Out, CA->getElementAsConstant(0), - &TypePrinter, Machine, - Context); - for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { - Out << ", "; - TypePrinter.print(ETy, Out); - Out << ' '; - WriteAsOperandInternal(Out, CA->getElementAsConstant(i), &TypePrinter, - Machine, Context); - } - Out << ']'; + WriteAsOperandInternal(Out, CA->getElementAsConstant(i), &TypePrinter, + Machine, Context); } + Out << ']'; return; } From sabre at nondot.org Mon Jan 30 21:16:39 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 03:16:39 -0000 Subject: [llvm-commits] [llvm] r149342 - /llvm/trunk/lib/Target/CBackend/CBackend.cpp Message-ID: <20120131031639.B34062A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 21:16:39 2012 New Revision: 149342 URL: http://llvm.org/viewvc/llvm-project?rev=149342&view=rev Log: use the right accessor for ConstantDataArray. Modified: llvm/trunk/lib/Target/CBackend/CBackend.cpp Modified: llvm/trunk/lib/Target/CBackend/CBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CBackend/CBackend.cpp?rev=149342&r1=149341&r2=149342&view=diff ============================================================================== --- llvm/trunk/lib/Target/CBackend/CBackend.cpp (original) +++ llvm/trunk/lib/Target/CBackend/CBackend.cpp Mon Jan 30 21:16:39 2012 @@ -680,7 +680,7 @@ } else { Out << "{ "; printConstant(CDS->getElementAsConstant(0), Static); - for (unsigned i = 1, e = CDS->getNumOperands(); i != e; ++i) { + for (unsigned i = 1, e = CDS->getNumElements(); i != e; ++i) { Out << ", "; printConstant(CDS->getElementAsConstant(i), Static); } From sabre at nondot.org Mon Jan 30 21:39:24 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 03:39:24 -0000 Subject: [llvm-commits] [llvm] r149343 - /llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Message-ID: <20120131033924.E4D6C2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 21:39:24 2012 New Revision: 149343 URL: http://llvm.org/viewvc/llvm-project?rev=149343&view=rev Log: don't emit a 1-byte object as a .fill. This is silly and causes CodeGen/X86/global-sections.ll to fail with CDArray Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=149343&r1=149342&r2=149343&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Mon Jan 30 21:39:24 2012 @@ -1615,7 +1615,9 @@ int Value = isRepeatedByteSequence(CDS, AP.TM); if (Value != -1) { uint64_t Bytes = AP.TM.getTargetData()->getTypeAllocSize(CDS->getType()); - return AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); + // Don't emit a 1-byte object as a .fill. + if (Bytes > 1) + return AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); } // If this can be emitted with .ascii/.asciz, emit it as such. From chandlerc at gmail.com Mon Jan 30 21:49:14 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 19:49:14 -0800 Subject: [llvm-commits] PATCH: Add several convenience predicates to llvm::Triple In-Reply-To: References: Message-ID: Ping. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/5833e082/attachment.html From sabre at nondot.org Mon Jan 30 22:39:22 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 04:39:22 -0000 Subject: [llvm-commits] [llvm] r149348 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <20120131043922.528222A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 22:39:22 2012 New Revision: 149348 URL: http://llvm.org/viewvc/llvm-project?rev=149348&view=rev Log: rework this logic to not depend on the last argument to GetConstantStringInfo, which is going away. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=149348&r1=149347&r2=149348&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Jan 30 22:39:22 2012 @@ -3323,7 +3323,10 @@ if (TLI.isLittleEndian()) Offset = Offset + MSB - 1; for (unsigned i = 0; i != MSB; ++i) { - Val = (Val << 8) | (unsigned char)Str[Offset]; + Val = (Val << 8); + + if (Offset < Str.size()) + Val |= (unsigned char)Str[Offset]; Offset += TLI.isLittleEndian() ? -1 : 1; } return DAG.getConstant(Val, VT); @@ -3354,9 +3357,12 @@ if (!G) return false; - const GlobalVariable *GV = dyn_cast(G->getGlobal()); - if (GV && GetConstantStringInfo(GV, Str, SrcDelta, false)) - return true; + if (const GlobalVariable *GV = dyn_cast(G->getGlobal())) + if (GetConstantStringInfo(GV, Str, SrcDelta)) { + // The nul can also be read. + Str.push_back(0); + return true; + } return false; } From sabre at nondot.org Mon Jan 30 22:42:22 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 04:42:22 -0000 Subject: [llvm-commits] [llvm] r149351 - in /llvm/trunk: include/llvm/Analysis/ValueTracking.h lib/Analysis/ConstantFolding.cpp lib/Analysis/ValueTracking.cpp lib/VMCore/Constants.cpp Message-ID: <20120131044222.887722A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 22:42:22 2012 New Revision: 149351 URL: http://llvm.org/viewvc/llvm-project?rev=149351&view=rev Log: Change ConstantArray::get to form a ConstantDataArray when possible, kicking in the big win of ConstantDataArray. As part of this, change the implementation of GetConstantStringInfo in ValueTracking to work with ConstantDataArray (and not ConstantArray) making it dramatically, amazingly, more efficient in the process and renaming it to getConstantStringInfo. This keeps around a GetConstantStringInfo entrypoint that (grossly) forwards to getConstantStringInfo and constructs the std::string required, but existing clients should move over to getConstantStringInfo instead. Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h llvm/trunk/lib/Analysis/ConstantFolding.cpp llvm/trunk/lib/Analysis/ValueTracking.cpp llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=149351&r1=149350&r2=149351&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Mon Jan 30 22:42:22 2012 @@ -17,14 +17,13 @@ #include "llvm/ADT/ArrayRef.h" #include "llvm/Support/DataTypes.h" -#include namespace llvm { - template class SmallVectorImpl; class Value; class Instruction; class APInt; class TargetData; + class StringRef; /// ComputeMaskedBits - Determine which of the bits specified in Mask are /// known to be either zero or one and return them in the KnownZero/KnownOne @@ -125,16 +124,17 @@ return GetPointerBaseWithConstantOffset(const_cast(Ptr), Offset,TD); } - /// GetConstantStringInfo - This function computes the length of a + /// getConstantStringInfo - This function computes the length of a /// null-terminated C string pointed to by V. If successful, it returns true - /// and returns the string in Str. If unsuccessful, it returns false. If - /// StopAtNul is set to true (the default), the returned string is truncated - /// by a nul character in the global. If StopAtNul is false, the nul - /// character is included in the result string. + /// and returns the string in Str. If unsuccessful, it returns false. This + /// does not include the trailing nul character. + bool getConstantStringInfo(const Value *V, StringRef &Str, + uint64_t Offset = 0); + + // FIXME: Remove this. bool GetConstantStringInfo(const Value *V, std::string &Str, - uint64_t Offset = 0, - bool StopAtNul = true); - + uint64_t Offset = 0); + /// GetStringLength - If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. uint64_t GetStringLength(Value *V); Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ConstantFolding.cpp?rev=149351&r1=149350&r2=149351&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ConstantFolding.cpp (original) +++ llvm/trunk/lib/Analysis/ConstantFolding.cpp Mon Jan 30 22:42:22 2012 @@ -476,9 +476,9 @@ // Instead of loading constant c string, use corresponding integer value // directly if string length is small enough. - std::string Str; - if (TD && GetConstantStringInfo(CE, Str) && !Str.empty()) { - unsigned StrLen = Str.length(); + StringRef Str; + if (TD && getConstantStringInfo(CE, Str) && !Str.empty()) { + unsigned StrLen = Str.size(); Type *Ty = cast(CE->getType())->getElementType(); unsigned NumBits = Ty->getPrimitiveSizeInBits(); // Replace load with immediate integer if the result is an integer or fp Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=149351&r1=149350&r2=149351&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Mon Jan 30 22:42:22 2012 @@ -1369,25 +1369,21 @@ } } - // A ConstantArray is splatable if all its members are equal and also - // splatable. - if (ConstantArray *CA = dyn_cast(V)) { - if (CA->getNumOperands() == 0) - return 0; - - Value *Val = isBytewiseValue(CA->getOperand(0)); + // A ConstantDataArray/Vector is splatable if all its members are equal and + // also splatable. + if (ConstantDataSequential *CA = dyn_cast(V)) { + Value *Elt = CA->getElementAsConstant(0); + Value *Val = isBytewiseValue(Elt); if (!Val) return 0; - for (unsigned I = 1, E = CA->getNumOperands(); I != E; ++I) - if (CA->getOperand(I-1) != CA->getOperand(I)) + for (unsigned I = 1, E = CA->getNumElements(); I != E; ++I) + if (CA->getElementAsConstant(I) != Elt) return 0; return Val; } - // FIXME: Vector types (e.g., <4 x i32> ). - // Conceptually, we could handle things like: // %a = zext i8 %X to i16 // %b = shl i16 %a, 8 @@ -1607,33 +1603,29 @@ } -/// GetConstantStringInfo - This function computes the length of a +// FIXME: Remove this. +bool llvm::GetConstantStringInfo(const Value *V, std::string &Str, + uint64_t Offset) { + StringRef Tmp; + if (!getConstantStringInfo(V, Tmp, Offset)) + return false; + Str = Tmp.str(); + return true; +} + +/// getConstantStringInfo - This function computes the length of a /// null-terminated C string pointed to by V. If successful, it returns true /// and returns the string in Str. If unsuccessful, it returns false. -bool llvm::GetConstantStringInfo(const Value *V, std::string &Str, - uint64_t Offset, bool StopAtNul) { - // If V is NULL then return false; - if (V == NULL) return false; - - // Look through bitcast instructions. - if (const BitCastInst *BCI = dyn_cast(V)) - return GetConstantStringInfo(BCI->getOperand(0), Str, Offset, StopAtNul); - - // If the value is not a GEP instruction nor a constant expression with a - // GEP instruction, then return false because ConstantArray can't occur - // any other way. - const User *GEP = 0; - if (const GetElementPtrInst *GEPI = dyn_cast(V)) { - GEP = GEPI; - } else if (const ConstantExpr *CE = dyn_cast(V)) { - if (CE->getOpcode() == Instruction::BitCast) - return GetConstantStringInfo(CE->getOperand(0), Str, Offset, StopAtNul); - if (CE->getOpcode() != Instruction::GetElementPtr) - return false; - GEP = CE; - } - - if (GEP) { +bool llvm::getConstantStringInfo(const Value *V, StringRef &Str, + uint64_t Offset) { + assert(V); + + // Look through bitcast instructions and geps. + V = V->stripPointerCasts(); + + // If the value is a GEP instructionor constant expression, treat it as an + // offset. + if (const GEPOperator *GEP = dyn_cast(V)) { // Make sure the GEP has exactly three arguments. if (GEP->getNumOperands() != 3) return false; @@ -1658,51 +1650,45 @@ StartIdx = CI->getZExtValue(); else return false; - return GetConstantStringInfo(GEP->getOperand(0), Str, StartIdx+Offset, - StopAtNul); + return getConstantStringInfo(GEP->getOperand(0), Str, StartIdx+Offset); } // The GEP instruction, constant or instruction, must reference a global // variable that is a constant and is initialized. The referenced constant // initializer is the array that we'll use for optimization. - const GlobalVariable* GV = dyn_cast(V); + const GlobalVariable *GV = dyn_cast(V); if (!GV || !GV->isConstant() || !GV->hasDefinitiveInitializer()) return false; - const Constant *GlobalInit = GV->getInitializer(); - + // Handle the all-zeros case - if (GlobalInit->isNullValue()) { + if (GV->getInitializer()->isNullValue()) { // This is a degenerate case. The initializer is constant zero so the // length of the string must be zero. - Str.clear(); + Str = ""; return true; } // Must be a Constant Array - const ConstantArray *Array = dyn_cast(GlobalInit); - if (Array == 0 || !Array->getType()->getElementType()->isIntegerTy(8)) + const ConstantDataArray *Array = + dyn_cast(GV->getInitializer()); + if (Array == 0 || !Array->isString()) return false; // Get the number of elements in the array - uint64_t NumElts = Array->getType()->getNumElements(); - + uint64_t NumElts = Array->getType()->getArrayNumElements(); + + // Start out with the entire array in the StringRef. + Str = Array->getAsString(); + if (Offset > NumElts) return false; - // Traverse the constant array from 'Offset' which is the place the GEP refers - // to in the array. - Str.reserve(NumElts-Offset); - for (unsigned i = Offset; i != NumElts; ++i) { - const Constant *Elt = Array->getOperand(i); - const ConstantInt *CI = dyn_cast(Elt); - if (!CI) // This array isn't suitable, non-int initializer. - return false; - if (StopAtNul && CI->isZero()) - return true; // we found end of string, success! - Str += (char)CI->getZExtValue(); - } - - // The array isn't null terminated, but maybe this is a memcpy, not a strcpy. + // Skip over 'offset' bytes. + Str = Str.substr(Offset); + // Trim off the \0 and anything after it. If the array is not nul terminated, + // we just return the whole end of string. The client may know some other way + // that the string is length-bound. + Str = Str.substr(0, Str.find('\0')); return true; } @@ -1714,8 +1700,7 @@ /// the specified pointer, return 'len+1'. If we can't, return 0. static uint64_t GetStringLengthH(Value *V, SmallPtrSet &PHIs) { // Look through noop bitcast instructions. - if (BitCastInst *BCI = dyn_cast(V)) - return GetStringLengthH(BCI->getOperand(0), PHIs); + V = V->stripPointerCasts(); // If this is a PHI node, there are two cases: either we have already seen it // or we haven't. @@ -1751,83 +1736,13 @@ if (Len1 != Len2) return 0; return Len1; } - - // As a special-case, "@string = constant i8 0" is also a string with zero - // length, not wrapped in a bitcast or GEP. - if (GlobalVariable *GV = dyn_cast(V)) { - if (GV->isConstant() && GV->hasDefinitiveInitializer()) - if (GV->getInitializer()->isNullValue()) return 1; - return 0; - } - - // If the value is not a GEP instruction nor a constant expression with a - // GEP instruction, then return unknown. - User *GEP = 0; - if (GetElementPtrInst *GEPI = dyn_cast(V)) { - GEP = GEPI; - } else if (ConstantExpr *CE = dyn_cast(V)) { - if (CE->getOpcode() != Instruction::GetElementPtr) - return 0; - GEP = CE; - } else { - return 0; - } - - // Make sure the GEP has exactly three arguments. - if (GEP->getNumOperands() != 3) - return 0; - - // Check to make sure that the first operand of the GEP is an integer and - // has value 0 so that we are sure we're indexing into the initializer. - if (ConstantInt *Idx = dyn_cast(GEP->getOperand(1))) { - if (!Idx->isZero()) - return 0; - } else - return 0; - - // If the second index isn't a ConstantInt, then this is a variable index - // into the array. If this occurs, we can't say anything meaningful about - // the string. - uint64_t StartIdx = 0; - if (ConstantInt *CI = dyn_cast(GEP->getOperand(2))) - StartIdx = CI->getZExtValue(); - else - return 0; - - // The GEP instruction, constant or instruction, must reference a global - // variable that is a constant and is initialized. The referenced constant - // initializer is the array that we'll use for optimization. - GlobalVariable* GV = dyn_cast(GEP->getOperand(0)); - if (!GV || !GV->isConstant() || !GV->hasInitializer() || - GV->mayBeOverridden()) + + // Otherwise, see if we can read the string. + StringRef StrData; + if (!getConstantStringInfo(V, StrData)) return 0; - Constant *GlobalInit = GV->getInitializer(); - - // Handle the ConstantAggregateZero case, which is a degenerate case. The - // initializer is constant zero so the length of the string must be zero. - if (isa(GlobalInit)) - return 1; // Len = 0 offset by 1. - - // Must be a Constant Array - ConstantArray *Array = dyn_cast(GlobalInit); - if (!Array || !Array->getType()->getElementType()->isIntegerTy(8)) - return false; - - // Get the number of elements in the array - uint64_t NumElts = Array->getType()->getNumElements(); - - // Traverse the constant array from StartIdx (derived above) which is - // the place the GEP refers to in the array. - for (unsigned i = StartIdx; i != NumElts; ++i) { - Constant *Elt = Array->getOperand(i); - ConstantInt *CI = dyn_cast(Elt); - if (!CI) // This array isn't suitable, non-int initializer. - return 0; - if (CI->isZero()) - return i-StartIdx+1; // We found end of string, success! - } - return 0; // The array isn't null terminated, conservatively return 'unknown'. + return StrData.size()+1; } /// GetStringLength - If we can compute the length of the string pointed to by Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149351&r1=149350&r2=149351&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Mon Jan 30 22:42:22 2012 @@ -666,6 +666,13 @@ // ConstantXXX Classes //===----------------------------------------------------------------------===// +template +static bool rangeOnlyContains(ItTy Start, ItTy End, EltTy Elt) { + for (; Start != End; ++Start) + if (*Start != Elt) + return false; + return true; +} ConstantArray::ConstantArray(ArrayType *T, ArrayRef V) : Constant(T, ConstantArrayVal, @@ -680,54 +687,103 @@ } Constant *ConstantArray::get(ArrayType *Ty, ArrayRef V) { + // Empty arrays are canonicalized to ConstantAggregateZero. + if (V.empty()) + return ConstantAggregateZero::get(Ty); + for (unsigned i = 0, e = V.size(); i != e; ++i) { assert(V[i]->getType() == Ty->getElementType() && "Wrong type in array element initializer"); } LLVMContextImpl *pImpl = Ty->getContext().pImpl; - // If this is an all-zero array, return a ConstantAggregateZero object - bool isAllZero = true; - bool isUndef = false; - if (!V.empty()) { - Constant *C = V[0]; - isAllZero = C->isNullValue(); - isUndef = isa(C); - - if (isAllZero || isUndef) - for (unsigned i = 1, e = V.size(); i != e; ++i) - if (V[i] != C) { - isAllZero = false; - isUndef = false; - break; - } - } + + // If this is an all-zero array, return a ConstantAggregateZero object. If + // all undef, return an UndefValue, if "all simple", then return a + // ConstantDataArray. + Constant *C = V[0]; + if (isa(C) && rangeOnlyContains(V.begin(), V.end(), C)) + return UndefValue::get(Ty); - if (isAllZero) + if (C->isNullValue() && rangeOnlyContains(V.begin(), V.end(), C)) return ConstantAggregateZero::get(Ty); - if (isUndef) - return UndefValue::get(Ty); + + // Check to see if all of the elements are ConstantFP or ConstantInt and if + // the element type is compatible with ConstantDataVector. If so, use it. + if (ConstantDataSequential::isElementTypeCompatible(C->getType())) { + // We speculatively build the elements here even if it turns out that there + // is a constantexpr or something else weird in the array, since it is so + // uncommon for that to happen. + if (ConstantInt *CI = dyn_cast(C)) { + if (CI->getType()->isIntegerTy(8)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(16)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(32)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } else if (CI->getType()->isIntegerTy(64)) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantInt *CI = dyn_cast(V[i])) + Elts.push_back(CI->getZExtValue()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } + } + + if (ConstantFP *CFP = dyn_cast(C)) { + if (CFP->getType()->isFloatTy()) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantFP *CFP = dyn_cast(V[i])) + Elts.push_back(CFP->getValueAPF().convertToFloat()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } else if (CFP->getType()->isDoubleTy()) { + SmallVector Elts; + for (unsigned i = 0, e = V.size(); i != e; ++i) + if (ConstantFP *CFP = dyn_cast(V[i])) + Elts.push_back(CFP->getValueAPF().convertToDouble()); + else + break; + if (Elts.size() == V.size()) + return ConstantDataArray::get(C->getContext(), Elts); + } + } + } + + // Otherwise, we really do want to create a ConstantArray. return pImpl->ArrayConstants.getOrCreate(Ty, V); } -/// ConstantArray::get(const string&) - Return an array that is initialized to -/// contain the specified string. If length is zero then a null terminator is -/// added to the specified string so that it may be used in a natural way. -/// Otherwise, the length parameter specifies how much of the string to use -/// and it won't be null terminated. -/// +// FIXME: Remove this method. Constant *ConstantArray::get(LLVMContext &Context, StringRef Str, bool AddNull) { - SmallVector ElementVals; - ElementVals.reserve(Str.size() + size_t(AddNull)); - for (unsigned i = 0; i < Str.size(); ++i) - ElementVals.push_back(ConstantInt::get(Type::getInt8Ty(Context), Str[i])); - - // Add a null terminator to the string... - if (AddNull) - ElementVals.push_back(ConstantInt::get(Type::getInt8Ty(Context), 0)); - - ArrayType *ATy = ArrayType::get(Type::getInt8Ty(Context), ElementVals.size()); - return get(ATy, ElementVals); + return ConstantDataArray::getString(Context, Str, AddNull); } /// getTypeForElements - Return an anonymous struct type to use for a constant @@ -839,8 +895,7 @@ // Check to see if all of the elements are ConstantFP or ConstantInt and if // the element type is compatible with ConstantDataVector. If so, use it. - if (ConstantDataSequential::isElementTypeCompatible(C->getType()) && - (isa(C) || isa(C))) { + if (ConstantDataSequential::isElementTypeCompatible(C->getType())) { // We speculatively build the elements here even if it turns out that there // is a constantexpr or something else weird in the array, since it is so // uncommon for that to happen. From sabre at nondot.org Mon Jan 30 22:43:11 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 04:43:11 -0000 Subject: [llvm-commits] [llvm] r149352 - /llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Message-ID: <20120131044311.B3D492A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 22:43:11 2012 New Revision: 149352 URL: http://llvm.org/viewvc/llvm-project?rev=149352&view=rev Log: start moving SimplifyLibcalls over to getConstantStringInfo, which is dramatically more efficient than GetConstantStringInfo. Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=149352&r1=149351&r2=149352&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Mon Jan 30 22:43:11 2012 @@ -256,19 +256,18 @@ ConstantInt::get(TD->getIntPtrType(*Context), Len), B, TD); } - + // Otherwise, the character is a constant, see if the first argument is // a string literal. If so, we can constant fold. - std::string Str; - if (!GetConstantStringInfo(SrcStr, Str)) + StringRef Str; + if (!getConstantStringInfo(SrcStr, Str)) return 0; - // strchr can find the nul character. - Str += '\0'; - - // Compute the offset. - size_t I = Str.find(CharC->getSExtValue()); - if (I == std::string::npos) // Didn't find the char. strchr returns null. + // Compute the offset, make sure to handle the case when we're searching for + // zero (a weird way to spell strlen). + size_t I = CharC->getSExtValue() == 0 ? + Str.size() : Str.find(CharC->getSExtValue()); + if (I == StringRef::npos) // Didn't find the char. strchr returns null. return Constant::getNullValue(CI->getType()); // strchr(s+n,c) -> gep(s+n+i,c) @@ -296,20 +295,18 @@ if (!CharC) return 0; - std::string Str; - if (!GetConstantStringInfo(SrcStr, Str)) { + StringRef Str; + if (!getConstantStringInfo(SrcStr, Str)) { // strrchr(s, 0) -> strchr(s, 0) if (TD && CharC->isZero()) return EmitStrChr(SrcStr, '\0', B, TD); return 0; } - // strrchr can find the nul character. - Str += '\0'; - // Compute the offset. - size_t I = Str.rfind(CharC->getSExtValue()); - if (I == std::string::npos) // Didn't find the char. Return null. + size_t I = CharC->getSExtValue() == 0 ? + Str.size() : Str.rfind(CharC->getSExtValue()); + if (I == StringRef::npos) // Didn't find the char. Return null. return Constant::getNullValue(CI->getType()); // strrchr(s+n,c) -> gep(s+n+i,c) @@ -334,14 +331,13 @@ if (Str1P == Str2P) // strcmp(x,x) -> 0 return ConstantInt::get(CI->getType(), 0); - std::string Str1, Str2; - bool HasStr1 = GetConstantStringInfo(Str1P, Str1); - bool HasStr2 = GetConstantStringInfo(Str2P, Str2); + StringRef Str1, Str2; + bool HasStr1 = getConstantStringInfo(Str1P, Str1); + bool HasStr2 = getConstantStringInfo(Str2P, Str2); // strcmp(x, y) -> cnst (if both x and y are constant strings) if (HasStr1 && HasStr2) - return ConstantInt::get(CI->getType(), - StringRef(Str1).compare(Str2)); + return ConstantInt::get(CI->getType(), Str1.compare(Str2)); if (HasStr1 && Str1.empty()) // strcmp("", x) -> -*x return B.CreateNeg(B.CreateZExt(B.CreateLoad(Str2P, "strcmpload"), @@ -397,14 +393,14 @@ if (TD && Length == 1) // strncmp(x,y,1) -> memcmp(x,y,1) return EmitMemCmp(Str1P, Str2P, CI->getArgOperand(2), B, TD); - std::string Str1, Str2; - bool HasStr1 = GetConstantStringInfo(Str1P, Str1); - bool HasStr2 = GetConstantStringInfo(Str2P, Str2); + StringRef Str1, Str2; + bool HasStr1 = getConstantStringInfo(Str1P, Str1); + bool HasStr2 = getConstantStringInfo(Str2P, Str2); // strncmp(x, y) -> cnst (if both x and y are constant strings) if (HasStr1 && HasStr2) { - StringRef SubStr1 = StringRef(Str1).substr(0, Length); - StringRef SubStr2 = StringRef(Str2).substr(0, Length); + StringRef SubStr1 = Str1.substr(0, Length); + StringRef SubStr2 = Str2.substr(0, Length); return ConstantInt::get(CI->getType(), SubStr1.compare(SubStr2)); } @@ -609,9 +605,9 @@ !FT->getReturnType()->isIntegerTy()) return 0; - std::string S1, S2; - bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); + StringRef S1, S2; + bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); // strspn(s, "") -> 0 // strspn("", s) -> 0 @@ -619,8 +615,11 @@ return Constant::getNullValue(CI->getType()); // Constant folding. - if (HasS1 && HasS2) - return ConstantInt::get(CI->getType(), strspn(S1.c_str(), S2.c_str())); + if (HasS1 && HasS2) { + size_t Pos = S1.find_first_not_of(S2); + if (Pos == StringRef::npos) Pos = S1.size(); + return ConstantInt::get(CI->getType(), Pos); + } return 0; } @@ -638,17 +637,20 @@ !FT->getReturnType()->isIntegerTy()) return 0; - std::string S1, S2; - bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); + StringRef S1, S2; + bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); // strcspn("", s) -> 0 if (HasS1 && S1.empty()) return Constant::getNullValue(CI->getType()); // Constant folding. - if (HasS1 && HasS2) - return ConstantInt::get(CI->getType(), strcspn(S1.c_str(), S2.c_str())); + if (HasS1 && HasS2) { + size_t Pos = S1.find_first_of(S2); + if (Pos == StringRef::npos) Pos = S1.size(); + return ConstantInt::get(CI->getType(), Pos); + } // strcspn(s, "") -> strlen(s) if (TD && HasS2 && S2.empty()) @@ -756,11 +758,11 @@ } // Constant folding: memcmp(x, y, l) -> cnst (all arguments are constant) - std::string LHSStr, RHSStr; - if (GetConstantStringInfo(LHS, LHSStr) && - GetConstantStringInfo(RHS, RHSStr)) { + StringRef LHSStr, RHSStr; + if (getConstantStringInfo(LHS, LHSStr) && + getConstantStringInfo(RHS, RHSStr)) { // Make sure we're not reading out-of-bounds memory. - if (Len > LHSStr.length() || Len > RHSStr.length()) + if (Len > LHSStr.size() || Len > RHSStr.size()) return 0; uint64_t Ret = memcmp(LHSStr.data(), RHSStr.data(), Len); return ConstantInt::get(CI->getType(), Ret); @@ -1116,8 +1118,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // Check for a fixed format string. - std::string FormatStr; - if (!GetConstantStringInfo(CI->getArgOperand(0), FormatStr)) + StringRef FormatStr; + if (!getConstantStringInfo(CI->getArgOperand(0), FormatStr)) return 0; // Empty format string -> noop. @@ -1143,7 +1145,7 @@ FormatStr.find('%') == std::string::npos) { // no format characters. // Create a string literal with no \n on it. We expect the constant merge // pass to be run after this pass, to merge duplicate strings. - FormatStr.erase(FormatStr.end()-1); + FormatStr = FormatStr.drop_back(); Value *GV = B.CreateGlobalString(FormatStr, "str"); EmitPutS(GV, B, TD); return CI->use_empty() ? (Value*)CI : @@ -1203,8 +1205,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // Check for a fixed format string. - std::string FormatStr; - if (!GetConstantStringInfo(CI->getArgOperand(1), FormatStr)) + StringRef FormatStr; + if (!getConstantStringInfo(CI->getArgOperand(1), FormatStr)) return 0; // If we just have a format string (nothing else crazy) transform it. @@ -1358,8 +1360,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // All the optimizations depend on the format string. - std::string FormatStr; - if (!GetConstantStringInfo(CI->getArgOperand(1), FormatStr)) + StringRef FormatStr; + if (!getConstantStringInfo(CI->getArgOperand(1), FormatStr)) return 0; // fprintf(F, "foo") --> fwrite("foo", 3, 1, F) @@ -1442,8 +1444,8 @@ return 0; // Check for a constant string. - std::string Str; - if (!GetConstantStringInfo(CI->getArgOperand(0), Str)) + StringRef Str; + if (!getConstantStringInfo(CI->getArgOperand(0), Str)) return 0; if (Str.empty() && CI->use_empty()) { @@ -2413,6 +2415,8 @@ // * stpcpy(str, "literal") -> // llvm.memcpy(str,"literal",strlen("literal")+1,1) // +// strchr: +// * strchr(p, 0) -> strlen(p) // tan, tanf, tanl: // * tan(atan(x)) -> x // From chandlerc at gmail.com Mon Jan 30 22:51:03 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 20:51:03 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: <4f26dcb9.daebd80a.6078.2dc0SMTPIN_ADDED@mx.google.com> References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> <4f26dcb9.daebd80a.6078.2dc0SMTPIN_ADDED@mx.google.com> Message-ID: On Mon, Jan 30, 2012 at 10:08 AM, James Molloy wrote: > Hi Chandler,**** > > ** ** > > One point:**** > > ** ** > > + /// Note that this tests for 16-bit pointer width, and nothing else.*** > * > > ** ** > > I?m not sure this comment is accurate. For example, real mode x86 would be > 16-bit but has 24-bit pointers (seg:offset). > As far as I can tell, there is no support for x86-16 or real mode or any of the other segmented addressing modes on x86. If such modes are added, I would be quite happy for them to return false on all three of these queries. ;] I still think "pointer size" is the most descriptive term, but I'm open to more suggestions. FWIW, the reason I don't particularly like "native width of the register file" is that i find it much less clear and unambiguous given the diversity of register widths on even modern architectures. However, addresses generally have a fixed size on modern architectures, and so that seems a good classification scheme. I'm planning to go ahead and submit this, we can tweak the wording in post-commit until the bike shed is just the right shade. ;] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/563237ec/attachment.html From clattner at apple.com Mon Jan 30 22:55:35 2012 From: clattner at apple.com (Chris Lattner) Date: Mon, 30 Jan 2012 20:55:35 -0800 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> <4f26dcb9.daebd80a.6078.2dc0SMTPIN_ADDED@mx.google.com> Message-ID: On Jan 30, 2012, at 8:51 PM, Chandler Carruth wrote: > On Mon, Jan 30, 2012 at 10:08 AM, James Molloy wrote: > Hi Chandler, > > > > One point: > > > > + /// Note that this tests for 16-bit pointer width, and nothing else. > > > > I?m not sure this comment is accurate. For example, real mode x86 would be 16-bit but has 24-bit pointers (seg:offset). > > > As far as I can tell, there is no support for x86-16 or real mode or any of the other segmented addressing modes on x86. If such modes are added, I would be quite happy for them to return false on all three of these queries. ;] I still think "pointer size" is the most descriptive term, but I'm open to more suggestions. > > FWIW, the reason I don't particularly like "native width of the register file" is that i find it much less clear and unambiguous given the diversity of register widths on even modern architectures. However, addresses generally have a fixed size on modern architectures, and so that seems a good classification scheme. Just my contribution to the bikeshed: I don't think that it makes sense for llvm::Triple to know the "native width of the register file", since that is such an amorphous statement. The best that llvm::Triple can know is sizeof(void*) in the default address space. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/89f7ddca/attachment.html From chandlerc at gmail.com Mon Jan 30 22:52:32 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Tue, 31 Jan 2012 04:52:32 -0000 Subject: [llvm-commits] [llvm] r149353 - in /llvm/trunk: include/llvm/ADT/Triple.h lib/Support/Triple.cpp unittests/ADT/TripleTest.cpp Message-ID: <20120131045232.8BA372A6C12C@llvm.org> Author: chandlerc Date: Mon Jan 30 22:52:32 2012 New Revision: 149353 URL: http://llvm.org/viewvc/llvm-project?rev=149353&view=rev Log: Add various coarse bit-width architecture predicates to llvm::Triple. These are very useful for frontends and other utilities reasoning about or selecting between triples. Modified: llvm/trunk/include/llvm/ADT/Triple.h llvm/trunk/lib/Support/Triple.cpp llvm/trunk/unittests/ADT/TripleTest.cpp Modified: llvm/trunk/include/llvm/ADT/Triple.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/Triple.h?rev=149353&r1=149352&r2=149353&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/Triple.h (original) +++ llvm/trunk/include/llvm/ADT/Triple.h Mon Jan 30 22:52:32 2012 @@ -241,6 +241,25 @@ /// @name Convenience Predicates /// @{ + /// \brief Test whether the architecture is 64-bit + /// + /// Note that this tests for 64-bit pointer width, and nothing else. Note + /// that we intentionally expose only three predicates, 64-bit, 32-bit, and + /// 16-bit. The inner details of pointer width for particular architectures + /// is not summed up in the triple, and so only a coarse grained predicate + /// system is provided. + bool isArch64Bit() const; + + /// \brief Test whether the architecture is 32-bit + /// + /// Note that this tests for 32-bit pointer width, and nothing else. + bool isArch32Bit() const; + + /// \brief Test whether the architecture is 16-bit + /// + /// Note that this tests for 16-bit pointer width, and nothing else. + bool isArch16Bit() const; + /// isOSVersionLT - Helper function for doing comparisons against version /// numbers included in the target triple. bool isOSVersionLT(unsigned Major, unsigned Minor = 0, Modified: llvm/trunk/lib/Support/Triple.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/Triple.cpp?rev=149353&r1=149352&r2=149353&view=diff ============================================================================== --- llvm/trunk/lib/Support/Triple.cpp (original) +++ llvm/trunk/lib/Support/Triple.cpp Mon Jan 30 22:52:32 2012 @@ -665,3 +665,52 @@ void Triple::setOSAndEnvironmentName(StringRef Str) { setTriple(getArchName() + "-" + getVendorName() + "-" + Str); } + +static unsigned getArchPointerBitWidth(llvm::Triple::ArchType Arch) { + switch (Arch) { + case llvm::Triple::UnknownArch: + case llvm::Triple::InvalidArch: + return 0; + + case llvm::Triple::msp430: + return 16; + + case llvm::Triple::amdil: + case llvm::Triple::arm: + case llvm::Triple::cellspu: + case llvm::Triple::hexagon: + case llvm::Triple::le32: + case llvm::Triple::mblaze: + case llvm::Triple::mips: + case llvm::Triple::mipsel: + case llvm::Triple::ppc: + case llvm::Triple::ptx32: + case llvm::Triple::sparc: + case llvm::Triple::tce: + case llvm::Triple::thumb: + case llvm::Triple::x86: + case llvm::Triple::xcore: + return 32; + + case llvm::Triple::mips64: + case llvm::Triple::mips64el: + case llvm::Triple::ppc64: + case llvm::Triple::ptx64: + case llvm::Triple::sparcv9: + case llvm::Triple::x86_64: + return 64; + } + llvm_unreachable("Invalid architecture value"); +} + +bool Triple::isArch64Bit() const { + return getArchPointerBitWidth(getArch()) == 64; +} + +bool Triple::isArch32Bit() const { + return getArchPointerBitWidth(getArch()) == 32; +} + +bool Triple::isArch16Bit() const { + return getArchPointerBitWidth(getArch()) == 16; +} Modified: llvm/trunk/unittests/ADT/TripleTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/TripleTest.cpp?rev=149353&r1=149352&r2=149353&view=diff ============================================================================== --- llvm/trunk/unittests/ADT/TripleTest.cpp (original) +++ llvm/trunk/unittests/ADT/TripleTest.cpp Mon Jan 30 22:52:32 2012 @@ -267,4 +267,61 @@ } +TEST(TripleTest, BitWidthPredicates) { + Triple T; + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::InvalidArch); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::arm); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_TRUE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::hexagon); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_TRUE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::mips); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_TRUE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::mips64); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_TRUE(T.isArch64Bit()); + + T.setArch(Triple::msp430); + EXPECT_TRUE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::ppc); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_TRUE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::ppc64); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_TRUE(T.isArch64Bit()); + + T.setArch(Triple::x86); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_TRUE(T.isArch32Bit()); + EXPECT_FALSE(T.isArch64Bit()); + + T.setArch(Triple::x86_64); + EXPECT_FALSE(T.isArch16Bit()); + EXPECT_FALSE(T.isArch32Bit()); + EXPECT_TRUE(T.isArch64Bit()); +} + } From sabre at nondot.org Mon Jan 30 22:54:27 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 04:54:27 -0000 Subject: [llvm-commits] [llvm] r149354 - /llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Message-ID: <20120131045427.AE64D2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 22:54:27 2012 New Revision: 149354 URL: http://llvm.org/viewvc/llvm-project?rev=149354&view=rev Log: eliminate the last uses of GetConstantStringInfo from this file, I didn't realize I was that close... Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=149354&r1=149353&r2=149354&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Mon Jan 30 22:54:27 2012 @@ -545,9 +545,9 @@ FT->getReturnType() != FT->getParamType(0)) return 0; - std::string S1, S2; - bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); + StringRef S1, S2; + bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); // strpbrk(s, "") -> NULL // strpbrk("", s) -> NULL @@ -694,9 +694,9 @@ } // See if either input string is a constant string. - std::string SearchStr, ToFindStr; - bool HasStr1 = GetConstantStringInfo(CI->getArgOperand(0), SearchStr); - bool HasStr2 = GetConstantStringInfo(CI->getArgOperand(1), ToFindStr); + StringRef SearchStr, ToFindStr; + bool HasStr1 = getConstantStringInfo(CI->getArgOperand(0), SearchStr); + bool HasStr2 = getConstantStringInfo(CI->getArgOperand(1), ToFindStr); // fold strstr(x, "") -> x. if (HasStr2 && ToFindStr.empty()) @@ -706,7 +706,7 @@ if (HasStr1 && HasStr2) { std::string::size_type Offset = SearchStr.find(ToFindStr); - if (Offset == std::string::npos) // strstr("foo", "bar") -> null + if (Offset == StringRef::npos) // strstr("foo", "bar") -> null return Constant::getNullValue(CI->getType()); // strstr("abcd", "bc") -> gep((char*)"abcd", 1) From sabre at nondot.org Mon Jan 30 23:09:17 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 05:09:17 -0000 Subject: [llvm-commits] [llvm] r149356 - in /llvm/trunk: include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <20120131050917.C16DD2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 23:09:17 2012 New Revision: 149356 URL: http://llvm.org/viewvc/llvm-project?rev=149356&view=rev Log: remove the last vestiges of llvm::GetConstantStringInfo, in CodeGen. Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h llvm/trunk/lib/Analysis/ValueTracking.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=149356&r1=149355&r2=149356&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Mon Jan 30 23:09:17 2012 @@ -132,8 +132,8 @@ uint64_t Offset = 0); // FIXME: Remove this. - bool GetConstantStringInfo(const Value *V, std::string &Str, - uint64_t Offset = 0); + // bool GetConstantStringInfo(const Value *V, std::string &Str, + // uint64_t Offset = 0); /// GetStringLength - If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=149356&r1=149355&r2=149356&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Mon Jan 30 23:09:17 2012 @@ -1603,16 +1603,6 @@ } -// FIXME: Remove this. -bool llvm::GetConstantStringInfo(const Value *V, std::string &Str, - uint64_t Offset) { - StringRef Tmp; - if (!getConstantStringInfo(V, Tmp, Offset)) - return false; - Str = Tmp.str(); - return true; -} - /// getConstantStringInfo - This function computes the length of a /// null-terminated C string pointed to by V. If successful, it returns true /// and returns the string in Str. If unsuccessful, it returns false. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=149356&r1=149355&r2=149356&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Jan 30 23:09:17 2012 @@ -3299,7 +3299,7 @@ /// string ptr. static SDValue getMemsetStringVal(EVT VT, DebugLoc dl, SelectionDAG &DAG, const TargetLowering &TLI, - std::string &Str, unsigned Offset) { + StringRef Str, unsigned Offset) { // Handle vector with all elements zero. if (Str.empty()) { if (VT.isInteger()) @@ -3343,7 +3343,7 @@ /// isMemSrcFromString - Returns true if memcpy source is a string constant. /// -static bool isMemSrcFromString(SDValue Src, std::string &Str) { +static bool isMemSrcFromString(SDValue Src, StringRef &Str) { unsigned SrcDelta = 0; GlobalAddressSDNode *G = NULL; if (Src.getOpcode() == ISD::GlobalAddress) @@ -3358,11 +3358,8 @@ return false; if (const GlobalVariable *GV = dyn_cast(G->getGlobal())) - if (GetConstantStringInfo(GV, Str, SrcDelta)) { - // The nul can also be read. - Str.push_back(0); + if (getConstantStringInfo(GV, Str, SrcDelta)) return true; - } return false; } @@ -3467,7 +3464,7 @@ unsigned SrcAlign = DAG.InferPtrAlignment(Src); if (Align > SrcAlign) SrcAlign = Align; - std::string Str; + StringRef Str; bool CopyFromStr = isMemSrcFromString(Src, Str); bool isZeroStr = CopyFromStr && Str.empty(); unsigned Limit = AlwaysInline ? ~0U : TLI.getMaxStoresPerMemcpy(OptSize); From sabre at nondot.org Mon Jan 30 23:18:57 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 05:18:57 -0000 Subject: [llvm-commits] [llvm] r149357 - /llvm/trunk/lib/VMCore/Constants.cpp Message-ID: <20120131051857.10CEF2A6C12C@llvm.org> Author: lattner Date: Mon Jan 30 23:18:56 2012 New Revision: 149357 URL: http://llvm.org/viewvc/llvm-project?rev=149357&view=rev Log: fix a small oversight that broke the fhourstones app. Modified: llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149357&r1=149356&r2=149357&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Mon Jan 30 23:18:56 2012 @@ -176,7 +176,7 @@ return UV->getElementValue(Elt); if (const ConstantDataSequential *CDS =dyn_cast(this)) - return CDS->getElementAsConstant(Elt); + return Elt < CDS->getNumElements() ? CDS->getElementAsConstant(Elt) : 0; return 0; } From atrick at apple.com Mon Jan 30 23:55:32 2012 From: atrick at apple.com (Andrew Trick) Date: Tue, 31 Jan 2012 05:55:32 -0000 Subject: [llvm-commits] [llvm] r149360 - /llvm/trunk/lib/CodeGen/RegAllocFast.cpp Message-ID: <20120131055532.950F62A6C12C@llvm.org> Author: atrick Date: Mon Jan 30 23:55:32 2012 New Revision: 149360 URL: http://llvm.org/viewvc/llvm-project?rev=149360&view=rev Log: RAFast: Generalize the logic for return operands. This removes implicit assumption about the form of MI coming into regalloc. In particular, it should be independent of ProcessImplicitDefs which will eventually become a standard part of coming out of SSA--unless we simply can eliminate IMPLICIT_DEF completely. Current unit tests expose this once I remove incidental pass ordering restrictions. This is not a final fix. Just a temporary workaround until I figure out the right way. Modified: llvm/trunk/lib/CodeGen/RegAllocFast.cpp Modified: llvm/trunk/lib/CodeGen/RegAllocFast.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegAllocFast.cpp?rev=149360&r1=149359&r2=149360&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/RegAllocFast.cpp (original) +++ llvm/trunk/lib/CodeGen/RegAllocFast.cpp Mon Jan 30 23:55:32 2012 @@ -167,6 +167,7 @@ unsigned VirtReg, unsigned Hint); void spillAll(MachineInstr *MI); bool setPhysReg(MachineInstr *MI, unsigned OpNum, unsigned PhysReg); + void addRetOperands(MachineBasicBlock *MBB); }; char RAFast::ID = 0; } @@ -739,29 +740,64 @@ UsedInInstr.set(PartialDefs[i]); } -void RAFast::AllocateBasicBlock() { - DEBUG(dbgs() << "\nAllocating " << *MBB); +/// addRetOperand - ensure that a return instruction has an operand for each +/// value live out of the function. +/// +/// Things marked both call and return are tail calls; do not do this for them. +/// The tail callee need not take the same registers as input that it produces +/// as output, and there are dependencies for its input registers elsewhere. +/// +/// FIXME: This should be done as part of instruction selection, and this helper +/// should be deleted. Until then, we use custom logic here to create the proper +/// operand under all circumstances. We can't use addRegisterKilled because that +/// doesn't make sense for undefined values. We can't simply avoid calling it +/// for undefined values, because we must ensure that the operand always exists. +void RAFast::addRetOperands(MachineBasicBlock *MBB) { + if (MBB->empty() || !MBB->back().isReturn() || MBB->back().isCall()) + return; - // FIXME: This should probably be added by instruction selection instead? - // If the last instruction in the block is a return, make sure to mark it as - // using all of the live-out values in the function. Things marked both call - // and return are tail calls; do not do this for them. The tail callee need - // not take the same registers as input that it produces as output, and there - // are dependencies for its input registers elsewhere. - if (!MBB->empty() && MBB->back().isReturn() && - !MBB->back().isCall()) { - MachineInstr *Ret = &MBB->back(); - - for (MachineRegisterInfo::liveout_iterator - I = MF->getRegInfo().liveout_begin(), - E = MF->getRegInfo().liveout_end(); I != E; ++I) { - assert(TargetRegisterInfo::isPhysicalRegister(*I) && - "Cannot have a live-out virtual register."); + MachineInstr *MI = &MBB->back(); + + for (MachineRegisterInfo::liveout_iterator + I = MBB->getParent()->getRegInfo().liveout_begin(), + E = MBB->getParent()->getRegInfo().liveout_end(); I != E; ++I) { + unsigned Reg = *I; + assert(TargetRegisterInfo::isPhysicalRegister(Reg) && + "Cannot have a live-out virtual register."); + + bool hasDef = PhysRegState[Reg] == regReserved; + + // Check if this register already has an operand. + bool Found = false; + for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) { + MachineOperand &MO = MI->getOperand(i); + if (!MO.isReg() || !MO.isUse()) + continue; - // Add live-out registers as implicit uses. - Ret->addRegisterKilled(*I, TRI, true); + unsigned OperReg = MO.getReg(); + for (const unsigned *AS = TRI->getOverlaps(Reg); *AS; ++AS) { + if (OperReg != *AS) + continue; + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { + // If the ret already has an operand for this physreg or a superset, + // don't duplicate it. Set the kill flag if the value is defined. + if (hasDef && !MO.isKill()) + MO.setIsKill(); + Found = true; + break; + } + } } + if (!Found) + MI->addOperand(MachineOperand::CreateReg(Reg, + false /*IsDef*/, + true /*IsImp*/, + hasDef/*IsKill*/)); } +} + +void RAFast::AllocateBasicBlock() { + DEBUG(dbgs() << "\nAllocating " << *MBB); PhysRegState.assign(TRI->getNumRegs(), regDisabled); assert(LiveVirtRegs.empty() && "Mapping not cleared form last block?"); @@ -1033,6 +1069,9 @@ MBB->erase(Coalesced[i]); NumCopies += Coalesced.size(); + // addRetOperands must run after we've seen all defs in this block. + addRetOperands(MBB); + DEBUG(MBB->dump()); } From sabre at nondot.org Tue Jan 31 00:03:46 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 06:03:46 -0000 Subject: [llvm-commits] [llvm] r149361 - /llvm/trunk/tools/lto/LTOModule.cpp Message-ID: <20120131060346.87FA72A6C12C@llvm.org> Author: lattner Date: Tue Jan 31 00:03:46 2012 New Revision: 149361 URL: http://llvm.org/viewvc/llvm-project?rev=149361&view=rev Log: update this to ConstantDataArray. There are no tests and this isn't using the preferred functionality for ripping apart strings, so I have no way to test this. Modified: llvm/trunk/tools/lto/LTOModule.cpp Modified: llvm/trunk/tools/lto/LTOModule.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=149361&r1=149360&r2=149361&view=diff ============================================================================== --- llvm/trunk/tools/lto/LTOModule.cpp (original) +++ llvm/trunk/tools/lto/LTOModule.cpp Tue Jan 31 00:03:46 2012 @@ -190,9 +190,9 @@ Constant *op = ce->getOperand(0); if (GlobalVariable *gvn = dyn_cast(op)) { Constant *cn = gvn->getInitializer(); - if (ConstantArray *ca = dyn_cast(cn)) { + if (ConstantDataArray *ca = dyn_cast(cn)) { if (ca->isCString()) { - name = ".objc_class_name_" + ca->getAsCString(); + name = ".objc_class_name_" + ca->getAsCString().str(); return true; } } From sabre at nondot.org Tue Jan 31 00:05:00 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 06:05:00 -0000 Subject: [llvm-commits] [llvm] r149362 - in /llvm/trunk: include/llvm/Constants.h lib/Bitcode/Writer/BitcodeWriter.cpp lib/Bitcode/Writer/ValueEnumerator.cpp lib/CodeGen/AsmPrinter/AsmPrinter.cpp lib/Target/CBackend/CBackend.cpp lib/Target/CppBackend/CPPBackend.cpp lib/VMCore/AsmWriter.cpp lib/VMCore/Constants.cpp Message-ID: <20120131060500.7C76F2A6C12C@llvm.org> Author: lattner Date: Tue Jan 31 00:05:00 2012 New Revision: 149362 URL: http://llvm.org/viewvc/llvm-project?rev=149362&view=rev Log: with recent changes, ConstantArray is never a "string". Remove the associated methods and constant fold the clients to false. Modified: llvm/trunk/include/llvm/Constants.h llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/trunk/lib/Target/CBackend/CBackend.cpp llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp llvm/trunk/lib/VMCore/AsmWriter.cpp llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/include/llvm/Constants.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Constants.h?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/include/llvm/Constants.h (original) +++ llvm/trunk/include/llvm/Constants.h Tue Jan 31 00:05:00 2012 @@ -373,31 +373,6 @@ return reinterpret_cast(Value::getType()); } - // FIXME: String methods will eventually be removed. - - - /// isString - This method returns true if the array is an array of i8 and - /// the elements of the array are all ConstantInt's. - bool isString() const; - - /// isCString - This method returns true if the array is a string (see - /// @verbatim - /// isString) and it ends in a null byte \0 and does not contains any other - /// @endverbatim - /// null bytes except its terminator. - bool isCString() const; - - /// getAsString - If this array is isString(), then this method converts the - /// array to an std::string and returns it. Otherwise, it asserts out. - /// - std::string getAsString() const; - - /// getAsCString - If this array is isCString(), then this method converts the - /// array (without the trailing null byte) to an std::string and returns it. - /// Otherwise, it asserts out. - /// - std::string getAsCString() const; - virtual void destroyConstant(); virtual void replaceUsesOfWithOnConstant(Value *From, Value *To, Use *U); Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Tue Jan 31 00:05:00 2012 @@ -845,32 +845,6 @@ } else { assert (0 && "Unknown FP type!"); } - } else if (isa(C) && cast(C)->isString()) { - const ConstantArray *CA = cast(C); - // Emit constant strings specially. - unsigned NumOps = CA->getNumOperands(); - // If this is a null-terminated string, use the denser CSTRING encoding. - if (CA->getOperand(NumOps-1)->isNullValue()) { - Code = bitc::CST_CODE_CSTRING; - --NumOps; // Don't encode the null, which isn't allowed by char6. - } else { - Code = bitc::CST_CODE_STRING; - AbbrevToUse = String8Abbrev; - } - bool isCStr7 = Code == bitc::CST_CODE_CSTRING; - bool isCStrChar6 = Code == bitc::CST_CODE_CSTRING; - for (unsigned i = 0; i != NumOps; ++i) { - unsigned char V = cast(CA->getOperand(i))->getZExtValue(); - Record.push_back(V); - isCStr7 &= (V & 128) == 0; - if (isCStrChar6) - isCStrChar6 = BitCodeAbbrevOp::isChar6(V); - } - - if (isCStrChar6) - AbbrevToUse = CString6Abbrev; - else if (isCStr7) - AbbrevToUse = CString7Abbrev; } else if (isa(C) && cast(C)->isString()) { const ConstantDataSequential *Str = cast(C); Modified: llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp Tue Jan 31 00:05:00 2012 @@ -321,10 +321,6 @@ if (const Constant *C = dyn_cast(V)) { if (isa(C)) { // Initializers for globals are handled explicitly elsewhere. - } else if (isa(C) && cast(C)->isString()) { - // Do not enumerate the initializers for an array of simple characters. - // The initializers just pollute the value table, and we emit the strings - // specially. } else if (C->getNumOperands()) { // If a constant has operands, enumerate them. This makes sure that if a // constant has uses (for example an array of const ints), that they are Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Tue Jan 31 00:05:00 2012 @@ -1675,31 +1675,18 @@ static void EmitGlobalConstantArray(const ConstantArray *CA, unsigned AddrSpace, AsmPrinter &AP) { - if (AddrSpace != 0 || !CA->isString()) { - // Not a string. Print the values in successive locations. - - // See if we can aggregate some values. Make sure it can be - // represented as a series of bytes of the constant value. - int Value = isRepeatedByteSequence(CA, AP.TM); - - if (Value != -1) { - uint64_t Bytes = AP.TM.getTargetData()->getTypeAllocSize(CA->getType()); - AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); - } - else { - for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) - EmitGlobalConstantImpl(CA->getOperand(i), AddrSpace, AP); - } - return; + // See if we can aggregate some values. Make sure it can be + // represented as a series of bytes of the constant value. + int Value = isRepeatedByteSequence(CA, AP.TM); + + if (Value != -1) { + uint64_t Bytes = AP.TM.getTargetData()->getTypeAllocSize(CA->getType()); + AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); + } + else { + for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) + EmitGlobalConstantImpl(CA->getOperand(i), AddrSpace, AP); } - - // Otherwise, it can be emitted as .ascii. - SmallVector TmpVec; - TmpVec.reserve(CA->getNumOperands()); - for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) - TmpVec.push_back(cast(CA->getOperand(i))->getZExtValue()); - - AP.OutStreamer.EmitBytes(StringRef(TmpVec.data(), TmpVec.size()), AddrSpace); } static void EmitGlobalConstantVector(const ConstantVector *CV, Modified: llvm/trunk/lib/Target/CBackend/CBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CBackend/CBackend.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/Target/CBackend/CBackend.cpp (original) +++ llvm/trunk/lib/Target/CBackend/CBackend.cpp Tue Jan 31 00:05:00 2012 @@ -558,73 +558,21 @@ } void CWriter::printConstantArray(ConstantArray *CPA, bool Static) { - // As a special case, print the array as a string if it is an array of - // ubytes or an array of sbytes with positive values. - // - if (CPA->isCString()) { - Out << '\"'; - // Keep track of whether the last number was a hexadecimal escape. - bool LastWasHex = false; - - // Do not include the last character, which we know is null - for (unsigned i = 0, e = CPA->getNumOperands()-1; i != e; ++i) { - unsigned char C = cast(CPA->getOperand(i))->getZExtValue(); - - // Print it out literally if it is a printable character. The only thing - // to be careful about is when the last letter output was a hex escape - // code, in which case we have to be careful not to print out hex digits - // explicitly (the C compiler thinks it is a continuation of the previous - // character, sheesh...) - // - if (isprint(C) && (!LastWasHex || !isxdigit(C))) { - LastWasHex = false; - if (C == '"' || C == '\\') - Out << "\\" << (char)C; - else - Out << (char)C; - } else { - LastWasHex = false; - switch (C) { - case '\n': Out << "\\n"; break; - case '\t': Out << "\\t"; break; - case '\r': Out << "\\r"; break; - case '\v': Out << "\\v"; break; - case '\a': Out << "\\a"; break; - case '\"': Out << "\\\""; break; - case '\'': Out << "\\\'"; break; - default: - Out << "\\x"; - Out << (char)(( C/16 < 10) ? ( C/16 +'0') : ( C/16 -10+'A')); - Out << (char)(((C&15) < 10) ? ((C&15)+'0') : ((C&15)-10+'A')); - LastWasHex = true; - break; - } - } - } - Out << '\"'; - } else { - Out << '{'; - if (CPA->getNumOperands()) { - Out << ' '; - printConstant(cast(CPA->getOperand(0)), Static); - for (unsigned i = 1, e = CPA->getNumOperands(); i != e; ++i) { - Out << ", "; - printConstant(cast(CPA->getOperand(i)), Static); - } - } - Out << " }"; + Out << "{ "; + printConstant(cast(CPA->getOperand(0)), Static); + for (unsigned i = 1, e = CPA->getNumOperands(); i != e; ++i) { + Out << ", "; + printConstant(cast(CPA->getOperand(i)), Static); } + Out << " }"; } void CWriter::printConstantVector(ConstantVector *CP, bool Static) { - Out << '{'; - if (CP->getNumOperands()) { - Out << ' '; - printConstant(cast(CP->getOperand(0)), Static); - for (unsigned i = 1, e = CP->getNumOperands(); i != e; ++i) { - Out << ", "; - printConstant(cast(CP->getOperand(i)), Static); - } + Out << "{ "; + printConstant(cast(CP->getOperand(0)), Static); + for (unsigned i = 1, e = CP->getNumOperands(); i != e; ++i) { + Out << ", "; + printConstant(cast(CP->getOperand(i)), Static); } Out << " }"; } Modified: llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp (original) +++ llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp Tue Jan 31 00:05:00 2012 @@ -698,36 +698,17 @@ printCFP(CFP); Out << ";"; } else if (const ConstantArray *CA = dyn_cast(CV)) { - if (CA->isString()) { - Out << "Constant* " << constName << - " = ConstantArray::get(mod->getContext(), \""; - std::string tmp = CA->getAsString(); - bool nullTerminate = false; - if (tmp[tmp.length()-1] == 0) { - tmp.erase(tmp.length()-1); - nullTerminate = true; - } - printEscapedString(tmp); - // Determine if we want null termination or not. - if (nullTerminate) - Out << "\", true"; // Indicate that the null terminator should be - // added. - else - Out << "\", false";// No null terminator - Out << ");"; - } else { - Out << "std::vector " << constName << "_elems;"; + Out << "std::vector " << constName << "_elems;"; + nl(Out); + unsigned N = CA->getNumOperands(); + for (unsigned i = 0; i < N; ++i) { + printConstant(CA->getOperand(i)); // recurse to print operands + Out << constName << "_elems.push_back(" + << getCppName(CA->getOperand(i)) << ");"; nl(Out); - unsigned N = CA->getNumOperands(); - for (unsigned i = 0; i < N; ++i) { - printConstant(CA->getOperand(i)); // recurse to print operands - Out << constName << "_elems.push_back(" - << getCppName(CA->getOperand(i)) << ");"; - nl(Out); - } - Out << "Constant* " << constName << " = ConstantArray::get(" - << typeName << ", " << constName << "_elems);"; } + Out << "Constant* " << constName << " = ConstantArray::get(" + << typeName << ", " << constName << "_elems);"; } else if (const ConstantStruct *CS = dyn_cast(CV)) { Out << "std::vector " << constName << "_fields;"; nl(Out); @@ -740,14 +721,14 @@ } Out << "Constant* " << constName << " = ConstantStruct::get(" << typeName << ", " << constName << "_fields);"; - } else if (const ConstantVector *CP = dyn_cast(CV)) { + } else if (const ConstantVector *CV = dyn_cast(CV)) { Out << "std::vector " << constName << "_elems;"; nl(Out); - unsigned N = CP->getNumOperands(); + unsigned N = CV->getNumOperands(); for (unsigned i = 0; i < N; ++i) { - printConstant(CP->getOperand(i)); + printConstant(CV->getOperand(i)); Out << constName << "_elems.push_back(" - << getCppName(CP->getOperand(i)) << ");"; + << getCppName(CV->getOperand(i)) << ");"; nl(Out); } Out << "Constant* " << constName << " = ConstantVector::get(" @@ -760,7 +741,7 @@ if (CDS->isString()) { Out << "Constant *" << constName << " = ConstantDataArray::getString(mod->getContext(), \""; - StringRef Str = CA->getAsString(); + StringRef Str = CDS->getAsString(); bool nullTerminate = false; if (Str.back() == 0) { Str = Str.drop_back(); Modified: llvm/trunk/lib/VMCore/AsmWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/AsmWriter.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/AsmWriter.cpp (original) +++ llvm/trunk/lib/VMCore/AsmWriter.cpp Tue Jan 31 00:05:00 2012 @@ -827,30 +827,21 @@ } if (const ConstantArray *CA = dyn_cast(CV)) { - // As a special case, print the array as a string if it is an array of - // i8 with ConstantInt values. - // Type *ETy = CA->getType()->getElementType(); - if (CA->isString()) { - Out << "c\""; - PrintEscapedString(CA->getAsString(), Out); - Out << '"'; - } else { // Cannot output in string format... - Out << '['; + Out << '['; + TypePrinter.print(ETy, Out); + Out << ' '; + WriteAsOperandInternal(Out, CA->getOperand(0), + &TypePrinter, Machine, + Context); + for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { + Out << ", "; TypePrinter.print(ETy, Out); Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(0), - &TypePrinter, Machine, + WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, Context); - for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { - Out << ", "; - TypePrinter.print(ETy, Out); - Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, - Context); - } - Out << ']'; } + Out << ']'; return; } Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149362&r1=149361&r2=149362&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Tue Jan 31 00:05:00 2012 @@ -1201,69 +1201,6 @@ destroyConstantImpl(); } -/// isString - This method returns true if the array is an array of i8, and -/// if the elements of the array are all ConstantInt's. -bool ConstantArray::isString() const { - // Check the element type for i8... - if (!getType()->getElementType()->isIntegerTy(8)) - return false; - // Check the elements to make sure they are all integers, not constant - // expressions. - for (unsigned i = 0, e = getNumOperands(); i != e; ++i) - if (!isa(getOperand(i))) - return false; - return true; -} - -/// isCString - This method returns true if the array is a string (see -/// isString) and it ends in a null byte \\0 and does not contains any other -/// null bytes except its terminator. -bool ConstantArray::isCString() const { - // Check the element type for i8... - if (!getType()->getElementType()->isIntegerTy(8)) - return false; - - // Last element must be a null. - if (!getOperand(getNumOperands()-1)->isNullValue()) - return false; - // Other elements must be non-null integers. - for (unsigned i = 0, e = getNumOperands()-1; i != e; ++i) { - if (!isa(getOperand(i))) - return false; - if (getOperand(i)->isNullValue()) - return false; - } - return true; -} - - -/// convertToString - Helper function for getAsString() and getAsCString(). -static std::string convertToString(const User *U, unsigned len) { - std::string Result; - Result.reserve(len); - for (unsigned i = 0; i != len; ++i) - Result.push_back((char)cast(U->getOperand(i))->getZExtValue()); - return Result; -} - -/// getAsString - If this array is isString(), then this method converts the -/// array to an std::string and returns it. Otherwise, it asserts out. -/// -std::string ConstantArray::getAsString() const { - assert(isString() && "Not a string!"); - return convertToString(this, getNumOperands()); -} - - -/// getAsCString - If this array is isCString(), then this method converts the -/// array (without the trailing null byte) to an std::string and returns it. -/// Otherwise, it asserts out. -/// -std::string ConstantArray::getAsCString() const { - assert(isCString() && "Not a string!"); - return convertToString(this, getNumOperands() - 1); -} - //---- ConstantStruct::get() implementation... // From sabre at nondot.org Tue Jan 31 00:17:27 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 06:17:27 -0000 Subject: [llvm-commits] [llvm] r149364 - /llvm/trunk/include/llvm/Analysis/ValueTracking.h Message-ID: <20120131061727.17AA32A6C12C@llvm.org> Author: lattner Date: Tue Jan 31 00:17:26 2012 New Revision: 149364 URL: http://llvm.org/viewvc/llvm-project?rev=149364&view=rev Log: remove commented-out code. Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=149364&r1=149363&r2=149364&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Tue Jan 31 00:17:26 2012 @@ -131,10 +131,6 @@ bool getConstantStringInfo(const Value *V, StringRef &Str, uint64_t Offset = 0); - // FIXME: Remove this. - // bool GetConstantStringInfo(const Value *V, std::string &Str, - // uint64_t Offset = 0); - /// GetStringLength - If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. uint64_t GetStringLength(Value *V); From sabre at nondot.org Tue Jan 31 00:18:43 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 06:18:43 -0000 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompilation.cpp Message-ID: <20120131061843.59AAC2A6C12C@llvm.org> Author: lattner Date: Tue Jan 31 00:18:43 2012 New Revision: 149365 URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev Log: eliminate the "string" form of ConstantArray::get, using ConstantDataArray::getString instead. Modified: llvm/trunk/include/llvm/Constants.h llvm/trunk/lib/AsmParser/LLParser.cpp llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp llvm/trunk/lib/VMCore/Constants.cpp llvm/trunk/lib/VMCore/Core.cpp llvm/trunk/lib/VMCore/IRBuilder.cpp llvm/trunk/tools/bugpoint/Miscompilation.cpp Modified: llvm/trunk/include/llvm/Constants.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Constants.h?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/include/llvm/Constants.h (original) +++ llvm/trunk/include/llvm/Constants.h Tue Jan 31 00:18:43 2012 @@ -352,17 +352,6 @@ // ConstantArray accessors static Constant *get(ArrayType *T, ArrayRef V); - /// This method constructs a ConstantArray and initializes it with a text - /// string. The default behavior (AddNull==true) causes a null terminator to - /// be placed at the end of the array. This effectively increases the length - /// of the array by one (you've been warned). However, in some situations - /// this is not desired so if AddNull==false then the string is copied without - /// null termination. - - // FIXME Remove this. - static Constant *get(LLVMContext &Context, StringRef Initializer, - bool AddNull = true); - /// Transparently provide more efficient getOperand methods. DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Constant); Modified: llvm/trunk/lib/AsmParser/LLParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLParser.cpp (original) +++ llvm/trunk/lib/AsmParser/LLParser.cpp Tue Jan 31 00:18:43 2012 @@ -2018,7 +2018,8 @@ } case lltok::kw_c: // c "foo" Lex.Lex(); - ID.ConstantVal = ConstantArray::get(Context, Lex.getStrVal(), false); + ID.ConstantVal = ConstantDataArray::getString(Context, Lex.getStrVal(), + false); if (ParseToken(lltok::StringConstant, "expected string")) return true; ID.Kind = ValID::t_Constant; return false; Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Tue Jan 31 00:18:43 2012 @@ -213,7 +213,7 @@ // Create a constant for Str so that we can pass it to the run-time lib. static GlobalVariable *createPrivateGlobalForString(Module &M, StringRef Str) { - Constant *StrConst = ConstantArray::get(M.getContext(), Str); + Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str); return new GlobalVariable(M, StrConst->getType(), true, GlobalValue::PrivateLinkage, StrConst, ""); } Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Tue Jan 31 00:18:43 2012 @@ -780,12 +780,6 @@ return pImpl->ArrayConstants.getOrCreate(Ty, V); } -// FIXME: Remove this method. -Constant *ConstantArray::get(LLVMContext &Context, StringRef Str, - bool AddNull) { - return ConstantDataArray::getString(Context, Str, AddNull); -} - /// getTypeForElements - Return an anonymous struct type to use for a constant /// with the specified set of elements. The list must not be empty. StructType *ConstantStruct::getTypeForElements(LLVMContext &Context, Modified: llvm/trunk/lib/VMCore/Core.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Core.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Core.cpp (original) +++ llvm/trunk/lib/VMCore/Core.cpp Tue Jan 31 00:18:43 2012 @@ -634,8 +634,8 @@ LLVMBool DontNullTerminate) { /* Inverted the sense of AddNull because ', 0)' is a better mnemonic for null termination than ', 1)'. */ - return wrap(ConstantArray::get(*unwrap(C), StringRef(Str, Length), - DontNullTerminate == 0)); + return wrap(ConstantDataArray::getString(*unwrap(C), StringRef(Str, Length), + DontNullTerminate == 0)); } LLVMValueRef LLVMConstStructInContext(LLVMContextRef C, LLVMValueRef *ConstantVals, Modified: llvm/trunk/lib/VMCore/IRBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/IRBuilder.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/IRBuilder.cpp (original) +++ llvm/trunk/lib/VMCore/IRBuilder.cpp Tue Jan 31 00:18:43 2012 @@ -24,7 +24,7 @@ /// specified. If Name is specified, it is the name of the global variable /// created. Value *IRBuilderBase::CreateGlobalString(StringRef Str, const Twine &Name) { - Constant *StrConstant = ConstantArray::get(Context, Str, true); + Constant *StrConstant = ConstantDataArray::getString(Context, Str); Module &M = *BB->getParent()->getParent(); GlobalVariable *GV = new GlobalVariable(M, StrConstant->getType(), true, GlobalValue::PrivateLinkage, Modified: llvm/trunk/tools/bugpoint/Miscompilation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/bugpoint/Miscompilation.cpp?rev=149365&r1=149364&r2=149365&view=diff ============================================================================== --- llvm/trunk/tools/bugpoint/Miscompilation.cpp (original) +++ llvm/trunk/tools/bugpoint/Miscompilation.cpp Tue Jan 31 00:18:43 2012 @@ -820,7 +820,8 @@ // Don't forward functions which are external in the test module too. if (TestFn && !TestFn->isDeclaration()) { // 1. Add a string constant with its name to the global file - Constant *InitArray = ConstantArray::get(F->getContext(), F->getName()); + Constant *InitArray = + ConstantDataArray::getString(F->getContext(), F->getName()); GlobalVariable *funcName = new GlobalVariable(*Safe, InitArray->getType(), true /*isConstant*/, GlobalValue::InternalLinkage, InitArray, From chandlerc at gmail.com Tue Jan 31 00:25:31 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Mon, 30 Jan 2012 22:25:31 -0800 Subject: [llvm-commits] PATCH: Add support to llvm::Triple for computing 32-bit and 64-bit variant triples. Message-ID: This patch teaches the Triple class to compute 32-bit variants of 64-bit architectures and 64-bit variants of 32-bit architectures. These can be used when reasoning about what alternate triples may have semi-compatible toolchains such as multiarch and bi-arch toolchains. The goal in placing this logic here is to associate it closely with the triple and architecture definitions themselves so that as those change, this gets updated and maintained. Comments on the somewhat clunky API welcome. The reason I went with returning a full triple rather than operating exclusively on the Arch is to make code using the interface as concise as possible. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/0452cc4d/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: triple-predicates3.patch Type: text/x-patch Size: 6093 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/0452cc4d/attachment.bin From craig.topper at gmail.com Tue Jan 31 00:52:44 2012 From: craig.topper at gmail.com (Craig Topper) Date: Tue, 31 Jan 2012 06:52:44 -0000 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll Message-ID: <20120131065244.AB1222A6C12C@llvm.org> Author: ctopper Date: Tue Jan 31 00:52:44 2012 New Revision: 149367 URL: http://llvm.org/viewvc/llvm-project?rev=149367&view=rev Log: Remove pcmpgt/pcmpeq intrinsics as clang is not using them. Modified: llvm/trunk/include/llvm/IntrinsicsX86.td llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll Modified: llvm/trunk/include/llvm/IntrinsicsX86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=149367&r1=149366&r2=149367&view=diff ============================================================================== --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) +++ llvm/trunk/include/llvm/IntrinsicsX86.td Tue Jan 31 00:52:44 2012 @@ -452,28 +452,6 @@ llvm_i32_ty], [IntrNoMem]>; } -// Integer comparison ops -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". - def int_x86_sse2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb128">, - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, - llvm_v16i8_ty], [IntrNoMem, Commutative]>; - def int_x86_sse2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw128">, - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, - llvm_v8i16_ty], [IntrNoMem, Commutative]>; - def int_x86_sse2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd128">, - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, - llvm_v4i32_ty], [IntrNoMem, Commutative]>; - def int_x86_sse2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb128">, - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, - llvm_v16i8_ty], [IntrNoMem]>; - def int_x86_sse2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw128">, - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, - llvm_v8i16_ty], [IntrNoMem]>; - def int_x86_sse2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd128">, - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, - llvm_v4i32_ty], [IntrNoMem]>; -} - // Conversion ops let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". def int_x86_sse2_cvtdq2pd : GCCBuiltin<"__builtin_ia32_cvtdq2pd">, @@ -792,12 +770,6 @@ // Vector compare, min, max let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". - def int_x86_sse41_pcmpeqq : GCCBuiltin<"__builtin_ia32_pcmpeqq">, - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], - [IntrNoMem, Commutative]>; - def int_x86_sse42_pcmpgtq : GCCBuiltin<"__builtin_ia32_pcmpgtq">, - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], - [IntrNoMem]>; def int_x86_sse41_pmaxsb : GCCBuiltin<"__builtin_ia32_pmaxsb128">, Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], [IntrNoMem, Commutative]>; @@ -1515,34 +1487,6 @@ llvm_i32_ty], [IntrNoMem]>; } -// Integer comparison ops -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". - def int_x86_avx2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb256">, - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], - [IntrNoMem, Commutative]>; - def int_x86_avx2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw256">, - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], - [IntrNoMem, Commutative]>; - def int_x86_avx2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd256">, - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], - [IntrNoMem, Commutative]>; - def int_x86_avx2_pcmpeq_q : GCCBuiltin<"__builtin_ia32_pcmpeqq256">, - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], - [IntrNoMem, Commutative]>; - def int_x86_avx2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb256">, - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], - [IntrNoMem]>; - def int_x86_avx2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw256">, - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], - [IntrNoMem]>; - def int_x86_avx2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd256">, - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], - [IntrNoMem]>; - def int_x86_avx2_pcmpgt_q : GCCBuiltin<"__builtin_ia32_pcmpgtq256">, - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], - [IntrNoMem]>; -} - // Pack ops. let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". def int_x86_avx2_packsswb : GCCBuiltin<"__builtin_ia32_packsswb256">, Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149367&r1=149366&r2=149367&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jan 31 00:52:44 2012 @@ -9492,26 +9492,6 @@ case Intrinsic::x86_avx2_psrav_d_256: return DAG.getNode(ISD::SRA, dl, Op.getValueType(), Op.getOperand(1), Op.getOperand(2)); - case Intrinsic::x86_sse2_pcmpeq_b: - case Intrinsic::x86_sse2_pcmpeq_w: - case Intrinsic::x86_sse2_pcmpeq_d: - case Intrinsic::x86_sse41_pcmpeqq: - case Intrinsic::x86_avx2_pcmpeq_b: - case Intrinsic::x86_avx2_pcmpeq_w: - case Intrinsic::x86_avx2_pcmpeq_d: - case Intrinsic::x86_avx2_pcmpeq_q: - return DAG.getNode(X86ISD::PCMPEQ, dl, Op.getValueType(), - Op.getOperand(1), Op.getOperand(2)); - case Intrinsic::x86_sse2_pcmpgt_b: - case Intrinsic::x86_sse2_pcmpgt_w: - case Intrinsic::x86_sse2_pcmpgt_d: - case Intrinsic::x86_sse42_pcmpgtq: - case Intrinsic::x86_avx2_pcmpgt_b: - case Intrinsic::x86_avx2_pcmpgt_w: - case Intrinsic::x86_avx2_pcmpgt_d: - case Intrinsic::x86_avx2_pcmpgt_q: - return DAG.getNode(X86ISD::PCMPGT, dl, Op.getValueType(), - Op.getOperand(1), Op.getOperand(2)); case Intrinsic::x86_ssse3_pshuf_b_128: case Intrinsic::x86_avx2_pshuf_b: return DAG.getNode(X86ISD::PSHUFB, dl, Op.getValueType(), Modified: llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll?rev=149367&r1=149366&r2=149367&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll (original) +++ llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll Tue Jan 31 00:52:44 2012 @@ -30,7 +30,7 @@ %tmp87 = bitcast <16 x i8> %tmp66 to <4 x i32> ; <<4 x i32>> [#uses=1] %tmp88 = add <4 x i32> %tmp87, %tmp77 ; <<4 x i32>> [#uses=2] %tmp88.upgrd.4 = bitcast <4 x i32> %tmp88 to <2 x i64> ; <<2 x i64>> [#uses=1] - %tmp99 = tail call <4 x i32> @llvm.x86.sse2.pcmpgt.d( <4 x i32> %tmp88, <4 x i32> %tmp55 ) ; <<4 x i32>> [#uses=1] + %tmp99 = tail call <4 x i32> @llvm.x86.sse2.psra.d( <4 x i32> %tmp88, <4 x i32> %tmp55 ) ; <<4 x i32>> [#uses=1] %tmp99.upgrd.5 = bitcast <4 x i32> %tmp99 to <2 x i64> ; <<2 x i64>> [#uses=2] %tmp110 = xor <2 x i64> %tmp99.upgrd.5, < i64 -1, i64 -1 > ; <<2 x i64>> [#uses=1] %tmp111 = and <2 x i64> %tmp110, %tmp55.upgrd.2 ; <<2 x i64>> [#uses=1] @@ -48,4 +48,4 @@ ret void } -declare <4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>, <4 x i32>) +declare <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>, <4 x i32>) Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 @@ -369,54 +369,6 @@ declare <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>, <8 x i16>) nounwind readnone -define <16 x i8> @test_x86_sse2_pcmpeq_b(<16 x i8> %a0, <16 x i8> %a1) { - ; CHECK: vpcmpeqb - %res = call <16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1] - ret <16 x i8> %res -} -declare <16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8>, <16 x i8>) nounwind readnone - - -define <4 x i32> @test_x86_sse2_pcmpeq_d(<4 x i32> %a0, <4 x i32> %a1) { - ; CHECK: vpcmpeqd - %res = call <4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1] - ret <4 x i32> %res -} -declare <4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32>, <4 x i32>) nounwind readnone - - -define <8 x i16> @test_x86_sse2_pcmpeq_w(<8 x i16> %a0, <8 x i16> %a1) { - ; CHECK: vpcmpeqw - %res = call <8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1] - ret <8 x i16> %res -} -declare <8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16>, <8 x i16>) nounwind readnone - - -define <16 x i8> @test_x86_sse2_pcmpgt_b(<16 x i8> %a0, <16 x i8> %a1) { - ; CHECK: vpcmpgtb - %res = call <16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1] - ret <16 x i8> %res -} -declare <16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8>, <16 x i8>) nounwind readnone - - -define <4 x i32> @test_x86_sse2_pcmpgt_d(<4 x i32> %a0, <4 x i32> %a1) { - ; CHECK: vpcmpgtd - %res = call <4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1] - ret <4 x i32> %res -} -declare <4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>, <4 x i32>) nounwind readnone - - -define <8 x i16> @test_x86_sse2_pcmpgt_w(<8 x i16> %a0, <8 x i16> %a1) { - ; CHECK: vpcmpgtw - %res = call <8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1] - ret <8 x i16> %res -} -declare <8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16>, <8 x i16>) nounwind readnone - - define <4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0, <8 x i16> %a1) { ; CHECK: vpmaddwd %res = call <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0, <8 x i16> %a1) ; <<4 x i32>> [#uses=1] @@ -950,14 +902,6 @@ declare <8 x i16> @llvm.x86.sse41.pblendw(<8 x i16>, <8 x i16>, i32) nounwind readnone -define <2 x i64> @test_x86_sse41_pcmpeqq(<2 x i64> %a0, <2 x i64> %a1) { - ; CHECK: vpcmpeqq - %res = call <2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1] - ret <2 x i64> %res -} -declare <2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64>, <2 x i64>) nounwind readnone - - define <8 x i16> @test_x86_sse41_phminposuw(<8 x i16> %a0) { ; CHECK: vphminposuw %res = call <8 x i16> @llvm.x86.sse41.phminposuw(<8 x i16> %a0) ; <<8 x i16>> [#uses=1] @@ -1271,14 +1215,6 @@ declare <16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone -define <2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) { - ; CHECK: vpcmpgtq - %res = call <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1] - ret <2 x i64> %res -} -declare <2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>, <2 x i64>) nounwind readnone - - define i32 @test_x86_sse42_pcmpistri128(<16 x i8> %a0, <16 x i8> %a1) { ; CHECK: vpcmpistri ; CHECK: movl Modified: llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 @@ -72,54 +72,6 @@ declare <16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>, <16 x i16>) nounwind readnone -define <32 x i8> @test_x86_avx2_pcmpeq_b(<32 x i8> %a0, <32 x i8> %a1) { - ; CHECK: vpcmpeqb - %res = call <32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1] - ret <32 x i8> %res -} -declare <32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8>, <32 x i8>) nounwind readnone - - -define <8 x i32> @test_x86_avx2_pcmpeq_d(<8 x i32> %a0, <8 x i32> %a1) { - ; CHECK: vpcmpeqd - %res = call <8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1] - ret <8 x i32> %res -} -declare <8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32>, <8 x i32>) nounwind readnone - - -define <16 x i16> @test_x86_avx2_pcmpeq_w(<16 x i16> %a0, <16 x i16> %a1) { - ; CHECK: vpcmpeqw - %res = call <16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1] - ret <16 x i16> %res -} -declare <16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16>, <16 x i16>) nounwind readnone - - -define <32 x i8> @test_x86_avx2_pcmpgt_b(<32 x i8> %a0, <32 x i8> %a1) { - ; CHECK: vpcmpgtb - %res = call <32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1] - ret <32 x i8> %res -} -declare <32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8>, <32 x i8>) nounwind readnone - - -define <8 x i32> @test_x86_avx2_pcmpgt_d(<8 x i32> %a0, <8 x i32> %a1) { - ; CHECK: vpcmpgtd - %res = call <8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1] - ret <8 x i32> %res -} -declare <8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32>, <8 x i32>) nounwind readnone - - -define <16 x i16> @test_x86_avx2_pcmpgt_w(<16 x i16> %a0, <16 x i16> %a1) { - ; CHECK: vpcmpgtw - %res = call <16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1] - ret <16 x i16> %res -} -declare <16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16>, <16 x i16>) nounwind readnone - - define <8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0, <16 x i16> %a1) { ; CHECK: vpmaddwd %res = call <8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0, <16 x i16> %a1) ; <<8 x i32>> [#uses=1] @@ -553,14 +505,6 @@ declare <16 x i16> @llvm.x86.avx2.pblendw(<16 x i16>, <16 x i16>, i32) nounwind readnone -define <4 x i64> @test_x86_avx2_pcmpeqq(<4 x i64> %a0, <4 x i64> %a1) { - ; CHECK: vpcmpeqq - %res = call <4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1] - ret <4 x i64> %res -} -declare <4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64>, <4 x i64>) nounwind readnone - - define <32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0, <32 x i8> %a1) { ; CHECK: vpmaxsb %res = call <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1] @@ -729,14 +673,6 @@ declare <4 x i64> @llvm.x86.avx2.pmul.dq(<8 x i32>, <8 x i32>) nounwind readnone -define <4 x i64> @test_x86_avx2_pcmpgtq(<4 x i64> %a0, <4 x i64> %a1) { - ; CHECK: vpcmpgtq - %res = call <4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1] - ret <4 x i64> %res -} -declare <4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64>, <4 x i64>) nounwind readnone - - define <4 x i64> @test_x86_avx2_vbroadcasti128(i8* %a0) { ; CHECK: vbroadcasti128 %res = call <4 x i64> @llvm.x86.avx2.vbroadcasti128(i8* %a0) ; <<4 x i64>> [#uses=1] From isanbard at gmail.com Tue Jan 31 00:57:53 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 06:57:53 -0000 Subject: [llvm-commits] [llvm] r149368 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <20120131065753.8F1F32A6C12C@llvm.org> Author: void Date: Tue Jan 31 00:57:53 2012 New Revision: 149368 URL: http://llvm.org/viewvc/llvm-project?rev=149368&view=rev Log: Cache the size of the vector instead of calling .size() all over the place. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=149368&r1=149367&r2=149368&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Tue Jan 31 00:57:53 2012 @@ -1278,14 +1278,14 @@ // If we had to process more than one hundred blocks to find the // dependencies, this load isn't worth worrying about. Optimizing // it will be too expensive. - if (Deps.size() > 100) + unsigned NumDeps = Deps.size(); + if (NumDeps > 100) return false; // If we had a phi translation failure, we'll have a single entry which is a // clobber in the current block. Reject this early. - if (Deps.size() == 1 - && !Deps[0].getResult().isDef() && !Deps[0].getResult().isClobber()) - { + if (NumDeps == 1 && + !Deps[0].getResult().isDef() && !Deps[0].getResult().isClobber()) { DEBUG( dbgs() << "GVN: non-local load "; WriteAsOperand(dbgs(), LI); @@ -1301,7 +1301,7 @@ SmallVector ValuesPerBlock; SmallVector UnavailableBlocks; - for (unsigned i = 0, e = Deps.size(); i != e; ++i) { + for (unsigned i = 0, e = NumDeps; i != e; ++i) { BasicBlock *DepBB = Deps[i].getBB(); MemDepResult DepInfo = Deps[i].getResult(); From isanbard at gmail.com Tue Jan 31 01:04:52 2012 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 31 Jan 2012 07:04:52 -0000 Subject: [llvm-commits] [llvm] r149369 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <20120131070452.4D54D2A6C12C@llvm.org> Author: void Date: Tue Jan 31 01:04:52 2012 New Revision: 149369 URL: http://llvm.org/viewvc/llvm-project?rev=149369&view=rev Log: Increase the initial vector size to be equivalent to the size of the Deps vector. This potentially saves a resizing. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=149369&r1=149368&r2=149369&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Tue Jan 31 01:04:52 2012 @@ -1298,8 +1298,8 @@ // where we have a value available in repl, also keep track of whether we see // dependencies that produce an unknown value for the load (such as a call // that could potentially clobber the load). - SmallVector ValuesPerBlock; - SmallVector UnavailableBlocks; + SmallVector ValuesPerBlock; + SmallVector UnavailableBlocks; for (unsigned i = 0, e = NumDeps; i != e; ++i) { BasicBlock *DepBB = Deps[i].getBB(); From atrick at apple.com Tue Jan 31 01:22:17 2012 From: atrick at apple.com (Andrew Trick) Date: Mon, 30 Jan 2012 23:22:17 -0800 Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom In-Reply-To: References: Message-ID: <3893BF02-E382-470D-AFB4-9F30AE9ECE34@apple.com> On Jan 23, 2012, at 3:05 PM, "Gurd, Preston" wrote: > Revision 2: Tests which were failing, when run on an Atom, due to the tests finding a schedule different from what was expected, have been changed to use ?-mcpu=generic? in order to prevent the Atom scheduler from running, so that all ?make check? tests pass. > > From: Gurd, Preston > Sent: Tuesday, January 17, 2012 4:29 PM > To: Evan Cheng > Cc: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. > > Revision: the patch also changes the scheduling preference to ?Hybrid? for i386 Atom, while leaving x86_64 as ILP. > > Please commit the patch if it seems acceptable. > > Preston > > > From: Evan Cheng [mailto:evan.cheng at apple.com] > Sent: Monday, January 16, 2012 12:01 PM > To: Gurd, Preston > Cc: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] [llvm][PATCH][Review request] X86 Instruction scheduler for the Intel Atom > > Very nice. One question, I noticed you haven't changed the scheduling preference so x86_64 is still using ILP scheduler while i386 is using register pressure reduction scheduler. Have you tried changing the preference to latency scheduler for Atom? > > Evan > > On Jan 13, 2012, at 3:26 PM, Gurd, Preston wrote: > > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. Hi Preston, I just have a couple minor questions I'd like you to address before I commit this: +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, + FeatureMOVBE, FeatureSlowBTMem]>; These features are already included in the ProcIntelAtom family. Why do you need to list them again? Please verify, but subtarget features should be transitively implied. + //CriticalPathRCs.push_back(&X86::GPRRegClass); If you want to leave this disabled, please add comments. Thanks, -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120130/48120b1f/attachment.html From glider at google.com Tue Jan 31 02:27:06 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 12:27:06 +0400 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil Message-ID: On Tue, Jan 31, 2012 at 10:18 AM, Chris Lattner wrote: > Author: lattner > Date: Tue Jan 31 00:18:43 2012 > New Revision: 149365 > > URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev > Log: > eliminate the "string" form of ConstantArray::get, using > ConstantDataArray::getString instead. Chris, our internal (for the moment) ASan buildbot is reporting test failures starting at the range of r149363 to r149365 on Linux x64 and Mac x64 Namely they are: ******************** TEST 'Clang :: CodeGenObjC/arc-ivar-layout.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjC/arc-block-ivar-layout.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjC/block-var-layout.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-array0-struct.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-64.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-no-optimize.m' FAILED ******************** ******************** TEST 'Clang :: CodeGenObjCXX/block-var-layout.mm' FAILED ******************** HTH, Alex From james.molloy at arm.com Tue Jan 31 02:30:44 2012 From: james.molloy at arm.com (James Molloy) Date: Tue, 31 Jan 2012 08:30:44 -0000 Subject: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple In-Reply-To: References: <4f26853a.06a1ec0a.5cf8.ffff889dSMTPIN_ADDED@mx.google.com> <4f26dcb9.daebd80a.6078.2dc0SMTPIN_ADDED@mx.google.com> Message-ID: <000a01ccdff2$a07da410$e178ec30$@molloy@arm.com> Hi Chandler, That's fine, it was a pedantic nitpick anyway J Cheers, James From: Chris Lattner [mailto:clattner at apple.com] Sent: 31 January 2012 04:56 To: Chandler Carruth Cc: James Molloy; llvm-commits at cs.uiuc.edu; Anton Korobeynikov Subject: Re: [llvm-commits] PATCH: Add 64-bit architecture predicate to llvm::Triple On Jan 30, 2012, at 8:51 PM, Chandler Carruth wrote: On Mon, Jan 30, 2012 at 10:08 AM, James Molloy wrote: Hi Chandler, One point: + /// Note that this tests for 16-bit pointer width, and nothing else. I'm not sure this comment is accurate. For example, real mode x86 would be 16-bit but has 24-bit pointers (seg:offset). As far as I can tell, there is no support for x86-16 or real mode or any of the other segmented addressing modes on x86. If such modes are added, I would be quite happy for them to return false on all three of these queries. ;] I still think "pointer size" is the most descriptive term, but I'm open to more suggestions. FWIW, the reason I don't particularly like "native width of the register file" is that i find it much less clear and unambiguous given the diversity of register widths on even modern architectures. However, addresses generally have a fixed size on modern architectures, and so that seems a good classification scheme. Just my contribution to the bikeshed: I don't think that it makes sense for llvm::Triple to know the "native width of the register file", since that is such an amorphous statement. The best that llvm::Triple can know is sizeof(void*) in the default address space. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/27cbb7cb/attachment.html From grosser at fim.uni-passau.de Tue Jan 31 02:50:12 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 08:50:12 -0000 Subject: [llvm-commits] [polly] r149370 - /polly/trunk/www/polly.sh Message-ID: <20120131085012.D0E1F2A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 02:50:12 2012 New Revision: 149370 URL: http://llvm.org/viewvc/llvm-project?rev=149370&view=rev Log: polly.sh: Do not build PoCC automatically As we now have a scheduler that works, I do not believe a lot of people need PoCC right ahead. People who want to do an in depth investigation of the different schedulers can install it as well manually. Modified: polly/trunk/www/polly.sh Modified: polly/trunk/www/polly.sh URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/polly.sh?rev=149370&r1=149369&r2=149370&view=diff ============================================================================== --- polly/trunk/www/polly.sh (original) +++ polly/trunk/www/polly.sh Tue Jan 31 02:50:12 2012 @@ -8,7 +8,6 @@ export CLOOG_INSTALL=${BASE}/cloog_install export LLVM_BUILD=${BASE}/llvm_build export SCOPLIB_DIR=${BASE}/scoplib-0.2.0 -export POCC_DIR=${BASE}/pocc-1.0-rc3.1 if [ -e /proc/cpuinfo ]; then procs=`cat /proc/cpuinfo | grep processor | wc -l` @@ -38,15 +37,6 @@ make install cd ${BASE} -if ! test -d ${POCC_DIR}; then - wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/pocc-1.0-rc3.1-full.tar.gz - tar xzf pocc-1.0-rc3.1-full.tar.gz - cd ${POCC_DIR} - ./install.sh - cd ${BASE} -fi -export PATH=${POCC_DIR}/bin:$PATH - if ! test -d ${SCOPLIB_DIR}; then wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/modules/scoplib-0.2.0.tar.gz tar xzf scoplib-0.2.0.tar.gz From grosser at fim.uni-passau.de Tue Jan 31 02:50:16 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 08:50:16 -0000 Subject: [llvm-commits] [polly] r149371 - /polly/trunk/www/get_started.html Message-ID: <20120131085016.8CD072A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 02:50:16 2012 New Revision: 149371 URL: http://llvm.org/viewvc/llvm-project?rev=149371&view=rev Log: www: Move PoCC to the end of the installation section Modified: polly/trunk/www/get_started.html Modified: polly/trunk/www/get_started.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/get_started.html?rev=149371&r1=149370&r2=149371&view=diff ============================================================================== --- polly/trunk/www/get_started.html (original) +++ polly/trunk/www/get_started.html Tue Jan 31 02:50:16 2012 @@ -107,40 +107,6 @@ cd ${BASE} -

Install Pocc (Optional)

- -

Polly can use -PoCC as an external optimizer. PoCC is a research project that provides -an integrated version of Pluto, an -advanced data-locality and tileability optimizer. Similar functionality was -recently integrated in Polly (through isl), however the optimizations are not as -mature as the ones in Pluto/PoCC. Hence, if you want to use Pluto to optimize -your code or you want to compare the optimizer integrated in Polly to Pluto you -may want to use PoCC.

- -Install PoCC 1.0-rc3.1 (the one with Polly support) and add it to your PATH. - -
-wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/pocc-1.0-rc3.1-full.tar.gz
-tar xzf pocc-1.0-rc3.1-full.tar.gz
-cd pocc-1.0-rc3.1
-./install.sh
-export PATH=`pwd`/bin
-
- -Install scoplib-0.2.0 - -
-wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/modules/scoplib-0.2.0.tar.gz
-tar xzf  scoplib-0.2.0.tar.gz
-cd scoplib-0.2.0
-./configure --enable-mp-version --prefix=/path/to/scoplib/installation
-make && make install
-
-

Build Polly

To build Polly you can either use the autoconf or the cmake build system. At the @@ -173,6 +139,43 @@ To check if Polly works correctly you can run make polly-test for the cmake build or make polly-test -C tools/polly/test/ for the autoconf build. + +

Optional Features

+ +

Pocc

+ +

Polly can use +PoCC as an external optimizer. PoCC is a research project that provides +an integrated version of Pluto, an +advanced data-locality and tileability optimizer. Polly includes internally +already a similar optimizer, such that in general PoCC is not needed. It is +only recommended for people who want to compare against a different +optimizer. +
+To use it install PoCC 1.0-rc3.1 (the one with Polly support) and add it to your PATH. + +

+wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/pocc-1.0-rc3.1-full.tar.gz
+tar xzf pocc-1.0-rc3.1-full.tar.gz
+cd pocc-1.0-rc3.1
+./install.sh
+export PATH=`pwd`/bin
+
+ +You also need to install scoplib-0.2.0 and provide its location to +Polly's cmake or configure call. + +
+wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/modules/scoplib-0.2.0.tar.gz
+tar xzf  scoplib-0.2.0.tar.gz
+cd scoplib-0.2.0
+./configure --enable-mp-version --prefix=/path/to/scoplib/installation
+make && make install
+
+ From grosser at fim.uni-passau.de Tue Jan 31 02:50:20 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 08:50:20 -0000 Subject: [llvm-commits] [polly] r149372 - /polly/trunk/www/polly.sh Message-ID: <20120131085020.25CE52A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 02:50:19 2012 New Revision: 149372 URL: http://llvm.org/viewvc/llvm-project?rev=149372&view=rev Log: polly.sh: Do not automatically install scoplib either. It is only needed for PoCC. We may update our openscop support which is expected to be wider used. If this is the case we could automatically build openscop. Modified: polly/trunk/www/polly.sh Modified: polly/trunk/www/polly.sh URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/polly.sh?rev=149372&r1=149371&r2=149372&view=diff ============================================================================== --- polly/trunk/www/polly.sh (original) +++ polly/trunk/www/polly.sh Tue Jan 31 02:50:19 2012 @@ -7,7 +7,6 @@ export CLOOG_SRC=${BASE}/cloog_src export CLOOG_INSTALL=${BASE}/cloog_install export LLVM_BUILD=${BASE}/llvm_build -export SCOPLIB_DIR=${BASE}/scoplib-0.2.0 if [ -e /proc/cpuinfo ]; then procs=`cat /proc/cpuinfo | grep processor | wc -l` @@ -37,14 +36,6 @@ make install cd ${BASE} -if ! test -d ${SCOPLIB_DIR}; then - wget http://www.cse.ohio-state.edu/~pouchet/software/pocc/download/modules/scoplib-0.2.0.tar.gz - tar xzf scoplib-0.2.0.tar.gz - cd ${SCOPLIB_DIR} - ./configure --enable-mp-version --prefix=${SCOPLIB_DIR}/usr - make -j${procs} -l${procs} && make install -fi - mkdir -p ${LLVM_BUILD} cd ${LLVM_BUILD} From grosser at fim.uni-passau.de Tue Jan 31 02:50:23 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 08:50:23 -0000 Subject: [llvm-commits] [polly] r149373 - /polly/trunk/www/get_started.html Message-ID: <20120131085023.8C19A2A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 02:50:23 2012 New Revision: 149373 URL: http://llvm.org/viewvc/llvm-project?rev=149373&view=rev Log: www: Remove PoCC from the prerequisites Modified: polly/trunk/www/get_started.html Modified: polly/trunk/www/get_started.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/get_started.html?rev=149373&r1=149372&r2=149373&view=diff ============================================================================== --- polly/trunk/www/get_started.html (original) +++ polly/trunk/www/get_started.html Tue Jan 31 02:50:23 2012 @@ -66,7 +66,6 @@
  • libgmp
  • CLooG/isl
  • -
  • PoCC (optional)

libgmp

From glider at google.com Tue Jan 31 02:55:25 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 12:55:25 +0400 Subject: [llvm-commits] [compiler-rt] r149296 - in /compiler-rt/trunk/lib/asan/tests: asan_mac_test.mm asan_test.cc In-Reply-To: <20120130232326.B4A2E2A6C12C@llvm.org> References: <20120130232326.B4A2E2A6C12C@llvm.org> Message-ID: I dislike the idea of using noinline functions instead of actual memory accesses. This may mask problems with uninstrumented accesses, see http://code.google.com/p/address-sanitizer/issues/detail?id=33#c9 On Tue, Jan 31, 2012 at 3:23 AM, Kostya Serebryany wrote: > Author: kcc > Date: Mon Jan 30 17:23:26 2012 > New Revision: 149296 > > URL: http://llvm.org/viewvc/llvm-project?rev=149296&view=rev > Log: > [asan] fix issue 35: don't let the optimizer to optimize the test code away. > > Modified: > ? ?compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm > ? ?compiler-rt/trunk/lib/asan/tests/asan_test.cc > > Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm?rev=149296&r1=149295&r2=149296&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm (original) > +++ compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Mon Jan 30 17:23:26 2012 > @@ -32,6 +32,10 @@ > ? CFAllocatorDeallocate(kCFAllocatorMallocZone, mem); > ?} > > +__attribute__((noinline)) > +void access_memory(char *a) { > + ?*a = 0; > +} > > ?// Test the +load instrumentation. > ?// Because the +load methods are invoked before anything else is initialized, > @@ -51,7 +55,8 @@ > > ?+(void) load { > ? for (int i = 0; i < strlen(kStartupStr); i++) { > - ? ?volatile char ch = kStartupStr[i]; ?// make sure no optimizations occur. > + ? ?// TODO: this is currently broken, see Issue 33. > + ? ?// access_memory(&kStartupStr[i]); ?// make sure no optimizations occur. > ? } > ? // Don't print anything here not to interfere with the death tests. > ?} > @@ -66,7 +71,7 @@ > > ?void worker_do_crash(int size) { > ? char * volatile mem = malloc(size); > - ?mem[size] = 0; ?// BOOM > + ?access_memory(&mem[size]); ?// BOOM > ? free(mem); > ?} > > @@ -162,7 +167,7 @@ > ? dispatch_source_set_timer(timer, milestone, DISPATCH_TIME_FOREVER, 0); > ? char * volatile mem = malloc(10); > ? dispatch_source_set_event_handler(timer, ^{ > - ? ?mem[10] = 1; > + ? ?access_memory(&mem[10]); > ? }); > ? dispatch_resume(timer); > ? sleep(2); > @@ -186,7 +191,7 @@ > ? ? dispatch_source_cancel(timer); > ? }); > ? dispatch_source_set_cancel_handler(timer, ^{ > - ? ?mem[10] = 1; > + ? ?access_memory(&mem[10]); > ? }); > ? dispatch_resume(timer); > ? sleep(2); > @@ -197,7 +202,7 @@ > ? dispatch_group_t group = dispatch_group_create(); > ? char * volatile mem = malloc(10); > ? dispatch_group_async(group, queue, ^{ > - ? ?mem[10] = 1; > + ? ?access_memory(&mem[10]); > ? }); > ? dispatch_group_wait(group, DISPATCH_TIME_FOREVER); > ?} > > Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc > URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=149296&r1=149295&r2=149296&view=diff > ============================================================================== > --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) > +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Mon Jan 30 17:23:26 2012 > @@ -1668,7 +1668,7 @@ > ? *Ident(&a) = *Ident(&a); > ?} > > - __attribute__((no_address_safety_analysis)) > +__attribute__((no_address_safety_analysis)) > ?static void NoAddressSafety() { > ? char *foo = new char[10]; > ? Ident(foo)[10] = 0; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -- Alexander Potapenko Software Engineer Google Moscow From grosser at fim.uni-passau.de Tue Jan 31 03:13:12 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 09:13:12 -0000 Subject: [llvm-commits] [polly] r149374 - /polly/trunk/www/index.html Message-ID: <20120131091312.BF16B2A6C131@llvm.org> Author: grosser Date: Tue Jan 31 03:13:12 2012 New Revision: 149374 URL: http://llvm.org/viewvc/llvm-project?rev=149374&view=rev Log: www: Add news about the improved isl scheduling support Modified: polly/trunk/www/index.html Modified: polly/trunk/www/index.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/index.html?rev=149374&r1=149373&r2=149374&view=diff ============================================================================== --- polly/trunk/www/index.html (original) +++ polly/trunk/www/index.html Tue Jan 31 03:13:12 2012 @@ -69,7 +69,17 @@ - + + + + + + From klimek at google.com Tue Jan 31 13:58:34 2012 From: klimek at google.com (Manuel Klimek) Date: Tue, 31 Jan 2012 19:58:34 -0000 Subject: [llvm-commits] [llvm] r149411 - in /llvm/trunk: include/llvm/ADT/IntrusiveRefCntPtr.h unittests/ADT/IntrusiveRefCntPtrTest.cpp unittests/CMakeLists.txt Message-ID: <20120131195834.94D892A6C12C@llvm.org> Author: klimek Date: Tue Jan 31 13:58:34 2012 New Revision: 149411 URL: http://llvm.org/viewvc/llvm-project?rev=149411&view=rev Log: RefCountedBaseVPTR needs the IntrusiveRefCntPtrInfo as friend, now that this handles the release / retain calls. Adds a regression test for that bug (which is a compile-time regression) and for the last two changes to the IntrusiveRefCntPtr, especially tests for the memory leak due to copy construction of the ref-counted object and ensuring that the traits are used for release / retain calls. Added: llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp (with props) Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h llvm/trunk/unittests/CMakeLists.txt Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h?rev=149411&r1=149410&r2=149411&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h (original) +++ llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Tue Jan 31 13:58:34 2012 @@ -80,7 +80,7 @@ } template - friend class IntrusiveRefCntPtr; + friend struct IntrusiveRefCntPtrInfo; }; Added: llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp?rev=149411&view=auto ============================================================================== --- llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp (added) +++ llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp Tue Jan 31 13:58:34 2012 @@ -0,0 +1,64 @@ +//===- unittest/ADT/IntrusiveRefCntPtrTest.cpp ----------------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "llvm/ADT/IntrusiveRefCntPtr.h" +#include "gtest/gtest.h" + +namespace llvm { + +struct VirtualRefCounted : public RefCountedBaseVPTR { + virtual void f() {} +}; + +// Run this test with valgrind to detect memory leaks. +TEST(IntrusiveRefCntPtr, RefCountedBaseVPTRCopyDoesNotLeak) { + VirtualRefCounted *V1 = new VirtualRefCounted; + IntrusiveRefCntPtr R1 = V1; + VirtualRefCounted *V2 = new VirtualRefCounted(*V1); + IntrusiveRefCntPtr R2 = V2; +} + +struct SimpleRefCounted : public RefCountedBase {}; + +// Run this test with valgrind to detect memory leaks. +TEST(IntrusiveRefCntPtr, RefCountedBaseCopyDoesNotLeak) { + SimpleRefCounted *S1 = new SimpleRefCounted; + IntrusiveRefCntPtr R1 = S1; + SimpleRefCounted *S2 = new SimpleRefCounted(*S1); + IntrusiveRefCntPtr R2 = S2; +} + +struct InterceptRefCounted : public RefCountedBase { + InterceptRefCounted(bool *Released, bool *Retained) + : Released(Released), Retained(Retained) {} + bool * const Released; + bool * const Retained; +}; +template <> struct IntrusiveRefCntPtrInfo { + static void retain(InterceptRefCounted *I) { + *I->Retained = true; + I->Retain(); + } + static void release(InterceptRefCounted *I) { + *I->Released = true; + I->Release(); + } +}; +TEST(IntrusiveRefCntPtr, UsesTraitsToRetainAndRelease) { + bool Released = false; + bool Retained = false; + { + InterceptRefCounted *I = new InterceptRefCounted(&Released, &Retained); + IntrusiveRefCntPtr R = I; + } + EXPECT_TRUE(Released); + EXPECT_TRUE(Retained); +} + +} // end namespace llvm Propchange: llvm/trunk/unittests/ADT/IntrusiveRefCntPtrTest.cpp ------------------------------------------------------------------------------ svn:eol-style = LF Modified: llvm/trunk/unittests/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/CMakeLists.txt?rev=149411&r1=149410&r2=149411&view=diff ============================================================================== --- llvm/trunk/unittests/CMakeLists.txt (original) +++ llvm/trunk/unittests/CMakeLists.txt Tue Jan 31 13:58:34 2012 @@ -64,6 +64,7 @@ ADT/ImmutableSetTest.cpp ADT/IntEqClassesTest.cpp ADT/IntervalMapTest.cpp + ADT/IntrusiveRefCntPtrTest.cpp ADT/PackedVectorTest.cpp ADT/SmallBitVectorTest.cpp ADT/SmallStringTest.cpp From grosser at fim.uni-passau.de Tue Jan 31 14:24:21 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 20:24:21 -0000 Subject: [llvm-commits] [polly] r149415 - /polly/trunk/www/example_load_Polly_into_clang.html Message-ID: <20120131202421.A20A82A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 14:24:21 2012 New Revision: 149415 URL: http://llvm.org/viewvc/llvm-project?rev=149415&view=rev Log: www: Add missing -mllvm to documentation Found by Ryan Taylor Modified: polly/trunk/www/example_load_Polly_into_clang.html Modified: polly/trunk/www/example_load_Polly_into_clang.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_load_Polly_into_clang.html?rev=149415&r1=149414&r2=149415&view=diff ============================================================================== --- polly/trunk/www/example_load_Polly_into_clang.html (original) +++ polly/trunk/www/example_load_Polly_into_clang.html Tue Jan 31 14:24:21 2012 @@ -43,7 +43,7 @@ Optimizing with Polly is as easy as ading -O3 -polly to your compiler flags (Polly is only available at -O3). -
pollycc -O3 -polly file.c
+
pollycc -O3 -mllvm -polly file.c

Automatic OpenMP code generation

From grosbach at apple.com Tue Jan 31 14:34:53 2012 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 31 Jan 2012 20:34:53 -0000 Subject: [llvm-commits] [llvm] r149416 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Message-ID: <20120131203453.C428C2A6C12C@llvm.org> Author: grosbach Date: Tue Jan 31 14:34:53 2012 New Revision: 149416 URL: http://llvm.org/viewvc/llvm-project?rev=149416&view=rev Log: Add explanatory comment. Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp?rev=149416&r1=149415&r2=149416&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Tue Jan 31 14:34:53 2012 @@ -834,6 +834,7 @@ if (&*I == MI) return Offset; Offset += TII->GetInstSizeInBytes(I); } + // Unreachable. } /// CompareMBBNumbers - Little predicate function to sort the WaterList by MBB From sabre at nondot.org Tue Jan 31 14:43:31 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 12:43:31 -0800 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: References: Message-ID: <076AA06D-AC93-41B4-B830-464D2A76E710@nondot.org> On Jan 31, 2012, at 4:22 AM, Alexander Potapenko wrote: > On Tue, Jan 31, 2012 at 3:56 PM, Chris Lattner wrote: >> >> On Jan 31, 2012, at 12:27 AM, Alexander Potapenko wrote: >> >>> On Tue, Jan 31, 2012 at 10:18 AM, Chris Lattner wrote: >>>> Author: lattner >>>> Date: Tue Jan 31 00:18:43 2012 >>>> New Revision: 149365 >>>> >>>> URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev >>>> Log: >>>> eliminate the "string" form of ConstantArray::get, using >>>> ConstantDataArray::getString instead. >>> Chris, >>> >>> our internal (for the moment) ASan buildbot is reporting test failures >>> starting at the range of r149363 to r149365 on Linux x64 and Mac x64 >>> Namely they are: >> >> That sounds bad, and I'd definitely like to fix these. Can you give me more information about how they are failing? What command line is failing, and with what stack trace? >> >> -Chris > Attached is the log of `make check-all` on my Snow Leopard machine. > Feel free to ask for more information. > Thanks Alexander. I think that Benjamin fixed this for me (thanks!), can you verify that it is resolved for you? -Chris From sabre at nondot.org Tue Jan 31 14:44:56 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 12:44:56 -0800 Subject: [llvm-commits] [llvm] r149348 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp In-Reply-To: References: <20120131043922.528222A6C12C@llvm.org> Message-ID: <38C376D1-635F-4573-99BD-AB155C9E9A2A@nondot.org> On Jan 31, 2012, at 9:32 AM, NAKAMURA Takumi wrote: > 2012/1/31 Chris Lattner : >> Author: lattner >> Date: Mon Jan 30 22:39:22 2012 >> New Revision: 149348 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149348&view=rev >> Log: >> rework this logic to not depend on the last argument to GetConstantStringInfo, >> which is going away. > > Chris, it seems it might trigger a failure on > stage2(stage1-built-clang) build on my two builders, x86_64-linux and > i686-cygwin. I have not investigated why yet. Hi Takumi, Is this still failing for you? I cannot reproduce this, and it is very strange. I didn't change anything around metadata, so I'm not sure how I could have caused this. -Chris > > ...Takumi > > ******************** TEST 'LLVM-Unit :: > VMCore/Release/VMCoreTests/MDStringTest.PrintingComplex' FAILED > ********************Note: Google Test filter = > MDStringTest.PrintingComplex > [==========] Running 1 test from 1 test case. > [----------] Global test environment set-up. > [----------] 1 test from MDStringTest > [ RUN ] MDStringTest.PrintingComplex > /home/bb/buildslave/clang-3stage-x86_64-linux/llvm-project/llvm/unittests/VMCore/MetadataTest.cpp:71: > Failure > Value of: oss.str().c_str() > Actual: "metadata !"\00\00\00\00\00"" > Expected: "metadata !\"\\00\\0A\\22\\5C\\FF\"" > Which is: "metadata !"\00\0A\22\5C\FF"" > [ FAILED ] MDStringTest.PrintingComplex (0 ms) > [----------] 1 test from MDStringTest (0 ms total) > > [----------] Global test environment tear-down > [==========] 1 test from 1 test case ran. (0 ms total) > [ PASSED ] 0 tests. > [ FAILED ] 1 test, listed below: > [ FAILED ] MDStringTest.PrintingComplex > > 1 FAILED TEST > > ******************** From benny.kra at googlemail.com Tue Jan 31 14:52:41 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Tue, 31 Jan 2012 21:52:41 +0100 Subject: [llvm-commits] [llvm] r149416 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp In-Reply-To: <20120131203453.C428C2A6C12C@llvm.org> References: <20120131203453.C428C2A6C12C@llvm.org> Message-ID: <8E2B1DBC-63F7-44A6-8AC4-70BF0CF9401D@googlemail.com> On 31.01.2012, at 21:34, Jim Grosbach wrote: > Author: grosbach > Date: Tue Jan 31 14:34:53 2012 > New Revision: 149416 > > URL: http://llvm.org/viewvc/llvm-project?rev=149416&view=rev > Log: > Add explanatory comment. > > Modified: > llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp > > Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp?rev=149416&r1=149415&r2=149416&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Tue Jan 31 14:34:53 2012 > @@ -834,6 +834,7 @@ > if (&*I == MI) return Offset; > Offset += TII->GetInstSizeInBytes(I); > } > + // Unreachable. > } Why not rewrite this loop as for (MachineBasicBlock::iterator I = MBB->begin(); &*I != MI; ++I) { assert(I != MBB->end() && "Didn't find MI in its own basic block?"); Offset += TII->GetInstSizeInBytes(I); } return Offset; and avoid the reachability confusion? - Ben > > /// CompareMBBNumbers - Little predicate function to sort the WaterList by MBB > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From grosbach at apple.com Tue Jan 31 14:56:55 2012 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 31 Jan 2012 20:56:55 -0000 Subject: [llvm-commits] [llvm] r149417 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Message-ID: <20120131205655.811242A6C12C@llvm.org> Author: grosbach Date: Tue Jan 31 14:56:55 2012 New Revision: 149417 URL: http://llvm.org/viewvc/llvm-project?rev=149417&view=rev Log: Refactor loop for better readability. Excellent suggestion from Ben Kramer. Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp?rev=149417&r1=149416&r2=149417&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Tue Jan 31 14:56:55 2012 @@ -829,12 +829,11 @@ unsigned Offset = BBInfo[MBB->getNumber()].Offset; // Sum instructions before MI in MBB. - for (MachineBasicBlock::iterator I = MBB->begin(); ; ++I) { + for (MachineBasicBlock::iterator I = MBB->begin(); &*I != MI; ++I) { assert(I != MBB->end() && "Didn't find MI in its own basic block?"); - if (&*I == MI) return Offset; Offset += TII->GetInstSizeInBytes(I); } - // Unreachable. + return Offset; } /// CompareMBBNumbers - Little predicate function to sort the WaterList by MBB From grosbach at apple.com Tue Jan 31 15:01:21 2012 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 31 Jan 2012 13:01:21 -0800 Subject: [llvm-commits] [llvm] r149416 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp In-Reply-To: <8E2B1DBC-63F7-44A6-8AC4-70BF0CF9401D@googlemail.com> References: <20120131203453.C428C2A6C12C@llvm.org> <8E2B1DBC-63F7-44A6-8AC4-70BF0CF9401D@googlemail.com> Message-ID: On Jan 31, 2012, at 12:52 PM, Benjamin Kramer wrote: > > On 31.01.2012, at 21:34, Jim Grosbach wrote: > >> Author: grosbach >> Date: Tue Jan 31 14:34:53 2012 >> New Revision: 149416 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149416&view=rev >> Log: >> Add explanatory comment. >> >> Modified: >> llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp >> >> Modified: llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp?rev=149416&r1=149415&r2=149416&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp Tue Jan 31 14:34:53 2012 >> @@ -834,6 +834,7 @@ >> if (&*I == MI) return Offset; >> Offset += TII->GetInstSizeInBytes(I); >> } >> + // Unreachable. >> } > > Why not rewrite this loop as > > for (MachineBasicBlock::iterator I = MBB->begin(); &*I != MI; ++I) { > assert(I != MBB->end() && "Didn't find MI in its own basic block?"); > Offset += TII->GetInstSizeInBytes(I); > } > return Offset; > > and avoid the reachability confusion? An excellent idea. r149417. > - Ben > >> >> /// CompareMBBNumbers - Little predicate function to sort the WaterList by MBB >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From stoklund at 2pi.dk Tue Jan 31 14:57:56 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 31 Jan 2012 20:57:56 -0000 Subject: [llvm-commits] [llvm] r149418 - in /llvm/trunk/utils/TableGen: CodeGenRegisters.cpp CodeGenRegisters.h RegisterInfoEmitter.cpp Message-ID: <20120131205756.691902A6C12C@llvm.org> Author: stoklund Date: Tue Jan 31 14:57:55 2012 New Revision: 149418 URL: http://llvm.org/viewvc/llvm-project?rev=149418&view=rev Log: Add a TableGen CodeGenSubRegIndex class. This class is used to represent SubRegIndex instances instead of the raw Record pointers that were used before. No functional change intended. Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp llvm/trunk/utils/TableGen/CodeGenRegisters.h llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.cpp?rev=149418&r1=149417&r2=149418&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.cpp (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Tue Jan 31 14:57:55 2012 @@ -22,6 +22,34 @@ using namespace llvm; //===----------------------------------------------------------------------===// +// CodeGenSubRegIndex +//===----------------------------------------------------------------------===// + +CodeGenSubRegIndex::CodeGenSubRegIndex(Record *R, unsigned Enum) + : TheDef(R), + EnumValue(Enum) +{} + +std::string CodeGenSubRegIndex::getNamespace() const { + if (TheDef->getValue("Namespace")) + return TheDef->getValueAsString("Namespace"); + else + return ""; +} + +const std::string &CodeGenSubRegIndex::getName() const { + return TheDef->getName(); +} + +std::string CodeGenSubRegIndex::getQualifiedName() const { + std::string N = getNamespace(); + if (!N.empty()) + N += "::"; + N += getName(); + return N; +} + +//===----------------------------------------------------------------------===// // CodeGenRegister //===----------------------------------------------------------------------===// @@ -40,8 +68,8 @@ namespace { struct Orphan { CodeGenRegister *SubReg; - Record *First, *Second; - Orphan(CodeGenRegister *r, Record *a, Record *b) + CodeGenSubRegIndex *First, *Second; + Orphan(CodeGenRegister *r, CodeGenSubRegIndex *a, CodeGenSubRegIndex *b) : SubReg(r), First(a), Second(b) {} }; } @@ -62,8 +90,9 @@ // First insert the direct subregs and make sure they are fully indexed. for (unsigned i = 0, e = SubList.size(); i != e; ++i) { CodeGenRegister *SR = RegBank.getReg(SubList[i]); - if (!SubRegs.insert(std::make_pair(Indices[i], SR)).second) - throw TGError(TheDef->getLoc(), "SubRegIndex " + Indices[i]->getName() + + CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(Indices[i]); + if (!SubRegs.insert(std::make_pair(Idx, SR)).second) + throw TGError(TheDef->getLoc(), "SubRegIndex " + Idx->getName() + " appears twice in Register " + getName()); } @@ -74,6 +103,7 @@ // Here the order is important - earlier subregs take precedence. for (unsigned i = 0, e = SubList.size(); i != e; ++i) { CodeGenRegister *SR = RegBank.getReg(SubList[i]); + CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(Indices[i]); const SubRegMap &Map = SR->getSubRegs(RegBank); // Add this as a super-register of SR now all sub-registers are in the list. @@ -84,7 +114,7 @@ for (SubRegMap::const_iterator SI = Map.begin(), SE = Map.end(); SI != SE; ++SI) { if (!SubRegs.insert(*SI).second) - Orphans.push_back(Orphan(SI->second, Indices[i], SI->first)); + Orphans.push_back(Orphan(SI->second, Idx, SI->first)); // Noop sub-register indexes are possible, so avoid duplicates. if (SI->second != SR) @@ -104,6 +134,7 @@ if (!BaseIdxInit || !BaseIdxInit->getDef()->isSubClassOf("SubRegIndex")) throw TGError(TheDef->getLoc(), "Invalid SubClassIndex in " + Pat->getAsString()); + CodeGenSubRegIndex *BaseIdx = RegBank.getSubRegIdx(BaseIdxInit->getDef()); // Resolve list of subreg indices into R2. CodeGenRegister *R2 = this; @@ -113,8 +144,9 @@ if (!IdxInit || !IdxInit->getDef()->isSubClassOf("SubRegIndex")) throw TGError(TheDef->getLoc(), "Invalid SubClassIndex in " + Pat->getAsString()); + CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(IdxInit->getDef()); const SubRegMap &R2Subs = R2->getSubRegs(RegBank); - SubRegMap::const_iterator ni = R2Subs.find(IdxInit->getDef()); + SubRegMap::const_iterator ni = R2Subs.find(Idx); if (ni == R2Subs.end()) throw TGError(TheDef->getLoc(), "Composite " + Pat->getAsString() + " refers to bad index in " + R2->getName()); @@ -122,7 +154,7 @@ } // Insert composite index. Allow overriding inherited indices etc. - SubRegs[BaseIdxInit->getDef()] = R2; + SubRegs[BaseIdx] = R2; // R2 is no longer an orphan. for (unsigned j = 0, je = Orphans.size(); j != je; ++j) @@ -143,13 +175,15 @@ } void -CodeGenRegister::addSubRegsPreOrder(SetVector &OSet) const { +CodeGenRegister::addSubRegsPreOrder(SetVector &OSet, + CodeGenRegBank &RegBank) const { assert(SubRegsComplete && "Must precompute sub-registers"); std::vector Indices = TheDef->getValueAsListOfDefs("SubRegIndices"); for (unsigned i = 0, e = Indices.size(); i != e; ++i) { - CodeGenRegister *SR = SubRegs.find(Indices[i])->second; + CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(Indices[i]); + CodeGenRegister *SR = SubRegs.find(Idx)->second; if (OSet.insert(SR)) - SR->addSubRegsPreOrder(OSet); + SR->addSubRegsPreOrder(OSet, RegBank); } } @@ -516,8 +550,10 @@ } void -CodeGenRegisterClass::getSuperRegClasses(Record *SubIdx, BitVector &Out) const { - DenseMap >::const_iterator +CodeGenRegisterClass::getSuperRegClasses(CodeGenSubRegIndex *SubIdx, + BitVector &Out) const { + DenseMap >::const_iterator FindI = SuperRegClasses.find(SubIdx); if (FindI == SuperRegClasses.end()) return; @@ -539,9 +575,11 @@ // Read in the user-defined (named) sub-register indices. // More indices will be synthesized later. - SubRegIndices = Records.getAllDerivedDefinitions("SubRegIndex"); - std::sort(SubRegIndices.begin(), SubRegIndices.end(), LessRecord()); - NumNamedIndices = SubRegIndices.size(); + std::vector SRIs = Records.getAllDerivedDefinitions("SubRegIndex"); + std::sort(SRIs.begin(), SRIs.end(), LessRecord()); + NumNamedIndices = SRIs.size(); + for (unsigned i = 0, e = SRIs.size(); i != e; ++i) + getSubRegIdx(SRIs[i]); // Read in the register definitions. std::vector Regs = Records.getAllDerivedDefinitions("Register"); @@ -585,6 +623,15 @@ CodeGenRegisterClass::computeSubClasses(*this); } +CodeGenSubRegIndex *CodeGenRegBank::getSubRegIdx(Record *Def) { + CodeGenSubRegIndex *&Idx = Def2SubRegIdx[Def]; + if (Idx) + return Idx; + Idx = new CodeGenSubRegIndex(Def, SubRegIndices.size() + 1); + SubRegIndices.push_back(Idx); + return Idx; +} + CodeGenRegister *CodeGenRegBank::getReg(Record *Def) { CodeGenRegister *&Reg = Def2Reg[Def]; if (Reg) @@ -630,34 +677,28 @@ throw TGError(Def->getLoc(), "Not a known RegisterClass!"); } -Record *CodeGenRegBank::getCompositeSubRegIndex(Record *A, Record *B, - bool create) { +CodeGenSubRegIndex* +CodeGenRegBank::getCompositeSubRegIndex(CodeGenSubRegIndex *A, + CodeGenSubRegIndex *B, + bool create) { // Look for an existing entry. - Record *&Comp = Composite[std::make_pair(A, B)]; + CodeGenSubRegIndex *&Comp = Composite[std::make_pair(A, B)]; if (Comp || !create) return Comp; // None exists, synthesize one. std::string Name = A->getName() + "_then_" + B->getName(); - Comp = new Record(Name, SMLoc(), Records); - SubRegIndices.push_back(Comp); + Comp = getSubRegIdx(new Record(Name, SMLoc(), Records)); return Comp; } -unsigned CodeGenRegBank::getSubRegIndexNo(Record *idx) { - std::vector::const_iterator i = - std::find(SubRegIndices.begin(), SubRegIndices.end(), idx); - assert(i != SubRegIndices.end() && "Not a SubRegIndex"); - return (i - SubRegIndices.begin()) + 1; -} - void CodeGenRegBank::computeComposites() { for (unsigned i = 0, e = Registers.size(); i != e; ++i) { CodeGenRegister *Reg1 = Registers[i]; const CodeGenRegister::SubRegMap &SRM1 = Reg1->getSubRegs(); for (CodeGenRegister::SubRegMap::const_iterator i1 = SRM1.begin(), e1 = SRM1.end(); i1 != e1; ++i1) { - Record *Idx1 = i1->first; + CodeGenSubRegIndex *Idx1 = i1->first; CodeGenRegister *Reg2 = i1->second; // Ignore identity compositions. if (Reg1 == Reg2) @@ -666,7 +707,8 @@ // Try composing Idx1 with another SubRegIndex. for (CodeGenRegister::SubRegMap::const_iterator i2 = SRM2.begin(), e2 = SRM2.end(); i2 != e2; ++i2) { - std::pair IdxPair(Idx1, i2->first); + std::pair + IdxPair(Idx1, i2->first); CodeGenRegister *Reg3 = i2->second; // Ignore identity compositions. if (Reg2 == Reg3) @@ -679,11 +721,11 @@ Composite.insert(std::make_pair(IdxPair, i1d->first)); // Conflicting composition? Emit a warning but allow it. if (!Ins.second && Ins.first->second != i1d->first) { - errs() << "Warning: SubRegIndex " << getQualifiedName(Idx1) - << " and " << getQualifiedName(IdxPair.second) + errs() << "Warning: SubRegIndex " << Idx1->getQualifiedName() + << " and " << IdxPair.second->getQualifiedName() << " compose ambiguously as " - << getQualifiedName(Ins.first->second) << " or " - << getQualifiedName(i1d->first) << "\n"; + << Ins.first->second->getQualifiedName() << " or " + << i1d->first->getQualifiedName() << "\n"; } } } @@ -826,7 +868,8 @@ // void CodeGenRegBank::inferSubClassWithSubReg(CodeGenRegisterClass *RC) { // Map SubRegIndex to set of registers in RC supporting that SubRegIndex. - typedef std::map SubReg2SetMap; + typedef std::map SubReg2SetMap; // Compute the set of registers supporting each SubRegIndex. SubReg2SetMap SRSets; @@ -841,7 +884,7 @@ // Find matching classes for all SRSets entries. Iterate in SubRegIndex // numerical order to visit synthetic indices last. for (unsigned sri = 0, sre = SubRegIndices.size(); sri != sre; ++sri) { - Record *SubIdx = SubRegIndices[sri]; + CodeGenSubRegIndex *SubIdx = SubRegIndices[sri]; SubReg2SetMap::const_iterator I = SRSets.find(SubIdx); // Unsupported SubRegIndex. Skip it. if (I == SRSets.end()) @@ -873,7 +916,7 @@ // Iterate in SubRegIndex numerical order to visit synthetic indices last. for (unsigned sri = 0, sre = SubRegIndices.size(); sri != sre; ++sri) { - Record *SubIdx = SubRegIndices[sri]; + CodeGenSubRegIndex *SubIdx = SubRegIndices[sri]; // Skip indexes that aren't fully supported by RC's registers. This was // computed by inferSubClassWithSubReg() above which should have been // called first. @@ -1010,7 +1053,7 @@ if (Set.insert(Reg)) // Reg is new, add all sub-registers. // The pre-ordering is not important here. - Reg->addSubRegsPreOrder(Set); + Reg->addSubRegsPreOrder(Set, *this); } // Second, find all super-registers that are completely covered by the set. Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.h?rev=149418&r1=149417&r2=149418&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.h (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.h Tue Jan 31 14:57:55 2012 @@ -31,6 +31,28 @@ namespace llvm { class CodeGenRegBank; + /// CodeGenSubRegIndex - Represents a sub-register index. + class CodeGenSubRegIndex { + Record *const TheDef; + const unsigned EnumValue; + + public: + CodeGenSubRegIndex(Record *R, unsigned Enum); + + const std::string &getName() const; + std::string getNamespace() const; + std::string getQualifiedName() const; + + // Order CodeGenSubRegIndex pointers by EnumValue. + struct Less { + bool operator()(const CodeGenSubRegIndex *A, + const CodeGenSubRegIndex *B) const { + assert(A && B); + return A->EnumValue < B->EnumValue; + } + }; + }; + /// CodeGenRegister - Represents a register definition. struct CodeGenRegister { Record *TheDef; @@ -39,7 +61,8 @@ bool CoveredBySubRegs; // Map SubRegIndex -> Register. - typedef std::map SubRegMap; + typedef std::map SubRegMap; CodeGenRegister(Record *R, unsigned Enum); @@ -55,7 +78,8 @@ } // Add sub-registers to OSet following a pre-order defined by the .td file. - void addSubRegsPreOrder(SetVector &OSet) const; + void addSubRegsPreOrder(SetVector &OSet, + CodeGenRegBank&) const; // List of super-registers in topological order, small to large. typedef std::vector SuperRegList; @@ -104,14 +128,15 @@ // Map SubRegIndex -> sub-class. This is the largest sub-class where all // registers have a SubRegIndex sub-register. - DenseMap SubClassWithSubReg; + DenseMap SubClassWithSubReg; // Map SubRegIndex -> set of super-reg classes. This is all register // classes SuperRC such that: // // R:SubRegIndex in this RC for all R in SuperRC. // - DenseMap > SuperRegClasses; + DenseMap > SuperRegClasses; public: unsigned EnumValue; std::string Namespace; @@ -158,20 +183,23 @@ // getSubClassWithSubReg - Returns the largest sub-class where all // registers have a SubIdx sub-register. - CodeGenRegisterClass *getSubClassWithSubReg(Record *SubIdx) const { + CodeGenRegisterClass* + getSubClassWithSubReg(CodeGenSubRegIndex *SubIdx) const { return SubClassWithSubReg.lookup(SubIdx); } - void setSubClassWithSubReg(Record *SubIdx, CodeGenRegisterClass *SubRC) { + void setSubClassWithSubReg(CodeGenSubRegIndex *SubIdx, + CodeGenRegisterClass *SubRC) { SubClassWithSubReg[SubIdx] = SubRC; } // getSuperRegClasses - Returns a bit vector of all register classes // containing only SubIdx super-registers of this class. - void getSuperRegClasses(Record *SubIdx, BitVector &Out) const; + void getSuperRegClasses(CodeGenSubRegIndex *SubIdx, BitVector &Out) const; // addSuperRegClass - Add a class containing only SudIdx super-registers. - void addSuperRegClass(Record *SubIdx, CodeGenRegisterClass *SuperRC) { + void addSuperRegClass(CodeGenSubRegIndex *SubIdx, + CodeGenRegisterClass *SuperRC) { SuperRegClasses[SubIdx].insert(SuperRC); } @@ -240,8 +268,12 @@ RecordKeeper &Records; SetTheory Sets; - std::vector SubRegIndices; + // SubRegIndices. + std::vector SubRegIndices; + DenseMap Def2SubRegIdx; unsigned NumNamedIndices; + + // Registers. std::vector Registers; DenseMap Def2Reg; @@ -268,7 +300,8 @@ // Composite SubRegIndex instances. // Map (SubRegIndex, SubRegIndex) -> SubRegIndex. - typedef DenseMap, Record*> CompositeMap; + typedef DenseMap, + CodeGenSubRegIndex*> CompositeMap; CompositeMap Composite; // Populate the Composite map from sub-register relationships. @@ -282,14 +315,16 @@ // Sub-register indices. The first NumNamedIndices are defined by the user // in the .td files. The rest are synthesized such that all sub-registers // have a unique name. - const std::vector &getSubRegIndices() { return SubRegIndices; } + ArrayRef getSubRegIndices() { return SubRegIndices; } unsigned getNumNamedIndices() { return NumNamedIndices; } - // Map a SubRegIndex Record to its enum value. - unsigned getSubRegIndexNo(Record *idx); + // Find a SubRegIndex form its Record def. + CodeGenSubRegIndex *getSubRegIdx(Record*); // Find or create a sub-register index representing the A+B composition. - Record *getCompositeSubRegIndex(Record *A, Record *B, bool create = false); + CodeGenSubRegIndex *getCompositeSubRegIndex(CodeGenSubRegIndex *A, + CodeGenSubRegIndex *B, + bool create = false); const std::vector &getRegisters() { return Registers; } Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp?rev=149418&r1=149417&r2=149418&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Tue Jan 31 14:57:55 2012 @@ -282,7 +282,7 @@ continue; // getSubRegs() orders by SubRegIndex. We want a topological order. SetVector SR; - Reg.addSubRegsPreOrder(SR); + Reg.addSubRegsPreOrder(SR, RegBank); OS << " const unsigned " << Reg.getName() << "_SubRegsSet[] = { "; for (unsigned j = 0, je = SR.size(); j != je; ++j) OS << getQualifiedName(SR[j]->TheDef) << ", "; @@ -431,10 +431,11 @@ "unsigned) const;\n" << "};\n\n"; - const std::vector &SubRegIndices = RegBank.getSubRegIndices(); + ArrayRef SubRegIndices = RegBank.getSubRegIndices(); if (!SubRegIndices.empty()) { OS << "\n// Subregister indices\n"; - std::string Namespace = SubRegIndices[0]->getValueAsString("Namespace"); + std::string Namespace = + SubRegIndices[0]->getNamespace(); if (!Namespace.empty()) OS << "namespace " << Namespace << " {\n"; OS << "enum {\n NoSubRegister,\n"; @@ -690,7 +691,7 @@ unsigned NamedIndices = RegBank.getNumNamedIndices(); // Emit SubRegIndex names, skipping 0 - const std::vector &SubRegIndices = RegBank.getSubRegIndices(); + ArrayRef SubRegIndices = RegBank.getSubRegIndices(); OS << "\n static const char *const " << TargetName << "SubRegIndexTable[] = { \""; for (unsigned i = 0, e = SubRegIndices.size(); i != e; ++i) { @@ -729,7 +730,7 @@ OS << " default: return 0;\n"; for (CodeGenRegister::SubRegMap::const_iterator ii = SRM.begin(), ie = SRM.end(); ii != ie; ++ii) - OS << " case " << getQualifiedName(ii->first) + OS << " case " << ii->first->getQualifiedName() << ": return " << getQualifiedName(ii->second->TheDef) << ";\n"; OS << " };\n" << " break;\n"; } @@ -749,7 +750,7 @@ for (CodeGenRegister::SubRegMap::const_iterator ii = SRM.begin(), ie = SRM.end(); ii != ie; ++ii) OS << " if (SubRegNo == " << getQualifiedName(ii->second->TheDef) - << ") return " << getQualifiedName(ii->first) << ";\n"; + << ") return " << ii->first->getQualifiedName() << ";\n"; OS << " return 0;\n"; } OS << " };\n"; @@ -764,15 +765,16 @@ for (unsigned i = 0, e = SubRegIndices.size(); i != e; ++i) { bool Open = false; for (unsigned j = 0; j != e; ++j) { - if (Record *Comp = RegBank.getCompositeSubRegIndex(SubRegIndices[i], - SubRegIndices[j])) { + if (CodeGenSubRegIndex *Comp = + RegBank.getCompositeSubRegIndex(SubRegIndices[i], + SubRegIndices[j])) { if (!Open) { - OS << " case " << getQualifiedName(SubRegIndices[i]) + OS << " case " << SubRegIndices[i]->getQualifiedName() << ": switch(IdxB) {\n default: return IdxB;\n"; Open = true; } - OS << " case " << getQualifiedName(SubRegIndices[j]) - << ": return " << getQualifiedName(Comp) << ";\n"; + OS << " case " << SubRegIndices[j]->getQualifiedName() + << ": return " << Comp->getQualifiedName() << ";\n"; } } if (Open) @@ -801,7 +803,7 @@ const CodeGenRegisterClass &RC = *RegisterClasses[rci]; OS << " {\t// " << RC.getName() << "\n"; for (unsigned sri = 0, sre = SubRegIndices.size(); sri != sre; ++sri) { - Record *Idx = SubRegIndices[sri]; + CodeGenSubRegIndex *Idx = SubRegIndices[sri]; if (CodeGenRegisterClass *SRC = RC.getSubClassWithSubReg(Idx)) OS << " " << SRC->EnumValue + 1 << ",\t// " << Idx->getName() << " -> " << SRC->getName() << "\n"; @@ -842,7 +844,7 @@ const CodeGenRegisterClass &RC = *RegisterClasses[rci]; OS << " {\t// " << RC.getName() << "\n"; for (unsigned sri = 0, sre = SubRegIndices.size(); sri != sre; ++sri) { - Record *Idx = SubRegIndices[sri]; + CodeGenSubRegIndex *Idx = SubRegIndices[sri]; BV.reset(); RC.getSuperRegClasses(Idx, BV); OS << " { "; From kcc at google.com Tue Jan 31 15:24:19 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 13:24:19 -0800 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: <076AA06D-AC93-41B4-B830-464D2A76E710@nondot.org> References: <076AA06D-AC93-41B4-B830-464D2A76E710@nondot.org> Message-ID: On Tue, Jan 31, 2012 at 12:43 PM, Chris Lattner wrote: > > On Jan 31, 2012, at 4:22 AM, Alexander Potapenko wrote: > > > On Tue, Jan 31, 2012 at 3:56 PM, Chris Lattner wrote: > >> > >> On Jan 31, 2012, at 12:27 AM, Alexander Potapenko wrote: > >> > >>> On Tue, Jan 31, 2012 at 10:18 AM, Chris Lattner > wrote: > >>>> Author: lattner > >>>> Date: Tue Jan 31 00:18:43 2012 > >>>> New Revision: 149365 > >>>> > >>>> URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev > >>>> Log: > >>>> eliminate the "string" form of ConstantArray::get, using > >>>> ConstantDataArray::getString instead. > >>> Chris, > >>> > >>> our internal (for the moment) ASan buildbot is reporting test failures > >>> starting at the range of r149363 to r149365 on Linux x64 and Mac x64 > >>> Namely they are: > >> > >> That sounds bad, and I'd definitely like to fix these. Can you give me > more information about how they are failing? What command line is failing, > and with what stack trace? > >> > >> -Chris > > Attached is the log of `make check-all` on my Snow Leopard machine. > > Feel free to ask for more information. > > > > Thanks Alexander. I think that Benjamin fixed this for me (thanks!), can > you verify that it is resolved for you? > Our bot is nicely green. (Really hope to make it public this quarter) --kcc > > -Chris > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/5d7ac821/attachment.html From chandlerc at google.com Tue Jan 31 15:29:07 2012 From: chandlerc at google.com (Chandler Carruth) Date: Tue, 31 Jan 2012 13:29:07 -0800 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: References: <076AA06D-AC93-41B4-B830-464D2A76E710@nondot.org> Message-ID: On Tue, Jan 31, 2012 at 1:24 PM, Kostya Serebryany wrote: > Our bot is nicely green. > (Really hope to make it public this quarter) > :: hangs head in shame :: I promise I'm working on this' it's just taking forever... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/3c0a075a/attachment.html From echristo at apple.com Tue Jan 31 15:34:05 2012 From: echristo at apple.com (Eric Christopher) Date: Tue, 31 Jan 2012 13:34:05 -0800 Subject: [llvm-commits] [cfe-commits] [PATCH] autoconf: add private config.h to clang In-Reply-To: References: Message-ID: <79775E3E-32A4-4393-90F3-5DAA8E418F92@apple.com> On Jan 30, 2012, at 1:01 PM, nobled wrote: > This already exists in the CMake build, which is part of what makes > building clang separately from llvm via cmake possible. This cleans up > that discrepancy between the build systems. > > I'll just add the minimal file include/clang/Config/config.h.in (with > the same contents as its config.h.cmake counterpart), then add it to > LLVM's configure script so it gets generated, then remove the #ifdef > logic from clang that was conditionally including the cmake generated > header. > > Okay to commit? Seems reasonable. Didn't think we'd need one, but it's fine with me if we want one. -eric From mcrosier at apple.com Tue Jan 31 15:44:33 2012 From: mcrosier at apple.com (Chad Rosier) Date: Tue, 31 Jan 2012 13:44:33 -0800 Subject: [llvm-commits] [polly] r149415 - /polly/trunk/www/example_load_Polly_into_clang.html In-Reply-To: <20120131202421.A20A82A6C12C@llvm.org> References: <20120131202421.A20A82A6C12C@llvm.org> Message-ID: <3DA1131F-A696-49E4-B464-C7EA781152C5@apple.com> Hi Tobias, On Jan 31, 2012, at 12:24 PM, Tobias Grosser wrote: > Author: grosser > Date: Tue Jan 31 14:24:21 2012 > New Revision: 149415 > > URL: http://llvm.org/viewvc/llvm-project?rev=149415&view=rev > Log: > www: Add missing -mllvm to documentation > > Found by Ryan Taylor > > Modified: > polly/trunk/www/example_load_Polly_into_clang.html > > Modified: polly/trunk/www/example_load_Polly_into_clang.html > URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_load_Polly_into_clang.html?rev=149415&r1=149414&r2=149415&view=diff > ============================================================================== > --- polly/trunk/www/example_load_Polly_into_clang.html (original) > +++ polly/trunk/www/example_load_Polly_into_clang.html Tue Jan 31 14:24:21 2012 > @@ -43,7 +43,7 @@ > Optimizing with Polly is as easy as ading -O3 -polly to your compiler Does a similar change need to be applied above? Chad > flags (Polly is only available at -O3). > > -
pollycc -O3 -polly file.c
> +
pollycc -O3 -mllvm -polly file.c
> >

Automatic OpenMP code generation

> > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From preston.gurd at intel.com Tue Jan 31 15:42:27 2012 From: preston.gurd at intel.com (Gurd, Preston) Date: Tue, 31 Jan 2012 21:42:27 +0000 Subject: [llvm-commits] [llvm][PATCH - REVISED][Commit request] X86 Instruction scheduler for the Intel Atom In-Reply-To: <3893BF02-E382-470D-AFB4-9F30AE9ECE34@apple.com> References: <3893BF02-E382-470D-AFB4-9F30AE9ECE34@apple.com> Message-ID: Hello Andy, Thank you for your comments. I have revised the patch as you suggested. I have also added "-mcpu=generic" to two additional tests (2010-02-19-TailCallRetAddrBug.ll and peep-test-3.ll) which were failing when run on Atom, after I applied a Evan's suggestion to change the scheduling preference to "Hybrid". Unless you have any other comments, please commit the attached patch. Thanks, Preston From: Andrew Trick [mailto:atrick at apple.com] Sent: Tuesday, January 31, 2012 2:22 AM To: Gurd, Preston Cc: Evan Cheng; llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom On Jan 23, 2012, at 3:05 PM, "Gurd, Preston" > wrote: Revision 2: Tests which were failing, when run on an Atom, due to the tests finding a schedule different from what was expected, have been changed to use "-mcpu=generic" in order to prevent the Atom scheduler from running, so that all "make check" tests pass. From: Gurd, Preston Sent: Tuesday, January 17, 2012 4:29 PM To: Evan Cheng Cc: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom The attached patch implements most of an instruction scheduler for the Intel Atom. It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. It adds a test to verify that the scheduler is working. I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. Revision: the patch also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. Please commit the patch if it seems acceptable. Preston From: Evan Cheng [mailto:evan.cheng at apple.com] Sent: Monday, January 16, 2012 12:01 PM To: Gurd, Preston Cc: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] [llvm][PATCH][Review request] X86 Instruction scheduler for the Intel Atom Very nice. One question, I noticed you haven't changed the scheduling preference so x86_64 is still using ILP scheduler while i386 is using register pressure reduction scheduler. Have you tried changing the preference to latency scheduler for Atom? Evan On Jan 13, 2012, at 3:26 PM, Gurd, Preston wrote: The attached patch implements most of an instruction scheduler for the Intel Atom. It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. It adds a test to verify that the scheduler is working. I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. Hi Preston, I just have a couple minor questions I'd like you to address before I commit this: +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, + FeatureMOVBE, FeatureSlowBTMem]>; These features are already included in the ProcIntelAtom family. Why do you need to list them again? Please verify, but subtarget features should be transitively implied. + //CriticalPathRCs.push_back(&X86::GPRRegClass); If you want to leave this disabled, please add comments. Thanks, -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/640fa4fc/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-x86-scheduler.diff Type: application/octet-stream Size: 186924 bytes Desc: llvm-x86-scheduler.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/640fa4fc/attachment-0001.obj From mcrosier at apple.com Tue Jan 31 15:46:39 2012 From: mcrosier at apple.com (Chad Rosier) Date: Tue, 31 Jan 2012 13:46:39 -0800 Subject: [llvm-commits] [polly] r149415 - /polly/trunk/www/example_load_Polly_into_clang.html In-Reply-To: <3DA1131F-A696-49E4-B464-C7EA781152C5@apple.com> References: <20120131202421.A20A82A6C12C@llvm.org> <3DA1131F-A696-49E4-B464-C7EA781152C5@apple.com> Message-ID: On Jan 31, 2012, at 1:44 PM, Chad Rosier wrote: > Hi Tobias, > > On Jan 31, 2012, at 12:24 PM, Tobias Grosser wrote: > >> Author: grosser >> Date: Tue Jan 31 14:24:21 2012 >> New Revision: 149415 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149415&view=rev >> Log: >> www: Add missing -mllvm to documentation >> >> Found by Ryan Taylor >> >> Modified: >> polly/trunk/www/example_load_Polly_into_clang.html >> >> Modified: polly/trunk/www/example_load_Polly_into_clang.html >> URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_load_Polly_into_clang.html?rev=149415&r1=149414&r2=149415&view=diff >> ============================================================================== >> --- polly/trunk/www/example_load_Polly_into_clang.html (original) >> +++ polly/trunk/www/example_load_Polly_into_clang.html Tue Jan 31 14:24:21 2012 >> @@ -43,7 +43,7 @@ >> Optimizing with Polly is as easy as ading -O3 -polly to your compiler Also, ading -> adding. > Does a similar change need to be applied above? > > Chad > >> flags (Polly is only available at -O3). >> >> -
pollycc -O3 -polly file.c
>> +
pollycc -O3 -mllvm -polly file.c
>> >>

Automatic OpenMP code generation

>> >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From stoklund at 2pi.dk Tue Jan 31 15:44:11 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 31 Jan 2012 21:44:11 -0000 Subject: [llvm-commits] [llvm] r149423 - in /llvm/trunk/utils/TableGen: CodeGenRegisters.cpp CodeGenRegisters.h RegisterInfoEmitter.cpp Message-ID: <20120131214411.928782A6C12C@llvm.org> Author: stoklund Date: Tue Jan 31 15:44:11 2012 New Revision: 149423 URL: http://llvm.org/viewvc/llvm-project?rev=149423&view=rev Log: Move the composite map into CodeGenSubRegIndex. Each SubRegIndex keeps track of how it composes. Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp llvm/trunk/utils/TableGen/CodeGenRegisters.h llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.cpp?rev=149423&r1=149422&r2=149423&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.cpp (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Tue Jan 31 15:44:11 2012 @@ -49,6 +49,16 @@ return N; } +void CodeGenSubRegIndex::cleanComposites() { + // Clean out redundant mappings of the form this+X -> X. + for (CompMap::iterator i = Composed.begin(), e = Composed.end(); i != e;) { + CompMap::iterator j = i; + ++i; + if (j->first == j->second) + Composed.erase(j); + } +} + //===----------------------------------------------------------------------===// // CodeGenRegister //===----------------------------------------------------------------------===// @@ -168,7 +178,7 @@ Orphan &O = Orphans[i]; if (!O.SubReg) continue; - SubRegs[RegBank.getCompositeSubRegIndex(O.First, O.Second, true)] = + SubRegs[RegBank.getCompositeSubRegIndex(O.First, O.Second)] = O.SubReg; } return SubRegs; @@ -679,16 +689,16 @@ CodeGenSubRegIndex* CodeGenRegBank::getCompositeSubRegIndex(CodeGenSubRegIndex *A, - CodeGenSubRegIndex *B, - bool create) { + CodeGenSubRegIndex *B) { // Look for an existing entry. - CodeGenSubRegIndex *&Comp = Composite[std::make_pair(A, B)]; - if (Comp || !create) + CodeGenSubRegIndex *Comp = A->compose(B); + if (Comp) return Comp; // None exists, synthesize one. std::string Name = A->getName() + "_then_" + B->getName(); Comp = getSubRegIdx(new Record(Name, SMLoc(), Records)); + A->addComposite(B, Comp); return Comp; } @@ -707,8 +717,7 @@ // Try composing Idx1 with another SubRegIndex. for (CodeGenRegister::SubRegMap::const_iterator i2 = SRM2.begin(), e2 = SRM2.end(); i2 != e2; ++i2) { - std::pair - IdxPair(Idx1, i2->first); + CodeGenSubRegIndex *Idx2 = i2->first; CodeGenRegister *Reg3 = i2->second; // Ignore identity compositions. if (Reg2 == Reg3) @@ -717,16 +726,13 @@ for (CodeGenRegister::SubRegMap::const_iterator i1d = SRM1.begin(), e1d = SRM1.end(); i1d != e1d; ++i1d) { if (i1d->second == Reg3) { - std::pair Ins = - Composite.insert(std::make_pair(IdxPair, i1d->first)); // Conflicting composition? Emit a warning but allow it. - if (!Ins.second && Ins.first->second != i1d->first) { + if (CodeGenSubRegIndex *Prev = Idx1->addComposite(Idx2, i1d->first)) errs() << "Warning: SubRegIndex " << Idx1->getQualifiedName() - << " and " << IdxPair.second->getQualifiedName() + << " and " << Idx2->getQualifiedName() << " compose ambiguously as " - << Ins.first->second->getQualifiedName() << " or " + << Prev->getQualifiedName() << " or " << i1d->first->getQualifiedName() << "\n"; - } } } } @@ -735,13 +741,8 @@ // We don't care about the difference between (Idx1, Idx2) -> Idx2 and invalid // compositions, so remove any mappings of that form. - for (CompositeMap::iterator i = Composite.begin(), e = Composite.end(); - i != e;) { - CompositeMap::iterator j = i; - ++i; - if (j->first.second == j->second) - Composite.erase(j); - } + for (unsigned i = 0, e = SubRegIndices.size(); i != e; ++i) + SubRegIndices[i]->cleanComposites(); } // Compute sets of overlapping registers. Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.h?rev=149423&r1=149422&r2=149423&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.h (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.h Tue Jan 31 15:44:11 2012 @@ -51,6 +51,34 @@ return A->EnumValue < B->EnumValue; } }; + + // Map of composite subreg indices. + typedef std::map CompMap; + + // Returns the subreg index that results from composing this with Idx. + // Returns NULL if this and Idx don't compose. + CodeGenSubRegIndex *compose(CodeGenSubRegIndex *Idx) const { + CompMap::const_iterator I = Composed.find(Idx); + return I == Composed.end() ? 0 : I->second; + } + + // Add a composite subreg index: this+A = B. + // Return a conflicting composite, or NULL + CodeGenSubRegIndex *addComposite(CodeGenSubRegIndex *A, + CodeGenSubRegIndex *B) { + std::pair Ins = + Composed.insert(std::make_pair(A, B)); + return (Ins.second || Ins.first->second == B) ? 0 : Ins.first->second; + } + + // Clean out redundant composite mappings. + void cleanComposites(); + + // Return the map of composites. + const CompMap &getComposites() const { return Composed; } + + private: + CompMap Composed; }; /// CodeGenRegister - Represents a register definition. @@ -298,12 +326,6 @@ void inferMatchingSuperRegClass(CodeGenRegisterClass *RC, unsigned FirstSubRegRC = 0); - // Composite SubRegIndex instances. - // Map (SubRegIndex, SubRegIndex) -> SubRegIndex. - typedef DenseMap, - CodeGenSubRegIndex*> CompositeMap; - CompositeMap Composite; - // Populate the Composite map from sub-register relationships. void computeComposites(); @@ -323,8 +345,7 @@ // Find or create a sub-register index representing the A+B composition. CodeGenSubRegIndex *getCompositeSubRegIndex(CodeGenSubRegIndex *A, - CodeGenSubRegIndex *B, - bool create = false); + CodeGenSubRegIndex *B); const std::vector &getRegisters() { return Registers; } Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp?rev=149423&r1=149422&r2=149423&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Tue Jan 31 15:44:11 2012 @@ -766,8 +766,7 @@ bool Open = false; for (unsigned j = 0; j != e; ++j) { if (CodeGenSubRegIndex *Comp = - RegBank.getCompositeSubRegIndex(SubRegIndices[i], - SubRegIndices[j])) { + SubRegIndices[i]->compose(SubRegIndices[j])) { if (!Open) { OS << " case " << SubRegIndices[i]->getQualifiedName() << ": switch(IdxB) {\n default: return IdxB;\n"; From bigcheesegs at gmail.com Tue Jan 31 15:45:27 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:45:27 -0000 Subject: [llvm-commits] [lld] r149425 - /lld/trunk/include/lld/Core/Atom.h Message-ID: <20120131214527.5A19C2A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:45:26 2012 New Revision: 149425 URL: http://llvm.org/viewvc/llvm-project?rev=149425&view=rev Log: NULL requires including cstddef. Just use 0. Modified: lld/trunk/include/lld/Core/Atom.h Modified: lld/trunk/include/lld/Core/Atom.h URL: http://llvm.org/viewvc/llvm-project/lld/trunk/include/lld/Core/Atom.h?rev=149425&r1=149424&r2=149425&view=diff ============================================================================== --- lld/trunk/include/lld/Core/Atom.h (original) +++ lld/trunk/include/lld/Core/Atom.h Tue Jan 31 15:45:26 2012 @@ -50,9 +50,9 @@ virtual Definition definition() const = 0; /// definedAtom - like dynamic_cast, if atom is definitionRegular - /// returns atom cast to DefinedAtom*, else returns NULL; - virtual const DefinedAtom* definedAtom() const { return NULL; } - + /// returns atom cast to DefinedAtom*, else returns nullptr; + virtual const DefinedAtom* definedAtom() const { return 0; } + protected: /// Atom is an abstract base class. Only subclasses can access constructor. Atom() {} From bigcheesegs at gmail.com Tue Jan 31 15:45:53 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:45:53 -0000 Subject: [llvm-commits] [lld] r149426 - in /lld/trunk/lib/Core: NativeFileFormat.h NativeReader.cpp NativeWriter.cpp Message-ID: <20120131214553.F1C792A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:45:53 2012 New Revision: 149426 URL: http://llvm.org/viewvc/llvm-project?rev=149426&view=rev Log: Flexible array members are not in C++03, and MSVC doesn't support them. Modified: lld/trunk/lib/Core/NativeFileFormat.h lld/trunk/lib/Core/NativeReader.cpp lld/trunk/lib/Core/NativeWriter.cpp Modified: lld/trunk/lib/Core/NativeFileFormat.h URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeFileFormat.h?rev=149426&r1=149425&r2=149426&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeFileFormat.h (original) +++ lld/trunk/lib/Core/NativeFileFormat.h Tue Jan 31 15:45:53 2012 @@ -80,7 +80,6 @@ uint32_t architecture; uint32_t fileSize; uint32_t chunkCount; - NativeChunk chunks[]; }; // Modified: lld/trunk/lib/Core/NativeReader.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeReader.cpp?rev=149426&r1=149425&r2=149426&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeReader.cpp (original) +++ lld/trunk/lib/Core/NativeReader.cpp Tue Jan 31 15:45:53 2012 @@ -175,6 +175,8 @@ reinterpret_cast(mb->getBufferStart()); const NativeFileHeader* const header = reinterpret_cast(base); + const NativeChunk *const chunks = + reinterpret_cast(base + sizeof(NativeFileHeader)); // make sure magic matches if ( memcmp(header->magic, NATIVE_FILE_HEADER_MAGIC, 16) != 0 ) return make_error_code(unknown_file_format); @@ -190,7 +192,7 @@ // process each chunk for(uint32_t i=0; i < header->chunkCount; ++i) { llvm::error_code ec; - const NativeChunk* chunk = &header->chunks[i]; + const NativeChunk* chunk = &chunks[i]; // sanity check chunk is within file if ( chunk->fileOffset > fileSize ) return make_error_code(file_malformed); Modified: lld/trunk/lib/Core/NativeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeWriter.cpp?rev=149426&r1=149425&r2=149426&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeWriter.cpp (original) +++ lld/trunk/lib/Core/NativeWriter.cpp Tue Jan 31 15:45:53 2012 @@ -71,6 +71,9 @@ _headerBufferSize = sizeof(NativeFileHeader) + 4*sizeof(NativeChunk); _headerBuffer = reinterpret_cast (operator new(_headerBufferSize, std::nothrow)); + NativeChunk *chunks = + reinterpret_cast(reinterpret_cast(_headerBuffer) + + sizeof(NativeFileHeader)); memcpy(_headerBuffer->magic, NATIVE_FILE_HEADER_MAGIC, 16); _headerBuffer->endian = NFH_LittleEndian; _headerBuffer->architecture = 0; @@ -78,25 +81,25 @@ _headerBuffer->chunkCount = 4; // create chunk for atom ivar array - NativeChunk& ch0 = _headerBuffer->chunks[0]; + NativeChunk& ch0 = chunks[0]; ch0.signature = NCS_DefinedAtomsV1; ch0.fileOffset = _headerBufferSize; ch0.fileSize = _definedAtomIvars.size()*sizeof(NativeDefinedAtomIvarsV1); ch0.elementCount = _definedAtomIvars.size(); - // create chunk for attributes - NativeChunk& ch1 = _headerBuffer->chunks[1]; + // create chunk for attributes + NativeChunk& ch1 = chunks[1]; ch1.signature = NCS_AttributesArrayV1; ch1.fileOffset = ch0.fileOffset + ch0.fileSize; ch1.fileSize = _attributes.size()*sizeof(NativeAtomAttributesV1); ch1.elementCount = _attributes.size(); - // create chunk for content - NativeChunk& ch2 = _headerBuffer->chunks[2]; + // create chunk for content + NativeChunk& ch2 = chunks[2]; ch2.signature = NCS_Content; ch2.fileOffset = ch1.fileOffset + ch1.fileSize; ch2.fileSize = _contentPool.size(); ch2.elementCount = _contentPool.size(); // create chunk for symbol strings - NativeChunk& ch3 = _headerBuffer->chunks[3]; + NativeChunk& ch3 = chunks[3]; ch3.signature = NCS_Strings; ch3.fileOffset = ch2.fileOffset + ch2.fileSize; ch3.fileSize = _stringPool.size(); From bigcheesegs at gmail.com Tue Jan 31 15:46:06 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:46:06 -0000 Subject: [llvm-commits] [lld] r149427 - /lld/trunk/lib/Core/NativeWriter.cpp Message-ID: <20120131214606.129CE2A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:46:05 2012 New Revision: 149427 URL: http://llvm.org/viewvc/llvm-project?rev=149427&view=rev Log: If cont.size() is 0, the expression &_contentPool[result] has undefined behaivior because it indexes past the end of _contentPool. Modified: lld/trunk/lib/Core/NativeWriter.cpp Modified: lld/trunk/lib/Core/NativeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeWriter.cpp?rev=149427&r1=149426&r2=149427&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeWriter.cpp (original) +++ lld/trunk/lib/Core/NativeWriter.cpp Tue Jan 31 15:46:05 2012 @@ -128,8 +128,7 @@ return 0; uint32_t result = _contentPool.size(); llvm::ArrayRef cont = atom.rawContent(); - _contentPool.insert(_contentPool.end(), cont.size(), 0); - memcpy(&_contentPool[result], cont.data(), cont.size()); + _contentPool.insert(_contentPool.end(), cont.begin(), cont.end()); return result; } From bigcheesegs at gmail.com Tue Jan 31 15:46:17 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:46:17 -0000 Subject: [llvm-commits] [lld] r149428 - /lld/trunk/tools/lld-core/lld-core.cpp Message-ID: <20120131214617.BD5692A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:46:17 2012 New Revision: 149428 URL: http://llvm.org/viewvc/llvm-project?rev=149428&view=rev Log: Add pretty stack tracing and llvm_shutdown. Modified: lld/trunk/tools/lld-core/lld-core.cpp Modified: lld/trunk/tools/lld-core/lld-core.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/tools/lld-core/lld-core.cpp?rev=149428&r1=149427&r2=149428&view=diff ============================================================================== --- lld/trunk/tools/lld-core/lld-core.cpp (original) +++ lld/trunk/tools/lld-core/lld-core.cpp Tue Jan 31 15:46:17 2012 @@ -23,8 +23,11 @@ #include "llvm/ADT/SmallString.h" #include "llvm/ADT/Twine.h" #include "llvm/Support/DataTypes.h" +#include "llvm/Support/ManagedStatic.h" #include "llvm/Support/MemoryBuffer.h" +#include "llvm/Support/PrettyStackTrace.h" #include "llvm/Support/raw_ostream.h" +#include "llvm/Support/Signals.h" #include "llvm/Support/system_error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" @@ -198,6 +201,11 @@ } int main(int argc, const char *argv[]) { + // Print a stack trace if we signal out. + llvm::sys::PrintStackTraceOnErrorSignal(); + llvm::PrettyStackTraceProgram X(argc, argv); + llvm::llvm_shutdown_obj Y; // Call llvm_shutdown() on exit. + // read input YAML doc into object file(s) std::vector files; if (error(yaml::parseObjectTextFileOrSTDIN(llvm::StringRef(argv[1]), files))) From bigcheesegs at gmail.com Tue Jan 31 15:46:29 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:46:29 -0000 Subject: [llvm-commits] [lld] r149429 - /lld/trunk/lib/Core/NativeReader.cpp Message-ID: <20120131214629.D31722A6C12D@llvm.org> Author: mspencer Date: Tue Jan 31 15:46:29 2012 New Revision: 149429 URL: http://llvm.org/viewvc/llvm-project?rev=149429&view=rev Log: Fix use after free. Modified: lld/trunk/lib/Core/NativeReader.cpp Modified: lld/trunk/lib/Core/NativeReader.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeReader.cpp?rev=149429&r1=149428&r2=149429&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeReader.cpp (original) +++ lld/trunk/lib/Core/NativeReader.cpp Tue Jan 31 15:46:29 2012 @@ -409,7 +409,7 @@ if ( ec ) return ec; - return parseNativeObjectFile(mb.get(), path, result); + return parseNativeObjectFile(mb.take(), path, result); } From bigcheesegs at gmail.com Tue Jan 31 15:46:41 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:46:41 -0000 Subject: [llvm-commits] [lld] r149430 - /lld/trunk/lib/Core/NativeWriter.cpp Message-ID: <20120131214641.629582A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:46:41 2012 New Revision: 149430 URL: http://llvm.org/viewvc/llvm-project?rev=149430&view=rev Log: &vectorval[0] is UB when vectorval.size() == 0. Modified: lld/trunk/lib/Core/NativeWriter.cpp Modified: lld/trunk/lib/Core/NativeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeWriter.cpp?rev=149430&r1=149429&r2=149430&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeWriter.cpp (original) +++ lld/trunk/lib/Core/NativeWriter.cpp Tue Jan 31 15:46:41 2012 @@ -38,12 +38,16 @@ // write the lld::File in native format to the specified stream void write(llvm::raw_ostream& out) { out.write((char*)_headerBuffer, _headerBufferSize); - out.write((char*)&_definedAtomIvars[0], - _definedAtomIvars.size()*sizeof(NativeDefinedAtomIvarsV1)); - out.write((char*)&_attributes[0], - _attributes.size()*sizeof(NativeAtomAttributesV1)); - out.write((char*)&_contentPool[0], _contentPool.size()); - out.write(&_stringPool[0], _stringPool.size()); + if (!_definedAtomIvars.empty()) + out.write((char*)&_definedAtomIvars[0], + _definedAtomIvars.size()*sizeof(NativeDefinedAtomIvarsV1)); + if (!_attributes.empty()) + out.write((char*)&_attributes[0], + _attributes.size()*sizeof(NativeAtomAttributesV1)); + if (!_contentPool.empty()) + out.write((char*)&_contentPool[0], _contentPool.size()); + if (!_stringPool.empty()) + out.write(&_stringPool[0], _stringPool.size()); } private: From bigcheesegs at gmail.com Tue Jan 31 15:46:52 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:46:52 -0000 Subject: [llvm-commits] [lld] r149431 - /lld/trunk/tools/lld-core/lld-core.cpp Message-ID: <20120131214652.E45282A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:46:52 2012 New Revision: 149431 URL: http://llvm.org/viewvc/llvm-project?rev=149431&view=rev Log: Don't delete the temp file until after we finish reading from it. Modified: lld/trunk/tools/lld-core/lld-core.cpp Modified: lld/trunk/tools/lld-core/lld-core.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/tools/lld-core/lld-core.cpp?rev=149431&r1=149430&r2=149431&view=diff ============================================================================== --- lld/trunk/tools/lld-core/lld-core.cpp (original) +++ lld/trunk/tools/lld-core/lld-core.cpp Tue Jan 31 15:46:52 2012 @@ -235,13 +235,13 @@ // read native file lld::File* natFile; parseNativeObjectFileOrSTDIN(tempPath, natFile); - - // delete temp .o file - bool existed; - llvm::sys::fs::remove(tempPath.str(), existed); // write new atom graph out as YAML doc yaml::writeObjectText(*natFile, out); + // delete temp .o file + bool existed; + llvm::sys::fs::remove(tempPath.str(), existed); + return 0; } From bigcheesegs at gmail.com Tue Jan 31 15:47:14 2012 From: bigcheesegs at gmail.com (Michael J. Spencer) Date: Tue, 31 Jan 2012 21:47:14 -0000 Subject: [llvm-commits] [lld] r149432 - in /lld/trunk: include/lld/Core/Error.h lib/Core/CMakeLists.txt lib/Core/Error.cpp lib/Core/NativeReader.cpp lib/Core/YamlReader.cpp Message-ID: <20120131214714.1B20D2A6C12C@llvm.org> Author: mspencer Date: Tue Jan 31 15:47:13 2012 New Revision: 149432 URL: http://llvm.org/viewvc/llvm-project?rev=149432&view=rev Log: Cleanup system_error extensions. Added: lld/trunk/include/lld/Core/Error.h lld/trunk/lib/Core/Error.cpp Modified: lld/trunk/lib/Core/CMakeLists.txt lld/trunk/lib/Core/NativeReader.cpp lld/trunk/lib/Core/YamlReader.cpp Added: lld/trunk/include/lld/Core/Error.h URL: http://llvm.org/viewvc/llvm-project/lld/trunk/include/lld/Core/Error.h?rev=149432&view=auto ============================================================================== --- lld/trunk/include/lld/Core/Error.h (added) +++ lld/trunk/include/lld/Core/Error.h Tue Jan 31 15:47:13 2012 @@ -0,0 +1,76 @@ +//===- Error.h - system_error extensions for lld ----------------*- C++ -*-===// +// +// The LLVM Linker +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This declares a new error_category for the lld library. +// +//===----------------------------------------------------------------------===// + +#ifndef LLD_CORE_ERROR_H +#define LLD_CORE_ERROR_H + +#include "llvm/Support/system_error.h" + +namespace lld { + +const llvm::error_category &native_reader_category(); + +struct native_reader_error { + enum _ { + success = 0, + unknown_file_format, + file_too_short, + file_malformed, + unknown_chunk_type, + memory_error, + }; + _ v_; + + native_reader_error(_ v) : v_(v) {} + explicit native_reader_error(int v) : v_(_(v)) {} + operator int() const {return v_;} +}; + +inline llvm::error_code make_error_code(native_reader_error e) { + return llvm::error_code(static_cast(e), native_reader_category()); +} + +const llvm::error_category &yaml_reader_category(); + +struct yaml_reader_error { + enum _ { + success = 0, + unknown_keyword, + illegal_value + }; + _ v_; + + yaml_reader_error(_ v) : v_(v) {} + explicit yaml_reader_error(int v) : v_(_(v)) {} + operator int() const {return v_;} +}; + +inline llvm::error_code make_error_code(yaml_reader_error e) { + return llvm::error_code(static_cast(e), yaml_reader_category()); +} + +} // end namespace lld + +namespace llvm { + +template <> struct is_error_code_enum : true_type { }; +template <> +struct is_error_code_enum : true_type { }; + +template <> struct is_error_code_enum : true_type { }; +template <> +struct is_error_code_enum : true_type { }; + +} // end namespace llvm + +#endif Modified: lld/trunk/lib/Core/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/CMakeLists.txt?rev=149432&r1=149431&r2=149432&view=diff ============================================================================== --- lld/trunk/lib/Core/CMakeLists.txt (original) +++ lld/trunk/lib/Core/CMakeLists.txt Tue Jan 31 15:47:13 2012 @@ -1,4 +1,5 @@ add_lld_library(lldCore + Error.cpp File.cpp NativeFileFormat.h NativeReader.cpp Added: lld/trunk/lib/Core/Error.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/Error.cpp?rev=149432&view=auto ============================================================================== --- lld/trunk/lib/Core/Error.cpp (added) +++ lld/trunk/lib/Core/Error.cpp Tue Jan 31 15:47:13 2012 @@ -0,0 +1,92 @@ +//===- Error.cpp - system_error extensions for lld --------------*- C++ -*-===// +// +// The LLVM Linker +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "lld/Core/Error.h" + +#include "llvm/Support/ErrorHandling.h" + +using namespace lld; + +class _native_reader_error_category : public llvm::_do_message { +public: + virtual const char* name() const { + return "lld.native.reader"; + } + + virtual std::string message(int ev) const { + switch (ev) { + case native_reader_error::success: + return "Success"; + case native_reader_error::unknown_file_format: + return "Unknown file foramt"; + case native_reader_error::file_too_short: + return "file truncated"; + case native_reader_error::file_malformed: + return "file malformed"; + case native_reader_error::memory_error: + return "out of memory"; + case native_reader_error::unknown_chunk_type: + return "unknown chunk type"; + default: + llvm_unreachable("An enumerator of native_reader_error does not have a " + "message defined."); + } + } + + virtual llvm::error_condition default_error_condition(int ev) const { + if (ev == native_reader_error::success) + return llvm::errc::success; + return llvm::errc::invalid_argument; + } +}; + +const llvm::error_category &lld::native_reader_category() { + static _native_reader_error_category o; + return o; +} + +inline llvm::error_code make_error_code(native_reader_error e) { + return llvm::error_code(static_cast(e), native_reader_category()); +} + +class _yaml_reader_error_category : public llvm::_do_message { +public: + virtual const char* name() const { + return "lld.yaml.reader"; + } + + virtual std::string message(int ev) const { + switch (ev) { + case yaml_reader_error::success: + return "Success"; + case yaml_reader_error::unknown_keyword: + return "Unknown keyword found in yaml file"; + case yaml_reader_error::illegal_value: + return "Bad value found in yaml file"; + default: + llvm_unreachable("An enumerator of yaml_reader_error does not have a " + "message defined."); + } + } + + virtual llvm::error_condition default_error_condition(int ev) const { + if (ev == yaml_reader_error::success) + return llvm::errc::success; + return llvm::errc::invalid_argument; + } +}; + +const llvm::error_category &lld::yaml_reader_category() { + static _yaml_reader_error_category o; + return o; +} + +inline llvm::error_code make_error_code(yaml_reader_error e) { + return llvm::error_code(static_cast(e), yaml_reader_category()); +} Modified: lld/trunk/lib/Core/NativeReader.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeReader.cpp?rev=149432&r1=149431&r2=149432&view=diff ============================================================================== --- lld/trunk/lib/Core/NativeReader.cpp (original) +++ lld/trunk/lib/Core/NativeReader.cpp Tue Jan 31 15:47:13 2012 @@ -16,8 +16,8 @@ #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MemoryBuffer.h" -#include "llvm/Support/system_error.h" +#include "lld/Core/Error.h" #include "lld/Core/File.h" #include "lld/Core/Atom.h" @@ -28,53 +28,6 @@ // forward reference class NativeFile; - -enum native_reader_errors { - success = 0, - unknown_file_format, - file_too_short, - file_malformed, - unknown_chunk_type, - memory_error, -}; - -class reader_error_category : public llvm::_do_message { -public: - virtual const char* name() const { - return "lld.native.reader"; - } - virtual std::string message(int ev) const; -}; - -const reader_error_category reader_error_category_singleton; - -std::string reader_error_category::message(int ev) const { - switch (ev) { - case success: - return "Success"; - case unknown_file_format: - return "Unknown file foramt"; - case file_too_short: - return "file truncated"; - case file_malformed: - return "file malformed"; - case memory_error: - return "out of memory"; - case unknown_chunk_type: - return "unknown chunk type"; - default: - llvm_unreachable("An enumerator of native_reader_errors does not have a " - "message defined."); - } -} - -inline llvm::error_code make_error_code(native_reader_errors e) { - return llvm::error_code(static_cast(e), reader_error_category_singleton); -} - - - - // // An object of this class is instantied for each NativeDefinedAtomIvarsV1 // struct in the NCS_DefinedAtomsV1 chunk. @@ -179,13 +132,13 @@ reinterpret_cast(base + sizeof(NativeFileHeader)); // make sure magic matches if ( memcmp(header->magic, NATIVE_FILE_HEADER_MAGIC, 16) != 0 ) - return make_error_code(unknown_file_format); - + return make_error_code(native_reader_error::unknown_file_format); + // make sure mapped file contains all needed data const size_t fileSize = mb->getBufferSize(); if ( header->fileSize > fileSize ) - return make_error_code(file_too_short); - + return make_error_code(native_reader_error::file_too_short); + // instantiate NativeFile object and add values to it as found NativeFile* file = new NativeFile(mb, path); @@ -194,10 +147,10 @@ llvm::error_code ec; const NativeChunk* chunk = &chunks[i]; // sanity check chunk is within file - if ( chunk->fileOffset > fileSize ) - return make_error_code(file_malformed); - if ( (chunk->fileOffset + chunk->fileSize) > fileSize) - return make_error_code(file_malformed); + if ( chunk->fileOffset > fileSize ) + return make_error_code(native_reader_error::file_malformed); + if ( (chunk->fileOffset + chunk->fileSize) > fileSize) + return make_error_code(native_reader_error::file_malformed); // process chunk, based on signature switch ( chunk->signature ) { case NCS_DefinedAtomsV1: @@ -213,7 +166,7 @@ ec = file->processStrings(base, chunk); break; default: - return make_error_code(unknown_chunk_type); + return make_error_code(native_reader_error::unknown_chunk_type); } if ( ec ) { delete file; @@ -224,9 +177,9 @@ result = file; } - - return make_error_code(success); + + return make_error_code(native_reader_error::success); } virtual ~NativeFile() { @@ -266,11 +219,11 @@ uint8_t* atomsStart = reinterpret_cast (operator new(atomsArraySize, std::nothrow)); if (atomsStart == NULL ) - return make_error_code(memory_error); - const size_t ivarElementSize = chunk->fileSize + return make_error_code(native_reader_error::memory_error); + const size_t ivarElementSize = chunk->fileSize / chunk->elementCount; if ( ivarElementSize != sizeof(NativeDefinedAtomIvarsV1) ) - return make_error_code(file_malformed); + return make_error_code(native_reader_error::file_malformed); uint8_t* atomsEnd = atomsStart + atomsArraySize; const NativeDefinedAtomIvarsV1* ivarData = reinterpret_cast @@ -284,14 +237,14 @@ this->_definedAtoms.arrayStart = atomsStart; this->_definedAtoms.arrayEnd = atomsEnd; this->_definedAtoms.elementSize = atomSize; - return make_error_code(success); + return make_error_code(native_reader_error::success); } // set up pointers to attributes array llvm::error_code processAttributesV1(const uint8_t* base, const NativeChunk* chunk) { this->_attributes = base + chunk->fileOffset; this->_attributesMaxOffset = chunk->fileSize; - return make_error_code(success); + return make_error_code(native_reader_error::success); } // set up pointers to string pool in file @@ -299,7 +252,7 @@ const NativeChunk* chunk) { this->_strings = reinterpret_cast(base + chunk->fileOffset); this->_stringsMaxOffset = chunk->fileSize; - return make_error_code(success); + return make_error_code(native_reader_error::success); } // set up pointers to content area in file @@ -307,7 +260,7 @@ const NativeChunk* chunk) { this->_contentStart = base + chunk->fileOffset; this->_contentEnd = base + chunk->fileOffset + chunk->fileSize; - return make_error_code(success); + return make_error_code(native_reader_error::success); } llvm::StringRef string(uint32_t offset) const { Modified: lld/trunk/lib/Core/YamlReader.cpp URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/YamlReader.cpp?rev=149432&r1=149431&r2=149432&view=diff ============================================================================== --- lld/trunk/lib/Core/YamlReader.cpp (original) +++ lld/trunk/lib/Core/YamlReader.cpp Tue Jan 31 15:47:13 2012 @@ -11,6 +11,7 @@ #include "lld/Core/YamlReader.h" #include "lld/Core/Atom.h" +#include "lld/Core/Error.h" #include "lld/Core/File.h" #include "lld/Core/Reference.h" @@ -29,41 +30,6 @@ namespace lld { namespace yaml { -enum yaml_reader_errors { - success = 0, - unknown_keyword, - illegal_value -}; - -class reader_error_category : public llvm::_do_message { -public: - virtual const char* name() const { - return "lld.yaml.reader"; - } - virtual std::string message(int ev) const; -}; - -const reader_error_category reader_error_category_singleton; - -std::string reader_error_category::message(int ev) const { - switch (ev) { - case success: - return "Success"; - case unknown_keyword: - return "Unknown keyword found in yaml file"; - case illegal_value: - return "Bad value found in yaml file"; - default: - llvm_unreachable("An enumerator of yaml_reader_errors does not have a " - "message defined."); - } -} - -inline llvm::error_code make_error_code(yaml_reader_errors e) { - return llvm::error_code(static_cast(e), reader_error_category_singleton); -} - - class YAML { public: struct Entry { @@ -704,8 +670,8 @@ } else if (strcmp(entry->key, KeyValues::sizeKeyword) == 0) { llvm::StringRef val = entry->value; - if ( val.getAsInteger(0, atomState._size) ) - return make_error_code(illegal_value); + if ( val.getAsInteger(0, atomState._size) ) + return make_error_code(yaml_reader_error::illegal_value); haveAtom = true; } else if (strcmp(entry->key, KeyValues::contentKeyword) == 0) { @@ -720,7 +686,7 @@ inFixups = true; } else { - return make_error_code(unknown_keyword); + return make_error_code(yaml_reader_error::unknown_keyword); } } else if (depthForFixups == entry->depth) { @@ -749,7 +715,7 @@ } result.push_back(file); - return make_error_code(success); + return make_error_code(yaml_reader_error::success); } // From ashok.thirumurthi at intel.com Tue Jan 31 15:52:32 2012 From: ashok.thirumurthi at intel.com (Thirumurthi, Ashok) Date: Tue, 31 Jan 2012 21:52:32 +0000 Subject: [llvm-commits] [PATCH] Basic MCJIT for ELF with ExecutionEngine tests In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D326470@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D326470@HASMSX103.ger.corp.intel.com> Message-ID: Hello, Following the email that Eli Bendersky sent to LLVMdev (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/046671.html ), attached is the second patch in the MCJIT/ELF series which Eli and I refined based on Andy Kaylor's work. This patches modifies ExecutionEngine/MCJIT to allocate executable memory for the result of MC code emission. In turn, RuntimeDyldELF uses DyldELFObject (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120116/135091.html ) to rebase section addresses and then performs x86 relocations to create a live memory-mapped object-file. The result should be a 100% pass rate on ExecutionEngine tests on 32/64-bit Linux and should set the stage for GDB-JIT integration for debugging. In addition, RuntimeDyldELF was modified to extensively check return codes in the debug build. As discussed with Jim Grosbach (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120116/135177.html), this patch also backs out the section-based dy-load (which might well trade functionality for performance), but please stay tuned for its re-incarnation. The patch also removes behavior related to the function-based RuntimeDyldImpl that was recently deprecated on MachO (thanks Jim!). Note that this patch enables mcjit testing for ExecutionEngine tests using a second RUN line that uses mcjit only on Linux. Note that mcjit does not yet work on Windows (blocked on review: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120123/135703.html ), and is in mid-development on the MachO side. Thanks in advance for your review, - Ashok Thirumurthi Intel of Canada -------------- next part -------------- A non-text attachment was scrubbed... Name: elf-mcjit.diff Type: application/octet-stream Size: 56395 bytes Desc: elf-mcjit.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/432cc473/attachment-0001.obj From lenny at Colorado.EDU Tue Jan 31 15:55:34 2012 From: lenny at Colorado.EDU (Lenny Maiorani) Date: Tue, 31 Jan 2012 14:55:34 -0700 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> Message-ID: On Jan 30, 2012, at 3:31 PM, Jakob Stoklund Olesen wrote: > On Jan 30, 2012, at 1:11 PM, Lenny Maiorani wrote: >> >> Ok, I understand the algorithm difference, but I don't know what to do to store the hash table scopes on the stack. They are not copy-constructable or assignable so they don't work with STL containers. I could use a shared pointer, but that is just reference counting again. > > Oh, how annoying. I think you should just do what MachineCSE does and store pointers: > > void MachineCSE::EnterScope(MachineBasicBlock *MBB) { > DEBUG(dbgs() << "Entering: " << MBB->getName() << '\n'); > ScopeType *Scope = new ScopeType(VNT); > ScopeMap[MBB] = Scope; > } > > void MachineCSE::ExitScope(MachineBasicBlock *MBB) { > DEBUG(dbgs() << "Exiting: " << MBB->getName() << '\n'); > DenseMap::iterator SI = ScopeMap.find(MBB); > assert(SI != ScopeMap.end()); > ScopeMap.erase(SI); > delete SI->second; > } > > (But please don't dereference iterators after erasing them). > > /jakob Jakob, Please review this new patch. It uses the same method as the DepthFirstIterator to keep only the path back to the root node on the stack, pushes and pops scopes as appropriate by keeping pointers to them in the StackNode, and properly pushes and pops the CurrentGeneration. I have tested this with my little C++ file generated by the Python script and the LLVM/Clang unit test suites. It all passes and the binaries output match. This patch will be added to the bug #11794. Provided this passes review I will commit this patch. If not, I have run out of time to continue working on this and will have to move on. Thanks for the help along the way, -Lenny From stoklund at 2pi.dk Tue Jan 31 15:51:53 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 31 Jan 2012 21:51:53 -0000 Subject: [llvm-commits] [llvm] r149433 - /llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Message-ID: <20120131215153.D001D2A6C12C@llvm.org> Author: stoklund Date: Tue Jan 31 15:51:53 2012 New Revision: 149433 URL: http://llvm.org/viewvc/llvm-project?rev=149433&view=rev Log: Don't assign a value to NUM_TARGET_NAMED_SUBREGS. It was wrong and completely unused. Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp?rev=149433&r1=149432&r2=149433&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Tue Jan 31 15:51:53 2012 @@ -441,8 +441,7 @@ OS << "enum {\n NoSubRegister,\n"; for (unsigned i = 0, e = RegBank.getNumNamedIndices(); i != e; ++i) OS << " " << SubRegIndices[i]->getName() << ",\t// " << i+1 << "\n"; - OS << " NUM_TARGET_NAMED_SUBREGS = " << SubRegIndices.size()+1 << "\n"; - OS << "};\n"; + OS << " NUM_TARGET_NAMED_SUBREGS\n};\n"; if (!Namespace.empty()) OS << "}\n"; } From eli.friedman at gmail.com Tue Jan 31 16:04:17 2012 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 31 Jan 2012 14:04:17 -0800 Subject: [llvm-commits] [lld] r149426 - in /lld/trunk/lib/Core: NativeFileFormat.h NativeReader.cpp NativeWriter.cpp In-Reply-To: <20120131214553.F1C792A6C12C@llvm.org> References: <20120131214553.F1C792A6C12C@llvm.org> Message-ID: On Tue, Jan 31, 2012 at 1:45 PM, Michael J. Spencer wrote: > Author: mspencer > Date: Tue Jan 31 15:45:53 2012 > New Revision: 149426 > > URL: http://llvm.org/viewvc/llvm-project?rev=149426&view=rev > Log: > Flexible array members are not in C++03, and MSVC doesn't support them. > > Modified: > ? ?lld/trunk/lib/Core/NativeFileFormat.h > ? ?lld/trunk/lib/Core/NativeReader.cpp > ? ?lld/trunk/lib/Core/NativeWriter.cpp > > Modified: lld/trunk/lib/Core/NativeFileFormat.h > URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeFileFormat.h?rev=149426&r1=149425&r2=149426&view=diff > ============================================================================== > --- lld/trunk/lib/Core/NativeFileFormat.h (original) > +++ lld/trunk/lib/Core/NativeFileFormat.h Tue Jan 31 15:45:53 2012 > @@ -80,7 +80,6 @@ > ? uint32_t ? ?architecture; > ? uint32_t ? ?fileSize; > ? uint32_t ? ?chunkCount; > - ?NativeChunk chunks[]; > ?}; This isn't obviously safe: the presence of the flexible array might change the tail padding. -Eli > ?// > > Modified: lld/trunk/lib/Core/NativeReader.cpp > URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeReader.cpp?rev=149426&r1=149425&r2=149426&view=diff > ============================================================================== > --- lld/trunk/lib/Core/NativeReader.cpp (original) > +++ lld/trunk/lib/Core/NativeReader.cpp Tue Jan 31 15:45:53 2012 > @@ -175,6 +175,8 @@ > ? ? ? ? ? ? ? ? ? ? ? ?reinterpret_cast(mb->getBufferStart()); > ? ? const NativeFileHeader* const header = > ? ? ? ? ? ? ? ? ? ? ? ?reinterpret_cast(base); > + ? ?const NativeChunk *const chunks = > + ? ? ?reinterpret_cast(base + sizeof(NativeFileHeader)); > ? ? // make sure magic matches > ? ? if ( memcmp(header->magic, NATIVE_FILE_HEADER_MAGIC, 16) != 0 ) > ? ? ? return make_error_code(unknown_file_format); > @@ -190,7 +192,7 @@ > ? ? // process each chunk > ? ? for(uint32_t i=0; i < header->chunkCount; ++i) { > ? ? ? llvm::error_code ec; > - ? ? ?const NativeChunk* chunk = &header->chunks[i]; > + ? ? ?const NativeChunk* chunk = &chunks[i]; > ? ? ? // sanity check chunk is within file > ? ? ? if ( chunk->fileOffset > fileSize ) > ? ? ? ? return make_error_code(file_malformed); > > Modified: lld/trunk/lib/Core/NativeWriter.cpp > URL: http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Core/NativeWriter.cpp?rev=149426&r1=149425&r2=149426&view=diff > ============================================================================== > --- lld/trunk/lib/Core/NativeWriter.cpp (original) > +++ lld/trunk/lib/Core/NativeWriter.cpp Tue Jan 31 15:45:53 2012 > @@ -71,6 +71,9 @@ > ? ? _headerBufferSize = sizeof(NativeFileHeader) + 4*sizeof(NativeChunk); > ? ? _headerBuffer = reinterpret_cast > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(operator new(_headerBufferSize, std::nothrow)); > + ? ?NativeChunk *chunks = > + ? ? ?reinterpret_cast(reinterpret_cast(_headerBuffer) > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + sizeof(NativeFileHeader)); > ? ? memcpy(_headerBuffer->magic, NATIVE_FILE_HEADER_MAGIC, 16); > ? ? _headerBuffer->endian = NFH_LittleEndian; > ? ? _headerBuffer->architecture = 0; > @@ -78,25 +81,25 @@ > ? ? _headerBuffer->chunkCount = 4; > > ? ? // create chunk for atom ivar array > - ? ?NativeChunk& ch0 = _headerBuffer->chunks[0]; > + ? ?NativeChunk& ch0 = chunks[0]; > ? ? ch0.signature = NCS_DefinedAtomsV1; > ? ? ch0.fileOffset = _headerBufferSize; > ? ? ch0.fileSize = _definedAtomIvars.size()*sizeof(NativeDefinedAtomIvarsV1); > ? ? ch0.elementCount = _definedAtomIvars.size(); > - ? ?// create chunk for attributes > - ? ?NativeChunk& ch1 = _headerBuffer->chunks[1]; > + ? ?// create chunk for attributes > + ? ?NativeChunk& ch1 = chunks[1]; > ? ? ch1.signature = NCS_AttributesArrayV1; > ? ? ch1.fileOffset = ch0.fileOffset + ch0.fileSize; > ? ? ch1.fileSize = _attributes.size()*sizeof(NativeAtomAttributesV1); > ? ? ch1.elementCount = _attributes.size(); > - ? ?// create chunk for content > - ? ?NativeChunk& ch2 = _headerBuffer->chunks[2]; > + ? ?// create chunk for content > + ? ?NativeChunk& ch2 = chunks[2]; > ? ? ch2.signature = NCS_Content; > ? ? ch2.fileOffset = ch1.fileOffset + ch1.fileSize; > ? ? ch2.fileSize = _contentPool.size(); > ? ? ch2.elementCount = _contentPool.size(); > ? ? // create chunk for symbol strings > - ? ?NativeChunk& ch3 = _headerBuffer->chunks[3]; > + ? ?NativeChunk& ch3 = chunks[3]; > ? ? ch3.signature = NCS_Strings; > ? ? ch3.fileOffset = ch2.fileOffset + ch2.fileSize; > ? ? ch3.fileSize = _stringPool.size(); > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From stoklund at 2pi.dk Tue Jan 31 16:16:33 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 31 Jan 2012 14:16:33 -0800 Subject: [llvm-commits] [cfe-commits] [PATCH][Review Request] EarlyCSE stack overflow - bugzilla 11794 In-Reply-To: References: <051207DF-F3D0-4E8B-A756-E6B193750483@colorado.edu> <184C997C-6E18-4BE8-A3F8-092102981B11@colorado.edu> <6663C373-1004-4FB2-9E39-F784587B4CDD@2pi.dk> <4AA7AFB1-51F3-48B4-B543-00141B738258@colorado.edu> <8D478BF2-5BEA-45AA-A525-44C2085C11CC@2pi.dk> <4A53CDC9-0BC2-48F4-BBF6-334D98F358DA@colorado.edu> <36F5E7D6-DA08-4A70-BF2E-788642787972@2pi.dk> Message-ID: On Jan 31, 2012, at 1:55 PM, Lenny Maiorani wrote: > Please review this new patch. It uses the same method as the DepthFirstIterator to keep only the path back to the root node on the stack, pushes and pops scopes as appropriate by keeping pointers to them in the StackNode, and properly pushes and pops the CurrentGeneration. > > I have tested this with my little C++ file generated by the Python script and the LLVM/Clang unit test suites. It all passes and the binaries output match. > > This patch will be added to the bug #11794. Provided this passes review I will commit this patch. If not, I have run out of time to continue working on this and will have to move on. Hi Lenny, This new patch is looking good. Just one thing: You don't need the 'processedNodes' set. The DomTree is a tree, so you will only encounter each node once. Just keep a 'Processed' bit in StackNode instead. /jakob From bob.wilson at apple.com Tue Jan 31 16:32:31 2012 From: bob.wilson at apple.com (Bob Wilson) Date: Tue, 31 Jan 2012 22:32:31 -0000 Subject: [llvm-commits] [llvm] r149438 - in /llvm/trunk: include/llvm/ADT/Triple.h lib/Support/Triple.cpp Message-ID: <20120131223232.588B02A6C12C@llvm.org> Author: bwilson Date: Tue Jan 31 16:32:29 2012 New Revision: 149438 URL: http://llvm.org/viewvc/llvm-project?rev=149438&view=rev Log: Add Triple::getMacOSXVersion to replace crufty code in the clang driver. This new function provides a way to get the Mac OS X version number from either generic "darwin" triples of macosx triples. Modified: llvm/trunk/include/llvm/ADT/Triple.h llvm/trunk/lib/Support/Triple.cpp Modified: llvm/trunk/include/llvm/ADT/Triple.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/Triple.h?rev=149438&r1=149437&r2=149438&view=diff ============================================================================== --- llvm/trunk/include/llvm/ADT/Triple.h (original) +++ llvm/trunk/include/llvm/ADT/Triple.h Tue Jan 31 16:32:29 2012 @@ -209,6 +209,13 @@ return Maj; } + /// getMacOSXVersion - Parse the version number as with getOSVersion and then + /// translate generic "darwin" versions to the corresponding OS X versions. + /// This may also be called with IOS triples but the OS X version number is + /// just set to a constant 10.4.0 in that case. Returns true if successful. + bool getMacOSXVersion(unsigned &Major, unsigned &Minor, + unsigned &Micro) const; + /// @} /// @name Direct Component Access /// @{ Modified: llvm/trunk/lib/Support/Triple.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/Triple.cpp?rev=149438&r1=149437&r2=149438&view=diff ============================================================================== --- llvm/trunk/lib/Support/Triple.cpp (original) +++ llvm/trunk/lib/Support/Triple.cpp Tue Jan 31 16:32:29 2012 @@ -613,6 +613,45 @@ } } +bool Triple::getMacOSXVersion(unsigned &Major, unsigned &Minor, + unsigned &Micro) const { + getOSVersion(Major, Minor, Micro); + + switch (getOS()) { + default: assert(0 && "unexpected OS for Darwin triple"); + case Darwin: + // Default to darwin8, i.e., MacOSX 10.4. + if (Major == 0) + Major = 8; + // Darwin version numbers are skewed from OS X versions. + if (Major < 4) + return false; + Micro = 0; + Minor = Major - 4; + Major = 10; + break; + case MacOSX: + // Default to 10.4. + if (Major == 0) { + Major = 10; + Minor = 4; + } + if (Major != 10) + return false; + break; + case IOS: + // Ignore the version from the triple. This is only handled because the + // the clang driver combines OS X and IOS support into a common Darwin + // toolchain that wants to know the OS X version number even when targeting + // IOS. + Major = 10; + Minor = 4; + Micro = 0; + break; + } + return true; +} + void Triple::setTriple(const Twine &Str) { Data = Str.str(); Arch = InvalidArch; From zinob at codeaurora.org Tue Jan 31 16:39:56 2012 From: zinob at codeaurora.org (Zino Benaissa) Date: Tue, 31 Jan 2012 14:39:56 -0800 Subject: [llvm-commits] Tuning LLVM Greedy Register Allocator to optimize for code size when targeting ARM Thumb 2 instruction set In-Reply-To: <8837C2BB-DD5B-408A-9978-026181DB0E61@2pi.dk> References: <000001ccda35$239f0e10$6add2a30$@org> <0E37D7B5-BBF8-4D29-9679-5C4D22B32AEB@2pi.dk> <000c01ccda5b$76b208c0$64161a40$@org> <38F583CE-3516-421A-84C2-46978621E648@apple.com> <901B7A01-6E81-4807-A78F-2922C100117D@2pi.dk> <001c01ccdacd$befdd9c0$3cf98d40$@org> <226102FF-900A-4896-870C-F46C00B82BD4@2pi.dk> <000c01ccdbdf$45cacb90$d16062b0$@org> <8837C2BB-DD5B-408A-9978-026181DB0E61@2pi.dk> Message-ID: <000601cce069$41d2e570$c578b050$@org> >>> As I am reading your changes to the eviction policy, you are completely >>> replacing spill weights with a code size metric for live ranges with >>> Virteg.bytes > 0. Is that the intention? >> It depends why the eviction is invoked. Currently there are three reasons >> for invoking eviction: enabling coalescing, preventing spill/split, >> preventing a costPerUse register. Note all these evections where already put >> in place before my heuristic. >> 1) Both for coalescing or for preventing split/spill: VirtReg.bytes=0 and >> the heuristic is ignored and only the pair is considered. >> Whatever were put in place is still managing these type of evictions. >> 2) This heuristic is ON only when a candidate gets a register that has a >> CostPerUse. In this case, When the RA attempts to trade it for a register >> with no cost, Now with this heuristic it has a metric to evaluate whether >> there is a trade worth evicting for. > Here is the problem: Whenever you do a 'luxury' eviction because you got a physreg with a CostPerUse, you could be evicting virtregs with very high spill weight. These are the 'used in a hot loop' virtregs you were talking about. Whenever VirtReg.bytes > 0, you are effectively replacing the spill weights with code size metrics. That is very heavily biased towards optimizing for code size, and I think it is too aggressive. > Live range splitting is going to save you some of the time. It still uses speed metrics, but the overall behavior of the greedy algorithm becomes very erratic. > Spill weights are used in two different ways when evicting: > 1. The shouldEvict() policy function prevents a VirtReg from evicting something with a higher spill weight. (But you are overriding it!) > 2. The tryEvict() function selects the eviction candidate that would cause the lowest maximum spill weight to be evicted. > I don't think it is safe to override the shouldEvict() policy. You can get away with changing the candidate selection in 2., though. The policy you described was designed for register coalescing. Initially the priority is given to the hint over spill weight but then the eviction policy ensures hotter candidate are getting a register. I am aware that my "luxury" eviction (interesting nomenclature :-)) is overriding the eviction policy. The reason is because it was designed differently: The way to look it is as a register trading (win-win) instead of an eviction (Win-lose). It works in two steps: First, Candidate gets a register (Note here my heuristic is silenced and policy is enforced). Second, if it gets a register and it happen to be a CostPerUse register (R8-R15), then try to trade this register with some other candidate's register (in this case, does it matter if this candidate has higher weight? Answer is no!) The register trade implementation leverages the eviction and RA register assignment functions and also happens in two steps: First try to evict a candidate (that would cost cheaper using my heuristic). Second if eviction occurs then because the evictee has highest spill weight, it is first on the list and it will get the register that the evictor is giving up. I have carefully followed this in the debugger and verified these steps. Furthermore I looked at hundreds of diffs and I have not seen any evidence of increase of stack size or spilling activity (including some pretty large complex functions). If we decide to enforce the eviction policy to this type of eviction then we are simply walking away from performance and better usage of these registers. I have tested enforcing the policy on EEMBC and SPEC and I have seen consistent loss in code size and runtime performance. > Here is what I suggest you do: > - Don't override shouldEvict(). That policy should always stay in place. See above > - Use code size metrics to select among multiple eviction candidates when evicting from 'cheap' physregs. Can you be more precise? > - Don't evict from two physregs in selectOrSplit() and then only use one of them. You may be able to use code size metrics for selecting the best eviction candidate, but don't evict two different physregs needlessly. Same applies here it is a register trade (not two evictions! Even if it looks as two evictions) and it is safe. The way I came through this is by looking at some assembly in EEMBC and SPEC, some candidate occurred frequently in a function but still failed to get R0-7. Once I added this call it looked nice... At a high-level this framework+heuristic achieves the following. As some point of RA, we get this allocation: t1(R0) = ... t2(R8) = ... = OP1 t1 (R0) = OP2 t2(R8) = OP3 t2(R8) = OP4 t2(R8) Heuristics allows to trade R8 with R0 (to better usage of R0) and we get: t1(R8) = ... t2(R0) = ... = OP1 t1 (R8) = OP2 t2(R0) = OP3 t2(R0) = OP4 t2(R0) I understand that from reading the code, it may look like it is a needless eviction. I have added a comment to explain this. Please let me know if there is a phrasing or better way to help readability? > You should also make sure that the patch works for x86-64. There is a similar code size penalty to using r8-r15 and xmm8-15. Before submitting, I have done the testing due diligence for x86-64 with the patch I am submitting (FYI, attached the test result). Yes, x86-64 could benefit from this framework to minimize REX prefix and optimize usage of old x86 registers. Unfortunately, I don't have the bandwidth to do the implementation. Of course, I will be happy to hear about and/or review the enabling work for X86-64. Thanks again for the great code review, -Zino -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: RACodeSize.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/91451aa6/attachment.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: X86LLVMTestSuite.report Type: application/octet-stream Size: 1159 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/91451aa6/attachment.obj From dag at cray.com Tue Jan 31 16:49:32 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:32 -0600 Subject: [llvm-commits] (no subject) Message-ID: <1328050185-1503-1-git-send-email-dag@cray.com> Here is a set of patches to implement the TableGen foreach feature discussed a couple of months ago. Please review. Thanks! From dag at cray.com Tue Jan 31 16:49:33 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:33 -0600 Subject: [llvm-commits] [PATCH 01/13] Add For Loop Structures Message-ID: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Add some data structures to represent for loops. These will be referenced during object processing to do any needed iteration and instantiation. --- include/llvm/TableGen/Record.h | 48 ++++++++++++++++++++++++++++++++++++++++ 1 files changed, 48 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Add-For-Loop-Structures.patch Type: text/x-patch Size: 1885 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/64bb03e9/attachment.bin From dag at cray.com Tue Jan 31 16:49:34 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:34 -0600 Subject: [llvm-commits] [PATCH 02/13] Foreach Lexer Support In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <8c34fbbb12c82c8ed1dd845586402d39aba8f4c1.1328050160.git.dag@cray.com> Add foreach keyword support to the lexer. --- lib/TableGen/TGLexer.cpp | 1 + lib/TableGen/TGLexer.h | 2 +- lib/TableGen/TGParser.cpp | 1 + 3 files changed, 3 insertions(+), 1 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Foreach-Lexer-Support.patch Type: text/x-patch Size: 1312 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/69585465/attachment.bin From dag at cray.com Tue Jan 31 16:49:35 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:35 -0600 Subject: [llvm-commits] [PATCH 03/13] Add a Foreach Parse Mode In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <3c2ea4fa5af833cf65fdcecc13cd1d8fd798ca9b.1328050160.git.dag@cray.com> Add a mode to indicate that we're parsing a foreach loop. This allows the value parser to early-out when processing the foreach value list. --- lib/TableGen/TGParser.cpp | 2 +- lib/TableGen/TGParser.h | 6 ++++-- 2 files changed, 5 insertions(+), 3 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Add-a-Foreach-Parse-Mode.patch Type: text/x-patch Size: 1260 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/ed2ef35d/attachment.bin From dag at cray.com Tue Jan 31 16:49:36 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:36 -0600 Subject: [llvm-commits] [PATCH 04/13] Add Foreach Declaration Parser In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <75400c51ce2803ea8efeb0235b9425dc8d817850.1328050160.git.dag@cray.com> Add a routine to parse foreach iteration declarations. This is separate from ParseDeclaration because the type of the named value (the iterator) doesn't match the type of the initializer value (the value list). It also needs to add two values to the foreach record: the iterator and the value list. --- lib/TableGen/TGParser.cpp | 68 +++++++++++++++++++++++++++++++++++++++++++++ lib/TableGen/TGParser.h | 1 + 2 files changed, 69 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-Add-Foreach-Declaration-Parser.patch Type: text/x-patch Size: 3241 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/e154b1a7/attachment.bin From dag at cray.com Tue Jan 31 16:49:37 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:37 -0600 Subject: [llvm-commits] [PATCH 05/13] Add Foreach Parser In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: Add parsing support for foreach. --- lib/TableGen/TGParser.cpp | 39 +++++++++++++++++++++++++++++++++++++++ lib/TableGen/TGParser.h | 1 + 2 files changed, 40 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0005-Add-Foreach-Parser.patch Type: text/x-patch Size: 2437 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/1fb03bf4/attachment.bin From dag at cray.com Tue Jan 31 16:49:38 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:38 -0600 Subject: [llvm-commits] [PATCH 06/13] Add Foreach Processing Logic In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <978f4269a01e615f779353ba090d2cbb1f94f58c.1328050160.git.dag@cray.com> Add the code to process foreach loops and create defs based on iterator values. This is not active yet. --- lib/TableGen/TGParser.cpp | 102 +++++++++++++++++++++++++++++++++++++++++++++ lib/TableGen/TGParser.h | 17 +++++++ 2 files changed, 119 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0006-Add-Foreach-Processing-Logic.patch Type: text/x-patch Size: 5127 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/bdcd1172/attachment.bin From dag at cray.com Tue Jan 31 16:49:39 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:39 -0600 Subject: [llvm-commits] [PATCH 07/13] Make Foreach a Top-Level Object In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: Allow foreach loops to be matched at the top level. --- lib/TableGen/TGParser.cpp | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0007-Make-Foreach-a-Top-Level-Object.patch Type: text/x-patch Size: 663 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/fd5ee92a/attachment.bin From dag at cray.com Tue Jan 31 16:49:40 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:40 -0600 Subject: [llvm-commits] [PATCH 08/13] Check Loop Iterators for ID In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: When parsing an IDValue check if it is a foreach loop iterator for one of the active loops. If so, return a VarInit for it. --- lib/TableGen/TGParser.cpp | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0008-Check-Loop-Iterators-for-ID.patch Type: text/x-patch Size: 758 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/2e9e24d5/attachment.bin From dag at cray.com Tue Jan 31 16:49:41 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:41 -0600 Subject: [llvm-commits] [PATCH 09/13] Activate Foreach Processing In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: Integrate foreach processing into the parser. Foreach loops are now live. --- lib/TableGen/TGParser.cpp | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0009-Activate-Foreach-Processing.patch Type: text/x-patch Size: 458 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/b1e3aa0b/attachment.bin From dag at cray.com Tue Jan 31 16:49:42 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:42 -0600 Subject: [llvm-commits] [PATCH 10/13] Emacs Foreach Support In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <4c75cb35391146a9ea8b8c7699ef8ffbe7fc9b9c.1328050160.git.dag@cray.com> Add keyword support for foreach. --- utils/emacs/tablegen-mode.el | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0010-Emacs-Foreach-Support.patch Type: text/x-patch Size: 558 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/3b5bcc2c/attachment-0001.bin From dag at cray.com Tue Jan 31 16:49:43 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:43 -0600 Subject: [llvm-commits] [PATCH 11/13] Add VIM Foreach Support In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: Add keyword support for foreach. --- utils/vim/tablegen.vim | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0011-Add-VIM-Foreach-Support.patch Type: text/x-patch Size: 453 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/ab0cdc8c/attachment.bin From dag at cray.com Tue Jan 31 16:49:44 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:44 -0600 Subject: [llvm-commits] [PATCH 12/13] Add Foreach Tests In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <11a3926000dde6bb89b0cf7874f6b44439b3a382.1328050160.git.dag@cray.com> Add tests to check foreach operation. --- test/TableGen/ForeachLoop.td | 43 +++++++++++++++++++++++ test/TableGen/NestedForeach.td | 74 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 117 insertions(+), 0 deletions(-) create mode 100644 test/TableGen/ForeachLoop.td create mode 100644 test/TableGen/NestedForeach.td -------------- next part -------------- A non-text attachment was scrubbed... Name: 0012-Add-Foreach-Tests.patch Type: text/x-patch Size: 2754 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/2f250712/attachment.bin From dag at cray.com Tue Jan 31 16:49:45 2012 From: dag at cray.com (David Greene) Date: Tue, 31 Jan 2012 16:49:45 -0600 Subject: [llvm-commits] [PATCH 13/13] Document Foreach In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: <43c7e6a3bfda361572dddb440fd8dba1f72bea10.1328050160.git.dag@cray.com> Add TableGen documentation for foreach. --- docs/TableGenFundamentals.html | 28 ++++++++++++++++++++++++++++ 1 files changed, 28 insertions(+), 0 deletions(-) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0013-Document-Foreach.patch Type: text/x-patch Size: 1999 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/51d101a7/attachment.bin From dag at cray.com Tue Jan 31 16:53:08 2012 From: dag at cray.com (David A. Greene) Date: Tue, 31 Jan 2012 16:53:08 -0600 Subject: [llvm-commits] [PATCH 01/13] Add For Loop Structures In-Reply-To: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> (David Greene's message of "Tue, 31 Jan 2012 16:49:33 -0600") References: <4ebd9343c592547977b041f93d7ee7160da178e7.1328050160.git.dag@cray.com> Message-ID: David Greene writes: > Add some data structures to represent for loops. These will be > referenced during object processing to do any needed iteration and > instantiation. Ah crud. Sorry these got attached instead of inlined. I thought I fixed that in my git config. If you want them re-sent I can do that. Don't want to waste the bandwidth unless necessary, though. :) -Dave From enderby at apple.com Tue Jan 31 17:02:57 2012 From: enderby at apple.com (Kevin Enderby) Date: Tue, 31 Jan 2012 23:02:57 -0000 Subject: [llvm-commits] [llvm] r149442 - in /llvm/trunk: lib/MC/MCObjectWriter.cpp test/MC/MachO/darwin-x86_64-diff-reloc-assign.s Message-ID: <20120131230257.D35E92A6C12C@llvm.org> Author: enderby Date: Tue Jan 31 17:02:57 2012 New Revision: 149442 URL: http://llvm.org/viewvc/llvm-project?rev=149442&view=rev Log: Fixed a crash in llvm-mc for Mach-O when a symbol difference expression uses a symbol from an assignment. In this case the symbol did not have a fragment so MCObjectWriter::IsSymbolRefDifferenceFullyResolved() should not have been calling IsSymbolRefDifferenceFullyResolvedImpl() with a NULL fragment and should just have returned false in that case. Added: llvm/trunk/test/MC/MachO/darwin-x86_64-diff-reloc-assign.s Modified: llvm/trunk/lib/MC/MCObjectWriter.cpp Modified: llvm/trunk/lib/MC/MCObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCObjectWriter.cpp?rev=149442&r1=149441&r2=149442&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/MCObjectWriter.cpp Tue Jan 31 17:02:57 2012 @@ -68,6 +68,8 @@ const MCSymbolData &DataA = Asm.getSymbolData(SA); const MCSymbolData &DataB = Asm.getSymbolData(SB); + if(!DataA.getFragment() || !DataB.getFragment()) + return false; return IsSymbolRefDifferenceFullyResolvedImpl(Asm, DataA, *DataB.getFragment(), Added: llvm/trunk/test/MC/MachO/darwin-x86_64-diff-reloc-assign.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/MachO/darwin-x86_64-diff-reloc-assign.s?rev=149442&view=auto ============================================================================== --- llvm/trunk/test/MC/MachO/darwin-x86_64-diff-reloc-assign.s (added) +++ llvm/trunk/test/MC/MachO/darwin-x86_64-diff-reloc-assign.s Tue Jan 31 17:02:57 2012 @@ -0,0 +1,27 @@ +// RUN: llvm-mc -triple x86_64-apple-darwin9 %s -filetype=obj -o - | macho-dump --dump-section-data | FileCheck %s + +// Test case for rdar://10743265 + +// This tests that this expression does not cause a crash and produces two +// relocation entries: +// Relocation information (__TEXT,__text) 2 entries +// address pcrel length extern type scattered symbolnum/value +// 00000000 False long True SUB False _base +// 00000000 False long True UNSIGND False _start_ap_2 + +_base = . + +.long (0x2000) + _start_ap_2 - _base +.word 0 + +_start_ap_2: + cli + +// CHECK: ('_relocations', [ +// CHECK: # Relocation 0 +// CHECK: (('word-0', 0x0), +// CHECK: ('word-1', 0x5c000000)), +// CHECK: # Relocation 1 +// CHECK: (('word-0', 0x0), +// CHECK: ('word-1', 0xc000001)), +// CHECK: ]) From bcahoon at codeaurora.org Tue Jan 31 17:13:42 2012 From: bcahoon at codeaurora.org (Brendon Cahoon) Date: Tue, 31 Jan 2012 23:13:42 -0000 Subject: [llvm-commits] [llvm] r149444 - /llvm/trunk/README.txt Message-ID: <20120131231342.A33E02A6C12D@llvm.org> Author: bcahoon Date: Tue Jan 31 17:13:42 2012 New Revision: 149444 URL: http://llvm.org/viewvc/llvm-project?rev=149444&view=rev Log: test commit, adding a blank space Modified: llvm/trunk/README.txt Modified: llvm/trunk/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/README.txt?rev=149444&r1=149443&r2=149444&view=diff ============================================================================== --- llvm/trunk/README.txt (original) +++ llvm/trunk/README.txt Tue Jan 31 17:13:42 2012 @@ -1,4 +1,4 @@ - + Low Level Virtual Machine (LLVM) ================================ From lenny at colorado.edu Tue Jan 31 17:14:41 2012 From: lenny at colorado.edu (Lenny Maiorani) Date: Tue, 31 Jan 2012 23:14:41 -0000 Subject: [llvm-commits] [llvm] r149445 - /llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Message-ID: <20120131231441.D32382A6C12D@llvm.org> Author: lenny Date: Tue Jan 31 17:14:41 2012 New Revision: 149445 URL: http://llvm.org/viewvc/llvm-project?rev=149445&view=rev Log: bz11794 : EarlyCSE stack overflow on long functions. Make the EarlyCSE optimizer not use recursion to do a depth first iteration. Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Modified: llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp?rev=149445&r1=149444&r2=149445&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/EarlyCSE.cpp Tue Jan 31 17:14:41 2012 @@ -25,6 +25,7 @@ #include "llvm/Support/RecyclingAllocator.h" #include "llvm/ADT/ScopedHashTable.h" #include "llvm/ADT/Statistic.h" +#include using namespace llvm; STATISTIC(NumSimplify, "Number of instructions simplified or DCE'd"); @@ -259,7 +260,71 @@ bool runOnFunction(Function &F); private: - + + // NodeScope - almost a POD, but needs to call the constructors for the + // scoped hash tables so that a new scope gets pushed on. These are RAII so + // that the scope gets popped when the NodeScope is destroyed. + class NodeScope { + public: + NodeScope(ScopedHTType *availableValues, + LoadHTType *availableLoads, + CallHTType *availableCalls) : + Scope(*availableValues), + LoadScope(*availableLoads), + CallScope(*availableCalls) {} + + private: + NodeScope(const NodeScope&); // DO NOT IMPLEMENT + + ScopedHTType::ScopeTy Scope; + LoadHTType::ScopeTy LoadScope; + CallHTType::ScopeTy CallScope; + }; + + // StackNode - contains all the needed information to create a stack for + // doing a depth first tranversal of the tree. This includes scopes for + // values, loads, and calls as well as the generation. There is a child + // iterator so that the children do not need to be store spearately. + class StackNode { + public: + StackNode(ScopedHTType *availableValues, + LoadHTType *availableLoads, + CallHTType *availableCalls, + unsigned cg, DomTreeNode *n, + DomTreeNode::iterator child, DomTreeNode::iterator end) : + CurrentGeneration(cg), ChildGeneration(cg), Node(n), + ChildIter(child), EndIter(end), + Scopes(availableValues, availableLoads, availableCalls), + Processed(false) {} + + // Accessors. + unsigned currentGeneration() { return CurrentGeneration; } + unsigned childGeneration() { return ChildGeneration; } + void childGeneration(unsigned generation) { ChildGeneration = generation; } + DomTreeNode *node() { return Node; } + DomTreeNode::iterator childIter() { return ChildIter; } + DomTreeNode *nextChild() { + DomTreeNode *child = *ChildIter; + ++ChildIter; + return child; + } + DomTreeNode::iterator end() { return EndIter; } + bool isProcessed() { return Processed; } + void process() { Processed = true; } + + private: + StackNode(const StackNode&); // DO NOT IMPLEMENT + + // Members. + unsigned CurrentGeneration; + unsigned ChildGeneration; + DomTreeNode *Node; + DomTreeNode::iterator ChildIter; + DomTreeNode::iterator EndIter; + NodeScope Scopes; + bool Processed; + }; + bool processNode(DomTreeNode *Node); // This transformation requires dominator postdominator info @@ -284,19 +349,6 @@ INITIALIZE_PASS_END(EarlyCSE, "early-cse", "Early CSE", false, false) bool EarlyCSE::processNode(DomTreeNode *Node) { - // Define a scope in the scoped hash table. When we are done processing this - // domtree node and recurse back up to our parent domtree node, this will pop - // off all the values we install. - ScopedHTType::ScopeTy Scope(*AvailableValues); - - // Define a scope for the load values so that anything we add will get - // popped when we recurse back up to our parent domtree node. - LoadHTType::ScopeTy LoadScope(*AvailableLoads); - - // Define a scope for the call values so that anything we add will get - // popped when we recurse back up to our parent domtree node. - CallHTType::ScopeTy CallScope(*AvailableCalls); - BasicBlock *BB = Node->getBlock(); // If this block has a single predecessor, then the predecessor is the parent @@ -446,18 +498,14 @@ } } } - - unsigned LiveOutGeneration = CurrentGeneration; - for (DomTreeNode::iterator I = Node->begin(), E = Node->end(); I != E; ++I) { - Changed |= processNode(*I); - // Pop any generation changes off the stack from the recursive walk. - CurrentGeneration = LiveOutGeneration; - } + return Changed; } bool EarlyCSE::runOnFunction(Function &F) { + std::deque nodesToProcess; + TD = getAnalysisIfAvailable(); TLI = &getAnalysis(); DT = &getAnalysis(); @@ -471,5 +519,52 @@ AvailableCalls = &CallTable; CurrentGeneration = 0; - return processNode(DT->getRootNode()); + bool Changed = false; + + // Process the root node. + nodesToProcess.push_front( + new StackNode(AvailableValues, AvailableLoads, AvailableCalls, + CurrentGeneration, DT->getRootNode(), + DT->getRootNode()->begin(), + DT->getRootNode()->end())); + + // Save the current generation. + unsigned LiveOutGeneration = CurrentGeneration; + + // Process the stack. + while (!nodesToProcess.empty()) { + // Grab the first item off the stack. Set the current generation, remove + // the node from the stack, and process it. + StackNode *NodeToProcess = nodesToProcess.front(); + + // Initialize class members. + CurrentGeneration = NodeToProcess->currentGeneration(); + + // Check if the node needs to be processed. + if (!NodeToProcess->isProcessed()) { + // Process the node. + Changed |= processNode(NodeToProcess->node()); + NodeToProcess->childGeneration(CurrentGeneration); + NodeToProcess->process(); + } else if (NodeToProcess->childIter() != NodeToProcess->end()) { + // Push the next child onto the stack. + DomTreeNode *child = NodeToProcess->nextChild(); + nodesToProcess.push_front( + new StackNode(AvailableValues, + AvailableLoads, + AvailableCalls, + NodeToProcess->childGeneration(), child, + child->begin(), child->end())); + } else { + // It has been processed, and there are no more children to process, + // so delete it and pop it off the stack. + delete NodeToProcess; + nodesToProcess.pop_front(); + } + } // while (!nodes...) + + // Reset the current generation. + CurrentGeneration = LiveOutGeneration; + + return Changed; } From bcahoon at codeaurora.org Tue Jan 31 17:18:34 2012 From: bcahoon at codeaurora.org (Brendon Cahoon) Date: Tue, 31 Jan 2012 23:18:34 -0000 Subject: [llvm-commits] [llvm] r149446 - /llvm/trunk/README.txt Message-ID: <20120131231834.0FF872A6C12D@llvm.org> Author: bcahoon Date: Tue Jan 31 17:18:33 2012 New Revision: 149446 URL: http://llvm.org/viewvc/llvm-project?rev=149446&view=rev Log: Revert test commit Modified: llvm/trunk/README.txt Modified: llvm/trunk/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/README.txt?rev=149446&r1=149445&r2=149446&view=diff ============================================================================== --- llvm/trunk/README.txt (original) +++ llvm/trunk/README.txt Tue Jan 31 17:18:33 2012 @@ -1,4 +1,4 @@ - + Low Level Virtual Machine (LLVM) ================================ From grosbach at apple.com Tue Jan 31 17:51:09 2012 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 31 Jan 2012 23:51:09 -0000 Subject: [llvm-commits] [llvm] r149452 - /llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp Message-ID: <20120131235110.00FD22A6C12C@llvm.org> Author: grosbach Date: Tue Jan 31 17:51:09 2012 New Revision: 149452 URL: http://llvm.org/viewvc/llvm-project?rev=149452&view=rev Log: Tidy up. One more return type mismatch fix. Modified: llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp Modified: llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp?rev=149452&r1=149451&r2=149452&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp (original) +++ llvm/trunk/lib/Target/ARM/AsmParser/ARMAsmParser.cpp Tue Jan 31 17:51:09 2012 @@ -2550,7 +2550,7 @@ const MCExpr *ImmVal; if (getParser().ParseExpression(ImmVal)) - return MatchOperand_ParseFail; + return true; const MCConstantExpr *MCE = dyn_cast(ImmVal); if (!MCE) return TokError("immediate value expected for vector index"); From grosser at fim.uni-passau.de Tue Jan 31 18:08:10 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 01 Feb 2012 00:08:10 -0000 Subject: [llvm-commits] [polly] r149456 - /polly/trunk/www/example_load_Polly_into_clang.html Message-ID: <20120201000810.773F32A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 18:08:10 2012 New Revision: 149456 URL: http://llvm.org/viewvc/llvm-project?rev=149456&view=rev Log: www: More typos Pointed out by Chad Rosier Modified: polly/trunk/www/example_load_Polly_into_clang.html Modified: polly/trunk/www/example_load_Polly_into_clang.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_load_Polly_into_clang.html?rev=149456&r1=149455&r2=149456&view=diff ============================================================================== --- polly/trunk/www/example_load_Polly_into_clang.html (original) +++ polly/trunk/www/example_load_Polly_into_clang.html Tue Jan 31 18:08:10 2012 @@ -40,8 +40,8 @@

Optimizing with Polly

-Optimizing with Polly is as easy as ading -O3 -polly to your compiler -flags (Polly is only available at -O3). +Optimizing with Polly is as easy as adding -O3 -mllvm -polly to your +compiler flags (Polly is only available at -O3).
pollycc -O3 -mllvm -polly file.c
From grosbach at apple.com Tue Jan 31 18:08:17 2012 From: grosbach at apple.com (Jim Grosbach) Date: Wed, 01 Feb 2012 00:08:17 -0000 Subject: [llvm-commits] [llvm] r149457 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineCalls.cpp test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll test/Transforms/InstCombine/call.ll Message-ID: <20120201000817.8B26F2A6C12C@llvm.org> Author: grosbach Date: Tue Jan 31 18:08:17 2012 New Revision: 149457 URL: http://llvm.org/viewvc/llvm-project?rev=149457&view=rev Log: Disable InstCombine unsafe folding bitcasts of calls w/ varargs. Changing arguments from being passed as fixed to varargs is unsafe, as the ABI may require they be handled differently (stack vs. register, for example). Remove two tests which rely on the bitcast being folded into the direct call, which is exactly the transformation that's unsafe. Removed: llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp llvm/trunk/test/Transforms/InstCombine/call.ll Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp?rev=149457&r1=149456&r2=149457&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp Tue Jan 31 18:08:17 2012 @@ -1105,21 +1105,12 @@ if (FT->isVarArg()!=cast(APTy->getElementType())->isVarArg()) return false; } - - if (FT->getNumParams() < NumActualArgs && FT->isVarArg() && - !CallerPAL.isEmpty()) - // In this case we have more arguments than the new function type, but we - // won't be dropping them. Check that these extra arguments have attributes - // that are compatible with being a vararg call argument. - for (unsigned i = CallerPAL.getNumSlots(); i; --i) { - if (CallerPAL.getSlot(i - 1).Index <= FT->getNumParams()) - break; - Attributes PAttrs = CallerPAL.getSlot(i - 1).Attrs; - if (PAttrs & Attribute::VarArgsIncompatible) - return false; - } - + // If we're casting varargs to non-varargs, that may not be allowable + // under the ABI, so conservatively don't do anything. + if (FT->getNumParams() < NumActualArgs && FT->isVarArg()) + return false; + // Okay, we decided that this is a safe thing to do: go ahead and start // inserting cast instructions as necessary. std::vector Args; Removed: llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll?rev=149456&view=auto ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll (removed) @@ -1,23 +0,0 @@ -; Ignore stderr, we expect warnings there -; RUN: opt < %s -instcombine 2> /dev/null -S | not grep bitcast - -define void @a() { - ret void -} - -define signext i32 @b(i32* inreg %x) { - ret i32 0 -} - -define void @c(...) { - ret void -} - -define void @g(i32* %y) { - call void bitcast (void ()* @a to void (i32*)*)( i32* noalias %y ) - call <2 x i32> bitcast (i32 (i32*)* @b to <2 x i32> (i32*)*)( i32* inreg null ) ; <<2 x i32>>:1 [#uses=0] - %x = call i64 bitcast (i32 (i32*)* @b to i64 (i32)*)( i32 0 ) ; [#uses=0] - call void bitcast (void (...)* @c to void (i32)*)( i32 0 ) - call void bitcast (void (...)* @c to void (i32)*)( i32 zeroext 0 ) - ret void -} Modified: llvm/trunk/test/Transforms/InstCombine/call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/call.ll?rev=149457&r1=149456&r2=149457&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/call.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/call.ll Tue Jan 31 18:08:17 2012 @@ -30,20 +30,6 @@ } -; Resolving this should insert a cast from sbyte to int, following the C -; promotion rules. -define void @test3a(i8, ...) {unreachable } - -define void @test3(i8 %A, i8 %B) { - call void bitcast (void (i8, ...)* @test3a to void (i8, i8)*)( i8 %A, i8 %B -) - ret void -; CHECK: %1 = zext i8 %B to i32 -; CHECK: call void (i8, ...)* @test3a(i8 %A, i32 %1) -; CHECK: ret void -} - - ; test conversion of return value... define i8 @test4a() { ret i8 0 From grosser at fim.uni-passau.de Tue Jan 31 18:13:03 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 01 Feb 2012 01:13:03 +0100 Subject: [llvm-commits] [polly] r149415 - /polly/trunk/www/example_load_Polly_into_clang.html In-Reply-To: References: <20120131202421.A20A82A6C12C@llvm.org> <3DA1131F-A696-49E4-B464-C7EA781152C5@apple.com> Message-ID: <4F28838F.5090202@fim.uni-passau.de> On 01/31/2012 10:46 PM, Chad Rosier wrote: > > On Jan 31, 2012, at 1:44 PM, Chad Rosier wrote: > >> Hi Tobias, >> >> On Jan 31, 2012, at 12:24 PM, Tobias Grosser wrote: >> >>> Author: grosser >>> Date: Tue Jan 31 14:24:21 2012 >>> New Revision: 149415 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=149415&view=rev >>> Log: >>> www: Add missing -mllvm to documentation >>> >>> Found by Ryan Taylor >>> >>> Modified: >>> polly/trunk/www/example_load_Polly_into_clang.html >>> >>> Modified: polly/trunk/www/example_load_Polly_into_clang.html >>> URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_load_Polly_into_clang.html?rev=149415&r1=149414&r2=149415&view=diff >>> ============================================================================== >>> --- polly/trunk/www/example_load_Polly_into_clang.html (original) >>> +++ polly/trunk/www/example_load_Polly_into_clang.html Tue Jan 31 14:24:21 2012 >>> @@ -43,7 +43,7 @@ >>> Optimizing with Polly is as easy as ading-O3 -polly to your compiler > > Also, ading -> adding. > >> Does a similar change need to be applied above? Hi Chad, thanks for keeping the eyes open. I committed the fix in r149456 Tobi From eli.friedman at gmail.com Tue Jan 31 18:32:28 2012 From: eli.friedman at gmail.com (Eli Friedman) Date: Tue, 31 Jan 2012 16:32:28 -0800 Subject: [llvm-commits] [llvm] r149457 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineCalls.cpp test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll test/Transforms/InstCombine/call.ll In-Reply-To: <20120201000817.8B26F2A6C12C@llvm.org> References: <20120201000817.8B26F2A6C12C@llvm.org> Message-ID: On Tue, Jan 31, 2012 at 4:08 PM, Jim Grosbach wrote: > Author: grosbach > Date: Tue Jan 31 18:08:17 2012 > New Revision: 149457 > > URL: http://llvm.org/viewvc/llvm-project?rev=149457&view=rev > Log: > Disable InstCombine unsafe folding bitcasts of calls w/ varargs. > > Changing arguments from being passed as fixed to varargs is unsafe, as > the ABI may require they be handled differently (stack vs. register, for > example). > > Remove two tests which rely on the bitcast being folded into the direct > call, which is exactly the transformation that's unsafe. > > Removed: > ? ?llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll > Modified: > ? ?llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp > ? ?llvm/trunk/test/Transforms/InstCombine/call.ll > > Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp?rev=149457&r1=149456&r2=149457&view=diff > ============================================================================== > --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp (original) > +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp Tue Jan 31 18:08:17 2012 > @@ -1105,21 +1105,12 @@ > ? ? if (FT->isVarArg()!=cast(APTy->getElementType())->isVarArg()) > ? ? ? return false; > ? } > - > - ?if (FT->getNumParams() < NumActualArgs && FT->isVarArg() && > - ? ? ?!CallerPAL.isEmpty()) > - ? ?// In this case we have more arguments than the new function type, but we > - ? ?// won't be dropping them. ?Check that these extra arguments have attributes > - ? ?// that are compatible with being a vararg call argument. > - ? ?for (unsigned i = CallerPAL.getNumSlots(); i; --i) { > - ? ? ?if (CallerPAL.getSlot(i - 1).Index <= FT->getNumParams()) > - ? ? ? ?break; > - ? ? ?Attributes PAttrs = CallerPAL.getSlot(i - 1).Attrs; > - ? ? ?if (PAttrs & Attribute::VarArgsIncompatible) > - ? ? ? ?return false; > - ? ?} > > - > + ?// If we're casting varargs to non-varargs, that may not be allowable > + ?// under the ABI, so conservatively don't do anything. > + ?if (FT->getNumParams() < NumActualArgs && FT->isVarArg()) > + ? ?return false; This comment doesn't make sense given the code above this: if we can't see the definition, we won't reach this check, and if we can see the definition we're casting we're casting varargs to non-varargs, we can't possibly be introducing undefined behavior. -Eli From evan.cheng at apple.com Tue Jan 31 19:06:24 2012 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 31 Jan 2012 17:06:24 -0800 Subject: [llvm-commits] [llvm] r149457 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineCalls.cpp test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll test/Transforms/InstCombine/call.ll In-Reply-To: References: <20120201000817.8B26F2A6C12C@llvm.org> Message-ID: Hi Jim, I chatted with Eli about this. I share his concern that your fix is not quite right. Please chat with Eli to get this sorted out. Thanks, Evan On Jan 31, 2012, at 4:32 PM, Eli Friedman wrote: > On Tue, Jan 31, 2012 at 4:08 PM, Jim Grosbach wrote: >> Author: grosbach >> Date: Tue Jan 31 18:08:17 2012 >> New Revision: 149457 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149457&view=rev >> Log: >> Disable InstCombine unsafe folding bitcasts of calls w/ varargs. >> >> Changing arguments from being passed as fixed to varargs is unsafe, as >> the ABI may require they be handled differently (stack vs. register, for >> example). >> >> Remove two tests which rely on the bitcast being folded into the direct >> call, which is exactly the transformation that's unsafe. >> >> Removed: >> llvm/trunk/test/Transforms/InstCombine/2008-01-06-BitCastAttributes.ll >> Modified: >> llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp >> llvm/trunk/test/Transforms/InstCombine/call.ll >> >> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp?rev=149457&r1=149456&r2=149457&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp (original) >> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp Tue Jan 31 18:08:17 2012 >> @@ -1105,21 +1105,12 @@ >> if (FT->isVarArg()!=cast(APTy->getElementType())->isVarArg()) >> return false; >> } >> - >> - if (FT->getNumParams() < NumActualArgs && FT->isVarArg() && >> - !CallerPAL.isEmpty()) >> - // In this case we have more arguments than the new function type, but we >> - // won't be dropping them. Check that these extra arguments have attributes >> - // that are compatible with being a vararg call argument. >> - for (unsigned i = CallerPAL.getNumSlots(); i; --i) { >> - if (CallerPAL.getSlot(i - 1).Index <= FT->getNumParams()) >> - break; >> - Attributes PAttrs = CallerPAL.getSlot(i - 1).Attrs; >> - if (PAttrs & Attribute::VarArgsIncompatible) >> - return false; >> - } >> >> - >> + // If we're casting varargs to non-varargs, that may not be allowable >> + // under the ABI, so conservatively don't do anything. >> + if (FT->getNumParams() < NumActualArgs && FT->isVarArg()) >> + return false; > > This comment doesn't make sense given the code above this: if we can't > see the definition, we won't reach this check, and if we can see the > definition we're casting we're casting varargs to non-varargs, we > can't possibly be introducing undefined behavior. > > -Eli > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From hfinkel at anl.gov Tue Jan 31 21:51:44 2012 From: hfinkel at anl.gov (Hal Finkel) Date: Wed, 01 Feb 2012 03:51:44 -0000 Subject: [llvm-commits] [llvm] r149468 - in /llvm/trunk: docs/ include/llvm-c/ include/llvm-c/Transforms/ include/llvm/ include/llvm/Transforms/ include/llvm/Transforms/IPO/ lib/Transforms/ lib/Transforms/IPO/ lib/Transforms/Vectorize/ test/Transforms/BBVectorize/ tools/bugpoint/ tools/llvm-ld/ tools/lto/ tools/opt/ Message-ID: <20120201035145.411492A6C12C@llvm.org> Author: hfinkel Date: Tue Jan 31 21:51:43 2012 New Revision: 149468 URL: http://llvm.org/viewvc/llvm-project?rev=149468&view=rev Log: Add a basic-block autovectorization pass. This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). Added: llvm/trunk/include/llvm-c/Transforms/Vectorize.h - copied, changed from r149457, llvm/trunk/include/llvm-c/Initialization.h llvm/trunk/include/llvm/Transforms/Vectorize.h llvm/trunk/lib/Transforms/Vectorize/ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp llvm/trunk/lib/Transforms/Vectorize/CMakeLists.txt llvm/trunk/lib/Transforms/Vectorize/LLVMBuild.txt - copied, changed from r149457, llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt llvm/trunk/lib/Transforms/Vectorize/Makefile - copied, changed from r149457, llvm/trunk/tools/llvm-ld/Makefile llvm/trunk/lib/Transforms/Vectorize/Vectorize.cpp llvm/trunk/test/Transforms/BBVectorize/ llvm/trunk/test/Transforms/BBVectorize/cycle.ll llvm/trunk/test/Transforms/BBVectorize/dg.exp llvm/trunk/test/Transforms/BBVectorize/ld1.ll llvm/trunk/test/Transforms/BBVectorize/loop1.ll llvm/trunk/test/Transforms/BBVectorize/req-depth.ll llvm/trunk/test/Transforms/BBVectorize/search-limit.ll llvm/trunk/test/Transforms/BBVectorize/simple-int.ll llvm/trunk/test/Transforms/BBVectorize/simple-ldstr.ll llvm/trunk/test/Transforms/BBVectorize/simple.ll Modified: llvm/trunk/docs/Passes.html llvm/trunk/include/llvm-c/Initialization.h llvm/trunk/include/llvm/InitializePasses.h llvm/trunk/include/llvm/LinkAllPasses.h llvm/trunk/include/llvm/Transforms/IPO/PassManagerBuilder.h llvm/trunk/lib/Transforms/CMakeLists.txt llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp llvm/trunk/lib/Transforms/LLVMBuild.txt llvm/trunk/lib/Transforms/Makefile llvm/trunk/tools/bugpoint/CMakeLists.txt llvm/trunk/tools/bugpoint/Makefile llvm/trunk/tools/llvm-ld/CMakeLists.txt llvm/trunk/tools/llvm-ld/Makefile llvm/trunk/tools/lto/CMakeLists.txt llvm/trunk/tools/lto/Makefile llvm/trunk/tools/opt/CMakeLists.txt llvm/trunk/tools/opt/Makefile llvm/trunk/tools/opt/opt.cpp Modified: llvm/trunk/docs/Passes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/Passes.html?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/docs/Passes.html (original) +++ llvm/trunk/docs/Passes.html Tue Jan 31 21:51:43 2012 @@ -126,6 +126,7 @@
+ @@ -817,6 +818,26 @@

+ -bb-vectorize: Basic-Block Vectorization +

+
+

This pass combines instructions inside basic blocks to form vector + instructions. It iterates over each basic block, attempting to pair + compatible instructions, repeating this process until no additional + pairs are selected for vectorization. When the outputs of some pair + of compatible instructions are used as inputs by some other pair of + compatible instructions, those pairs are part of a potential + vectorization chain. Instruction pairs are only fused into vector + instructions when they are part of a chain longer than some + threshold length. Moreover, the pass attempts to find the best + possible chain for each pair of compatible instructions. These + heuristics are intended to prevent vectorization in cases where + it would not yield a performance increase of the resulting code. +

+
+ + +

-block-placement: Profile Guided Basic Block Placement

Modified: llvm/trunk/include/llvm-c/Initialization.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Initialization.h?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Initialization.h (original) +++ llvm/trunk/include/llvm-c/Initialization.h Tue Jan 31 21:51:43 2012 @@ -25,6 +25,7 @@ void LLVMInitializeCore(LLVMPassRegistryRef R); void LLVMInitializeTransformUtils(LLVMPassRegistryRef R); void LLVMInitializeScalarOpts(LLVMPassRegistryRef R); +void LLVMInitializeVectorization(LLVMPassRegistryRef R); void LLVMInitializeInstCombine(LLVMPassRegistryRef R); void LLVMInitializeIPO(LLVMPassRegistryRef R); void LLVMInitializeInstrumentation(LLVMPassRegistryRef R); Copied: llvm/trunk/include/llvm-c/Transforms/Vectorize.h (from r149457, llvm/trunk/include/llvm-c/Initialization.h) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Transforms/Vectorize.h?p2=llvm/trunk/include/llvm-c/Transforms/Vectorize.h&p1=llvm/trunk/include/llvm-c/Initialization.h&r1=149457&r2=149468&rev=149468&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Initialization.h (original) +++ llvm/trunk/include/llvm-c/Transforms/Vectorize.h Tue Jan 31 21:51:43 2012 @@ -1,4 +1,5 @@ -/*===-- llvm-c/Initialization.h - Initialization C Interface ------*- C -*-===*\ +/*===---------------------------Vectorize.h ------------------- -*- C++ -*-===*\ +|*===----------- Vectorization Transformation Library C Interface ---------===*| |* *| |* The LLVM Compiler Infrastructure *| |* *| @@ -7,14 +8,17 @@ |* *| |*===----------------------------------------------------------------------===*| |* *| -|* This header declares the C interface to LLVM initialization routines, *| -|* which must be called before you can use the functionality provided by *| -|* the corresponding LLVM library. *| +|* This header declares the C interface to libLLVMVectorize.a, which *| +|* implements various vectorization transformations of the LLVM IR. *| +|* *| +|* Many exotic languages can interoperate with C code but have a harder time *| +|* with C++ due to name mangling. So in addition to C, this interface enables *| +|* tools written in such languages. *| |* *| \*===----------------------------------------------------------------------===*/ -#ifndef LLVM_C_INITIALIZEPASSES_H -#define LLVM_C_INITIALIZEPASSES_H +#ifndef LLVM_C_TRANSFORMS_VECTORIZE_H +#define LLVM_C_TRANSFORMS_VECTORIZE_H #include "llvm-c/Core.h" @@ -22,19 +26,12 @@ extern "C" { #endif -void LLVMInitializeCore(LLVMPassRegistryRef R); -void LLVMInitializeTransformUtils(LLVMPassRegistryRef R); -void LLVMInitializeScalarOpts(LLVMPassRegistryRef R); -void LLVMInitializeInstCombine(LLVMPassRegistryRef R); -void LLVMInitializeIPO(LLVMPassRegistryRef R); -void LLVMInitializeInstrumentation(LLVMPassRegistryRef R); -void LLVMInitializeAnalysis(LLVMPassRegistryRef R); -void LLVMInitializeIPA(LLVMPassRegistryRef R); -void LLVMInitializeCodeGen(LLVMPassRegistryRef R); -void LLVMInitializeTarget(LLVMPassRegistryRef R); +/** See llvm::createBBVectorizePass function. */ +void LLVMAddBBVectorizePass(LLVMPassManagerRef PM); #ifdef __cplusplus } -#endif +#endif /* defined(__cplusplus) */ #endif + Modified: llvm/trunk/include/llvm/InitializePasses.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/InitializePasses.h?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/include/llvm/InitializePasses.h (original) +++ llvm/trunk/include/llvm/InitializePasses.h Tue Jan 31 21:51:43 2012 @@ -31,6 +31,10 @@ /// ScalarOpts library. void initializeScalarOpts(PassRegistry&); +/// initializeVectorization - Initialize all passes linked into the +/// Vectorize library. +void initializeVectorization(PassRegistry&); + /// initializeInstCombine - Initialize all passes linked into the /// ScalarOpts library. void initializeInstCombine(PassRegistry&); @@ -236,7 +240,7 @@ void initializeInstSimplifierPass(PassRegistry&); void initializeUnpackMachineBundlesPass(PassRegistry&); void initializeFinalizeMachineBundlesPass(PassRegistry&); - +void initializeBBVectorizePass(PassRegistry&); } #endif Modified: llvm/trunk/include/llvm/LinkAllPasses.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/LinkAllPasses.h?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/include/llvm/LinkAllPasses.h (original) +++ llvm/trunk/include/llvm/LinkAllPasses.h Tue Jan 31 21:51:43 2012 @@ -31,6 +31,7 @@ #include "llvm/Transforms/Instrumentation.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/Scalar.h" +#include "llvm/Transforms/Vectorize.h" #include "llvm/Transforms/Utils/UnifyFunctionExitNodes.h" #include @@ -151,6 +152,7 @@ (void) llvm::createCorrelatedValuePropagationPass(); (void) llvm::createMemDepPrinter(); (void) llvm::createInstructionSimplifierPass(); + (void) llvm::createBBVectorizePass(); (void)new llvm::IntervalPartition(); (void)new llvm::FindUsedTypes(); Modified: llvm/trunk/include/llvm/Transforms/IPO/PassManagerBuilder.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/IPO/PassManagerBuilder.h?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/include/llvm/Transforms/IPO/PassManagerBuilder.h (original) +++ llvm/trunk/include/llvm/Transforms/IPO/PassManagerBuilder.h Tue Jan 31 21:51:43 2012 @@ -99,6 +99,7 @@ bool DisableSimplifyLibCalls; bool DisableUnitAtATime; bool DisableUnrollLoops; + bool Vectorize; private: /// ExtensionList - This is list of all of the extensions that are registered. Added: llvm/trunk/include/llvm/Transforms/Vectorize.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Vectorize.h?rev=149468&view=auto ============================================================================== --- llvm/trunk/include/llvm/Transforms/Vectorize.h (added) +++ llvm/trunk/include/llvm/Transforms/Vectorize.h Tue Jan 31 21:51:43 2012 @@ -0,0 +1,30 @@ +//===-- Vectorize.h - Vectorization Transformations -------------*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This header file defines prototypes for accessor functions that expose passes +// in the Vectorize transformations library. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TRANSFORMS_VECTORIZE_H +#define LLVM_TRANSFORMS_VECTORIZE_H + +namespace llvm { + +class BasicBlockPass; + +//===----------------------------------------------------------------------===// +// +// BBVectorize - A basic-block vectorization pass. +// +BasicBlockPass *createBBVectorizePass(); + +} // End llvm namespace + +#endif Modified: llvm/trunk/lib/Transforms/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/CMakeLists.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/CMakeLists.txt (original) +++ llvm/trunk/lib/Transforms/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -3,4 +3,5 @@ add_subdirectory(InstCombine) add_subdirectory(Scalar) add_subdirectory(IPO) +add_subdirectory(Vectorize) add_subdirectory(Hello) Modified: llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt (original) +++ llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt Tue Jan 31 21:51:43 2012 @@ -20,4 +20,4 @@ name = IPO parent = Transforms library_name = ipo -required_libraries = Analysis Core IPA InstCombine Scalar Support Target TransformUtils +required_libraries = Analysis Core IPA InstCombine Scalar Vectorize Support Target TransformUtils Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Tue Jan 31 21:51:43 2012 @@ -21,14 +21,20 @@ #include "llvm/DefaultPasses.h" #include "llvm/PassManager.h" #include "llvm/Analysis/Passes.h" +#include "llvm/Analysis/Verifier.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Target/TargetLibraryInfo.h" #include "llvm/Transforms/Scalar.h" +#include "llvm/Transforms/Vectorize.h" #include "llvm/Transforms/IPO.h" #include "llvm/ADT/SmallVector.h" #include "llvm/Support/ManagedStatic.h" using namespace llvm; +static cl::opt +RunVectorization("vectorize", cl::desc("Run vectorization passes")); + PassManagerBuilder::PassManagerBuilder() { OptLevel = 2; SizeLevel = 0; @@ -37,6 +43,7 @@ DisableSimplifyLibCalls = false; DisableUnitAtATime = false; DisableUnrollLoops = false; + Vectorize = RunVectorization; } PassManagerBuilder::~PassManagerBuilder() { @@ -172,6 +179,13 @@ addExtensionsToPM(EP_ScalarOptimizerLate, MPM); + if (Vectorize) { + MPM.add(createBBVectorizePass()); + MPM.add(createInstructionCombiningPass()); + if (OptLevel > 1) + MPM.add(createGVNPass()); // Remove redundancies + } + MPM.add(createAggressiveDCEPass()); // Delete dead instructions MPM.add(createCFGSimplificationPass()); // Merge & remove BBs MPM.add(createInstructionCombiningPass()); // Clean up after everything. Modified: llvm/trunk/lib/Transforms/LLVMBuild.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/LLVMBuild.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/LLVMBuild.txt (original) +++ llvm/trunk/lib/Transforms/LLVMBuild.txt Tue Jan 31 21:51:43 2012 @@ -16,7 +16,7 @@ ;===------------------------------------------------------------------------===; [common] -subdirectories = IPO InstCombine Instrumentation Scalar Utils +subdirectories = IPO InstCombine Instrumentation Scalar Utils Vectorize [component_0] type = Group Modified: llvm/trunk/lib/Transforms/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Makefile?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Makefile (original) +++ llvm/trunk/lib/Transforms/Makefile Tue Jan 31 21:51:43 2012 @@ -8,7 +8,7 @@ ##===----------------------------------------------------------------------===## LEVEL = ../.. -PARALLEL_DIRS = Utils Instrumentation Scalar InstCombine IPO Hello +PARALLEL_DIRS = Utils Instrumentation Scalar InstCombine IPO Vectorize Hello include $(LEVEL)/Makefile.config Added: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp?rev=149468&view=auto ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp (added) +++ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Tue Jan 31 21:51:43 2012 @@ -0,0 +1,1796 @@ +//===- BBVectorize.cpp - A Basic-Block Vectorizer -------------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file implements a basic-block vectorization pass. The algorithm was +// inspired by that used by the Vienna MAP Vectorizor by Franchetti and Kral, +// et al. It works by looking for chains of pairable operations and then +// pairing them. +// +//===----------------------------------------------------------------------===// + +#define BBV_NAME "bb-vectorize" +#define DEBUG_TYPE BBV_NAME +#include "llvm/Constants.h" +#include "llvm/DerivedTypes.h" +#include "llvm/Function.h" +#include "llvm/Instructions.h" +#include "llvm/IntrinsicInst.h" +#include "llvm/Intrinsics.h" +#include "llvm/LLVMContext.h" +#include "llvm/Pass.h" +#include "llvm/Type.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/DenseSet.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/Statistic.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Analysis/AliasAnalysis.h" +#include "llvm/Analysis/AliasSetTracker.h" +#include "llvm/Analysis/ScalarEvolution.h" +#include "llvm/Analysis/ScalarEvolutionExpressions.h" +#include "llvm/Analysis/ValueTracking.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/raw_ostream.h" +#include "llvm/Support/ValueHandle.h" +#include "llvm/Target/TargetData.h" +#include "llvm/Transforms/Vectorize.h" +#include +#include +using namespace llvm; + +static cl::opt +ReqChainDepth("bb-vectorize-req-chain-depth", cl::init(6), cl::Hidden, + cl::desc("The required chain depth for vectorization")); + +static cl::opt +SearchLimit("bb-vectorize-search-limit", cl::init(400), cl::Hidden, + cl::desc("The maximum search distance for instruction pairs")); + +static cl::opt +SplatBreaksChain("bb-vectorize-splat-breaks-chain", cl::init(false), cl::Hidden, + cl::desc("Replicating one element to a pair breaks the chain")); + +static cl::opt +VectorBits("bb-vectorize-vector-bits", cl::init(128), cl::Hidden, + cl::desc("The size of the native vector registers")); + +static cl::opt +MaxIter("bb-vectorize-max-iter", cl::init(0), cl::Hidden, + cl::desc("The maximum number of pairing iterations")); + +static cl::opt +MaxCandPairsForCycleCheck("bb-vectorize-max-cycle-check-pairs", cl::init(200), + cl::Hidden, cl::desc("The maximum number of candidate pairs with which to use" + " a full cycle check")); + +static cl::opt +NoInts("bb-vectorize-no-ints", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize integer values")); + +static cl::opt +NoFloats("bb-vectorize-no-floats", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize floating-point values")); + +static cl::opt +NoCasts("bb-vectorize-no-casts", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize casting (conversion) operations")); + +static cl::opt +NoMath("bb-vectorize-no-math", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize floating-point math intrinsics")); + +static cl::opt +NoFMA("bb-vectorize-no-fma", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize the fused-multiply-add intrinsic")); + +static cl::opt +NoMemOps("bb-vectorize-no-mem-ops", cl::init(false), cl::Hidden, + cl::desc("Don't try to vectorize loads and stores")); + +static cl::opt +AlignedOnly("bb-vectorize-aligned-only", cl::init(false), cl::Hidden, + cl::desc("Only generate aligned loads and stores")); + +static cl::opt +FastDep("bb-vectorize-fast-dep", cl::init(false), cl::Hidden, + cl::desc("Use a fast instruction dependency analysis")); + +#ifndef NDEBUG +static cl::opt +DebugInstructionExamination("bb-vectorize-debug-instruction-examination", + cl::init(false), cl::Hidden, + cl::desc("When debugging is enabled, output information on the" + " instruction-examination process")); +static cl::opt +DebugCandidateSelection("bb-vectorize-debug-candidate-selection", + cl::init(false), cl::Hidden, + cl::desc("When debugging is enabled, output information on the" + " candidate-selection process")); +static cl::opt +DebugPairSelection("bb-vectorize-debug-pair-selection", + cl::init(false), cl::Hidden, + cl::desc("When debugging is enabled, output information on the" + " pair-selection process")); +static cl::opt +DebugCycleCheck("bb-vectorize-debug-cycle-check", + cl::init(false), cl::Hidden, + cl::desc("When debugging is enabled, output information on the" + " cycle-checking process")); +#endif + +STATISTIC(NumFusedOps, "Number of operations fused by bb-vectorize"); + +namespace { + struct BBVectorize : public BasicBlockPass { + static char ID; // Pass identification, replacement for typeid + BBVectorize() : BasicBlockPass(ID) { + initializeBBVectorizePass(*PassRegistry::getPassRegistry()); + } + + typedef std::pair ValuePair; + typedef std::pair ValuePairWithDepth; + typedef std::pair VPPair; // A ValuePair pair + typedef std::pair::iterator, + std::multimap::iterator> VPIteratorPair; + typedef std::pair::iterator, + std::multimap::iterator> + VPPIteratorPair; + + AliasAnalysis *AA; + ScalarEvolution *SE; + TargetData *TD; + + // FIXME: const correct? + + bool vectorizePairs(BasicBlock &BB); + + void getCandidatePairs(BasicBlock &BB, + std::multimap &CandidatePairs, + std::vector &PairableInsts); + + void computeConnectedPairs(std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs); + + void buildDepMap(BasicBlock &BB, + std::multimap &CandidatePairs, + std::vector &PairableInsts, + DenseSet &PairableInstUsers); + + void choosePairs(std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + DenseMap& ChosenPairs); + + void fuseChosenPairs(BasicBlock &BB, + std::vector &PairableInsts, + DenseMap& ChosenPairs); + + bool isInstVectorizable(Instruction *I, bool &IsSimpleLoadStore); + + bool areInstsCompatible(Instruction *I, Instruction *J, + bool IsSimpleLoadStore); + + bool trackUsesOfI(DenseSet &Users, + AliasSetTracker &WriteSet, Instruction *I, + Instruction *J, bool UpdateUsers = true, + std::multimap *LoadMoveSet = 0); + + void computePairsConnectedTo( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + ValuePair P); + + bool pairsConflict(ValuePair P, ValuePair Q, + DenseSet &PairableInstUsers, + std::multimap *PairableInstUserMap = 0); + + bool pairWillFormCycle(ValuePair P, + std::multimap &PairableInstUsers, + DenseSet &CurrentPairs); + + void pruneTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + std::multimap &PairableInstUserMap, + DenseMap &ChosenPairs, + DenseMap &Tree, + DenseSet &PrunedTree, ValuePair J, + bool UseCycleCheck); + + void buildInitialTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + DenseMap &ChosenPairs, + DenseMap &Tree, ValuePair J); + + void findBestTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + std::multimap &PairableInstUserMap, + DenseMap &ChosenPairs, + DenseSet &BestTree, size_t &BestMaxDepth, + size_t &BestEffSize, VPIteratorPair ChoiceRange, + bool UseCycleCheck); + + Value *getReplacementPointerInput(LLVMContext& Context, Instruction *I, + Instruction *J, unsigned o, bool &FlipMemInputs); + + void fillNewShuffleMask(LLVMContext& Context, Instruction *J, + unsigned NumElem, unsigned MaskOffset, unsigned NumInElem, + unsigned IdxOffset, std::vector &Mask); + + Value *getReplacementShuffleMask(LLVMContext& Context, Instruction *I, + Instruction *J); + + Value *getReplacementInput(LLVMContext& Context, Instruction *I, + Instruction *J, unsigned o, bool FlipMemInputs); + + void getReplacementInputsForPair(LLVMContext& Context, Instruction *I, + Instruction *J, SmallVector &ReplacedOperands, + bool &FlipMemInputs); + + void replaceOutputsOfPair(LLVMContext& Context, Instruction *I, + Instruction *J, Instruction *K, + Instruction *&InsertionPt, Instruction *&K1, + Instruction *&K2, bool &FlipMemInputs); + + void collectPairLoadMoveSet(BasicBlock &BB, + DenseMap &ChosenPairs, + std::multimap &LoadMoveSet, + Instruction *I); + + void collectLoadMoveSet(BasicBlock &BB, + std::vector &PairableInsts, + DenseMap &ChosenPairs, + std::multimap &LoadMoveSet); + + bool canMoveUsesOfIAfterJ(BasicBlock &BB, + std::multimap &LoadMoveSet, + Instruction *I, Instruction *J); + + void moveUsesOfIAfterJ(BasicBlock &BB, + std::multimap &LoadMoveSet, + Instruction *&InsertionPt, + Instruction *I, Instruction *J); + + virtual bool runOnBasicBlock(BasicBlock &BB) { + AA = &getAnalysis(); + SE = &getAnalysis(); + TD = getAnalysisIfAvailable(); + + bool changed = false; + // Iterate a sufficient number of times to merge types of size 1 bit, + // then 2 bits, then 4, etc. up to half of the target vector width of the + // target vector register. + for (unsigned v = 2, n = 1; v <= VectorBits && (!MaxIter || n <= MaxIter); + v *= 2, ++n) { + DEBUG(dbgs() << "BBV: fusing loop #" << n << + " for " << BB.getName() << " in " << + BB.getParent()->getName() << "...\n"); + if (vectorizePairs(BB)) + changed = true; + else + break; + } + + DEBUG(dbgs() << "BBV: done!\n"); + return changed; + } + + virtual void getAnalysisUsage(AnalysisUsage &AU) const { + BasicBlockPass::getAnalysisUsage(AU); + AU.addRequired(); + AU.addRequired(); + AU.addPreserved(); + AU.addPreserved(); + } + + // This returns the vector type that holds a pair of the provided type. + // If the provided type is already a vector, then its length is doubled. + static inline VectorType *getVecTypeForPair(Type *ElemTy) { + if (VectorType *VTy = dyn_cast(ElemTy)) { + unsigned numElem = VTy->getNumElements(); + return VectorType::get(ElemTy->getScalarType(), numElem*2); + } else { + return VectorType::get(ElemTy, 2); + } + } + + // Returns the weight associated with the provided value. A chain of + // candidate pairs has a length given by the sum of the weights of its + // members (one weight per pair; the weight of each member of the pair + // is assumed to be the same). This length is then compared to the + // chain-length threshold to determine if a given chain is significant + // enough to be vectorized. The length is also used in comparing + // candidate chains where longer chains are considered to be better. + // Note: when this function returns 0, the resulting instructions are + // not actually fused. + static inline size_t getDepthFactor(Value *V) { + // InsertElement and ExtractElement have a depth factor of zero. This is + // for two reasons: First, they cannot be usefully fused. Second, because + // the pass generates a lot of these, they can confuse the simple metric + // used to compare the trees in the next iteration. Thus, giving them a + // weight of zero allows the pass to essentially ignore them in + // subsequent iterations when looking for vectorization opportunities + // while still tracking dependency chains that flow through those + // instructions. + if (isa(V) || isa(V)) + return 0; + + return 1; + } + + // This determines the relative offset of two loads or stores, returning + // true if the offset could be determined to be some constant value. + // For example, if OffsetInElmts == 1, then J accesses the memory directly + // after I; if OffsetInElmts == -1 then I accesses the memory + // directly after J. This function assumes that both instructions + // have the same type. + bool getPairPtrInfo(Instruction *I, Instruction *J, + Value *&IPtr, Value *&JPtr, unsigned &IAlignment, unsigned &JAlignment, + int64_t &OffsetInElmts) { + OffsetInElmts = 0; + if (isa(I)) { + IPtr = cast(I)->getPointerOperand(); + JPtr = cast(J)->getPointerOperand(); + IAlignment = cast(I)->getAlignment(); + JAlignment = cast(J)->getAlignment(); + } else { + IPtr = cast(I)->getPointerOperand(); + JPtr = cast(J)->getPointerOperand(); + IAlignment = cast(I)->getAlignment(); + JAlignment = cast(J)->getAlignment(); + } + + const SCEV *IPtrSCEV = SE->getSCEV(IPtr); + const SCEV *JPtrSCEV = SE->getSCEV(JPtr); + + // If this is a trivial offset, then we'll get something like + // 1*sizeof(type). With target data, which we need anyway, this will get + // constant folded into a number. + const SCEV *OffsetSCEV = SE->getMinusSCEV(JPtrSCEV, IPtrSCEV); + if (const SCEVConstant *ConstOffSCEV = + dyn_cast(OffsetSCEV)) { + ConstantInt *IntOff = ConstOffSCEV->getValue(); + int64_t Offset = IntOff->getSExtValue(); + + Type *VTy = cast(IPtr->getType())->getElementType(); + int64_t VTyTSS = (int64_t) TD->getTypeStoreSize(VTy); + + assert(VTy == cast(JPtr->getType())->getElementType()); + + OffsetInElmts = Offset/VTyTSS; + return (abs64(Offset) % VTyTSS) == 0; + } + + return false; + } + + // Returns true if the provided CallInst represents an intrinsic that can + // be vectorized. + bool isVectorizableIntrinsic(CallInst* I) { + Function *F = I->getCalledFunction(); + if (!F) return false; + + unsigned IID = F->getIntrinsicID(); + if (!IID) return false; + + switch(IID) { + default: + return false; + case Intrinsic::sqrt: + case Intrinsic::powi: + case Intrinsic::sin: + case Intrinsic::cos: + case Intrinsic::log: + case Intrinsic::log2: + case Intrinsic::log10: + case Intrinsic::exp: + case Intrinsic::exp2: + case Intrinsic::pow: + return !NoMath; + case Intrinsic::fma: + return !NoFMA; + } + } + + // Returns true if J is the second element in some pair referenced by + // some multimap pair iterator pair. + template + bool isSecondInIteratorPair(V J, std::pair< + typename std::multimap::iterator, + typename std::multimap::iterator> PairRange) { + for (typename std::multimap::iterator K = PairRange.first; + K != PairRange.second; ++K) + if (K->second == J) return true; + + return false; + } + }; + + // This function implements one vectorization iteration on the provided + // basic block. It returns true if the block is changed. + bool BBVectorize::vectorizePairs(BasicBlock &BB) { + std::vector PairableInsts; + std::multimap CandidatePairs; + getCandidatePairs(BB, CandidatePairs, PairableInsts); + if (PairableInsts.size() == 0) return false; + + // Now we have a map of all of the pairable instructions and we need to + // select the best possible pairing. A good pairing is one such that the + // users of the pair are also paired. This defines a (directed) forest + // over the pairs such that two pairs are connected iff the second pair + // uses the first. + + // Note that it only matters that both members of the second pair use some + // element of the first pair (to allow for splatting). + + std::multimap ConnectedPairs; + computeConnectedPairs(CandidatePairs, PairableInsts, ConnectedPairs); + if (ConnectedPairs.size() == 0) return false; + + // Build the pairable-instruction dependency map + DenseSet PairableInstUsers; + buildDepMap(BB, CandidatePairs, PairableInsts, PairableInstUsers); + + // There is now a graph of the connected pairs. For each variable, pick the + // pairing with the largest tree meeting the depth requirement on at least + // one branch. Then select all pairings that are part of that tree and + // remove them from the list of available pairings and pairable variables. + + DenseMap ChosenPairs; + choosePairs(CandidatePairs, PairableInsts, ConnectedPairs, + PairableInstUsers, ChosenPairs); + + if (ChosenPairs.size() == 0) return false; + NumFusedOps += ChosenPairs.size(); + + // A set of pairs has now been selected. It is now necessary to replace the + // paired instructions with vector instructions. For this procedure each + // operand much be replaced with a vector operand. This vector is formed + // by using build_vector on the old operands. The replaced values are then + // replaced with a vector_extract on the result. Subsequent optimization + // passes should coalesce the build/extract combinations. + + fuseChosenPairs(BB, PairableInsts, ChosenPairs); + + return true; + } + + // This function returns true if the provided instruction is capable of being + // fused into a vector instruction. This determination is based only on the + // type and other attributes of the instruction. + bool BBVectorize::isInstVectorizable(Instruction *I, + bool &IsSimpleLoadStore) { + IsSimpleLoadStore = false; + + if (CallInst *C = dyn_cast(I)) { + if (!isVectorizableIntrinsic(C)) + return false; + } else if (LoadInst *L = dyn_cast(I)) { + // Vectorize simple loads if possbile: + IsSimpleLoadStore = L->isSimple(); + if (!IsSimpleLoadStore || NoMemOps) + return false; + } else if (StoreInst *S = dyn_cast(I)) { + // Vectorize simple stores if possbile: + IsSimpleLoadStore = S->isSimple(); + if (!IsSimpleLoadStore || NoMemOps) + return false; + } else if (CastInst *C = dyn_cast(I)) { + // We can vectorize casts, but not casts of pointer types, etc. + if (NoCasts) + return false; + + Type *SrcTy = C->getSrcTy(); + if (!SrcTy->isSingleValueType() || SrcTy->isPointerTy()) + return false; + + Type *DestTy = C->getDestTy(); + if (!DestTy->isSingleValueType() || DestTy->isPointerTy()) + return false; + } else if (!(I->isBinaryOp() || isa(I) || + isa(I) || isa(I))) { + return false; + } + + // We can't vectorize memory operations without target data + if (TD == 0 && IsSimpleLoadStore) + return false; + + Type *T1, *T2; + if (isa(I)) { + // For stores, it is the value type, not the pointer type that matters + // because the value is what will come from a vector register. + + Value *IVal = cast(I)->getValueOperand(); + T1 = IVal->getType(); + } else { + T1 = I->getType(); + } + + if (I->isCast()) + T2 = cast(I)->getSrcTy(); + else + T2 = T1; + + // Not every type can be vectorized... + if (!(VectorType::isValidElementType(T1) || T1->isVectorTy()) || + !(VectorType::isValidElementType(T2) || T2->isVectorTy())) + return false; + + if (NoInts && (T1->isIntOrIntVectorTy() || T2->isIntOrIntVectorTy())) + return false; + + if (NoFloats && (T1->isFPOrFPVectorTy() || T2->isFPOrFPVectorTy())) + return false; + + if (T1->getPrimitiveSizeInBits() > VectorBits/2 || + T2->getPrimitiveSizeInBits() > VectorBits/2) + return false; + + return true; + } + + // This function returns true if the two provided instructions are compatible + // (meaning that they can be fused into a vector instruction). This assumes + // that I has already been determined to be vectorizable and that J is not + // in the use tree of I. + bool BBVectorize::areInstsCompatible(Instruction *I, Instruction *J, + bool IsSimpleLoadStore) { + DEBUG(if (DebugInstructionExamination) dbgs() << "BBV: looking at " << *I << + " <-> " << *J << "\n"); + + // Loads and stores can be merged if they have different alignments, + // but are otherwise the same. + LoadInst *LI, *LJ; + StoreInst *SI, *SJ; + if ((LI = dyn_cast(I)) && (LJ = dyn_cast(J))) { + if (I->getType() != J->getType()) + return false; + + if (LI->getPointerOperand()->getType() != + LJ->getPointerOperand()->getType() || + LI->isVolatile() != LJ->isVolatile() || + LI->getOrdering() != LJ->getOrdering() || + LI->getSynchScope() != LJ->getSynchScope()) + return false; + } else if ((SI = dyn_cast(I)) && (SJ = dyn_cast(J))) { + if (SI->getValueOperand()->getType() != + SJ->getValueOperand()->getType() || + SI->getPointerOperand()->getType() != + SJ->getPointerOperand()->getType() || + SI->isVolatile() != SJ->isVolatile() || + SI->getOrdering() != SJ->getOrdering() || + SI->getSynchScope() != SJ->getSynchScope()) + return false; + } else if (!J->isSameOperationAs(I)) { + return false; + } + // FIXME: handle addsub-type operations! + + if (IsSimpleLoadStore) { + Value *IPtr, *JPtr; + unsigned IAlignment, JAlignment; + int64_t OffsetInElmts = 0; + if (getPairPtrInfo(I, J, IPtr, JPtr, IAlignment, JAlignment, + OffsetInElmts) && abs64(OffsetInElmts) == 1) { + if (AlignedOnly) { + Type *aType = isa(I) ? + cast(I)->getValueOperand()->getType() : I->getType(); + // An aligned load or store is possible only if the instruction + // with the lower offset has an alignment suitable for the + // vector type. + + unsigned BottomAlignment = IAlignment; + if (OffsetInElmts < 0) BottomAlignment = JAlignment; + + Type *VType = getVecTypeForPair(aType); + unsigned VecAlignment = TD->getPrefTypeAlignment(VType); + if (BottomAlignment < VecAlignment) + return false; + } + } else { + return false; + } + } else if (isa(I)) { + // Only merge two shuffles if they're both constant + return isa(I->getOperand(2)) && + isa(J->getOperand(2)); + // FIXME: We may want to vectorize non-constant shuffles also. + } + + return true; + } + + // Figure out whether or not J uses I and update the users and write-set + // structures associated with I. Specifically, Users represents the set of + // instructions that depend on I. WriteSet represents the set + // of memory locations that are dependent on I. If UpdateUsers is true, + // and J uses I, then Users is updated to contain J and WriteSet is updated + // to contain any memory locations to which J writes. The function returns + // true if J uses I. By default, alias analysis is used to determine + // whether J reads from memory that overlaps with a location in WriteSet. + // If LoadMoveSet is not null, then it is a previously-computed multimap + // where the key is the memory-based user instruction and the value is + // the instruction to be compared with I. So, if LoadMoveSet is provided, + // then the alias analysis is not used. This is necessary because this + // function is called during the process of moving instructions during + // vectorization and the results of the alias analysis are not stable during + // that process. + bool BBVectorize::trackUsesOfI(DenseSet &Users, + AliasSetTracker &WriteSet, Instruction *I, + Instruction *J, bool UpdateUsers, + std::multimap *LoadMoveSet) { + bool UsesI = false; + + // This instruction may already be marked as a user due, for example, to + // being a member of a selected pair. + if (Users.count(J)) + UsesI = true; + + if (!UsesI) + for (User::op_iterator JU = J->op_begin(), e = J->op_end(); + JU != e; ++JU) { + Value *V = *JU; + if (I == V || Users.count(V)) { + UsesI = true; + break; + } + } + if (!UsesI && J->mayReadFromMemory()) { + if (LoadMoveSet) { + VPIteratorPair JPairRange = LoadMoveSet->equal_range(J); + UsesI = isSecondInIteratorPair(I, JPairRange); + } else { + for (AliasSetTracker::iterator W = WriteSet.begin(), + WE = WriteSet.end(); W != WE; ++W) { + for (AliasSet::iterator A = W->begin(), AE = W->end(); + A != AE; ++A) { + AliasAnalysis::Location ptrLoc(A->getValue(), A->getSize(), + A->getTBAAInfo()); + if (AA->getModRefInfo(J, ptrLoc) != AliasAnalysis::NoModRef) { + UsesI = true; + break; + } + } + if (UsesI) break; + } + } + } + + if (UsesI && UpdateUsers) { + if (J->mayWriteToMemory()) WriteSet.add(J); + Users.insert(J); + } + + return UsesI; + } + + // This function iterates over all instruction pairs in the provided + // basic block and collects all candidate pairs for vectorization. + void BBVectorize::getCandidatePairs(BasicBlock &BB, + std::multimap &CandidatePairs, + std::vector &PairableInsts) { + BasicBlock::iterator E = BB.end(); + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { + bool IsSimpleLoadStore; + if (!isInstVectorizable(I, IsSimpleLoadStore)) continue; + + // Look for an instruction with which to pair instruction *I... + DenseSet Users; + AliasSetTracker WriteSet(*AA); + BasicBlock::iterator J = I; ++J; + for (unsigned ss = 0; J != E && ss <= SearchLimit; ++J, ++ss) { + // Determine if J uses I, if so, exit the loop. + bool UsesI = trackUsesOfI(Users, WriteSet, I, J, !FastDep); + if (FastDep) { + // Note: For this heuristic to be effective, independent operations + // must tend to be intermixed. This is likely to be true from some + // kinds of grouped loop unrolling (but not the generic LLVM pass), + // but otherwise may require some kind of reordering pass. + + // When using fast dependency analysis, + // stop searching after first use: + if (UsesI) break; + } else { + if (UsesI) continue; + } + + // J does not use I, and comes before the first use of I, so it can be + // merged with I if the instructions are compatible. + if (!areInstsCompatible(I, J, IsSimpleLoadStore)) continue; + + // J is a candidate for merging with I. + if (!PairableInsts.size() || + PairableInsts[PairableInsts.size()-1] != I) { + PairableInsts.push_back(I); + } + CandidatePairs.insert(ValuePair(I, J)); + DEBUG(if (DebugCandidateSelection) dbgs() << "BBV: candidate pair " + << *I << " <-> " << *J << "\n"); + } + } + + DEBUG(dbgs() << "BBV: found " << PairableInsts.size() + << " instructions with candidate pairs\n"); + } + + // Finds candidate pairs connected to the pair P = . This means that + // it looks for pairs such that both members have an input which is an + // output of PI or PJ. + void BBVectorize::computePairsConnectedTo( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + ValuePair P) { + // For each possible pairing for this variable, look at the uses of + // the first value... + for (Value::use_iterator I = P.first->use_begin(), + E = P.first->use_end(); I != E; ++I) { + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); + + // For each use of the first variable, look for uses of the second + // variable... + for (Value::use_iterator J = P.second->use_begin(), + E2 = P.second->use_end(); J != E2; ++J) { + VPIteratorPair JPairRange = CandidatePairs.equal_range(*J); + + // Look for : + if (isSecondInIteratorPair(*J, IPairRange)) + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); + + // Look for : + if (isSecondInIteratorPair(*I, JPairRange)) + ConnectedPairs.insert(VPPair(P, ValuePair(*J, *I))); + } + + if (SplatBreaksChain) continue; + // Look for cases where just the first value in the pair is used by + // both members of another pair (splatting). + for (Value::use_iterator J = P.first->use_begin(); J != E; ++J) { + if (isSecondInIteratorPair(*J, IPairRange)) + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); + } + } + + if (SplatBreaksChain) return; + // Look for cases where just the second value in the pair is used by + // both members of another pair (splatting). + for (Value::use_iterator I = P.second->use_begin(), + E = P.second->use_end(); I != E; ++I) { + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); + + for (Value::use_iterator J = P.second->use_begin(); J != E; ++J) { + if (isSecondInIteratorPair(*J, IPairRange)) + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); + } + } + } + + // This function figures out which pairs are connected. Two pairs are + // connected if some output of the first pair forms an input to both members + // of the second pair. + void BBVectorize::computeConnectedPairs( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs) { + + for (std::vector::iterator PI = PairableInsts.begin(), + PE = PairableInsts.end(); PI != PE; ++PI) { + VPIteratorPair choiceRange = CandidatePairs.equal_range(*PI); + + for (std::multimap::iterator P = choiceRange.first; + P != choiceRange.second; ++P) + computePairsConnectedTo(CandidatePairs, PairableInsts, + ConnectedPairs, *P); + } + + DEBUG(dbgs() << "BBV: found " << ConnectedPairs.size() + << " pair connections.\n"); + } + + // This function builds a set of use tuples such that is in the set + // if B is in the use tree of A. If B is in the use tree of A, then B + // depends on the output of A. + void BBVectorize::buildDepMap( + BasicBlock &BB, + std::multimap &CandidatePairs, + std::vector &PairableInsts, + DenseSet &PairableInstUsers) { + DenseSet IsInPair; + for (std::multimap::iterator C = CandidatePairs.begin(), + E = CandidatePairs.end(); C != E; ++C) { + IsInPair.insert(C->first); + IsInPair.insert(C->second); + } + + // Iterate through the basic block, recording all Users of each + // pairable instruction. + + BasicBlock::iterator E = BB.end(); + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { + if (IsInPair.find(I) == IsInPair.end()) continue; + + DenseSet Users; + AliasSetTracker WriteSet(*AA); + for (BasicBlock::iterator J = llvm::next(I); J != E; ++J) + (void) trackUsesOfI(Users, WriteSet, I, J); + + for (DenseSet::iterator U = Users.begin(), E = Users.end(); + U != E; ++U) + PairableInstUsers.insert(ValuePair(I, *U)); + } + } + + // Returns true if an input to pair P is an output of pair Q and also an + // input of pair Q is an output of pair P. If this is the case, then these + // two pairs cannot be simultaneously fused. + bool BBVectorize::pairsConflict(ValuePair P, ValuePair Q, + DenseSet &PairableInstUsers, + std::multimap *PairableInstUserMap) { + // Two pairs are in conflict if they are mutual Users of eachother. + bool QUsesP = PairableInstUsers.count(ValuePair(P.first, Q.first)) || + PairableInstUsers.count(ValuePair(P.first, Q.second)) || + PairableInstUsers.count(ValuePair(P.second, Q.first)) || + PairableInstUsers.count(ValuePair(P.second, Q.second)); + bool PUsesQ = PairableInstUsers.count(ValuePair(Q.first, P.first)) || + PairableInstUsers.count(ValuePair(Q.first, P.second)) || + PairableInstUsers.count(ValuePair(Q.second, P.first)) || + PairableInstUsers.count(ValuePair(Q.second, P.second)); + if (PairableInstUserMap) { + // FIXME: The expensive part of the cycle check is not so much the cycle + // check itself but this edge insertion procedure. This needs some + // profiling and probably a different data structure (same is true of + // most uses of std::multimap). + if (PUsesQ) { + VPPIteratorPair QPairRange = PairableInstUserMap->equal_range(Q); + if (!isSecondInIteratorPair(P, QPairRange)) + PairableInstUserMap->insert(VPPair(Q, P)); + } + if (QUsesP) { + VPPIteratorPair PPairRange = PairableInstUserMap->equal_range(P); + if (!isSecondInIteratorPair(Q, PPairRange)) + PairableInstUserMap->insert(VPPair(P, Q)); + } + } + + return (QUsesP && PUsesQ); + } + + // This function walks the use graph of current pairs to see if, starting + // from P, the walk returns to P. + bool BBVectorize::pairWillFormCycle(ValuePair P, + std::multimap &PairableInstUserMap, + DenseSet &CurrentPairs) { + DEBUG(if (DebugCycleCheck) + dbgs() << "BBV: starting cycle check for : " << *P.first << " <-> " + << *P.second << "\n"); + // A lookup table of visisted pairs is kept because the PairableInstUserMap + // contains non-direct associations. + DenseSet Visited; + std::vector Q; + // General depth-first post-order traversal: + Q.push_back(P); + while (!Q.empty()) { + ValuePair QTop = Q.back(); + + Visited.insert(QTop); + Q.pop_back(); + + DEBUG(if (DebugCycleCheck) + dbgs() << "BBV: cycle check visiting: " << *QTop.first << " <-> " + << *QTop.second << "\n"); + VPPIteratorPair QPairRange = PairableInstUserMap.equal_range(QTop); + for (std::multimap::iterator C = QPairRange.first; + C != QPairRange.second; ++C) { + if (C->second == P) { + DEBUG(dbgs() + << "BBV: rejected to prevent non-trivial cycle formation: " + << *C->first.first << " <-> " << *C->first.second << "\n"); + return true; + } + + if (CurrentPairs.count(C->second) > 0 && + Visited.count(C->second) == 0) + Q.push_back(C->second); + } + } + + return false; + } + + // This function builds the initial tree of connected pairs with the + // pair J at the root. + void BBVectorize::buildInitialTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + DenseMap &ChosenPairs, + DenseMap &Tree, ValuePair J) { + // Each of these pairs is viewed as the root node of a Tree. The Tree + // is then walked (depth-first). As this happens, we keep track of + // the pairs that compose the Tree and the maximum depth of the Tree. + std::vector Q; + // General depth-first post-order traversal: + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); + while (!Q.empty()) { + ValuePairWithDepth QTop = Q.back(); + + // Push each child onto the queue: + bool MoreChildren = false; + size_t MaxChildDepth = QTop.second; + VPPIteratorPair qtRange = ConnectedPairs.equal_range(QTop.first); + for (std::map::iterator k = qtRange.first; + k != qtRange.second; ++k) { + // Make sure that this child pair is still a candidate: + bool IsStillCand = false; + VPIteratorPair checkRange = + CandidatePairs.equal_range(k->second.first); + for (std::multimap::iterator m = checkRange.first; + m != checkRange.second; ++m) { + if (m->second == k->second.second) { + IsStillCand = true; + break; + } + } + + if (IsStillCand) { + DenseMap::iterator C = Tree.find(k->second); + if (C == Tree.end()) { + size_t d = getDepthFactor(k->second.first); + Q.push_back(ValuePairWithDepth(k->second, QTop.second+d)); + MoreChildren = true; + } else { + MaxChildDepth = std::max(MaxChildDepth, C->second); + } + } + } + + if (!MoreChildren) { + // Record the current pair as part of the Tree: + Tree.insert(ValuePairWithDepth(QTop.first, MaxChildDepth)); + Q.pop_back(); + } + } + } + + // Given some initial tree, prune it by removing conflicting pairs (pairs + // that cannot be simultaneously chosen for vectorization). + void BBVectorize::pruneTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + std::multimap &PairableInstUserMap, + DenseMap &ChosenPairs, + DenseMap &Tree, + DenseSet &PrunedTree, ValuePair J, + bool UseCycleCheck) { + std::vector Q; + // General depth-first post-order traversal: + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); + while (!Q.empty()) { + ValuePairWithDepth QTop = Q.back(); + PrunedTree.insert(QTop.first); + Q.pop_back(); + + // Visit each child, pruning as necessary... + DenseMap BestChilden; + VPPIteratorPair QTopRange = ConnectedPairs.equal_range(QTop.first); + for (std::map::iterator K = QTopRange.first; + K != QTopRange.second; ++K) { + DenseMap::iterator C = Tree.find(K->second); + if (C == Tree.end()) continue; + + // This child is in the Tree, now we need to make sure it is the + // best of any conflicting children. There could be multiple + // conflicting children, so first, determine if we're keeping + // this child, then delete conflicting children as necessary. + + // It is also necessary to guard against pairing-induced + // dependencies. Consider instructions a .. x .. y .. b + // such that (a,b) are to be fused and (x,y) are to be fused + // but a is an input to x and b is an output from y. This + // means that y cannot be moved after b but x must be moved + // after b for (a,b) to be fused. In other words, after + // fusing (a,b) we have y .. a/b .. x where y is an input + // to a/b and x is an output to a/b: x and y can no longer + // be legally fused. To prevent this condition, we must + // make sure that a child pair added to the Tree is not + // both an input and output of an already-selected pair. + + // Pairing-induced dependencies can also form from more complicated + // cycles. The pair vs. pair conflicts are easy to check, and so + // that is done explicitly for "fast rejection", and because for + // child vs. child conflicts, we may prefer to keep the current + // pair in preference to the already-selected child. + DenseSet CurrentPairs; + + bool CanAdd = true; + for (DenseMap::iterator C2 + = BestChilden.begin(), E2 = BestChilden.end(); + C2 != E2; ++C2) { + if (C2->first.first == C->first.first || + C2->first.first == C->first.second || + C2->first.second == C->first.first || + C2->first.second == C->first.second || + pairsConflict(C2->first, C->first, PairableInstUsers, + UseCycleCheck ? &PairableInstUserMap : 0)) { + if (C2->second >= C->second) { + CanAdd = false; + break; + } + + CurrentPairs.insert(C2->first); + } + } + if (!CanAdd) continue; + + // Even worse, this child could conflict with another node already + // selected for the Tree. If that is the case, ignore this child. + for (DenseSet::iterator T = PrunedTree.begin(), + E2 = PrunedTree.end(); T != E2; ++T) { + if (T->first == C->first.first || + T->first == C->first.second || + T->second == C->first.first || + T->second == C->first.second || + pairsConflict(*T, C->first, PairableInstUsers, + UseCycleCheck ? &PairableInstUserMap : 0)) { + CanAdd = false; + break; + } + + CurrentPairs.insert(*T); + } + if (!CanAdd) continue; + + // And check the queue too... + for (std::vector::iterator C2 = Q.begin(), + E2 = Q.end(); C2 != E2; ++C2) { + if (C2->first.first == C->first.first || + C2->first.first == C->first.second || + C2->first.second == C->first.first || + C2->first.second == C->first.second || + pairsConflict(C2->first, C->first, PairableInstUsers, + UseCycleCheck ? &PairableInstUserMap : 0)) { + CanAdd = false; + break; + } + + CurrentPairs.insert(C2->first); + } + if (!CanAdd) continue; + + // Last but not least, check for a conflict with any of the + // already-chosen pairs. + for (DenseMap::iterator C2 = + ChosenPairs.begin(), E2 = ChosenPairs.end(); + C2 != E2; ++C2) { + if (pairsConflict(*C2, C->first, PairableInstUsers, + UseCycleCheck ? &PairableInstUserMap : 0)) { + CanAdd = false; + break; + } + + CurrentPairs.insert(*C2); + } + if (!CanAdd) continue; + + // To check for non-trivial cycles formed by the addition of the + // current pair we've formed a list of all relevant pairs, now use a + // graph walk to check for a cycle. We start from the current pair and + // walk the use tree to see if we again reach the current pair. If we + // do, then the current pair is rejected. + + // FIXME: It may be more efficient to use a topological-ordering + // algorithm to improve the cycle check. This should be investigated. + if (UseCycleCheck && + pairWillFormCycle(C->first, PairableInstUserMap, CurrentPairs)) + continue; + + // This child can be added, but we may have chosen it in preference + // to an already-selected child. Check for this here, and if a + // conflict is found, then remove the previously-selected child + // before adding this one in its place. + for (DenseMap::iterator C2 + = BestChilden.begin(); C2 != BestChilden.end();) { + if (C2->first.first == C->first.first || + C2->first.first == C->first.second || + C2->first.second == C->first.first || + C2->first.second == C->first.second || + pairsConflict(C2->first, C->first, PairableInstUsers)) + BestChilden.erase(C2++); + else + ++C2; + } + + BestChilden.insert(ValuePairWithDepth(C->first, C->second)); + } + + for (DenseMap::iterator C + = BestChilden.begin(), E2 = BestChilden.end(); + C != E2; ++C) { + size_t DepthF = getDepthFactor(C->first.first); + Q.push_back(ValuePairWithDepth(C->first, QTop.second+DepthF)); + } + } + } + + // This function finds the best tree of mututally-compatible connected + // pairs, given the choice of root pairs as an iterator range. + void BBVectorize::findBestTreeFor( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + std::multimap &PairableInstUserMap, + DenseMap &ChosenPairs, + DenseSet &BestTree, size_t &BestMaxDepth, + size_t &BestEffSize, VPIteratorPair ChoiceRange, + bool UseCycleCheck) { + for (std::multimap::iterator J = ChoiceRange.first; + J != ChoiceRange.second; ++J) { + + // Before going any further, make sure that this pair does not + // conflict with any already-selected pairs (see comment below + // near the Tree pruning for more details). + DenseSet ChosenPairSet; + bool DoesConflict = false; + for (DenseMap::iterator C = ChosenPairs.begin(), + E = ChosenPairs.end(); C != E; ++C) { + if (pairsConflict(*C, *J, PairableInstUsers, + UseCycleCheck ? &PairableInstUserMap : 0)) { + DoesConflict = true; + break; + } + + ChosenPairSet.insert(*C); + } + if (DoesConflict) continue; + + if (UseCycleCheck && + pairWillFormCycle(*J, PairableInstUserMap, ChosenPairSet)) + continue; + + DenseMap Tree; + buildInitialTreeFor(CandidatePairs, PairableInsts, ConnectedPairs, + PairableInstUsers, ChosenPairs, Tree, *J); + + // Because we'll keep the child with the largest depth, the largest + // depth is still the same in the unpruned Tree. + size_t MaxDepth = Tree.lookup(*J); + + DEBUG(if (DebugPairSelection) dbgs() << "BBV: found Tree for pair {" + << *J->first << " <-> " << *J->second << "} of depth " << + MaxDepth << " and size " << Tree.size() << "\n"); + + // At this point the Tree has been constructed, but, may contain + // contradictory children (meaning that different children of + // some tree node may be attempting to fuse the same instruction). + // So now we walk the tree again, in the case of a conflict, + // keep only the child with the largest depth. To break a tie, + // favor the first child. + + DenseSet PrunedTree; + pruneTreeFor(CandidatePairs, PairableInsts, ConnectedPairs, + PairableInstUsers, PairableInstUserMap, ChosenPairs, Tree, + PrunedTree, *J, UseCycleCheck); + + size_t EffSize = 0; + for (DenseSet::iterator S = PrunedTree.begin(), + E = PrunedTree.end(); S != E; ++S) + EffSize += getDepthFactor(S->first); + + DEBUG(if (DebugPairSelection) + dbgs() << "BBV: found pruned Tree for pair {" + << *J->first << " <-> " << *J->second << "} of depth " << + MaxDepth << " and size " << PrunedTree.size() << + " (effective size: " << EffSize << ")\n"); + if (MaxDepth >= ReqChainDepth && EffSize > BestEffSize) { + BestMaxDepth = MaxDepth; + BestEffSize = EffSize; + BestTree = PrunedTree; + } + } + } + + // Given the list of candidate pairs, this function selects those + // that will be fused into vector instructions. + void BBVectorize::choosePairs( + std::multimap &CandidatePairs, + std::vector &PairableInsts, + std::multimap &ConnectedPairs, + DenseSet &PairableInstUsers, + DenseMap& ChosenPairs) { + bool UseCycleCheck = CandidatePairs.size() <= MaxCandPairsForCycleCheck; + std::multimap PairableInstUserMap; + for (std::vector::iterator I = PairableInsts.begin(), + E = PairableInsts.end(); I != E; ++I) { + // The number of possible pairings for this variable: + size_t NumChoices = CandidatePairs.count(*I); + if (!NumChoices) continue; + + VPIteratorPair ChoiceRange = CandidatePairs.equal_range(*I); + + // The best pair to choose and its tree: + size_t BestMaxDepth = 0, BestEffSize = 0; + DenseSet BestTree; + findBestTreeFor(CandidatePairs, PairableInsts, ConnectedPairs, + PairableInstUsers, PairableInstUserMap, ChosenPairs, + BestTree, BestMaxDepth, BestEffSize, ChoiceRange, + UseCycleCheck); + + // A tree has been chosen (or not) at this point. If no tree was + // chosen, then this instruction, I, cannot be paired (and is no longer + // considered). + + DEBUG(if (BestTree.size() > 0) + dbgs() << "BBV: selected pairs in the best tree for: " + << *cast(*I) << "\n"); + + for (DenseSet::iterator S = BestTree.begin(), + SE2 = BestTree.end(); S != SE2; ++S) { + // Insert the members of this tree into the list of chosen pairs. + ChosenPairs.insert(ValuePair(S->first, S->second)); + DEBUG(dbgs() << "BBV: selected pair: " << *S->first << " <-> " << + *S->second << "\n"); + + // Remove all candidate pairs that have values in the chosen tree. + for (std::multimap::iterator K = + CandidatePairs.begin(); K != CandidatePairs.end();) { + if (K->first == S->first || K->second == S->first || + K->second == S->second || K->first == S->second) { + // Don't remove the actual pair chosen so that it can be used + // in subsequent tree selections. + if (!(K->first == S->first && K->second == S->second)) + CandidatePairs.erase(K++); + else + ++K; + } else { + ++K; + } + } + } + } + + DEBUG(dbgs() << "BBV: selected " << ChosenPairs.size() << " pairs.\n"); + } + + std::string getReplacementName(Instruction *I, bool IsInput, unsigned o, + unsigned n = 0) { + if (!I->hasName()) + return ""; + + return (I->getName() + (IsInput ? ".v.i" : ".v.r") + utostr(o) + + (n > 0 ? "." + utostr(n) : "")).str(); + } + + // Returns the value that is to be used as the pointer input to the vector + // instruction that fuses I with J. + Value *BBVectorize::getReplacementPointerInput(LLVMContext& Context, + Instruction *I, Instruction *J, unsigned o, + bool &FlipMemInputs) { + Value *IPtr, *JPtr; + unsigned IAlignment, JAlignment; + int64_t OffsetInElmts; + (void) getPairPtrInfo(I, J, IPtr, JPtr, IAlignment, JAlignment, + OffsetInElmts); + + // The pointer value is taken to be the one with the lowest offset. + Value *VPtr; + if (OffsetInElmts > 0) { + VPtr = IPtr; + } else { + FlipMemInputs = true; + VPtr = JPtr; + } + + Type *ArgType = cast(IPtr->getType())->getElementType(); + Type *VArgType = getVecTypeForPair(ArgType); + Type *VArgPtrType = PointerType::get(VArgType, + cast(IPtr->getType())->getAddressSpace()); + return new BitCastInst(VPtr, VArgPtrType, getReplacementName(I, true, o), + /* insert before */ FlipMemInputs ? J : I); + } + + void BBVectorize::fillNewShuffleMask(LLVMContext& Context, Instruction *J, + unsigned NumElem, unsigned MaskOffset, unsigned NumInElem, + unsigned IdxOffset, std::vector &Mask) { + for (unsigned v = 0; v < NumElem/2; ++v) { + int m = cast(J)->getMaskValue(v); + if (m < 0) { + Mask[v+MaskOffset] = UndefValue::get(Type::getInt32Ty(Context)); + } else { + unsigned mm = m + (int) IdxOffset; + if (m >= (int) NumInElem) + mm += (int) NumInElem; + + Mask[v+MaskOffset] = + ConstantInt::get(Type::getInt32Ty(Context), mm); + } + } + } + + // Returns the value that is to be used as the vector-shuffle mask to the + // vector instruction that fuses I with J. + Value *BBVectorize::getReplacementShuffleMask(LLVMContext& Context, + Instruction *I, Instruction *J) { + // This is the shuffle mask. We need to append the second + // mask to the first, and the numbers need to be adjusted. + + Type *ArgType = I->getType(); + Type *VArgType = getVecTypeForPair(ArgType); + + // Get the total number of elements in the fused vector type. + // By definition, this must equal the number of elements in + // the final mask. + unsigned NumElem = cast(VArgType)->getNumElements(); + std::vector Mask(NumElem); + + Type *OpType = I->getOperand(0)->getType(); + unsigned NumInElem = cast(OpType)->getNumElements(); + + // For the mask from the first pair... + fillNewShuffleMask(Context, I, NumElem, 0, NumInElem, 0, Mask); + + // For the mask from the second pair... + fillNewShuffleMask(Context, J, NumElem, NumElem/2, NumInElem, NumInElem, + Mask); + + return ConstantVector::get(Mask); + } + + // Returns the value to be used as the specified operand of the vector + // instruction that fuses I with J. + Value *BBVectorize::getReplacementInput(LLVMContext& Context, Instruction *I, + Instruction *J, unsigned o, bool FlipMemInputs) { + Value *CV0 = ConstantInt::get(Type::getInt32Ty(Context), 0); + Value *CV1 = ConstantInt::get(Type::getInt32Ty(Context), 1); + + // Compute the fused vector type for this operand + Type *ArgType = I->getOperand(o)->getType(); + VectorType *VArgType = getVecTypeForPair(ArgType); + + Instruction *L = I, *H = J; + if (FlipMemInputs) { + L = J; + H = I; + } + + if (ArgType->isVectorTy()) { + unsigned numElem = cast(VArgType)->getNumElements(); + std::vector Mask(numElem); + for (unsigned v = 0; v < numElem; ++v) + Mask[v] = ConstantInt::get(Type::getInt32Ty(Context), v); + + Instruction *BV = new ShuffleVectorInst(L->getOperand(o), + H->getOperand(o), + ConstantVector::get(Mask), + getReplacementName(I, true, o)); + BV->insertBefore(J); + return BV; + } + + // If these two inputs are the output of another vector instruction, + // then we should use that output directly. It might be necessary to + // permute it first. [When pairings are fused recursively, you can + // end up with cases where a large vector is decomposed into scalars + // using extractelement instructions, then built into size-2 + // vectors using insertelement and the into larger vectors using + // shuffles. InstCombine does not simplify all of these cases well, + // and so we make sure that shuffles are generated here when possible. + ExtractElementInst *LEE + = dyn_cast(L->getOperand(o)); + ExtractElementInst *HEE + = dyn_cast(H->getOperand(o)); + + if (LEE && HEE && + LEE->getOperand(0)->getType() == HEE->getOperand(0)->getType()) { + VectorType *EEType = cast(LEE->getOperand(0)->getType()); + unsigned LowIndx = cast(LEE->getOperand(1))->getZExtValue(); + unsigned HighIndx = cast(HEE->getOperand(1))->getZExtValue(); + if (LEE->getOperand(0) == HEE->getOperand(0)) { + if (LowIndx == 0 && HighIndx == 1) + return LEE->getOperand(0); + + std::vector Mask(2); + Mask[0] = ConstantInt::get(Type::getInt32Ty(Context), LowIndx); + Mask[1] = ConstantInt::get(Type::getInt32Ty(Context), HighIndx); + + Instruction *BV = new ShuffleVectorInst(LEE->getOperand(0), + UndefValue::get(EEType), + ConstantVector::get(Mask), + getReplacementName(I, true, o)); + BV->insertBefore(J); + return BV; + } + + std::vector Mask(2); + HighIndx += EEType->getNumElements(); + Mask[0] = ConstantInt::get(Type::getInt32Ty(Context), LowIndx); + Mask[1] = ConstantInt::get(Type::getInt32Ty(Context), HighIndx); + + Instruction *BV = new ShuffleVectorInst(LEE->getOperand(0), + HEE->getOperand(0), + ConstantVector::get(Mask), + getReplacementName(I, true, o)); + BV->insertBefore(J); + return BV; + } + + Instruction *BV1 = InsertElementInst::Create( + UndefValue::get(VArgType), + L->getOperand(o), CV0, + getReplacementName(I, true, o, 1)); + BV1->insertBefore(I); + Instruction *BV2 = InsertElementInst::Create(BV1, H->getOperand(o), + CV1, + getReplacementName(I, true, o, 2)); + BV2->insertBefore(J); + return BV2; + } + + // This function creates an array of values that will be used as the inputs + // to the vector instruction that fuses I with J. + void BBVectorize::getReplacementInputsForPair(LLVMContext& Context, + Instruction *I, Instruction *J, + SmallVector &ReplacedOperands, + bool &FlipMemInputs) { + FlipMemInputs = false; + unsigned NumOperands = I->getNumOperands(); + + for (unsigned p = 0, o = NumOperands-1; p < NumOperands; ++p, --o) { + // Iterate backward so that we look at the store pointer + // first and know whether or not we need to flip the inputs. + + if (isa(I) || (o == 1 && isa(I))) { + // This is the pointer for a load/store instruction. + ReplacedOperands[o] = getReplacementPointerInput(Context, I, J, o, + FlipMemInputs); + continue; + } else if (isa(I) && o == NumOperands-1) { + Function *F = cast(I)->getCalledFunction(); + unsigned IID = F->getIntrinsicID(); + BasicBlock &BB = *I->getParent(); + + Module *M = BB.getParent()->getParent(); + Type *ArgType = I->getType(); + Type *VArgType = getVecTypeForPair(ArgType); + + // FIXME: is it safe to do this here? + ReplacedOperands[o] = Intrinsic::getDeclaration(M, + (Intrinsic::ID) IID, VArgType); + continue; + } else if (isa(I) && o == NumOperands-1) { + ReplacedOperands[o] = getReplacementShuffleMask(Context, I, J); + continue; + } + + ReplacedOperands[o] = + getReplacementInput(Context, I, J, o, FlipMemInputs); + } + } + + // This function creates two values that represent the outputs of the + // original I and J instructions. These are generally vector shuffles + // or extracts. In many cases, these will end up being unused and, thus, + // eliminated by later passes. + void BBVectorize::replaceOutputsOfPair(LLVMContext& Context, Instruction *I, + Instruction *J, Instruction *K, + Instruction *&InsertionPt, + Instruction *&K1, Instruction *&K2, + bool &FlipMemInputs) { + Value *CV0 = ConstantInt::get(Type::getInt32Ty(Context), 0); + Value *CV1 = ConstantInt::get(Type::getInt32Ty(Context), 1); + + if (isa(I)) { + AA->replaceWithNewValue(I, K); + AA->replaceWithNewValue(J, K); + } else { + Type *IType = I->getType(); + Type *VType = getVecTypeForPair(IType); + + if (IType->isVectorTy()) { + unsigned numElem = cast(IType)->getNumElements(); + std::vector Mask1(numElem), Mask2(numElem); + for (unsigned v = 0; v < numElem; ++v) { + Mask1[v] = ConstantInt::get(Type::getInt32Ty(Context), v); + Mask2[v] = ConstantInt::get(Type::getInt32Ty(Context), numElem+v); + } + + K1 = new ShuffleVectorInst(K, UndefValue::get(VType), + ConstantVector::get( + FlipMemInputs ? Mask2 : Mask1), + getReplacementName(K, false, 1)); + K2 = new ShuffleVectorInst(K, UndefValue::get(VType), + ConstantVector::get( + FlipMemInputs ? Mask1 : Mask2), + getReplacementName(K, false, 2)); + } else { + K1 = ExtractElementInst::Create(K, FlipMemInputs ? CV1 : CV0, + getReplacementName(K, false, 1)); + K2 = ExtractElementInst::Create(K, FlipMemInputs ? CV0 : CV1, + getReplacementName(K, false, 2)); + } + + K1->insertAfter(K); + K2->insertAfter(K1); + InsertionPt = K2; + } + } + + // Move all uses of the function I (including pairing-induced uses) after J. + bool BBVectorize::canMoveUsesOfIAfterJ(BasicBlock &BB, + std::multimap &LoadMoveSet, + Instruction *I, Instruction *J) { + // Skip to the first instruction past I. + BasicBlock::iterator L = BB.begin(); + for (; cast(L) != I; ++L); + ++L; + + DenseSet Users; + AliasSetTracker WriteSet(*AA); + for (; cast(L) != J; ++L) + (void) trackUsesOfI(Users, WriteSet, I, L, true, &LoadMoveSet); + + assert(cast(L) == J && + "Tracking has not proceeded far enough to check for dependencies"); + // If J is now in the use set of I, then trackUsesOfI will return true + // and we have a dependency cycle (and the fusing operation must abort). + return !trackUsesOfI(Users, WriteSet, I, J, true, &LoadMoveSet); + } + + // Move all uses of the function I (including pairing-induced uses) after J. + void BBVectorize::moveUsesOfIAfterJ(BasicBlock &BB, + std::multimap &LoadMoveSet, + Instruction *&InsertionPt, + Instruction *I, Instruction *J) { + // Skip to the first instruction past I. + BasicBlock::iterator L = BB.begin(); + for (; cast(L) != I; ++L); + ++L; + + DenseSet Users; + AliasSetTracker WriteSet(*AA); + for (; cast(L) != J;) { + if (trackUsesOfI(Users, WriteSet, I, L, true, &LoadMoveSet)) { + // Move this instruction + Instruction *InstToMove = L; ++L; + + DEBUG(dbgs() << "BBV: moving: " << *InstToMove << + " to after " << *InsertionPt << "\n"); + InstToMove->removeFromParent(); + InstToMove->insertAfter(InsertionPt); + InsertionPt = InstToMove; + } else { + ++L; + } + } + } + + // Collect all load instruction that are in the move set of a given first + // pair member. These loads depend on the first instruction, I, and so need + // to be moved after J (the second instruction) when the pair is fused. + void BBVectorize::collectPairLoadMoveSet(BasicBlock &BB, + DenseMap &ChosenPairs, + std::multimap &LoadMoveSet, + Instruction *I) { + // Skip to the first instruction past I. + BasicBlock::iterator L = BB.begin(); + for (; cast(L) != I; ++L); + ++L; + + DenseSet Users; + AliasSetTracker WriteSet(*AA); + + // Note: We cannot end the loop when we reach J because J could be moved + // farther down the use chain by another instruction pairing. Also, J + // could be before I if this is an inverted input. + for (BasicBlock::iterator E = BB.end(); cast(L) != E; ++L) { + if (trackUsesOfI(Users, WriteSet, I, L)) { + if (L->mayReadFromMemory()) + LoadMoveSet.insert(ValuePair(L, I)); + } + } + } + + // In cases where both load/stores and the computation of their pointers + // are chosen for vectorization, we can end up in a situation where the + // aliasing analysis starts returning different query results as the + // process of fusing instruction pairs continues. Because the algorithm + // relies on finding the same use trees here as were found earlier, we'll + // need to precompute the necessary aliasing information here and then + // manually update it during the fusion process. + void BBVectorize::collectLoadMoveSet(BasicBlock &BB, + std::vector &PairableInsts, + DenseMap &ChosenPairs, + std::multimap &LoadMoveSet) { + for (std::vector::iterator PI = PairableInsts.begin(), + PIE = PairableInsts.end(); PI != PIE; ++PI) { + DenseMap::iterator P = ChosenPairs.find(*PI); + if (P == ChosenPairs.end()) continue; + + Instruction *I = cast(P->first); + collectPairLoadMoveSet(BB, ChosenPairs, LoadMoveSet, I); + } + } + + // This function fuses the chosen instruction pairs into vector instructions, + // taking care preserve any needed scalar outputs and, then, it reorders the + // remaining instructions as needed (users of the first member of the pair + // need to be moved to after the location of the second member of the pair + // because the vector instruction is inserted in the location of the pair's + // second member). + void BBVectorize::fuseChosenPairs(BasicBlock &BB, + std::vector &PairableInsts, + DenseMap &ChosenPairs) { + LLVMContext& Context = BB.getContext(); + + // During the vectorization process, the order of the pairs to be fused + // could be flipped. So we'll add each pair, flipped, into the ChosenPairs + // list. After a pair is fused, the flipped pair is removed from the list. + std::vector FlippedPairs; + FlippedPairs.reserve(ChosenPairs.size()); + for (DenseMap::iterator P = ChosenPairs.begin(), + E = ChosenPairs.end(); P != E; ++P) + FlippedPairs.push_back(ValuePair(P->second, P->first)); + for (std::vector::iterator P = FlippedPairs.begin(), + E = FlippedPairs.end(); P != E; ++P) + ChosenPairs.insert(*P); + + std::multimap LoadMoveSet; + collectLoadMoveSet(BB, PairableInsts, ChosenPairs, LoadMoveSet); + + DEBUG(dbgs() << "BBV: initial: \n" << BB << "\n"); + + for (BasicBlock::iterator PI = BB.getFirstInsertionPt(); PI != BB.end();) { + DenseMap::iterator P = ChosenPairs.find(PI); + if (P == ChosenPairs.end()) { + ++PI; + continue; + } + + if (getDepthFactor(P->first) == 0) { + // These instructions are not really fused, but are tracked as though + // they are. Any case in which it would be interesting to fuse them + // will be taken care of by InstCombine. + --NumFusedOps; + ++PI; + continue; + } + + Instruction *I = cast(P->first), + *J = cast(P->second); + + DEBUG(dbgs() << "BBV: fusing: " << *I << + " <-> " << *J << "\n"); + + // Remove the pair and flipped pair from the list. + DenseMap::iterator FP = ChosenPairs.find(P->second); + assert(FP != ChosenPairs.end() && "Flipped pair not found in list"); + ChosenPairs.erase(FP); + ChosenPairs.erase(P); + + if (!canMoveUsesOfIAfterJ(BB, LoadMoveSet, I, J)) { + DEBUG(dbgs() << "BBV: fusion of: " << *I << + " <-> " << *J << + " aborted because of non-trivial dependency cycle\n"); + --NumFusedOps; + ++PI; + continue; + } + + bool FlipMemInputs; + unsigned NumOperands = I->getNumOperands(); + SmallVector ReplacedOperands(NumOperands); + getReplacementInputsForPair(Context, I, J, ReplacedOperands, + FlipMemInputs); + + // Make a copy of the original operation, change its type to the vector + // type and replace its operands with the vector operands. + Instruction *K = I->clone(); + if (I->hasName()) K->takeName(I); + + if (!isa(K)) + K->mutateType(getVecTypeForPair(I->getType())); + + for (unsigned o = 0; o < NumOperands; ++o) + K->setOperand(o, ReplacedOperands[o]); + + // If we've flipped the memory inputs, make sure that we take the correct + // alignment. + if (FlipMemInputs) { + if (isa(K)) + cast(K)->setAlignment(cast(J)->getAlignment()); + else + cast(K)->setAlignment(cast(J)->getAlignment()); + } + + K->insertAfter(J); + + // Instruction insertion point: + Instruction *InsertionPt = K; + Instruction *K1 = 0, *K2 = 0; + replaceOutputsOfPair(Context, I, J, K, InsertionPt, K1, K2, + FlipMemInputs); + + // The use tree of the first original instruction must be moved to after + // the location of the second instruction. The entire use tree of the + // first instruction is disjoint from the input tree of the second + // (by definition), and so commutes with it. + + moveUsesOfIAfterJ(BB, LoadMoveSet, InsertionPt, I, J); + + if (!isa(I)) { + I->replaceAllUsesWith(K1); + J->replaceAllUsesWith(K2); + AA->replaceWithNewValue(I, K1); + AA->replaceWithNewValue(J, K2); + } + + // Instructions that may read from memory may be in the load move set. + // Once an instruction is fused, we no longer need its move set, and so + // the values of the map never need to be updated. However, when a load + // is fused, we need to merge the entries from both instructions in the + // pair in case those instructions were in the move set of some other + // yet-to-be-fused pair. The loads in question are the keys of the map. + if (I->mayReadFromMemory()) { + std::vector NewSetMembers; + VPIteratorPair IPairRange = LoadMoveSet.equal_range(I); + VPIteratorPair JPairRange = LoadMoveSet.equal_range(J); + for (std::multimap::iterator N = IPairRange.first; + N != IPairRange.second; ++N) + NewSetMembers.push_back(ValuePair(K, N->second)); + for (std::multimap::iterator N = JPairRange.first; + N != JPairRange.second; ++N) + NewSetMembers.push_back(ValuePair(K, N->second)); + for (std::vector::iterator A = NewSetMembers.begin(), + AE = NewSetMembers.end(); A != AE; ++A) + LoadMoveSet.insert(*A); + } + + // Before removing I, set the iterator to the next instruction. + PI = llvm::next(BasicBlock::iterator(I)); + if (cast(PI) == J) + ++PI; + + SE->forgetValue(I); + SE->forgetValue(J); + I->eraseFromParent(); + J->eraseFromParent(); + } + + DEBUG(dbgs() << "BBV: final: \n" << BB << "\n"); + } +} + +char BBVectorize::ID = 0; +static const char bb_vectorize_name[] = "Basic-Block Vectorization"; +INITIALIZE_PASS_BEGIN(BBVectorize, BBV_NAME, bb_vectorize_name, false, false) +INITIALIZE_AG_DEPENDENCY(AliasAnalysis) +INITIALIZE_PASS_DEPENDENCY(ScalarEvolution) +INITIALIZE_PASS_END(BBVectorize, BBV_NAME, bb_vectorize_name, false, false) + +BasicBlockPass *llvm::createBBVectorizePass() { + return new BBVectorize(); +} + Added: llvm/trunk/lib/Transforms/Vectorize/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/CMakeLists.txt?rev=149468&view=auto ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/CMakeLists.txt (added) +++ llvm/trunk/lib/Transforms/Vectorize/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -0,0 +1,4 @@ +add_llvm_library(LLVMVectorize + BBVectorize.cpp + Vectorize.cpp + ) Copied: llvm/trunk/lib/Transforms/Vectorize/LLVMBuild.txt (from r149457, llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LLVMBuild.txt?p2=llvm/trunk/lib/Transforms/Vectorize/LLVMBuild.txt&p1=llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt&r1=149457&r2=149468&rev=149468&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/LLVMBuild.txt (original) +++ llvm/trunk/lib/Transforms/Vectorize/LLVMBuild.txt Tue Jan 31 21:51:43 2012 @@ -1,4 +1,4 @@ -;===- ./lib/Transforms/IPO/LLVMBuild.txt -----------------------*- Conf -*--===; +;===- ./lib/Transforms/Scalar/LLVMBuild.txt --------------------*- Conf -*--===; ; ; The LLVM Compiler Infrastructure ; @@ -17,7 +17,8 @@ [component_0] type = Library -name = IPO +name = Vectorize parent = Transforms -library_name = ipo -required_libraries = Analysis Core IPA InstCombine Scalar Support Target TransformUtils +library_name = Vectorize +required_libraries = Analysis Core InstCombine Support Target TransformUtils + Copied: llvm/trunk/lib/Transforms/Vectorize/Makefile (from r149457, llvm/trunk/tools/llvm-ld/Makefile) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/Makefile?p2=llvm/trunk/lib/Transforms/Vectorize/Makefile&p1=llvm/trunk/tools/llvm-ld/Makefile&r1=149457&r2=149468&rev=149468&view=diff ============================================================================== --- llvm/trunk/tools/llvm-ld/Makefile (original) +++ llvm/trunk/lib/Transforms/Vectorize/Makefile Tue Jan 31 21:51:43 2012 @@ -1,14 +1,15 @@ -##===- tools/llvm-ld/Makefile ------------------------------*- Makefile -*-===## -# +##===- lib/Transforms/Vectorize/Makefile -----------------*- Makefile -*-===## +# # The LLVM Compiler Infrastructure # # This file is distributed under the University of Illinois Open Source # License. See LICENSE.TXT for details. -# +# ##===----------------------------------------------------------------------===## -LEVEL := ../.. -TOOLNAME := llvm-ld -LINK_COMPONENTS := ipo scalaropts linker archive bitwriter +LEVEL = ../../.. +LIBRARYNAME = LLVMVectorize +BUILD_ARCHIVE = 1 include $(LEVEL)/Makefile.common + Added: llvm/trunk/lib/Transforms/Vectorize/Vectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/Vectorize.cpp?rev=149468&view=auto ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/Vectorize.cpp (added) +++ llvm/trunk/lib/Transforms/Vectorize/Vectorize.cpp Tue Jan 31 21:51:43 2012 @@ -0,0 +1,39 @@ +//===-- Vectorize.cpp -----------------------------------------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file implements common infrastructure for libLLVMVectorizeOpts.a, which +// implements several vectorization transformations over the LLVM intermediate +// representation, including the C bindings for that library. +// +//===----------------------------------------------------------------------===// + +#include "llvm-c/Transforms/Vectorize.h" +#include "llvm-c/Initialization.h" +#include "llvm/InitializePasses.h" +#include "llvm/PassManager.h" +#include "llvm/Analysis/Passes.h" +#include "llvm/Analysis/Verifier.h" +#include "llvm/Transforms/Vectorize.h" + +using namespace llvm; + +/// initializeVectorizationPasses - Initialize all passes linked into the +/// Vectorization library. +void llvm::initializeVectorization(PassRegistry &Registry) { + initializeBBVectorizePass(Registry); +} + +void LLVMInitializeVectorization(LLVMPassRegistryRef R) { + initializeVectorization(*unwrap(R)); +} + +void LLVMAddBBVectorizePass(LLVMPassManagerRef PM) { + unwrap(PM)->add(createBBVectorizePass()); +} + Added: llvm/trunk/test/Transforms/BBVectorize/cycle.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/cycle.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/cycle.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/cycle.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,112 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s + +; This test checks the non-trivial pairing-induced cycle avoidance. Without this cycle avoidance, the algorithm would otherwise +; want to select the pairs: +; %div77 = fdiv double %sub74, %mul76.v.r1 <-> %div125 = fdiv double %mul121, %mul76.v.r2 (div125 depends on mul117) +; %add84 = fadd double %sub83, 2.000000e+00 <-> %add127 = fadd double %mul126, 1.000000e+00 (add127 depends on div77) +; %mul95 = fmul double %sub45.v.r1, %sub36.v.r1 <-> %mul88 = fmul double %sub36.v.r1, %sub87 (mul88 depends on add84) +; %mul117 = fmul double %sub39.v.r1, %sub116 <-> %mul97 = fmul double %mul96, %sub39.v.r1 (mul97 depends on mul95) +; and so a dependency cycle would be created. + +declare double @fabs(double) nounwind readnone +define void @test1(double %a, double %b, double %c, double %add80, double %mul1, double %mul2.v.r1, double %mul73, double %sub, double %sub65, double %F.0, i32 %n.0, double %Bnm3.0, double %Bnm2.0, double %Bnm1.0, double %Anm3.0, double %Anm2.0, double %Anm1.0) { +entry: + br label %go +go: + %conv = sitofp i32 %n.0 to double + %add35 = fadd double %conv, %a + %sub36 = fadd double %add35, -1.000000e+00 + %add38 = fadd double %conv, %b + %sub39 = fadd double %add38, -1.000000e+00 + %add41 = fadd double %conv, %c + %sub42 = fadd double %add41, -1.000000e+00 + %sub45 = fadd double %add35, -2.000000e+00 + %sub48 = fadd double %add38, -2.000000e+00 + %sub51 = fadd double %add41, -2.000000e+00 + %mul52 = shl nsw i32 %n.0, 1 + %sub53 = add nsw i32 %mul52, -1 + %conv54 = sitofp i32 %sub53 to double + %sub56 = add nsw i32 %mul52, -3 + %conv57 = sitofp i32 %sub56 to double + %sub59 = add nsw i32 %mul52, -5 + %conv60 = sitofp i32 %sub59 to double + %mul61 = mul nsw i32 %n.0, %n.0 + %conv62 = sitofp i32 %mul61 to double + %mul63 = fmul double %conv62, 3.000000e+00 + %mul67 = fmul double %sub65, %conv + %add68 = fadd double %mul63, %mul67 + %add69 = fadd double %add68, 2.000000e+00 + %sub71 = fsub double %add69, %mul2.v.r1 + %sub74 = fsub double %sub71, %mul73 + %mul75 = fmul double %conv57, 2.000000e+00 + %mul76 = fmul double %mul75, %sub42 + %div77 = fdiv double %sub74, %mul76 + %mul82 = fmul double %add80, %conv + %sub83 = fsub double %mul63, %mul82 + %add84 = fadd double %sub83, 2.000000e+00 + %sub86 = fsub double %add84, %mul2.v.r1 + %sub87 = fsub double -0.000000e+00, %sub86 + %mul88 = fmul double %sub36, %sub87 + %mul89 = fmul double %mul88, %sub39 + %mul90 = fmul double %conv54, 4.000000e+00 + %mul91 = fmul double %mul90, %conv57 + %mul92 = fmul double %mul91, %sub51 + %mul93 = fmul double %mul92, %sub42 + %div94 = fdiv double %mul89, %mul93 + %mul95 = fmul double %sub45, %sub36 + %mul96 = fmul double %mul95, %sub48 + %mul97 = fmul double %mul96, %sub39 + %sub99 = fsub double %conv, %a + %sub100 = fadd double %sub99, -2.000000e+00 + %mul101 = fmul double %mul97, %sub100 + %sub103 = fsub double %conv, %b + %sub104 = fadd double %sub103, -2.000000e+00 + %mul105 = fmul double %mul101, %sub104 + %mul106 = fmul double %conv57, 8.000000e+00 + %mul107 = fmul double %mul106, %conv57 + %mul108 = fmul double %mul107, %conv60 + %sub111 = fadd double %add41, -3.000000e+00 + %mul112 = fmul double %mul108, %sub111 + %mul113 = fmul double %mul112, %sub51 + %mul114 = fmul double %mul113, %sub42 + %div115 = fdiv double %mul105, %mul114 + %sub116 = fsub double -0.000000e+00, %sub36 + %mul117 = fmul double %sub39, %sub116 + %sub119 = fsub double %conv, %c + %sub120 = fadd double %sub119, -1.000000e+00 + %mul121 = fmul double %mul117, %sub120 + %mul123 = fmul double %mul75, %sub51 + %mul124 = fmul double %mul123, %sub42 + %div125 = fdiv double %mul121, %mul124 + %mul126 = fmul double %div77, %sub + %add127 = fadd double %mul126, 1.000000e+00 + %mul128 = fmul double %add127, %Anm1.0 + %mul129 = fmul double %div94, %sub + %add130 = fadd double %div125, %mul129 + %mul131 = fmul double %add130, %sub + %mul132 = fmul double %mul131, %Anm2.0 + %add133 = fadd double %mul128, %mul132 + %mul134 = fmul double %div115, %mul1 + %mul135 = fmul double %mul134, %Anm3.0 + %add136 = fadd double %add133, %mul135 + %mul139 = fmul double %add127, %Bnm1.0 + %mul143 = fmul double %mul131, %Bnm2.0 + %add144 = fadd double %mul139, %mul143 + %mul146 = fmul double %mul134, %Bnm3.0 + %add147 = fadd double %add144, %mul146 + %div148 = fdiv double %add136, %add147 + %sub149 = fsub double %F.0, %div148 + %div150 = fdiv double %sub149, %F.0 + %call = tail call double @fabs(double %div150) nounwind readnone + %cmp = fcmp olt double %call, 0x3CB0000000000000 + %cmp152 = icmp sgt i32 %n.0, 20000 + %or.cond = or i1 %cmp, %cmp152 + br i1 %or.cond, label %done, label %go +done: + ret void +; CHECK: @test1 +; CHECK: go: +; CHECK-NEXT: %conv.v.i0.1 = insertelement <2 x i32> undef, i32 %n.0, i32 0 +; FIXME: When tree pruning is deterministic, include the entire output. +} Added: llvm/trunk/test/Transforms/BBVectorize/dg.exp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/dg.exp?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/dg.exp (added) +++ llvm/trunk/test/Transforms/BBVectorize/dg.exp Tue Jan 31 21:51:43 2012 @@ -0,0 +1,3 @@ +load_lib llvm.exp + +RunLLVMTests [lsort [glob -nocomplain $srcdir/$subdir/*.{ll,c,cpp}]] Added: llvm/trunk/test/Transforms/BBVectorize/ld1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/ld1.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/ld1.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/ld1.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,41 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s + +define double @test1(double* %a, double* %b, double* %c) nounwind uwtable readonly { +entry: + %i0 = load double* %a, align 8 + %i1 = load double* %b, align 8 + %mul = fmul double %i0, %i1 + %i2 = load double* %c, align 8 + %add = fadd double %mul, %i2 + %arrayidx3 = getelementptr inbounds double* %a, i64 1 + %i3 = load double* %arrayidx3, align 8 + %arrayidx4 = getelementptr inbounds double* %b, i64 1 + %i4 = load double* %arrayidx4, align 8 + %mul5 = fmul double %i3, %i4 + %arrayidx6 = getelementptr inbounds double* %c, i64 1 + %i5 = load double* %arrayidx6, align 8 + %add7 = fadd double %mul5, %i5 + %mul9 = fmul double %add, %i1 + %add11 = fadd double %mul9, %i2 + %mul13 = fmul double %add7, %i4 + %add15 = fadd double %mul13, %i5 + %mul16 = fmul double %add11, %add15 + ret double %mul16 +; CHECK: @test1 +; CHECK: %i0.v.i0 = bitcast double* %a to <2 x double>* +; CHECK: %i1.v.i0 = bitcast double* %b to <2 x double>* +; CHECK: %i2.v.i0 = bitcast double* %c to <2 x double>* +; CHECK: %i0 = load <2 x double>* %i0.v.i0, align 8 +; CHECK: %i1 = load <2 x double>* %i1.v.i0, align 8 +; CHECK: %mul = fmul <2 x double> %i0, %i1 +; CHECK: %i2 = load <2 x double>* %i2.v.i0, align 8 +; CHECK: %add = fadd <2 x double> %mul, %i2 +; CHECK: %mul9 = fmul <2 x double> %add, %i1 +; CHECK: %add11 = fadd <2 x double> %mul9, %i2 +; CHECK: %add11.v.r1 = extractelement <2 x double> %add11, i32 0 +; CHECK: %add11.v.r2 = extractelement <2 x double> %add11, i32 1 +; CHECK: %mul16 = fmul double %add11.v.r1, %add11.v.r2 +; CHECK: ret double %mul16 +} + Added: llvm/trunk/test/Transforms/BBVectorize/loop1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/loop1.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/loop1.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/loop1.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,93 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s +; RUN: opt < %s -basicaa -loop-unroll -unroll-threshold=45 -unroll-allow-partial -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s -check-prefix=CHECK-UNRL +; The second check covers the use of alias analysis (with loop unrolling). + +define void @test1(double* noalias %out, double* noalias %in1, double* noalias %in2) nounwind uwtable { +entry: + br label %for.body +; CHECK: @test1 +; CHECK-UNRL: @test1 + +for.body: ; preds = %for.body, %entry + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] + %arrayidx = getelementptr inbounds double* %in1, i64 %indvars.iv + %0 = load double* %arrayidx, align 8 + %arrayidx2 = getelementptr inbounds double* %in2, i64 %indvars.iv + %1 = load double* %arrayidx2, align 8 + %mul = fmul double %0, %0 + %mul3 = fmul double %0, %1 + %add = fadd double %mul, %mul3 + %add4 = fadd double %1, %1 + %add5 = fadd double %add4, %0 + %mul6 = fmul double %0, %add5 + %add7 = fadd double %add, %mul6 + %mul8 = fmul double %1, %1 + %add9 = fadd double %0, %0 + %add10 = fadd double %add9, %0 + %mul11 = fmul double %mul8, %add10 + %add12 = fadd double %add7, %mul11 + %arrayidx14 = getelementptr inbounds double* %out, i64 %indvars.iv + store double %add12, double* %arrayidx14, align 8 + %indvars.iv.next = add i64 %indvars.iv, 1 + %lftr.wideiv = trunc i64 %indvars.iv.next to i32 + %exitcond = icmp eq i32 %lftr.wideiv, 10 + br i1 %exitcond, label %for.end, label %for.body +; CHECK: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] +; CHECK: %arrayidx = getelementptr inbounds double* %in1, i64 %indvars.iv +; CHECK: %0 = load double* %arrayidx, align 8 +; CHECK: %arrayidx2 = getelementptr inbounds double* %in2, i64 %indvars.iv +; CHECK: %1 = load double* %arrayidx2, align 8 +; CHECK: %mul = fmul double %0, %0 +; CHECK: %mul3 = fmul double %0, %1 +; CHECK: %add = fadd double %mul, %mul3 +; CHECK: %add4.v.i1.1 = insertelement <2 x double> undef, double %1, i32 0 +; CHECK: %mul8 = fmul double %1, %1 +; CHECK: %add4.v.i1.2 = insertelement <2 x double> %add4.v.i1.1, double %0, i32 1 +; CHECK: %add4 = fadd <2 x double> %add4.v.i1.2, %add4.v.i1.2 +; CHECK: %add5.v.i1.1 = insertelement <2 x double> undef, double %0, i32 0 +; CHECK: %add5.v.i1.2 = insertelement <2 x double> %add5.v.i1.1, double %0, i32 1 +; CHECK: %add5 = fadd <2 x double> %add4, %add5.v.i1.2 +; CHECK: %mul6.v.i0.2 = insertelement <2 x double> %add5.v.i1.1, double %mul8, i32 1 +; CHECK: %mul6 = fmul <2 x double> %mul6.v.i0.2, %add5 +; CHECK: %mul6.v.r1 = extractelement <2 x double> %mul6, i32 0 +; CHECK: %mul6.v.r2 = extractelement <2 x double> %mul6, i32 1 +; CHECK: %add7 = fadd double %add, %mul6.v.r1 +; CHECK: %add12 = fadd double %add7, %mul6.v.r2 +; CHECK: %arrayidx14 = getelementptr inbounds double* %out, i64 %indvars.iv +; CHECK: store double %add12, double* %arrayidx14, align 8 +; CHECK: %indvars.iv.next = add i64 %indvars.iv, 1 +; CHECK: %lftr.wideiv = trunc i64 %indvars.iv.next to i32 +; CHECK: %exitcond = icmp eq i32 %lftr.wideiv, 10 +; CHECK: br i1 %exitcond, label %for.end, label %for.body +; CHECK-UNRL: %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next.1, %for.body ] +; CHECK-UNRL: %arrayidx = getelementptr inbounds double* %in1, i64 %indvars.iv +; CHECK-UNRL: %0 = bitcast double* %arrayidx to <2 x double>* +; CHECK-UNRL: %arrayidx2 = getelementptr inbounds double* %in2, i64 %indvars.iv +; CHECK-UNRL: %1 = bitcast double* %arrayidx2 to <2 x double>* +; CHECK-UNRL: %arrayidx14 = getelementptr inbounds double* %out, i64 %indvars.iv +; CHECK-UNRL: %2 = load <2 x double>* %0, align 8 +; CHECK-UNRL: %3 = load <2 x double>* %1, align 8 +; CHECK-UNRL: %mul = fmul <2 x double> %2, %2 +; CHECK-UNRL: %mul3 = fmul <2 x double> %2, %3 +; CHECK-UNRL: %add = fadd <2 x double> %mul, %mul3 +; CHECK-UNRL: %add4 = fadd <2 x double> %3, %3 +; CHECK-UNRL: %add5 = fadd <2 x double> %add4, %2 +; CHECK-UNRL: %mul6 = fmul <2 x double> %2, %add5 +; CHECK-UNRL: %add7 = fadd <2 x double> %add, %mul6 +; CHECK-UNRL: %mul8 = fmul <2 x double> %3, %3 +; CHECK-UNRL: %add9 = fadd <2 x double> %2, %2 +; CHECK-UNRL: %add10 = fadd <2 x double> %add9, %2 +; CHECK-UNRL: %mul11 = fmul <2 x double> %mul8, %add10 +; CHECK-UNRL: %add12 = fadd <2 x double> %add7, %mul11 +; CHECK-UNRL: %4 = bitcast double* %arrayidx14 to <2 x double>* +; CHECK-UNRL: store <2 x double> %add12, <2 x double>* %4, align 8 +; CHECK-UNRL: %indvars.iv.next.1 = add i64 %indvars.iv, 2 +; CHECK-UNRL: %lftr.wideiv.1 = trunc i64 %indvars.iv.next.1 to i32 +; CHECK-UNRL: %exitcond.1 = icmp eq i32 %lftr.wideiv.1, 10 +; CHECK-UNRL: br i1 %exitcond.1, label %for.end, label %for.body + +for.end: ; preds = %for.body + ret void +} Added: llvm/trunk/test/Transforms/BBVectorize/req-depth.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/req-depth.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/req-depth.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/req-depth.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,17 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth 3 -S | FileCheck %s -check-prefix=CHECK-RD3 +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth 2 -S | FileCheck %s -check-prefix=CHECK-RD2 + +define double @test1(double %A1, double %A2, double %B1, double %B2) { + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 + %R = fmul double %Y1, %Y2 + ret double %R +; CHECK-RD3: @test1 +; CHECK-RD2: @test1 +; CHECK-RD3-NOT: <2 x double> +; CHECK-RD2: <2 x double> +} + Added: llvm/trunk/test/Transforms/BBVectorize/search-limit.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/search-limit.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/search-limit.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/search-limit.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,46 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -bb-vectorize-search-limit=4 -instcombine -gvn -S | FileCheck %s -check-prefix=CHECK-SL4 + +define double @test1(double %A1, double %A2, double %B1, double %B2) { +; CHECK: @test1 +; CHECK-SL4: @test1 +; CHECK-SL4-NOT: <2 x double> +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 +; CHECK: %Y1 = fmul <2 x double> %X1, %X1.v.i0.2 + %Z1 = fadd double %Y1, %B1 + ; Here we have a dependency chain: the short search limit will not + ; see past this chain and so will not see the second part of the + ; pair to vectorize. + %mul41 = fmul double %Z1, %Y2 + %sub48 = fsub double %Z1, %mul41 + %mul62 = fmul double %Z1, %sub48 + %sub69 = fsub double %Z1, %mul62 + %mul83 = fmul double %Z1, %sub69 + %sub90 = fsub double %Z1, %mul83 + %mul104 = fmul double %Z1, %sub90 + %sub111 = fsub double %Z1, %mul104 + %mul125 = fmul double %Z1, %sub111 + %sub132 = fsub double %Z1, %mul125 + %mul146 = fmul double %Z1, %sub132 + %sub153 = fsub double %Z1, %mul146 + ; end of chain. + %Z2 = fadd double %Y2, %B2 +; CHECK: %Z1 = fadd <2 x double> %Y1, %X1.v.i1.2 + %R1 = fdiv double %Z1, %Z2 + %R = fmul double %R1, %sub153 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R1 = fdiv double %Z1.v.r1, %Z1.v.r2 + ret double %R +; CHECK: ret double %R +} + Added: llvm/trunk/test/Transforms/BBVectorize/simple-int.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/simple-int.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/simple-int.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/simple-int.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,59 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s + +declare double @llvm.fma.f64(double, double, double) +declare double @llvm.cos.f64(double) + +; Basic depth-3 chain with fma +define double @test1(double %A1, double %A2, double %B1, double %B2, double %C1, double %C2) { + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 + %Y1 = call double @llvm.fma.f64(double %X1, double %A1, double %C1) + %Y2 = call double @llvm.fma.f64(double %X2, double %A2, double %C2) + %Z1 = fadd double %Y1, %B1 + %Z2 = fadd double %Y2, %B2 + %R = fmul double %Z1, %Z2 + ret double %R +; CHECK: @test1 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 +; CHECK: %Y1.v.i2.1 = insertelement <2 x double> undef, double %C1, i32 0 +; CHECK: %Y1.v.i2.2 = insertelement <2 x double> %Y1.v.i2.1, double %C2, i32 1 +; CHECK: %Y1 = call <2 x double> @llvm.fma.v2f64(<2 x double> %X1, <2 x double> %X1.v.i0.2, <2 x double> %Y1.v.i2.2) +; CHECK: %Z1 = fadd <2 x double> %Y1, %X1.v.i1.2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 +; CHECK: ret double %R +} + +; Basic depth-3 chain with cos +define double @test2(double %A1, double %A2, double %B1, double %B2) { + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 + %Y1 = call double @llvm.cos.f64(double %X1) + %Y2 = call double @llvm.cos.f64(double %X2) + %Z1 = fadd double %Y1, %B1 + %Z2 = fadd double %Y2, %B2 + %R = fmul double %Z1, %Z2 + ret double %R +; CHECK: @test2 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 +; CHECK: %Y1 = call <2 x double> @llvm.cos.v2f64(<2 x double> %X1) +; CHECK: %Z1 = fadd <2 x double> %Y1, %X1.v.i1.2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 +; CHECK: ret double %R +} + +; CHECK: declare <2 x double> @llvm.fma.v2f64(<2 x double>, <2 x double>, <2 x double>) nounwind readnone +; CHECK: declare <2 x double> @llvm.cos.v2f64(<2 x double>) nounwind readonly + Added: llvm/trunk/test/Transforms/BBVectorize/simple-ldstr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/simple-ldstr.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/simple-ldstr.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/simple-ldstr.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,110 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -bb-vectorize-aligned-only -instcombine -gvn -S | FileCheck %s -check-prefix=CHECK-AO + +; Simple 3-pair chain with loads and stores +define void @test1(double* %a, double* %b, double* %c) nounwind uwtable readonly { +entry: + %i0 = load double* %a, align 8 + %i1 = load double* %b, align 8 + %mul = fmul double %i0, %i1 + %arrayidx3 = getelementptr inbounds double* %a, i64 1 + %i3 = load double* %arrayidx3, align 8 + %arrayidx4 = getelementptr inbounds double* %b, i64 1 + %i4 = load double* %arrayidx4, align 8 + %mul5 = fmul double %i3, %i4 + store double %mul, double* %c, align 8 + %arrayidx5 = getelementptr inbounds double* %c, i64 1 + store double %mul5, double* %arrayidx5, align 8 + ret void +; CHECK: @test1 +; CHECK: %i0.v.i0 = bitcast double* %a to <2 x double>* +; CHECK: %i1.v.i0 = bitcast double* %b to <2 x double>* +; CHECK: %i0 = load <2 x double>* %i0.v.i0, align 8 +; CHECK: %i1 = load <2 x double>* %i1.v.i0, align 8 +; CHECK: %mul = fmul <2 x double> %i0, %i1 +; CHECK: %0 = bitcast double* %c to <2 x double>* +; CHECK: store <2 x double> %mul, <2 x double>* %0, align 8 +; CHECK: ret void +; CHECK-AO: @test1 +; CHECK-AO-NOT: <2 x double> +} + +; Simple chain with extending loads and stores +define void @test2(float* %a, float* %b, double* %c) nounwind uwtable readonly { +entry: + %i0f = load float* %a, align 4 + %i0 = fpext float %i0f to double + %i1f = load float* %b, align 4 + %i1 = fpext float %i1f to double + %mul = fmul double %i0, %i1 + %arrayidx3 = getelementptr inbounds float* %a, i64 1 + %i3f = load float* %arrayidx3, align 4 + %i3 = fpext float %i3f to double + %arrayidx4 = getelementptr inbounds float* %b, i64 1 + %i4f = load float* %arrayidx4, align 4 + %i4 = fpext float %i4f to double + %mul5 = fmul double %i3, %i4 + store double %mul, double* %c, align 8 + %arrayidx5 = getelementptr inbounds double* %c, i64 1 + store double %mul5, double* %arrayidx5, align 8 + ret void +; CHECK: @test2 +; CHECK: %i0f.v.i0 = bitcast float* %a to <2 x float>* +; CHECK: %i1f.v.i0 = bitcast float* %b to <2 x float>* +; CHECK: %i0f = load <2 x float>* %i0f.v.i0, align 4 +; CHECK: %i0 = fpext <2 x float> %i0f to <2 x double> +; CHECK: %i1f = load <2 x float>* %i1f.v.i0, align 4 +; CHECK: %i1 = fpext <2 x float> %i1f to <2 x double> +; CHECK: %mul = fmul <2 x double> %i0, %i1 +; CHECK: %0 = bitcast double* %c to <2 x double>* +; CHECK: store <2 x double> %mul, <2 x double>* %0, align 8 +; CHECK: ret void +; CHECK-AO: @test2 +; CHECK-AO-NOT: <2 x double> +} + +; Simple chain with loads and truncating stores +define void @test3(double* %a, double* %b, float* %c) nounwind uwtable readonly { +entry: + %i0 = load double* %a, align 8 + %i1 = load double* %b, align 8 + %mul = fmul double %i0, %i1 + %mulf = fptrunc double %mul to float + %arrayidx3 = getelementptr inbounds double* %a, i64 1 + %i3 = load double* %arrayidx3, align 8 + %arrayidx4 = getelementptr inbounds double* %b, i64 1 + %i4 = load double* %arrayidx4, align 8 + %mul5 = fmul double %i3, %i4 + %mul5f = fptrunc double %mul5 to float + store float %mulf, float* %c, align 8 + %arrayidx5 = getelementptr inbounds float* %c, i64 1 + store float %mul5f, float* %arrayidx5, align 4 + ret void +; CHECK: @test3 +; CHECK: %i0.v.i0 = bitcast double* %a to <2 x double>* +; CHECK: %i1.v.i0 = bitcast double* %b to <2 x double>* +; CHECK: %i0 = load <2 x double>* %i0.v.i0, align 8 +; CHECK: %i1 = load <2 x double>* %i1.v.i0, align 8 +; CHECK: %mul = fmul <2 x double> %i0, %i1 +; CHECK: %mulf = fptrunc <2 x double> %mul to <2 x float> +; CHECK: %0 = bitcast float* %c to <2 x float>* +; CHECK: store <2 x float> %mulf, <2 x float>* %0, align 8 +; CHECK: ret void +; CHECK-AO: @test3 +; CHECK-AO: %i0 = load double* %a, align 8 +; CHECK-AO: %i1 = load double* %b, align 8 +; CHECK-AO: %mul.v.i1.1 = insertelement <2 x double> undef, double %i1, i32 0 +; CHECK-AO: %mul.v.i0.1 = insertelement <2 x double> undef, double %i0, i32 0 +; CHECK-AO: %arrayidx3 = getelementptr inbounds double* %a, i64 1 +; CHECK-AO: %i3 = load double* %arrayidx3, align 8 +; CHECK-AO: %arrayidx4 = getelementptr inbounds double* %b, i64 1 +; CHECK-AO: %i4 = load double* %arrayidx4, align 8 +; CHECK-AO: %mul.v.i1.2 = insertelement <2 x double> %mul.v.i1.1, double %i4, i32 1 +; CHECK-AO: %mul.v.i0.2 = insertelement <2 x double> %mul.v.i0.1, double %i3, i32 1 +; CHECK-AO: %mul = fmul <2 x double> %mul.v.i0.2, %mul.v.i1.2 +; CHECK-AO: %mulf = fptrunc <2 x double> %mul to <2 x float> +; CHECK-AO: %0 = bitcast float* %c to <2 x float>* +; CHECK-AO: store <2 x float> %mulf, <2 x float>* %0, align 8 +; CHECK-AO: ret void +} Added: llvm/trunk/test/Transforms/BBVectorize/simple.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/simple.ll?rev=149468&view=auto ============================================================================== --- llvm/trunk/test/Transforms/BBVectorize/simple.ll (added) +++ llvm/trunk/test/Transforms/BBVectorize/simple.ll Tue Jan 31 21:51:43 2012 @@ -0,0 +1,152 @@ +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" +; RUN: opt < %s -bb-vectorize -bb-vectorize-req-chain-depth=3 -instcombine -gvn -S | FileCheck %s + +; Basic depth-3 chain +define double @test1(double %A1, double %A2, double %B1, double %B2) { +; CHECK: @test1 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 +; CHECK: %Y1 = fmul <2 x double> %X1, %X1.v.i0.2 + %Z1 = fadd double %Y1, %B1 + %Z2 = fadd double %Y2, %B2 +; CHECK: %Z1 = fadd <2 x double> %Y1, %X1.v.i1.2 + %R = fmul double %Z1, %Z2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 + ret double %R +; CHECK: ret double %R +} + +; Basic depth-3 chain (last pair permuted) +define double @test2(double %A1, double %A2, double %B1, double %B2) { +; CHECK: @test2 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 +; CHECK: %Y1 = fmul <2 x double> %X1, %X1.v.i0.2 + %Z1 = fadd double %Y2, %B1 + %Z2 = fadd double %Y1, %B2 +; CHECK: %Z1.v.i0 = shufflevector <2 x double> %Y1, <2 x double> undef, <2 x i32> +; CHECK: %Z1 = fadd <2 x double> %Z1.v.i0, %X1.v.i1.2 + %R = fmul double %Z1, %Z2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 + ret double %R +; CHECK: ret double %R +} + +; Basic depth-3 chain (last pair first splat) +define double @test3(double %A1, double %A2, double %B1, double %B2) { +; CHECK: @test3 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 +; CHECK: %Y1 = fmul <2 x double> %X1, %X1.v.i0.2 + %Z1 = fadd double %Y2, %B1 + %Z2 = fadd double %Y2, %B2 +; CHECK: %Z1.v.i0 = shufflevector <2 x double> %Y1, <2 x double> undef, <2 x i32> +; CHECK: %Z1 = fadd <2 x double> %Z1.v.i0, %X1.v.i1.2 + %R = fmul double %Z1, %Z2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 + ret double %R +; CHECK: ret double %R +} + +; Basic depth-3 chain (last pair second splat) +define double @test4(double %A1, double %A2, double %B1, double %B2) { +; CHECK: @test4 +; CHECK: %X1.v.i1.1 = insertelement <2 x double> undef, double %B1, i32 0 +; CHECK: %X1.v.i0.1 = insertelement <2 x double> undef, double %A1, i32 0 +; CHECK: %X1.v.i1.2 = insertelement <2 x double> %X1.v.i1.1, double %B2, i32 1 +; CHECK: %X1.v.i0.2 = insertelement <2 x double> %X1.v.i0.1, double %A2, i32 1 + %X1 = fsub double %A1, %B1 + %X2 = fsub double %A2, %B2 +; CHECK: %X1 = fsub <2 x double> %X1.v.i0.2, %X1.v.i1.2 + %Y1 = fmul double %X1, %A1 + %Y2 = fmul double %X2, %A2 +; CHECK: %Y1 = fmul <2 x double> %X1, %X1.v.i0.2 + %Z1 = fadd double %Y1, %B1 + %Z2 = fadd double %Y1, %B2 +; CHECK: %Z1.v.i0 = shufflevector <2 x double> %Y1, <2 x double> undef, <2 x i32> zeroinitializer +; CHECK: %Z1 = fadd <2 x double> %Z1.v.i0, %X1.v.i1.2 + %R = fmul double %Z1, %Z2 +; CHECK: %Z1.v.r1 = extractelement <2 x double> %Z1, i32 0 +; CHECK: %Z1.v.r2 = extractelement <2 x double> %Z1, i32 1 +; CHECK: %R = fmul double %Z1.v.r1, %Z1.v.r2 + ret double %R +; CHECK: ret double %R +} + +; Basic depth-3 chain +define <2 x float> @test5(<2 x float> %A1, <2 x float> %A2, <2 x float> %B1, <2 x float> %B2) { +; CHECK: @test5 +; CHECK: %X1.v.i1 = shufflevector <2 x float> %B1, <2 x float> %B2, <4 x i32> +; CHECK: %X1.v.i0 = shufflevector <2 x float> %A1, <2 x float> %A2, <4 x i32> + %X1 = fsub <2 x float> %A1, %B1 + %X2 = fsub <2 x float> %A2, %B2 +; CHECK: %X1 = fsub <4 x float> %X1.v.i0, %X1.v.i1 + %Y1 = fmul <2 x float> %X1, %A1 + %Y2 = fmul <2 x float> %X2, %A2 +; CHECK: %Y1 = fmul <4 x float> %X1, %X1.v.i0 + %Z1 = fadd <2 x float> %Y1, %B1 + %Z2 = fadd <2 x float> %Y2, %B2 +; CHECK: %Z1 = fadd <4 x float> %Y1, %X1.v.i1 + %R = fmul <2 x float> %Z1, %Z2 +; CHECK: %Z1.v.r1 = shufflevector <4 x float> %Z1, <4 x float> undef, <2 x i32> +; CHECK: %Z1.v.r2 = shufflevector <4 x float> %Z1, <4 x float> undef, <2 x i32> +; CHECK: %R = fmul <2 x float> %Z1.v.r1, %Z1.v.r2 + ret <2 x float> %R +; CHECK: ret <2 x float> %R +} + +; Basic chain with shuffles +define <8 x i8> @test6(<8 x i8> %A1, <8 x i8> %A2, <8 x i8> %B1, <8 x i8> %B2) { +; CHECK: @test6 +; CHECK: %X1.v.i1 = shufflevector <8 x i8> %B1, <8 x i8> %B2, <16 x i32> +; CHECK: %X1.v.i0 = shufflevector <8 x i8> %A1, <8 x i8> %A2, <16 x i32> + %X1 = sub <8 x i8> %A1, %B1 + %X2 = sub <8 x i8> %A2, %B2 +; CHECK: %X1 = sub <16 x i8> %X1.v.i0, %X1.v.i1 + %Y1 = mul <8 x i8> %X1, %A1 + %Y2 = mul <8 x i8> %X2, %A2 +; CHECK: %Y1 = mul <16 x i8> %X1, %X1.v.i0 + %Z1 = add <8 x i8> %Y1, %B1 + %Z2 = add <8 x i8> %Y2, %B2 +; CHECK: %Z1 = add <16 x i8> %Y1, %X1.v.i1 + %Q1 = shufflevector <8 x i8> %Z1, <8 x i8> %Z2, <8 x i32> + %Q2 = shufflevector <8 x i8> %Z2, <8 x i8> %Z2, <8 x i32> +; CHECK: %Z1.v.r2 = shufflevector <16 x i8> %Z1, <16 x i8> undef, <8 x i32> +; CHECK: %Q1.v.i1 = shufflevector <8 x i8> %Z1.v.r2, <8 x i8> undef, <16 x i32> +; CHECK: %Q1 = shufflevector <16 x i8> %Z1, <16 x i8> %Q1.v.i1, <16 x i32> + %R = mul <8 x i8> %Q1, %Q2 +; CHECK: %Q1.v.r1 = shufflevector <16 x i8> %Q1, <16 x i8> undef, <8 x i32> +; CHECK: %Q1.v.r2 = shufflevector <16 x i8> %Q1, <16 x i8> undef, <8 x i32> +; CHECK: %R = mul <8 x i8> %Q1.v.r1, %Q1.v.r2 + ret <8 x i8> %R +; CHECK: ret <8 x i8> %R +} + + Modified: llvm/trunk/tools/bugpoint/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/bugpoint/CMakeLists.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/bugpoint/CMakeLists.txt (original) +++ llvm/trunk/tools/bugpoint/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -1,5 +1,5 @@ set(LLVM_LINK_COMPONENTS asmparser instrumentation scalaropts ipo - linker bitreader bitwriter) + linker bitreader bitwriter vectorize) add_llvm_tool(bugpoint BugDriver.cpp Modified: llvm/trunk/tools/bugpoint/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/bugpoint/Makefile?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/bugpoint/Makefile (original) +++ llvm/trunk/tools/bugpoint/Makefile Tue Jan 31 21:51:43 2012 @@ -10,6 +10,6 @@ LEVEL := ../.. TOOLNAME := bugpoint LINK_COMPONENTS := asmparser instrumentation scalaropts ipo linker bitreader \ - bitwriter + bitwriter vectorize include $(LEVEL)/Makefile.common Modified: llvm/trunk/tools/llvm-ld/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-ld/CMakeLists.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/llvm-ld/CMakeLists.txt (original) +++ llvm/trunk/tools/llvm-ld/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -1,4 +1,4 @@ -set(LLVM_LINK_COMPONENTS ipo scalaropts linker archive bitwriter) +set(LLVM_LINK_COMPONENTS ipo scalaropts linker archive bitwriter vectorize) add_llvm_tool(llvm-ld Optimize.cpp Modified: llvm/trunk/tools/llvm-ld/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-ld/Makefile?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/llvm-ld/Makefile (original) +++ llvm/trunk/tools/llvm-ld/Makefile Tue Jan 31 21:51:43 2012 @@ -9,6 +9,6 @@ LEVEL := ../.. TOOLNAME := llvm-ld -LINK_COMPONENTS := ipo scalaropts linker archive bitwriter +LINK_COMPONENTS := ipo scalaropts linker archive bitwriter vectorize include $(LEVEL)/Makefile.common Modified: llvm/trunk/tools/lto/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/CMakeLists.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/lto/CMakeLists.txt (original) +++ llvm/trunk/tools/lto/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -1,6 +1,6 @@ set(LLVM_LINK_COMPONENTS ${LLVM_TARGETS_TO_BUILD} - ipo scalaropts linker bitreader bitwriter mcdisassembler) + ipo scalaropts linker bitreader bitwriter mcdisassembler vectorize) add_definitions( -DLLVM_VERSION_INFO=\"${PACKAGE_VERSION}\" ) Modified: llvm/trunk/tools/lto/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/Makefile?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/lto/Makefile (original) +++ llvm/trunk/tools/lto/Makefile Tue Jan 31 21:51:43 2012 @@ -10,7 +10,7 @@ LEVEL := ../.. LIBRARYNAME := LTO LINK_COMPONENTS := all-targets ipo scalaropts linker bitreader bitwriter \ - mcdisassembler + mcdisassembler vectorize LINK_LIBS_IN_SHARED := 1 SHARED_LIBRARY := 1 Modified: llvm/trunk/tools/opt/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/opt/CMakeLists.txt?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/opt/CMakeLists.txt (original) +++ llvm/trunk/tools/opt/CMakeLists.txt Tue Jan 31 21:51:43 2012 @@ -1,4 +1,4 @@ -set(LLVM_LINK_COMPONENTS bitreader asmparser bitwriter instrumentation scalaropts ipo) +set(LLVM_LINK_COMPONENTS bitreader asmparser bitwriter instrumentation scalaropts ipo vectorize) add_llvm_tool(opt AnalysisWrappers.cpp Modified: llvm/trunk/tools/opt/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/opt/Makefile?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/opt/Makefile (original) +++ llvm/trunk/tools/opt/Makefile Tue Jan 31 21:51:43 2012 @@ -9,6 +9,6 @@ LEVEL := ../.. TOOLNAME := opt -LINK_COMPONENTS := bitreader bitwriter asmparser instrumentation scalaropts ipo +LINK_COMPONENTS := bitreader bitwriter asmparser instrumentation scalaropts ipo vectorize include $(LEVEL)/Makefile.common Modified: llvm/trunk/tools/opt/opt.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/opt/opt.cpp?rev=149468&r1=149467&r2=149468&view=diff ============================================================================== --- llvm/trunk/tools/opt/opt.cpp (original) +++ llvm/trunk/tools/opt/opt.cpp Tue Jan 31 21:51:43 2012 @@ -480,6 +480,7 @@ PassRegistry &Registry = *PassRegistry::getPassRegistry(); initializeCore(Registry); initializeScalarOpts(Registry); + initializeVectorization(Registry); initializeIPO(Registry); initializeAnalysis(Registry); initializeIPA(Registry); From akyrtzi at gmail.com Tue Jan 31 22:51:17 2012 From: akyrtzi at gmail.com (Argyrios Kyrtzidis) Date: Wed, 01 Feb 2012 04:51:17 -0000 Subject: [llvm-commits] [llvm] r149470 - in /llvm/trunk: include/llvm/ include/llvm/Analysis/ lib/Analysis/ lib/AsmParser/ lib/Bitcode/Writer/ lib/CodeGen/AsmPrinter/ lib/CodeGen/SelectionDAG/ lib/Target/CBackend/ lib/Target/CppBackend/ lib/Transforms/Instrumentation/ lib/Transforms/Scalar/ lib/VMCore/ tools/bugpoint/ tools/lto/ Message-ID: <20120201045117.EC9392A6C12C@llvm.org> Author: akirtzidis Date: Tue Jan 31 22:51:17 2012 New Revision: 149470 URL: http://llvm.org/viewvc/llvm-project?rev=149470&view=rev Log: Revert Chris' commits up to r149348 that started causing VMCoreTests unit test to fail. These are: r149348 r149351 r149352 r149354 r149356 r149357 r149361 r149362 r149364 r149365 Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h llvm/trunk/include/llvm/Constants.h llvm/trunk/lib/Analysis/ConstantFolding.cpp llvm/trunk/lib/Analysis/ValueTracking.cpp llvm/trunk/lib/AsmParser/LLParser.cpp llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/trunk/lib/Target/CBackend/CBackend.cpp llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp llvm/trunk/lib/VMCore/AsmWriter.cpp llvm/trunk/lib/VMCore/Constants.cpp llvm/trunk/lib/VMCore/Core.cpp llvm/trunk/lib/VMCore/IRBuilder.cpp llvm/trunk/tools/bugpoint/Miscompilation.cpp llvm/trunk/tools/lto/LTOModule.cpp Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Tue Jan 31 22:51:17 2012 @@ -17,13 +17,14 @@ #include "llvm/ADT/ArrayRef.h" #include "llvm/Support/DataTypes.h" +#include namespace llvm { + template class SmallVectorImpl; class Value; class Instruction; class APInt; class TargetData; - class StringRef; /// ComputeMaskedBits - Determine which of the bits specified in Mask are /// known to be either zero or one and return them in the KnownZero/KnownOne @@ -124,13 +125,16 @@ return GetPointerBaseWithConstantOffset(const_cast(Ptr), Offset,TD); } - /// getConstantStringInfo - This function computes the length of a + /// GetConstantStringInfo - This function computes the length of a /// null-terminated C string pointed to by V. If successful, it returns true - /// and returns the string in Str. If unsuccessful, it returns false. This - /// does not include the trailing nul character. - bool getConstantStringInfo(const Value *V, StringRef &Str, - uint64_t Offset = 0); - + /// and returns the string in Str. If unsuccessful, it returns false. If + /// StopAtNul is set to true (the default), the returned string is truncated + /// by a nul character in the global. If StopAtNul is false, the nul + /// character is included in the result string. + bool GetConstantStringInfo(const Value *V, std::string &Str, + uint64_t Offset = 0, + bool StopAtNul = true); + /// GetStringLength - If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. uint64_t GetStringLength(Value *V); Modified: llvm/trunk/include/llvm/Constants.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Constants.h?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/include/llvm/Constants.h (original) +++ llvm/trunk/include/llvm/Constants.h Tue Jan 31 22:51:17 2012 @@ -352,6 +352,17 @@ // ConstantArray accessors static Constant *get(ArrayType *T, ArrayRef V); + /// This method constructs a ConstantArray and initializes it with a text + /// string. The default behavior (AddNull==true) causes a null terminator to + /// be placed at the end of the array. This effectively increases the length + /// of the array by one (you've been warned). However, in some situations + /// this is not desired so if AddNull==false then the string is copied without + /// null termination. + + // FIXME Remove this. + static Constant *get(LLVMContext &Context, StringRef Initializer, + bool AddNull = true); + /// Transparently provide more efficient getOperand methods. DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Constant); @@ -362,6 +373,31 @@ return reinterpret_cast(Value::getType()); } + // FIXME: String methods will eventually be removed. + + + /// isString - This method returns true if the array is an array of i8 and + /// the elements of the array are all ConstantInt's. + bool isString() const; + + /// isCString - This method returns true if the array is a string (see + /// @verbatim + /// isString) and it ends in a null byte \0 and does not contains any other + /// @endverbatim + /// null bytes except its terminator. + bool isCString() const; + + /// getAsString - If this array is isString(), then this method converts the + /// array to an std::string and returns it. Otherwise, it asserts out. + /// + std::string getAsString() const; + + /// getAsCString - If this array is isCString(), then this method converts the + /// array (without the trailing null byte) to an std::string and returns it. + /// Otherwise, it asserts out. + /// + std::string getAsCString() const; + virtual void destroyConstant(); virtual void replaceUsesOfWithOnConstant(Value *From, Value *To, Use *U); Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ConstantFolding.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ConstantFolding.cpp (original) +++ llvm/trunk/lib/Analysis/ConstantFolding.cpp Tue Jan 31 22:51:17 2012 @@ -476,9 +476,9 @@ // Instead of loading constant c string, use corresponding integer value // directly if string length is small enough. - StringRef Str; - if (TD && getConstantStringInfo(CE, Str) && !Str.empty()) { - unsigned StrLen = Str.size(); + std::string Str; + if (TD && GetConstantStringInfo(CE, Str) && !Str.empty()) { + unsigned StrLen = Str.length(); Type *Ty = cast(CE->getType())->getElementType(); unsigned NumBits = Ty->getPrimitiveSizeInBits(); // Replace load with immediate integer if the result is an integer or fp Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Tue Jan 31 22:51:17 2012 @@ -1369,21 +1369,25 @@ } } - // A ConstantDataArray/Vector is splatable if all its members are equal and - // also splatable. - if (ConstantDataSequential *CA = dyn_cast(V)) { - Value *Elt = CA->getElementAsConstant(0); - Value *Val = isBytewiseValue(Elt); + // A ConstantArray is splatable if all its members are equal and also + // splatable. + if (ConstantArray *CA = dyn_cast(V)) { + if (CA->getNumOperands() == 0) + return 0; + + Value *Val = isBytewiseValue(CA->getOperand(0)); if (!Val) return 0; - for (unsigned I = 1, E = CA->getNumElements(); I != E; ++I) - if (CA->getElementAsConstant(I) != Elt) + for (unsigned I = 1, E = CA->getNumOperands(); I != E; ++I) + if (CA->getOperand(I-1) != CA->getOperand(I)) return 0; return Val; } + // FIXME: Vector types (e.g., <4 x i32> ). + // Conceptually, we could handle things like: // %a = zext i8 %X to i16 // %b = shl i16 %a, 8 @@ -1603,19 +1607,33 @@ } -/// getConstantStringInfo - This function computes the length of a +/// GetConstantStringInfo - This function computes the length of a /// null-terminated C string pointed to by V. If successful, it returns true /// and returns the string in Str. If unsuccessful, it returns false. -bool llvm::getConstantStringInfo(const Value *V, StringRef &Str, - uint64_t Offset) { - assert(V); - - // Look through bitcast instructions and geps. - V = V->stripPointerCasts(); - - // If the value is a GEP instructionor constant expression, treat it as an - // offset. - if (const GEPOperator *GEP = dyn_cast(V)) { +bool llvm::GetConstantStringInfo(const Value *V, std::string &Str, + uint64_t Offset, bool StopAtNul) { + // If V is NULL then return false; + if (V == NULL) return false; + + // Look through bitcast instructions. + if (const BitCastInst *BCI = dyn_cast(V)) + return GetConstantStringInfo(BCI->getOperand(0), Str, Offset, StopAtNul); + + // If the value is not a GEP instruction nor a constant expression with a + // GEP instruction, then return false because ConstantArray can't occur + // any other way. + const User *GEP = 0; + if (const GetElementPtrInst *GEPI = dyn_cast(V)) { + GEP = GEPI; + } else if (const ConstantExpr *CE = dyn_cast(V)) { + if (CE->getOpcode() == Instruction::BitCast) + return GetConstantStringInfo(CE->getOperand(0), Str, Offset, StopAtNul); + if (CE->getOpcode() != Instruction::GetElementPtr) + return false; + GEP = CE; + } + + if (GEP) { // Make sure the GEP has exactly three arguments. if (GEP->getNumOperands() != 3) return false; @@ -1640,45 +1658,51 @@ StartIdx = CI->getZExtValue(); else return false; - return getConstantStringInfo(GEP->getOperand(0), Str, StartIdx+Offset); + return GetConstantStringInfo(GEP->getOperand(0), Str, StartIdx+Offset, + StopAtNul); } // The GEP instruction, constant or instruction, must reference a global // variable that is a constant and is initialized. The referenced constant // initializer is the array that we'll use for optimization. - const GlobalVariable *GV = dyn_cast(V); + const GlobalVariable* GV = dyn_cast(V); if (!GV || !GV->isConstant() || !GV->hasDefinitiveInitializer()) return false; - + const Constant *GlobalInit = GV->getInitializer(); + // Handle the all-zeros case - if (GV->getInitializer()->isNullValue()) { + if (GlobalInit->isNullValue()) { // This is a degenerate case. The initializer is constant zero so the // length of the string must be zero. - Str = ""; + Str.clear(); return true; } // Must be a Constant Array - const ConstantDataArray *Array = - dyn_cast(GV->getInitializer()); - if (Array == 0 || !Array->isString()) + const ConstantArray *Array = dyn_cast(GlobalInit); + if (Array == 0 || !Array->getType()->getElementType()->isIntegerTy(8)) return false; // Get the number of elements in the array - uint64_t NumElts = Array->getType()->getArrayNumElements(); - - // Start out with the entire array in the StringRef. - Str = Array->getAsString(); - + uint64_t NumElts = Array->getType()->getNumElements(); + if (Offset > NumElts) return false; - // Skip over 'offset' bytes. - Str = Str.substr(Offset); - // Trim off the \0 and anything after it. If the array is not nul terminated, - // we just return the whole end of string. The client may know some other way - // that the string is length-bound. - Str = Str.substr(0, Str.find('\0')); + // Traverse the constant array from 'Offset' which is the place the GEP refers + // to in the array. + Str.reserve(NumElts-Offset); + for (unsigned i = Offset; i != NumElts; ++i) { + const Constant *Elt = Array->getOperand(i); + const ConstantInt *CI = dyn_cast(Elt); + if (!CI) // This array isn't suitable, non-int initializer. + return false; + if (StopAtNul && CI->isZero()) + return true; // we found end of string, success! + Str += (char)CI->getZExtValue(); + } + + // The array isn't null terminated, but maybe this is a memcpy, not a strcpy. return true; } @@ -1690,7 +1714,8 @@ /// the specified pointer, return 'len+1'. If we can't, return 0. static uint64_t GetStringLengthH(Value *V, SmallPtrSet &PHIs) { // Look through noop bitcast instructions. - V = V->stripPointerCasts(); + if (BitCastInst *BCI = dyn_cast(V)) + return GetStringLengthH(BCI->getOperand(0), PHIs); // If this is a PHI node, there are two cases: either we have already seen it // or we haven't. @@ -1726,13 +1751,83 @@ if (Len1 != Len2) return 0; return Len1; } - - // Otherwise, see if we can read the string. - StringRef StrData; - if (!getConstantStringInfo(V, StrData)) + + // As a special-case, "@string = constant i8 0" is also a string with zero + // length, not wrapped in a bitcast or GEP. + if (GlobalVariable *GV = dyn_cast(V)) { + if (GV->isConstant() && GV->hasDefinitiveInitializer()) + if (GV->getInitializer()->isNullValue()) return 1; + return 0; + } + + // If the value is not a GEP instruction nor a constant expression with a + // GEP instruction, then return unknown. + User *GEP = 0; + if (GetElementPtrInst *GEPI = dyn_cast(V)) { + GEP = GEPI; + } else if (ConstantExpr *CE = dyn_cast(V)) { + if (CE->getOpcode() != Instruction::GetElementPtr) + return 0; + GEP = CE; + } else { + return 0; + } + + // Make sure the GEP has exactly three arguments. + if (GEP->getNumOperands() != 3) + return 0; + + // Check to make sure that the first operand of the GEP is an integer and + // has value 0 so that we are sure we're indexing into the initializer. + if (ConstantInt *Idx = dyn_cast(GEP->getOperand(1))) { + if (!Idx->isZero()) + return 0; + } else return 0; - return StrData.size()+1; + // If the second index isn't a ConstantInt, then this is a variable index + // into the array. If this occurs, we can't say anything meaningful about + // the string. + uint64_t StartIdx = 0; + if (ConstantInt *CI = dyn_cast(GEP->getOperand(2))) + StartIdx = CI->getZExtValue(); + else + return 0; + + // The GEP instruction, constant or instruction, must reference a global + // variable that is a constant and is initialized. The referenced constant + // initializer is the array that we'll use for optimization. + GlobalVariable* GV = dyn_cast(GEP->getOperand(0)); + if (!GV || !GV->isConstant() || !GV->hasInitializer() || + GV->mayBeOverridden()) + return 0; + Constant *GlobalInit = GV->getInitializer(); + + // Handle the ConstantAggregateZero case, which is a degenerate case. The + // initializer is constant zero so the length of the string must be zero. + if (isa(GlobalInit)) + return 1; // Len = 0 offset by 1. + + // Must be a Constant Array + ConstantArray *Array = dyn_cast(GlobalInit); + if (!Array || !Array->getType()->getElementType()->isIntegerTy(8)) + return false; + + // Get the number of elements in the array + uint64_t NumElts = Array->getType()->getNumElements(); + + // Traverse the constant array from StartIdx (derived above) which is + // the place the GEP refers to in the array. + for (unsigned i = StartIdx; i != NumElts; ++i) { + Constant *Elt = Array->getOperand(i); + ConstantInt *CI = dyn_cast(Elt); + if (!CI) // This array isn't suitable, non-int initializer. + return 0; + if (CI->isZero()) + return i-StartIdx+1; // We found end of string, success! + } + + return 0; // The array isn't null terminated, conservatively return 'unknown'. } /// GetStringLength - If we can compute the length of the string pointed to by Modified: llvm/trunk/lib/AsmParser/LLParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLParser.cpp (original) +++ llvm/trunk/lib/AsmParser/LLParser.cpp Tue Jan 31 22:51:17 2012 @@ -2018,8 +2018,7 @@ } case lltok::kw_c: // c "foo" Lex.Lex(); - ID.ConstantVal = ConstantDataArray::getString(Context, Lex.getStrVal(), - false); + ID.ConstantVal = ConstantArray::get(Context, Lex.getStrVal(), false); if (ParseToken(lltok::StringConstant, "expected string")) return true; ID.Kind = ValID::t_Constant; return false; Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Tue Jan 31 22:51:17 2012 @@ -845,6 +845,32 @@ } else { assert (0 && "Unknown FP type!"); } + } else if (isa(C) && cast(C)->isString()) { + const ConstantArray *CA = cast(C); + // Emit constant strings specially. + unsigned NumOps = CA->getNumOperands(); + // If this is a null-terminated string, use the denser CSTRING encoding. + if (CA->getOperand(NumOps-1)->isNullValue()) { + Code = bitc::CST_CODE_CSTRING; + --NumOps; // Don't encode the null, which isn't allowed by char6. + } else { + Code = bitc::CST_CODE_STRING; + AbbrevToUse = String8Abbrev; + } + bool isCStr7 = Code == bitc::CST_CODE_CSTRING; + bool isCStrChar6 = Code == bitc::CST_CODE_CSTRING; + for (unsigned i = 0; i != NumOps; ++i) { + unsigned char V = cast(CA->getOperand(i))->getZExtValue(); + Record.push_back(V); + isCStr7 &= (V & 128) == 0; + if (isCStrChar6) + isCStrChar6 = BitCodeAbbrevOp::isChar6(V); + } + + if (isCStrChar6) + AbbrevToUse = CString6Abbrev; + else if (isCStr7) + AbbrevToUse = CString7Abbrev; } else if (isa(C) && cast(C)->isString()) { const ConstantDataSequential *Str = cast(C); Modified: llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/ValueEnumerator.cpp Tue Jan 31 22:51:17 2012 @@ -321,6 +321,10 @@ if (const Constant *C = dyn_cast(V)) { if (isa(C)) { // Initializers for globals are handled explicitly elsewhere. + } else if (isa(C) && cast(C)->isString()) { + // Do not enumerate the initializers for an array of simple characters. + // The initializers just pollute the value table, and we emit the strings + // specially. } else if (C->getNumOperands()) { // If a constant has operands, enumerate them. This makes sure that if a // constant has uses (for example an array of const ints), that they are Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Tue Jan 31 22:51:17 2012 @@ -1675,18 +1675,31 @@ static void EmitGlobalConstantArray(const ConstantArray *CA, unsigned AddrSpace, AsmPrinter &AP) { - // See if we can aggregate some values. Make sure it can be - // represented as a series of bytes of the constant value. - int Value = isRepeatedByteSequence(CA, AP.TM); - - if (Value != -1) { - uint64_t Bytes = AP.TM.getTargetData()->getTypeAllocSize(CA->getType()); - AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); - } - else { - for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) - EmitGlobalConstantImpl(CA->getOperand(i), AddrSpace, AP); + if (AddrSpace != 0 || !CA->isString()) { + // Not a string. Print the values in successive locations. + + // See if we can aggregate some values. Make sure it can be + // represented as a series of bytes of the constant value. + int Value = isRepeatedByteSequence(CA, AP.TM); + + if (Value != -1) { + uint64_t Bytes = AP.TM.getTargetData()->getTypeAllocSize(CA->getType()); + AP.OutStreamer.EmitFill(Bytes, Value, AddrSpace); + } + else { + for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) + EmitGlobalConstantImpl(CA->getOperand(i), AddrSpace, AP); + } + return; } + + // Otherwise, it can be emitted as .ascii. + SmallVector TmpVec; + TmpVec.reserve(CA->getNumOperands()); + for (unsigned i = 0, e = CA->getNumOperands(); i != e; ++i) + TmpVec.push_back(cast(CA->getOperand(i))->getZExtValue()); + + AP.OutStreamer.EmitBytes(StringRef(TmpVec.data(), TmpVec.size()), AddrSpace); } static void EmitGlobalConstantVector(const ConstantVector *CV, Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Jan 31 22:51:17 2012 @@ -3299,7 +3299,7 @@ /// string ptr. static SDValue getMemsetStringVal(EVT VT, DebugLoc dl, SelectionDAG &DAG, const TargetLowering &TLI, - StringRef Str, unsigned Offset) { + std::string &Str, unsigned Offset) { // Handle vector with all elements zero. if (Str.empty()) { if (VT.isInteger()) @@ -3323,10 +3323,7 @@ if (TLI.isLittleEndian()) Offset = Offset + MSB - 1; for (unsigned i = 0; i != MSB; ++i) { - Val = (Val << 8); - - if (Offset < Str.size()) - Val |= (unsigned char)Str[Offset]; + Val = (Val << 8) | (unsigned char)Str[Offset]; Offset += TLI.isLittleEndian() ? -1 : 1; } return DAG.getConstant(Val, VT); @@ -3343,7 +3340,7 @@ /// isMemSrcFromString - Returns true if memcpy source is a string constant. /// -static bool isMemSrcFromString(SDValue Src, StringRef &Str) { +static bool isMemSrcFromString(SDValue Src, std::string &Str) { unsigned SrcDelta = 0; GlobalAddressSDNode *G = NULL; if (Src.getOpcode() == ISD::GlobalAddress) @@ -3357,9 +3354,9 @@ if (!G) return false; - if (const GlobalVariable *GV = dyn_cast(G->getGlobal())) - if (getConstantStringInfo(GV, Str, SrcDelta)) - return true; + const GlobalVariable *GV = dyn_cast(G->getGlobal()); + if (GV && GetConstantStringInfo(GV, Str, SrcDelta, false)) + return true; return false; } @@ -3464,7 +3461,7 @@ unsigned SrcAlign = DAG.InferPtrAlignment(Src); if (Align > SrcAlign) SrcAlign = Align; - StringRef Str; + std::string Str; bool CopyFromStr = isMemSrcFromString(Src, Str); bool isZeroStr = CopyFromStr && Str.empty(); unsigned Limit = AlwaysInline ? ~0U : TLI.getMaxStoresPerMemcpy(OptSize); Modified: llvm/trunk/lib/Target/CBackend/CBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CBackend/CBackend.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Target/CBackend/CBackend.cpp (original) +++ llvm/trunk/lib/Target/CBackend/CBackend.cpp Tue Jan 31 22:51:17 2012 @@ -558,21 +558,73 @@ } void CWriter::printConstantArray(ConstantArray *CPA, bool Static) { - Out << "{ "; - printConstant(cast(CPA->getOperand(0)), Static); - for (unsigned i = 1, e = CPA->getNumOperands(); i != e; ++i) { - Out << ", "; - printConstant(cast(CPA->getOperand(i)), Static); + // As a special case, print the array as a string if it is an array of + // ubytes or an array of sbytes with positive values. + // + if (CPA->isCString()) { + Out << '\"'; + // Keep track of whether the last number was a hexadecimal escape. + bool LastWasHex = false; + + // Do not include the last character, which we know is null + for (unsigned i = 0, e = CPA->getNumOperands()-1; i != e; ++i) { + unsigned char C = cast(CPA->getOperand(i))->getZExtValue(); + + // Print it out literally if it is a printable character. The only thing + // to be careful about is when the last letter output was a hex escape + // code, in which case we have to be careful not to print out hex digits + // explicitly (the C compiler thinks it is a continuation of the previous + // character, sheesh...) + // + if (isprint(C) && (!LastWasHex || !isxdigit(C))) { + LastWasHex = false; + if (C == '"' || C == '\\') + Out << "\\" << (char)C; + else + Out << (char)C; + } else { + LastWasHex = false; + switch (C) { + case '\n': Out << "\\n"; break; + case '\t': Out << "\\t"; break; + case '\r': Out << "\\r"; break; + case '\v': Out << "\\v"; break; + case '\a': Out << "\\a"; break; + case '\"': Out << "\\\""; break; + case '\'': Out << "\\\'"; break; + default: + Out << "\\x"; + Out << (char)(( C/16 < 10) ? ( C/16 +'0') : ( C/16 -10+'A')); + Out << (char)(((C&15) < 10) ? ((C&15)+'0') : ((C&15)-10+'A')); + LastWasHex = true; + break; + } + } + } + Out << '\"'; + } else { + Out << '{'; + if (CPA->getNumOperands()) { + Out << ' '; + printConstant(cast(CPA->getOperand(0)), Static); + for (unsigned i = 1, e = CPA->getNumOperands(); i != e; ++i) { + Out << ", "; + printConstant(cast(CPA->getOperand(i)), Static); + } + } + Out << " }"; } - Out << " }"; } void CWriter::printConstantVector(ConstantVector *CP, bool Static) { - Out << "{ "; - printConstant(cast(CP->getOperand(0)), Static); - for (unsigned i = 1, e = CP->getNumOperands(); i != e; ++i) { - Out << ", "; - printConstant(cast(CP->getOperand(i)), Static); + Out << '{'; + if (CP->getNumOperands()) { + Out << ' '; + printConstant(cast(CP->getOperand(0)), Static); + for (unsigned i = 1, e = CP->getNumOperands(); i != e; ++i) { + Out << ", "; + printConstant(cast(CP->getOperand(i)), Static); + } } Out << " }"; } Modified: llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp (original) +++ llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp Tue Jan 31 22:51:17 2012 @@ -698,17 +698,36 @@ printCFP(CFP); Out << ";"; } else if (const ConstantArray *CA = dyn_cast(CV)) { - Out << "std::vector " << constName << "_elems;"; - nl(Out); - unsigned N = CA->getNumOperands(); - for (unsigned i = 0; i < N; ++i) { - printConstant(CA->getOperand(i)); // recurse to print operands - Out << constName << "_elems.push_back(" - << getCppName(CA->getOperand(i)) << ");"; + if (CA->isString()) { + Out << "Constant* " << constName << + " = ConstantArray::get(mod->getContext(), \""; + std::string tmp = CA->getAsString(); + bool nullTerminate = false; + if (tmp[tmp.length()-1] == 0) { + tmp.erase(tmp.length()-1); + nullTerminate = true; + } + printEscapedString(tmp); + // Determine if we want null termination or not. + if (nullTerminate) + Out << "\", true"; // Indicate that the null terminator should be + // added. + else + Out << "\", false";// No null terminator + Out << ");"; + } else { + Out << "std::vector " << constName << "_elems;"; nl(Out); + unsigned N = CA->getNumOperands(); + for (unsigned i = 0; i < N; ++i) { + printConstant(CA->getOperand(i)); // recurse to print operands + Out << constName << "_elems.push_back(" + << getCppName(CA->getOperand(i)) << ");"; + nl(Out); + } + Out << "Constant* " << constName << " = ConstantArray::get(" + << typeName << ", " << constName << "_elems);"; } - Out << "Constant* " << constName << " = ConstantArray::get(" - << typeName << ", " << constName << "_elems);"; } else if (const ConstantStruct *CS = dyn_cast(CV)) { Out << "std::vector " << constName << "_fields;"; nl(Out); @@ -721,14 +740,14 @@ } Out << "Constant* " << constName << " = ConstantStruct::get(" << typeName << ", " << constName << "_fields);"; - } else if (const ConstantVector *CV = dyn_cast(CV)) { + } else if (const ConstantVector *CP = dyn_cast(CV)) { Out << "std::vector " << constName << "_elems;"; nl(Out); - unsigned N = CV->getNumOperands(); + unsigned N = CP->getNumOperands(); for (unsigned i = 0; i < N; ++i) { - printConstant(CV->getOperand(i)); + printConstant(CP->getOperand(i)); Out << constName << "_elems.push_back(" - << getCppName(CV->getOperand(i)) << ");"; + << getCppName(CP->getOperand(i)) << ");"; nl(Out); } Out << "Constant* " << constName << " = ConstantVector::get(" @@ -741,7 +760,7 @@ if (CDS->isString()) { Out << "Constant *" << constName << " = ConstantDataArray::getString(mod->getContext(), \""; - StringRef Str = CDS->getAsString(); + StringRef Str = CA->getAsString(); bool nullTerminate = false; if (Str.back() == 0) { Str = Str.drop_back(); Modified: llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp (original) +++ llvm/trunk/lib/Transforms/Instrumentation/AddressSanitizer.cpp Tue Jan 31 22:51:17 2012 @@ -213,7 +213,7 @@ // Create a constant for Str so that we can pass it to the run-time lib. static GlobalVariable *createPrivateGlobalForString(Module &M, StringRef Str) { - Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str); + Constant *StrConst = ConstantArray::get(M.getContext(), Str); return new GlobalVariable(M, StrConst->getType(), true, GlobalValue::PrivateLinkage, StrConst, ""); } Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Tue Jan 31 22:51:17 2012 @@ -256,18 +256,19 @@ ConstantInt::get(TD->getIntPtrType(*Context), Len), B, TD); } - + // Otherwise, the character is a constant, see if the first argument is // a string literal. If so, we can constant fold. - StringRef Str; - if (!getConstantStringInfo(SrcStr, Str)) + std::string Str; + if (!GetConstantStringInfo(SrcStr, Str)) return 0; - // Compute the offset, make sure to handle the case when we're searching for - // zero (a weird way to spell strlen). - size_t I = CharC->getSExtValue() == 0 ? - Str.size() : Str.find(CharC->getSExtValue()); - if (I == StringRef::npos) // Didn't find the char. strchr returns null. + // strchr can find the nul character. + Str += '\0'; + + // Compute the offset. + size_t I = Str.find(CharC->getSExtValue()); + if (I == std::string::npos) // Didn't find the char. strchr returns null. return Constant::getNullValue(CI->getType()); // strchr(s+n,c) -> gep(s+n+i,c) @@ -295,18 +296,20 @@ if (!CharC) return 0; - StringRef Str; - if (!getConstantStringInfo(SrcStr, Str)) { + std::string Str; + if (!GetConstantStringInfo(SrcStr, Str)) { // strrchr(s, 0) -> strchr(s, 0) if (TD && CharC->isZero()) return EmitStrChr(SrcStr, '\0', B, TD); return 0; } + // strrchr can find the nul character. + Str += '\0'; + // Compute the offset. - size_t I = CharC->getSExtValue() == 0 ? - Str.size() : Str.rfind(CharC->getSExtValue()); - if (I == StringRef::npos) // Didn't find the char. Return null. + size_t I = Str.rfind(CharC->getSExtValue()); + if (I == std::string::npos) // Didn't find the char. Return null. return Constant::getNullValue(CI->getType()); // strrchr(s+n,c) -> gep(s+n+i,c) @@ -331,13 +334,14 @@ if (Str1P == Str2P) // strcmp(x,x) -> 0 return ConstantInt::get(CI->getType(), 0); - StringRef Str1, Str2; - bool HasStr1 = getConstantStringInfo(Str1P, Str1); - bool HasStr2 = getConstantStringInfo(Str2P, Str2); + std::string Str1, Str2; + bool HasStr1 = GetConstantStringInfo(Str1P, Str1); + bool HasStr2 = GetConstantStringInfo(Str2P, Str2); // strcmp(x, y) -> cnst (if both x and y are constant strings) if (HasStr1 && HasStr2) - return ConstantInt::get(CI->getType(), Str1.compare(Str2)); + return ConstantInt::get(CI->getType(), + StringRef(Str1).compare(Str2)); if (HasStr1 && Str1.empty()) // strcmp("", x) -> -*x return B.CreateNeg(B.CreateZExt(B.CreateLoad(Str2P, "strcmpload"), @@ -393,14 +397,14 @@ if (TD && Length == 1) // strncmp(x,y,1) -> memcmp(x,y,1) return EmitMemCmp(Str1P, Str2P, CI->getArgOperand(2), B, TD); - StringRef Str1, Str2; - bool HasStr1 = getConstantStringInfo(Str1P, Str1); - bool HasStr2 = getConstantStringInfo(Str2P, Str2); + std::string Str1, Str2; + bool HasStr1 = GetConstantStringInfo(Str1P, Str1); + bool HasStr2 = GetConstantStringInfo(Str2P, Str2); // strncmp(x, y) -> cnst (if both x and y are constant strings) if (HasStr1 && HasStr2) { - StringRef SubStr1 = Str1.substr(0, Length); - StringRef SubStr2 = Str2.substr(0, Length); + StringRef SubStr1 = StringRef(Str1).substr(0, Length); + StringRef SubStr2 = StringRef(Str2).substr(0, Length); return ConstantInt::get(CI->getType(), SubStr1.compare(SubStr2)); } @@ -545,9 +549,9 @@ FT->getReturnType() != FT->getParamType(0)) return 0; - StringRef S1, S2; - bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); + std::string S1, S2; + bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); // strpbrk(s, "") -> NULL // strpbrk("", s) -> NULL @@ -605,9 +609,9 @@ !FT->getReturnType()->isIntegerTy()) return 0; - StringRef S1, S2; - bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); + std::string S1, S2; + bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); // strspn(s, "") -> 0 // strspn("", s) -> 0 @@ -615,11 +619,8 @@ return Constant::getNullValue(CI->getType()); // Constant folding. - if (HasS1 && HasS2) { - size_t Pos = S1.find_first_not_of(S2); - if (Pos == StringRef::npos) Pos = S1.size(); - return ConstantInt::get(CI->getType(), Pos); - } + if (HasS1 && HasS2) + return ConstantInt::get(CI->getType(), strspn(S1.c_str(), S2.c_str())); return 0; } @@ -637,20 +638,17 @@ !FT->getReturnType()->isIntegerTy()) return 0; - StringRef S1, S2; - bool HasS1 = getConstantStringInfo(CI->getArgOperand(0), S1); - bool HasS2 = getConstantStringInfo(CI->getArgOperand(1), S2); + std::string S1, S2; + bool HasS1 = GetConstantStringInfo(CI->getArgOperand(0), S1); + bool HasS2 = GetConstantStringInfo(CI->getArgOperand(1), S2); // strcspn("", s) -> 0 if (HasS1 && S1.empty()) return Constant::getNullValue(CI->getType()); // Constant folding. - if (HasS1 && HasS2) { - size_t Pos = S1.find_first_of(S2); - if (Pos == StringRef::npos) Pos = S1.size(); - return ConstantInt::get(CI->getType(), Pos); - } + if (HasS1 && HasS2) + return ConstantInt::get(CI->getType(), strcspn(S1.c_str(), S2.c_str())); // strcspn(s, "") -> strlen(s) if (TD && HasS2 && S2.empty()) @@ -694,9 +692,9 @@ } // See if either input string is a constant string. - StringRef SearchStr, ToFindStr; - bool HasStr1 = getConstantStringInfo(CI->getArgOperand(0), SearchStr); - bool HasStr2 = getConstantStringInfo(CI->getArgOperand(1), ToFindStr); + std::string SearchStr, ToFindStr; + bool HasStr1 = GetConstantStringInfo(CI->getArgOperand(0), SearchStr); + bool HasStr2 = GetConstantStringInfo(CI->getArgOperand(1), ToFindStr); // fold strstr(x, "") -> x. if (HasStr2 && ToFindStr.empty()) @@ -706,7 +704,7 @@ if (HasStr1 && HasStr2) { std::string::size_type Offset = SearchStr.find(ToFindStr); - if (Offset == StringRef::npos) // strstr("foo", "bar") -> null + if (Offset == std::string::npos) // strstr("foo", "bar") -> null return Constant::getNullValue(CI->getType()); // strstr("abcd", "bc") -> gep((char*)"abcd", 1) @@ -758,11 +756,11 @@ } // Constant folding: memcmp(x, y, l) -> cnst (all arguments are constant) - StringRef LHSStr, RHSStr; - if (getConstantStringInfo(LHS, LHSStr) && - getConstantStringInfo(RHS, RHSStr)) { + std::string LHSStr, RHSStr; + if (GetConstantStringInfo(LHS, LHSStr) && + GetConstantStringInfo(RHS, RHSStr)) { // Make sure we're not reading out-of-bounds memory. - if (Len > LHSStr.size() || Len > RHSStr.size()) + if (Len > LHSStr.length() || Len > RHSStr.length()) return 0; uint64_t Ret = memcmp(LHSStr.data(), RHSStr.data(), Len); return ConstantInt::get(CI->getType(), Ret); @@ -1118,8 +1116,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // Check for a fixed format string. - StringRef FormatStr; - if (!getConstantStringInfo(CI->getArgOperand(0), FormatStr)) + std::string FormatStr; + if (!GetConstantStringInfo(CI->getArgOperand(0), FormatStr)) return 0; // Empty format string -> noop. @@ -1145,7 +1143,7 @@ FormatStr.find('%') == std::string::npos) { // no format characters. // Create a string literal with no \n on it. We expect the constant merge // pass to be run after this pass, to merge duplicate strings. - FormatStr = FormatStr.drop_back(); + FormatStr.erase(FormatStr.end()-1); Value *GV = B.CreateGlobalString(FormatStr, "str"); EmitPutS(GV, B, TD); return CI->use_empty() ? (Value*)CI : @@ -1205,8 +1203,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // Check for a fixed format string. - StringRef FormatStr; - if (!getConstantStringInfo(CI->getArgOperand(1), FormatStr)) + std::string FormatStr; + if (!GetConstantStringInfo(CI->getArgOperand(1), FormatStr)) return 0; // If we just have a format string (nothing else crazy) transform it. @@ -1360,8 +1358,8 @@ Value *OptimizeFixedFormatString(Function *Callee, CallInst *CI, IRBuilder<> &B) { // All the optimizations depend on the format string. - StringRef FormatStr; - if (!getConstantStringInfo(CI->getArgOperand(1), FormatStr)) + std::string FormatStr; + if (!GetConstantStringInfo(CI->getArgOperand(1), FormatStr)) return 0; // fprintf(F, "foo") --> fwrite("foo", 3, 1, F) @@ -1444,8 +1442,8 @@ return 0; // Check for a constant string. - StringRef Str; - if (!getConstantStringInfo(CI->getArgOperand(0), Str)) + std::string Str; + if (!GetConstantStringInfo(CI->getArgOperand(0), Str)) return 0; if (Str.empty() && CI->use_empty()) { @@ -2415,8 +2413,6 @@ // * stpcpy(str, "literal") -> // llvm.memcpy(str,"literal",strlen("literal")+1,1) // -// strchr: -// * strchr(p, 0) -> strlen(p) // tan, tanf, tanl: // * tan(atan(x)) -> x // Modified: llvm/trunk/lib/VMCore/AsmWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/AsmWriter.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/AsmWriter.cpp (original) +++ llvm/trunk/lib/VMCore/AsmWriter.cpp Tue Jan 31 22:51:17 2012 @@ -827,21 +827,30 @@ } if (const ConstantArray *CA = dyn_cast(CV)) { + // As a special case, print the array as a string if it is an array of + // i8 with ConstantInt values. + // Type *ETy = CA->getType()->getElementType(); - Out << '['; - TypePrinter.print(ETy, Out); - Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(0), - &TypePrinter, Machine, - Context); - for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { - Out << ", "; + if (CA->isString()) { + Out << "c\""; + PrintEscapedString(CA->getAsString(), Out); + Out << '"'; + } else { // Cannot output in string format... + Out << '['; TypePrinter.print(ETy, Out); Out << ' '; - WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, + WriteAsOperandInternal(Out, CA->getOperand(0), + &TypePrinter, Machine, Context); + for (unsigned i = 1, e = CA->getNumOperands(); i != e; ++i) { + Out << ", "; + TypePrinter.print(ETy, Out); + Out << ' '; + WriteAsOperandInternal(Out, CA->getOperand(i), &TypePrinter, Machine, + Context); + } + Out << ']'; } - Out << ']'; return; } Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Tue Jan 31 22:51:17 2012 @@ -176,7 +176,7 @@ return UV->getElementValue(Elt); if (const ConstantDataSequential *CDS =dyn_cast(this)) - return Elt < CDS->getNumElements() ? CDS->getElementAsConstant(Elt) : 0; + return CDS->getElementAsConstant(Elt); return 0; } @@ -666,13 +666,6 @@ // ConstantXXX Classes //===----------------------------------------------------------------------===// -template -static bool rangeOnlyContains(ItTy Start, ItTy End, EltTy Elt) { - for (; Start != End; ++Start) - if (*Start != Elt) - return false; - return true; -} ConstantArray::ConstantArray(ArrayType *T, ArrayRef V) : Constant(T, ConstantArrayVal, @@ -687,97 +680,54 @@ } Constant *ConstantArray::get(ArrayType *Ty, ArrayRef V) { - // Empty arrays are canonicalized to ConstantAggregateZero. - if (V.empty()) - return ConstantAggregateZero::get(Ty); - for (unsigned i = 0, e = V.size(); i != e; ++i) { assert(V[i]->getType() == Ty->getElementType() && "Wrong type in array element initializer"); } LLVMContextImpl *pImpl = Ty->getContext().pImpl; - - // If this is an all-zero array, return a ConstantAggregateZero object. If - // all undef, return an UndefValue, if "all simple", then return a - // ConstantDataArray. - Constant *C = V[0]; - if (isa(C) && rangeOnlyContains(V.begin(), V.end(), C)) - return UndefValue::get(Ty); + // If this is an all-zero array, return a ConstantAggregateZero object + bool isAllZero = true; + bool isUndef = false; + if (!V.empty()) { + Constant *C = V[0]; + isAllZero = C->isNullValue(); + isUndef = isa(C); + + if (isAllZero || isUndef) + for (unsigned i = 1, e = V.size(); i != e; ++i) + if (V[i] != C) { + isAllZero = false; + isUndef = false; + break; + } + } - if (C->isNullValue() && rangeOnlyContains(V.begin(), V.end(), C)) + if (isAllZero) return ConstantAggregateZero::get(Ty); + if (isUndef) + return UndefValue::get(Ty); + return pImpl->ArrayConstants.getOrCreate(Ty, V); +} - // Check to see if all of the elements are ConstantFP or ConstantInt and if - // the element type is compatible with ConstantDataVector. If so, use it. - if (ConstantDataSequential::isElementTypeCompatible(C->getType())) { - // We speculatively build the elements here even if it turns out that there - // is a constantexpr or something else weird in the array, since it is so - // uncommon for that to happen. - if (ConstantInt *CI = dyn_cast(C)) { - if (CI->getType()->isIntegerTy(8)) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantInt *CI = dyn_cast(V[i])) - Elts.push_back(CI->getZExtValue()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } else if (CI->getType()->isIntegerTy(16)) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantInt *CI = dyn_cast(V[i])) - Elts.push_back(CI->getZExtValue()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } else if (CI->getType()->isIntegerTy(32)) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantInt *CI = dyn_cast(V[i])) - Elts.push_back(CI->getZExtValue()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } else if (CI->getType()->isIntegerTy(64)) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantInt *CI = dyn_cast(V[i])) - Elts.push_back(CI->getZExtValue()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } - } - - if (ConstantFP *CFP = dyn_cast(C)) { - if (CFP->getType()->isFloatTy()) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantFP *CFP = dyn_cast(V[i])) - Elts.push_back(CFP->getValueAPF().convertToFloat()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } else if (CFP->getType()->isDoubleTy()) { - SmallVector Elts; - for (unsigned i = 0, e = V.size(); i != e; ++i) - if (ConstantFP *CFP = dyn_cast(V[i])) - Elts.push_back(CFP->getValueAPF().convertToDouble()); - else - break; - if (Elts.size() == V.size()) - return ConstantDataArray::get(C->getContext(), Elts); - } - } - } +/// ConstantArray::get(const string&) - Return an array that is initialized to +/// contain the specified string. If length is zero then a null terminator is +/// added to the specified string so that it may be used in a natural way. +/// Otherwise, the length parameter specifies how much of the string to use +/// and it won't be null terminated. +/// +Constant *ConstantArray::get(LLVMContext &Context, StringRef Str, + bool AddNull) { + SmallVector ElementVals; + ElementVals.reserve(Str.size() + size_t(AddNull)); + for (unsigned i = 0; i < Str.size(); ++i) + ElementVals.push_back(ConstantInt::get(Type::getInt8Ty(Context), Str[i])); + + // Add a null terminator to the string... + if (AddNull) + ElementVals.push_back(ConstantInt::get(Type::getInt8Ty(Context), 0)); - // Otherwise, we really do want to create a ConstantArray. - return pImpl->ArrayConstants.getOrCreate(Ty, V); + ArrayType *ATy = ArrayType::get(Type::getInt8Ty(Context), ElementVals.size()); + return get(ATy, ElementVals); } /// getTypeForElements - Return an anonymous struct type to use for a constant @@ -889,7 +839,8 @@ // Check to see if all of the elements are ConstantFP or ConstantInt and if // the element type is compatible with ConstantDataVector. If so, use it. - if (ConstantDataSequential::isElementTypeCompatible(C->getType())) { + if (ConstantDataSequential::isElementTypeCompatible(C->getType()) && + (isa(C) || isa(C))) { // We speculatively build the elements here even if it turns out that there // is a constantexpr or something else weird in the array, since it is so // uncommon for that to happen. @@ -1195,6 +1146,69 @@ destroyConstantImpl(); } +/// isString - This method returns true if the array is an array of i8, and +/// if the elements of the array are all ConstantInt's. +bool ConstantArray::isString() const { + // Check the element type for i8... + if (!getType()->getElementType()->isIntegerTy(8)) + return false; + // Check the elements to make sure they are all integers, not constant + // expressions. + for (unsigned i = 0, e = getNumOperands(); i != e; ++i) + if (!isa(getOperand(i))) + return false; + return true; +} + +/// isCString - This method returns true if the array is a string (see +/// isString) and it ends in a null byte \\0 and does not contains any other +/// null bytes except its terminator. +bool ConstantArray::isCString() const { + // Check the element type for i8... + if (!getType()->getElementType()->isIntegerTy(8)) + return false; + + // Last element must be a null. + if (!getOperand(getNumOperands()-1)->isNullValue()) + return false; + // Other elements must be non-null integers. + for (unsigned i = 0, e = getNumOperands()-1; i != e; ++i) { + if (!isa(getOperand(i))) + return false; + if (getOperand(i)->isNullValue()) + return false; + } + return true; +} + + +/// convertToString - Helper function for getAsString() and getAsCString(). +static std::string convertToString(const User *U, unsigned len) { + std::string Result; + Result.reserve(len); + for (unsigned i = 0; i != len; ++i) + Result.push_back((char)cast(U->getOperand(i))->getZExtValue()); + return Result; +} + +/// getAsString - If this array is isString(), then this method converts the +/// array to an std::string and returns it. Otherwise, it asserts out. +/// +std::string ConstantArray::getAsString() const { + assert(isString() && "Not a string!"); + return convertToString(this, getNumOperands()); +} + + +/// getAsCString - If this array is isCString(), then this method converts the +/// array (without the trailing null byte) to an std::string and returns it. +/// Otherwise, it asserts out. +/// +std::string ConstantArray::getAsCString() const { + assert(isCString() && "Not a string!"); + return convertToString(this, getNumOperands() - 1); +} + //---- ConstantStruct::get() implementation... // Modified: llvm/trunk/lib/VMCore/Core.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Core.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Core.cpp (original) +++ llvm/trunk/lib/VMCore/Core.cpp Tue Jan 31 22:51:17 2012 @@ -634,8 +634,8 @@ LLVMBool DontNullTerminate) { /* Inverted the sense of AddNull because ', 0)' is a better mnemonic for null termination than ', 1)'. */ - return wrap(ConstantDataArray::getString(*unwrap(C), StringRef(Str, Length), - DontNullTerminate == 0)); + return wrap(ConstantArray::get(*unwrap(C), StringRef(Str, Length), + DontNullTerminate == 0)); } LLVMValueRef LLVMConstStructInContext(LLVMContextRef C, LLVMValueRef *ConstantVals, Modified: llvm/trunk/lib/VMCore/IRBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/IRBuilder.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/IRBuilder.cpp (original) +++ llvm/trunk/lib/VMCore/IRBuilder.cpp Tue Jan 31 22:51:17 2012 @@ -24,7 +24,7 @@ /// specified. If Name is specified, it is the name of the global variable /// created. Value *IRBuilderBase::CreateGlobalString(StringRef Str, const Twine &Name) { - Constant *StrConstant = ConstantDataArray::getString(Context, Str); + Constant *StrConstant = ConstantArray::get(Context, Str, true); Module &M = *BB->getParent()->getParent(); GlobalVariable *GV = new GlobalVariable(M, StrConstant->getType(), true, GlobalValue::PrivateLinkage, Modified: llvm/trunk/tools/bugpoint/Miscompilation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/bugpoint/Miscompilation.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/tools/bugpoint/Miscompilation.cpp (original) +++ llvm/trunk/tools/bugpoint/Miscompilation.cpp Tue Jan 31 22:51:17 2012 @@ -820,8 +820,7 @@ // Don't forward functions which are external in the test module too. if (TestFn && !TestFn->isDeclaration()) { // 1. Add a string constant with its name to the global file - Constant *InitArray = - ConstantDataArray::getString(F->getContext(), F->getName()); + Constant *InitArray = ConstantArray::get(F->getContext(), F->getName()); GlobalVariable *funcName = new GlobalVariable(*Safe, InitArray->getType(), true /*isConstant*/, GlobalValue::InternalLinkage, InitArray, Modified: llvm/trunk/tools/lto/LTOModule.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOModule.cpp?rev=149470&r1=149469&r2=149470&view=diff ============================================================================== --- llvm/trunk/tools/lto/LTOModule.cpp (original) +++ llvm/trunk/tools/lto/LTOModule.cpp Tue Jan 31 22:51:17 2012 @@ -190,9 +190,9 @@ Constant *op = ce->getOperand(0); if (GlobalVariable *gvn = dyn_cast(op)) { Constant *cn = gvn->getInitializer(); - if (ConstantDataArray *ca = dyn_cast(cn)) { + if (ConstantArray *ca = dyn_cast(cn)) { if (ca->isCString()) { - name = ".objc_class_name_" + ca->getAsCString().str(); + name = ".objc_class_name_" + ca->getAsCString(); return true; } } From kyrtzidis at apple.com Tue Jan 31 22:56:13 2012 From: kyrtzidis at apple.com (Argyrios Kyrtzidis) Date: Tue, 31 Jan 2012 20:56:13 -0800 Subject: [llvm-commits] [llvm] r149348 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp In-Reply-To: <38C376D1-635F-4573-99BD-AB155C9E9A2A@nondot.org> References: <20120131043922.528222A6C12C@llvm.org> <38C376D1-635F-4573-99BD-AB155C9E9A2A@nondot.org> Message-ID: <6BA7FA2A-C5A4-41A4-8E9C-E67F4048FC06@apple.com> On Jan 31, 2012, at 12:44 PM, Chris Lattner wrote: > > On Jan 31, 2012, at 9:32 AM, NAKAMURA Takumi wrote: > >> 2012/1/31 Chris Lattner : >>> Author: lattner >>> Date: Mon Jan 30 22:39:22 2012 >>> New Revision: 149348 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=149348&view=rev >>> Log: >>> rework this logic to not depend on the last argument to GetConstantStringInfo, >>> which is going away. >> >> Chris, it seems it might trigger a failure on >> stage2(stage1-built-clang) build on my two builders, x86_64-linux and >> i686-cygwin. I have not investigated why yet. > > Hi Takumi, > > Is this still failing for you? I cannot reproduce this, and it is very strange. I didn't change anything around metadata, so I'm not sure how I could have caused this. Here's how to reproduce: -Build llvm/clang -Build Release+Asserts llvm/clang again with the newly built clang -VMCoreTests fails Apologies, I wasn't successful in untangling your changes without build or test failures, I ended up reverting all your commits up to the offending r149348: r149348 r149351 r149352 r149354 r149356 r149357 r149361 r149362 r149364 r149365 Reverted in r149470. -Argyrios > > -Chris > >> >> ...Takumi >> >> ******************** TEST 'LLVM-Unit :: >> VMCore/Release/VMCoreTests/MDStringTest.PrintingComplex' FAILED >> ********************Note: Google Test filter = >> MDStringTest.PrintingComplex >> [==========] Running 1 test from 1 test case. >> [----------] Global test environment set-up. >> [----------] 1 test from MDStringTest >> [ RUN ] MDStringTest.PrintingComplex >> /home/bb/buildslave/clang-3stage-x86_64-linux/llvm-project/llvm/unittests/VMCore/MetadataTest.cpp:71: >> Failure >> Value of: oss.str().c_str() >> Actual: "metadata !"\00\00\00\00\00"" >> Expected: "metadata !\"\\00\\0A\\22\\5C\\FF\"" >> Which is: "metadata !"\00\0A\22\5C\FF"" >> [ FAILED ] MDStringTest.PrintingComplex (0 ms) >> [----------] 1 test from MDStringTest (0 ms total) >> >> [----------] Global test environment tear-down >> [==========] 1 test from 1 test case ran. (0 ms total) >> [ PASSED ] 0 tests. >> [ FAILED ] 1 test, listed below: >> [ FAILED ] MDStringTest.PrintingComplex >> >> 1 FAILED TEST >> >> ******************** > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From nicholas at mxc.ca Tue Jan 31 22:59:29 2012 From: nicholas at mxc.ca (Nick Lewycky) Date: Tue, 31 Jan 2012 20:59:29 -0800 Subject: [llvm-commits] [llvm] r149468 - in /llvm/trunk: docs/ include/llvm-c/ include/llvm-c/Transforms/ include/llvm/ include/llvm/Transforms/ include/llvm/Transforms/IPO/ lib/Transforms/ lib/Transforms/IPO/ lib/Transforms/Vectorize/ test/Transforms/BBVectorize/ tools/bugpoint/ tools/llvm-ld/ tools/lto/ tools/opt/ In-Reply-To: <20120201035145.411492A6C12C@llvm.org> References: <20120201035145.411492A6C12C@llvm.org> Message-ID: <4F28C6B1.4060409@mxc.ca> Hal Finkel wrote: > Author: hfinkel > Date: Tue Jan 31 21:51:43 2012 > New Revision: 149468 > > URL: http://llvm.org/viewvc/llvm-project?rev=149468&view=rev > Log: > Add a basic-block autovectorization pass. > > This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. > Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). Great stuff! > Copied: llvm/trunk/include/llvm-c/Transforms/Vectorize.h (from r149457, llvm/trunk/include/llvm-c/Initialization.h) > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Transforms/Vectorize.h?p2=llvm/trunk/include/llvm-c/Transforms/Vectorize.h&p1=llvm/trunk/include/llvm-c/Initialization.h&r1=149457&r2=149468&rev=149468&view=diff > ============================================================================== > --- llvm/trunk/include/llvm-c/Initialization.h (original) > +++ llvm/trunk/include/llvm-c/Transforms/Vectorize.h Tue Jan 31 21:51:43 2012 > @@ -1,4 +1,5 @@ > -/*===-- llvm-c/Initialization.h - Initialization C Interface ------*- C -*-===*\ > +/*===---------------------------Vectorize.h ------------------- -*- C++ -*-===*\ -*- C -*- not C++. > Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=149468&r1=149467&r2=149468&view=diff > ============================================================================== > --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original) > +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Tue Jan 31 21:51:43 2012 > @@ -21,14 +21,20 @@ > #include "llvm/DefaultPasses.h" > #include "llvm/PassManager.h" > #include "llvm/Analysis/Passes.h" > +#include "llvm/Analysis/Verifier.h" > +#include "llvm/Support/CommandLine.h" > #include "llvm/Target/TargetLibraryInfo.h" > #include "llvm/Transforms/Scalar.h" > +#include "llvm/Transforms/Vectorize.h" > #include "llvm/Transforms/IPO.h" > #include "llvm/ADT/SmallVector.h" > #include "llvm/Support/ManagedStatic.h" > > using namespace llvm; > > +static cl::opt > +RunVectorization("vectorize", cl::desc("Run vectorization passes")); > + > PassManagerBuilder::PassManagerBuilder() { > OptLevel = 2; > SizeLevel = 0; > @@ -37,6 +43,7 @@ > DisableSimplifyLibCalls = false; > DisableUnitAtATime = false; > DisableUnrollLoops = false; > + Vectorize = RunVectorization; > } > > PassManagerBuilder::~PassManagerBuilder() { > @@ -172,6 +179,13 @@ > > addExtensionsToPM(EP_ScalarOptimizerLate, MPM); > > + if (Vectorize) { > + MPM.add(createBBVectorizePass()); > + MPM.add(createInstructionCombiningPass()); > + if (OptLevel> 1) > + MPM.add(createGVNPass()); // Remove redundancies Whooooaa... GVN is *really* expensive, I find it hard to believe that you want to run it twice even with vectorization on. Are you sure? What is this doing that instcombine isn't? > Added: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp?rev=149468&view=auto > ============================================================================== > --- llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp (added) > +++ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Tue Jan 31 21:51:43 2012 > @@ -0,0 +1,1796 @@ > +//===- BBVectorize.cpp - A Basic-Block Vectorizer -------------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file implements a basic-block vectorization pass. The algorithm was > +// inspired by that used by the Vienna MAP Vectorizor by Franchetti and Kral, > +// et al. It works by looking for chains of pairable operations and then > +// pairing them. > +// > +//===----------------------------------------------------------------------===// > + > +#define BBV_NAME "bb-vectorize" I think it's safe to constant propagate this away. :) > +#define DEBUG_TYPE BBV_NAME > +#include "llvm/Constants.h" > +#include "llvm/DerivedTypes.h" > +#include "llvm/Function.h" > +#include "llvm/Instructions.h" > +#include "llvm/IntrinsicInst.h" > +#include "llvm/Intrinsics.h" > +#include "llvm/LLVMContext.h" > +#include "llvm/Pass.h" > +#include "llvm/Type.h" > +#include "llvm/ADT/DenseMap.h" > +#include "llvm/ADT/DenseSet.h" > +#include "llvm/ADT/SmallVector.h" > +#include "llvm/ADT/Statistic.h" > +#include "llvm/ADT/STLExtras.h" > +#include "llvm/ADT/StringExtras.h" > +#include "llvm/Analysis/AliasAnalysis.h" > +#include "llvm/Analysis/AliasSetTracker.h" > +#include "llvm/Analysis/ScalarEvolution.h" > +#include "llvm/Analysis/ScalarEvolutionExpressions.h" > +#include "llvm/Analysis/ValueTracking.h" > +#include "llvm/Support/CommandLine.h" > +#include "llvm/Support/Debug.h" > +#include "llvm/Support/raw_ostream.h" > +#include "llvm/Support/ValueHandle.h" > +#include "llvm/Target/TargetData.h" > +#include "llvm/Transforms/Vectorize.h" > +#include > +#include > +using namespace llvm; > + > +static cl::opt > +ReqChainDepth("bb-vectorize-req-chain-depth", cl::init(6), cl::Hidden, > + cl::desc("The required chain depth for vectorization")); > + > +static cl::opt > +SearchLimit("bb-vectorize-search-limit", cl::init(400), cl::Hidden, > + cl::desc("The maximum search distance for instruction pairs")); > + > +static cl::opt > +SplatBreaksChain("bb-vectorize-splat-breaks-chain", cl::init(false), cl::Hidden, > + cl::desc("Replicating one element to a pair breaks the chain")); > + > +static cl::opt > +VectorBits("bb-vectorize-vector-bits", cl::init(128), cl::Hidden, > + cl::desc("The size of the native vector registers")); > + > +static cl::opt > +MaxIter("bb-vectorize-max-iter", cl::init(0), cl::Hidden, > + cl::desc("The maximum number of pairing iterations")); > + > +static cl::opt > +MaxCandPairsForCycleCheck("bb-vectorize-max-cycle-check-pairs", cl::init(200), > + cl::Hidden, cl::desc("The maximum number of candidate pairs with which to use" > + " a full cycle check")); > + > +static cl::opt > +NoInts("bb-vectorize-no-ints", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize integer values")); > + > +static cl::opt > +NoFloats("bb-vectorize-no-floats", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize floating-point values")); > + > +static cl::opt > +NoCasts("bb-vectorize-no-casts", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize casting (conversion) operations")); > + > +static cl::opt > +NoMath("bb-vectorize-no-math", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize floating-point math intrinsics")); > + > +static cl::opt > +NoFMA("bb-vectorize-no-fma", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize the fused-multiply-add intrinsic")); > + > +static cl::opt > +NoMemOps("bb-vectorize-no-mem-ops", cl::init(false), cl::Hidden, > + cl::desc("Don't try to vectorize loads and stores")); > + > +static cl::opt > +AlignedOnly("bb-vectorize-aligned-only", cl::init(false), cl::Hidden, > + cl::desc("Only generate aligned loads and stores")); > + > +static cl::opt > +FastDep("bb-vectorize-fast-dep", cl::init(false), cl::Hidden, > + cl::desc("Use a fast instruction dependency analysis")); > + > +#ifndef NDEBUG > +static cl::opt > +DebugInstructionExamination("bb-vectorize-debug-instruction-examination", > + cl::init(false), cl::Hidden, > + cl::desc("When debugging is enabled, output information on the" > + " instruction-examination process")); > +static cl::opt > +DebugCandidateSelection("bb-vectorize-debug-candidate-selection", > + cl::init(false), cl::Hidden, > + cl::desc("When debugging is enabled, output information on the" > + " candidate-selection process")); > +static cl::opt > +DebugPairSelection("bb-vectorize-debug-pair-selection", > + cl::init(false), cl::Hidden, > + cl::desc("When debugging is enabled, output information on the" > + " pair-selection process")); > +static cl::opt > +DebugCycleCheck("bb-vectorize-debug-cycle-check", > + cl::init(false), cl::Hidden, > + cl::desc("When debugging is enabled, output information on the" > + " cycle-checking process")); > +#endif > + > +STATISTIC(NumFusedOps, "Number of operations fused by bb-vectorize"); > + > +namespace { > + struct BBVectorize : public BasicBlockPass { > + static char ID; // Pass identification, replacement for typeid > + BBVectorize() : BasicBlockPass(ID) { > + initializeBBVectorizePass(*PassRegistry::getPassRegistry()); > + } > + > + typedef std::pair ValuePair; > + typedef std::pair ValuePairWithDepth; > + typedef std::pair VPPair; // A ValuePair pair > + typedef std::pair::iterator, > + std::multimap::iterator> VPIteratorPair; > + typedef std::pair::iterator, > + std::multimap::iterator> > + VPPIteratorPair; > + > + AliasAnalysis *AA; > + ScalarEvolution *SE; > + TargetData *TD; > + > + // FIXME: const correct? > + > + bool vectorizePairs(BasicBlock&BB); > + > + void getCandidatePairs(BasicBlock&BB, > + std::multimap &CandidatePairs, > + std::vector &PairableInsts); > + > + void computeConnectedPairs(std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs); > + > + void buildDepMap(BasicBlock&BB, > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + DenseSet &PairableInstUsers); > + > + void choosePairs(std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + DenseMap& ChosenPairs); > + > + void fuseChosenPairs(BasicBlock&BB, > + std::vector &PairableInsts, > + DenseMap& ChosenPairs); > + > + bool isInstVectorizable(Instruction *I, bool&IsSimpleLoadStore); > + > + bool areInstsCompatible(Instruction *I, Instruction *J, > + bool IsSimpleLoadStore); > + > + bool trackUsesOfI(DenseSet &Users, > + AliasSetTracker&WriteSet, Instruction *I, > + Instruction *J, bool UpdateUsers = true, > + std::multimap *LoadMoveSet = 0); > + > + void computePairsConnectedTo( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + ValuePair P); > + > + bool pairsConflict(ValuePair P, ValuePair Q, > + DenseSet &PairableInstUsers, > + std::multimap *PairableInstUserMap = 0); > + > + bool pairWillFormCycle(ValuePair P, > + std::multimap &PairableInstUsers, > + DenseSet &CurrentPairs); > + > + void pruneTreeFor( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + std::multimap &PairableInstUserMap, > + DenseMap &ChosenPairs, > + DenseMap &Tree, > + DenseSet &PrunedTree, ValuePair J, > + bool UseCycleCheck); > + > + void buildInitialTreeFor( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + DenseMap &ChosenPairs, > + DenseMap &Tree, ValuePair J); > + > + void findBestTreeFor( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + std::multimap &PairableInstUserMap, > + DenseMap &ChosenPairs, > + DenseSet &BestTree, size_t&BestMaxDepth, > + size_t&BestEffSize, VPIteratorPair ChoiceRange, > + bool UseCycleCheck); > + > + Value *getReplacementPointerInput(LLVMContext& Context, Instruction *I, > + Instruction *J, unsigned o, bool&FlipMemInputs); > + > + void fillNewShuffleMask(LLVMContext& Context, Instruction *J, > + unsigned NumElem, unsigned MaskOffset, unsigned NumInElem, > + unsigned IdxOffset, std::vector &Mask); > + > + Value *getReplacementShuffleMask(LLVMContext& Context, Instruction *I, > + Instruction *J); > + > + Value *getReplacementInput(LLVMContext& Context, Instruction *I, > + Instruction *J, unsigned o, bool FlipMemInputs); > + > + void getReplacementInputsForPair(LLVMContext& Context, Instruction *I, > + Instruction *J, SmallVector &ReplacedOperands, > + bool&FlipMemInputs); > + > + void replaceOutputsOfPair(LLVMContext& Context, Instruction *I, > + Instruction *J, Instruction *K, > + Instruction *&InsertionPt, Instruction *&K1, > + Instruction *&K2, bool&FlipMemInputs); > + > + void collectPairLoadMoveSet(BasicBlock&BB, > + DenseMap &ChosenPairs, > + std::multimap &LoadMoveSet, > + Instruction *I); > + > + void collectLoadMoveSet(BasicBlock&BB, > + std::vector &PairableInsts, > + DenseMap &ChosenPairs, > + std::multimap &LoadMoveSet); > + > + bool canMoveUsesOfIAfterJ(BasicBlock&BB, > + std::multimap &LoadMoveSet, > + Instruction *I, Instruction *J); > + > + void moveUsesOfIAfterJ(BasicBlock&BB, > + std::multimap &LoadMoveSet, > + Instruction *&InsertionPt, > + Instruction *I, Instruction *J); > + > + virtual bool runOnBasicBlock(BasicBlock&BB) { > + AA =&getAnalysis(); > + SE =&getAnalysis(); > + TD = getAnalysisIfAvailable(); > + > + bool changed = false; > + // Iterate a sufficient number of times to merge types of size 1 bit, > + // then 2 bits, then 4, etc. up to half of the target vector width of the > + // target vector register. > + for (unsigned v = 2, n = 1; v<= VectorBits&& (!MaxIter || n<= MaxIter); > + v *= 2, ++n) { > + DEBUG(dbgs()<< "BBV: fusing loop #"<< n<< > + " for "<< BB.getName()<< " in "<< > + BB.getParent()->getName()<< "...\n"); > + if (vectorizePairs(BB)) > + changed = true; > + else > + break; > + } > + > + DEBUG(dbgs()<< "BBV: done!\n"); > + return changed; > + } > + > + virtual void getAnalysisUsage(AnalysisUsage&AU) const { Does this pass mutate the CFG (ie., modify terminator instructions)? I don't see where it does, so AU.setPreservesCFG() should be here? > + BasicBlockPass::getAnalysisUsage(AU); > + AU.addRequired(); > + AU.addRequired(); > + AU.addPreserved(); > + AU.addPreserved(); > + } > + > + // This returns the vector type that holds a pair of the provided type. > + // If the provided type is already a vector, then its length is doubled. > + static inline VectorType *getVecTypeForPair(Type *ElemTy) { > + if (VectorType *VTy = dyn_cast(ElemTy)) { > + unsigned numElem = VTy->getNumElements(); > + return VectorType::get(ElemTy->getScalarType(), numElem*2); > + } else { No else-after-return. http://llvm.org/docs/CodingStandards.html#hl_else_after_return > + return VectorType::get(ElemTy, 2); > + } > + } > + > + // Returns the weight associated with the provided value. A chain of > + // candidate pairs has a length given by the sum of the weights of its > + // members (one weight per pair; the weight of each member of the pair > + // is assumed to be the same). This length is then compared to the > + // chain-length threshold to determine if a given chain is significant > + // enough to be vectorized. The length is also used in comparing > + // candidate chains where longer chains are considered to be better. > + // Note: when this function returns 0, the resulting instructions are > + // not actually fused. > + static inline size_t getDepthFactor(Value *V) { > + // InsertElement and ExtractElement have a depth factor of zero. This is > + // for two reasons: First, they cannot be usefully fused. Second, because > + // the pass generates a lot of these, they can confuse the simple metric > + // used to compare the trees in the next iteration. Thus, giving them a > + // weight of zero allows the pass to essentially ignore them in > + // subsequent iterations when looking for vectorization opportunities > + // while still tracking dependency chains that flow through those > + // instructions. > + if (isa(V) || isa(V)) > + return 0; > + > + return 1; > + } > + > + // This determines the relative offset of two loads or stores, returning > + // true if the offset could be determined to be some constant value. > + // For example, if OffsetInElmts == 1, then J accesses the memory directly > + // after I; if OffsetInElmts == -1 then I accesses the memory > + // directly after J. This function assumes that both instructions > + // have the same type. > + bool getPairPtrInfo(Instruction *I, Instruction *J, > + Value *&IPtr, Value *&JPtr, unsigned&IAlignment, unsigned&JAlignment, > + int64_t&OffsetInElmts) { > + OffsetInElmts = 0; > + if (isa(I)) { > + IPtr = cast(I)->getPointerOperand(); > + JPtr = cast(J)->getPointerOperand(); > + IAlignment = cast(I)->getAlignment(); > + JAlignment = cast(J)->getAlignment(); > + } else { > + IPtr = cast(I)->getPointerOperand(); > + JPtr = cast(J)->getPointerOperand(); > + IAlignment = cast(I)->getAlignment(); > + JAlignment = cast(J)->getAlignment(); > + } > + > + const SCEV *IPtrSCEV = SE->getSCEV(IPtr); > + const SCEV *JPtrSCEV = SE->getSCEV(JPtr); > + > + // If this is a trivial offset, then we'll get something like > + // 1*sizeof(type). With target data, which we need anyway, this will get > + // constant folded into a number. > + const SCEV *OffsetSCEV = SE->getMinusSCEV(JPtrSCEV, IPtrSCEV); > + if (const SCEVConstant *ConstOffSCEV = > + dyn_cast(OffsetSCEV)) { > + ConstantInt *IntOff = ConstOffSCEV->getValue(); > + int64_t Offset = IntOff->getSExtValue(); > + > + Type *VTy = cast(IPtr->getType())->getElementType(); > + int64_t VTyTSS = (int64_t) TD->getTypeStoreSize(VTy); > + > + assert(VTy == cast(JPtr->getType())->getElementType()); > + > + OffsetInElmts = Offset/VTyTSS; > + return (abs64(Offset) % VTyTSS) == 0; > + } > + > + return false; > + } > + > + // Returns true if the provided CallInst represents an intrinsic that can > + // be vectorized. > + bool isVectorizableIntrinsic(CallInst* I) { > + Function *F = I->getCalledFunction(); > + if (!F) return false; > + > + unsigned IID = F->getIntrinsicID(); > + if (!IID) return false; > + > + switch(IID) { > + default: > + return false; > + case Intrinsic::sqrt: > + case Intrinsic::powi: > + case Intrinsic::sin: > + case Intrinsic::cos: > + case Intrinsic::log: > + case Intrinsic::log2: > + case Intrinsic::log10: > + case Intrinsic::exp: > + case Intrinsic::exp2: > + case Intrinsic::pow: > + return !NoMath; > + case Intrinsic::fma: > + return !NoFMA; > + } > + } > + > + // Returns true if J is the second element in some pair referenced by > + // some multimap pair iterator pair. > + template > + bool isSecondInIteratorPair(V J, std::pair< > + typename std::multimap::iterator, > + typename std::multimap::iterator> PairRange) { > + for (typename std::multimap::iterator K = PairRange.first; > + K != PairRange.second; ++K) > + if (K->second == J) return true; > + > + return false; > + } > + }; > + > + // This function implements one vectorization iteration on the provided > + // basic block. It returns true if the block is changed. > + bool BBVectorize::vectorizePairs(BasicBlock&BB) { > + std::vector PairableInsts; > + std::multimap CandidatePairs; > + getCandidatePairs(BB, CandidatePairs, PairableInsts); > + if (PairableInsts.size() == 0) return false; > + > + // Now we have a map of all of the pairable instructions and we need to > + // select the best possible pairing. A good pairing is one such that the > + // users of the pair are also paired. This defines a (directed) forest > + // over the pairs such that two pairs are connected iff the second pair > + // uses the first. > + > + // Note that it only matters that both members of the second pair use some > + // element of the first pair (to allow for splatting). > + > + std::multimap ConnectedPairs; > + computeConnectedPairs(CandidatePairs, PairableInsts, ConnectedPairs); > + if (ConnectedPairs.size() == 0) return false; ConnectedPairs.empty() > + > + // Build the pairable-instruction dependency map > + DenseSet PairableInstUsers; > + buildDepMap(BB, CandidatePairs, PairableInsts, PairableInstUsers); > + > + // There is now a graph of the connected pairs. For each variable, pick the > + // pairing with the largest tree meeting the depth requirement on at least > + // one branch. Then select all pairings that are part of that tree and > + // remove them from the list of available pairings and pairable variables. > + > + DenseMap ChosenPairs; > + choosePairs(CandidatePairs, PairableInsts, ConnectedPairs, > + PairableInstUsers, ChosenPairs); > + > + if (ChosenPairs.size() == 0) return false; ChosenPairs.empty() > + NumFusedOps += ChosenPairs.size(); > + > + // A set of pairs has now been selected. It is now necessary to replace the > + // paired instructions with vector instructions. For this procedure each > + // operand much be replaced with a vector operand. This vector is formed > + // by using build_vector on the old operands. The replaced values are then > + // replaced with a vector_extract on the result. Subsequent optimization > + // passes should coalesce the build/extract combinations. > + > + fuseChosenPairs(BB, PairableInsts, ChosenPairs); > + > + return true; > + } > + > + // This function returns true if the provided instruction is capable of being > + // fused into a vector instruction. This determination is based only on the > + // type and other attributes of the instruction. > + bool BBVectorize::isInstVectorizable(Instruction *I, > + bool&IsSimpleLoadStore) { > + IsSimpleLoadStore = false; > + > + if (CallInst *C = dyn_cast(I)) { > + if (!isVectorizableIntrinsic(C)) > + return false; > + } else if (LoadInst *L = dyn_cast(I)) { > + // Vectorize simple loads if possbile: > + IsSimpleLoadStore = L->isSimple(); > + if (!IsSimpleLoadStore || NoMemOps) > + return false; > + } else if (StoreInst *S = dyn_cast(I)) { > + // Vectorize simple stores if possbile: > + IsSimpleLoadStore = S->isSimple(); > + if (!IsSimpleLoadStore || NoMemOps) > + return false; > + } else if (CastInst *C = dyn_cast(I)) { > + // We can vectorize casts, but not casts of pointer types, etc. > + if (NoCasts) > + return false; > + > + Type *SrcTy = C->getSrcTy(); > + if (!SrcTy->isSingleValueType() || SrcTy->isPointerTy()) > + return false; > + > + Type *DestTy = C->getDestTy(); > + if (!DestTy->isSingleValueType() || DestTy->isPointerTy()) > + return false; > + } else if (!(I->isBinaryOp() || isa(I) || > + isa(I) || isa(I))) { > + return false; > + } > + > + // We can't vectorize memory operations without target data > + if (TD == 0&& IsSimpleLoadStore) > + return false; > + > + Type *T1, *T2; > + if (isa(I)) { > + // For stores, it is the value type, not the pointer type that matters > + // because the value is what will come from a vector register. > + > + Value *IVal = cast(I)->getValueOperand(); > + T1 = IVal->getType(); > + } else { > + T1 = I->getType(); > + } > + > + if (I->isCast()) > + T2 = cast(I)->getSrcTy(); > + else > + T2 = T1; > + > + // Not every type can be vectorized... > + if (!(VectorType::isValidElementType(T1) || T1->isVectorTy()) || > + !(VectorType::isValidElementType(T2) || T2->isVectorTy())) > + return false; > + > + if (NoInts&& (T1->isIntOrIntVectorTy() || T2->isIntOrIntVectorTy())) > + return false; > + > + if (NoFloats&& (T1->isFPOrFPVectorTy() || T2->isFPOrFPVectorTy())) > + return false; > + > + if (T1->getPrimitiveSizeInBits()> VectorBits/2 || > + T2->getPrimitiveSizeInBits()> VectorBits/2) > + return false; > + > + return true; > + } > + > + // This function returns true if the two provided instructions are compatible > + // (meaning that they can be fused into a vector instruction). This assumes > + // that I has already been determined to be vectorizable and that J is not > + // in the use tree of I. > + bool BBVectorize::areInstsCompatible(Instruction *I, Instruction *J, > + bool IsSimpleLoadStore) { > + DEBUG(if (DebugInstructionExamination) dbgs()<< "BBV: looking at "<< *I<< > + "<-> "<< *J<< "\n"); > + > + // Loads and stores can be merged if they have different alignments, > + // but are otherwise the same. > + LoadInst *LI, *LJ; > + StoreInst *SI, *SJ; > + if ((LI = dyn_cast(I))&& (LJ = dyn_cast(J))) { > + if (I->getType() != J->getType()) > + return false; > + > + if (LI->getPointerOperand()->getType() != > + LJ->getPointerOperand()->getType() || > + LI->isVolatile() != LJ->isVolatile() || You don't combine two separate volatile loads, do you? That sounds bad. I'm also not sure about merging two atomic load/stores... > + LI->getOrdering() != LJ->getOrdering() || > + LI->getSynchScope() != LJ->getSynchScope()) > + return false; > + } else if ((SI = dyn_cast(I))&& (SJ = dyn_cast(J))) { > + if (SI->getValueOperand()->getType() != > + SJ->getValueOperand()->getType() || > + SI->getPointerOperand()->getType() != > + SJ->getPointerOperand()->getType() || > + SI->isVolatile() != SJ->isVolatile() || > + SI->getOrdering() != SJ->getOrdering() || > + SI->getSynchScope() != SJ->getSynchScope()) > + return false; > + } else if (!J->isSameOperationAs(I)) { > + return false; > + } > + // FIXME: handle addsub-type operations! > + > + if (IsSimpleLoadStore) { > + Value *IPtr, *JPtr; > + unsigned IAlignment, JAlignment; > + int64_t OffsetInElmts = 0; > + if (getPairPtrInfo(I, J, IPtr, JPtr, IAlignment, JAlignment, > + OffsetInElmts)&& abs64(OffsetInElmts) == 1) { > + if (AlignedOnly) { > + Type *aType = isa(I) ? > + cast(I)->getValueOperand()->getType() : I->getType(); > + // An aligned load or store is possible only if the instruction > + // with the lower offset has an alignment suitable for the > + // vector type. > + > + unsigned BottomAlignment = IAlignment; > + if (OffsetInElmts< 0) BottomAlignment = JAlignment; > + > + Type *VType = getVecTypeForPair(aType); > + unsigned VecAlignment = TD->getPrefTypeAlignment(VType); > + if (BottomAlignment< VecAlignment) > + return false; > + } > + } else { > + return false; > + } > + } else if (isa(I)) { > + // Only merge two shuffles if they're both constant > + return isa(I->getOperand(2))&& > + isa(J->getOperand(2)); > + // FIXME: We may want to vectorize non-constant shuffles also. > + } > + > + return true; > + } > + > + // Figure out whether or not J uses I and update the users and write-set > + // structures associated with I. Specifically, Users represents the set of > + // instructions that depend on I. WriteSet represents the set > + // of memory locations that are dependent on I. If UpdateUsers is true, > + // and J uses I, then Users is updated to contain J and WriteSet is updated > + // to contain any memory locations to which J writes. The function returns > + // true if J uses I. By default, alias analysis is used to determine > + // whether J reads from memory that overlaps with a location in WriteSet. > + // If LoadMoveSet is not null, then it is a previously-computed multimap > + // where the key is the memory-based user instruction and the value is > + // the instruction to be compared with I. So, if LoadMoveSet is provided, > + // then the alias analysis is not used. This is necessary because this > + // function is called during the process of moving instructions during > + // vectorization and the results of the alias analysis are not stable during > + // that process. > + bool BBVectorize::trackUsesOfI(DenseSet &Users, > + AliasSetTracker&WriteSet, Instruction *I, > + Instruction *J, bool UpdateUsers, > + std::multimap *LoadMoveSet) { > + bool UsesI = false; > + > + // This instruction may already be marked as a user due, for example, to > + // being a member of a selected pair. > + if (Users.count(J)) > + UsesI = true; > + > + if (!UsesI) > + for (User::op_iterator JU = J->op_begin(), e = J->op_end(); > + JU != e; ++JU) { This is correct, but it's common to say "JU = ..., JE = ..." for consistency. > + Value *V = *JU; > + if (I == V || Users.count(V)) { > + UsesI = true; > + break; > + } > + } > + if (!UsesI&& J->mayReadFromMemory()) { > + if (LoadMoveSet) { > + VPIteratorPair JPairRange = LoadMoveSet->equal_range(J); > + UsesI = isSecondInIteratorPair(I, JPairRange); > + } else { > + for (AliasSetTracker::iterator W = WriteSet.begin(), > + WE = WriteSet.end(); W != WE; ++W) { > + for (AliasSet::iterator A = W->begin(), AE = W->end(); > + A != AE; ++A) { > + AliasAnalysis::Location ptrLoc(A->getValue(), A->getSize(), > + A->getTBAAInfo()); > + if (AA->getModRefInfo(J, ptrLoc) != AliasAnalysis::NoModRef) { > + UsesI = true; > + break; > + } > + } > + if (UsesI) break; > + } > + } > + } > + > + if (UsesI&& UpdateUsers) { > + if (J->mayWriteToMemory()) WriteSet.add(J); > + Users.insert(J); > + } > + > + return UsesI; > + } > + > + // This function iterates over all instruction pairs in the provided > + // basic block and collects all candidate pairs for vectorization. > + void BBVectorize::getCandidatePairs(BasicBlock&BB, > + std::multimap &CandidatePairs, > + std::vector &PairableInsts) { > + BasicBlock::iterator E = BB.end(); > + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { > + bool IsSimpleLoadStore; > + if (!isInstVectorizable(I, IsSimpleLoadStore)) continue; > + > + // Look for an instruction with which to pair instruction *I... > + DenseSet Users; > + AliasSetTracker WriteSet(*AA); > + BasicBlock::iterator J = I; ++J; > + for (unsigned ss = 0; J != E&& ss<= SearchLimit; ++J, ++ss) { > + // Determine if J uses I, if so, exit the loop. > + bool UsesI = trackUsesOfI(Users, WriteSet, I, J, !FastDep); > + if (FastDep) { > + // Note: For this heuristic to be effective, independent operations > + // must tend to be intermixed. This is likely to be true from some > + // kinds of grouped loop unrolling (but not the generic LLVM pass), > + // but otherwise may require some kind of reordering pass. > + > + // When using fast dependency analysis, > + // stop searching after first use: > + if (UsesI) break; > + } else { > + if (UsesI) continue; > + } > + > + // J does not use I, and comes before the first use of I, so it can be > + // merged with I if the instructions are compatible. > + if (!areInstsCompatible(I, J, IsSimpleLoadStore)) continue; > + > + // J is a candidate for merging with I. > + if (!PairableInsts.size() || > + PairableInsts[PairableInsts.size()-1] != I) { > + PairableInsts.push_back(I); > + } > + CandidatePairs.insert(ValuePair(I, J)); > + DEBUG(if (DebugCandidateSelection) dbgs()<< "BBV: candidate pair" > +<< *I<< "<-> "<< *J<< "\n"); > + } > + } > + > + DEBUG(dbgs()<< "BBV: found "<< PairableInsts.size() > +<< " instructions with candidate pairs\n"); > + } > + > + // Finds candidate pairs connected to the pair P =. This means that > + // it looks for pairs such that both members have an input which is an > + // output of PI or PJ. > + void BBVectorize::computePairsConnectedTo( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + ValuePair P) { > + // For each possible pairing for this variable, look at the uses of > + // the first value... > + for (Value::use_iterator I = P.first->use_begin(), > + E = P.first->use_end(); I != E; ++I) { > + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); > + > + // For each use of the first variable, look for uses of the second > + // variable... > + for (Value::use_iterator J = P.second->use_begin(), > + E2 = P.second->use_end(); J != E2; ++J) { > + VPIteratorPair JPairRange = CandidatePairs.equal_range(*J); > + > + // Look for: > + if (isSecondInIteratorPair(*J, IPairRange)) > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > + > + // Look for: > + if (isSecondInIteratorPair(*I, JPairRange)) > + ConnectedPairs.insert(VPPair(P, ValuePair(*J, *I))); > + } > + > + if (SplatBreaksChain) continue; > + // Look for cases where just the first value in the pair is used by > + // both members of another pair (splatting). > + for (Value::use_iterator J = P.first->use_begin(); J != E; ++J) { > + if (isSecondInIteratorPair(*J, IPairRange)) > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > + } > + } > + > + if (SplatBreaksChain) return; > + // Look for cases where just the second value in the pair is used by > + // both members of another pair (splatting). > + for (Value::use_iterator I = P.second->use_begin(), > + E = P.second->use_end(); I != E; ++I) { > + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); > + > + for (Value::use_iterator J = P.second->use_begin(); J != E; ++J) { > + if (isSecondInIteratorPair(*J, IPairRange)) > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > + } > + } > + } > + > + // This function figures out which pairs are connected. Two pairs are > + // connected if some output of the first pair forms an input to both members > + // of the second pair. > + void BBVectorize::computeConnectedPairs( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs) { > + > + for (std::vector::iterator PI = PairableInsts.begin(), > + PE = PairableInsts.end(); PI != PE; ++PI) { > + VPIteratorPair choiceRange = CandidatePairs.equal_range(*PI); > + > + for (std::multimap::iterator P = choiceRange.first; > + P != choiceRange.second; ++P) > + computePairsConnectedTo(CandidatePairs, PairableInsts, > + ConnectedPairs, *P); > + } > + > + DEBUG(dbgs()<< "BBV: found "<< ConnectedPairs.size() > +<< " pair connections.\n"); > + } > + > + // This function builds a set of use tuples such that is in the set > + // if B is in the use tree of A. If B is in the use tree of A, then B > + // depends on the output of A. > + void BBVectorize::buildDepMap( > + BasicBlock&BB, > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + DenseSet &PairableInstUsers) { > + DenseSet IsInPair; > + for (std::multimap::iterator C = CandidatePairs.begin(), > + E = CandidatePairs.end(); C != E; ++C) { > + IsInPair.insert(C->first); > + IsInPair.insert(C->second); > + } > + > + // Iterate through the basic block, recording all Users of each > + // pairable instruction. > + > + BasicBlock::iterator E = BB.end(); > + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { "for (...; !isa(I); ++I) {" should also work, and avoid the need to declare 'E' above. > + if (IsInPair.find(I) == IsInPair.end()) continue; > + > + DenseSet Users; > + AliasSetTracker WriteSet(*AA); > + for (BasicBlock::iterator J = llvm::next(I); J != E; ++J) > + (void) trackUsesOfI(Users, WriteSet, I, J); > + > + for (DenseSet::iterator U = Users.begin(), E = Users.end(); > + U != E; ++U) > + PairableInstUsers.insert(ValuePair(I, *U)); > + } > + } > + > + // Returns true if an input to pair P is an output of pair Q and also an > + // input of pair Q is an output of pair P. If this is the case, then these > + // two pairs cannot be simultaneously fused. > + bool BBVectorize::pairsConflict(ValuePair P, ValuePair Q, > + DenseSet &PairableInstUsers, > + std::multimap *PairableInstUserMap) { > + // Two pairs are in conflict if they are mutual Users of eachother. > + bool QUsesP = PairableInstUsers.count(ValuePair(P.first, Q.first)) || > + PairableInstUsers.count(ValuePair(P.first, Q.second)) || > + PairableInstUsers.count(ValuePair(P.second, Q.first)) || > + PairableInstUsers.count(ValuePair(P.second, Q.second)); > + bool PUsesQ = PairableInstUsers.count(ValuePair(Q.first, P.first)) || > + PairableInstUsers.count(ValuePair(Q.first, P.second)) || > + PairableInstUsers.count(ValuePair(Q.second, P.first)) || > + PairableInstUsers.count(ValuePair(Q.second, P.second)); > + if (PairableInstUserMap) { > + // FIXME: The expensive part of the cycle check is not so much the cycle > + // check itself but this edge insertion procedure. This needs some > + // profiling and probably a different data structure (same is true of > + // most uses of std::multimap). > + if (PUsesQ) { > + VPPIteratorPair QPairRange = PairableInstUserMap->equal_range(Q); > + if (!isSecondInIteratorPair(P, QPairRange)) > + PairableInstUserMap->insert(VPPair(Q, P)); > + } > + if (QUsesP) { > + VPPIteratorPair PPairRange = PairableInstUserMap->equal_range(P); > + if (!isSecondInIteratorPair(Q, PPairRange)) > + PairableInstUserMap->insert(VPPair(P, Q)); > + } > + } > + > + return (QUsesP&& PUsesQ); > + } > + > + // This function walks the use graph of current pairs to see if, starting > + // from P, the walk returns to P. > + bool BBVectorize::pairWillFormCycle(ValuePair P, > + std::multimap &PairableInstUserMap, > + DenseSet &CurrentPairs) { > + DEBUG(if (DebugCycleCheck) > + dbgs()<< "BBV: starting cycle check for : "<< *P.first<< "<-> " > +<< *P.second<< "\n"); > + // A lookup table of visisted pairs is kept because the PairableInstUserMap > + // contains non-direct associations. > + DenseSet Visited; > + std::vector Q; > + // General depth-first post-order traversal: > + Q.push_back(P); > + while (!Q.empty()) { This is always true on the first iteration. Please make this a: SmallVector Q; Q.push_back(P); do { ValuePair QTop = Q.pop_back_val(); Visited.insert(QTop); // ... } while(!Q.empty()); loop. > + ValuePair QTop = Q.back(); > + > + Visited.insert(QTop); > + Q.pop_back(); > + > + DEBUG(if (DebugCycleCheck) > + dbgs()<< "BBV: cycle check visiting: "<< *QTop.first<< "<-> " > +<< *QTop.second<< "\n"); > + VPPIteratorPair QPairRange = PairableInstUserMap.equal_range(QTop); > + for (std::multimap::iterator C = QPairRange.first; > + C != QPairRange.second; ++C) { > + if (C->second == P) { > + DEBUG(dbgs() > +<< "BBV: rejected to prevent non-trivial cycle formation:" > +<< *C->first.first<< "<-> "<< *C->first.second<< "\n"); > + return true; > + } > + > + if (CurrentPairs.count(C->second)> 0&& > + Visited.count(C->second) == 0) > + Q.push_back(C->second); > + } > + } > + > + return false; > + } > + > + // This function builds the initial tree of connected pairs with the > + // pair J at the root. > + void BBVectorize::buildInitialTreeFor( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + DenseMap &ChosenPairs, > + DenseMap &Tree, ValuePair J) { > + // Each of these pairs is viewed as the root node of a Tree. The Tree > + // is then walked (depth-first). As this happens, we keep track of > + // the pairs that compose the Tree and the maximum depth of the Tree. > + std::vector Q; > + // General depth-first post-order traversal: > + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); > + while (!Q.empty()) { > + ValuePairWithDepth QTop = Q.back(); This loop can be rotated too, though you may not want to switch to using pop_back_val() here (I see that you do additional pushes and optional pops in the loop). > + > + // Push each child onto the queue: > + bool MoreChildren = false; > + size_t MaxChildDepth = QTop.second; > + VPPIteratorPair qtRange = ConnectedPairs.equal_range(QTop.first); > + for (std::map::iterator k = qtRange.first; > + k != qtRange.second; ++k) { > + // Make sure that this child pair is still a candidate: > + bool IsStillCand = false; > + VPIteratorPair checkRange = > + CandidatePairs.equal_range(k->second.first); > + for (std::multimap::iterator m = checkRange.first; > + m != checkRange.second; ++m) { > + if (m->second == k->second.second) { > + IsStillCand = true; > + break; > + } > + } > + > + if (IsStillCand) { > + DenseMap::iterator C = Tree.find(k->second); > + if (C == Tree.end()) { > + size_t d = getDepthFactor(k->second.first); > + Q.push_back(ValuePairWithDepth(k->second, QTop.second+d)); > + MoreChildren = true; > + } else { > + MaxChildDepth = std::max(MaxChildDepth, C->second); > + } > + } > + } > + > + if (!MoreChildren) { > + // Record the current pair as part of the Tree: > + Tree.insert(ValuePairWithDepth(QTop.first, MaxChildDepth)); > + Q.pop_back(); > + } > + } > + } > + > + // Given some initial tree, prune it by removing conflicting pairs (pairs > + // that cannot be simultaneously chosen for vectorization). > + void BBVectorize::pruneTreeFor( > + std::multimap &CandidatePairs, > + std::vector &PairableInsts, > + std::multimap &ConnectedPairs, > + DenseSet &PairableInstUsers, > + std::multimap &PairableInstUserMap, > + DenseMap &ChosenPairs, > + DenseMap &Tree, > + DenseSet &PrunedTree, ValuePair J, > + bool UseCycleCheck) { > + std::vector Q; > + // General depth-first post-order traversal: > + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); > + while (!Q.empty()) { > + ValuePairWithDepth QTop = Q.back(); > + PrunedTree.insert(QTop.first); > + Q.pop_back(); Another loop to restructure. (Stopped reviewing at this point.) Nick From hfinkel at anl.gov Tue Jan 31 23:36:53 2012 From: hfinkel at anl.gov (Hal Finkel) Date: Tue, 31 Jan 2012 23:36:53 -0600 Subject: [llvm-commits] [llvm] r149468 - in /llvm/trunk: docs/ include/llvm-c/ include/llvm-c/Transforms/ include/llvm/ include/llvm/Transforms/ include/llvm/Transforms/IPO/ lib/Transforms/ lib/Transforms/IPO/ lib/Transforms/Vectorize/ test/Transforms/BBVectorize/ tools/bugpoint/ tools/llvm-ld/ tools/lto/ tools/opt/ In-Reply-To: <4F28C6B1.4060409@mxc.ca> References: <20120201035145.411492A6C12C@llvm.org> <4F28C6B1.4060409@mxc.ca> Message-ID: <1328074613.2489.1155.camel@sapling> On Tue, 2012-01-31 at 20:59 -0800, Nick Lewycky wrote: > Hal Finkel wrote: > > Author: hfinkel > > Date: Tue Jan 31 21:51:43 2012 > > New Revision: 149468 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=149468&view=rev > > Log: > > Add a basic-block autovectorization pass. > > > > This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. > > Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). > > Great stuff! > > > Copied: llvm/trunk/include/llvm-c/Transforms/Vectorize.h (from r149457, llvm/trunk/include/llvm-c/Initialization.h) > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Transforms/Vectorize.h?p2=llvm/trunk/include/llvm-c/Transforms/Vectorize.h&p1=llvm/trunk/include/llvm-c/Initialization.h&r1=149457&r2=149468&rev=149468&view=diff > > ============================================================================== > > --- llvm/trunk/include/llvm-c/Initialization.h (original) > > +++ llvm/trunk/include/llvm-c/Transforms/Vectorize.h Tue Jan 31 21:51:43 2012 > > @@ -1,4 +1,5 @@ > > -/*===-- llvm-c/Initialization.h - Initialization C Interface ------*- C -*-===*\ > > +/*===---------------------------Vectorize.h ------------------- -*- C++ -*-===*\ > > -*- C -*- not C++. Oops ;) > > > Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=149468&r1=149467&r2=149468&view=diff > > ============================================================================== > > --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original) > > +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Tue Jan 31 21:51:43 2012 > > @@ -21,14 +21,20 @@ > > #include "llvm/DefaultPasses.h" > > #include "llvm/PassManager.h" > > #include "llvm/Analysis/Passes.h" > > +#include "llvm/Analysis/Verifier.h" > > +#include "llvm/Support/CommandLine.h" > > #include "llvm/Target/TargetLibraryInfo.h" > > #include "llvm/Transforms/Scalar.h" > > +#include "llvm/Transforms/Vectorize.h" > > #include "llvm/Transforms/IPO.h" > > #include "llvm/ADT/SmallVector.h" > > #include "llvm/Support/ManagedStatic.h" > > > > using namespace llvm; > > > > +static cl::opt > > +RunVectorization("vectorize", cl::desc("Run vectorization passes")); > > + > > PassManagerBuilder::PassManagerBuilder() { > > OptLevel = 2; > > SizeLevel = 0; > > @@ -37,6 +43,7 @@ > > DisableSimplifyLibCalls = false; > > DisableUnitAtATime = false; > > DisableUnrollLoops = false; > > + Vectorize = RunVectorization; > > } > > > > PassManagerBuilder::~PassManagerBuilder() { > > @@ -172,6 +179,13 @@ > > > > addExtensionsToPM(EP_ScalarOptimizerLate, MPM); > > > > + if (Vectorize) { > > + MPM.add(createBBVectorizePass()); > > + MPM.add(createInstructionCombiningPass()); > > + if (OptLevel> 1) > > + MPM.add(createGVNPass()); // Remove redundancies > > Whooooaa... GVN is *really* expensive, I find it hard to believe that > you want to run it twice even with vectorization on. Are you sure? What > is this doing that instcombine isn't? As I recall, this actually makes a big difference in the resulting code quality. I'll revisit this and make some more specific comments. > > > Added: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp?rev=149468&view=auto > > ============================================================================== > > --- llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp (added) > > +++ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Tue Jan 31 21:51:43 2012 > > @@ -0,0 +1,1796 @@ > > +//===- BBVectorize.cpp - A Basic-Block Vectorizer -------------------------===// > > +// > > +// The LLVM Compiler Infrastructure > > +// > > +// This file is distributed under the University of Illinois Open Source > > +// License. See LICENSE.TXT for details. > > +// > > +//===----------------------------------------------------------------------===// > > +// > > +// This file implements a basic-block vectorization pass. The algorithm was > > +// inspired by that used by the Vienna MAP Vectorizor by Franchetti and Kral, > > +// et al. It works by looking for chains of pairable operations and then > > +// pairing them. > > +// > > +//===----------------------------------------------------------------------===// > > + > > +#define BBV_NAME "bb-vectorize" > > I think it's safe to constant propagate this away. :) I use the string again in INITIALIZE_PASS_BEGIN,END. > > > +#define DEBUG_TYPE BBV_NAME > > +#include "llvm/Constants.h" > > +#include "llvm/DerivedTypes.h" > > +#include "llvm/Function.h" > > +#include "llvm/Instructions.h" > > +#include "llvm/IntrinsicInst.h" > > +#include "llvm/Intrinsics.h" > > +#include "llvm/LLVMContext.h" > > +#include "llvm/Pass.h" > > +#include "llvm/Type.h" > > +#include "llvm/ADT/DenseMap.h" > > +#include "llvm/ADT/DenseSet.h" > > +#include "llvm/ADT/SmallVector.h" > > +#include "llvm/ADT/Statistic.h" > > +#include "llvm/ADT/STLExtras.h" > > +#include "llvm/ADT/StringExtras.h" > > +#include "llvm/Analysis/AliasAnalysis.h" > > +#include "llvm/Analysis/AliasSetTracker.h" > > +#include "llvm/Analysis/ScalarEvolution.h" > > +#include "llvm/Analysis/ScalarEvolutionExpressions.h" > > +#include "llvm/Analysis/ValueTracking.h" > > +#include "llvm/Support/CommandLine.h" > > +#include "llvm/Support/Debug.h" > > +#include "llvm/Support/raw_ostream.h" > > +#include "llvm/Support/ValueHandle.h" > > +#include "llvm/Target/TargetData.h" > > +#include "llvm/Transforms/Vectorize.h" > > +#include > > +#include > > +using namespace llvm; > > + > > +static cl::opt > > +ReqChainDepth("bb-vectorize-req-chain-depth", cl::init(6), cl::Hidden, > > + cl::desc("The required chain depth for vectorization")); > > + > > +static cl::opt > > +SearchLimit("bb-vectorize-search-limit", cl::init(400), cl::Hidden, > > + cl::desc("The maximum search distance for instruction pairs")); > > + > > +static cl::opt > > +SplatBreaksChain("bb-vectorize-splat-breaks-chain", cl::init(false), cl::Hidden, > > + cl::desc("Replicating one element to a pair breaks the chain")); > > + > > +static cl::opt > > +VectorBits("bb-vectorize-vector-bits", cl::init(128), cl::Hidden, > > + cl::desc("The size of the native vector registers")); > > + > > +static cl::opt > > +MaxIter("bb-vectorize-max-iter", cl::init(0), cl::Hidden, > > + cl::desc("The maximum number of pairing iterations")); > > + > > +static cl::opt > > +MaxCandPairsForCycleCheck("bb-vectorize-max-cycle-check-pairs", cl::init(200), > > + cl::Hidden, cl::desc("The maximum number of candidate pairs with which to use" > > + " a full cycle check")); > > + > > +static cl::opt > > +NoInts("bb-vectorize-no-ints", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize integer values")); > > + > > +static cl::opt > > +NoFloats("bb-vectorize-no-floats", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize floating-point values")); > > + > > +static cl::opt > > +NoCasts("bb-vectorize-no-casts", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize casting (conversion) operations")); > > + > > +static cl::opt > > +NoMath("bb-vectorize-no-math", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize floating-point math intrinsics")); > > + > > +static cl::opt > > +NoFMA("bb-vectorize-no-fma", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize the fused-multiply-add intrinsic")); > > + > > +static cl::opt > > +NoMemOps("bb-vectorize-no-mem-ops", cl::init(false), cl::Hidden, > > + cl::desc("Don't try to vectorize loads and stores")); > > + > > +static cl::opt > > +AlignedOnly("bb-vectorize-aligned-only", cl::init(false), cl::Hidden, > > + cl::desc("Only generate aligned loads and stores")); > > + > > +static cl::opt > > +FastDep("bb-vectorize-fast-dep", cl::init(false), cl::Hidden, > > + cl::desc("Use a fast instruction dependency analysis")); > > + > > +#ifndef NDEBUG > > +static cl::opt > > +DebugInstructionExamination("bb-vectorize-debug-instruction-examination", > > + cl::init(false), cl::Hidden, > > + cl::desc("When debugging is enabled, output information on the" > > + " instruction-examination process")); > > +static cl::opt > > +DebugCandidateSelection("bb-vectorize-debug-candidate-selection", > > + cl::init(false), cl::Hidden, > > + cl::desc("When debugging is enabled, output information on the" > > + " candidate-selection process")); > > +static cl::opt > > +DebugPairSelection("bb-vectorize-debug-pair-selection", > > + cl::init(false), cl::Hidden, > > + cl::desc("When debugging is enabled, output information on the" > > + " pair-selection process")); > > +static cl::opt > > +DebugCycleCheck("bb-vectorize-debug-cycle-check", > > + cl::init(false), cl::Hidden, > > + cl::desc("When debugging is enabled, output information on the" > > + " cycle-checking process")); > > +#endif > > + > > +STATISTIC(NumFusedOps, "Number of operations fused by bb-vectorize"); > > + > > +namespace { > > + struct BBVectorize : public BasicBlockPass { > > + static char ID; // Pass identification, replacement for typeid > > + BBVectorize() : BasicBlockPass(ID) { > > + initializeBBVectorizePass(*PassRegistry::getPassRegistry()); > > + } > > + > > + typedef std::pair ValuePair; > > + typedef std::pair ValuePairWithDepth; > > + typedef std::pair VPPair; // A ValuePair pair > > + typedef std::pair::iterator, > > + std::multimap::iterator> VPIteratorPair; > > + typedef std::pair::iterator, > > + std::multimap::iterator> > > + VPPIteratorPair; > > + > > + AliasAnalysis *AA; > > + ScalarEvolution *SE; > > + TargetData *TD; > > + > > + // FIXME: const correct? > > + > > + bool vectorizePairs(BasicBlock&BB); > > + > > + void getCandidatePairs(BasicBlock&BB, > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts); > > + > > + void computeConnectedPairs(std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs); > > + > > + void buildDepMap(BasicBlock&BB, > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + DenseSet &PairableInstUsers); > > + > > + void choosePairs(std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + DenseMap& ChosenPairs); > > + > > + void fuseChosenPairs(BasicBlock&BB, > > + std::vector &PairableInsts, > > + DenseMap& ChosenPairs); > > + > > + bool isInstVectorizable(Instruction *I, bool&IsSimpleLoadStore); > > + > > + bool areInstsCompatible(Instruction *I, Instruction *J, > > + bool IsSimpleLoadStore); > > + > > + bool trackUsesOfI(DenseSet &Users, > > + AliasSetTracker&WriteSet, Instruction *I, > > + Instruction *J, bool UpdateUsers = true, > > + std::multimap *LoadMoveSet = 0); > > + > > + void computePairsConnectedTo( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + ValuePair P); > > + > > + bool pairsConflict(ValuePair P, ValuePair Q, > > + DenseSet &PairableInstUsers, > > + std::multimap *PairableInstUserMap = 0); > > + > > + bool pairWillFormCycle(ValuePair P, > > + std::multimap &PairableInstUsers, > > + DenseSet &CurrentPairs); > > + > > + void pruneTreeFor( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + std::multimap &PairableInstUserMap, > > + DenseMap &ChosenPairs, > > + DenseMap &Tree, > > + DenseSet &PrunedTree, ValuePair J, > > + bool UseCycleCheck); > > + > > + void buildInitialTreeFor( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + DenseMap &ChosenPairs, > > + DenseMap &Tree, ValuePair J); > > + > > + void findBestTreeFor( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + std::multimap &PairableInstUserMap, > > + DenseMap &ChosenPairs, > > + DenseSet &BestTree, size_t&BestMaxDepth, > > + size_t&BestEffSize, VPIteratorPair ChoiceRange, > > + bool UseCycleCheck); > > + > > + Value *getReplacementPointerInput(LLVMContext& Context, Instruction *I, > > + Instruction *J, unsigned o, bool&FlipMemInputs); > > + > > + void fillNewShuffleMask(LLVMContext& Context, Instruction *J, > > + unsigned NumElem, unsigned MaskOffset, unsigned NumInElem, > > + unsigned IdxOffset, std::vector &Mask); > > + > > + Value *getReplacementShuffleMask(LLVMContext& Context, Instruction *I, > > + Instruction *J); > > + > > + Value *getReplacementInput(LLVMContext& Context, Instruction *I, > > + Instruction *J, unsigned o, bool FlipMemInputs); > > + > > + void getReplacementInputsForPair(LLVMContext& Context, Instruction *I, > > + Instruction *J, SmallVector &ReplacedOperands, > > + bool&FlipMemInputs); > > + > > + void replaceOutputsOfPair(LLVMContext& Context, Instruction *I, > > + Instruction *J, Instruction *K, > > + Instruction *&InsertionPt, Instruction *&K1, > > + Instruction *&K2, bool&FlipMemInputs); > > + > > + void collectPairLoadMoveSet(BasicBlock&BB, > > + DenseMap &ChosenPairs, > > + std::multimap &LoadMoveSet, > > + Instruction *I); > > + > > + void collectLoadMoveSet(BasicBlock&BB, > > + std::vector &PairableInsts, > > + DenseMap &ChosenPairs, > > + std::multimap &LoadMoveSet); > > + > > + bool canMoveUsesOfIAfterJ(BasicBlock&BB, > > + std::multimap &LoadMoveSet, > > + Instruction *I, Instruction *J); > > + > > + void moveUsesOfIAfterJ(BasicBlock&BB, > > + std::multimap &LoadMoveSet, > > + Instruction *&InsertionPt, > > + Instruction *I, Instruction *J); > > + > > + virtual bool runOnBasicBlock(BasicBlock&BB) { > > + AA =&getAnalysis(); > > + SE =&getAnalysis(); > > + TD = getAnalysisIfAvailable(); > > + > > + bool changed = false; > > + // Iterate a sufficient number of times to merge types of size 1 bit, > > + // then 2 bits, then 4, etc. up to half of the target vector width of the > > + // target vector register. > > + for (unsigned v = 2, n = 1; v<= VectorBits&& (!MaxIter || n<= MaxIter); > > + v *= 2, ++n) { > > + DEBUG(dbgs()<< "BBV: fusing loop #"<< n<< > > + " for "<< BB.getName()<< " in "<< > > + BB.getParent()->getName()<< "...\n"); > > + if (vectorizePairs(BB)) > > + changed = true; > > + else > > + break; > > + } > > + > > + DEBUG(dbgs()<< "BBV: done!\n"); > > + return changed; > > + } > > + > > + virtual void getAnalysisUsage(AnalysisUsage&AU) const { > > Does this pass mutate the CFG (ie., modify terminator instructions)? I > don't see where it does, so AU.setPreservesCFG() should be here? Good point. > > > + BasicBlockPass::getAnalysisUsage(AU); > > + AU.addRequired(); > > + AU.addRequired(); > > + AU.addPreserved(); > > + AU.addPreserved(); > > + } > > + > > + // This returns the vector type that holds a pair of the provided type. > > + // If the provided type is already a vector, then its length is doubled. > > + static inline VectorType *getVecTypeForPair(Type *ElemTy) { > > + if (VectorType *VTy = dyn_cast(ElemTy)) { > > + unsigned numElem = VTy->getNumElements(); > > + return VectorType::get(ElemTy->getScalarType(), numElem*2); > > + } else { > > No else-after-return. > http://llvm.org/docs/CodingStandards.html#hl_else_after_return > > > + return VectorType::get(ElemTy, 2); > > + } > > + } > > + > > + // Returns the weight associated with the provided value. A chain of > > + // candidate pairs has a length given by the sum of the weights of its > > + // members (one weight per pair; the weight of each member of the pair > > + // is assumed to be the same). This length is then compared to the > > + // chain-length threshold to determine if a given chain is significant > > + // enough to be vectorized. The length is also used in comparing > > + // candidate chains where longer chains are considered to be better. > > + // Note: when this function returns 0, the resulting instructions are > > + // not actually fused. > > + static inline size_t getDepthFactor(Value *V) { > > + // InsertElement and ExtractElement have a depth factor of zero. This is > > + // for two reasons: First, they cannot be usefully fused. Second, because > > + // the pass generates a lot of these, they can confuse the simple metric > > + // used to compare the trees in the next iteration. Thus, giving them a > > + // weight of zero allows the pass to essentially ignore them in > > + // subsequent iterations when looking for vectorization opportunities > > + // while still tracking dependency chains that flow through those > > + // instructions. > > + if (isa(V) || isa(V)) > > + return 0; > > + > > + return 1; > > + } > > + > > + // This determines the relative offset of two loads or stores, returning > > + // true if the offset could be determined to be some constant value. > > + // For example, if OffsetInElmts == 1, then J accesses the memory directly > > + // after I; if OffsetInElmts == -1 then I accesses the memory > > + // directly after J. This function assumes that both instructions > > + // have the same type. > > + bool getPairPtrInfo(Instruction *I, Instruction *J, > > + Value *&IPtr, Value *&JPtr, unsigned&IAlignment, unsigned&JAlignment, > > + int64_t&OffsetInElmts) { > > + OffsetInElmts = 0; > > + if (isa(I)) { > > + IPtr = cast(I)->getPointerOperand(); > > + JPtr = cast(J)->getPointerOperand(); > > + IAlignment = cast(I)->getAlignment(); > > + JAlignment = cast(J)->getAlignment(); > > + } else { > > + IPtr = cast(I)->getPointerOperand(); > > + JPtr = cast(J)->getPointerOperand(); > > + IAlignment = cast(I)->getAlignment(); > > + JAlignment = cast(J)->getAlignment(); > > + } > > + > > + const SCEV *IPtrSCEV = SE->getSCEV(IPtr); > > + const SCEV *JPtrSCEV = SE->getSCEV(JPtr); > > + > > + // If this is a trivial offset, then we'll get something like > > + // 1*sizeof(type). With target data, which we need anyway, this will get > > + // constant folded into a number. > > + const SCEV *OffsetSCEV = SE->getMinusSCEV(JPtrSCEV, IPtrSCEV); > > + if (const SCEVConstant *ConstOffSCEV = > > + dyn_cast(OffsetSCEV)) { > > + ConstantInt *IntOff = ConstOffSCEV->getValue(); > > + int64_t Offset = IntOff->getSExtValue(); > > + > > + Type *VTy = cast(IPtr->getType())->getElementType(); > > + int64_t VTyTSS = (int64_t) TD->getTypeStoreSize(VTy); > > + > > + assert(VTy == cast(JPtr->getType())->getElementType()); > > + > > + OffsetInElmts = Offset/VTyTSS; > > + return (abs64(Offset) % VTyTSS) == 0; > > + } > > + > > + return false; > > + } > > + > > + // Returns true if the provided CallInst represents an intrinsic that can > > + // be vectorized. > > + bool isVectorizableIntrinsic(CallInst* I) { > > + Function *F = I->getCalledFunction(); > > + if (!F) return false; > > + > > + unsigned IID = F->getIntrinsicID(); > > + if (!IID) return false; > > + > > + switch(IID) { > > + default: > > + return false; > > + case Intrinsic::sqrt: > > + case Intrinsic::powi: > > + case Intrinsic::sin: > > + case Intrinsic::cos: > > + case Intrinsic::log: > > + case Intrinsic::log2: > > + case Intrinsic::log10: > > + case Intrinsic::exp: > > + case Intrinsic::exp2: > > + case Intrinsic::pow: > > + return !NoMath; > > + case Intrinsic::fma: > > + return !NoFMA; > > + } > > + } > > + > > + // Returns true if J is the second element in some pair referenced by > > + // some multimap pair iterator pair. > > + template > > + bool isSecondInIteratorPair(V J, std::pair< > > + typename std::multimap::iterator, > > + typename std::multimap::iterator> PairRange) { > > + for (typename std::multimap::iterator K = PairRange.first; > > + K != PairRange.second; ++K) > > + if (K->second == J) return true; > > + > > + return false; > > + } > > + }; > > + > > + // This function implements one vectorization iteration on the provided > > + // basic block. It returns true if the block is changed. > > + bool BBVectorize::vectorizePairs(BasicBlock&BB) { > > + std::vector PairableInsts; > > + std::multimap CandidatePairs; > > + getCandidatePairs(BB, CandidatePairs, PairableInsts); > > + if (PairableInsts.size() == 0) return false; > > + > > + // Now we have a map of all of the pairable instructions and we need to > > + // select the best possible pairing. A good pairing is one such that the > > + // users of the pair are also paired. This defines a (directed) forest > > + // over the pairs such that two pairs are connected iff the second pair > > + // uses the first. > > + > > + // Note that it only matters that both members of the second pair use some > > + // element of the first pair (to allow for splatting). > > + > > + std::multimap ConnectedPairs; > > + computeConnectedPairs(CandidatePairs, PairableInsts, ConnectedPairs); > > + if (ConnectedPairs.size() == 0) return false; > > ConnectedPairs.empty() > > > + > > + // Build the pairable-instruction dependency map > > + DenseSet PairableInstUsers; > > + buildDepMap(BB, CandidatePairs, PairableInsts, PairableInstUsers); > > + > > + // There is now a graph of the connected pairs. For each variable, pick the > > + // pairing with the largest tree meeting the depth requirement on at least > > + // one branch. Then select all pairings that are part of that tree and > > + // remove them from the list of available pairings and pairable variables. > > + > > + DenseMap ChosenPairs; > > + choosePairs(CandidatePairs, PairableInsts, ConnectedPairs, > > + PairableInstUsers, ChosenPairs); > > + > > + if (ChosenPairs.size() == 0) return false; > > ChosenPairs.empty() > > > + NumFusedOps += ChosenPairs.size(); > > + > > + // A set of pairs has now been selected. It is now necessary to replace the > > + // paired instructions with vector instructions. For this procedure each > > + // operand much be replaced with a vector operand. This vector is formed > > + // by using build_vector on the old operands. The replaced values are then > > + // replaced with a vector_extract on the result. Subsequent optimization > > + // passes should coalesce the build/extract combinations. > > + > > + fuseChosenPairs(BB, PairableInsts, ChosenPairs); > > + > > + return true; > > + } > > + > > + // This function returns true if the provided instruction is capable of being > > + // fused into a vector instruction. This determination is based only on the > > + // type and other attributes of the instruction. > > + bool BBVectorize::isInstVectorizable(Instruction *I, > > + bool&IsSimpleLoadStore) { > > + IsSimpleLoadStore = false; > > + > > + if (CallInst *C = dyn_cast(I)) { > > + if (!isVectorizableIntrinsic(C)) > > + return false; > > + } else if (LoadInst *L = dyn_cast(I)) { > > + // Vectorize simple loads if possbile: > > + IsSimpleLoadStore = L->isSimple(); > > + if (!IsSimpleLoadStore || NoMemOps) > > + return false; > > + } else if (StoreInst *S = dyn_cast(I)) { > > + // Vectorize simple stores if possbile: > > + IsSimpleLoadStore = S->isSimple(); > > + if (!IsSimpleLoadStore || NoMemOps) > > + return false; > > + } else if (CastInst *C = dyn_cast(I)) { > > + // We can vectorize casts, but not casts of pointer types, etc. > > + if (NoCasts) > > + return false; > > + > > + Type *SrcTy = C->getSrcTy(); > > + if (!SrcTy->isSingleValueType() || SrcTy->isPointerTy()) > > + return false; > > + > > + Type *DestTy = C->getDestTy(); > > + if (!DestTy->isSingleValueType() || DestTy->isPointerTy()) > > + return false; > > + } else if (!(I->isBinaryOp() || isa(I) || > > + isa(I) || isa(I))) { > > + return false; > > + } > > + > > + // We can't vectorize memory operations without target data > > + if (TD == 0&& IsSimpleLoadStore) > > + return false; > > + > > + Type *T1, *T2; > > + if (isa(I)) { > > + // For stores, it is the value type, not the pointer type that matters > > + // because the value is what will come from a vector register. > > + > > + Value *IVal = cast(I)->getValueOperand(); > > + T1 = IVal->getType(); > > + } else { > > + T1 = I->getType(); > > + } > > + > > + if (I->isCast()) > > + T2 = cast(I)->getSrcTy(); > > + else > > + T2 = T1; > > + > > + // Not every type can be vectorized... > > + if (!(VectorType::isValidElementType(T1) || T1->isVectorTy()) || > > + !(VectorType::isValidElementType(T2) || T2->isVectorTy())) > > + return false; > > + > > + if (NoInts&& (T1->isIntOrIntVectorTy() || T2->isIntOrIntVectorTy())) > > + return false; > > + > > + if (NoFloats&& (T1->isFPOrFPVectorTy() || T2->isFPOrFPVectorTy())) > > + return false; > > + > > + if (T1->getPrimitiveSizeInBits()> VectorBits/2 || > > + T2->getPrimitiveSizeInBits()> VectorBits/2) > > + return false; > > + > > + return true; > > + } > > + > > + // This function returns true if the two provided instructions are compatible > > + // (meaning that they can be fused into a vector instruction). This assumes > > + // that I has already been determined to be vectorizable and that J is not > > + // in the use tree of I. > > + bool BBVectorize::areInstsCompatible(Instruction *I, Instruction *J, > > + bool IsSimpleLoadStore) { > > + DEBUG(if (DebugInstructionExamination) dbgs()<< "BBV: looking at "<< *I<< > > + "<-> "<< *J<< "\n"); > > + > > + // Loads and stores can be merged if they have different alignments, > > + // but are otherwise the same. > > + LoadInst *LI, *LJ; > > + StoreInst *SI, *SJ; > > + if ((LI = dyn_cast(I))&& (LJ = dyn_cast(J))) { > > + if (I->getType() != J->getType()) > > + return false; > > + > > + if (LI->getPointerOperand()->getType() != > > + LJ->getPointerOperand()->getType() || > > + LI->isVolatile() != LJ->isVolatile() || > > You don't combine two separate volatile loads, do you? That sounds bad. > > I'm also not sure about merging two atomic load/stores... No, it will combine them only if isSimple() is true (this is checked in isInstVectorizable). > > > + LI->getOrdering() != LJ->getOrdering() || > > + LI->getSynchScope() != LJ->getSynchScope()) > > + return false; > > + } else if ((SI = dyn_cast(I))&& (SJ = dyn_cast(J))) { > > + if (SI->getValueOperand()->getType() != > > + SJ->getValueOperand()->getType() || > > + SI->getPointerOperand()->getType() != > > + SJ->getPointerOperand()->getType() || > > + SI->isVolatile() != SJ->isVolatile() || > > + SI->getOrdering() != SJ->getOrdering() || > > + SI->getSynchScope() != SJ->getSynchScope()) > > + return false; > > + } else if (!J->isSameOperationAs(I)) { > > + return false; > > + } > > + // FIXME: handle addsub-type operations! > > + > > + if (IsSimpleLoadStore) { > > + Value *IPtr, *JPtr; > > + unsigned IAlignment, JAlignment; > > + int64_t OffsetInElmts = 0; > > + if (getPairPtrInfo(I, J, IPtr, JPtr, IAlignment, JAlignment, > > + OffsetInElmts)&& abs64(OffsetInElmts) == 1) { > > + if (AlignedOnly) { > > + Type *aType = isa(I) ? > > + cast(I)->getValueOperand()->getType() : I->getType(); > > + // An aligned load or store is possible only if the instruction > > + // with the lower offset has an alignment suitable for the > > + // vector type. > > + > > + unsigned BottomAlignment = IAlignment; > > + if (OffsetInElmts< 0) BottomAlignment = JAlignment; > > + > > + Type *VType = getVecTypeForPair(aType); > > + unsigned VecAlignment = TD->getPrefTypeAlignment(VType); > > + if (BottomAlignment< VecAlignment) > > + return false; > > + } > > + } else { > > + return false; > > + } > > + } else if (isa(I)) { > > + // Only merge two shuffles if they're both constant > > + return isa(I->getOperand(2))&& > > + isa(J->getOperand(2)); > > + // FIXME: We may want to vectorize non-constant shuffles also. > > + } > > + > > + return true; > > + } > > + > > + // Figure out whether or not J uses I and update the users and write-set > > + // structures associated with I. Specifically, Users represents the set of > > + // instructions that depend on I. WriteSet represents the set > > + // of memory locations that are dependent on I. If UpdateUsers is true, > > + // and J uses I, then Users is updated to contain J and WriteSet is updated > > + // to contain any memory locations to which J writes. The function returns > > + // true if J uses I. By default, alias analysis is used to determine > > + // whether J reads from memory that overlaps with a location in WriteSet. > > + // If LoadMoveSet is not null, then it is a previously-computed multimap > > + // where the key is the memory-based user instruction and the value is > > + // the instruction to be compared with I. So, if LoadMoveSet is provided, > > + // then the alias analysis is not used. This is necessary because this > > + // function is called during the process of moving instructions during > > + // vectorization and the results of the alias analysis are not stable during > > + // that process. > > + bool BBVectorize::trackUsesOfI(DenseSet &Users, > > + AliasSetTracker&WriteSet, Instruction *I, > > + Instruction *J, bool UpdateUsers, > > + std::multimap *LoadMoveSet) { > > + bool UsesI = false; > > + > > + // This instruction may already be marked as a user due, for example, to > > + // being a member of a selected pair. > > + if (Users.count(J)) > > + UsesI = true; > > + > > + if (!UsesI) > > + for (User::op_iterator JU = J->op_begin(), e = J->op_end(); > > + JU != e; ++JU) { > > This is correct, but it's common to say "JU = ..., JE = ..." for > consistency. > > > + Value *V = *JU; > > + if (I == V || Users.count(V)) { > > + UsesI = true; > > + break; > > + } > > + } > > + if (!UsesI&& J->mayReadFromMemory()) { > > + if (LoadMoveSet) { > > + VPIteratorPair JPairRange = LoadMoveSet->equal_range(J); > > + UsesI = isSecondInIteratorPair(I, JPairRange); > > + } else { > > + for (AliasSetTracker::iterator W = WriteSet.begin(), > > + WE = WriteSet.end(); W != WE; ++W) { > > + for (AliasSet::iterator A = W->begin(), AE = W->end(); > > + A != AE; ++A) { > > + AliasAnalysis::Location ptrLoc(A->getValue(), A->getSize(), > > + A->getTBAAInfo()); > > + if (AA->getModRefInfo(J, ptrLoc) != AliasAnalysis::NoModRef) { > > + UsesI = true; > > + break; > > + } > > + } > > + if (UsesI) break; > > + } > > + } > > + } > > + > > + if (UsesI&& UpdateUsers) { > > + if (J->mayWriteToMemory()) WriteSet.add(J); > > + Users.insert(J); > > + } > > + > > + return UsesI; > > + } > > + > > + // This function iterates over all instruction pairs in the provided > > + // basic block and collects all candidate pairs for vectorization. > > + void BBVectorize::getCandidatePairs(BasicBlock&BB, > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts) { > > + BasicBlock::iterator E = BB.end(); > > + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { > > + bool IsSimpleLoadStore; > > + if (!isInstVectorizable(I, IsSimpleLoadStore)) continue; > > + > > + // Look for an instruction with which to pair instruction *I... > > + DenseSet Users; > > + AliasSetTracker WriteSet(*AA); > > + BasicBlock::iterator J = I; ++J; > > + for (unsigned ss = 0; J != E&& ss<= SearchLimit; ++J, ++ss) { > > + // Determine if J uses I, if so, exit the loop. > > + bool UsesI = trackUsesOfI(Users, WriteSet, I, J, !FastDep); > > + if (FastDep) { > > + // Note: For this heuristic to be effective, independent operations > > + // must tend to be intermixed. This is likely to be true from some > > + // kinds of grouped loop unrolling (but not the generic LLVM pass), > > + // but otherwise may require some kind of reordering pass. > > + > > + // When using fast dependency analysis, > > + // stop searching after first use: > > + if (UsesI) break; > > + } else { > > + if (UsesI) continue; > > + } > > + > > + // J does not use I, and comes before the first use of I, so it can be > > + // merged with I if the instructions are compatible. > > + if (!areInstsCompatible(I, J, IsSimpleLoadStore)) continue; > > + > > + // J is a candidate for merging with I. > > + if (!PairableInsts.size() || > > + PairableInsts[PairableInsts.size()-1] != I) { > > + PairableInsts.push_back(I); > > + } > > + CandidatePairs.insert(ValuePair(I, J)); > > + DEBUG(if (DebugCandidateSelection) dbgs()<< "BBV: candidate pair" > > +<< *I<< "<-> "<< *J<< "\n"); > > + } > > + } > > + > > + DEBUG(dbgs()<< "BBV: found "<< PairableInsts.size() > > +<< " instructions with candidate pairs\n"); > > + } > > + > > + // Finds candidate pairs connected to the pair P =. This means that > > + // it looks for pairs such that both members have an input which is an > > + // output of PI or PJ. > > + void BBVectorize::computePairsConnectedTo( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + ValuePair P) { > > + // For each possible pairing for this variable, look at the uses of > > + // the first value... > > + for (Value::use_iterator I = P.first->use_begin(), > > + E = P.first->use_end(); I != E; ++I) { > > + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); > > + > > + // For each use of the first variable, look for uses of the second > > + // variable... > > + for (Value::use_iterator J = P.second->use_begin(), > > + E2 = P.second->use_end(); J != E2; ++J) { > > + VPIteratorPair JPairRange = CandidatePairs.equal_range(*J); > > + > > + // Look for: > > + if (isSecondInIteratorPair(*J, IPairRange)) > > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > > + > > + // Look for: > > + if (isSecondInIteratorPair(*I, JPairRange)) > > + ConnectedPairs.insert(VPPair(P, ValuePair(*J, *I))); > > + } > > + > > + if (SplatBreaksChain) continue; > > + // Look for cases where just the first value in the pair is used by > > + // both members of another pair (splatting). > > + for (Value::use_iterator J = P.first->use_begin(); J != E; ++J) { > > + if (isSecondInIteratorPair(*J, IPairRange)) > > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > > + } > > + } > > + > > + if (SplatBreaksChain) return; > > + // Look for cases where just the second value in the pair is used by > > + // both members of another pair (splatting). > > + for (Value::use_iterator I = P.second->use_begin(), > > + E = P.second->use_end(); I != E; ++I) { > > + VPIteratorPair IPairRange = CandidatePairs.equal_range(*I); > > + > > + for (Value::use_iterator J = P.second->use_begin(); J != E; ++J) { > > + if (isSecondInIteratorPair(*J, IPairRange)) > > + ConnectedPairs.insert(VPPair(P, ValuePair(*I, *J))); > > + } > > + } > > + } > > + > > + // This function figures out which pairs are connected. Two pairs are > > + // connected if some output of the first pair forms an input to both members > > + // of the second pair. > > + void BBVectorize::computeConnectedPairs( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs) { > > + > > + for (std::vector::iterator PI = PairableInsts.begin(), > > + PE = PairableInsts.end(); PI != PE; ++PI) { > > + VPIteratorPair choiceRange = CandidatePairs.equal_range(*PI); > > + > > + for (std::multimap::iterator P = choiceRange.first; > > + P != choiceRange.second; ++P) > > + computePairsConnectedTo(CandidatePairs, PairableInsts, > > + ConnectedPairs, *P); > > + } > > + > > + DEBUG(dbgs()<< "BBV: found "<< ConnectedPairs.size() > > +<< " pair connections.\n"); > > + } > > + > > + // This function builds a set of use tuples such that is in the set > > + // if B is in the use tree of A. If B is in the use tree of A, then B > > + // depends on the output of A. > > + void BBVectorize::buildDepMap( > > + BasicBlock&BB, > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + DenseSet &PairableInstUsers) { > > + DenseSet IsInPair; > > + for (std::multimap::iterator C = CandidatePairs.begin(), > > + E = CandidatePairs.end(); C != E; ++C) { > > + IsInPair.insert(C->first); > > + IsInPair.insert(C->second); > > + } > > + > > + // Iterate through the basic block, recording all Users of each > > + // pairable instruction. > > + > > + BasicBlock::iterator E = BB.end(); > > + for (BasicBlock::iterator I = BB.getFirstInsertionPt(); I != E; ++I) { > > "for (...; !isa(I); ++I) {" should also work, and avoid > the need to declare 'E' above. > > > + if (IsInPair.find(I) == IsInPair.end()) continue; > > + > > + DenseSet Users; > > + AliasSetTracker WriteSet(*AA); > > + for (BasicBlock::iterator J = llvm::next(I); J != E; ++J) > > + (void) trackUsesOfI(Users, WriteSet, I, J); > > + > > + for (DenseSet::iterator U = Users.begin(), E = Users.end(); > > + U != E; ++U) > > + PairableInstUsers.insert(ValuePair(I, *U)); > > + } > > + } > > + > > + // Returns true if an input to pair P is an output of pair Q and also an > > + // input of pair Q is an output of pair P. If this is the case, then these > > + // two pairs cannot be simultaneously fused. > > + bool BBVectorize::pairsConflict(ValuePair P, ValuePair Q, > > + DenseSet &PairableInstUsers, > > + std::multimap *PairableInstUserMap) { > > + // Two pairs are in conflict if they are mutual Users of eachother. > > + bool QUsesP = PairableInstUsers.count(ValuePair(P.first, Q.first)) || > > + PairableInstUsers.count(ValuePair(P.first, Q.second)) || > > + PairableInstUsers.count(ValuePair(P.second, Q.first)) || > > + PairableInstUsers.count(ValuePair(P.second, Q.second)); > > + bool PUsesQ = PairableInstUsers.count(ValuePair(Q.first, P.first)) || > > + PairableInstUsers.count(ValuePair(Q.first, P.second)) || > > + PairableInstUsers.count(ValuePair(Q.second, P.first)) || > > + PairableInstUsers.count(ValuePair(Q.second, P.second)); > > + if (PairableInstUserMap) { > > + // FIXME: The expensive part of the cycle check is not so much the cycle > > + // check itself but this edge insertion procedure. This needs some > > + // profiling and probably a different data structure (same is true of > > + // most uses of std::multimap). > > + if (PUsesQ) { > > + VPPIteratorPair QPairRange = PairableInstUserMap->equal_range(Q); > > + if (!isSecondInIteratorPair(P, QPairRange)) > > + PairableInstUserMap->insert(VPPair(Q, P)); > > + } > > + if (QUsesP) { > > + VPPIteratorPair PPairRange = PairableInstUserMap->equal_range(P); > > + if (!isSecondInIteratorPair(Q, PPairRange)) > > + PairableInstUserMap->insert(VPPair(P, Q)); > > + } > > + } > > + > > + return (QUsesP&& PUsesQ); > > + } > > + > > + // This function walks the use graph of current pairs to see if, starting > > + // from P, the walk returns to P. > > + bool BBVectorize::pairWillFormCycle(ValuePair P, > > + std::multimap &PairableInstUserMap, > > + DenseSet &CurrentPairs) { > > + DEBUG(if (DebugCycleCheck) > > + dbgs()<< "BBV: starting cycle check for : "<< *P.first<< "<-> " > > +<< *P.second<< "\n"); > > + // A lookup table of visisted pairs is kept because the PairableInstUserMap > > + // contains non-direct associations. > > + DenseSet Visited; > > + std::vector Q; > > + // General depth-first post-order traversal: > > + Q.push_back(P); > > + while (!Q.empty()) { > > This is always true on the first iteration. Please make this a: > > SmallVector Q; > Q.push_back(P); > do { > ValuePair QTop = Q.pop_back_val(); > Visited.insert(QTop); > // ... > } while(!Q.empty()); > > loop. > > > + ValuePair QTop = Q.back(); > > + > > + Visited.insert(QTop); > > + Q.pop_back(); > > + > > + DEBUG(if (DebugCycleCheck) > > + dbgs()<< "BBV: cycle check visiting: "<< *QTop.first<< "<-> " > > +<< *QTop.second<< "\n"); > > + VPPIteratorPair QPairRange = PairableInstUserMap.equal_range(QTop); > > + for (std::multimap::iterator C = QPairRange.first; > > + C != QPairRange.second; ++C) { > > + if (C->second == P) { > > + DEBUG(dbgs() > > +<< "BBV: rejected to prevent non-trivial cycle formation:" > > +<< *C->first.first<< "<-> "<< *C->first.second<< "\n"); > > + return true; > > + } > > + > > + if (CurrentPairs.count(C->second)> 0&& > > + Visited.count(C->second) == 0) > > + Q.push_back(C->second); > > + } > > + } > > + > > + return false; > > + } > > + > > + // This function builds the initial tree of connected pairs with the > > + // pair J at the root. > > + void BBVectorize::buildInitialTreeFor( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + DenseMap &ChosenPairs, > > + DenseMap &Tree, ValuePair J) { > > + // Each of these pairs is viewed as the root node of a Tree. The Tree > > + // is then walked (depth-first). As this happens, we keep track of > > + // the pairs that compose the Tree and the maximum depth of the Tree. > > + std::vector Q; > > + // General depth-first post-order traversal: > > + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); > > + while (!Q.empty()) { > > + ValuePairWithDepth QTop = Q.back(); > > This loop can be rotated too, though you may not want to switch to using > pop_back_val() here (I see that you do additional pushes and optional > pops in the loop). > > > + > > + // Push each child onto the queue: > > + bool MoreChildren = false; > > + size_t MaxChildDepth = QTop.second; > > + VPPIteratorPair qtRange = ConnectedPairs.equal_range(QTop.first); > > + for (std::map::iterator k = qtRange.first; > > + k != qtRange.second; ++k) { > > + // Make sure that this child pair is still a candidate: > > + bool IsStillCand = false; > > + VPIteratorPair checkRange = > > + CandidatePairs.equal_range(k->second.first); > > + for (std::multimap::iterator m = checkRange.first; > > + m != checkRange.second; ++m) { > > + if (m->second == k->second.second) { > > + IsStillCand = true; > > + break; > > + } > > + } > > + > > + if (IsStillCand) { > > + DenseMap::iterator C = Tree.find(k->second); > > + if (C == Tree.end()) { > > + size_t d = getDepthFactor(k->second.first); > > + Q.push_back(ValuePairWithDepth(k->second, QTop.second+d)); > > + MoreChildren = true; > > + } else { > > + MaxChildDepth = std::max(MaxChildDepth, C->second); > > + } > > + } > > + } > > + > > + if (!MoreChildren) { > > + // Record the current pair as part of the Tree: > > + Tree.insert(ValuePairWithDepth(QTop.first, MaxChildDepth)); > > + Q.pop_back(); > > + } > > + } > > + } > > + > > + // Given some initial tree, prune it by removing conflicting pairs (pairs > > + // that cannot be simultaneously chosen for vectorization). > > + void BBVectorize::pruneTreeFor( > > + std::multimap &CandidatePairs, > > + std::vector &PairableInsts, > > + std::multimap &ConnectedPairs, > > + DenseSet &PairableInstUsers, > > + std::multimap &PairableInstUserMap, > > + DenseMap &ChosenPairs, > > + DenseMap &Tree, > > + DenseSet &PrunedTree, ValuePair J, > > + bool UseCycleCheck) { > > + std::vector Q; > > + // General depth-first post-order traversal: > > + Q.push_back(ValuePairWithDepth(J, getDepthFactor(J.first))); > > + while (!Q.empty()) { > > + ValuePairWithDepth QTop = Q.back(); > > + PrunedTree.insert(QTop.first); > > + Q.pop_back(); > > Another loop to restructure. > > (Stopped reviewing at this point.) Thanks for looking at this! I'll fix some of the quick ones now, and I'll do the rest tomorrow. -Hal > > Nick -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory From eli.bendersky at intel.com Tue Jan 31 23:38:02 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 05:38:02 +0000 Subject: [llvm-commits] [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D3241B7@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D3241B7@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326AD3@HASMSX103.ger.corp.intel.com> Ping > -----Original Message----- > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits- > bounces at cs.uiuc.edu] On Behalf Of Bendersky, Eli > Sent: Friday, January 27, 2012 07:35 > To: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] [PATCH] enabling generation of ELF objects on > Windows with the help of the triple > > Ping. > > Any objections, or OK to commit? > > Eli > > > > -----Original Message----- > > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits- > > bounces at cs.uiuc.edu] On Behalf Of Bendersky, Eli > > Sent: Tuesday, January 24, 2012 15:03 > > To: llvm-commits at cs.uiuc.edu > > Subject: [llvm-commits] [PATCH] enabling generation of ELF objects on > > Windows with the help of the triple > > > > Hello, > > > > Earlier this month I initiated a llvmdev discussion on the possibility > > to make MC generate code into an ELF container on Windows > > (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/046583.html). > > Currently in several places in the code the decision is made based on > > the Triple's OS component. When it's Windows, a decision is made > > automatically to generate COFF, so a way is needed to let MC know that > > we still want ELF, even if we're on Windows. > > > > There are several approaches to this: > > > > 1. Add this information somewhere which isn't the Triple 2. Add this > > information into the Triple, making it a 5-tuple instead of 4-tuple - > > the 5th component being "container" or something like that 3. Add this > > information into the Triple, overlaying the "environment" component > > > > The attached patch takes approach (3) since this appears to make the > > minimal overall impact on the code. It adds an "ELF" option to the > > EnvironmentType enum. Since we're interested in ELF on Windows on x86, > > this environment option doesn't conflict with the others. In other > > words, it enables us to generate and run MCJIT-ted code on Windows, > > without interfering with other code in LLVM. > > > > Although approach (1) would perhaps be cleaner, it is not easy to see > > how to go about it, since in many places where the modification is > > required the triple is the only accessible piece of information about > > the compiler target. The decision to generate COFF on Windows is based > > on the Triple, not on something else. > > > > I'll be happy to hear about other options, or to get this patch > > reviewed so I can commit it. > > > > Thanks in advance, > > Eli > > > > --------------------------------------------------------------------- > > Intel Israel (74) Limited > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for the > sole use of the intended recipient(s). Any review or distribution by others is > strictly prohibited. If you are not the intended recipient, please contact the > sender and delete all copies. > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From hfinkel at anl.gov Tue Jan 31 23:51:45 2012 From: hfinkel at anl.gov (Hal Finkel) Date: Wed, 01 Feb 2012 05:51:45 -0000 Subject: [llvm-commits] [llvm] r149472 - in /llvm/trunk: include/llvm-c/Transforms/Vectorize.h lib/Transforms/Vectorize/BBVectorize.cpp Message-ID: <20120201055145.9A18F2A6C12C@llvm.org> Author: hfinkel Date: Tue Jan 31 23:51:45 2012 New Revision: 149472 URL: http://llvm.org/viewvc/llvm-project?rev=149472&view=rev Log: A few of the changes suggested in code review (by Nick Lewycky) Modified: llvm/trunk/include/llvm-c/Transforms/Vectorize.h llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Modified: llvm/trunk/include/llvm-c/Transforms/Vectorize.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Transforms/Vectorize.h?rev=149472&r1=149471&r2=149472&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Transforms/Vectorize.h (original) +++ llvm/trunk/include/llvm-c/Transforms/Vectorize.h Tue Jan 31 23:51:45 2012 @@ -1,4 +1,4 @@ -/*===---------------------------Vectorize.h ------------------- -*- C++ -*-===*\ +/*===---------------------------Vectorize.h --------------------- -*- C -*-===*\ |*===----------- Vectorization Transformation Library C Interface ---------===*| |* *| |* The LLVM Compiler Infrastructure *| Modified: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp?rev=149472&r1=149471&r2=149472&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Tue Jan 31 23:51:45 2012 @@ -300,6 +300,7 @@ AU.addRequired(); AU.addPreserved(); AU.addPreserved(); + AU.setPreservesCFG(); } // This returns the vector type that holds a pair of the provided type. @@ -308,9 +309,9 @@ if (VectorType *VTy = dyn_cast(ElemTy)) { unsigned numElem = VTy->getNumElements(); return VectorType::get(ElemTy->getScalarType(), numElem*2); - } else { - return VectorType::get(ElemTy, 2); } + + return VectorType::get(ElemTy, 2); } // Returns the weight associated with the provided value. A chain of @@ -431,7 +432,7 @@ std::vector PairableInsts; std::multimap CandidatePairs; getCandidatePairs(BB, CandidatePairs, PairableInsts); - if (PairableInsts.size() == 0) return false; + if (PairableInsts.empty()) return false; // Now we have a map of all of the pairable instructions and we need to // select the best possible pairing. A good pairing is one such that the @@ -444,7 +445,7 @@ std::multimap ConnectedPairs; computeConnectedPairs(CandidatePairs, PairableInsts, ConnectedPairs); - if (ConnectedPairs.size() == 0) return false; + if (ConnectedPairs.empty()) return false; // Build the pairable-instruction dependency map DenseSet PairableInstUsers; @@ -459,7 +460,7 @@ choosePairs(CandidatePairs, PairableInsts, ConnectedPairs, PairableInstUsers, ChosenPairs); - if (ChosenPairs.size() == 0) return false; + if (ChosenPairs.empty()) return false; NumFusedOps += ChosenPairs.size(); // A set of pairs has now been selected. It is now necessary to replace the @@ -647,8 +648,8 @@ UsesI = true; if (!UsesI) - for (User::op_iterator JU = J->op_begin(), e = J->op_end(); - JU != e; ++JU) { + for (User::op_iterator JU = J->op_begin(), JE = J->op_end(); + JU != JE; ++JU) { Value *V = *JU; if (I == V || Users.count(V)) { UsesI = true; From geek4civic at gmail.com Wed Feb 1 00:11:58 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 01 Feb 2012 06:11:58 -0000 Subject: [llvm-commits] [llvm] r149475 - /llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Message-ID: <20120201061158.669B82A6C12C@llvm.org> Author: chapuni Date: Wed Feb 1 00:11:58 2012 New Revision: 149475 URL: http://llvm.org/viewvc/llvm-project?rev=149475&view=rev Log: BBVectorize.cpp: Try to fix MSVC build. map::iterator and multimap::iterator are incompatible. Modified: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Modified: llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp?rev=149475&r1=149474&r2=149475&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp (original) +++ llvm/trunk/lib/Transforms/Vectorize/BBVectorize.cpp Wed Feb 1 00:11:58 2012 @@ -940,7 +940,7 @@ bool MoreChildren = false; size_t MaxChildDepth = QTop.second; VPPIteratorPair qtRange = ConnectedPairs.equal_range(QTop.first); - for (std::map::iterator k = qtRange.first; + for (std::multimap::iterator k = qtRange.first; k != qtRange.second; ++k) { // Make sure that this child pair is still a candidate: bool IsStillCand = false; @@ -997,7 +997,7 @@ // Visit each child, pruning as necessary... DenseMap BestChilden; VPPIteratorPair QTopRange = ConnectedPairs.equal_range(QTop.first); - for (std::map::iterator K = QTopRange.first; + for (std::multimap::iterator K = QTopRange.first; K != QTopRange.second; ++K) { DenseMap::iterator C = Tree.find(K->second); if (C == Tree.end()) continue; From craig.topper at gmail.com Wed Feb 1 00:51:58 2012 From: craig.topper at gmail.com (Craig Topper) Date: Wed, 01 Feb 2012 06:51:58 -0000 Subject: [llvm-commits] [llvm] r149478 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120201065158.A33B82A6C12C@llvm.org> Author: ctopper Date: Wed Feb 1 00:51:58 2012 New Revision: 149478 URL: http://llvm.org/viewvc/llvm-project?rev=149478&view=rev Log: Don't create VBROADCAST nodes if any nodes use the chain result from the load. Fixes PR11900. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149478&r1=149477&r2=149478&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 00:51:58 2012 @@ -4997,6 +4997,10 @@ if (!ISD::isNormalLoad(Ld.getNode())) return SDValue(); + // Reject loads that have uses of the chain result + if (Ld->hasAnyUseOfValue(1)) + return SDValue(); + bool Is256 = VT.getSizeInBits() == 256; bool Is128 = VT.getSizeInBits() == 128; unsigned ScalarSize = Ld.getValueType().getSizeInBits(); From atrick at apple.com Wed Feb 1 01:16:17 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 07:16:17 -0000 Subject: [llvm-commits] [llvm] r149479 - in /llvm/trunk/include/llvm: PassManager.h PassManagers.h Message-ID: <20120201071617.E80DA2A6C12C@llvm.org> Author: atrick Date: Wed Feb 1 01:16:17 2012 New Revision: 149479 URL: http://llvm.org/viewvc/llvm-project?rev=149479&view=rev Log: whitespace Modified: llvm/trunk/include/llvm/PassManager.h llvm/trunk/include/llvm/PassManagers.h Modified: llvm/trunk/include/llvm/PassManager.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/PassManager.h?rev=149479&r1=149478&r2=149479&view=diff ============================================================================== --- llvm/trunk/include/llvm/PassManager.h (original) +++ llvm/trunk/include/llvm/PassManager.h Wed Feb 1 01:16:17 2012 @@ -53,7 +53,7 @@ /// will be destroyed as well, so there is no need to delete the pass. This /// implies that all passes MUST be allocated with 'new'. void add(Pass *P); - + /// run - Execute all of the passes scheduled for execution. Keep track of /// whether any of the passes modifies the module, and if so, return true. bool run(Module &M); @@ -63,7 +63,7 @@ /// checking whether to add a printer pass. void addImpl(Pass *P); - /// PassManagerImpl_New is the actual class. PassManager is just the + /// PassManagerImpl_New is the actual class. PassManager is just the /// wraper to publish simple pass manager interface PassManagerImpl *PM; }; @@ -75,7 +75,7 @@ /// but does not take ownership of, the specified Module. explicit FunctionPassManager(Module *M); ~FunctionPassManager(); - + /// add - Add a pass to the queue of passes to run. This passes /// ownership of the Pass to the PassManager. When the /// PassManager_X is destroyed, the pass will be destroyed as well, so @@ -88,15 +88,15 @@ /// so, return true. /// bool run(Function &F); - + /// doInitialization - Run all of the initializers for the function passes. /// bool doInitialization(); - + /// doFinalization - Run all of the finalizers for the function passes. /// bool doFinalization(); - + private: /// addImpl - Add a pass to the queue of passes to run, without /// checking whether to add a printer pass. Modified: llvm/trunk/include/llvm/PassManagers.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/PassManagers.h?rev=149479&r1=149478&r2=149479&view=diff ============================================================================== --- llvm/trunk/include/llvm/PassManagers.h (original) +++ llvm/trunk/include/llvm/PassManagers.h Wed Feb 1 01:16:17 2012 @@ -7,7 +7,7 @@ // //===----------------------------------------------------------------------===// // -// This file declares the LLVM Pass Manager infrastructure. +// This file declares the LLVM Pass Manager infrastructure. // //===----------------------------------------------------------------------===// @@ -24,11 +24,11 @@ //===----------------------------------------------------------------------===// // Overview: // The Pass Manager Infrastructure manages passes. It's responsibilities are: -// +// // o Manage optimization pass execution order // o Make required Analysis information available before pass P is run // o Release memory occupied by dead passes -// o If Analysis information is dirtied by a pass then regenerate Analysis +// o If Analysis information is dirtied by a pass then regenerate Analysis // information before it is consumed by another pass. // // Pass Manager Infrastructure uses multiple pass managers. They are @@ -43,13 +43,13 @@ // // [o] class PMTopLevelManager; // -// Two top level managers, PassManager and FunctionPassManager, derive from -// PMTopLevelManager. PMTopLevelManager manages information used by top level +// Two top level managers, PassManager and FunctionPassManager, derive from +// PMTopLevelManager. PMTopLevelManager manages information used by top level // managers such as last user info. // // [o] class PMDataManager; // -// PMDataManager manages information, e.g. list of available analysis info, +// PMDataManager manages information, e.g. list of available analysis info, // used by a pass manager to manage execution order of passes. It also provides // a place to implement common pass manager APIs. All pass managers derive from // PMDataManager. @@ -109,7 +109,7 @@ ON_REGION_MSG, // " 'on Region ...\n'" ON_LOOP_MSG, // " 'on Loop ...\n'" ON_CG_MSG // "' on Call Graph ...\n'" -}; +}; /// PassManagerPrettyStackEntry - This is used to print informative information /// about what pass is running when/if a stack trace is generated. @@ -124,19 +124,19 @@ : P(p), V(&v), M(0) {} // When P is run on V PassManagerPrettyStackEntry(Pass *p, Module &m) : P(p), V(0), M(&m) {} // When P is run on M - + /// print - Emit information about this stack frame to OS. virtual void print(raw_ostream &OS) const; }; - - + + //===----------------------------------------------------------------------===// // PMStack // /// PMStack - This class implements a stack data structure of PMDataManager /// pointers. /// -/// Top level pass managers (see PassManager.cpp) maintain active Pass Managers +/// Top level pass managers (see PassManager.cpp) maintain active Pass Managers /// using PMStack. Each Pass implements assignPassManager() to connect itself /// with appropriate manager. assignPassManager() walks PMStack to find /// suitable manager. @@ -174,7 +174,7 @@ void initializeAllAnalysisInfo(); private: - /// This is implemented by top level pass manager and used by + /// This is implemented by top level pass manager and used by /// schedulePass() to add analysis info passes that are not available. virtual void addTopLevelPass(Pass *P) = 0; @@ -198,7 +198,7 @@ /// Find analysis usage information for the pass P. AnalysisUsage *findAnalysisUsage(Pass *P); - virtual ~PMTopLevelManager(); + virtual ~PMTopLevelManager(); /// Add immutable pass and initialize it. inline void addImmutablePass(ImmutablePass *P) { @@ -228,7 +228,7 @@ PMStack activeStack; protected: - + /// Collection of pass managers SmallVector PassManagers; @@ -254,7 +254,7 @@ }; - + //===----------------------------------------------------------------------===// // PMDataManager @@ -268,7 +268,7 @@ } virtual ~PMDataManager(); - + virtual Pass *getAsPass() = 0; /// Augment AvailableAnalysis by adding analysis made available by pass P. @@ -279,16 +279,16 @@ /// Remove Analysis that is not preserved by the pass void removeNotPreservedAnalysis(Pass *P); - + /// Remove dead passes used by P. - void removeDeadPasses(Pass *P, StringRef Msg, + void removeDeadPasses(Pass *P, StringRef Msg, enum PassDebuggingString); /// Remove P. - void freePass(Pass *P, StringRef Msg, + void freePass(Pass *P, StringRef Msg, enum PassDebuggingString); - /// Add pass P into the PassVector. Update + /// Add pass P into the PassVector. Update /// AvailableAnalysis appropriately if ProcessAnalysis is true. void add(Pass *P, bool ProcessAnalysis = true); @@ -300,7 +300,7 @@ virtual Pass *getOnTheFlyPass(Pass *P, AnalysisID PI, Function &F); /// Initialize available analysis information. - void initializeAnalysisInfo() { + void initializeAnalysisInfo() { AvailableAnalysis.clear(); for (unsigned i = 0; i < PMT_Last; ++i) InheritedAnalysis[i] = NULL; @@ -347,9 +347,9 @@ return (unsigned)PassVector.size(); } - virtual PassManagerType getPassManagerType() const { + virtual PassManagerType getPassManagerType() const { assert ( 0 && "Invalid use of getPassManagerType"); - return PMT_Unknown; + return PMT_Unknown; } std::map *getAvailableAnalysis() { @@ -377,17 +377,17 @@ // then PMT_Last active pass mangers. std::map *InheritedAnalysis[PMT_Last]; - + /// isPassDebuggingExecutionsOrMore - Return true if -debug-pass=Executions /// or higher is specified. bool isPassDebuggingExecutionsOrMore() const; - + private: void dumpAnalysisUsage(StringRef Msg, const Pass *P, const AnalysisUsage::VectorType &Set) const; - // Set of available Analysis. This information is used while scheduling - // pass. If a pass requires an analysis which is not available then + // Set of available Analysis. This information is used while scheduling + // pass. If a pass requires an analysis which is not available then // the required analysis pass is scheduled to run before the pass itself is // scheduled to run. std::map AvailableAnalysis; @@ -403,27 +403,27 @@ // FPPassManager // /// FPPassManager manages BBPassManagers and FunctionPasses. -/// It batches all function passes and basic block pass managers together and -/// sequence them to process one function at a time before processing next +/// It batches all function passes and basic block pass managers together and +/// sequence them to process one function at a time before processing next /// function. class FPPassManager : public ModulePass, public PMDataManager { public: static char ID; - explicit FPPassManager() + explicit FPPassManager() : ModulePass(ID), PMDataManager() { } - + /// run - Execute all of the passes scheduled for execution. Keep track of /// whether any of the passes modifies the module, and if so, return true. bool runOnFunction(Function &F); bool runOnModule(Module &M); - + /// cleanup - After running all passes, clean up pass manager cache. void cleanup(); /// doInitialization - Run all of the initializers for the function passes. /// bool doInitialization(Module &M); - + /// doFinalization - Run all of the finalizers for the function passes. /// bool doFinalization(Module &M); @@ -449,8 +449,8 @@ return FP; } - virtual PassManagerType getPassManagerType() const { - return PMT_FunctionPassManager; + virtual PassManagerType getPassManagerType() const { + return PMT_FunctionPassManager; } }; From atrick at apple.com Wed Feb 1 01:16:21 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 07:16:21 -0000 Subject: [llvm-commits] [llvm] r149480 - in /llvm/trunk: include/llvm/PassManager.h include/llvm/PassManagers.h lib/VMCore/PassManager.cpp Message-ID: <20120201071621.336E32A6C12C@llvm.org> Author: atrick Date: Wed Feb 1 01:16:20 2012 New Revision: 149480 URL: http://llvm.org/viewvc/llvm-project?rev=149480&view=rev Log: Add pass printer passes in the right place. The pass pointer should never be referenced after sending it to schedulePass(), which may delete the pass. To fix this bug I had to clean up the design leading to more goodness. You may notice now that any non-analysis pass is printed. So things like loop-simplify and lcssa show up, while target lib, target data, alias analysis do not show up. Normally, analysis don't mutate the IR, but you can now check this by using both -print-after and -print-before. The effects of analysis will now show up in between the two. The llc path is still in bad shape. But I'll be improving it in my next checkin. Meanwhile, print-machineinstrs still works the same way. With print-before/after, many llc passes that were not printed before now are, some of these should be converted to analysis. A few very important passes, isel and scheduler, are not properly initialized, so not printed. Modified: llvm/trunk/include/llvm/PassManager.h llvm/trunk/include/llvm/PassManagers.h llvm/trunk/lib/VMCore/PassManager.cpp Modified: llvm/trunk/include/llvm/PassManager.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/PassManager.h?rev=149480&r1=149479&r2=149480&view=diff ============================================================================== --- llvm/trunk/include/llvm/PassManager.h (original) +++ llvm/trunk/include/llvm/PassManager.h Wed Feb 1 01:16:20 2012 @@ -59,10 +59,6 @@ bool run(Module &M); private: - /// addImpl - Add a pass to the queue of passes to run, without - /// checking whether to add a printer pass. - void addImpl(Pass *P); - /// PassManagerImpl_New is the actual class. PassManager is just the /// wraper to publish simple pass manager interface PassManagerImpl *PM; @@ -79,7 +75,7 @@ /// add - Add a pass to the queue of passes to run. This passes /// ownership of the Pass to the PassManager. When the /// PassManager_X is destroyed, the pass will be destroyed as well, so - /// there is no need to delete the pass. (TODO delete passes.) + /// there is no need to delete the pass. /// This implies that all passes MUST be allocated with 'new'. void add(Pass *P); @@ -98,10 +94,6 @@ bool doFinalization(); private: - /// addImpl - Add a pass to the queue of passes to run, without - /// checking whether to add a printer pass. - void addImpl(Pass *P); - FunctionPassManagerImpl *FPM; Module *M; }; Modified: llvm/trunk/include/llvm/PassManagers.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/PassManagers.h?rev=149480&r1=149479&r2=149480&view=diff ============================================================================== --- llvm/trunk/include/llvm/PassManagers.h (original) +++ llvm/trunk/include/llvm/PassManagers.h Wed Feb 1 01:16:20 2012 @@ -82,7 +82,7 @@ // relies on PassManagerImpl to do all the tasks. // // [o] class PassManagerImpl : public Pass, public PMDataManager, -// public PMDTopLevelManager +// public PMTopLevelManager // // PassManagerImpl is a top level pass manager responsible for managing // MPPassManagers. @@ -174,9 +174,8 @@ void initializeAllAnalysisInfo(); private: - /// This is implemented by top level pass manager and used by - /// schedulePass() to add analysis info passes that are not available. - virtual void addTopLevelPass(Pass *P) = 0; + virtual PMDataManager *getAsPMDataManager() = 0; + virtual PassManagerType getTopLevelPassManagerType() = 0; public: /// Schedule pass P for execution. Make sure that passes required by Modified: llvm/trunk/lib/VMCore/PassManager.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/PassManager.cpp?rev=149480&r1=149479&r2=149480&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/PassManager.cpp (original) +++ llvm/trunk/lib/VMCore/PassManager.cpp Wed Feb 1 01:16:20 2012 @@ -84,32 +84,28 @@ /// This is a helper to determine whether to print IR before or /// after a pass. -static bool ShouldPrintBeforeOrAfterPass(const void *PassID, +static bool ShouldPrintBeforeOrAfterPass(const PassInfo *PI, PassOptionList &PassesToPrint) { - if (const llvm::PassInfo *PI = - PassRegistry::getPassRegistry()->getPassInfo(PassID)) { - for (unsigned i = 0, ie = PassesToPrint.size(); i < ie; ++i) { - const llvm::PassInfo *PassInf = PassesToPrint[i]; - if (PassInf) - if (PassInf->getPassArgument() == PI->getPassArgument()) { - return true; - } - } + for (unsigned i = 0, ie = PassesToPrint.size(); i < ie; ++i) { + const llvm::PassInfo *PassInf = PassesToPrint[i]; + if (PassInf) + if (PassInf->getPassArgument() == PI->getPassArgument()) { + return true; + } } return false; } - /// This is a utility to check whether a pass should have IR dumped /// before it. -static bool ShouldPrintBeforePass(const void *PassID) { - return PrintBeforeAll || ShouldPrintBeforeOrAfterPass(PassID, PrintBefore); +static bool ShouldPrintBeforePass(const PassInfo *PI) { + return PrintBeforeAll || ShouldPrintBeforeOrAfterPass(PI, PrintBefore); } /// This is a utility to check whether a pass should have IR dumped /// after it. -static bool ShouldPrintAfterPass(const void *PassID) { - return PrintAfterAll || ShouldPrintBeforeOrAfterPass(PassID, PrintAfter); +static bool ShouldPrintAfterPass(const PassInfo *PI) { + return PrintAfterAll || ShouldPrintBeforeOrAfterPass(PI, PrintAfter); } } // End of llvm namespace @@ -264,27 +260,15 @@ virtual PMDataManager *getAsPMDataManager() { return this; } virtual Pass *getAsPass() { return this; } + virtual PassManagerType getTopLevelPassManagerType() { + return PMT_FunctionPassManager; + } /// Pass Manager itself does not invalidate any analysis info. void getAnalysisUsage(AnalysisUsage &Info) const { Info.setPreservesAll(); } - void addTopLevelPass(Pass *P) { - if (ImmutablePass *IP = P->getAsImmutablePass()) { - // P is a immutable pass and it will be managed by this - // top level manager. Set up analysis resolver to connect them. - AnalysisResolver *AR = new AnalysisResolver(*this); - P->setResolver(AR); - initializeAnalysisImpl(P); - addImmutablePass(IP); - recordAvailableAnalysis(IP); - } else { - P->assignPassManager(activeStack, PMT_FunctionPassManager); - } - - } - FPPassManager *getContainedManager(unsigned N) { assert(N < PassManagers.size() && "Pass number out of range!"); FPPassManager *FP = static_cast(PassManagers[N]); @@ -417,22 +401,11 @@ Info.setPreservesAll(); } - void addTopLevelPass(Pass *P) { - if (ImmutablePass *IP = P->getAsImmutablePass()) { - // P is a immutable pass and it will be managed by this - // top level manager. Set up analysis resolver to connect them. - AnalysisResolver *AR = new AnalysisResolver(*this); - P->setResolver(AR); - initializeAnalysisImpl(P); - addImmutablePass(IP); - recordAvailableAnalysis(IP); - } else { - P->assignPassManager(activeStack, PMT_ModulePassManager); - } - } - virtual PMDataManager *getAsPMDataManager() { return this; } virtual Pass *getAsPass() { return this; } + virtual PassManagerType getTopLevelPassManagerType() { + return PMT_ModulePassManager; + } MPPassManager *getContainedManager(unsigned N) { assert(N < PassManagers.size() && "Pass number out of range!"); @@ -660,7 +633,32 @@ } // Now all required passes are available. - addTopLevelPass(P); + if (ImmutablePass *IP = P->getAsImmutablePass()) { + // P is a immutable pass and it will be managed by this + // top level manager. Set up analysis resolver to connect them. + PMDataManager *DM = getAsPMDataManager(); + AnalysisResolver *AR = new AnalysisResolver(*DM); + P->setResolver(AR); + DM->initializeAnalysisImpl(P); + addImmutablePass(IP); + DM->recordAvailableAnalysis(IP); + return; + } + + if (PI && !PI->isAnalysis() && ShouldPrintBeforePass(PI)) { + Pass *PP = P->createPrinterPass( + dbgs(), std::string("*** IR Dump Before ") + P->getPassName() + " ***"); + PP->assignPassManager(activeStack, getTopLevelPassManagerType()); + } + + // Add the requested pass to the best available pass manager. + P->assignPassManager(activeStack, getTopLevelPassManagerType()); + + if (PI && !PI->isAnalysis() && ShouldPrintAfterPass(PI)) { + Pass *PP = P->createPrinterPass( + dbgs(), std::string("*** IR Dump After ") + P->getPassName() + " ***"); + PP->assignPassManager(activeStack, getTopLevelPassManagerType()); + } } /// Find the pass that implements Analysis AID. Search immutable @@ -1357,31 +1355,13 @@ delete FPM; } -/// addImpl - Add a pass to the queue of passes to run, without -/// checking whether to add a printer pass. -void FunctionPassManager::addImpl(Pass *P) { - FPM->add(P); -} - /// add - Add a pass to the queue of passes to run. This passes /// ownership of the Pass to the PassManager. When the /// PassManager_X is destroyed, the pass will be destroyed as well, so /// there is no need to delete the pass. (TODO delete passes.) /// This implies that all passes MUST be allocated with 'new'. void FunctionPassManager::add(Pass *P) { - // If this is a not a function pass, don't add a printer for it. - const void *PassID = P->getPassID(); - if (P->getPassKind() == PT_Function) - if (ShouldPrintBeforePass(PassID)) - addImpl(P->createPrinterPass(dbgs(), std::string("*** IR Dump Before ") - + P->getPassName() + " ***")); - - addImpl(P); - - if (P->getPassKind() == PT_Function) - if (ShouldPrintAfterPass(PassID)) - addImpl(P->createPrinterPass(dbgs(), std::string("*** IR Dump After ") - + P->getPassName() + " ***")); + FPM->add(P); } /// run - Execute all of the passes scheduled for execution. Keep @@ -1693,27 +1673,12 @@ delete PM; } -/// addImpl - Add a pass to the queue of passes to run, without -/// checking whether to add a printer pass. -void PassManager::addImpl(Pass *P) { - PM->add(P); -} - /// add - Add a pass to the queue of passes to run. This passes ownership of /// the Pass to the PassManager. When the PassManager is destroyed, the pass /// will be destroyed as well, so there is no need to delete the pass. This /// implies that all passes MUST be allocated with 'new'. void PassManager::add(Pass *P) { - const void* PassID = P->getPassID(); - if (ShouldPrintBeforePass(PassID)) - addImpl(P->createPrinterPass(dbgs(), std::string("*** IR Dump Before ") - + P->getPassName() + " ***")); - - addImpl(P); - - if (ShouldPrintAfterPass(PassID)) - addImpl(P->createPrinterPass(dbgs(), std::string("*** IR Dump After ") - + P->getPassName() + " ***")); + PM->add(P); } /// run - Execute all of the passes scheduled for execution. Keep track of @@ -1823,7 +1788,7 @@ void FunctionPass::assignPassManager(PMStack &PMS, PassManagerType PreferredType) { - // Find Module Pass Manager + // Find Function Pass Manager while (!PMS.empty()) { if (PMS.top()->getPassManagerType() > PMT_FunctionPassManager) PMS.pop(); From atrick at apple.com Wed Feb 1 01:42:39 2012 From: atrick at apple.com (Andrew Trick) Date: Tue, 31 Jan 2012 23:42:39 -0800 Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <07f401cce02b$87f18630$97d49290$@org> References: <07f401cce02b$87f18630$97d49290$@org> Message-ID: On Jan 31, 2012, at 7:18 AM, Sergei Larin wrote: > > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin > Sent: Friday, January 27, 2012 10:47 AM > To: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review > > > Hello everybody, > > Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . > > Several key points: > - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute ?cost? function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. > - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the ?default? scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). > - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. > - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. > - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. > > Comments and reviews are eagerly anticipated J I'm in the process of reviewing this and also reworking the codegen pass configuration to make it easier for targets to plugin scheduling/bundling and other passes. Hopefully you'll see the results of both tomorrow. This is probably fine to checkin in the short term, but you could instead move directly to scheduling machineinstrs after coalescing. Then you can actually work on using MachineBundles. Will it work for you to use the SourceListDAGScheduler and run your scheduler/bundler in the MachineScheduler pass? I think this migration will have to come either now or later for you. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/d1cce3ad/attachment-0001.html From stpworld at narod.ru Wed Feb 1 01:49:52 2012 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 01 Feb 2012 07:49:52 -0000 Subject: [llvm-commits] [llvm] r149481 - in /llvm/trunk: include/llvm/ include/llvm/Analysis/ lib/Analysis/ lib/Bitcode/Writer/ lib/CodeGen/SelectionDAG/ lib/ExecutionEngine/Interpreter/ lib/Target/CBackend/ lib/Target/CppBackend/ lib/Transforms/IPO/ lib/Transforms/InstCombine/ lib/Transforms/Scalar/ lib/Transforms/Utils/ lib/VMCore/ tools/llvm-diff/ Message-ID: <20120201074953.13FFD2A6C12C@llvm.org> Author: dyatkovskiy Date: Wed Feb 1 01:49:51 2012 New Revision: 149481 URL: http://llvm.org/viewvc/llvm-project?rev=149481&view=rev Log: SwitchInst refactoring. The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. What was done: 1. Changed semantics of index inside the getCaseValue method: getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous. 2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned. 3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment. 4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst. 4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor. 4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor. Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang. Modified: llvm/trunk/include/llvm/Analysis/CFGPrinter.h llvm/trunk/include/llvm/Instructions.h llvm/trunk/lib/Analysis/LazyValueInfo.cpp llvm/trunk/lib/Analysis/SparsePropagation.cpp llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h llvm/trunk/lib/ExecutionEngine/Interpreter/Execution.cpp llvm/trunk/lib/Target/CBackend/CBackend.cpp llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp llvm/trunk/lib/Transforms/Scalar/GVN.cpp llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp llvm/trunk/lib/Transforms/Scalar/SCCP.cpp llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp llvm/trunk/lib/Transforms/Utils/Local.cpp llvm/trunk/lib/Transforms/Utils/LowerExpectIntrinsic.cpp llvm/trunk/lib/Transforms/Utils/LowerSwitch.cpp llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp llvm/trunk/lib/VMCore/AsmWriter.cpp llvm/trunk/lib/VMCore/Instructions.cpp llvm/trunk/lib/VMCore/Verifier.cpp llvm/trunk/tools/llvm-diff/DifferenceEngine.cpp Modified: llvm/trunk/include/llvm/Analysis/CFGPrinter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/CFGPrinter.h?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/CFGPrinter.h (original) +++ llvm/trunk/include/llvm/Analysis/CFGPrinter.h Wed Feb 1 01:49:51 2012 @@ -95,7 +95,8 @@ std::string Str; raw_string_ostream OS(Str); - OS << SI->getCaseValue(SuccNo)->getValue(); + unsigned Case = SI->resolveCaseIndex(SuccNo); + OS << SI->getCaseValue(Case)->getValue(); return OS.str(); } return ""; Modified: llvm/trunk/include/llvm/Instructions.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Instructions.h?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/include/llvm/Instructions.h (original) +++ llvm/trunk/include/llvm/Instructions.h Wed Feb 1 01:49:51 2012 @@ -24,6 +24,7 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/Support/ErrorHandling.h" #include +#include namespace llvm { @@ -2467,6 +2468,9 @@ protected: virtual SwitchInst *clone_impl() const; public: + + enum { ErrorIndex = UINT_MAX }; + static SwitchInst *Create(Value *Value, BasicBlock *Default, unsigned NumCases, Instruction *InsertBefore = 0) { return new SwitchInst(Value, Default, NumCases, InsertBefore); @@ -2488,34 +2492,62 @@ return cast(getOperand(1)); } - /// getNumCases - return the number of 'cases' in this switch instruction. - /// Note that case #0 is always the default case. + void setDefaultDest(BasicBlock *DefaultCase) { + setOperand(1, reinterpret_cast(DefaultCase)); + } + + /// getNumCases - return the number of 'cases' in this switch instruction, + /// except the default case unsigned getNumCases() const { - return getNumOperands()/2; + return getNumOperands()/2 - 1; } - /// getCaseValue - Return the specified case value. Note that case #0, the - /// default destination, does not have a case value. + /// getCaseValue - Return the specified case value. Note that case #0, means + /// first case, not a default case. ConstantInt *getCaseValue(unsigned i) { - assert(i && i < getNumCases() && "Illegal case value to get!"); - return getSuccessorValue(i); + assert(i < getNumCases() && "Illegal case value to get!"); + return reinterpret_cast(getOperand(2 + i*2)); } - /// getCaseValue - Return the specified case value. Note that case #0, the - /// default destination, does not have a case value. + /// getCaseValue - Return the specified case value. Note that case #0, means + /// first case, not a default case. const ConstantInt *getCaseValue(unsigned i) const { - assert(i && i < getNumCases() && "Illegal case value to get!"); - return getSuccessorValue(i); + assert(i < getNumCases() && "Illegal case value to get!"); + return reinterpret_cast(getOperand(2 + i*2)); + } + + // setSuccessorValue - Updates the value associated with the specified + // case. + void setCaseValue(unsigned i, ConstantInt *CaseValue) { + assert(i < getNumCases() && "Case index # out of range!"); + setOperand(2 + i*2, reinterpret_cast(CaseValue)); } /// findCaseValue - Search all of the case values for the specified constant. /// If it is explicitly handled, return the case number of it, otherwise - /// return 0 to indicate that it is handled by the default handler. + /// return ErrorIndex to indicate that it is handled by the default handler. unsigned findCaseValue(const ConstantInt *C) const { - for (unsigned i = 1, e = getNumCases(); i != e; ++i) + for (unsigned i = 0, e = getNumCases(); i != e; ++i) if (getCaseValue(i) == C) return i; - return 0; + return ErrorIndex; + } + + /// resolveSuccessorIndex - Converts case index to index of its successor + /// index in TerminatorInst successors collection. + /// If CaseIndex == ErrorIndex, "default" successor will returned then. + unsigned resolveSuccessorIndex(unsigned CaseIndex) const { + assert((CaseIndex == ErrorIndex || CaseIndex < getNumCases()) && + "Case index # out of range!"); + return CaseIndex != ErrorIndex ? CaseIndex + 1 : 0; + } + + /// resolveCaseIndex - Converts index of successor in TerminatorInst + /// collection to index of case that corresponds to this successor. + unsigned resolveCaseIndex(unsigned SuccessorIndex) const { + assert(SuccessorIndex < getNumSuccessors() && + "Successor index # out of range!"); + return SuccessorIndex != 0 ? SuccessorIndex - 1 : ErrorIndex; } /// findCaseDest - Finds the unique case value for a given successor. Returns @@ -2524,8 +2556,8 @@ if (BB == getDefaultDest()) return NULL; ConstantInt *CI = NULL; - for (unsigned i = 1, e = getNumCases(); i != e; ++i) { - if (getSuccessor(i) == BB) { + for (unsigned i = 0, e = getNumCases(); i != e; ++i) { + if (getSuccessor(i + 1) == BB) { if (CI) return NULL; // Multiple cases lead to BB. else CI = getCaseValue(i); } @@ -2537,9 +2569,8 @@ /// void addCase(ConstantInt *OnVal, BasicBlock *Dest); - /// removeCase - This method removes the specified successor from the switch - /// instruction. Note that this cannot be used to remove the default - /// destination (successor #0). Also note that this operation may reorder the + /// removeCase - This method removes the specified case and its successor + /// from the switch instruction. Note that this operation may reorder the /// remaining cases at index idx and above. /// void removeCase(unsigned idx); @@ -2554,6 +2585,22 @@ setOperand(idx*2+1, (Value*)NewSucc); } + /// Resolves successor for idx-th case. + /// Use getCaseSuccessor instead of TerminatorInst::getSuccessor, + /// since internal SwitchInst organization of operands/successors is + /// hidden and may be changed in any moment. + BasicBlock *getCaseSuccessor(unsigned idx) const { + return getSuccessor(resolveSuccessorIndex(idx)); + } + + /// Set new successor for idx-th case. + /// Use setCaseSuccessor instead of TerminatorInst::setSuccessor, + /// since internal SwitchInst organization of operands/successors is + /// hidden and may be changed in any moment. + void setCaseSuccessor(unsigned idx, BasicBlock *NewSucc) { + setSuccessor(resolveSuccessorIndex(idx), NewSucc); + } + // getSuccessorValue - Return the value associated with the specified // successor. ConstantInt *getSuccessorValue(unsigned idx) const { Modified: llvm/trunk/lib/Analysis/LazyValueInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LazyValueInfo.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/LazyValueInfo.cpp (original) +++ llvm/trunk/lib/Analysis/LazyValueInfo.cpp Wed Feb 1 01:49:51 2012 @@ -854,8 +854,8 @@ // BBFrom to BBTo. unsigned NumEdges = 0; ConstantInt *EdgeVal = 0; - for (unsigned i = 1, e = SI->getNumSuccessors(); i != e; ++i) { - if (SI->getSuccessor(i) != BBTo) continue; + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) { + if (SI->getCaseSuccessor(i) != BBTo) continue; if (NumEdges++) break; EdgeVal = SI->getCaseValue(i); } Modified: llvm/trunk/lib/Analysis/SparsePropagation.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/SparsePropagation.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/SparsePropagation.cpp (original) +++ llvm/trunk/lib/Analysis/SparsePropagation.cpp Wed Feb 1 01:49:51 2012 @@ -195,7 +195,8 @@ return; } - Succs[SI.findCaseValue(cast(C))] = true; + unsigned CCase = SI.findCaseValue(cast(C)); + Succs[SI.resolveSuccessorIndex(CCase)] = true; } Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original) +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Wed Feb 1 01:49:51 2012 @@ -1162,10 +1162,17 @@ } break; case Instruction::Switch: - Code = bitc::FUNC_CODE_INST_SWITCH; - Vals.push_back(VE.getTypeID(I.getOperand(0)->getType())); - for (unsigned i = 0, e = I.getNumOperands(); i != e; ++i) - Vals.push_back(VE.getValueID(I.getOperand(i))); + { + Code = bitc::FUNC_CODE_INST_SWITCH; + SwitchInst &SI = cast(I); + Vals.push_back(VE.getTypeID(SI.getCondition()->getType())); + Vals.push_back(VE.getValueID(SI.getCondition())); + Vals.push_back(VE.getValueID(SI.getDefaultDest())); + for (unsigned i = 0, e = SI.getNumCases(); i != e; ++i) { + Vals.push_back(VE.getValueID(SI.getCaseValue(i))); + Vals.push_back(VE.getValueID(SI.getCaseSuccessor(i))); + } + } break; case Instruction::IndirectBr: Code = bitc::FUNC_CODE_INST_INDIRECTBR; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Wed Feb 1 01:49:51 2012 @@ -2209,7 +2209,7 @@ CaseRange LHSR(CR.Range.first, Pivot); CaseRange RHSR(Pivot, CR.Range.second); - Constant *C = Pivot->Low; + const Constant *C = Pivot->Low; MachineBasicBlock *FalseBB = 0, *TrueBB = 0; // We know that we branch to the LHS if the Value being switched on is @@ -2402,14 +2402,14 @@ BranchProbabilityInfo *BPI = FuncInfo.BPI; // Start with "simple" cases - for (size_t i = 1; i < SI.getNumSuccessors(); ++i) { - BasicBlock *SuccBB = SI.getSuccessor(i); + for (size_t i = 0; i < SI.getNumCases(); ++i) { + BasicBlock *SuccBB = SI.getCaseSuccessor(i); MachineBasicBlock *SMBB = FuncInfo.MBBMap[SuccBB]; uint32_t ExtraWeight = BPI ? BPI->getEdgeWeight(SI.getParent(), SuccBB) : 0; - Cases.push_back(Case(SI.getSuccessorValue(i), - SI.getSuccessorValue(i), + Cases.push_back(Case(SI.getCaseValue(i), + SI.getCaseValue(i), SMBB, ExtraWeight)); } std::sort(Cases.begin(), Cases.end(), CaseCmp()); @@ -2476,7 +2476,7 @@ // If there is only the default destination, branch to it if it is not the // next basic block. Otherwise, just fall through. - if (SI.getNumCases() == 1) { + if (!SI.getNumCases()) { // Update machine-CFG edges. // If this is not a fall-through branch, emit the branch. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h Wed Feb 1 01:49:51 2012 @@ -130,13 +130,13 @@ /// Case - A struct to record the Value for a switch case, and the /// case's target basic block. struct Case { - Constant* Low; - Constant* High; + const Constant *Low; + const Constant *High; MachineBasicBlock* BB; uint32_t ExtraWeight; Case() : Low(0), High(0), BB(0), ExtraWeight(0) { } - Case(Constant* low, Constant* high, MachineBasicBlock* bb, + Case(const Constant *low, const Constant *high, MachineBasicBlock *bb, uint32_t extraweight) : Low(low), High(high), BB(bb), ExtraWeight(extraweight) { } Modified: llvm/trunk/lib/ExecutionEngine/Interpreter/Execution.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/Interpreter/Execution.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/Interpreter/Execution.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/Interpreter/Execution.cpp Wed Feb 1 01:49:51 2012 @@ -670,10 +670,10 @@ BasicBlock *Dest = 0; unsigned NumCases = I.getNumCases(); // Skip the first item since that's the default case. - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { GenericValue CaseVal = getOperandValue(I.getCaseValue(i), SF); if (executeICMP_EQ(CondVal, CaseVal, ElTy).IntVal != 0) { - Dest = cast(I.getSuccessor(i)); + Dest = cast(I.getCaseSuccessor(i)); break; } } Modified: llvm/trunk/lib/Target/CBackend/CBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CBackend/CBackend.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Target/CBackend/CBackend.cpp (original) +++ llvm/trunk/lib/Target/CBackend/CBackend.cpp Wed Feb 1 01:49:51 2012 @@ -2449,9 +2449,9 @@ unsigned NumCases = SI.getNumCases(); // Skip the first item since that's the default case. - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { ConstantInt* CaseVal = SI.getCaseValue(i); - BasicBlock* Succ = SI.getSuccessor(i); + BasicBlock* Succ = SI.getCaseSuccessor(i); Out << " case "; writeOperand(CaseVal); Out << ":\n"; Modified: llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp (original) +++ llvm/trunk/lib/Target/CppBackend/CPPBackend.cpp Wed Feb 1 01:49:51 2012 @@ -1115,9 +1115,9 @@ << SI->getNumCases() << ", " << bbname << ");"; nl(Out); unsigned NumCases = SI->getNumCases(); - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { const ConstantInt* CaseVal = SI->getCaseValue(i); - const BasicBlock* BB = SI->getSuccessor(i); + const BasicBlock *BB = SI->getCaseSuccessor(i); Out << iName << "->addCase(" << getOpName(CaseVal) << ", " << getOpName(BB) << ");"; Modified: llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp Wed Feb 1 01:49:51 2012 @@ -2455,7 +2455,8 @@ ConstantInt *Val = dyn_cast(getVal(Values, SI->getCondition())); if (!Val) return false; // Cannot determine. - NewBB = SI->getSuccessor(SI->findCaseValue(Val)); + unsigned ValTISucc = SI->resolveSuccessorIndex(SI->findCaseValue(Val)); + NewBB = SI->getSuccessor(ValTISucc); } else if (IndirectBrInst *IBI = dyn_cast(CurInst)) { Value *Val = getVal(Values, IBI->getAddress())->stripPointerCasts(); if (BlockAddress *BA = dyn_cast(Val)) Modified: llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp (original) +++ llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp Wed Feb 1 01:49:51 2012 @@ -1251,13 +1251,13 @@ // change 'switch (X+4) case 1:' into 'switch (X) case -3' unsigned NumCases = SI.getNumCases(); // Skip the first item since that's the default case. - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { ConstantInt* CaseVal = SI.getCaseValue(i); Constant* NewCaseVal = ConstantExpr::getSub(cast(CaseVal), AddRHS); assert(isa(NewCaseVal) && "Result of expression should be constant"); - SI.setSuccessorValue(i, cast(NewCaseVal)); + SI.setCaseValue(i, cast(NewCaseVal)); } SI.setCondition(I->getOperand(0)); Worklist.Add(I); @@ -1877,15 +1877,15 @@ } else if (SwitchInst *SI = dyn_cast(TI)) { if (ConstantInt *Cond = dyn_cast(SI->getCondition())) { // See if this is an explicit destination. - for (unsigned i = 1, e = SI->getNumSuccessors(); i != e; ++i) + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) if (SI->getCaseValue(i) == Cond) { - BasicBlock *ReachableBB = SI->getSuccessor(i); + BasicBlock *ReachableBB = SI->getCaseSuccessor(i); Worklist.push_back(ReachableBB); continue; } // Otherwise it is the default destination. - Worklist.push_back(SI->getSuccessor(0)); + Worklist.push_back(SI->getDefaultDest()); continue; } } Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Wed Feb 1 01:49:51 2012 @@ -2085,8 +2085,8 @@ Value *SwitchCond = SI->getCondition(); BasicBlock *Parent = SI->getParent(); bool Changed = false; - for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) { - BasicBlock *Dst = SI->getSuccessor(i); + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) { + BasicBlock *Dst = SI->getCaseSuccessor(i); if (isOnlyReachableViaThisEdge(Parent, Dst, DT)) Changed |= propagateEquality(SwitchCond, SI->getCaseValue(i), Dst); } Modified: llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp Wed Feb 1 01:49:51 2012 @@ -1086,9 +1086,10 @@ DestBB = 0; else if (BranchInst *BI = dyn_cast(BB->getTerminator())) DestBB = BI->getSuccessor(cast(Val)->isZero()); - else if (SwitchInst *SI = dyn_cast(BB->getTerminator())) - DestBB = SI->getSuccessor(SI->findCaseValue(cast(Val))); - else { + else if (SwitchInst *SI = dyn_cast(BB->getTerminator())) { + unsigned ValCase = SI->findCaseValue(cast(Val)); + DestBB = SI->getSuccessor(SI->resolveSuccessorIndex(ValCase)); + } else { assert(isa(BB->getTerminator()) && "Unexpected terminator"); DestBB = cast(Val)->getBasicBlock(); Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Wed Feb 1 01:49:51 2012 @@ -436,7 +436,7 @@ Value *LoopCond = FindLIVLoopCondition(SI->getCondition(), currentLoop, Changed); unsigned NumCases = SI->getNumCases(); - if (LoopCond && NumCases > 1) { + if (LoopCond && NumCases) { // Find a value to unswitch on: // FIXME: this should chose the most expensive case! // FIXME: scan for a case with a non-critical edge? @@ -445,7 +445,7 @@ // Do not process same value again and again. // At this point we have some cases already unswitched and // some not yet unswitched. Let's find the first not yet unswitched one. - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { Constant* UnswitchValCandidate = SI->getCaseValue(i); if (!BranchesInfo.isUnswitched(SI, UnswitchValCandidate)) { UnswitchVal = UnswitchValCandidate; @@ -574,10 +574,10 @@ // this. // Note that we can't trivially unswitch on the default case or // on already unswitched cases. - for (unsigned i = 1, e = SI->getNumSuccessors(); i != e; ++i) { + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) { BasicBlock* LoopExitCandidate; if ((LoopExitCandidate = isTrivialLoopExitBlock(currentLoop, - SI->getSuccessor(i)))) { + SI->getCaseSuccessor(i)))) { // Okay, we found a trivial case, remember the value that is trivial. ConstantInt* CaseVal = SI->getCaseValue(i); @@ -1118,14 +1118,15 @@ if (SI == 0 || !isa(Val)) continue; unsigned DeadCase = SI->findCaseValue(cast(Val)); - if (DeadCase == 0) continue; // Default case is live for multiple values. + // Default case is live for multiple values. + if (DeadCase == SwitchInst::ErrorIndex) continue; // Found a dead case value. Don't remove PHI nodes in the // successor if they become single-entry, those PHI nodes may // be in the Users list. BasicBlock *Switch = SI->getParent(); - BasicBlock *SISucc = SI->getSuccessor(DeadCase); + BasicBlock *SISucc = SI->getCaseSuccessor(DeadCase); BasicBlock *Latch = L->getLoopLatch(); BranchesInfo.setUnswitched(SI, Val); @@ -1145,7 +1146,7 @@ // Compute the successors instead of relying on the return value // of SplitEdge, since it may have split the switch successor // after PHI nodes. - BasicBlock *NewSISucc = SI->getSuccessor(DeadCase); + BasicBlock *NewSISucc = SI->getCaseSuccessor(DeadCase); BasicBlock *OldSISucc = *succ_begin(NewSISucc); // Create an "unreachable" destination. BasicBlock *Abort = BasicBlock::Create(Context, "us-unreachable", Modified: llvm/trunk/lib/Transforms/Scalar/SCCP.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SCCP.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SCCP.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SCCP.cpp Wed Feb 1 01:49:51 2012 @@ -550,7 +550,7 @@ } if (SwitchInst *SI = dyn_cast(&TI)) { - if (TI.getNumSuccessors() < 2) { + if (!SI->getNumCases()) { Succs[0] = true; return; } @@ -564,7 +564,7 @@ return; } - Succs[SI->findCaseValue(CI)] = true; + Succs[SI->resolveSuccessorIndex(SI->findCaseValue(CI))] = true; return; } @@ -614,7 +614,7 @@ return true; if (SwitchInst *SI = dyn_cast(TI)) { - if (SI->getNumSuccessors() < 2) + if (SI->getNumCases() < 1) return true; LatticeVal SCValue = getValueState(SI->getCondition()); @@ -624,9 +624,9 @@ return !SCValue.isUndefined(); // Make sure to skip the "default value" which isn't a value - for (unsigned i = 1, E = SI->getNumSuccessors(); i != E; ++i) - if (SI->getSuccessorValue(i) == CI) // Found the taken branch. - return SI->getSuccessor(i) == To; + for (unsigned i = 0, E = SI->getNumCases(); i != E; ++i) + if (SI->getCaseValue(i) == CI) // Found the taken branch. + return SI->getCaseSuccessor(i) == To; // If the constant value is not equal to any of the branches, we must // execute default branch. @@ -1487,7 +1487,7 @@ } if (SwitchInst *SI = dyn_cast(TI)) { - if (SI->getNumSuccessors() < 2) // no cases + if (!SI->getNumCases()) continue; if (!getValueState(SI->getCondition()).isUndefined()) continue; @@ -1495,12 +1495,12 @@ // If the input to SCCP is actually switch on undef, fix the undef to // the first constant. if (isa(SI->getCondition())) { - SI->setCondition(SI->getCaseValue(1)); - markEdgeExecutable(BB, TI->getSuccessor(1)); + SI->setCondition(SI->getCaseValue(0)); + markEdgeExecutable(BB, SI->getCaseSuccessor(0)); return true; } - markForcedConstant(SI->getCondition(), SI->getCaseValue(1)); + markForcedConstant(SI->getCondition(), SI->getCaseValue(0)); return true; } } Modified: llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp Wed Feb 1 01:49:51 2012 @@ -314,7 +314,8 @@ Cond = dyn_cast_or_null(V); } if (Cond) { // Constant fold to uncond branch! - BasicBlock *Dest = SI->getSuccessor(SI->findCaseValue(Cond)); + unsigned CaseIndex = SI->findCaseValue(Cond); + BasicBlock *Dest = SI->getSuccessor(SI->resolveSuccessorIndex(CaseIndex)); VMap[OldTI] = BranchInst::Create(Dest, NewBB); ToClone.push_back(Dest); TerminatorDone = true; Modified: llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/CodeExtractor.cpp Wed Feb 1 01:49:51 2012 @@ -615,9 +615,9 @@ default: // Otherwise, make the default destination of the switch instruction be one // of the other successors. - TheSwitch->setOperand(0, call); - TheSwitch->setSuccessor(0, TheSwitch->getSuccessor(NumExitBlocks)); - TheSwitch->removeCase(NumExitBlocks); // Remove redundant case + TheSwitch->setCondition(call); + TheSwitch->setDefaultDest(TheSwitch->getSuccessor(NumExitBlocks)); + TheSwitch->removeCase(NumExitBlocks-1); // Remove redundant case break; } } Modified: llvm/trunk/lib/Transforms/Utils/Local.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/Local.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/Local.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/Local.cpp Wed Feb 1 01:49:51 2012 @@ -106,22 +106,20 @@ // If we are switching on a constant, we can convert the switch into a // single branch instruction! ConstantInt *CI = dyn_cast(SI->getCondition()); - BasicBlock *TheOnlyDest = SI->getSuccessor(0); // The default dest + BasicBlock *TheOnlyDest = SI->getDefaultDest(); // The default dest BasicBlock *DefaultDest = TheOnlyDest; - assert(TheOnlyDest == SI->getDefaultDest() && - "Default destination is not successor #0?"); // Figure out which case it goes to. - for (unsigned i = 1, e = SI->getNumSuccessors(); i != e; ++i) { + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) { // Found case matching a constant operand? - if (SI->getSuccessorValue(i) == CI) { - TheOnlyDest = SI->getSuccessor(i); + if (SI->getCaseValue(i) == CI) { + TheOnlyDest = SI->getCaseSuccessor(i); break; } // Check to see if this branch is going to the same place as the default // dest. If so, eliminate it as an explicit compare. - if (SI->getSuccessor(i) == DefaultDest) { + if (SI->getCaseSuccessor(i) == DefaultDest) { // Remove this entry. DefaultDest->removePredecessor(SI->getParent()); SI->removeCase(i); @@ -132,7 +130,7 @@ // Otherwise, check to see if the switch only branches to one destination. // We do this by reseting "TheOnlyDest" to null when we find two non-equal // destinations. - if (SI->getSuccessor(i) != TheOnlyDest) TheOnlyDest = 0; + if (SI->getCaseSuccessor(i) != TheOnlyDest) TheOnlyDest = 0; } if (CI && !TheOnlyDest) { @@ -166,14 +164,14 @@ return true; } - if (SI->getNumSuccessors() == 2) { + if (SI->getNumCases() == 1) { // Otherwise, we can fold this switch into a conditional branch // instruction if it has only one non-default destination. Value *Cond = Builder.CreateICmpEQ(SI->getCondition(), - SI->getSuccessorValue(1), "cond"); + SI->getCaseValue(0), "cond"); // Insert the new branch. - Builder.CreateCondBr(Cond, SI->getSuccessor(1), SI->getSuccessor(0)); + Builder.CreateCondBr(Cond, SI->getCaseSuccessor(0), SI->getDefaultDest()); // Delete the old switch. SI->eraseFromParent(); Modified: llvm/trunk/lib/Transforms/Utils/LowerExpectIntrinsic.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LowerExpectIntrinsic.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/LowerExpectIntrinsic.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/LowerExpectIntrinsic.cpp Wed Feb 1 01:49:51 2012 @@ -76,11 +76,14 @@ unsigned caseNo = SI->findCaseValue(ExpectedValue); std::vector Vec; unsigned n = SI->getNumCases(); - Vec.resize(n + 1); // +1 for MDString + Vec.resize(n + 1 + 1); // +1 for MDString and +1 for default case Vec[0] = MDString::get(Context, "branch_weights"); + Vec[1] = ConstantInt::get(Int32Ty, SwitchInst::ErrorIndex == caseNo ? + LikelyBranchWeight : UnlikelyBranchWeight); for (unsigned i = 0; i < n; ++i) { - Vec[i + 1] = ConstantInt::get(Int32Ty, i == caseNo ? LikelyBranchWeight : UnlikelyBranchWeight); + Vec[i + 1 + 1] = ConstantInt::get(Int32Ty, i == caseNo ? + LikelyBranchWeight : UnlikelyBranchWeight); } MDNode *WeightsNode = llvm::MDNode::get(Context, Vec); Modified: llvm/trunk/lib/Transforms/Utils/LowerSwitch.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LowerSwitch.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/LowerSwitch.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/LowerSwitch.cpp Wed Feb 1 01:49:51 2012 @@ -237,10 +237,10 @@ unsigned numCmps = 0; // Start with "simple" cases - for (unsigned i = 1; i < SI->getNumSuccessors(); ++i) - Cases.push_back(CaseRange(SI->getSuccessorValue(i), - SI->getSuccessorValue(i), - SI->getSuccessor(i))); + for (unsigned i = 0; i < SI->getNumCases(); ++i) + Cases.push_back(CaseRange(SI->getCaseValue(i), + SI->getCaseValue(i), + SI->getCaseSuccessor(i))); std::sort(Cases.begin(), Cases.end(), CaseCmp()); // Merge case into clusters @@ -281,7 +281,7 @@ BasicBlock* Default = SI->getDefaultDest(); // If there is only the default destination, don't bother with the code below. - if (SI->getNumCases() == 1) { + if (!SI->getNumCases()) { BranchInst::Create(SI->getDefaultDest(), CurBlock); CurBlock->getInstList().erase(SI); return; Modified: llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp (original) +++ llvm/trunk/lib/Transforms/Utils/SimplifyCFG.cpp Wed Feb 1 01:49:51 2012 @@ -481,8 +481,9 @@ BasicBlock*> > &Cases) { if (SwitchInst *SI = dyn_cast(TI)) { Cases.reserve(SI->getNumCases()); - for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) - Cases.push_back(std::make_pair(SI->getCaseValue(i), SI->getSuccessor(i))); + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) + Cases.push_back(std::make_pair(SI->getCaseValue(i), + SI->getCaseSuccessor(i))); return SI->getDefaultDest(); } @@ -605,11 +606,13 @@ DEBUG(dbgs() << "Threading pred instr: " << *Pred->getTerminator() << "Through successor TI: " << *TI); - for (unsigned i = SI->getNumCases()-1; i != 0; --i) + for (unsigned i = SI->getNumCases(); i != 0;) { + --i; if (DeadCases.count(SI->getCaseValue(i))) { - SI->getSuccessor(i)->removePredecessor(TI->getParent()); + SI->getCaseSuccessor(i)->removePredecessor(TI->getParent()); SI->removeCase(i); } + } DEBUG(dbgs() << "Leaving: " << *TI << "\n"); return true; @@ -2007,8 +2010,10 @@ // Find the relevant condition and destinations. Value *Condition = Select->getCondition(); - BasicBlock *TrueBB = SI->getSuccessor(SI->findCaseValue(TrueVal)); - BasicBlock *FalseBB = SI->getSuccessor(SI->findCaseValue(FalseVal)); + unsigned TrueCase = SI->findCaseValue(TrueVal); + unsigned FalseCase = SI->findCaseValue(FalseVal); + BasicBlock *TrueBB = SI->getSuccessor(SI->resolveSuccessorIndex(TrueCase)); + BasicBlock *FalseBB = SI->getSuccessor(SI->resolveSuccessorIndex(FalseCase)); // Perform the actual simplification. return SimplifyTerminatorOnSelect(SI, Condition, TrueBB, FalseBB); @@ -2092,7 +2097,7 @@ // Ok, the block is reachable from the default dest. If the constant we're // comparing exists in one of the other edges, then we can constant fold ICI // and zap it. - if (SI->findCaseValue(Cst) != 0) { + if (SI->findCaseValue(Cst) != SwitchInst::ErrorIndex) { Value *V; if (ICI->getPredicate() == ICmpInst::ICMP_EQ) V = ConstantInt::getFalse(BB->getContext()); @@ -2465,8 +2470,8 @@ } } } else if (SwitchInst *SI = dyn_cast(TI)) { - for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) - if (SI->getSuccessor(i) == BB) { + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) + if (SI->getCaseSuccessor(i) == BB) { BB->removePredecessor(SI->getParent()); SI->removeCase(i); --i; --e; @@ -2474,11 +2479,11 @@ } // If the default value is unreachable, figure out the most popular // destination and make it the default. - if (SI->getSuccessor(0) == BB) { + if (SI->getDefaultDest() == BB) { std::map > Popularity; - for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) { + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) { std::pair &entry = - Popularity[SI->getSuccessor(i)]; + Popularity[SI->getCaseSuccessor(i)]; if (entry.first == 0) { entry.first = 1; entry.second = i; @@ -2503,7 +2508,7 @@ if (MaxBlock) { // Make this the new default, allowing us to delete any explicit // edges to it. - SI->setSuccessor(0, MaxBlock); + SI->setDefaultDest(MaxBlock); Changed = true; // If MaxBlock has phinodes in it, remove MaxPop-1 entries from @@ -2512,8 +2517,8 @@ for (unsigned i = 0; i != MaxPop-1; ++i) MaxBlock->removePredecessor(SI->getParent()); - for (unsigned i = 1, e = SI->getNumCases(); i != e; ++i) - if (SI->getSuccessor(i) == MaxBlock) { + for (unsigned i = 0, e = SI->getNumCases(); i != e; ++i) + if (SI->getCaseSuccessor(i) == MaxBlock) { SI->removeCase(i); --i; --e; } @@ -2555,17 +2560,17 @@ /// TurnSwitchRangeIntoICmp - Turns a switch with that contains only a /// integer range comparison into a sub, an icmp and a branch. static bool TurnSwitchRangeIntoICmp(SwitchInst *SI, IRBuilder<> &Builder) { - assert(SI->getNumCases() > 2 && "Degenerate switch?"); + assert(SI->getNumCases() > 1 && "Degenerate switch?"); // Make sure all cases point to the same destination and gather the values. SmallVector Cases; - Cases.push_back(SI->getCaseValue(1)); - for (unsigned I = 2, E = SI->getNumCases(); I != E; ++I) { - if (SI->getSuccessor(I-1) != SI->getSuccessor(I)) + Cases.push_back(SI->getCaseValue(0)); + for (unsigned I = 1, E = SI->getNumCases(); I != E; ++I) { + if (SI->getCaseSuccessor(I-1) != SI->getCaseSuccessor(I)) return false; Cases.push_back(SI->getCaseValue(I)); } - assert(Cases.size() == SI->getNumCases()-1 && "Not all cases gathered"); + assert(Cases.size() == SI->getNumCases() && "Not all cases gathered"); // Sort the case values, then check if they form a range we can transform. array_pod_sort(Cases.begin(), Cases.end(), ConstantIntSortPredicate); @@ -2575,18 +2580,18 @@ } Constant *Offset = ConstantExpr::getNeg(Cases.back()); - Constant *NumCases = ConstantInt::get(Offset->getType(), SI->getNumCases()-1); + Constant *NumCases = ConstantInt::get(Offset->getType(), SI->getNumCases()); Value *Sub = SI->getCondition(); if (!Offset->isNullValue()) Sub = Builder.CreateAdd(Sub, Offset, Sub->getName()+".off"); Value *Cmp = Builder.CreateICmpULT(Sub, NumCases, "switch"); - Builder.CreateCondBr(Cmp, SI->getSuccessor(1), SI->getDefaultDest()); + Builder.CreateCondBr(Cmp, SI->getCaseSuccessor(0), SI->getDefaultDest()); // Prune obsolete incoming values off the successor's PHI nodes. - for (BasicBlock::iterator BBI = SI->getSuccessor(1)->begin(); + for (BasicBlock::iterator BBI = SI->getCaseSuccessor(0)->begin(); isa(BBI); ++BBI) { - for (unsigned I = 0, E = SI->getNumCases()-2; I != E; ++I) + for (unsigned I = 0, E = SI->getNumCases()-1; I != E; ++I) cast(BBI)->removeIncomingValue(SI->getParent()); } SI->eraseFromParent(); @@ -2604,7 +2609,7 @@ // Gather dead cases. SmallVector DeadCases; - for (unsigned I = 1, E = SI->getNumCases(); I != E; ++I) { + for (unsigned I = 0, E = SI->getNumCases(); I != E; ++I) { if ((SI->getCaseValue(I)->getValue() & KnownZero) != 0 || (SI->getCaseValue(I)->getValue() & KnownOne) != KnownOne) { DeadCases.push_back(SI->getCaseValue(I)); @@ -2616,8 +2621,10 @@ // Remove dead cases from the switch. for (unsigned I = 0, E = DeadCases.size(); I != E; ++I) { unsigned Case = SI->findCaseValue(DeadCases[I]); + assert(Case != SwitchInst::ErrorIndex && + "Case was not found. Probably mistake in DeadCases forming."); // Prune unused values from PHI nodes. - SI->getSuccessor(Case)->removePredecessor(SI->getParent()); + SI->getCaseSuccessor(Case)->removePredecessor(SI->getParent()); SI->removeCase(Case); } @@ -2666,9 +2673,9 @@ typedef DenseMap > ForwardingNodesMap; ForwardingNodesMap ForwardingNodes; - for (unsigned I = 1; I < SI->getNumCases(); ++I) { // 0 is the default case. + for (unsigned I = 0; I < SI->getNumCases(); ++I) { // 0 is the default case. ConstantInt *CaseValue = SI->getCaseValue(I); - BasicBlock *CaseDest = SI->getSuccessor(I); + BasicBlock *CaseDest = SI->getCaseSuccessor(I); int PhiIndex; PHINode *PHI = FindPHIForConditionForwarding(CaseValue, CaseDest, Modified: llvm/trunk/lib/VMCore/AsmWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/AsmWriter.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/AsmWriter.cpp (original) +++ llvm/trunk/lib/VMCore/AsmWriter.cpp Wed Feb 1 01:49:51 2012 @@ -1738,13 +1738,12 @@ Out << ", "; writeOperand(SI.getDefaultDest(), true); Out << " ["; - // Skip the first item since that's the default case. unsigned NumCases = SI.getNumCases(); - for (unsigned i = 1; i < NumCases; ++i) { + for (unsigned i = 0; i < NumCases; ++i) { Out << "\n "; writeOperand(SI.getCaseValue(i), true); Out << ", "; - writeOperand(SI.getSuccessor(i), true); + writeOperand(SI.getCaseSuccessor(i), true); } Out << "\n ]"; } else if (isa(I)) { Modified: llvm/trunk/lib/VMCore/Instructions.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Instructions.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Instructions.cpp (original) +++ llvm/trunk/lib/VMCore/Instructions.cpp Wed Feb 1 01:49:51 2012 @@ -3195,31 +3195,29 @@ /// addCase - Add an entry to the switch instruction... /// void SwitchInst::addCase(ConstantInt *OnVal, BasicBlock *Dest) { + unsigned NewCaseIdx = getNumCases(); unsigned OpNo = NumOperands; if (OpNo+2 > ReservedSpace) growOperands(); // Get more space! // Initialize some new operands. assert(OpNo+1 < ReservedSpace && "Growing didn't work!"); NumOperands = OpNo+2; - OperandList[OpNo] = OnVal; - OperandList[OpNo+1] = Dest; + setCaseValue(NewCaseIdx, OnVal); + setCaseSuccessor(NewCaseIdx, Dest); } -/// removeCase - This method removes the specified successor from the switch -/// instruction. Note that this cannot be used to remove the default -/// destination (successor #0). -/// +/// removeCase - This method removes the specified case and its successor +/// from the switch instruction. void SwitchInst::removeCase(unsigned idx) { - assert(idx != 0 && "Cannot remove the default case!"); - assert(idx*2 < getNumOperands() && "Successor index out of range!!!"); + assert(2 + idx*2 < getNumOperands() && "Case index out of range!!!"); unsigned NumOps = getNumOperands(); Use *OL = OperandList; // Overwrite this case with the end of the list. - if ((idx + 1) * 2 != NumOps) { - OL[idx * 2] = OL[NumOps - 2]; - OL[idx * 2 + 1] = OL[NumOps - 1]; + if (2 + (idx + 1) * 2 != NumOps) { + OL[2 + idx * 2] = OL[NumOps - 2]; + OL[2 + idx * 2 + 1] = OL[NumOps - 1]; } // Nuke the last value. Modified: llvm/trunk/lib/VMCore/Verifier.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Verifier.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Verifier.cpp (original) +++ llvm/trunk/lib/VMCore/Verifier.cpp Wed Feb 1 01:49:51 2012 @@ -812,7 +812,7 @@ // have the same type as the switched-on value. Type *SwitchTy = SI.getCondition()->getType(); SmallPtrSet Constants; - for (unsigned i = 1, e = SI.getNumCases(); i != e; ++i) { + for (unsigned i = 0, e = SI.getNumCases(); i != e; ++i) { Assert1(SI.getCaseValue(i)->getType() == SwitchTy, "Switch constants must all be same type as switch value!", &SI); Assert2(Constants.insert(SI.getCaseValue(i)), Modified: llvm/trunk/tools/llvm-diff/DifferenceEngine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-diff/DifferenceEngine.cpp?rev=149481&r1=149480&r2=149481&view=diff ============================================================================== --- llvm/trunk/tools/llvm-diff/DifferenceEngine.cpp (original) +++ llvm/trunk/tools/llvm-diff/DifferenceEngine.cpp Wed Feb 1 01:49:51 2012 @@ -319,13 +319,13 @@ bool Difference = false; DenseMap LCases; - for (unsigned I = 1, E = LI->getNumCases(); I != E; ++I) - LCases[LI->getCaseValue(I)] = LI->getSuccessor(I); - for (unsigned I = 1, E = RI->getNumCases(); I != E; ++I) { + for (unsigned I = 0, E = LI->getNumCases(); I != E; ++I) + LCases[LI->getCaseValue(I)] = LI->getCaseSuccessor(I); + for (unsigned I = 0, E = RI->getNumCases(); I != E; ++I) { ConstantInt *CaseValue = RI->getCaseValue(I); BasicBlock *LCase = LCases[CaseValue]; if (LCase) { - if (TryUnify) tryUnify(LCase, RI->getSuccessor(I)); + if (TryUnify) tryUnify(LCase, RI->getCaseSuccessor(I)); LCases.erase(CaseValue); } else if (Complain || !Difference) { if (Complain) From stpworld at narod.ru Wed Feb 1 01:54:22 2012 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 01 Feb 2012 07:54:22 -0000 Subject: [llvm-commits] [dragonegg] r149483 - /dragonegg/trunk/src/Convert.cpp Message-ID: <20120201075422.836702A6C12C@llvm.org> Author: dyatkovskiy Date: Wed Feb 1 01:54:22 2012 New Revision: 149483 URL: http://llvm.org/viewvc/llvm-project?rev=149483&view=rev Log: Compatability fix for SwitchInst refactoring. The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. What was done: 1. Changed semantics of index inside the getCaseValue method: getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous. 2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned. 3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment. 4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst. 4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor. 4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor. Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang. Modified: dragonegg/trunk/src/Convert.cpp Modified: dragonegg/trunk/src/Convert.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Convert.cpp?rev=149483&r1=149482&r2=149483&view=diff ============================================================================== --- dragonegg/trunk/src/Convert.cpp (original) +++ dragonegg/trunk/src/Convert.cpp Wed Feb 1 01:54:22 2012 @@ -7868,7 +7868,7 @@ if (IfBlock) { Builder.CreateBr(SI->getDefaultDest()); - SI->setSuccessor(0, IfBlock); + SI->setDefaultDest(IfBlock); } } From elena.demikhovsky at intel.com Wed Feb 1 01:56:44 2012 From: elena.demikhovsky at intel.com (Elena Demikhovsky) Date: Wed, 01 Feb 2012 07:56:44 -0000 Subject: [llvm-commits] [llvm] r149485 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86ISelLowering.h test/CodeGen/X86/avx-trunc.ll Message-ID: <20120201075644.BE35F2A6C12C@llvm.org> Author: delena Date: Wed Feb 1 01:56:44 2012 New Revision: 149485 URL: http://llvm.org/viewvc/llvm-project?rev=149485&view=rev Log: Optimization for "truncate" operation on AVX. Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles. Added: llvm/trunk/test/CodeGen/X86/avx-trunc.ll (with props) Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.h Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149485&r1=149484&r2=149485&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 01:56:44 2012 @@ -1218,6 +1218,7 @@ setTargetDAGCombine(ISD::LOAD); setTargetDAGCombine(ISD::STORE); setTargetDAGCombine(ISD::ZERO_EXTEND); + setTargetDAGCombine(ISD::TRUNCATE); setTargetDAGCombine(ISD::SINT_TO_FP); if (Subtarget->is64Bit()) setTargetDAGCombine(ISD::MUL); @@ -12911,6 +12912,104 @@ return EltsFromConsecutiveLoads(VT, Elts, dl, DAG); } + +/// PerformTruncateCombine - Converts truncate operation to +/// a sequence of vector shuffle operations. +/// It is possible when we truncate 256-bit vector to 128-bit vector + +SDValue X86TargetLowering::PerformTruncateCombine(SDNode *N, SelectionDAG &DAG, + DAGCombinerInfo &DCI) const { + if (!DCI.isBeforeLegalizeOps()) + return SDValue(); + + if (!Subtarget->hasAVX()) return SDValue(); + + EVT VT = N->getValueType(0); + SDValue Op = N->getOperand(0); + EVT OpVT = Op.getValueType(); + DebugLoc dl = N->getDebugLoc(); + + if ((VT == MVT::v4i32) && (OpVT == MVT::v4i64)) { + + SDValue OpLo = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v2i64, Op, + DAG.getIntPtrConstant(0)); + + SDValue OpHi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v2i64, Op, + DAG.getIntPtrConstant(2)); + + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpLo); + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); + + // PSHUFD + SmallVector ShufMask1; + ShufMask1.push_back(0); + ShufMask1.push_back(2); + ShufMask1.push_back(0); + ShufMask1.push_back(0); + + OpLo = DAG.getVectorShuffle(VT, dl, OpLo, DAG.getUNDEF(VT), + ShufMask1.data()); + OpHi = DAG.getVectorShuffle(VT, dl, OpHi, DAG.getUNDEF(VT), + ShufMask1.data()); + + // MOVLHPS + SmallVector ShufMask2; + ShufMask2.push_back(0); + ShufMask2.push_back(1); + ShufMask2.push_back(4); + ShufMask2.push_back(5); + + return DAG.getVectorShuffle(VT, dl, OpLo, OpHi, ShufMask2.data()); + } + if ((VT == MVT::v8i16) && (OpVT == MVT::v8i32)) { + + SDValue OpLo = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v4i32, Op, + DAG.getIntPtrConstant(0)); + + SDValue OpHi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v4i32, Op, + DAG.getIntPtrConstant(4)); + + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v16i8, OpLo); + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v16i8, OpHi); + + // PSHUFB + SmallVector ShufMask1; + ShufMask1.push_back(0x0); + ShufMask1.push_back(0x1); + ShufMask1.push_back(0x4); + ShufMask1.push_back(0x5); + ShufMask1.push_back(0x8); + ShufMask1.push_back(0x9); + ShufMask1.push_back(0xc); + ShufMask1.push_back(0xd); + for (unsigned i=0; i<8; ++i) + ShufMask1.push_back(-1); + + OpLo = DAG.getVectorShuffle(MVT::v16i8, dl, OpLo, + DAG.getUNDEF(MVT::v16i8), + ShufMask1.data()); + OpHi = DAG.getVectorShuffle(MVT::v16i8, dl, OpHi, + DAG.getUNDEF(MVT::v16i8), + ShufMask1.data()); + + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpLo); + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); + + // MOVLHPS + SmallVector ShufMask2; + ShufMask2.push_back(0); + ShufMask2.push_back(1); + ShufMask2.push_back(4); + ShufMask2.push_back(5); + + SDValue res = DAG.getVectorShuffle(MVT::v4i32, dl, OpLo, OpHi, ShufMask2.data()); + return DAG.getNode(ISD::BITCAST, dl, MVT::v8i16, res); + + } + + return SDValue(); +} + /// PerformEXTRACT_VECTOR_ELTCombine - Detect vector gather/scatter index /// generation and convert it from being a bunch of shuffles and extracts /// to a simple store and scalar loads to extract the elements. @@ -14771,6 +14870,7 @@ case X86ISD::BT: return PerformBTCombine(N, DAG, DCI); case X86ISD::VZEXT_MOVL: return PerformVZEXT_MOVLCombine(N, DAG); case ISD::ZERO_EXTEND: return PerformZExtCombine(N, DAG, Subtarget); + case ISD::TRUNCATE: return PerformTruncateCombine(N, DAG, DCI); case X86ISD::SETCC: return PerformSETCCCombine(N, DAG); case X86ISD::SHUFP: // Handle all target specific shuffles case X86ISD::PALIGN: Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=149485&r1=149484&r2=149485&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Wed Feb 1 01:56:44 2012 @@ -839,6 +839,7 @@ SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG &DAG) const; SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const; SDValue LowerSIGN_EXTEND_INREG(SDValue Op, SelectionDAG &DAG) const; + SDValue PerformTruncateCombine(SDNode* N, SelectionDAG &DAG, DAGCombinerInfo &DCI) const; // Utility functions to help LowerVECTOR_SHUFFLE SDValue LowerVECTOR_SHUFFLEv8i16(SDValue Op, SelectionDAG &DAG) const; Added: llvm/trunk/test/CodeGen/X86/avx-trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-trunc.ll?rev=149485&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-trunc.ll (added) +++ llvm/trunk/test/CodeGen/X86/avx-trunc.ll Wed Feb 1 01:56:44 2012 @@ -0,0 +1,15 @@ +; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx -mattr=+avx | FileCheck %s + +define <4 x i32> @trunc_64_32(<4 x i64> %A) nounwind uwtable readnone ssp{ +; CHECK: trunc_64_32 +; CHECK: pshufd + %B = trunc <4 x i64> %A to <4 x i32> + ret <4 x i32>%B +} +define <8 x i16> @trunc_32_16(<8 x i32> %A) nounwind uwtable readnone ssp{ +; CHECK: trunc_32_16 +; CHECK: pshufb + %B = trunc <8 x i32> %A to <8 x i16> + ret <8 x i16>%B +} + Propchange: llvm/trunk/test/CodeGen/X86/avx-trunc.ll ------------------------------------------------------------------------------ svn:executable = * From tobias at grosser.es Wed Feb 1 02:00:51 2012 From: tobias at grosser.es (Tobias Grosser) Date: Wed, 01 Feb 2012 09:00:51 +0100 Subject: [llvm-commits] [llvm] r149468 - in /llvm/trunk: docs/ include/llvm-c/ include/llvm-c/Transforms/ include/llvm/ include/llvm/Transforms/ include/llvm/Transforms/IPO/ lib/Transforms/ lib/Transforms/IPO/ lib/Transforms/Vectorize/ test/Transforms/BBVectorize/ tools/bugpoint/ tools/llvm-ld/ tools/lto/ tools/opt/ In-Reply-To: <20120201035145.411492A6C12C@llvm.org> References: <20120201035145.411492A6C12C@llvm.org> Message-ID: <4F28F133.8090101@grosser.es> On 02/01/2012 04:51 AM, Hal Finkel wrote: > Author: hfinkel > Date: Tue Jan 31 21:51:43 2012 > New Revision: 149468 > > URL: http://llvm.org/viewvc/llvm-project?rev=149468&view=rev > Log: > Add a basic-block autovectorization pass. > > This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. > Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). Hey Hal, thanks for working on this and for committing it into svn. This is a great start and it is good to have it finally in svn. This will make further improvements a lot easier to track. I would like to encourage people to spend some time to give further post-commit reviews. Even though this patch is a very solid start, additional review will definitely be beneficial. Thank you Nick for going ahead here. Cheers Tobi From stpworld at narod.ru Wed Feb 1 02:00:23 2012 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 01 Feb 2012 08:00:23 -0000 Subject: [llvm-commits] [llvm-gcc-4.0] r149486 - /llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp Message-ID: <20120201080023.3A8D12A6C12C@llvm.org> Author: dyatkovskiy Date: Wed Feb 1 02:00:22 2012 New Revision: 149486 URL: http://llvm.org/viewvc/llvm-project?rev=149486&view=rev Log: Compatability fix for SwitchInst refactoring. The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. What was done: 1. Changed semantics of index inside the getCaseValue method: getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous. 2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned. 3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment. 4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst. 4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor. 4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor. Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang. Modified: llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp Modified: llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp?rev=149486&r1=149485&r2=149486&view=diff ============================================================================== --- llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp (original) +++ llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp Wed Feb 1 02:00:22 2012 @@ -776,8 +776,8 @@ // Change the default destination to go to one of the other destinations, if // there is any other dest. SwitchInst *SI = cast(IndirectGotoBlock->getTerminator()); - if (SI->getNumSuccessors() > 1) - SI->setSuccessor(0, SI->getSuccessor(1)); + if (SI->getNumCases() > 0) + SI->setDefaultDest(SI->getCaseSuccessor(0)); } // Remove any cached LLVM values that are local to this function. Such values @@ -1981,7 +1981,7 @@ TREE_VEC_LENGTH(Cases)); EmitBlock(new BasicBlock("")); // Default location starts out as fall-through - SI->setSuccessor(0, Builder.GetInsertBlock()); + SI->setDefaultDest(Builder.GetInsertBlock()); assert(!SWITCH_BODY(exp) && "not a gimple switch?"); @@ -2035,8 +2035,8 @@ } if (DefaultDest) - if (SI->getSuccessor(0) == Builder.GetInsertBlock()) - SI->setSuccessor(0, DefaultDest); + if (SI->getDefaultDest() == Builder.GetInsertBlock()) + SI->setDefaultDest(DefaultDest); else { Builder.CreateBr(DefaultDest); // Emit a "fallthrough" block, which is almost certainly dead. From stpworld at narod.ru Wed Feb 1 02:01:29 2012 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 01 Feb 2012 08:01:29 -0000 Subject: [llvm-commits] [llvm-gcc-4.2] r149487 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Message-ID: <20120201080129.810572A6C12C@llvm.org> Author: dyatkovskiy Date: Wed Feb 1 02:01:29 2012 New Revision: 149487 URL: http://llvm.org/viewvc/llvm-project?rev=149487&view=rev Log: Compatability fix for SwitchInst refactoring. The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. What was done: 1. Changed semantics of index inside the getCaseValue method: getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous. 2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned. 3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment. 4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst. 4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor. 4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor. Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang. Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp?rev=149487&r1=149486&r2=149487&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp (original) +++ llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Wed Feb 1 02:01:29 2012 @@ -2076,7 +2076,7 @@ TREE_VEC_LENGTH(Cases)); EmitBlock(BasicBlock::Create(Context, "")); // Default location starts out as fall-through - SI->setSuccessor(0, Builder.GetInsertBlock()); + SI->setDefaultDest(Builder.GetInsertBlock()); assert(!SWITCH_BODY(exp) && "not a gimple switch?"); @@ -2131,8 +2131,8 @@ } if (DefaultDest) { - if (SI->getSuccessor(0) == Builder.GetInsertBlock()) - SI->setSuccessor(0, DefaultDest); + if (SI->getDefaultDest() == Builder.GetInsertBlock()) + SI->setDefaultDest(DefaultDest); else { Builder.CreateBr(DefaultDest); // Emit a "fallthrough" block, which is almost certainly dead. From echristo at apple.com Wed Feb 1 02:26:44 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 00:26:44 -0800 Subject: [llvm-commits] [llvm] r149481 - in /llvm/trunk: include/llvm/ include/llvm/Analysis/ lib/Analysis/ lib/Bitcode/Writer/ lib/CodeGen/SelectionDAG/ lib/ExecutionEngine/Interpreter/ lib/Target/CBackend/ lib/Target/CppBackend/ lib/Transforms/IPO/ lib/Transforms/InstCombine/ lib/Transforms/Scalar/ lib/Transforms/Utils/ lib/VMCore/ tools/llvm-diff/ In-Reply-To: <20120201074953.13FFD2A6C12C@llvm.org> References: <20120201074953.13FFD2A6C12C@llvm.org> Message-ID: On Jan 31, 2012, at 11:49 PM, Stepan Dyatkovskiy wrote: > SwitchInst refactoring. > The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want. Wait a minute. Why are you doing this? What do you hope to gain? -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/9a11625e/attachment.html From baldrick at free.fr Wed Feb 1 02:35:52 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 09:35:52 +0100 Subject: [llvm-commits] [llvm] r149468 - in /llvm/trunk: docs/ include/llvm-c/ include/llvm-c/Transforms/ include/llvm/ include/llvm/Transforms/ include/llvm/Transforms/IPO/ lib/Transforms/ lib/Transforms/IPO/ lib/Transforms/Vectorize/ test/Transforms/BBVectorize/ tools/bugpoint/ tools/llvm-ld/ tools/lto/ tools/opt/ In-Reply-To: <1328074613.2489.1155.camel@sapling> References: <20120201035145.411492A6C12C@llvm.org> <4F28C6B1.4060409@mxc.ca> <1328074613.2489.1155.camel@sapling> Message-ID: <4F28F968.3040800@free.fr> Hi Hal, >>> + if (Vectorize) { >>> + MPM.add(createBBVectorizePass()); >>> + MPM.add(createInstructionCombiningPass()); >>> + if (OptLevel> 1) >>> + MPM.add(createGVNPass()); // Remove redundancies >> >> Whooooaa... GVN is *really* expensive, I find it hard to believe that >> you want to run it twice even with vectorization on. Are you sure? What >> is this doing that instcombine isn't? > > As I recall, this actually makes a big difference in the resulting code > quality. I'll revisit this and make some more specific comments. there is also the EarlyCSE pass, though of course it does much less than GVN. Ciao, Duncan. From baldrick at free.fr Wed Feb 1 02:44:10 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 09:44:10 +0100 Subject: [llvm-commits] [llvm-gcc-4.0] r149486 - /llvm-gcc-4.0/trunk/gcc/llvm-convert.cpp In-Reply-To: <20120201080023.3A8D12A6C12C@llvm.org> References: <20120201080023.3A8D12A6C12C@llvm.org> Message-ID: <4F28FB5A.4090509@free.fr> Hi Stepan, > Compatability fix for SwitchInst refactoring. thanks for working on this. You don't need to update llvm-gcc-4.0. In theory you don't need to update llvm-gcc-4.2 either, but there are still some llvm-gcc buildbots running (I hope to get rid of the last two at lab.llvm.org in the next few weeks). Ciao, Duncan. From echristo at apple.com Wed Feb 1 02:51:06 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 00:51:06 -0800 Subject: [llvm-commits] PATCH: Add support to llvm::Triple for computing 32-bit and 64-bit variant triples. In-Reply-To: References: Message-ID: <37B81E9A-E6FF-47DC-A820-F3FCB86A9C01@apple.com> On Jan 30, 2012, at 10:25 PM, Chandler Carruth wrote: > This patch teaches the Triple class to compute 32-bit variants of 64-bit architectures and 64-bit variants of 32-bit architectures. These can be used when reasoning about what alternate triples may have semi-compatible toolchains such as multiarch and bi-arch toolchains. The goal in placing this logic here is to associate it closely with the triple and architecture definitions themselves so that as those change, this gets updated and maintained. > > Comments on the somewhat clunky API welcome. The reason I went with returning a full triple rather than operating exclusively on the Arch is to make code using the interface as concise as possible. Any reason why the 32-bit version of an already 32-bit arch returns unknown rather than the existing arch itself? -eric From chandlerc at gmail.com Wed Feb 1 02:54:14 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Wed, 1 Feb 2012 00:54:14 -0800 Subject: [llvm-commits] PATCH: Add several convenience predicates to llvm::Triple In-Reply-To: References: Message-ID: Sending with the patch re-attached for the folks that actually delete their unread email... ;] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/1b022429/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: triple-predicates1.patch Type: application/octet-stream Size: 4784 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/1b022429/attachment-0001.obj From echristo at apple.com Wed Feb 1 02:56:47 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 00:56:47 -0800 Subject: [llvm-commits] PATCH: Add several convenience predicates to llvm::Triple In-Reply-To: References: Message-ID: <78B0B74B-D015-44C6-9F06-8D8FC70FDB09@apple.com> On Feb 1, 2012, at 12:54 AM, Chandler Carruth wrote: > Sending with the patch re-attached for the folks that actually delete their unread email... ;] _______________________________________________ Oh I read it, I just deleted it after ;) Anyhow, looks good though you may have to rework it a bit after Bob's change of earlier today :) -eric From eli.bendersky at intel.com Wed Feb 1 02:57:47 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 08:57:47 +0000 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Re-sending the patch itself (by request on IRC) -----Original Message----- From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Bendersky, Eli Sent: Tuesday, January 24, 2012 15:03 To: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] [PATCH] enabling generation of ELF objects on Windows with the help of the triple Hello, Earlier this month I initiated a llvmdev discussion on the possibility to make MC generate code into an ELF container on Windows (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/046583.html). Currently in several places in the code the decision is made based on the Triple's OS component. When it's Windows, a decision is made automatically to generate COFF, so a way is needed to let MC know that we still want ELF, even if we're on Windows. There are several approaches to this: 1. Add this information somewhere which isn't the Triple 2. Add this information into the Triple, making it a 5-tuple instead of 4-tuple - the 5th component being "container" or something like that 3. Add this information into the Triple, overlaying the "environment" component The attached patch takes approach (3) since this appears to make the minimal overall impact on the code. It adds an "ELF" option to the EnvironmentType enum. Since we're interested in ELF on Windows on x86, this environment option doesn't conflict with the others. In other words, it enables us to generate and run MCJIT-ted code on Windows, without interfering with other code in LLVM. Although approach (1) would perhaps be cleaner, it is not easy to see how to go about it, since in many places where the modification is required the triple is the only accessible piece of information about the compiler target. The decision to generate COFF on Windows is based on the Triple, not on something else. I'll be happy to hear about other options, or to get this patch reviewed so I can commit it. Thanks in advance, Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- A non-text attachment was scrubbed... Name: windows_elf_triple.2.patch Type: application/octet-stream Size: 4996 bytes Desc: windows_elf_triple.2.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/fb4dc98f/attachment.obj -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ATT00001.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/fb4dc98f/attachment.txt From chandlerc at gmail.com Wed Feb 1 02:59:11 2012 From: chandlerc at gmail.com (Chandler Carruth) Date: Wed, 1 Feb 2012 00:59:11 -0800 Subject: [llvm-commits] PATCH: Add support to llvm::Triple for computing 32-bit and 64-bit variant triples. In-Reply-To: <37B81E9A-E6FF-47DC-A820-F3FCB86A9C01@apple.com> References: <37B81E9A-E6FF-47DC-A820-F3FCB86A9C01@apple.com> Message-ID: On Wed, Feb 1, 2012 at 12:51 AM, Eric Christopher wrote: > > On Jan 30, 2012, at 10:25 PM, Chandler Carruth wrote: > > > This patch teaches the Triple class to compute 32-bit variants of 64-bit > architectures and 64-bit variants of 32-bit architectures. These can be > used when reasoning about what alternate triples may have semi-compatible > toolchains such as multiarch and bi-arch toolchains. The goal in placing > this logic here is to associate it closely with the triple and architecture > definitions themselves so that as those change, this gets updated and > maintained. > > > > Comments on the somewhat clunky API welcome. The reason I went with > returning a full triple rather than operating exclusively on the Arch is to > make code using the interface as concise as possible. > > Any reason why the 32-bit version of an already 32-bit arch returns > unknown rather than the existing arch itself? Because it's not much of a variant? I thought of 3 possible behaviors: assert, return self, return unknown I'm happy with any of the three. I picked the return unknown as it technically simplifies what i expect to be the common client: Triple AltTriple = MyTriple.getArch32BitVariant(); if (AltTriple.getArch() != Triple::UnknownArch) { // Try alt } But I could be persuaded to any of the other behaviors. =] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/b296811b/attachment.html From echristo at apple.com Wed Feb 1 03:01:05 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 01:01:05 -0800 Subject: [llvm-commits] PATCH: Add support to llvm::Triple for computing 32-bit and 64-bit variant triples. In-Reply-To: References: <37B81E9A-E6FF-47DC-A820-F3FCB86A9C01@apple.com> Message-ID: <7BC058FB-2D92-49D8-928E-EAD978B42A3D@apple.com> On Feb 1, 2012, at 12:59 AM, Chandler Carruth wrote: > On Wed, Feb 1, 2012 at 12:51 AM, Eric Christopher wrote: > > On Jan 30, 2012, at 10:25 PM, Chandler Carruth wrote: > > > This patch teaches the Triple class to compute 32-bit variants of 64-bit architectures and 64-bit variants of 32-bit architectures. These can be used when reasoning about what alternate triples may have semi-compatible toolchains such as multiarch and bi-arch toolchains. The goal in placing this logic here is to associate it closely with the triple and architecture definitions themselves so that as those change, this gets updated and maintained. > > > > Comments on the somewhat clunky API welcome. The reason I went with returning a full triple rather than operating exclusively on the Arch is to make code using the interface as concise as possible. > > Any reason why the 32-bit version of an already 32-bit arch returns unknown rather than the existing arch itself? > > Because it's not much of a variant? > > I thought of 3 possible behaviors: assert, return self, return unknown > > I'm happy with any of the three. I picked the return unknown as it technically simplifies what i expect to be the common client: > > Triple AltTriple = MyTriple.getArch32BitVariant(); > if (AltTriple.getArch() != Triple::UnknownArch) { > // Try alt > } > > But I could be persuaded to any of the other behaviors. =] :) I'd have thought returning identity would be easier to handle in the "gimme the 32-bit variant" "you're on it already, done." "*uses 32-bit variant*" and error/return unknown if one really doesn't exist. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/0038357c/attachment.html From echristo at apple.com Wed Feb 1 03:09:43 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 01:09:43 -0800 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: Thanks for the patch. Couple questions/concerns: a) Have you seen the win32 in macho work? Thoughts on how that applies here? b) How about a way of initializing the JIT that takes the triple of the target you wish to generate code/information for rather than adding to the triple? I think for the JIT it makes more sense for the interface to require a "container triple" which defaults to the current host. Jim: Objections? Eli: Thoughts? :) -eric On Feb 1, 2012, at 12:57 AM, Bendersky, Eli wrote: > Re-sending the patch itself (by request on IRC) > > -----Original Message----- > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Bendersky, Eli > Sent: Tuesday, January 24, 2012 15:03 > To: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] [PATCH] enabling generation of ELF objects on Windows with the help of the triple > > Hello, > > Earlier this month I initiated a llvmdev discussion on the possibility to make MC generate code into an ELF container on Windows (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/046583.html). Currently in several places in the code the decision is made based on the Triple's OS component. When it's Windows, a decision is made automatically to generate COFF, so a way is needed to let MC know that we still want ELF, even if we're on Windows. > > There are several approaches to this: > > 1. Add this information somewhere which isn't the Triple 2. Add this information into the Triple, making it a 5-tuple instead of 4-tuple - the 5th component being "container" or something like that 3. Add this information into the Triple, overlaying the "environment" component > > The attached patch takes approach (3) since this appears to make the minimal overall impact on the code. It adds an "ELF" option to the EnvironmentType enum. Since we're interested in ELF on Windows on x86, this environment option doesn't conflict with the others. In other words, it enables us to generate and run MCJIT-ted code on Windows, without interfering with other code in LLVM. > > Although approach (1) would perhaps be cleaner, it is not easy to see how to go about it, since in many places where the modification is required the triple is the only accessible piece of information about the compiler target. The decision to generate COFF on Windows is based on the Triple, not on something else. > > I'll be happy to hear about other options, or to get this patch reviewed so I can commit it. > > Thanks in advance, > Eli > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From anton at korobeynikov.info Wed Feb 1 03:13:27 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 1 Feb 2012 13:13:27 +0400 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: Hi Eli, > Re-sending the patch itself (by request on IRC) I thought about this patch a little bit more. And I think we should have something "symmetric" wrt other targets/environments. How do you feel about the following: we should have "sane" set of defaults (macho on darwin, coff on win, elf everywhere else). If one will explicitly ask for other format, it should be tolerated, regardless whether elf was asked on windows or macho. What do you think? I believe this will make the patch cleaner... I don't like special case of ELF everywhere, such cases tend to be forgotten in many places. Minor nitpicks: - return !isTargetDarwin() && !isTargetWindows() && !isTargetCygMing(); + return TargetTriple.getEnvironment() == Triple::ELF || ( + !isTargetDarwin() && !isTargetWindows() && !isTargetCygMing()); Put ( at the new line :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From mcrosier at apple.com Wed Feb 1 03:26:55 2012 From: mcrosier at apple.com (Chad Rosier) Date: Wed, 01 Feb 2012 01:26:55 -0800 Subject: [llvm-commits] [llvm] r149485 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86ISelLowering.h test/CodeGen/X86/avx-trunc.ll In-Reply-To: <20120201075644.BE35F2A6C12C@llvm.org> References: <20120201075644.BE35F2A6C12C@llvm.org> Message-ID: <70671B54-9364-4639-91D2-33DA0589BF90@apple.com> Hi Elena, Minor nit-picks below. On Jan 31, 2012, at 11:56 PM, Elena Demikhovsky wrote: > Author: delena > Date: Wed Feb 1 01:56:44 2012 > New Revision: 149485 > > URL: http://llvm.org/viewvc/llvm-project?rev=149485&view=rev > Log: > Optimization for "truncate" operation on AVX. > Truncating v4i64 -> v4i32 and v8i32 -> v8i16 may be done with set of shuffles. > > Added: > llvm/trunk/test/CodeGen/X86/avx-trunc.ll (with props) > Modified: > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > llvm/trunk/lib/Target/X86/X86ISelLowering.h > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149485&r1=149484&r2=149485&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 01:56:44 2012 > @@ -1218,6 +1218,7 @@ > setTargetDAGCombine(ISD::LOAD); > setTargetDAGCombine(ISD::STORE); > setTargetDAGCombine(ISD::ZERO_EXTEND); > + setTargetDAGCombine(ISD::TRUNCATE); > setTargetDAGCombine(ISD::SINT_TO_FP); > if (Subtarget->is64Bit()) > setTargetDAGCombine(ISD::MUL); > @@ -12911,6 +12912,104 @@ > return EltsFromConsecutiveLoads(VT, Elts, dl, DAG); > } > > + > +/// PerformTruncateCombine - Converts truncate operation to > +/// a sequence of vector shuffle operations. > +/// It is possible when we truncate 256-bit vector to 128-bit vector > + > +SDValue X86TargetLowering::PerformTruncateCombine(SDNode *N, SelectionDAG &DAG, > + DAGCombinerInfo &DCI) const { > + if (!DCI.isBeforeLegalizeOps()) > + return SDValue(); > + > + if (!Subtarget->hasAVX()) return SDValue(); > + > + EVT VT = N->getValueType(0); > + SDValue Op = N->getOperand(0); > + EVT OpVT = Op.getValueType(); > + DebugLoc dl = N->getDebugLoc(); > + > + if ((VT == MVT::v4i32) && (OpVT == MVT::v4i64)) { > + > + SDValue OpLo = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v2i64, Op, > + DAG.getIntPtrConstant(0)); > + > + SDValue OpHi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v2i64, Op, > + DAG.getIntPtrConstant(2)); > + > + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpLo); > + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); > + > + // PSHUFD > + SmallVector ShufMask1; > + ShufMask1.push_back(0); > + ShufMask1.push_back(2); > + ShufMask1.push_back(0); > + ShufMask1.push_back(0); > + > + OpLo = DAG.getVectorShuffle(VT, dl, OpLo, DAG.getUNDEF(VT), > + ShufMask1.data()); > + OpHi = DAG.getVectorShuffle(VT, dl, OpHi, DAG.getUNDEF(VT), > + ShufMask1.data()); > + > + // MOVLHPS > + SmallVector ShufMask2; > + ShufMask2.push_back(0); > + ShufMask2.push_back(1); > + ShufMask2.push_back(4); > + ShufMask2.push_back(5); > + > + return DAG.getVectorShuffle(VT, dl, OpLo, OpHi, ShufMask2.data()); > + } > + if ((VT == MVT::v8i16) && (OpVT == MVT::v8i32)) { > + > + SDValue OpLo = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v4i32, Op, > + DAG.getIntPtrConstant(0)); > + > + SDValue OpHi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, MVT::v4i32, Op, > + DAG.getIntPtrConstant(4)); > + > + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v16i8, OpLo); > + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v16i8, OpHi); > + > + // PSHUFB > + SmallVector ShufMask1; > + ShufMask1.push_back(0x0); > + ShufMask1.push_back(0x1); > + ShufMask1.push_back(0x4); > + ShufMask1.push_back(0x5); > + ShufMask1.push_back(0x8); > + ShufMask1.push_back(0x9); > + ShufMask1.push_back(0xc); > + ShufMask1.push_back(0xd); > + for (unsigned i=0; i<8; ++i) It's much preferred for format for loops like this: for (unsigned i = 0; i < 8; ++i) Specifically, I'm referring to the whitespace or rather lack there of. > + ShufMask1.push_back(-1); > + > + OpLo = DAG.getVectorShuffle(MVT::v16i8, dl, OpLo, > + DAG.getUNDEF(MVT::v16i8), > + ShufMask1.data()); > + OpHi = DAG.getVectorShuffle(MVT::v16i8, dl, OpHi, > + DAG.getUNDEF(MVT::v16i8), > + ShufMask1.data()); > + > + OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpLo); > + OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); > + > + // MOVLHPS > + SmallVector ShufMask2; > + ShufMask2.push_back(0); > + ShufMask2.push_back(1); > + ShufMask2.push_back(4); > + ShufMask2.push_back(5); > + > + SDValue res = DAG.getVectorShuffle(MVT::v4i32, dl, OpLo, OpHi, ShufMask2.data()); > + return DAG.getNode(ISD::BITCAST, dl, MVT::v8i16, res); > + Extra newline. > + } > + > + return SDValue(); > +} > + > /// PerformEXTRACT_VECTOR_ELTCombine - Detect vector gather/scatter index > /// generation and convert it from being a bunch of shuffles and extracts > /// to a simple store and scalar loads to extract the elements. > @@ -14771,6 +14870,7 @@ > case X86ISD::BT: return PerformBTCombine(N, DAG, DCI); > case X86ISD::VZEXT_MOVL: return PerformVZEXT_MOVLCombine(N, DAG); > case ISD::ZERO_EXTEND: return PerformZExtCombine(N, DAG, Subtarget); > + case ISD::TRUNCATE: return PerformTruncateCombine(N, DAG, DCI); > case X86ISD::SETCC: return PerformSETCCCombine(N, DAG); > case X86ISD::SHUFP: // Handle all target specific shuffles > case X86ISD::PALIGN: > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=149485&r1=149484&r2=149485&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Wed Feb 1 01:56:44 2012 > @@ -839,6 +839,7 @@ > SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG &DAG) const; > SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const; > SDValue LowerSIGN_EXTEND_INREG(SDValue Op, SelectionDAG &DAG) const; > + SDValue PerformTruncateCombine(SDNode* N, SelectionDAG &DAG, DAGCombinerInfo &DCI) const; 80-column violation? > > // Utility functions to help LowerVECTOR_SHUFFLE > SDValue LowerVECTOR_SHUFFLEv8i16(SDValue Op, SelectionDAG &DAG) const; > > Added: llvm/trunk/test/CodeGen/X86/avx-trunc.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-trunc.ll?rev=149485&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/avx-trunc.ll (added) > +++ llvm/trunk/test/CodeGen/X86/avx-trunc.ll Wed Feb 1 01:56:44 2012 > @@ -0,0 +1,15 @@ > +; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx -mattr=+avx | FileCheck %s > + > +define <4 x i32> @trunc_64_32(<4 x i64> %A) nounwind uwtable readnone ssp{ > +; CHECK: trunc_64_32 > +; CHECK: pshufd > + %B = trunc <4 x i64> %A to <4 x i32> > + ret <4 x i32>%B > +} > +define <8 x i16> @trunc_32_16(<8 x i32> %A) nounwind uwtable readnone ssp{ > +; CHECK: trunc_32_16 > +; CHECK: pshufb > + %B = trunc <8 x i32> %A to <8 x i16> > + ret <8 x i16>%B > +} > + > > Propchange: llvm/trunk/test/CodeGen/X86/avx-trunc.ll > ------------------------------------------------------------------------------ > svn:executable = * > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From glider at google.com Wed Feb 1 03:47:40 2012 From: glider at google.com (Alexander Potapenko) Date: Wed, 01 Feb 2012 09:47:40 -0000 Subject: [llvm-commits] [compiler-rt] r149491 - /compiler-rt/trunk/lib/asan/tests/asan_test.cc Message-ID: <20120201094740.D884F2A6C12C@llvm.org> Author: glider Date: Wed Feb 1 03:47:40 2012 New Revision: 149491 URL: http://llvm.org/viewvc/llvm-project?rev=149491&view=rev Log: Disables testing memcpy() on Mac OS 10.7, where memcpy() in fact aliases memmove() and thus calling it with overlapping parameters is not an error. Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=149491&r1=149490&r2=149491&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Wed Feb 1 03:47:40 2012 @@ -30,6 +30,7 @@ #ifndef __APPLE__ #include #else +#include // For MAC_OS_X_VERSION_* #include #endif // __APPLE__ @@ -1305,12 +1306,17 @@ size_t size = Ident(100); char *str = Ident((char*)malloc(size)); +// Do not check memcpy() on OS X 10.7 and later, where it actually aliases +// memmove(). +#if !defined(__APPLE__) || !defined(MAC_OS_X_VERSION_10_7) || \ + (MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_X_VERSION_10_7) // Check "memcpy". Use Ident() to avoid inlining. memset(str, 'z', size); Ident(memcpy)(str + 1, str + 11, 10); Ident(memcpy)(str, str, 0); EXPECT_DEATH(Ident(memcpy)(str, str + 14, 15), OverlapErrorMessage("memcpy")); EXPECT_DEATH(Ident(memcpy)(str + 14, str, 15), OverlapErrorMessage("memcpy")); +#endif // We do not treat memcpy with to==from as a bug. // See http://llvm.org/bugs/show_bug.cgi?id=11763. From baldrick at free.fr Wed Feb 1 04:02:39 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 11:02:39 +0100 Subject: [llvm-commits] [llvm] r149485 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86ISelLowering.h test/CodeGen/X86/avx-trunc.ll In-Reply-To: <70671B54-9364-4639-91D2-33DA0589BF90@apple.com> References: <20120201075644.BE35F2A6C12C@llvm.org> <70671B54-9364-4639-91D2-33DA0589BF90@apple.com> Message-ID: <4F290DBF.5060703@free.fr> Hi Elena, >> + // PSHUFD >> + SmallVector ShufMask1; >> + ShufMask1.push_back(0); >> + ShufMask1.push_back(2); >> + ShufMask1.push_back(0); >> + ShufMask1.push_back(0); why not just int ShufMask1[] = {0, 2, 0, 0}; ? Likewise for all the other instances of this odd idiom. Ciao, Duncan. From elena.demikhovsky at intel.com Wed Feb 1 04:02:03 2012 From: elena.demikhovsky at intel.com (Demikhovsky, Elena) Date: Wed, 1 Feb 2012 10:02:03 +0000 Subject: [llvm-commits] Fixed a bug in Win64 CC - please review Message-ID: Passing AVX 256-bit structures in Win64 was wrong. - Elena --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- A non-text attachment was scrubbed... Name: win64_cc.diff Type: application/octet-stream Size: 1888 bytes Desc: win64_cc.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/0ada9403/attachment.obj From anton at korobeynikov.info Wed Feb 1 04:10:06 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 1 Feb 2012 14:10:06 +0400 Subject: [llvm-commits] Fixed a bug in Win64 CC - please review In-Reply-To: References: Message-ID: > Passing AVX 256-bit structures in Win64 was wrong. LGTM -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From glider at google.com Wed Feb 1 04:07:52 2012 From: glider at google.com (Alexander Potapenko) Date: Wed, 01 Feb 2012 10:07:52 -0000 Subject: [llvm-commits] [compiler-rt] r149492 - /compiler-rt/trunk/lib/asan/asan_interceptors.cc Message-ID: <20120201100753.0FF462A6C12D@llvm.org> Author: glider Date: Wed Feb 1 04:07:52 2012 New Revision: 149492 URL: http://llvm.org/viewvc/llvm-project?rev=149492&view=rev Log: Disable wrapping memcpy() on Mac OS Lion, where it actually falls back to memmove. In this case we still need to initialize real_memcpy, so we set it to real_memmove We check for MACOS_VERSION_SNOW_LEOPARD, because currently only Snow Leopard and Lion are supported. Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc Modified: compiler-rt/trunk/lib/asan/asan_interceptors.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interceptors.cc?rev=149492&r1=149491&r2=149492&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interceptors.cc (original) +++ compiler-rt/trunk/lib/asan/asan_interceptors.cc Wed Feb 1 04:07:52 2012 @@ -676,8 +676,22 @@ OVERRIDE_FUNCTION(index, WRAP(strchr)); #endif INTERCEPT_FUNCTION(memcmp); - INTERCEPT_FUNCTION(memcpy); INTERCEPT_FUNCTION(memmove); +#ifdef __APPLE__ + // Wrap memcpy() on OS X 10.6 only, because on 10.7 memcpy() and memmove() + // are resolved into memmove$VARIANT$sse42. + // See also http://code.google.com/p/address-sanitizer/issues/detail?id=34. + // TODO(glider): need to check dynamically that memcpy() and memmove() are + // actually the same function. + if (GetMacosVersion() == MACOS_VERSION_SNOW_LEOPARD) { + INTERCEPT_FUNCTION(memcpy); + } else { + real_memcpy = real_memmove; + } +#else + // Always wrap memcpy() on non-Darwin platforms. + INTERCEPT_FUNCTION(memcpy); +#endif INTERCEPT_FUNCTION(memset); INTERCEPT_FUNCTION(strcasecmp); INTERCEPT_FUNCTION(strcat); // NOLINT From elena.demikhovsky at intel.com Wed Feb 1 04:33:05 2012 From: elena.demikhovsky at intel.com (Elena Demikhovsky) Date: Wed, 01 Feb 2012 10:33:05 -0000 Subject: [llvm-commits] [llvm] r149493 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120201103305.C599F2A6C12C@llvm.org> Author: delena Date: Wed Feb 1 04:33:05 2012 New Revision: 149493 URL: http://llvm.org/viewvc/llvm-project?rev=149493&view=rev Log: Shortened code in shuffle masks Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149493&r1=149492&r2=149493&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 04:33:05 2012 @@ -12941,25 +12941,17 @@ OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); // PSHUFD - SmallVector ShufMask1; - ShufMask1.push_back(0); - ShufMask1.push_back(2); - ShufMask1.push_back(0); - ShufMask1.push_back(0); + int ShufMask1[] = {0, 2, 0, 0}; OpLo = DAG.getVectorShuffle(VT, dl, OpLo, DAG.getUNDEF(VT), - ShufMask1.data()); + ShufMask1); OpHi = DAG.getVectorShuffle(VT, dl, OpHi, DAG.getUNDEF(VT), - ShufMask1.data()); + ShufMask1); // MOVLHPS - SmallVector ShufMask2; - ShufMask2.push_back(0); - ShufMask2.push_back(1); - ShufMask2.push_back(4); - ShufMask2.push_back(5); + int ShufMask2[] = {0, 1, 4, 5}; - return DAG.getVectorShuffle(VT, dl, OpLo, OpHi, ShufMask2.data()); + return DAG.getVectorShuffle(VT, dl, OpLo, OpHi, ShufMask2); } if ((VT == MVT::v8i16) && (OpVT == MVT::v8i32)) { @@ -12973,38 +12965,24 @@ OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v16i8, OpHi); // PSHUFB - SmallVector ShufMask1; - ShufMask1.push_back(0x0); - ShufMask1.push_back(0x1); - ShufMask1.push_back(0x4); - ShufMask1.push_back(0x5); - ShufMask1.push_back(0x8); - ShufMask1.push_back(0x9); - ShufMask1.push_back(0xc); - ShufMask1.push_back(0xd); - for (unsigned i=0; i<8; ++i) - ShufMask1.push_back(-1); + int ShufMask1[] = {0, 1, 4, 5, 8, 9, 12, 13, + -1, -1, -1, -1, -1, -1, -1, -1}; OpLo = DAG.getVectorShuffle(MVT::v16i8, dl, OpLo, DAG.getUNDEF(MVT::v16i8), - ShufMask1.data()); + ShufMask1); OpHi = DAG.getVectorShuffle(MVT::v16i8, dl, OpHi, DAG.getUNDEF(MVT::v16i8), - ShufMask1.data()); + ShufMask1); OpLo = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpLo); OpHi = DAG.getNode(ISD::BITCAST, dl, MVT::v4i32, OpHi); // MOVLHPS - SmallVector ShufMask2; - ShufMask2.push_back(0); - ShufMask2.push_back(1); - ShufMask2.push_back(4); - ShufMask2.push_back(5); + int ShufMask2[] = {0, 1, 4, 5}; - SDValue res = DAG.getVectorShuffle(MVT::v4i32, dl, OpLo, OpHi, ShufMask2.data()); + SDValue res = DAG.getVectorShuffle(MVT::v4i32, dl, OpLo, OpHi, ShufMask2); return DAG.getNode(ISD::BITCAST, dl, MVT::v8i16, res); - } return SDValue(); From bigcheesegs at gmail.com Wed Feb 1 04:43:06 2012 From: bigcheesegs at gmail.com (Michael Spencer) Date: Wed, 1 Feb 2012 02:43:06 -0800 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: On Wed, Feb 1, 2012 at 1:13 AM, Anton Korobeynikov wrote: > Hi Eli, > >> Re-sending the patch itself (by request on IRC) > I thought about this patch a little bit more. And I think we should > have something "symmetric" wrt other targets/environments. > > How do you feel about the following: we should have "sane" set of > defaults (macho on darwin, coff on win, elf everywhere else). If one > will explicitly ask for other format, it should be tolerated, > regardless whether elf was asked on windows or macho. What do you > think? I believe this will make the patch cleaner... I don't like > special case of ELF everywhere, such cases tend to be forgotten in > many places. > > Minor nitpicks: > > - ? ?return !isTargetDarwin() && !isTargetWindows() && !isTargetCygMing(); > + ? ? ?return TargetTriple.getEnvironment() == Triple::ELF || ( > + ? ? ? ? ?!isTargetDarwin() && !isTargetWindows() && !isTargetCygMing()); > Put ( at the new line :) > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University I agree with Anton here. This should be consistent and not just special cased for ELF/Windows. The only thing I'm not sure about is how much effort will be required to properly respect the requested format. As it stands, there is quite a bit of code that assumes Windows => COFF and Darwin => MachO. There are also features, such as TLS, which depend on the format, but are all currently based on the OS. - Michael Spencer From elena.demikhovsky at intel.com Wed Feb 1 04:46:14 2012 From: elena.demikhovsky at intel.com (Elena Demikhovsky) Date: Wed, 01 Feb 2012 10:46:14 -0000 Subject: [llvm-commits] [llvm] r149494 - in /llvm/trunk: lib/Target/X86/X86CallingConv.td test/CodeGen/X86/avx-win64-args.ll test/CodeGen/X86/avx-win64.ll Message-ID: <20120201104614.8B9152A6C12E@llvm.org> Author: delena Date: Wed Feb 1 04:46:14 2012 New Revision: 149494 URL: http://llvm.org/viewvc/llvm-project?rev=149494&view=rev Log: Passing AVX 256-bit structures in Win64 was wrong. Fixed Win64 calling conventions. Added: llvm/trunk/test/CodeGen/X86/avx-win64-args.ll (with props) Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td llvm/trunk/test/CodeGen/X86/avx-win64.ll Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=149494&r1=149493&r2=149494&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86CallingConv.td (original) +++ llvm/trunk/lib/Target/X86/X86CallingConv.td Wed Feb 1 04:46:14 2012 @@ -198,6 +198,10 @@ // 128 bit vectors are passed by pointer CCIfType<[v16i8, v8i16, v4i32, v2i64, v4f32, v2f64], CCPassIndirect>, + + // 256 bit vectors are passed by pointer + CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], CCPassIndirect>, + // The first 4 MMX vector arguments are passed in GPRs. CCIfType<[x86mmx], CCBitConvertToType>, Added: llvm/trunk/test/CodeGen/X86/avx-win64-args.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-win64-args.ll?rev=149494&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-win64-args.ll (added) +++ llvm/trunk/test/CodeGen/X86/avx-win64-args.ll Wed Feb 1 04:46:14 2012 @@ -0,0 +1,18 @@ +; RUN: llc < %s -mcpu=corei7-avx -mattr=+avx | FileCheck %s +target triple = "x86_64-pc-win32" + +declare <8 x float> @foo(<8 x float>, i32) + +define <8 x float> @test1(<8 x float> %x, <8 x float> %y) nounwind uwtable readnone ssp { +entry: +; CHECK: test1 +; CHECK: leaq {{.*}}, %rcx +; CHECK: movl {{.*}}, %edx +; CHECK: call +; CHECK: ret + %x1 = fadd <8 x float> %x, %y + %call = call <8 x float> @foo(<8 x float> %x1, i32 1) nounwind + %y1 = fsub <8 x float> %call, %y + ret <8 x float> %y1 +} + Propchange: llvm/trunk/test/CodeGen/X86/avx-win64-args.ll ------------------------------------------------------------------------------ svn:executable = * Modified: llvm/trunk/test/CodeGen/X86/avx-win64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-win64.ll?rev=149494&r1=149493&r2=149494&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-win64.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-win64.ll Wed Feb 1 04:46:14 2012 @@ -9,7 +9,6 @@ ; CHECK: f___vyf ; CHECK: pushq %rbp -; CHECK-NOT: vmovaps{{.*}}(%r ; CHECK: vmovmsk ; CHECK: vmovaps %ymm{{.*}}(%r ; CHECK: vmovaps %ymm{{.*}}(%r From anton at korobeynikov.info Wed Feb 1 04:56:41 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 1 Feb 2012 14:56:41 +0400 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: > The only thing I'm not sure about is how much effort will be required > to properly respect the requested format. As it stands, there is quite > a bit of code that assumes Windows => COFF and Darwin => MachO. There > are also features, such as TLS, which depend on the format, but are > all currently based on the OS. I don't think that such stuff should be resolved all-at-once. Such limitations can be addressed as soon as someone will try to actually use them :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eli.bendersky at intel.com Wed Feb 1 05:12:02 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 11:12:02 +0000 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326D58@HASMSX103.ger.corp.intel.com> > Thanks for the patch. Couple questions/concerns: Eric, thanks for the review. My answers below: > > a) Have you seen the win32 in macho work? Thoughts on how that applies > here? As we discussed on IRC, I'm not really sure what this refers to. Conceptually, the new ELF environment type was added modeled on the existing MachO type. If you look around for "Triple.getEnvironment() == Triple::MachO" you'll actually see that a similar "container override" was already implemented for MachO (hmm, is this what you're referring to as the win32 macho work?) > b) How about a way of initializing the JIT that takes the triple of the target > you wish to generate code/information for rather than adding to the triple? > > I think for the JIT it makes more sense for the interface to require a > "container triple" which defaults to the current host. > ExecutionEngine uses the triple defined in the module. This triple can be overridden, by setting the module triple. This is actually already done in lli.cpp - it accepts a "-mtriple" argument that can override the module's triple: // If we are supposed to override the target triple, do so now. if (!TargetTriple.empty()) Mod->setTargetTriple(Triple::normalize(TargetTriple)); The real problem is elsewhere. To put it simply, MCJIT uses LLVMTargetMachine::addPassesToEmitMC to eventually generate an object file from MC. When MC decides which object file to create, it's currently just looking at the OS component of the target triple, and decides COFF if it sees Windows, etc. (with the exception of checking for MachO, as mentioned above). Actually, the only way to currently affect the object file format is to set the triple. So to get an ELF object file on Windows we have to change some things. As I previously mentioned, there are several approaches to this: 1. Add this information somewhere which isn't the Triple 2. Add this information into the Triple, making it a 5-tuple instead of 4-tuple - the 5th component being "container" or something like that 3. Add this information into the Triple, overlaying the "environment" component The patch takes approach (3). Approach (2) would also be relatively simple (and perhaps cleaner, except making Triple a 5-tuple instead of 4-tuple). Approach (3) seems (to my neophyte self) much more complex, as MC's and Target's reliance on the triple to decide on the object file format goes deep. A look at the code of the patch shows which parts of the code have to be touched to make the container override work. While we would be happy to have this patch in, we fully understand the need to come up with a solution people feel comfortable with. Discussion on this topic would be very welcome. Eli > On Feb 1, 2012, at 12:57 AM, Bendersky, Eli wrote: > > > Re-sending the patch itself (by request on IRC) > > > > -----Original Message----- > > From: llvm-commits-bounces at cs.uiuc.edu > > [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Bendersky, Eli > > Sent: Tuesday, January 24, 2012 15:03 > > To: llvm-commits at cs.uiuc.edu > > Subject: [llvm-commits] [PATCH] enabling generation of ELF objects on > > Windows with the help of the triple > > > > Hello, > > > > Earlier this month I initiated a llvmdev discussion on the possibility to make > MC generate code into an ELF container on Windows > (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/046583.html). > Currently in several places in the code the decision is made based on the > Triple's OS component. When it's Windows, a decision is made automatically > to generate COFF, so a way is needed to let MC know that we still want ELF, > even if we're on Windows. > > > > There are several approaches to this: > > > > 1. Add this information somewhere which isn't the Triple 2. Add this > > information into the Triple, making it a 5-tuple instead of 4-tuple - > > the 5th component being "container" or something like that 3. Add this > > information into the Triple, overlaying the "environment" component > > > > The attached patch takes approach (3) since this appears to make the > minimal overall impact on the code. It adds an "ELF" option to the > EnvironmentType enum. Since we're interested in ELF on Windows on x86, > this environment option doesn't conflict with the others. In other words, it > enables us to generate and run MCJIT-ted code on Windows, without > interfering with other code in LLVM. > > > > Although approach (1) would perhaps be cleaner, it is not easy to see how > to go about it, since in many places where the modification is required the > triple is the only accessible piece of information about the compiler target. > The decision to generate COFF on Windows is based on the Triple, not on > something else. > > > > I'll be happy to hear about other options, or to get this patch reviewed so I > can commit it. > > > > Thanks in advance, > > Eli > > > > --------------------------------------------------------------------- > > Intel Israel (74) Limited > > > > This e-mail and any attachments may contain confidential material for the > sole use of the intended recipient(s). Any review or distribution by others is > strictly prohibited. If you are not the intended recipient, please contact the > sender and delete all copies. > > --------------------------------------------------------------------- > > Intel Israel (74) Limited > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. > > > ___________________________________ > _______ > > _____ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From eli.bendersky at intel.com Wed Feb 1 05:18:10 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 11:18:10 +0000 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> Thanks for the review -- my answers below: > > Re-sending the patch itself (by request on IRC) > I thought about this patch a little bit more. And I think we should have > something "symmetric" wrt other targets/environments. > > How do you feel about the following: we should have "sane" set of defaults > (macho on darwin, coff on win, elf everywhere else). If one will explicitly ask > for other format, it should be tolerated, regardless whether elf was asked on > windows or macho. What do you think? I believe this will make the patch > cleaner... I don't like special case of ELF everywhere, such cases tend to be > forgotten in many places. > <> >I agree with Anton here. This should be consistent and not just special cased for ELF/Windows. > >The only thing I'm not sure about is how much effort will be required to properly respect the requested format. As it stands, there is quite a bit of >code that assumes Windows => COFF and Darwin => MachO. There are also features, such as TLS, which depend on the format, but are all >currently based on the OS. Let me see if I grok your proposal. You want the "container override" to be implemented in a way that is generic and not only active for ELF. Is that right? In this case, I think we're already almost there. If you look at the usage of Triple.getEnvironment(), you'll notice that the "MachO override" was already implemented. What my patch adds is an override for ELF - which is, IIUC, exactly what you're asking for - perhaps not coded in a sufficiently clear way and not adding the option of asking for ELF on Mac? So what's left is "COFF override"? Do you propose to add an environment named "COFF" and decide upon that, too? Would this make sense even if no one is currently interested in this? > Minor nitpicks: > > - return !isTargetDarwin() && !isTargetWindows() && !isTargetCygMing(); > + return TargetTriple.getEnvironment() == Triple::ELF || ( > + !isTargetDarwin() && !isTargetWindows() && > + !isTargetCygMing()); > Put ( at the new line :) > Thank you, these will be fixed before committing. Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From bigcheesegs at gmail.com Wed Feb 1 05:20:13 2012 From: bigcheesegs at gmail.com (Michael Spencer) Date: Wed, 1 Feb 2012 03:20:13 -0800 Subject: [llvm-commits] [PATCH] cmake: make BUILD_SHARED_LIBS a visible option In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 1:06 PM, nobled wrote: > Before, this could only be specified on the commandline via > -DBUILD_SHARED_LIBS=ON, > and wouldn't show up as an option when invoked via `cmake -i` at all. > It also wasn't type-checked > as a boolean variable with only ON/OFF as valid values. > > --- > ?CMakeLists.txt | ? ?3 +++ > ?1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/CMakeLists.txt b/CMakeLists.txt > index 13e358a..1e7a4a3 100644 > --- a/CMakeLists.txt > +++ b/CMakeLists.txt > @@ -94,6 +94,9 @@ else( MSVC ) > ? ? CACHE STRING "Semicolon-separated list of targets to build, or \"all\".") > ?endif( MSVC ) > > +option(BUILD_SHARED_LIBS > + ?"Build all libraries as shared libraries instead of static" OFF) > + > ?option(LLVM_ENABLE_CBE_PRINTF_A "Set to ON if CBE is enabled for > printf %a output" ON) > ?if(LLVM_ENABLE_CBE_PRINTF_A) > ? set(ENABLE_CBE_PRINTF_A 1) > -- > 1.7.4.1 LGTM. Do you have commit access? Or should I commit it? From anton at korobeynikov.info Wed Feb 1 05:33:02 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 1 Feb 2012 15:33:02 +0400 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> Message-ID: Hi Eli > Let me see if I grok your proposal. You want the "container override" to be implemented in a way that is generic and not only active for ELF. Is that right? In this case, I think we're already almost there. If you look at the usage of Triple.getEnvironment(), you'll notice that the "MachO override" was already implemented. Well... parts at least in lib/Target/X86/X86Subtarget.h and lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp look really weird, because they make ELF as a special case. Look for MachO usage nearby. In particular, I think that all stuff like getEnvironment() != ELF should be removed / refactored. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eli.bendersky at intel.com Wed Feb 1 06:03:47 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 12:03:47 +0000 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326DA5@HASMSX103.ger.corp.intel.com> > > Let me see if I grok your proposal. You want the "container override" to be > implemented in a way that is generic and not only active for ELF. Is that right? > In this case, I think we're already almost there. If you look at the usage of > Triple.getEnvironment(), you'll notice that the "MachO override" was already > implemented. > Well... parts at least in lib/Target/X86/X86Subtarget.h and > lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp look really weird, > because they make ELF as a special case. Look for MachO usage nearby. > > In particular, I think that all stuff like getEnvironment() != ELF should be > removed / refactored. > Anton, I think I see what you mean. The parallel thread with Eric discusses larger, conceptual differences which may end in all this code being changed to something completely different. However, if this approach is eventually accepted, I will try to streamline the special-casing for ELF to be more logical and similar to the special-casing for MachO, as you say. Thanks for your input, Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From anton at korobeynikov.info Wed Feb 1 06:17:28 2012 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 1 Feb 2012 16:17:28 +0400 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: <9BBE4537D1BAAB479E9E8F9D4234619D326DA5@HASMSX103.ger.corp.intel.com> References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326DA5@HASMSX103.ger.corp.intel.com> Message-ID: > Anton, I think I see what you mean. The parallel thread with Eric discusses larger, conceptual differences which may end in all this code being changed to something completely different. However, if this approach is eventually accepted, I will try to streamline the special-casing for ELF to be more logical and similar to the special-casing for MachO, as you say. Well, I think the root issue is precisely the same as Eric says. And eventually you will end to some sort of refactoring which will streamline the stuff. Surely adding "yet another special case" is a wrong conceptual solution :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eli.bendersky at intel.com Wed Feb 1 06:22:26 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Wed, 1 Feb 2012 12:22:26 +0000 Subject: [llvm-commits] FW: [PATCH] enabling generation of ELF objects on Windows with the help of the triple In-Reply-To: References: <9BBE4537D1BAAB479E9E8F9D4234619D32305D@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326C45@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326D6E@HASMSX103.ger.corp.intel.com> <9BBE4537D1BAAB479E9E8F9D4234619D326DA5@HASMSX103.ger.corp.intel.com> Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326DBA@HASMSX103.ger.corp.intel.com> > > Anton, I think I see what you mean. The parallel thread with Eric discusses > larger, conceptual differences which may end in all this code being changed > to something completely different. However, if this approach is eventually > accepted, I will try to streamline the special-casing for ELF to be more logical > and similar to the special-casing for MachO, as you say. > Well, I think the root issue is precisely the same as Eric says. And eventually > you will end to some sort of refactoring which will streamline the stuff. Surely > adding "yet another special case" is a wrong conceptual solution :) > If (the part which I call "conceptual"), ELF stays alongside existing MachO in the Environment enum, the "streamlining" I'm referring to is to restructure code that tries to figure out the container to generate, so that it looks like this (pseudo-code): if environment is MachO: create MachO elif environment is ELF: create ELF else decide based on the OS part of the Triple (Windows -> COFF, Mac -> MachO), etc. Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. From rafael.espindola at gmail.com Wed Feb 1 07:49:58 2012 From: rafael.espindola at gmail.com (=?UTF-8?Q?Rafael_Esp=C3=ADndola?=) Date: Wed, 1 Feb 2012 14:49:58 +0100 Subject: [llvm-commits] [llvm] r149163 - /llvm/trunk/lib/CodeGen/RegisterCoalescer.cpp In-Reply-To: <20120128011701.3A52A2A6C12C@llvm.org> References: <20120128011701.3A52A2A6C12C@llvm.org> Message-ID: > Remove code that adds live ranges for dead defs. It seems to be breaking things. In case anyone is interested, I have attached a testcase. Before this patch, running $ llc -mtriple i386-unknown-linux-gnu -o test.s test.ll would crash with ... Assertion failed: (B->end <= Start && "Cannot overlap two LiveRanges with differing ValID's" " (did you def the same reg twice in a MachineInstr?)"), function addRangeFrom, file /Users/espindola/llvm/llvm/lib/CodeGen/LiveInterval.cpp, line 246. Cheers, Rafael -------------- next part -------------- A non-text attachment was scrubbed... Name: test.ll Type: application/octet-stream Size: 3626 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/6dfc8730/attachment.obj From elena.demikhovsky at intel.com Wed Feb 1 08:09:30 2012 From: elena.demikhovsky at intel.com (Demikhovsky, Elena) Date: Wed, 1 Feb 2012 14:09:30 +0000 Subject: [llvm-commits] Optimization for SIGN_EXTEND on AVX - please review Message-ID: Optimization for sign extensions v4i32 -> v4i64 and v8i16 -> v8i32. - Elena --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- A non-text attachment was scrubbed... Name: sext.diff Type: application/octet-stream Size: 6050 bytes Desc: sext.diff Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/c026aa05/attachment.obj From nobled at dreamwidth.org Wed Feb 1 08:06:22 2012 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Wed, 01 Feb 2012 14:06:22 -0000 Subject: [llvm-commits] [llvm] r149498 - in /llvm/trunk: autoconf/configure.ac configure Message-ID: <20120201140622.2E4BD2A6C12C@llvm.org> Author: nobled Date: Wed Feb 1 08:06:21 2012 New Revision: 149498 URL: http://llvm.org/viewvc/llvm-project?rev=149498&view=rev Log: autoconf: generate clang's private config.h header The CMake build already generated one. Follows clang r149497. This brings us one step closer to compiling and configuring clang separately from LLVM using the autoconf build, too. (I lack the right version of autoconf et al. to regen, but it was a simple change, so I just updated configure manually.) Modified: llvm/trunk/autoconf/configure.ac llvm/trunk/configure Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=149498&r1=149497&r2=149498&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Wed Feb 1 08:06:21 2012 @@ -1586,7 +1586,10 @@ dnl Configure doxygen's configuration file AC_CONFIG_FILES([docs/doxygen.cfg]) + +dnl Configure clang, if present if test -f ${srcdir}/tools/clang/README.txt; then + AC_CONFIG_HEADERS([tools/clang/include/clang/Config/config.h]) AC_CONFIG_FILES([tools/clang/docs/doxygen.cfg]) fi Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=149498&r1=149497&r2=149498&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Wed Feb 1 08:06:21 2012 @@ -21095,6 +21095,7 @@ ac_config_files="$ac_config_files docs/doxygen.cfg" if test -f ${srcdir}/tools/clang/README.txt; then + ac_config_headers="$ac_config_headers tools/clang/include/clang/Config/config.h" ac_config_files="$ac_config_files tools/clang/docs/doxygen.cfg" fi From grosser at fim.uni-passau.de Wed Feb 1 08:23:29 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 01 Feb 2012 14:23:29 -0000 Subject: [llvm-commits] [polly] r149501 - /polly/trunk/lib/CodeGeneration.cpp Message-ID: <20120201142329.7022C2A6C12C@llvm.org> Author: grosser Date: Wed Feb 1 08:23:29 2012 New Revision: 149501 URL: http://llvm.org/viewvc/llvm-project?rev=149501&view=rev Log: CodeGeneration: Order includes alphabetically Modified: polly/trunk/lib/CodeGeneration.cpp Modified: polly/trunk/lib/CodeGeneration.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/CodeGeneration.cpp?rev=149501&r1=149500&r2=149501&view=diff ============================================================================== --- polly/trunk/lib/CodeGeneration.cpp (original) +++ polly/trunk/lib/CodeGeneration.cpp Wed Feb 1 08:23:29 2012 @@ -22,23 +22,23 @@ #define DEBUG_TYPE "polly-codegen" -#include "polly/LinkAllPasses.h" -#include "polly/Support/GICHelper.h" -#include "polly/Support/ScopHelper.h" #include "polly/Cloog.h" #include "polly/CodeGeneration.h" #include "polly/Dependences.h" +#include "polly/LinkAllPasses.h" #include "polly/ScopInfo.h" #include "polly/TempScopInfo.h" +#include "polly/Support/GICHelper.h" + +#include "llvm/Module.h" +#include "llvm/ADT/SetVector.h" +#include "llvm/Analysis/LoopInfo.h" +#include "llvm/Analysis/ScalarEvolutionExpander.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/IRBuilder.h" -#include "llvm/Analysis/LoopInfo.h" -#include "llvm/Analysis/ScalarEvolutionExpander.h" -#include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Target/TargetData.h" -#include "llvm/Module.h" -#include "llvm/ADT/SetVector.h" +#include "llvm/Transforms/Utils/BasicBlockUtils.h" #define CLOOG_INT_GMP 1 #include "cloog/cloog.h" From grosser at fim.uni-passau.de Wed Feb 1 08:23:37 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 01 Feb 2012 14:23:37 -0000 Subject: [llvm-commits] [polly] r149503 - /polly/trunk/lib/Analysis/ScopInfo.cpp Message-ID: <20120201142337.327F02A6C12D@llvm.org> Author: grosser Date: Wed Feb 1 08:23:36 2012 New Revision: 149503 URL: http://llvm.org/viewvc/llvm-project?rev=149503&view=rev Log: ScopInfo: Simplify some isl code Modified: polly/trunk/lib/Analysis/ScopInfo.cpp Modified: polly/trunk/lib/Analysis/ScopInfo.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/ScopInfo.cpp?rev=149503&r1=149502&r2=149503&view=diff ============================================================================== --- polly/trunk/lib/Analysis/ScopInfo.cpp (original) +++ polly/trunk/lib/Analysis/ScopInfo.cpp Wed Feb 1 08:23:36 2012 @@ -383,35 +383,23 @@ // : i0 = o0, i1 = o1, ..., i(X-1) = o(X-1), iX < oX // static isl_map *getEqualAndLarger(isl_space *setDomain) { - isl_space *mapDomain = isl_space_map_from_set(setDomain); - isl_basic_map *bmap = isl_basic_map_universe(isl_space_copy(mapDomain)); - isl_local_space *MapLocalSpace = isl_local_space_from_space(mapDomain); + isl_space *Space = isl_space_map_from_set(setDomain); + isl_map *Map = isl_map_universe(isl_space_copy(Space)); + isl_local_space *MapLocalSpace = isl_local_space_from_space(Space); // Set all but the last dimension to be equal for the input and output // // input[i0, i1, ..., iX] -> output[o0, o1, ..., oX] // : i0 = o0, i1 = o1, ..., i(X-1) = o(X-1) - for (unsigned i = 0; i < isl_basic_map_n_in(bmap) - 1; ++i) { - isl_int v; - isl_int_init(v); - isl_constraint *c = isl_equality_alloc(isl_local_space_copy(MapLocalSpace)); - - isl_int_set_si(v, 1); - isl_constraint_set_coefficient(c, isl_dim_in, i, v); - isl_int_set_si(v, -1); - isl_constraint_set_coefficient(c, isl_dim_out, i, v); - - bmap = isl_basic_map_add_constraint(bmap, c); - - isl_int_clear(v); - } + for (unsigned i = 0; i < isl_map_dim(Map, isl_dim_in) - 1; ++i) + Map = isl_map_equate(Map, isl_dim_in, i, isl_dim_out, i); // Set the last dimension of the input to be strict smaller than the // last dimension of the output. // // input[?,?,?,...,iX] -> output[?,?,?,...,oX] : iX < oX // - unsigned lastDimension = isl_basic_map_n_in(bmap) - 1; + unsigned lastDimension = isl_map_dim(Map, isl_dim_in) - 1; isl_int v; isl_int_init(v); isl_constraint *c = isl_inequality_alloc(isl_local_space_copy(MapLocalSpace)); @@ -423,10 +411,10 @@ isl_constraint_set_constant(c, v); isl_int_clear(v); - bmap = isl_basic_map_add_constraint(bmap, c); + Map = isl_map_add_constraint(Map, c); isl_local_space_free(MapLocalSpace); - return isl_map_from_basic_map(bmap); + return Map; } isl_set *MemoryAccess::getStride(__isl_take const isl_set *domainSubset) const { From grosser at fim.uni-passau.de Wed Feb 1 08:23:33 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Wed, 01 Feb 2012 14:23:33 -0000 Subject: [llvm-commits] [polly] r149502 - /polly/trunk/lib/CodeGeneration.cpp Message-ID: <20120201142333.7FF6E2A6C12C@llvm.org> Author: grosser Date: Wed Feb 1 08:23:33 2012 New Revision: 149502 URL: http://llvm.org/viewvc/llvm-project?rev=149502&view=rev Log: CodeGeneration: Rephrase comment slightly Modified: polly/trunk/lib/CodeGeneration.cpp Modified: polly/trunk/lib/CodeGeneration.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/CodeGeneration.cpp?rev=149502&r1=149501&r2=149502&view=diff ============================================================================== --- polly/trunk/lib/CodeGeneration.cpp (original) +++ polly/trunk/lib/CodeGeneration.cpp Wed Feb 1 08:23:33 2012 @@ -1699,14 +1699,14 @@ assert(region->isSimple() && "Only simple regions are supported"); - // In the CFG and we generate next to original code of the Scop the - // optimized version. Both the new and the original version of the code - // remain in the CFG. A branch statement decides which version is executed. - // At the moment, we always execute the newly generated version (the old one - // is dead code eliminated by the cleanup passes). Later we may decide to - // execute the new version only under certain conditions. This will be the - // case if we support constructs for which we cannot prove all assumptions - // at compile time. + // In the CFG the optimized code of the SCoP is generated next to the + // original code. Both the new and the original version of the code remain + // in the CFG. A branch statement decides which version is executed. + // For now, we always execute the new version (the old one is dead code + // eliminated by the cleanup passes). In the future we may decide to execute + // the new version only if certain run time checks succeed. This will be + // useful to support constructs for which we cannot prove all assumptions at + // compile time. // // Before transformation: // From geek4civic at gmail.com Wed Feb 1 08:35:30 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 01 Feb 2012 14:35:30 -0000 Subject: [llvm-commits] [llvm] r149505 - /llvm/trunk/test/CodeGen/X86/avx-minmax.ll Message-ID: <20120201143530.219A92A6C12C@llvm.org> Author: chapuni Date: Wed Feb 1 08:35:29 2012 New Revision: 149505 URL: http://llvm.org/viewvc/llvm-project?rev=149505&view=rev Log: test/CodeGen/X86/avx-minmax.ll: Relax expressions for Win32 targets. YMM arguments are passed as indirect on Win32 x64. Modified: llvm/trunk/test/CodeGen/X86/avx-minmax.ll Modified: llvm/trunk/test/CodeGen/X86/avx-minmax.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-minmax.ll?rev=149505&r1=149504&r2=149505&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/avx-minmax.ll (original) +++ llvm/trunk/test/CodeGen/X86/avx-minmax.ll Wed Feb 1 08:35:29 2012 @@ -33,7 +33,7 @@ } ; UNSAFE: vmaxpd: -; UNSAFE: vmaxpd %ymm +; UNSAFE: vmaxpd {{.+}}, %ymm define <4 x double> @vmaxpd(<4 x double> %x, <4 x double> %y) { %max_is_x = fcmp oge <4 x double> %x, %y %max = select <4 x i1> %max_is_x, <4 x double> %x, <4 x double> %y @@ -41,7 +41,7 @@ } ; UNSAFE: vminpd: -; UNSAFE: vminpd %ymm +; UNSAFE: vminpd {{.+}}, %ymm define <4 x double> @vminpd(<4 x double> %x, <4 x double> %y) { %min_is_x = fcmp ole <4 x double> %x, %y %min = select <4 x i1> %min_is_x, <4 x double> %x, <4 x double> %y @@ -49,7 +49,7 @@ } ; UNSAFE: vmaxps: -; UNSAFE: vmaxps %ymm +; UNSAFE: vmaxps {{.+}}, %ymm define <8 x float> @vmaxps(<8 x float> %x, <8 x float> %y) { %max_is_x = fcmp oge <8 x float> %x, %y %max = select <8 x i1> %max_is_x, <8 x float> %x, <8 x float> %y @@ -57,7 +57,7 @@ } ; UNSAFE: vminps: -; UNSAFE: vminps %ymm +; UNSAFE: vminps {{.+}}, %ymm define <8 x float> @vminps(<8 x float> %x, <8 x float> %y) { %min_is_x = fcmp ole <8 x float> %x, %y %min = select <8 x i1> %min_is_x, <8 x float> %x, <8 x float> %y From nobled at dreamwidth.org Wed Feb 1 08:49:40 2012 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Wed, 01 Feb 2012 14:49:40 -0000 Subject: [llvm-commits] [llvm] r149506 - /llvm/trunk/CMakeLists.txt Message-ID: <20120201144940.1A6282A6C12C@llvm.org> Author: nobled Date: Wed Feb 1 08:49:39 2012 New Revision: 149506 URL: http://llvm.org/viewvc/llvm-project?rev=149506&view=rev Log: cmake: make BUILD_SHARED_LIBS a visible option It could only be specified on the commandline, and wouldn't show up as an option in the GUI or when invoked via `cmake -i` at all. This also tells CMake that it's a BOOL, rather than "UNINITIALIZED". Modified: llvm/trunk/CMakeLists.txt Modified: llvm/trunk/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/CMakeLists.txt?rev=149506&r1=149505&r2=149506&view=diff ============================================================================== --- llvm/trunk/CMakeLists.txt (original) +++ llvm/trunk/CMakeLists.txt Wed Feb 1 08:49:39 2012 @@ -94,6 +94,9 @@ CACHE STRING "Semicolon-separated list of targets to build, or \"all\".") endif( MSVC ) +option(BUILD_SHARED_LIBS + "Build all libraries as shared libraries instead of static" OFF) + option(LLVM_ENABLE_CBE_PRINTF_A "Set to ON if CBE is enabled for printf %a output" ON) if(LLVM_ENABLE_CBE_PRINTF_A) set(ENABLE_CBE_PRINTF_A 1) From baldrick at free.fr Wed Feb 1 08:49:44 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 14:49:44 -0000 Subject: [llvm-commits] [dragonegg] r149507 - in /dragonegg/trunk/src: Convert.cpp DefaultABI.cpp Types.cpp Message-ID: <20120201144944.F36D82A6C12D@llvm.org> Author: baldrick Date: Wed Feb 1 08:49:44 2012 New Revision: 149507 URL: http://llvm.org/viewvc/llvm-project?rev=149507&view=rev Log: Avoid compiler warnings about unused parameters when compiling for ARM. Modified: dragonegg/trunk/src/Convert.cpp dragonegg/trunk/src/DefaultABI.cpp dragonegg/trunk/src/Types.cpp Modified: dragonegg/trunk/src/Convert.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Convert.cpp?rev=149507&r1=149506&r2=149507&view=diff ============================================================================== --- dragonegg/trunk/src/Convert.cpp (original) +++ dragonegg/trunk/src/Convert.cpp Wed Feb 1 08:49:44 2012 @@ -468,7 +468,9 @@ // passed in memory byval. static bool isPassedByVal(tree type, Type *Ty, std::vector &ScalarArgs, - bool isShadowRet, CallingConv::ID &/*CC*/) { + bool isShadowRet, CallingConv::ID CC) { + (void)type; (void)Ty; (void)ScalarArgs; (void)isShadowRet; + (void)CC; // Not used by all ABI macros. if (LLVM_SHOULD_PASS_AGGREGATE_USING_BYVAL_ATTR(type, Ty)) return true; @@ -3572,6 +3574,9 @@ return LLVM_TARGET_INTRINSIC_LOWER(stmt, fndecl, DestLoc, Result, ResultType, Operands); +#else + // Avoid compiler warnings about unused parameters. + (void)stmt; (void)fndecl; (void)DestLoc; (void)Result; #endif return false; } Modified: dragonegg/trunk/src/DefaultABI.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/DefaultABI.cpp?rev=149507&r1=149506&r2=149507&view=diff ============================================================================== --- dragonegg/trunk/src/DefaultABI.cpp (original) +++ dragonegg/trunk/src/DefaultABI.cpp Wed Feb 1 08:49:44 2012 @@ -39,7 +39,8 @@ // doNotUseShadowReturn - Return true if the specified GCC type // should not be returned using a pointer to struct parameter. -bool doNotUseShadowReturn(tree type, tree fndecl, CallingConv::ID /*CC*/) { +bool doNotUseShadowReturn(tree type, tree fndecl, CallingConv::ID CC) { + (void)CC; // Not used by all ABI macros. if (!TYPE_SIZE(type)) return false; if (TREE_CODE(TYPE_SIZE(type)) != INTEGER_CST) @@ -128,6 +129,7 @@ /// on the client that indicate how its pieces should be handled. This /// handles things like returning structures via hidden parameters. void DefaultABI::HandleReturnType(tree type, tree fn, bool isBuiltin) { + (void)isBuiltin; // Not used by all ABI macros. unsigned Offset = 0; Type *Ty = ConvertType(type); if (Ty->isVectorTy()) { @@ -252,7 +254,6 @@ for (tree Field = TYPE_FIELDS(type); Field; Field = TREE_CHAIN(Field)) if (TREE_CODE(Field) == FIELD_DECL) { const tree Ftype = TREE_TYPE(Field); - Type *FTy = ConvertType(Ftype); unsigned FNo = GetFieldIndex(Field, Ty); assert(FNo < INT_MAX && "Case not handled yet!"); @@ -262,6 +263,8 @@ // (We know there currently are no other such cases active because // they would hit the assert in FunctionPrologArgumentConversion:: // HandleByValArgument.) + Type *FTy = ConvertType(Ftype); + (void)FTy; // Not used by all ABI macros. if (!LLVM_SHOULD_PASS_AGGREGATE_USING_BYVAL_ATTR(Ftype, FTy)) { C.EnterField(FNo, Ty); HandleArgument(TREE_TYPE(Field), ScalarElts); Modified: dragonegg/trunk/src/Types.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Types.cpp?rev=149507&r1=149506&r2=149507&view=diff ============================================================================== --- dragonegg/trunk/src/Types.cpp (original) +++ dragonegg/trunk/src/Types.cpp Wed Feb 1 08:49:44 2012 @@ -799,11 +799,11 @@ Attribute::Nest)); } +#ifdef LLVM_TARGET_ENABLE_REGPARM // If the target has regparam parameters, allow it to inspect the function // type. int local_regparam = 0; int local_fp_regparam = 0; -#ifdef LLVM_TARGET_ENABLE_REGPARM LLVM_TARGET_INIT_REGPARM(local_regparam, local_fp_regparam, type); #endif // LLVM_TARGET_ENABLE_REGPARM From baldrick at free.fr Wed Feb 1 08:50:58 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 14:50:58 -0000 Subject: [llvm-commits] [dragonegg] r149508 - in /dragonegg/trunk: include/arm/ include/arm/dragonegg/ include/arm/dragonegg/Target.h src/arm/ src/arm/Target.cpp Message-ID: <20120201145058.B5E752A6C12C@llvm.org> Author: baldrick Date: Wed Feb 1 08:50:58 2012 New Revision: 149508 URL: http://llvm.org/viewvc/llvm-project?rev=149508&view=rev Log: Support for ARM. Based on a patch by Jin Gu Kang that ports the logic from llvm-gcc-4.2. Added: dragonegg/trunk/include/arm/ dragonegg/trunk/include/arm/dragonegg/ dragonegg/trunk/include/arm/dragonegg/Target.h dragonegg/trunk/src/arm/ dragonegg/trunk/src/arm/Target.cpp Added: dragonegg/trunk/include/arm/dragonegg/Target.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/include/arm/dragonegg/Target.h?rev=149508&view=auto ============================================================================== --- dragonegg/trunk/include/arm/dragonegg/Target.h (added) +++ dragonegg/trunk/include/arm/dragonegg/Target.h Wed Feb 1 08:50:58 2012 @@ -0,0 +1,281 @@ +//==----- Target.h - Target hooks for GCC to LLVM conversion -----*- C++ -*-==// +// +// Copyright (C) 2007 to 2012 Jin Gu Kang, Anton Korobeynikov, Duncan Sands +// et al. +// +// This file is part of DragonEgg. +// +// DragonEgg is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free Software +// Foundation; either version 2, or (at your option) any later version. +// +// DragonEgg is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR +// A PARTICULAR PURPOSE. See the GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along with +// DragonEgg; see the file COPYING. If not, write to the Free Software +// Foundation, 51 Franklin Street, Suite 500, Boston, MA 02110-1335, USA. +// +//===----------------------------------------------------------------------===// +// This file declares some target-specific hooks for GCC to LLVM conversion. +// It was derived from llvm-arm-target.h and arm.h on llvm-gcc.4.2. +//===----------------------------------------------------------------------===// + +#ifndef DRAGONEGG_TARGET_H +#define DRAGONEGG_TARGET_H + +/* LLVM specific code to select the calling conventions. The AAPCS + specification says that varargs functions must use the base standard + instead of the VFP hard float variant. We check for that with + (isVoid || hasArgList). */ + +/* from TARGET_AAPCS_BASED */ +#define DEFAULT_TARGET_AAPCS_BASED \ + (ARM_DEFAULT_ABI != ARM_ABI_APCS && ARM_DEFAULT_ABI != ARM_ABI_ATPCS) + +#define TARGET_ADJUST_LLVM_CC(CC, type) \ + { \ + if (TARGET_AAPCS_BASED) { \ + if (TARGET_VFP && TARGET_HARD_FLOAT_ABI && \ + ((TYPE_ARG_TYPES(type) == 0) || \ + (TREE_VALUE(tree_last(TYPE_ARG_TYPES(type))) == \ + void_type_node))) \ + CC = CallingConv::ARM_AAPCS_VFP; \ + else if (!DEFAULT_TARGET_AAPCS_BASED) \ + CC = CallingConv::ARM_AAPCS; \ + } else if (DEFAULT_TARGET_AAPCS_BASED) { \ + CC = CallingConv::ARM_APCS; \ + } \ + } + +#ifdef DRAGONEGG_ABI_H + +extern bool +llvm_arm_should_pass_aggregate_in_mixed_regs(tree_node *, Type *Ty, + CallingConv::ID, + std::vector&); + +#define LLVM_SHOULD_PASS_AGGREGATE_IN_MIXED_REGS(T, TY, CC, E) \ + llvm_arm_should_pass_aggregate_in_mixed_regs((T), (TY), (CC), (E)) + +struct DefaultABIClient; +extern bool +llvm_arm_try_pass_aggregate_custom(tree_node *, std::vector&, + CallingConv::ID CC, + struct DefaultABIClient*); + +#define LLVM_TRY_PASS_AGGREGATE_CUSTOM(T, E, CC, C) \ + llvm_arm_try_pass_aggregate_custom((T), (E), (CC), (C)) + +extern +bool llvm_arm_aggregate_partially_passed_in_regs(std::vector&, + std::vector&, + CallingConv::ID CC); + +#define LLVM_AGGREGATE_PARTIALLY_PASSED_IN_REGS(E, SE, ISR, CC) \ + llvm_arm_aggregate_partially_passed_in_regs((E), (SE), (CC)) + +extern Type *llvm_arm_aggr_type_for_struct_return(tree_node *type, + CallingConv::ID CC); + +/* LLVM_AGGR_TYPE_FOR_STRUCT_RETURN - Return LLVM Type if X can be + returned as an aggregate, otherwise return NULL. */ +#define LLVM_AGGR_TYPE_FOR_STRUCT_RETURN(X, CC) \ + llvm_arm_aggr_type_for_struct_return((X), (CC)) + +extern void llvm_arm_extract_multiple_return_value(Value *Src, Value *Dest, + bool isVolatile, + LLVMBuilder &B); + +/* LLVM_EXTRACT_MULTIPLE_RETURN_VALUE - Extract multiple return value from + SRC and assign it to DEST. */ +#define LLVM_EXTRACT_MULTIPLE_RETURN_VALUE(Src,Dest,V,B) \ + llvm_arm_extract_multiple_return_value((Src),(Dest),(V),(B)) + +extern +bool llvm_arm_should_pass_or_return_aggregate_in_regs(tree_node *TreeType, + CallingConv::ID CC); + +/* LLVM_SHOULD_NOT_USE_SHADOW_RETURN = Return true is the given type should + not be returned via a shadow parameter with the given calling conventions. */ +#define LLVM_SHOULD_NOT_USE_SHADOW_RETURN(X, CC) \ + llvm_arm_should_pass_or_return_aggregate_in_regs((X), (CC)) + +/* Vectors bigger than 128 are returned using sret. */ +#define LLVM_SHOULD_RETURN_VECTOR_AS_SHADOW(X, isBuiltin) \ + (TREE_INT_CST_LOW(TYPE_SIZE(X)) > 128) + +#endif /* DRAGONEGG_ABI_H */ + +#define LLVM_TARGET_INTRINSIC_PREFIX "arm" + +/* LLVM_TARGET_NAME - This specifies the name of the target, which correlates to + * the llvm::InitializeXXXTarget() function. + */ +#define LLVM_TARGET_NAME ARM + + +/* Turn -march=xx into a CPU type. + */ +#define LLVM_SET_SUBTARGET_FEATURES(C, F) \ + { switch (arm_tune) { \ + case arm8: C = ("arm8"); break;\ + case arm810: C = ("arm810"); break;\ + case strongarm: C = ("strongarm"); break;\ + case strongarm110: C = ("strongarm110"); break;\ + case strongarm1100: C = ("strongarm1100"); break;\ + case strongarm1110: C = ("strongarm1110"); break;\ + case arm7tdmi: C = ("arm7tdmi"); break;\ + case arm7tdmis: C = ("arm7tdmi-s"); break;\ + case arm710t: C = ("arm710t"); break;\ + case arm720t: C = ("arm720t"); break;\ + case arm740t: C = ("arm740t"); break;\ + case arm9: C = ("arm9"); break;\ + case arm9tdmi: C = ("arm9tdmi"); break;\ + case arm920: C = ("arm920"); break;\ + case arm920t: C = ("arm920t"); break;\ + case arm922t: C = ("arm922t"); break;\ + case arm940t: C = ("arm940t"); break;\ + case ep9312: C = ("ep9312"); break;\ + case arm10tdmi: C = ("arm10tdmi"); break;\ + case arm1020t: C = ("arm1020t"); break;\ + case arm9e: C = ("arm9e"); break;\ + case arm946es: C = ("arm946e-s"); break;\ + case arm966es: C = ("arm966e-s"); break;\ + case arm968es: C = ("arm968e-s"); break;\ + case arm10e: C = ("arm10e"); break;\ + case arm1020e: C = ("arm1020e"); break;\ + case arm1022e: C = ("arm1022e"); break;\ + case xscale: C = ("xscale"); break;\ + case iwmmxt: C = ("iwmmxt"); break;\ + case arm926ejs: C = ("arm926ej-s"); break;\ + case arm1026ejs: C = ("arm1026ej-s"); break;\ + case arm1136js: C = ("arm1136j-s"); break;\ + case arm1136jfs: C = ("arm1136jf-s"); break;\ + case arm1176jzs: C = ("arm1176jz-s"); break;\ + case arm1176jzfs: C = ("arm1176jzf-s"); break;\ + case mpcorenovfp: C = ("mpcorenovfp"); break;\ + case mpcore: C = ("mpcore"); break;\ + case arm1156t2s: C = ("arm1156t2-s"); break; \ + case arm1156t2fs: C = ("arm1156t2f-s"); break; \ + case cortexa8: C = ("cortex-a8"); break; \ + case cortexa9: C = ("cortex-a9"); break; \ + case cortexr4: C = ("cortex-r4"); break; \ + case cortexm3: C = ("cortex-m3"); break; \ + case cortexm4: C = ("cortex-m4"); break; \ + case cortexm0: C = ("cortex-m0"); break; \ + default: \ + C = ("arm7tdmi"); \ + break; \ + } \ + if (TARGET_VFP3) \ + F.AddFeature("vfp3"); \ + else { \ + F.AddFeature("vfp3", false); \ + if (TARGET_VFP && TARGET_HARD_FLOAT) \ + F.AddFeature("vfp2"); \ + else \ + F.AddFeature("vfp2", false); \ + } \ + if (TARGET_NEON) \ + F.AddFeature("neon"); \ + else \ + F.AddFeature("neon", false); \ + if (TARGET_FP16) \ + F.AddFeature("fp16"); \ + else \ + F.AddFeature("fp16", false); \ + } + +/* Encode arm / thumb modes and arm subversion number in the triplet. e.g. + * armv6-apple-darwin, thumbv5-apple-darwin. FIXME: Replace thumb triplets + * with function notes. + */ +#define LLVM_OVERRIDE_TARGET_ARCH() \ + (TARGET_THUMB \ + ? (arm_arch7 \ + ? "thumbv7" \ + : (arm_arch_thumb2 \ + ? "thumbv6t2" \ + : (arm_tune == cortexm0 \ + ? "thumbv6m" \ + : (arm_arch6 \ + ? "thumbv6" \ + : (arm_arch5e \ + ? "thumbv5e" \ + : (arm_arch5 \ + ? "thumbv5" \ + : (arm_arch4t \ + ? "thumbv4t" : ""))))))) \ + : (arm_arch7 \ + ? "armv7" \ + : (arm_arch_thumb2 \ + ? "armv6t2" \ + : (arm_arch6 \ + ? "armv6" \ + : (arm_arch5e \ + ? "armv5e" \ + : (arm_arch5 \ + ? "armv5" \ + : (arm_arch4t \ + ? "armv4t" \ + : (arm_arch4 \ + ? "armv4" : "")))))))) + +#if 0 +// Dragonegg should make flag_mkernel and flag_apple_kext option later on. +// We didn't decide place to make these flags. +#define LLVM_SET_MACHINE_OPTIONS(argvec) \ + if (flag_mkernel || flag_apple_kext) { \ + argvec.push_back("-arm-long-calls"); \ + argvec.push_back("-arm-strict-align"); \ + } +#endif + +#define LLVM_SET_TARGET_MACHINE_OPTIONS(options) \ + options.UseSoftFloat = TARGET_SOFT_FLOAT; \ + if (TARGET_HARD_FLOAT_ABI) \ + options.FloatABIType = llvm::FloatABI::Hard; + + +/* Doing struct copy by partial-word loads and stores is not a good idea on ARM. */ +#define TARGET_LLVM_MIN_BYTES_COPY_BY_MEMCPY 4 + +/* These are a couple of extensions to the asm formats + %@ prints out ASM_COMMENT_START + TODO: %r prints out REGISTER_PREFIX reg_names[arg] */ +#define LLVM_ASM_EXTENSIONS(ESCAPED_CHAR, ASM, RESULT) \ + else if ((ESCAPED_CHAR) == '@') { \ + (RESULT) += ASM_COMMENT_START; \ + } + +/* LLVM_TARGET_INTRINSIC_LOWER - To handle builtins, we want to expand the + invocation into normal LLVM code. If the target can handle the builtin, this + macro should call the target TreeToLLVM::TargetIntrinsicLower method and + return true. This macro is invoked from a method in the TreeToLLVM class. */ +#if 0 +// Because of data dependency, we will implement later on. +#define LLVM_TARGET_INTRINSIC_LOWER(EXP, BUILTIN_CODE, DESTLOC, RESULT, \ + DESTTY, OPS) \ + TargetIntrinsicLower(EXP, BUILTIN_CODE, DESTLOC, RESULT, DESTTY, OPS); +#endif + +/* LLVM_GET_REG_NAME - The registers known to llvm as "r10", "r11", and "r12" + may have different names in GCC. Register "r12" is called "ip", and on + non-Darwin OSs, "r10" is "sl" and "r11" is "fp". Translate those names. + For VFP registers, GCC doesn't distinguish between the q and d registers + so use the incoming register name if it exists. Otherwise, use the default + register names to match the backend. */ +#define LLVM_GET_REG_NAME(REG_NAME, REG_NUM) \ + ((REG_NUM) == 10 ? "r10" \ + : (REG_NUM) == 11 ? "r11" \ + : (REG_NUM) == 12 ? "r12" \ + : (REG_NUM) >= FIRST_VFP_REGNUM && REG_NAME != 0 ? REG_NAME \ + : reg_names[REG_NUM]) + +/* Define a static enumeration of the NEON builtins to be used when + converting to LLVM intrinsics. These names are derived from the + neon_builtin_data table in arm.c and should be kept in sync with that. */ + +#endif /* DRAGONEGG_TARGET_H */ Added: dragonegg/trunk/src/arm/Target.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/arm/Target.cpp?rev=149508&view=auto ============================================================================== --- dragonegg/trunk/src/arm/Target.cpp (added) +++ dragonegg/trunk/src/arm/Target.cpp Wed Feb 1 08:50:58 2012 @@ -0,0 +1,623 @@ +//===---------------- Target.cpp - Implements the ARM ABI. ----------------===// +// +// Copyright (C) 2005 to 2012 Evan Cheng, Jin Gu Kang, Duncan Sands et al. +// +// This file is part of DragonEgg. +// +// DragonEgg is free software; you can redistribute it and/or modify it under +// the terms of the GNU General Public License as published by the Free Software +// Foundation; either version 2, or (at your option) any later version. +// +// DragonEgg is distributed in the hope that it will be useful, but WITHOUT ANY +// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR +// A PARTICULAR PURPOSE. See the GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along with +// DragonEgg; see the file COPYING. If not, write to the Free Software +// Foundation, 51 Franklin Street, Suite 500, Boston, MA 02110-1335, USA. +// +//===----------------------------------------------------------------------===// +// This file implements specific LLVM ARM ABI. +// It was derived from llvm-arm.cpp on llvm-gcc.4.2. +//===----------------------------------------------------------------------===// + +// Plugin headers +#include "dragonegg/ABI.h" +#include "dragonegg/Target.h" + +// LLVM headers +#include "llvm/Module.h" + +// System headers +#include + +// GCC headers +extern "C" { +#include "config.h" +// Stop GCC declaring 'getopt' as it can clash with the system's declaration. +#undef HAVE_DECL_GETOPT +#include "system.h" +#include "coretypes.h" +#include "target.h" +#include "tree.h" + +#include "diagnostic.h" +#include "gimple.h" +#include "toplev.h" +} + +static LLVMContext &Context = getGlobalContext(); + +// "Fundamental Data Types" according to the AAPCS spec. These are used +// to check that a given aggregate meets the criteria for a "homogeneous +// aggregate." +enum arm_fdts { + ARM_FDT_INVALID, + + ARM_FDT_HALF_FLOAT, + ARM_FDT_FLOAT, + ARM_FDT_DOUBLE, + + ARM_FDT_VECTOR_64, + ARM_FDT_VECTOR_128, + + ARM_FDT_MAX +}; + +// Classify type according to the number of fundamental data types contained +// among its members. Returns true if type is a homogeneous aggregate. +static bool +vfp_arg_homogeneous_aggregate_p(enum machine_mode mode, tree type, + int *fdt_counts) +{ + bool result = false; + HOST_WIDE_INT bytes = + (mode == BLKmode) ? int_size_in_bytes (type) : (int) GET_MODE_SIZE (mode); + + if (type && AGGREGATE_TYPE_P (type)) + { + int i; + int cnt = 0; + tree field; + + // Zero sized arrays or structures are not homogeneous aggregates. + if (!bytes) + return 0; + + // Classify each field of records. + switch (TREE_CODE (type)) + { + case RECORD_TYPE: + // For classes first merge in the field of the subclasses. + if (TYPE_BINFO (type)) { + tree binfo, base_binfo; + int basenum; + + for (binfo = TYPE_BINFO (type), basenum = 0; + BINFO_BASE_ITERATE (binfo, basenum, base_binfo); basenum++) { + tree type = BINFO_TYPE (base_binfo); + + result = vfp_arg_homogeneous_aggregate_p(TYPE_MODE (type), type, + fdt_counts); + if (!result) + return false; + } + } + // And now merge the fields of structure. + for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) { + if (TREE_CODE (field) == FIELD_DECL) { + if (TREE_TYPE (field) == error_mark_node) + continue; + + result = vfp_arg_homogeneous_aggregate_p(TYPE_MODE(TREE_TYPE(field)), + TREE_TYPE(field), + fdt_counts); + if (!result) + return false; + } + } + break; + + case ARRAY_TYPE: + // Arrays are handled as small records. + { + int array_fdt_counts[ARM_FDT_MAX] = { 0 }; + + result = vfp_arg_homogeneous_aggregate_p(TYPE_MODE(TREE_TYPE(type)), + TREE_TYPE(type), + array_fdt_counts); + + cnt = bytes / int_size_in_bytes(TREE_TYPE(type)); + for (i = 0; i < ARM_FDT_MAX; ++i) + fdt_counts[i] += array_fdt_counts[i] * cnt; + + if (!result) + return false; + } + break; + + case UNION_TYPE: + case QUAL_UNION_TYPE: + { + // Unions are similar to RECORD_TYPE. + int union_fdt_counts[ARM_FDT_MAX] = { 0 }; + + // Unions are not derived. + gcc_assert (!TYPE_BINFO (type) + || !BINFO_N_BASE_BINFOS (TYPE_BINFO (type))); + for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) { + int union_field_fdt_counts[ARM_FDT_MAX] = { 0 }; + + if (TREE_CODE (field) == FIELD_DECL) { + if (TREE_TYPE (field) == error_mark_node) + continue; + + result = vfp_arg_homogeneous_aggregate_p( + TYPE_MODE(TREE_TYPE(field)), + TREE_TYPE(field), + union_field_fdt_counts); + if (!result) + return false; + + // track largest union field + for (i = 0; i < ARM_FDT_MAX; ++i) { + if (union_field_fdt_counts[i] > 4) // bail early if we can + return false; + + union_fdt_counts[i] = MAX(union_fdt_counts[i], + union_field_fdt_counts[i]); + union_field_fdt_counts[i] = 0; // clear it out for next iter + } + } + } + + // check for only one type across all union fields + cnt = 0; + for (i = 0; i < ARM_FDT_MAX; ++i) { + if (union_fdt_counts[i]) + ++cnt; + + if (cnt > 1) + return false; + + fdt_counts[i] += union_fdt_counts[i]; + } + } + break; + + default: + assert(0 && "What type is this?"); + } + + // Walk through fdt_counts. This is a homogeneous aggregate if + // only one FDT is used. + cnt = 0; + for (i = 0; i < ARM_FDT_MAX; ++i) { + if (fdt_counts[i]) { + // Make sure that one FDT is 4 or less elements in size. + if (fdt_counts[i] > 4) + return false; + ++cnt; + } + + if (cnt > 1) + return false; + } + + if (cnt == 0) + return false; + + return true; + } + + if (type) + { + int idx = 0; + int cnt = 0; + + switch (TREE_CODE(type)) + { + case REAL_TYPE: + idx = (TYPE_PRECISION(type) == 32) ? + ARM_FDT_FLOAT : + ((TYPE_PRECISION(type) == 64) ? + ARM_FDT_DOUBLE : + ARM_FDT_INVALID); + cnt = 1; + break; + + case COMPLEX_TYPE: + { + tree subtype = TREE_TYPE(type); + idx = (TYPE_PRECISION(subtype) == 32) ? + ARM_FDT_FLOAT : + ((TYPE_PRECISION(subtype) == 64) ? + ARM_FDT_DOUBLE : + ARM_FDT_INVALID); + cnt = 2; + } + break; + + case VECTOR_TYPE: + idx = (bytes == 8) ? + ARM_FDT_VECTOR_64 : + (bytes == 16) ? + ARM_FDT_VECTOR_128 : + ARM_FDT_INVALID; + cnt = 1; + break; + + case INTEGER_TYPE: + case POINTER_TYPE: + case ENUMERAL_TYPE: + case BOOLEAN_TYPE: + case REFERENCE_TYPE: + case FUNCTION_TYPE: + case METHOD_TYPE: + default: + return false; // All of these disqualify. + } + + fdt_counts[idx] += cnt; + return true; + } + else + assert(0 && "what type was this?"); + + return false; +} + +// Walk over an LLVM Type that we know is a homogeneous aggregate and +// push the proper LLVM Types that represent the register types to pass +// that struct member in. +static void push_elts(Type *Ty, std::vector &Elts) +{ + for (Type::subtype_iterator I = Ty->subtype_begin(), E = Ty->subtype_end(); + I != E; ++I) { + Type *STy = *I; + if (const VectorType *VTy = dyn_cast(STy)) { + switch (VTy->getBitWidth()) + { + case 64: // v2f32 + Elts.push_back(VectorType::get(Type::getFloatTy(Context), 2)); + break; + case 128: // v2f64 + Elts.push_back(VectorType::get(Type::getDoubleTy(Context), 2)); + break; + default: + assert (0 && "invalid vector type"); + } + } else if (ArrayType *ATy = dyn_cast(STy)) { + Type *ETy = ATy->getElementType(); + + for (uint64_t i = ATy->getNumElements(); i > 0; --i) + Elts.push_back(ETy); + } else if (STy->getNumContainedTypes()) + push_elts(STy, Elts); + else + Elts.push_back(STy); + } +} + +static unsigned count_num_words(std::vector &ScalarElts) { + unsigned NumWords = 0; + for (unsigned i = 0, e = ScalarElts.size(); i != e; ++i) { + Type *Ty = ScalarElts[i]; + if (Ty->isPointerTy()) { + NumWords++; + } else if (Ty->isIntegerTy()) { + const unsigned TypeSize = Ty->getPrimitiveSizeInBits(); + const unsigned NumWordsForType = (TypeSize + 31) / 32; + + NumWords += NumWordsForType; + } else { + assert (0 && "Unexpected type."); + } + } + return NumWords; +} + +// This function is used only on AAPCS. The difference from the generic +// handling of arguments is that arguments larger than 32 bits are split +// and padding arguments are added as necessary for alignment. This makes +// the IL a bit more explicit about how arguments are handled. +extern bool +llvm_arm_try_pass_aggregate_custom(tree type, + std::vector& ScalarElts, + CallingConv::ID CC, + struct DefaultABIClient* C) { + if (CC != CallingConv::ARM_AAPCS && CC != CallingConv::C) + return false; + + if (CC == CallingConv::C && !TARGET_AAPCS_BASED) + return false; + + if (TARGET_HARD_FLOAT_ABI) + return false; + Type *Ty = ConvertType(type); + if (Ty->isPointerTy()) + return false; + + const unsigned Size = TREE_INT_CST_LOW(TYPE_SIZE(type))/8; + const unsigned Alignment = TYPE_ALIGN(type)/8; + const unsigned NumWords = count_num_words(ScalarElts); + const bool AddPad = Alignment >= 8 && (NumWords % 2); + + // First, build a type that will be bitcast to the original one and + // from where elements will be extracted. + std::vector Elts; + Type* Int32Ty = Type::getInt32Ty(getGlobalContext()); + const unsigned NumRegularArgs = Size / 4; + for (unsigned i = 0; i < NumRegularArgs; ++i) { + Elts.push_back(Int32Ty); + } + const unsigned RestSize = Size % 4; + llvm::Type *RestType = NULL; + if (RestSize> 2) { + RestType = Type::getInt32Ty(getGlobalContext()); + } else if (RestSize > 1) { + RestType = Type::getInt16Ty(getGlobalContext()); + } else if (RestSize > 0) { + RestType = Type::getInt8Ty(getGlobalContext()); + } + if (RestType) + Elts.push_back(RestType); + StructType *STy = StructType::get(getGlobalContext(), Elts, false); + + if (AddPad) { + ScalarElts.push_back(Int32Ty); + C->HandlePad(Int32Ty); + } + + for (unsigned i = 0; i < NumRegularArgs; ++i) { + C->EnterField(i, STy); + C->HandleScalarArgument(Int32Ty, 0); + ScalarElts.push_back(Int32Ty); + C->ExitField(); + } + if (RestType) { + C->EnterField(NumRegularArgs, STy); + C->HandleScalarArgument(RestType, 0, RestSize); + ScalarElts.push_back(RestType); + C->ExitField(); + } + return true; +} + +// Target hook for llvm-abi.h. It returns true if an aggregate of the +// specified type should be passed in a number of registers of mixed types. +// It also returns a vector of types that correspond to the registers used +// for parameter passing. This only applies to AAPCS-VFP "homogeneous +// aggregates" as specified in 4.3.5 of the AAPCS spec. +bool +llvm_arm_should_pass_aggregate_in_mixed_regs(tree TreeType, Type *Ty, + CallingConv::ID CC, + std::vector &Elts) { + if (!llvm_arm_should_pass_or_return_aggregate_in_regs(TreeType, CC)) + return false; + + // Walk Ty and push LLVM types corresponding to register types onto + // Elts. + push_elts(Ty, Elts); + + return true; +} + +static bool alloc_next_spr(bool *SPRs) +{ + for (int i = 0; i < 16; ++i) + if (!SPRs[i]) { + SPRs[i] = true; + return true; + } + return false; +} + +static bool alloc_next_dpr(bool *SPRs) +{ + for (int i = 0; i < 16; i += 2) + if (!SPRs[i]) { + SPRs[i] = SPRs[i+1] = true; + return true; + } + return false; +} + +static bool alloc_next_qpr(bool *SPRs) { + for (int i = 0; i < 16; i += 4) + if (!SPRs[i]) { + SPRs[i] = SPRs[i+1] = SPRs[i+2] = SPRs[i+3] = true; + return true; + } + return false; +} + +// count_num_registers_uses - Simulate argument passing reg allocation in SPRs. +// Caller is expected to zero out SPRs. Returns true if all of ScalarElts fit +// in registers. +static bool count_num_registers_uses(std::vector &ScalarElts, + bool *SPRs) { + for (unsigned i = 0, e = ScalarElts.size(); i != e; ++i) { + Type *Ty = ScalarElts[i]; + if (const VectorType *VTy = dyn_cast(Ty)) { + switch (VTy->getBitWidth()) + { + case 64: + if (!alloc_next_dpr(SPRs)) + return false; + break; + case 128: + if (!alloc_next_qpr(SPRs)) + return false; + break; + default: + assert(0); + } + } else if (Ty->isIntegerTy() || Ty->isPointerTy() || + Ty==Type::getVoidTy(Context)) { + ; + } else { + // Floating point scalar argument. + assert(Ty->isFloatingPointTy() && Ty->isPrimitiveType() && + "Expecting a floating point primitive type!"); + switch (Ty->getTypeID()) + { + case Type::FloatTyID: + if (!alloc_next_spr(SPRs)) + return false; + break; + case Type::DoubleTyID: + if (!alloc_next_spr(SPRs)) + return false; + break; + default: + assert(0); + } + } + } + return true; +} + +// Target hook for llvm-abi.h. This is called when an aggregate is being passed +// in registers. If there are only enough available parameter registers to pass +// part of the aggregate, return true. That means the aggregate should instead +// be passed in memory. +bool +llvm_arm_aggregate_partially_passed_in_regs(std::vector &Elts, + std::vector &ScalarElts, + CallingConv::ID CC) { + // Homogeneous aggregates are an AAPCS-VFP feature. + if ((CC != CallingConv::ARM_AAPCS_VFP) || + !(TARGET_AAPCS_BASED && TARGET_VFP && TARGET_HARD_FLOAT_ABI)) + return true; + + bool SPRs[16] = { 0 }; // represents S0-S16 + + // Figure out which SPRs are available. + if (!count_num_registers_uses(ScalarElts, SPRs)) + return true; + + if (!count_num_registers_uses(Elts, SPRs)) + return true; + + return false; // it all fit in registers! +} + +// Return LLVM Type if TYPE can be returned as an aggregate, +// otherwise return NULL. +Type *llvm_arm_aggr_type_for_struct_return(tree TreeType, + CallingConv::ID CC) { + if (!llvm_arm_should_pass_or_return_aggregate_in_regs(TreeType, CC)) + return NULL; + + // Walk Ty and push LLVM types corresponding to register types onto + // Elts. + std::vector Elts; + Type *Ty = ConvertType(TreeType); + push_elts(Ty, Elts); + + return StructType::get(Context, Elts, false); +} + +// llvm_arm_extract_mrv_array_element - Helper function that helps extract +// an array element from multiple return value. +// +// Here, SRC is returning multiple values. DEST's DESTFIELDNO field is an array. +// Extract SRCFIELDNO's ELEMENO value and store it in DEST's FIELDNO field's +// ELEMENTNO. +// +static void llvm_arm_extract_mrv_array_element(Value *Src, Value *Dest, + unsigned SrcFieldNo, + unsigned SrcElemNo, + unsigned DestFieldNo, + unsigned DestElemNo, + LLVMBuilder &Builder, + bool isVolatile) { + Value *EVI = Builder.CreateExtractValue(Src, SrcFieldNo, "mrv_gr"); + const StructType *STy = cast(Src->getType()); + llvm::Value *Idxs[3]; + Idxs[0] = ConstantInt::get(llvm::Type::getInt32Ty(Context), 0); + Idxs[1] = ConstantInt::get(llvm::Type::getInt32Ty(Context), DestFieldNo); + Idxs[2] = ConstantInt::get(llvm::Type::getInt32Ty(Context), DestElemNo); + Value *GEP = Builder.CreateGEP(Dest, Idxs, "mrv_gep"); + if (STy->getElementType(SrcFieldNo)->isVectorTy()) { + Value *ElemIndex = ConstantInt::get(Type::getInt32Ty(Context), SrcElemNo); + Value *EVIElem = Builder.CreateExtractElement(EVI, ElemIndex, "mrv"); + Builder.CreateStore(EVIElem, GEP, isVolatile); + } else { + Builder.CreateStore(EVI, GEP, isVolatile); + } +} + +// llvm_arm_extract_multiple_return_value - Extract multiple values returned +// by SRC and store them in DEST. It is expected that SRC and +// DEST types are StructType, but they may not match. +void llvm_arm_extract_multiple_return_value(Value *Src, Value *Dest, + bool isVolatile, + LLVMBuilder &Builder) { + const StructType *STy = cast(Src->getType()); + unsigned NumElements = STy->getNumElements(); + + const PointerType *PTy = cast(Dest->getType()); + const StructType *DestTy = cast(PTy->getElementType()); + + unsigned SNO = 0; + unsigned DNO = 0; + + while (SNO < NumElements) { + + Type *DestElemType = DestTy->getElementType(DNO); + + // Directly access first class values. + if (DestElemType->isSingleValueType()) { + Value *GEP = Builder.CreateStructGEP(Dest, DNO, "mrv_gep"); + Value *EVI = Builder.CreateExtractValue(Src, SNO, "mrv_gr"); + Builder.CreateStore(EVI, GEP, isVolatile); + ++DNO; ++SNO; + continue; + } + + // Access array elements individually. Note, Src and Dest type may + // not match. For example { <2 x float>, float } and { float[3]; } + const ArrayType *ATy = cast(DestElemType); + unsigned ArraySize = ATy->getNumElements(); + unsigned DElemNo = 0; // DestTy's DNO field's element number + while (DElemNo < ArraySize) { + unsigned i = 0; + unsigned Size = 1; + + if (const VectorType *SElemTy = + dyn_cast(STy->getElementType(SNO))) { + Size = SElemTy->getNumElements(); + } + while (i < Size) { + llvm_arm_extract_mrv_array_element(Src, Dest, SNO, i++, + DNO, DElemNo++, + Builder, isVolatile); + } + // Consumed this src field. Try next one. + ++SNO; + } + // Finished building current dest field. + ++DNO; + } +} + +// Target hook for llvm-abi.h for LLVM_SHOULD_NOT_USE_SHADOW_RETURN and is +// also a utility function used for other target hooks in this file. Returns +// true if the aggregate should be passed or returned in registers. +bool llvm_arm_should_pass_or_return_aggregate_in_regs(tree TreeType, + CallingConv::ID CC) { + // Homogeneous aggregates are an AAPCS-VFP feature. + if ((CC != CallingConv::ARM_AAPCS_VFP) || + !(TARGET_AAPCS_BASED && TARGET_VFP && TARGET_HARD_FLOAT_ABI)) + return false; + + // Alas, we can't use LLVM Types to figure this out because we need to + // examine unions closely. We'll have to walk the GCC TreeType. + int fdt_counts[ARM_FDT_MAX] = { 0 }; + bool result = false; + result = vfp_arg_homogeneous_aggregate_p(TYPE_MODE(TreeType), TreeType, + fdt_counts); + return result && !TREE_ADDRESSABLE(TreeType); +} From baldrick at free.fr Wed Feb 1 09:10:30 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 15:10:30 -0000 Subject: [llvm-commits] [dragonegg] r149509 - /dragonegg/trunk/include/arm/dragonegg/Target.h Message-ID: <20120201151030.83A9F2A6C12C@llvm.org> Author: baldrick Date: Wed Feb 1 09:10:30 2012 New Revision: 149509 URL: http://llvm.org/viewvc/llvm-project?rev=149509&view=rev Log: Small ARM cleanups. Modified: dragonegg/trunk/include/arm/dragonegg/Target.h Modified: dragonegg/trunk/include/arm/dragonegg/Target.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/include/arm/dragonegg/Target.h?rev=149509&r1=149508&r2=149509&view=diff ============================================================================== --- dragonegg/trunk/include/arm/dragonegg/Target.h (original) +++ dragonegg/trunk/include/arm/dragonegg/Target.h Wed Feb 1 09:10:30 2012 @@ -233,15 +233,14 @@ } #endif -#define LLVM_SET_TARGET_MACHINE_OPTIONS(options) \ - options.UseSoftFloat = TARGET_SOFT_FLOAT; \ - if (TARGET_HARD_FLOAT_ABI) \ - options.FloatABIType = llvm::FloatABI::Hard; +#define LLVM_SET_TARGET_MACHINE_OPTIONS(options) \ + do { \ + options.UseSoftFloat = TARGET_SOFT_FLOAT; \ + if (TARGET_HARD_FLOAT_ABI) \ + options.FloatABIType = llvm::FloatABI::Hard; \ + } while (0) -/* Doing struct copy by partial-word loads and stores is not a good idea on ARM. */ -#define TARGET_LLVM_MIN_BYTES_COPY_BY_MEMCPY 4 - /* These are a couple of extensions to the asm formats %@ prints out ASM_COMMENT_START TODO: %r prints out REGISTER_PREFIX reg_names[arg] */ From baldrick at free.fr Wed Feb 1 09:20:49 2012 From: baldrick at free.fr (Duncan Sands) Date: Wed, 01 Feb 2012 15:20:49 -0000 Subject: [llvm-commits] [dragonegg] r149511 - /dragonegg/trunk/www/index.html Message-ID: <20120201152049.4C68E2A6C12C@llvm.org> Author: baldrick Date: Wed Feb 1 09:20:49 2012 New Revision: 149511 URL: http://llvm.org/viewvc/llvm-project?rev=149511&view=rev Log: Note that dragonegg can be used to produce code for ARM. Modified: dragonegg/trunk/www/index.html Modified: dragonegg/trunk/www/index.html URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/www/index.html?rev=149511&r1=149510&r2=149511&view=diff ============================================================================== --- dragonegg/trunk/www/index.html (original) +++ dragonegg/trunk/www/index.html Wed Feb 1 09:20:49 2012 @@ -19,10 +19,10 @@ LLVM project. It works with gcc-4.5 or gcc-4.6, - targets the x86-32 and x86-64 processor families, and has been successfully - used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD platforms. It fully - supports Ada, C, C++ and Fortran. It has partial support for Go, Java, Obj-C - and Obj-C++.

+ can target the x86-32/x86-64 and ARM processor families, and has been + successfully used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD + platforms. It fully supports Ada, C, C++ and Fortran. It has partial support + for Go, Java, Obj-C and Obj-C++.


From ojab at ojab.ru Tue Jan 31 23:48:19 2012 From: ojab at ojab.ru (ojab) Date: Wed, 1 Feb 2012 08:48:19 +0300 Subject: [llvm-commits] Add InitializeNativeTargetDisassembler function Message-ID: Hello, I would be useful to have InitializeNativeTargetDisassembler() function to initialize the native target disassembler, similar to InitializeNativeTargetAsmPrinter/InitializeNativeTargetAsmParser/etc, patch in the attached file. //wbr ojab -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/ff3c6f66/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: InitializeNativeTargetDisassembler.patch Type: application/octet-stream Size: 8431 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/ff3c6f66/attachment.obj From slarin at codeaurora.org Wed Feb 1 10:35:24 2012 From: slarin at codeaurora.org (Sergei Larin) Date: Wed, 1 Feb 2012 10:35:24 -0600 Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: References: <07f401cce02b$87f18630$97d49290$@org> Message-ID: <08c101cce0ff$7fa53ec0$7eefbc40$@org> Andrew, I would much rather check this version in now, and later work on the migration. We have numerous outstanding patches we are trying to upstream to Hexagon, and this one is blocking. since it touches both target independent and Hexagon sides. I'll happily transform the scheduler later, but I must unblock everyone else in the team now. Thank you for understanding. Sergei -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. From: Andrew Trick [mailto:atrick at apple.com] Sent: Wednesday, February 01, 2012 1:43 AM To: Sergei Larin Cc: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review On Jan 31, 2012, at 7:18 AM, Sergei Larin wrote: From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin Sent: Friday, January 27, 2012 10:47 AM To: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review Hello everybody, Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . Several key points: - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute "cost" function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the "default" scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. Comments and reviews are eagerly anticipated J I'm in the process of reviewing this and also reworking the codegen pass configuration to make it easier for targets to plugin scheduling/bundling and other passes. Hopefully you'll see the results of both tomorrow. This is probably fine to checkin in the short term, but you could instead move directly to scheduling machineinstrs after coalescing. Then you can actually work on using MachineBundles. Will it work for you to use the SourceListDAGScheduler and run your scheduler/bundler in the MachineScheduler pass? I think this migration will have to come either now or later for you. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/8c63d751/attachment.html From evan.cheng at apple.com Wed Feb 1 12:29:46 2012 From: evan.cheng at apple.com (Evan Cheng) Date: Wed, 01 Feb 2012 10:29:46 -0800 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll In-Reply-To: References: <20120131065244.AB1222A6C12C@llvm.org> <4F27C3F1.9060505@free.fr> Message-ID: <81F5B899-05F9-4C9D-A461-81777B249C77@apple.com> Can you add logic to bitcode upgrader to handle them? Thanks, Evan On Jan 31, 2012, at 8:42 AM, Craig Topper wrote: > Here's what clang has in its emmintrin.h file > > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpeq_epi8(__m128i a, __m128i b) > { > return (__m128i)((__v16qi)a == (__v16qi)b); > } > > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpeq_epi16(__m128i a, __m128i b) > { > return (__m128i)((__v8hi)a == (__v8hi)b); > } > > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpeq_epi32(__m128i a, __m128i b) > { > return (__m128i)((__v4si)a == (__v4si)b); > } > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpgt_epi8(__m128i a, __m128i b) > { > return (__m128i)((__v16qi)a > (__v16qi)b); > } > > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpgt_epi16(__m128i a, __m128i b) > { > return (__m128i)((__v8hi)a > (__v8hi)b); > } > > static __inline__ __m128i __attribute__((__always_inline__, __nodebug__)) > _mm_cmpgt_epi32(__m128i a, __m128i b) > { > return (__m128i)((__v4si)a > (__v4si)b); > } > > On Tue, Jan 31, 2012 at 2:35 AM, Duncan Sands wrote: > Hi Craig, > > > Remove pcmpgt/pcmpeq intrinsics as clang is not using them. > > dragonegg is using them. Can the same effect be obtained some other way? > > Ciao, Duncan. > > > > > Modified: > > llvm/trunk/include/llvm/IntrinsicsX86.td > > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > > llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > > llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > > llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > > > > Modified: llvm/trunk/include/llvm/IntrinsicsX86.td > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=149367&r1=149366&r2=149367&view=diff > > ============================================================================== > > --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) > > +++ llvm/trunk/include/llvm/IntrinsicsX86.td Tue Jan 31 00:52:44 2012 > > @@ -452,28 +452,6 @@ > > llvm_i32_ty], [IntrNoMem]>; > > } > > > > -// Integer comparison ops > > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > - def int_x86_sse2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb128">, > > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > > - llvm_v16i8_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw128">, > > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > > - llvm_v8i16_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd128">, > > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > > - llvm_v4i32_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb128">, > > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > > - llvm_v16i8_ty], [IntrNoMem]>; > > - def int_x86_sse2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw128">, > > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > > - llvm_v8i16_ty], [IntrNoMem]>; > > - def int_x86_sse2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd128">, > > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > > - llvm_v4i32_ty], [IntrNoMem]>; > > -} > > - > > // Conversion ops > > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > def int_x86_sse2_cvtdq2pd : GCCBuiltin<"__builtin_ia32_cvtdq2pd">, > > @@ -792,12 +770,6 @@ > > > > // Vector compare, min, max > > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > - def int_x86_sse41_pcmpeqq : GCCBuiltin<"__builtin_ia32_pcmpeqq">, > > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_sse42_pcmpgtq : GCCBuiltin<"__builtin_ia32_pcmpgtq">, > > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > > - [IntrNoMem]>; > > def int_x86_sse41_pmaxsb : GCCBuiltin<"__builtin_ia32_pmaxsb128">, > > Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], > > [IntrNoMem, Commutative]>; > > @@ -1515,34 +1487,6 @@ > > llvm_i32_ty], [IntrNoMem]>; > > } > > > > -// Integer comparison ops > > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > - def int_x86_avx2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb256">, > > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw256">, > > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd256">, > > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_q : GCCBuiltin<"__builtin_ia32_pcmpeqq256">, > > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb256">, > > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw256">, > > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd256">, > > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_q : GCCBuiltin<"__builtin_ia32_pcmpgtq256">, > > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > > - [IntrNoMem]>; > > -} > > - > > // Pack ops. > > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > def int_x86_avx2_packsswb : GCCBuiltin<"__builtin_ia32_packsswb256">, > > > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149367&r1=149366&r2=149367&view=diff > > ============================================================================== > > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jan 31 00:52:44 2012 > > @@ -9492,26 +9492,6 @@ > > case Intrinsic::x86_avx2_psrav_d_256: > > return DAG.getNode(ISD::SRA, dl, Op.getValueType(), > > Op.getOperand(1), Op.getOperand(2)); > > - case Intrinsic::x86_sse2_pcmpeq_b: > > - case Intrinsic::x86_sse2_pcmpeq_w: > > - case Intrinsic::x86_sse2_pcmpeq_d: > > - case Intrinsic::x86_sse41_pcmpeqq: > > - case Intrinsic::x86_avx2_pcmpeq_b: > > - case Intrinsic::x86_avx2_pcmpeq_w: > > - case Intrinsic::x86_avx2_pcmpeq_d: > > - case Intrinsic::x86_avx2_pcmpeq_q: > > - return DAG.getNode(X86ISD::PCMPEQ, dl, Op.getValueType(), > > - Op.getOperand(1), Op.getOperand(2)); > > - case Intrinsic::x86_sse2_pcmpgt_b: > > - case Intrinsic::x86_sse2_pcmpgt_w: > > - case Intrinsic::x86_sse2_pcmpgt_d: > > - case Intrinsic::x86_sse42_pcmpgtq: > > - case Intrinsic::x86_avx2_pcmpgt_b: > > - case Intrinsic::x86_avx2_pcmpgt_w: > > - case Intrinsic::x86_avx2_pcmpgt_d: > > - case Intrinsic::x86_avx2_pcmpgt_q: > > - return DAG.getNode(X86ISD::PCMPGT, dl, Op.getValueType(), > > - Op.getOperand(1), Op.getOperand(2)); > > case Intrinsic::x86_ssse3_pshuf_b_128: > > case Intrinsic::x86_avx2_pshuf_b: > > return DAG.getNode(X86ISD::PSHUFB, dl, Op.getValueType(), > > > > Modified: llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll?rev=149367&r1=149366&r2=149367&view=diff > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll Tue Jan 31 00:52:44 2012 > > @@ -30,7 +30,7 @@ > > %tmp87 = bitcast<16 x i8> %tmp66 to<4 x i32> ;<<4 x i32>> [#uses=1] > > %tmp88 = add<4 x i32> %tmp87, %tmp77 ;<<4 x i32>> [#uses=2] > > %tmp88.upgrd.4 = bitcast<4 x i32> %tmp88 to<2 x i64> ;<<2 x i64>> [#uses=1] > > - %tmp99 = tail call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > > + %tmp99 = tail call<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > > %tmp99.upgrd.5 = bitcast<4 x i32> %tmp99 to<2 x i64> ;<<2 x i64>> [#uses=2] > > %tmp110 = xor<2 x i64> %tmp99.upgrd.5,< i64 -1, i64 -1> ;<<2 x i64>> [#uses=1] > > %tmp111 = and<2 x i64> %tmp110, %tmp55.upgrd.2 ;<<2 x i64>> [#uses=1] > > @@ -48,4 +48,4 @@ > > ret void > > } > > > > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) > > +declare<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>,<4 x i32>) > > > > Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 > > @@ -369,54 +369,6 @@ > > declare<8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>,<8 x i16>) nounwind readnone > > > > > > -define<16 x i8> @test_x86_sse2_pcmpeq_b(<16 x i8> %a0,<16 x i8> %a1) { > > - ; CHECK: vpcmpeqb > > - %res = call<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8> %a0,<16 x i8> %a1) ;<<16 x i8>> [#uses=1] > > - ret<16 x i8> %res > > -} > > -declare<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8>,<16 x i8>) nounwind readnone > > - > > - > > -define<4 x i32> @test_x86_sse2_pcmpeq_d(<4 x i32> %a0,<4 x i32> %a1) { > > - ; CHECK: vpcmpeqd > > - %res = call<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32> %a0,<4 x i32> %a1) ;<<4 x i32>> [#uses=1] > > - ret<4 x i32> %res > > -} > > -declare<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32>,<4 x i32>) nounwind readnone > > - > > - > > -define<8 x i16> @test_x86_sse2_pcmpeq_w(<8 x i16> %a0,<8 x i16> %a1) { > > - ; CHECK: vpcmpeqw > > - %res = call<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16> %a0,<8 x i16> %a1) ;<<8 x i16>> [#uses=1] > > - ret<8 x i16> %res > > -} > > -declare<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16>,<8 x i16>) nounwind readnone > > - > > - > > -define<16 x i8> @test_x86_sse2_pcmpgt_b(<16 x i8> %a0,<16 x i8> %a1) { > > - ; CHECK: vpcmpgtb > > - %res = call<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8> %a0,<16 x i8> %a1) ;<<16 x i8>> [#uses=1] > > - ret<16 x i8> %res > > -} > > -declare<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8>,<16 x i8>) nounwind readnone > > - > > - > > -define<4 x i32> @test_x86_sse2_pcmpgt_d(<4 x i32> %a0,<4 x i32> %a1) { > > - ; CHECK: vpcmpgtd > > - %res = call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %a0,<4 x i32> %a1) ;<<4 x i32>> [#uses=1] > > - ret<4 x i32> %res > > -} > > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) nounwind readnone > > - > > - > > -define<8 x i16> @test_x86_sse2_pcmpgt_w(<8 x i16> %a0,<8 x i16> %a1) { > > - ; CHECK: vpcmpgtw > > - %res = call<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16> %a0,<8 x i16> %a1) ;<<8 x i16>> [#uses=1] > > - ret<8 x i16> %res > > -} > > -declare<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16>,<8 x i16>) nounwind readnone > > - > > - > > define<4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0,<8 x i16> %a1) { > > ; CHECK: vpmaddwd > > %res = call<4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0,<8 x i16> %a1) ;<<4 x i32>> [#uses=1] > > @@ -950,14 +902,6 @@ > > declare<8 x i16> @llvm.x86.sse41.pblendw(<8 x i16>,<8 x i16>, i32) nounwind readnone > > > > > > -define<2 x i64> @test_x86_sse41_pcmpeqq(<2 x i64> %a0,<2 x i64> %a1) { > > - ; CHECK: vpcmpeqq > > - %res = call<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64> %a0,<2 x i64> %a1) ;<<2 x i64>> [#uses=1] > > - ret<2 x i64> %res > > -} > > -declare<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64>,<2 x i64>) nounwind readnone > > - > > - > > define<8 x i16> @test_x86_sse41_phminposuw(<8 x i16> %a0) { > > ; CHECK: vphminposuw > > %res = call<8 x i16> @llvm.x86.sse41.phminposuw(<8 x i16> %a0) ;<<8 x i16>> [#uses=1] > > @@ -1271,14 +1215,6 @@ > > declare<16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32,<16 x i8>, i32, i8) nounwind readnone > > > > > > -define<2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0,<2 x i64> %a1) { > > - ; CHECK: vpcmpgtq > > - %res = call<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0,<2 x i64> %a1) ;<<2 x i64>> [#uses=1] > > - ret<2 x i64> %res > > -} > > -declare<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>,<2 x i64>) nounwind readnone > > - > > - > > define i32 @test_x86_sse42_pcmpistri128(<16 x i8> %a0,<16 x i8> %a1) { > > ; CHECK: vpcmpistri > > ; CHECK: movl > > > > Modified: llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 > > @@ -72,54 +72,6 @@ > > declare<16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>,<16 x i16>) nounwind readnone > > > > > > -define<32 x i8> @test_x86_avx2_pcmpeq_b(<32 x i8> %a0,<32 x i8> %a1) { > > - ; CHECK: vpcmpeqb > > - %res = call<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > > - ret<32 x i8> %res > > -} > > -declare<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8>,<32 x i8>) nounwind readnone > > - > > - > > -define<8 x i32> @test_x86_avx2_pcmpeq_d(<8 x i32> %a0,<8 x i32> %a1) { > > - ; CHECK: vpcmpeqd > > - %res = call<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32> %a0,<8 x i32> %a1) ;<<8 x i32>> [#uses=1] > > - ret<8 x i32> %res > > -} > > -declare<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32>,<8 x i32>) nounwind readnone > > - > > - > > -define<16 x i16> @test_x86_avx2_pcmpeq_w(<16 x i16> %a0,<16 x i16> %a1) { > > - ; CHECK: vpcmpeqw > > - %res = call<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16> %a0,<16 x i16> %a1) ;<<16 x i16>> [#uses=1] > > - ret<16 x i16> %res > > -} > > -declare<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16>,<16 x i16>) nounwind readnone > > - > > - > > -define<32 x i8> @test_x86_avx2_pcmpgt_b(<32 x i8> %a0,<32 x i8> %a1) { > > - ; CHECK: vpcmpgtb > > - %res = call<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > > - ret<32 x i8> %res > > -} > > -declare<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8>,<32 x i8>) nounwind readnone > > - > > - > > -define<8 x i32> @test_x86_avx2_pcmpgt_d(<8 x i32> %a0,<8 x i32> %a1) { > > - ; CHECK: vpcmpgtd > > - %res = call<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32> %a0,<8 x i32> %a1) ;<<8 x i32>> [#uses=1] > > - ret<8 x i32> %res > > -} > > -declare<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32>,<8 x i32>) nounwind readnone > > - > > - > > -define<16 x i16> @test_x86_avx2_pcmpgt_w(<16 x i16> %a0,<16 x i16> %a1) { > > - ; CHECK: vpcmpgtw > > - %res = call<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16> %a0,<16 x i16> %a1) ;<<16 x i16>> [#uses=1] > > - ret<16 x i16> %res > > -} > > -declare<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16>,<16 x i16>) nounwind readnone > > - > > - > > define<8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0,<16 x i16> %a1) { > > ; CHECK: vpmaddwd > > %res = call<8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0,<16 x i16> %a1) ;<<8 x i32>> [#uses=1] > > @@ -553,14 +505,6 @@ > > declare<16 x i16> @llvm.x86.avx2.pblendw(<16 x i16>,<16 x i16>, i32) nounwind readnone > > > > > > -define<4 x i64> @test_x86_avx2_pcmpeqq(<4 x i64> %a0,<4 x i64> %a1) { > > - ; CHECK: vpcmpeqq > > - %res = call<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64> %a0,<4 x i64> %a1) ;<<4 x i64>> [#uses=1] > > - ret<4 x i64> %res > > -} > > -declare<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64>,<4 x i64>) nounwind readnone > > - > > - > > define<32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0,<32 x i8> %a1) { > > ; CHECK: vpmaxsb > > %res = call<32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > > @@ -729,14 +673,6 @@ > > declare<4 x i64> @llvm.x86.avx2.pmul.dq(<8 x i32>,<8 x i32>) nounwind readnone > > > > > > -define<4 x i64> @test_x86_avx2_pcmpgtq(<4 x i64> %a0,<4 x i64> %a1) { > > - ; CHECK: vpcmpgtq > > - %res = call<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64> %a0,<4 x i64> %a1) ;<<4 x i64>> [#uses=1] > > - ret<4 x i64> %res > > -} > > -declare<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64>,<4 x i64>) nounwind readnone > > - > > - > > define<4 x i64> @test_x86_avx2_vbroadcasti128(i8* %a0) { > > ; CHECK: vbroadcasti128 > > %res = call<4 x i64> @llvm.x86.avx2.vbroadcasti128(i8* %a0) ;<<4 x i64>> [#uses=1] > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > -- > ~Craig > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/a6dc8116/attachment.html From mcrosier at apple.com Wed Feb 1 12:45:51 2012 From: mcrosier at apple.com (Chad Rosier) Date: Wed, 01 Feb 2012 18:45:51 -0000 Subject: [llvm-commits] [llvm] r149521 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <20120201184551.7C7012A6C12C@llvm.org> Author: mcrosier Date: Wed Feb 1 12:45:51 2012 New Revision: 149521 URL: http://llvm.org/viewvc/llvm-project?rev=149521&view=rev Log: Tidy up. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149521&r1=149520&r2=149521&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 12:45:51 2012 @@ -14306,7 +14306,7 @@ Ld = cast(St->getChain()); else if (St->getValue().hasOneUse() && ChainVal->getOpcode() == ISD::TokenFactor) { - for (unsigned i=0, e = ChainVal->getNumOperands(); i != e; ++i) { + for (unsigned i = 0, e = ChainVal->getNumOperands(); i != e; ++i) { if (ChainVal->getOperand(i).getNode() == LdVal) { TokenFactorIndex = i; Ld = cast(St->getValue()); From stpworld at narod.ru Wed Feb 1 13:50:08 2012 From: stpworld at narod.ru (Stepan Dyatkovskiy) Date: Wed, 01 Feb 2012 23:50:08 +0400 Subject: [llvm-commits] [llvm] r149481 - in /llvm/trunk: include/llvm/ include/llvm/Analysis/ lib/Analysis/ lib/Bitcode/Writer/ lib/CodeGen/SelectionDAG/ lib/ExecutionEngine/Interpreter/ lib/Target/CBackend/ lib/Target/CppBackend/ lib/Transforms/IPO/ lib/Transforms/InstCombine/ lib/Transforms/Scalar/ lib/Transforms/Utils/ lib/VMCore/ tools/llvm-diff/ In-Reply-To: References: <20120201074953.13FFD2A6C12C@llvm.org> Message-ID: <4F299770.6070401@narod.ru> Hi Eric. I hope that it allows to change SwitchInst internals as we want without fixing any 3rd side sources (llvm sources itself and llvm clients). Instead I made it as pre-patch for PR1255: "Should enhance LLVM switch instruction to take case ranges". Sorry for the long delay in reply. I carefully checked it again and again. Though, if you found something changes I made, I'm ready for discussing. -Stepan. Eric Christopher wrote: > > On Jan 31, 2012, at 11:49 PM, Stepan Dyatkovskiy wrote: > >> SwitchInst refactoring. >> The purpose of refactoring is to hide operand roles from SwitchInst >> user (programmer). If you want to play with operands directly, >> probably you will need lower level methods than SwitchInst ones >> (TerminatorInst or may be User). After this patch we can reorganize >> SwitchInst operands and successors as we want. > > Wait a minute. > > Why are you doing this? What do you hope to gain? > > -eric From atrick at apple.com Wed Feb 1 14:35:20 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 12:35:20 -0800 Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: References: <07f401cce02b$87f18630$97d49290$@org> Message-ID: <42CFE65B-4116-41F3-BC56-50FB139847C9@apple.com> On Jan 31, 2012, at 11:42 PM, Andrew Trick wrote: > On Jan 31, 2012, at 7:18 AM, Sergei Larin wrote: >> >> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin >> Sent: Friday, January 27, 2012 10:47 AM >> To: llvm-commits at cs.uiuc.edu >> Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review >> >> >> Hello everybody, >> >> Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . >> >> Several key points: >> - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute ?cost? function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. >> - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the ?default? scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). >> - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. >> - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. >> - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. >> >> Comments and reviews are eagerly anticipated J Sergei, Let me know if these superficial changes are ok with you, and I'll commit (or if you can commit, go ahead): For the record, I don't understand how your register pressure tracking works. I don't think it matters, because these are really target specific heuristics masquerading as machine independent code--following the same style as the rest of the scheduler. So if it works for you I'm ok with it. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0004.html -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-ScheduleVLIW-only-works-on-SelectionDAG-rename-it-ac.patch Type: application/octet-stream Size: 64168 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0003.obj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0005.html -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Update-CMakeLists.txt-with-new-files.patch Type: application/octet-stream Size: 984 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0004.obj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0006.html -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Trashold-typo.patch Type: application/octet-stream Size: 1430 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0005.obj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/be81438b/attachment-0007.html From slarin at codeaurora.org Wed Feb 1 14:54:37 2012 From: slarin at codeaurora.org (Sergei Larin) Date: Wed, 1 Feb 2012 14:54:37 -0600 Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <42CFE65B-4116-41F3-BC56-50FB139847C9@apple.com> References: <07f401cce02b$87f18630$97d49290$@org> <42CFE65B-4116-41F3-BC56-50FB139847C9@apple.com> Message-ID: <0b6f01cce123$b5ff8ab0$21fea010$@org> This looks great to me. Please commit it. Thanks a lot. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. From: Andrew Trick [mailto:atrick at apple.com] Sent: Wednesday, February 01, 2012 2:35 PM To: Sergei Larin Cc: llvm-commits at cs.uiuc.edu Subject: Re: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review On Jan 31, 2012, at 11:42 PM, Andrew Trick wrote: On Jan 31, 2012, at 7:18 AM, Sergei Larin wrote: From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin Sent: Friday, January 27, 2012 10:47 AM To: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review Hello everybody, Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . Several key points: - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute "cost" function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the "default" scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. Comments and reviews are eagerly anticipated J Sergei, Let me know if these superficial changes are ok with you, and I'll commit (or if you can commit, go ahead): -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/00af40e0/attachment.html From echristo at apple.com Wed Feb 1 15:06:00 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 13:06:00 -0800 Subject: [llvm-commits] [llvm] r149498 - in /llvm/trunk: autoconf/configure.ac configure In-Reply-To: <20120201140622.2E4BD2A6C12C@llvm.org> References: <20120201140622.2E4BD2A6C12C@llvm.org> Message-ID: On Feb 1, 2012, at 6:06 AM, Dylan Noblesmith wrote: > (I lack the right version of autoconf et al. to regen, but it > was a simple change, so I just updated configure manually.) You can download and build the sources quite easily. -eric From echristo at apple.com Wed Feb 1 15:36:10 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 13:36:10 -0800 Subject: [llvm-commits] [llvm] r149481 - in /llvm/trunk: include/llvm/ include/llvm/Analysis/ lib/Analysis/ lib/Bitcode/Writer/ lib/CodeGen/SelectionDAG/ lib/ExecutionEngine/Interpreter/ lib/Target/CBackend/ lib/Target/CppBackend/ lib/Transforms/IPO/ lib/Transforms/InstCombine/ lib/Transforms/Scalar/ lib/Transforms/Utils/ lib/VMCore/ tools/llvm-diff/ In-Reply-To: <4F299770.6070401@narod.ru> References: <20120201074953.13FFD2A6C12C@llvm.org> <4F299770.6070401@narod.ru> Message-ID: <3825DF6E-225A-471E-9947-4E1D7BF27B67@apple.com> On Feb 1, 2012, at 11:50 AM, Stepan Dyatkovskiy wrote: > Hi Eric. I hope that it allows to change SwitchInst internals as we want without fixing any 3rd side sources (llvm sources itself and llvm clients). Instead I made it as pre-patch for PR1255: "Should enhance LLVM switch instruction to take case ranges". Sorry for the long delay in reply. > I carefully checked it again and again. > Though, if you found something changes I made, I'm ready for discussing. Makes total sense. Probably should have referenced the PR in the commit message. Anton explained a bit last night as well. Thanks! -eric From stoklund at 2pi.dk Wed Feb 1 16:12:51 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 01 Feb 2012 22:12:51 -0000 Subject: [llvm-commits] [llvm] r149546 - /llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Message-ID: <20120201221251.EC5A82A6C12E@llvm.org> Author: stoklund Date: Wed Feb 1 16:12:51 2012 New Revision: 149546 URL: http://llvm.org/viewvc/llvm-project?rev=149546&view=rev Log: Avoid emitting empty arrays, they're not standard C++. It's only by luck that we haven't produced any yet, and clang refuses to compile them. Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Modified: llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp?rev=149546&r1=149545&r2=149546&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/RegisterInfoEmitter.cpp Wed Feb 1 16:12:51 2012 @@ -640,17 +640,22 @@ << "getRawAllocationOrder(const MachineFunction &MF) const {\n"; for (unsigned oi = 1 , oe = RC.getNumOrders(); oi != oe; ++oi) { ArrayRef Elems = RC.getOrder(oi); - OS << " static const unsigned AltOrder" << oi << "[] = {"; - for (unsigned elem = 0; elem != Elems.size(); ++elem) - OS << (elem ? ", " : " ") << getQualifiedName(Elems[elem]); - OS << " };\n"; + if (!Elems.empty()) { + OS << " static const unsigned AltOrder" << oi << "[] = {"; + for (unsigned elem = 0; elem != Elems.size(); ++elem) + OS << (elem ? ", " : " ") << getQualifiedName(Elems[elem]); + OS << " };\n"; + } } OS << " const MCRegisterClass &MCR = " << Target.getName() - << "MCRegisterClasses[" << RC.getQualifiedName() + "RegClassID];" + << "MCRegisterClasses[" << RC.getQualifiedName() + "RegClassID];\n" << " static const ArrayRef Order[] = {\n" << " makeArrayRef(MCR.begin(), MCR.getNumRegs()"; for (unsigned oi = 1, oe = RC.getNumOrders(); oi != oe; ++oi) - OS << "),\n makeArrayRef(AltOrder" << oi; + if (RC.getOrder(oi).empty()) + OS << "),\n ArrayRef("; + else + OS << "),\n makeArrayRef(AltOrder" << oi; OS << ")\n };\n const unsigned Select = " << RC.getName() << "AltOrderSelect(MF);\n assert(Select < " << RC.getNumOrders() << ");\n return Order[Select];\n}\n"; From atrick at apple.com Wed Feb 1 16:13:58 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 22:13:58 -0000 Subject: [llvm-commits] [llvm] r149547 - in /llvm/trunk: include/llvm/CodeGen/ include/llvm/Target/ lib/CodeGen/SelectionDAG/ lib/Target/Hexagon/ test/CodeGen/Hexagon/ Message-ID: <20120201221359.097912A6C12E@llvm.org> Author: atrick Date: Wed Feb 1 16:13:57 2012 New Revision: 149547 URL: http://llvm.org/viewvc/llvm-project?rev=149547&view=rev Log: VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA). This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling. Patch by Sergei Larin! Added: llvm/trunk/include/llvm/CodeGen/ResourcePriorityQueue.h llvm/trunk/lib/CodeGen/SelectionDAG/ResourcePriorityQueue.cpp llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGVLIW.cpp Modified: llvm/trunk/include/llvm/CodeGen/LinkAllCodegenComponents.h llvm/trunk/include/llvm/CodeGen/SchedulerRegistry.h llvm/trunk/include/llvm/Target/TargetInstrInfo.h llvm/trunk/include/llvm/Target/TargetLowering.h llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.h llvm/trunk/lib/Target/Hexagon/HexagonSubtarget.cpp llvm/trunk/lib/Target/Hexagon/Makefile llvm/trunk/test/CodeGen/Hexagon/args.ll llvm/trunk/test/CodeGen/Hexagon/static.ll Modified: llvm/trunk/include/llvm/CodeGen/LinkAllCodegenComponents.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/LinkAllCodegenComponents.h?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/LinkAllCodegenComponents.h (original) +++ llvm/trunk/include/llvm/CodeGen/LinkAllCodegenComponents.h Wed Feb 1 16:13:57 2012 @@ -40,12 +40,13 @@ llvm::linkOcamlGC(); llvm::linkShadowStackGC(); - + (void) llvm::createBURRListDAGScheduler(NULL, llvm::CodeGenOpt::Default); (void) llvm::createSourceListDAGScheduler(NULL,llvm::CodeGenOpt::Default); (void) llvm::createHybridListDAGScheduler(NULL,llvm::CodeGenOpt::Default); (void) llvm::createFastDAGScheduler(NULL, llvm::CodeGenOpt::Default); (void) llvm::createDefaultScheduler(NULL, llvm::CodeGenOpt::Default); + (void) llvm::createVLIWDAGScheduler(NULL, llvm::CodeGenOpt::Default); } } ForceCodegenLinking; // Force link by creating a global definition. Added: llvm/trunk/include/llvm/CodeGen/ResourcePriorityQueue.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ResourcePriorityQueue.h?rev=149547&view=auto ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ResourcePriorityQueue.h (added) +++ llvm/trunk/include/llvm/CodeGen/ResourcePriorityQueue.h Wed Feb 1 16:13:57 2012 @@ -0,0 +1,142 @@ +//===----- ResourcePriorityQueue.h - A DFA-oriented priority queue -------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file implements the ResourcePriorityQueue class, which is a +// SchedulingPriorityQueue that schedules using DFA state to +// reduce the length of the critical path through the basic block +// on VLIW platforms. +// +//===----------------------------------------------------------------------===// + +#ifndef RESOURCE_PRIORITY_QUEUE_H +#define RESOURCE_PRIORITY_QUEUE_H + +#include "llvm/CodeGen/DFAPacketizer.h" +#include "llvm/CodeGen/SelectionDAGISel.h" +#include "llvm/CodeGen/ScheduleDAG.h" +#include "llvm/MC/MCInstrItineraries.h" +#include "llvm/Target/TargetInstrInfo.h" +#include "llvm/Target/TargetRegisterInfo.h" + +namespace llvm { + class ResourcePriorityQueue; + + /// Sorting functions for the Available queue. + struct resource_sort : public std::binary_function { + ResourcePriorityQueue *PQ; + explicit resource_sort(ResourcePriorityQueue *pq) : PQ(pq) {} + + bool operator()(const SUnit* left, const SUnit* right) const; + }; + + class ResourcePriorityQueue : public SchedulingPriorityQueue { + /// SUnits - The SUnits for the current graph. + std::vector *SUnits; + + /// NumNodesSolelyBlocking - This vector contains, for every node in the + /// Queue, the number of nodes that the node is the sole unscheduled + /// predecessor for. This is used as a tie-breaker heuristic for better + /// mobility. + std::vector NumNodesSolelyBlocking; + + /// Queue - The queue. + std::vector Queue; + + /// RegPressure - Tracking current reg pressure per register class. + /// + std::vector RegPressure; + + /// RegLimit - Tracking the number of allocatable registers per register + /// class. + std::vector RegLimit; + + resource_sort Picker; + const TargetRegisterInfo *TRI; + const TargetLowering *TLI; + const TargetInstrInfo *TII; + const InstrItineraryData* InstrItins; + /// ResourcesModel - Represents VLIW state. + /// Not limited to VLIW targets per say, but assumes + /// definition of DFA by a target. + DFAPacketizer *ResourcesModel; + + /// Resource model - packet/bundle model. Purely + /// internal at the time. + std::vector Packet; + + /// Heuristics for estimating register pressure. + unsigned ParallelLiveRanges; + signed HorizontalVerticalBalance; + + public: + ResourcePriorityQueue(SelectionDAGISel *IS); + + ~ResourcePriorityQueue() { + delete ResourcesModel; + } + + bool isBottomUp() const { return false; } + + void initNodes(std::vector &sunits); + + void addNode(const SUnit *SU) { + NumNodesSolelyBlocking.resize(SUnits->size(), 0); + } + + void updateNode(const SUnit *SU) {} + + void releaseState() { + SUnits = 0; + } + + unsigned getLatency(unsigned NodeNum) const { + assert(NodeNum < (*SUnits).size()); + return (*SUnits)[NodeNum].getHeight(); + } + + unsigned getNumSolelyBlockNodes(unsigned NodeNum) const { + assert(NodeNum < NumNodesSolelyBlocking.size()); + return NumNodesSolelyBlocking[NodeNum]; + } + + /// Single cost function reflecting benefit of scheduling SU + /// in the current cycle. + signed SUSchedulingCost (SUnit *SU); + + /// InitNumRegDefsLeft - Determine the # of regs defined by this node. + /// + void initNumRegDefsLeft(SUnit *SU); + void updateNumRegDefsLeft(SUnit *SU); + signed regPressureDelta(SUnit *SU, bool RawPressure = false); + signed rawRegPressureDelta (SUnit *SU, unsigned RCId); + + bool empty() const { return Queue.empty(); } + + virtual void push(SUnit *U); + + virtual SUnit *pop(); + + virtual void remove(SUnit *SU); + + virtual void dump(ScheduleDAG* DAG) const; + + /// ScheduledNode - Main resource tracking point. + void ScheduledNode(SUnit *Node); + bool isResourceAvailable(SUnit *SU); + void reserveResources(SUnit *SU); + +private: + void adjustPriorityOfUnscheduledPreds(SUnit *SU); + SUnit *getSingleUnscheduledPred(SUnit *SU); + unsigned numberRCValPredInSU (SUnit *SU, unsigned RCId); + unsigned numberRCValSuccInSU (SUnit *SU, unsigned RCId); + }; +} + +#endif Modified: llvm/trunk/include/llvm/CodeGen/SchedulerRegistry.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SchedulerRegistry.h?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SchedulerRegistry.h (original) +++ llvm/trunk/include/llvm/CodeGen/SchedulerRegistry.h Wed Feb 1 16:13:57 2012 @@ -42,7 +42,7 @@ : MachinePassRegistryNode(N, D, (MachinePassCtor)C) { Registry.Add(this); } ~RegisterScheduler() { Registry.Remove(this); } - + // Accessors. // @@ -92,6 +92,11 @@ ScheduleDAGSDNodes *createFastDAGScheduler(SelectionDAGISel *IS, CodeGenOpt::Level OptLevel); +/// createVLIWDAGScheduler - Scheduler for VLIW targets. This creates top down +/// DFA driven list scheduler with clustering heuristic to control +/// register pressure. +ScheduleDAGSDNodes *createVLIWDAGScheduler(SelectionDAGISel *IS, + CodeGenOpt::Level OptLevel); /// createDefaultScheduler - This creates an instruction scheduler appropriate /// for the target. ScheduleDAGSDNodes *createDefaultScheduler(SelectionDAGISel *IS, Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original) +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Wed Feb 1 16:13:57 2012 @@ -15,6 +15,7 @@ #define LLVM_TARGET_TARGETINSTRINFO_H #include "llvm/MC/MCInstrInfo.h" +#include "llvm/CodeGen/DFAPacketizer.h" #include "llvm/CodeGen/MachineFunction.h" namespace llvm { @@ -811,6 +812,12 @@ breakPartialRegDependency(MachineBasicBlock::iterator MI, unsigned OpNum, const TargetRegisterInfo *TRI) const {} + /// Create machine specific model for scheduling. + virtual DFAPacketizer* + CreateTargetScheduleState(const TargetMachine*, const ScheduleDAG*) const { + return NULL; + } + private: int CallFrameSetupOpcode, CallFrameDestroyOpcode; }; Modified: llvm/trunk/include/llvm/Target/TargetLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetLowering.h (original) +++ llvm/trunk/include/llvm/Target/TargetLowering.h Wed Feb 1 16:13:57 2012 @@ -59,7 +59,8 @@ Source, // Follow source order. RegPressure, // Scheduling for lowest register pressure. Hybrid, // Scheduling for both latency and register pressure. - ILP // Scheduling for ILP in low register pressure mode. + ILP, // Scheduling for ILP in low register pressure mode. + VLIW // Scheduling for VLIW targets. }; } Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Wed Feb 1 16:13:57 2012 @@ -10,13 +10,15 @@ LegalizeTypesGeneric.cpp LegalizeVectorOps.cpp LegalizeVectorTypes.cpp + ResourcePriorityQueue.cpp ScheduleDAGFast.cpp - ScheduleDAGRRList.cpp + ScheduleDAGRRList.cpp ScheduleDAGSDNodes.cpp SelectionDAG.cpp SelectionDAGBuilder.cpp SelectionDAGISel.cpp SelectionDAGPrinter.cpp + SelectionDAGVLIW.cpp TargetLowering.cpp TargetSelectionDAGInfo.cpp ) Added: llvm/trunk/lib/CodeGen/SelectionDAG/ResourcePriorityQueue.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ResourcePriorityQueue.cpp?rev=149547&view=auto ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ResourcePriorityQueue.cpp (added) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ResourcePriorityQueue.cpp Wed Feb 1 16:13:57 2012 @@ -0,0 +1,657 @@ +//===- ResourcePriorityQueue.cpp - A DFA-oriented priority queue -*- C++ -*-==// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file implements the ResourcePriorityQueue class, which is a +// SchedulingPriorityQueue that prioritizes instructions using DFA state to +// reduce the length of the critical path through the basic block +// on VLIW platforms. +// The scheduler is basically a top-down adaptable list scheduler with DFA +// resource tracking added to the cost function. +// DFA is queried as a state machine to model "packets/bundles" during +// schedule. Currently packets/bundles are discarded at the end of +// scheduling, affecting only order of instructions. +// +//===----------------------------------------------------------------------===// + +#define DEBUG_TYPE "scheduler" +#include "llvm/CodeGen/ResourcePriorityQueue.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/raw_ostream.h" +#include "llvm/CodeGen/MachineInstr.h" +#include "llvm/CodeGen/SelectionDAGNodes.h" +#include "llvm/Target/TargetMachine.h" +#include "llvm/Target/TargetLowering.h" + +using namespace llvm; + +static cl::opt DisableDFASched("disable-dfa-sched", cl::Hidden, + cl::ZeroOrMore, cl::init(false), + cl::desc("Disable use of DFA during scheduling")); + +static cl::opt RegPressureThreshold( + "dfa-sched-reg-pressure-threshold", cl::Hidden, cl::ZeroOrMore, cl::init(5), + cl::desc("Track reg pressure and switch priority to in-depth")); + + +ResourcePriorityQueue::ResourcePriorityQueue(SelectionDAGISel *IS) : + Picker(this), + InstrItins(IS->getTargetLowering().getTargetMachine().getInstrItineraryData()) +{ + TII = IS->getTargetLowering().getTargetMachine().getInstrInfo(); + TRI = IS->getTargetLowering().getTargetMachine().getRegisterInfo(); + TLI = &IS->getTargetLowering(); + + const TargetMachine &tm = (*IS->MF).getTarget(); + ResourcesModel = tm.getInstrInfo()->CreateTargetScheduleState(&tm,NULL); + // This hard requirment could be relaxed, but for now + // do not let it procede. + assert (ResourcesModel && "Unimplemented CreateTargetScheduleState."); + + unsigned NumRC = TRI->getNumRegClasses(); + RegLimit.resize(NumRC); + RegPressure.resize(NumRC); + std::fill(RegLimit.begin(), RegLimit.end(), 0); + std::fill(RegPressure.begin(), RegPressure.end(), 0); + for (TargetRegisterInfo::regclass_iterator I = TRI->regclass_begin(), + E = TRI->regclass_end(); I != E; ++I) + RegLimit[(*I)->getID()] = TRI->getRegPressureLimit(*I, *IS->MF); + + ParallelLiveRanges = 0; + HorizontalVerticalBalance = 0; +} + +unsigned +ResourcePriorityQueue::numberRCValPredInSU(SUnit *SU, unsigned RCId) { + unsigned NumberDeps = 0; + for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end(); + I != E; ++I) { + if (I->isCtrl()) + continue; + + SUnit *PredSU = I->getSUnit(); + const SDNode *ScegN = PredSU->getNode(); + + if (!ScegN) + continue; + + // If value is passed to CopyToReg, it is probably + // live outside BB. + switch (ScegN->getOpcode()) { + default: break; + case ISD::TokenFactor: break; + case ISD::CopyFromReg: NumberDeps++; break; + case ISD::CopyToReg: break; + case ISD::INLINEASM: break; + } + if (!ScegN->isMachineOpcode()) + continue; + + for (unsigned i = 0, e = ScegN->getNumValues(); i != e; ++i) { + EVT VT = ScegN->getValueType(i); + if (TLI->isTypeLegal(VT) + && (TLI->getRegClassFor(VT)->getID() == RCId)) { + NumberDeps++; + break; + } + } + } + return NumberDeps; +} + +unsigned ResourcePriorityQueue::numberRCValSuccInSU(SUnit *SU, + unsigned RCId) { + unsigned NumberDeps = 0; + for (SUnit::const_succ_iterator I = SU->Succs.begin(), E = SU->Succs.end(); + I != E; ++I) { + if (I->isCtrl()) + continue; + + SUnit *SuccSU = I->getSUnit(); + const SDNode *ScegN = SuccSU->getNode(); + if (!ScegN) + continue; + + // If value is passed to CopyToReg, it is probably + // live outside BB. + switch (ScegN->getOpcode()) { + default: break; + case ISD::TokenFactor: break; + case ISD::CopyFromReg: break; + case ISD::CopyToReg: NumberDeps++; break; + case ISD::INLINEASM: break; + } + if (!ScegN->isMachineOpcode()) + continue; + + for (unsigned i = 0, e = ScegN->getNumOperands(); i != e; ++i) { + const SDValue &Op = ScegN->getOperand(i); + EVT VT = Op.getNode()->getValueType(Op.getResNo()); + if (TLI->isTypeLegal(VT) + && (TLI->getRegClassFor(VT)->getID() == RCId)) { + NumberDeps++; + break; + } + } + } + return NumberDeps; +} + +static unsigned numberCtrlDepsInSU(SUnit *SU) { + unsigned NumberDeps = 0; + for (SUnit::const_succ_iterator I = SU->Succs.begin(), E = SU->Succs.end(); + I != E; ++I) + if (I->isCtrl()) + NumberDeps++; + + return NumberDeps; +} + +static unsigned numberCtrlPredInSU(SUnit *SU) { + unsigned NumberDeps = 0; + for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end(); + I != E; ++I) + if (I->isCtrl()) + NumberDeps++; + + return NumberDeps; +} + +/// +/// Initialize nodes. +/// +void ResourcePriorityQueue::initNodes(std::vector &sunits) { + SUnits = &sunits; + NumNodesSolelyBlocking.resize(SUnits->size(), 0); + + for (unsigned i = 0, e = SUnits->size(); i != e; ++i) { + SUnit *SU = &(*SUnits)[i]; + initNumRegDefsLeft(SU); + SU->NodeQueueId = 0; + } +} + +/// This heuristic is used if DFA scheduling is not desired +/// for some VLIW platform. +bool resource_sort::operator()(const SUnit *LHS, const SUnit *RHS) const { + // The isScheduleHigh flag allows nodes with wraparound dependencies that + // cannot easily be modeled as edges with latencies to be scheduled as + // soon as possible in a top-down schedule. + if (LHS->isScheduleHigh && !RHS->isScheduleHigh) + return false; + + if (!LHS->isScheduleHigh && RHS->isScheduleHigh) + return true; + + unsigned LHSNum = LHS->NodeNum; + unsigned RHSNum = RHS->NodeNum; + + // The most important heuristic is scheduling the critical path. + unsigned LHSLatency = PQ->getLatency(LHSNum); + unsigned RHSLatency = PQ->getLatency(RHSNum); + if (LHSLatency < RHSLatency) return true; + if (LHSLatency > RHSLatency) return false; + + // After that, if two nodes have identical latencies, look to see if one will + // unblock more other nodes than the other. + unsigned LHSBlocked = PQ->getNumSolelyBlockNodes(LHSNum); + unsigned RHSBlocked = PQ->getNumSolelyBlockNodes(RHSNum); + if (LHSBlocked < RHSBlocked) return true; + if (LHSBlocked > RHSBlocked) return false; + + // Finally, just to provide a stable ordering, use the node number as a + // deciding factor. + return LHSNum < RHSNum; +} + + +/// getSingleUnscheduledPred - If there is exactly one unscheduled predecessor +/// of SU, return it, otherwise return null. +SUnit *ResourcePriorityQueue::getSingleUnscheduledPred(SUnit *SU) { + SUnit *OnlyAvailablePred = 0; + for (SUnit::const_pred_iterator I = SU->Preds.begin(), E = SU->Preds.end(); + I != E; ++I) { + SUnit &Pred = *I->getSUnit(); + if (!Pred.isScheduled) { + // We found an available, but not scheduled, predecessor. If it's the + // only one we have found, keep track of it... otherwise give up. + if (OnlyAvailablePred && OnlyAvailablePred != &Pred) + return 0; + OnlyAvailablePred = &Pred; + } + } + return OnlyAvailablePred; +} + +void ResourcePriorityQueue::push(SUnit *SU) { + // Look at all of the successors of this node. Count the number of nodes that + // this node is the sole unscheduled node for. + unsigned NumNodesBlocking = 0; + for (SUnit::const_succ_iterator I = SU->Succs.begin(), E = SU->Succs.end(); + I != E; ++I) + if (getSingleUnscheduledPred(I->getSUnit()) == SU) + ++NumNodesBlocking; + + NumNodesSolelyBlocking[SU->NodeNum] = NumNodesBlocking; + Queue.push_back(SU); +} + +/// Check if scheduling of this SU is possible +/// in the current packet. +bool ResourcePriorityQueue::isResourceAvailable(SUnit *SU) { + if (!SU || !SU->getNode()) + return false; + + // If this is a compound instruction, + // it is likely to be a call. Do not delay it. + if (SU->getNode()->getGluedNode()) + return true; + + // First see if the pipeline could receive this instruction + // in the current cycle. + if (SU->getNode()->isMachineOpcode()) + switch (SU->getNode()->getMachineOpcode()) { + default: + if (!ResourcesModel->canReserveResources(&TII->get( + SU->getNode()->getMachineOpcode()))) + return false; + case TargetOpcode::EXTRACT_SUBREG: + case TargetOpcode::INSERT_SUBREG: + case TargetOpcode::SUBREG_TO_REG: + case TargetOpcode::REG_SEQUENCE: + case TargetOpcode::IMPLICIT_DEF: + break; + } + + // Now see if there are no other dependencies + // to instructions alredy in the packet. + for (unsigned i = 0, e = Packet.size(); i != e; ++i) + for (SUnit::const_succ_iterator I = Packet[i]->Succs.begin(), + E = Packet[i]->Succs.end(); I != E; ++I) { + // Since we do not add pseudos to packets, might as well + // ignor order deps. + if (I->isCtrl()) + continue; + + if (I->getSUnit() == SU) + return false; + } + + return true; +} + +/// Keep track of available resources. +void ResourcePriorityQueue::reserveResources(SUnit *SU) { + // If this SU does not fit in the packet + // start a new one. + if (!isResourceAvailable(SU) || SU->getNode()->getGluedNode()) { + ResourcesModel->clearResources(); + Packet.clear(); + } + + if (SU->getNode() && SU->getNode()->isMachineOpcode()) { + switch (SU->getNode()->getMachineOpcode()) { + default: + ResourcesModel->reserveResources(&TII->get( + SU->getNode()->getMachineOpcode())); + break; + case TargetOpcode::EXTRACT_SUBREG: + case TargetOpcode::INSERT_SUBREG: + case TargetOpcode::SUBREG_TO_REG: + case TargetOpcode::REG_SEQUENCE: + case TargetOpcode::IMPLICIT_DEF: + break; + } + Packet.push_back(SU); + } + // Forcefully end packet for PseudoOps. + else { + ResourcesModel->clearResources(); + Packet.clear(); + } + + // If packet is now full, reset the state so in the next cycle + // we start fresh. + if (Packet.size() >= InstrItins->IssueWidth) { + ResourcesModel->clearResources(); + Packet.clear(); + } +} + +signed ResourcePriorityQueue::rawRegPressureDelta(SUnit *SU, unsigned RCId) { + signed RegBalance = 0; + + if (!SU || !SU->getNode() || !SU->getNode()->isMachineOpcode()) + return RegBalance; + + // Gen estimate. + for (unsigned i = 0, e = SU->getNode()->getNumValues(); i != e; ++i) { + EVT VT = SU->getNode()->getValueType(i); + if (TLI->isTypeLegal(VT) + && TLI->getRegClassFor(VT) + && TLI->getRegClassFor(VT)->getID() == RCId) + RegBalance += numberRCValSuccInSU(SU, RCId); + } + // Kill estimate. + for (unsigned i = 0, e = SU->getNode()->getNumOperands(); i != e; ++i) { + const SDValue &Op = SU->getNode()->getOperand(i); + EVT VT = Op.getNode()->getValueType(Op.getResNo()); + if (isa(Op.getNode())) + continue; + + if (TLI->isTypeLegal(VT) && TLI->getRegClassFor(VT) + && TLI->getRegClassFor(VT)->getID() == RCId) + RegBalance -= numberRCValPredInSU(SU, RCId); + } + return RegBalance; +} + +/// Estimates change in reg pressure from this SU. +/// It is acheived by trivial tracking of defined +/// and used vregs in dependent instructions. +/// The RawPressure flag makes this function to ignore +/// existing reg file sizes, and report raw def/use +/// balance. +signed ResourcePriorityQueue::regPressureDelta(SUnit *SU, bool RawPressure) { + signed RegBalance = 0; + + if (!SU || !SU->getNode() || !SU->getNode()->isMachineOpcode()) + return RegBalance; + + if (RawPressure) { + for (TargetRegisterInfo::regclass_iterator I = TRI->regclass_begin(), + E = TRI->regclass_end(); I != E; ++I) { + const TargetRegisterClass *RC = *I; + RegBalance += rawRegPressureDelta(SU, RC->getID()); + } + } + else { + for (TargetRegisterInfo::regclass_iterator I = TRI->regclass_begin(), + E = TRI->regclass_end(); I != E; ++I) { + const TargetRegisterClass *RC = *I; + if ((RegPressure[RC->getID()] + + rawRegPressureDelta(SU, RC->getID()) > 0) && + (RegPressure[RC->getID()] + + rawRegPressureDelta(SU, RC->getID()) >= RegLimit[RC->getID()])) + RegBalance += rawRegPressureDelta(SU, RC->getID()); + } + } + + return RegBalance; +} + +// Constants used to denote relative importance of +// heuristic components for cost computation. +static const unsigned PriorityOne = 200; +static const unsigned PriorityTwo = 100; +static const unsigned PriorityThree = 50; +static const unsigned PriorityFour = 15; +static const unsigned PriorityFive = 5; +static const unsigned ScaleOne = 20; +static const unsigned ScaleTwo = 10; +static const unsigned ScaleThree = 5; +static const unsigned FactorOne = 2; + +/// Returns single number reflecting benefit of scheduling SU +/// in the current cycle. +signed ResourcePriorityQueue::SUSchedulingCost(SUnit *SU) { + // Initial trivial priority. + signed ResCount = 1; + + // Do not waste time on a node that is already scheduled. + if (SU->isScheduled) + return ResCount; + + // Forced priority is high. + if (SU->isScheduleHigh) + ResCount += PriorityOne; + + // Adaptable scheduling + // A small, but very parallel + // region, where reg pressure is an issue. + if (HorizontalVerticalBalance > RegPressureThreshold) { + // Critical path first + ResCount += (SU->getHeight() * ScaleTwo); + // If resources are available for it, multiply the + // chance of scheduling. + if (isResourceAvailable(SU)) + ResCount <<= FactorOne; + + // Consider change to reg pressure from scheduling + // this SU. + ResCount -= (regPressureDelta(SU,true) * ScaleOne); + } + // Default heuristic, greeady and + // critical path driven. + else { + // Critical path first. + ResCount += (SU->getHeight() * ScaleTwo); + // Now see how many instructions is blocked by this SU. + ResCount += (NumNodesSolelyBlocking[SU->NodeNum] * ScaleTwo); + // If resources are available for it, multiply the + // chance of scheduling. + if (isResourceAvailable(SU)) + ResCount <<= FactorOne; + + ResCount -= (regPressureDelta(SU) * ScaleTwo); + } + + // These are platform specific things. + // Will need to go into the back end + // and accessed from here via a hook. + for (SDNode *N = SU->getNode(); N; N = N->getGluedNode()) { + if (N->isMachineOpcode()) { + const MCInstrDesc &TID = TII->get(N->getMachineOpcode()); + if (TID.isCall()) + ResCount += (PriorityThree + (ScaleThree*N->getNumValues())); + } + else + switch (N->getOpcode()) { + default: break; + case ISD::TokenFactor: + case ISD::CopyFromReg: + case ISD::CopyToReg: + ResCount += PriorityFive; + break; + + case ISD::INLINEASM: + ResCount += PriorityFour; + break; + } + } + return ResCount; +} + + +/// Main resource tracking point. +void ResourcePriorityQueue::ScheduledNode(SUnit *SU) { + // Use NULL entry as an event marker to reset + // the DFA state. + if (!SU) { + ResourcesModel->clearResources(); + Packet.clear(); + return; + } + + const SDNode *ScegN = SU->getNode(); + // Update reg pressure tracking. + // First update current node. + if (ScegN->isMachineOpcode()) { + // Estimate generated regs. + for (unsigned i = 0, e = ScegN->getNumValues(); i != e; ++i) { + EVT VT = ScegN->getValueType(i); + + if (TLI->isTypeLegal(VT)) { + const TargetRegisterClass *RC = TLI->getRegClassFor(VT); + if (RC) + RegPressure[RC->getID()] += numberRCValSuccInSU(SU, RC->getID()); + } + } + // Estimate killed regs. + for (unsigned i = 0, e = ScegN->getNumOperands(); i != e; ++i) { + const SDValue &Op = ScegN->getOperand(i); + EVT VT = Op.getNode()->getValueType(Op.getResNo()); + + if (TLI->isTypeLegal(VT)) { + const TargetRegisterClass *RC = TLI->getRegClassFor(VT); + if (RC) { + if (RegPressure[RC->getID()] > + (numberRCValPredInSU(SU, RC->getID()))) + RegPressure[RC->getID()] -= numberRCValPredInSU(SU, RC->getID()); + else RegPressure[RC->getID()] = 0; + } + } + } + for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end(); + I != E; ++I) { + if (I->isCtrl() || (I->getSUnit()->NumRegDefsLeft == 0)) + continue; + --I->getSUnit()->NumRegDefsLeft; + } + } + + // Reserve resources for this SU. + reserveResources(SU); + + // Adjust number of parallel live ranges. + // Heuristic is simple - node with no data successors reduces + // number of live ranges. All others, increase it. + unsigned NumberNonControlDeps = 0; + + for (SUnit::const_succ_iterator I = SU->Succs.begin(), E = SU->Succs.end(); + I != E; ++I) { + adjustPriorityOfUnscheduledPreds(I->getSUnit()); + if (!I->isCtrl()) + NumberNonControlDeps++; + } + + if (!NumberNonControlDeps) { + if (ParallelLiveRanges >= SU->NumPreds) + ParallelLiveRanges -= SU->NumPreds; + else + ParallelLiveRanges = 0; + + } + else + ParallelLiveRanges += SU->NumRegDefsLeft; + + // Track parallel live chains. + HorizontalVerticalBalance += (SU->Succs.size() - numberCtrlDepsInSU(SU)); + HorizontalVerticalBalance -= (SU->Preds.size() - numberCtrlPredInSU(SU)); +} + +void ResourcePriorityQueue::initNumRegDefsLeft(SUnit *SU) { + unsigned NodeNumDefs = 0; + for (SDNode *N = SU->getNode(); N; N = N->getGluedNode()) + if (N->isMachineOpcode()) { + const MCInstrDesc &TID = TII->get(N->getMachineOpcode()); + // No register need be allocated for this. + if (N->getMachineOpcode() == TargetOpcode::IMPLICIT_DEF) { + NodeNumDefs = 0; + break; + } + NodeNumDefs = std::min(N->getNumValues(), TID.getNumDefs()); + } + else + switch(N->getOpcode()) { + default: break; + case ISD::CopyFromReg: + NodeNumDefs++; + break; + case ISD::INLINEASM: + NodeNumDefs++; + break; + } + + SU->NumRegDefsLeft = NodeNumDefs; +} + +/// adjustPriorityOfUnscheduledPreds - One of the predecessors of SU was just +/// scheduled. If SU is not itself available, then there is at least one +/// predecessor node that has not been scheduled yet. If SU has exactly ONE +/// unscheduled predecessor, we want to increase its priority: it getting +/// scheduled will make this node available, so it is better than some other +/// node of the same priority that will not make a node available. +void ResourcePriorityQueue::adjustPriorityOfUnscheduledPreds(SUnit *SU) { + if (SU->isAvailable) return; // All preds scheduled. + + SUnit *OnlyAvailablePred = getSingleUnscheduledPred(SU); + if (OnlyAvailablePred == 0 || !OnlyAvailablePred->isAvailable) + return; + + // Okay, we found a single predecessor that is available, but not scheduled. + // Since it is available, it must be in the priority queue. First remove it. + remove(OnlyAvailablePred); + + // Reinsert the node into the priority queue, which recomputes its + // NumNodesSolelyBlocking value. + push(OnlyAvailablePred); +} + + +/// Main access point - returns next instructions +/// to be placed in scheduling sequence. +SUnit *ResourcePriorityQueue::pop() { + if (empty()) + return 0; + + std::vector::iterator Best = Queue.begin(); + if (!DisableDFASched) { + signed BestCost = SUSchedulingCost(*Best); + for (std::vector::iterator I = Queue.begin(), + E = Queue.end(); I != E; ++I) { + if (*I == *Best) + continue; + + if (SUSchedulingCost(*I) > BestCost) { + BestCost = SUSchedulingCost(*I); + Best = I; + } + } + } + // Use default TD scheduling mechanism. + else { + for (std::vector::iterator I = llvm::next(Queue.begin()), + E = Queue.end(); I != E; ++I) + if (Picker(*Best, *I)) + Best = I; + } + + SUnit *V = *Best; + if (Best != prior(Queue.end())) + std::swap(*Best, Queue.back()); + + Queue.pop_back(); + + return V; +} + + +void ResourcePriorityQueue::remove(SUnit *SU) { + assert(!Queue.empty() && "Queue is empty!"); + std::vector::iterator I = std::find(Queue.begin(), Queue.end(), SU); + if (I != prior(Queue.end())) + std::swap(*I, Queue.back()); + + Queue.pop_back(); +} + + +#ifdef NDEBUG +void ResourcePriorityQueue::dump(ScheduleDAG *DAG) const {} +#else +void ResourcePriorityQueue::dump(ScheduleDAG *DAG) const { + ResourcePriorityQueue q = *this; + while (!q.empty()) { + SUnit *su = q.pop(); + dbgs() << "Height " << su->getHeight() << ": "; + su->dump(DAG); + } +} +#endif Added: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGVLIW.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGVLIW.cpp?rev=149547&view=auto ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGVLIW.cpp (added) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGVLIW.cpp Wed Feb 1 16:13:57 2012 @@ -0,0 +1,276 @@ +//===- ScheduleDAGVLIW.cpp - SelectionDAG list scheduler for VLIW -*- C++ -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This implements a top-down list scheduler, using standard algorithms. +// The basic approach uses a priority queue of available nodes to schedule. +// One at a time, nodes are taken from the priority queue (thus in priority +// order), checked for legality to schedule, and emitted if legal. +// +// Nodes may not be legal to schedule either due to structural hazards (e.g. +// pipeline or resource constraints) or because an input to the instruction has +// not completed execution. +// +//===----------------------------------------------------------------------===// + +#define DEBUG_TYPE "pre-RA-sched" +#include "ScheduleDAGSDNodes.h" +#include "llvm/CodeGen/LatencyPriorityQueue.h" +#include "llvm/CodeGen/ScheduleHazardRecognizer.h" +#include "llvm/CodeGen/SchedulerRegistry.h" +#include "llvm/CodeGen/SelectionDAGISel.h" +#include "llvm/Target/TargetRegisterInfo.h" +#include "llvm/Target/TargetData.h" +#include "llvm/Target/TargetInstrInfo.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/Statistic.h" +#include "llvm/CodeGen/ResourcePriorityQueue.h" +#include +using namespace llvm; + +STATISTIC(NumNoops , "Number of noops inserted"); +STATISTIC(NumStalls, "Number of pipeline stalls"); + +static RegisterScheduler + VLIWScheduler("vliw-td", "VLIW scheduler", + createVLIWDAGScheduler); + +namespace { +//===----------------------------------------------------------------------===// +/// ScheduleDAGVLIW - The actual DFA list scheduler implementation. This +/// supports / top-down scheduling. +/// +class ScheduleDAGVLIW : public ScheduleDAGSDNodes { +private: + /// AvailableQueue - The priority queue to use for the available SUnits. + /// + SchedulingPriorityQueue *AvailableQueue; + + /// PendingQueue - This contains all of the instructions whose operands have + /// been issued, but their results are not ready yet (due to the latency of + /// the operation). Once the operands become available, the instruction is + /// added to the AvailableQueue. + std::vector PendingQueue; + + /// HazardRec - The hazard recognizer to use. + ScheduleHazardRecognizer *HazardRec; + + /// AA - AliasAnalysis for making memory reference queries. + AliasAnalysis *AA; + +public: + ScheduleDAGVLIW(MachineFunction &mf, + AliasAnalysis *aa, + SchedulingPriorityQueue *availqueue) + : ScheduleDAGSDNodes(mf), AvailableQueue(availqueue), AA(aa) { + + const TargetMachine &tm = mf.getTarget(); + HazardRec = tm.getInstrInfo()->CreateTargetHazardRecognizer(&tm, this); + } + + ~ScheduleDAGVLIW() { + delete HazardRec; + delete AvailableQueue; + } + + void Schedule(); + +private: + void releaseSucc(SUnit *SU, const SDep &D); + void releaseSuccessors(SUnit *SU); + void scheduleNodeTopDown(SUnit *SU, unsigned CurCycle); + void listScheduleTopDown(); +}; +} // end anonymous namespace + +/// Schedule - Schedule the DAG using list scheduling. +void ScheduleDAGVLIW::Schedule() { + DEBUG(dbgs() + << "********** List Scheduling BB#" << BB->getNumber() + << " '" << BB->getName() << "' **********\n"); + + // Build the scheduling graph. + BuildSchedGraph(AA); + + AvailableQueue->initNodes(SUnits); + + listScheduleTopDown(); + + AvailableQueue->releaseState(); +} + +//===----------------------------------------------------------------------===// +// Top-Down Scheduling +//===----------------------------------------------------------------------===// + +/// releaseSucc - Decrement the NumPredsLeft count of a successor. Add it to +/// the PendingQueue if the count reaches zero. Also update its cycle bound. +void ScheduleDAGVLIW::releaseSucc(SUnit *SU, const SDep &D) { + SUnit *SuccSU = D.getSUnit(); + +#ifndef NDEBUG + if (SuccSU->NumPredsLeft == 0) { + dbgs() << "*** Scheduling failed! ***\n"; + SuccSU->dump(this); + dbgs() << " has been released too many times!\n"; + llvm_unreachable(0); + } +#endif + --SuccSU->NumPredsLeft; + + SuccSU->setDepthToAtLeast(SU->getDepth() + D.getLatency()); + + // If all the node's predecessors are scheduled, this node is ready + // to be scheduled. Ignore the special ExitSU node. + if (SuccSU->NumPredsLeft == 0 && SuccSU != &ExitSU) { + PendingQueue.push_back(SuccSU); + } +} + +void ScheduleDAGVLIW::releaseSuccessors(SUnit *SU) { + // Top down: release successors. + for (SUnit::succ_iterator I = SU->Succs.begin(), E = SU->Succs.end(); + I != E; ++I) { + assert(!I->isAssignedRegDep() && + "The list-td scheduler doesn't yet support physreg dependencies!"); + + releaseSucc(SU, *I); + } +} + +/// scheduleNodeTopDown - Add the node to the schedule. Decrement the pending +/// count of its successors. If a successor pending count is zero, add it to +/// the Available queue. +void ScheduleDAGVLIW::scheduleNodeTopDown(SUnit *SU, unsigned CurCycle) { + DEBUG(dbgs() << "*** Scheduling [" << CurCycle << "]: "); + DEBUG(SU->dump(this)); + + Sequence.push_back(SU); + assert(CurCycle >= SU->getDepth() && "Node scheduled above its depth!"); + SU->setDepthToAtLeast(CurCycle); + + releaseSuccessors(SU); + SU->isScheduled = true; + AvailableQueue->ScheduledNode(SU); +} + +/// listScheduleTopDown - The main loop of list scheduling for top-down +/// schedulers. +void ScheduleDAGVLIW::listScheduleTopDown() { + unsigned CurCycle = 0; + + // Release any successors of the special Entry node. + releaseSuccessors(&EntrySU); + + // All leaves to AvailableQueue. + for (unsigned i = 0, e = SUnits.size(); i != e; ++i) { + // It is available if it has no predecessors. + if (SUnits[i].Preds.empty()) { + AvailableQueue->push(&SUnits[i]); + SUnits[i].isAvailable = true; + } + } + + // While AvailableQueue is not empty, grab the node with the highest + // priority. If it is not ready put it back. Schedule the node. + std::vector NotReady; + Sequence.reserve(SUnits.size()); + while (!AvailableQueue->empty() || !PendingQueue.empty()) { + // Check to see if any of the pending instructions are ready to issue. If + // so, add them to the available queue. + for (unsigned i = 0, e = PendingQueue.size(); i != e; ++i) { + if (PendingQueue[i]->getDepth() == CurCycle) { + AvailableQueue->push(PendingQueue[i]); + PendingQueue[i]->isAvailable = true; + PendingQueue[i] = PendingQueue.back(); + PendingQueue.pop_back(); + --i; --e; + } + else { + assert(PendingQueue[i]->getDepth() > CurCycle && "Negative latency?"); + } + } + + // If there are no instructions available, don't try to issue anything, and + // don't advance the hazard recognizer. + if (AvailableQueue->empty()) { + // Reset DFA state. + AvailableQueue->ScheduledNode(0); + ++CurCycle; + continue; + } + + SUnit *FoundSUnit = 0; + + bool HasNoopHazards = false; + while (!AvailableQueue->empty()) { + SUnit *CurSUnit = AvailableQueue->pop(); + + ScheduleHazardRecognizer::HazardType HT = + HazardRec->getHazardType(CurSUnit, 0/*no stalls*/); + if (HT == ScheduleHazardRecognizer::NoHazard) { + FoundSUnit = CurSUnit; + break; + } + + // Remember if this is a noop hazard. + HasNoopHazards |= HT == ScheduleHazardRecognizer::NoopHazard; + + NotReady.push_back(CurSUnit); + } + + // Add the nodes that aren't ready back onto the available list. + if (!NotReady.empty()) { + AvailableQueue->push_all(NotReady); + NotReady.clear(); + } + + // If we found a node to schedule, do it now. + if (FoundSUnit) { + scheduleNodeTopDown(FoundSUnit, CurCycle); + HazardRec->EmitInstruction(FoundSUnit); + + // If this is a pseudo-op node, we don't want to increment the current + // cycle. + if (FoundSUnit->Latency) // Don't increment CurCycle for pseudo-ops! + ++CurCycle; + } else if (!HasNoopHazards) { + // Otherwise, we have a pipeline stall, but no other problem, just advance + // the current cycle and try again. + DEBUG(dbgs() << "*** Advancing cycle, no work to do\n"); + HazardRec->AdvanceCycle(); + ++NumStalls; + ++CurCycle; + } else { + // Otherwise, we have no instructions to issue and we have instructions + // that will fault if we don't do this right. This is the case for + // processors without pipeline interlocks and other cases. + DEBUG(dbgs() << "*** Emitting noop\n"); + HazardRec->EmitNoop(); + Sequence.push_back(0); // NULL here means noop + ++NumNoops; + ++CurCycle; + } + } + +#ifndef NDEBUG + VerifySchedule(/*isBottomUp=*/false); +#endif +} + +//===----------------------------------------------------------------------===// +// Public Constructor Functions +//===----------------------------------------------------------------------===// + +/// createVLIWDAGScheduler - This creates a top-down list scheduler. +ScheduleDAGSDNodes * +llvm::createVLIWDAGScheduler(SelectionDAGISel *IS, CodeGenOpt::Level) { + return new ScheduleDAGVLIW(*IS->MF, IS->AA, new ResourcePriorityQueue(IS)); +} Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Wed Feb 1 16:13:57 2012 @@ -225,6 +225,8 @@ return createBURRListDAGScheduler(IS, OptLevel); if (TLI.getSchedulingPreference() == Sched::Hybrid) return createHybridListDAGScheduler(IS, OptLevel); + if (TLI.getSchedulingPreference() == Sched::VLIW) + return createVLIWDAGScheduler(IS, OptLevel); assert(TLI.getSchedulingPreference() == Sched::ILP && "Unknown sched type!"); return createILPListDAGScheduler(IS, OptLevel); Modified: llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonISelLowering.cpp Wed Feb 1 16:13:57 2012 @@ -1298,6 +1298,7 @@ // Needed for DYNAMIC_STACKALLOC expansion. unsigned StackRegister = TM.getRegisterInfo()->getStackRegister(); setStackPointerRegisterToSaveRestore(StackRegister); + setSchedulingPreference(Sched::VLIW); } Modified: llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.cpp Wed Feb 1 16:13:57 2012 @@ -24,7 +24,9 @@ #include "llvm/CodeGen/MachineMemOperand.h" #include "llvm/CodeGen/PseudoSourceValue.h" #define GET_INSTRINFO_CTOR +#include "llvm/CodeGen/DFAPacketizer.h" #include "HexagonGenInstrInfo.inc" +#include "HexagonGenDFAPacketizer.inc" #include @@ -469,6 +471,7 @@ } + bool HexagonInstrInfo::isPredicable(MachineInstr *MI) const { bool isPred = MI->getDesc().isPredicable(); @@ -559,6 +562,7 @@ } + int HexagonInstrInfo:: getMatchingCondBranchOpcode(int Opc, bool invertPredicate) const { switch(Opc) { @@ -1450,3 +1454,29 @@ return false; } } + +DFAPacketizer *HexagonInstrInfo:: +CreateTargetScheduleState(const TargetMachine *TM, + const ScheduleDAG *DAG) const { + const InstrItineraryData *II = TM->getInstrItineraryData(); + return TM->getSubtarget().createDFAPacketizer(II); +} + +bool HexagonInstrInfo::isSchedulingBoundary(const MachineInstr *MI, + const MachineBasicBlock *MBB, + const MachineFunction &MF) const { + // Debug info is never a scheduling boundary. It's necessary to be explicit + // due to the special treatment of IT instructions below, otherwise a + // dbg_value followed by an IT will result in the IT instruction being + // considered a scheduling hazard, which is wrong. It should be the actual + // instruction preceding the dbg_value instruction(s), just like it is + // when debug info is not present. + if (MI->isDebugValue()) + return false; + + // Terminators and labels can't be scheduled around. + if (MI->getDesc().isTerminator() || MI->isLabel() || MI->isInlineAsm()) + return true; + + return false; +} Modified: llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.h?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.h (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonInstrInfo.h Wed Feb 1 16:13:57 2012 @@ -135,6 +135,13 @@ isProfitableToDupForIfCvt(MachineBasicBlock &MBB,unsigned NumCycles, const BranchProbability &Probability) const; + virtual DFAPacketizer* + CreateTargetScheduleState(const TargetMachine *TM, + const ScheduleDAG *DAG) const; + + virtual bool isSchedulingBoundary(const MachineInstr *MI, + const MachineBasicBlock *MBB, + const MachineFunction &MF) const; bool isValidOffset(const int Opcode, const int Offset) const; bool isValidAutoIncImm(const EVT VT, const int Offset) const; bool isMemOp(const MachineInstr *MI) const; Modified: llvm/trunk/lib/Target/Hexagon/HexagonSubtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/HexagonSubtarget.cpp?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/HexagonSubtarget.cpp (original) +++ llvm/trunk/lib/Target/Hexagon/HexagonSubtarget.cpp Wed Feb 1 16:13:57 2012 @@ -52,6 +52,9 @@ // Initialize scheduling itinerary for the specified CPU. InstrItins = getInstrItineraryForCPU(CPUString); + // Max issue per cycle == bundle width. + InstrItins.IssueWidth = 4; + if (EnableMemOps) UseMemOps = true; else Modified: llvm/trunk/lib/Target/Hexagon/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/Makefile?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/Makefile (original) +++ llvm/trunk/lib/Target/Hexagon/Makefile Wed Feb 1 16:13:57 2012 @@ -16,6 +16,7 @@ HexagonGenAsmWriter.inc \ HexagonGenDAGISel.inc HexagonGenSubtargetInfo.inc \ HexagonGenCallingConv.inc \ + HexagonGenDFAPacketizer.inc \ HexagonAsmPrinter.cpp DIRS = TargetInfo MCTargetDesc Modified: llvm/trunk/test/CodeGen/Hexagon/args.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Hexagon/args.ll?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Hexagon/args.ll (original) +++ llvm/trunk/test/CodeGen/Hexagon/args.ll Wed Feb 1 16:13:57 2012 @@ -1,4 +1,4 @@ -; RUN: llc -march=hexagon -mcpu=hexagonv4 < %s | FileCheck %s +; RUN: llc -march=hexagon -mcpu=hexagonv4 -disable-dfa-sched < %s | FileCheck %s ; CHECK: r[[T0:[0-9]+]] = #7 ; CHECK: memw(r29 + #0) = r[[T0]] ; CHECK: r0 = #1 Modified: llvm/trunk/test/CodeGen/Hexagon/static.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Hexagon/static.ll?rev=149547&r1=149546&r2=149547&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Hexagon/static.ll (original) +++ llvm/trunk/test/CodeGen/Hexagon/static.ll Wed Feb 1 16:13:57 2012 @@ -1,12 +1,12 @@ -; RUN: llc -march=hexagon -mcpu=hexagonv4 < %s | FileCheck %s +; RUN: llc -march=hexagon -mcpu=hexagonv4 -disable-dfa-sched < %s | FileCheck %s @num = external global i32 @acc = external global i32 @val = external global i32 +; CHECK: CONST32(#num) ; CHECK: CONST32(#acc) ; CHECK: CONST32(#val) -; CHECK: CONST32(#num) define void @foo() nounwind { entry: From wangmp at apple.com Wed Feb 1 16:15:21 2012 From: wangmp at apple.com (Mon P Wang) Date: Wed, 01 Feb 2012 22:15:21 -0000 Subject: [llvm-commits] [llvm] r149548 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/shl-i64.ll Message-ID: <20120201221521.215BE2A6C12C@llvm.org> Author: wangmp Date: Wed Feb 1 16:15:20 2012 New Revision: 149548 URL: http://llvm.org/viewvc/llvm-project?rev=149548&view=rev Log: Avoid creating an extract element to an illegal type after LegalizeTypes has run. Added: llvm/trunk/test/CodeGen/X86/shl-i64.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149548&r1=149547&r2=149548&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 16:15:20 2012 @@ -13604,6 +13604,7 @@ /// PerformShiftCombine - Transforms vector shift nodes to use vector shifts /// when possible. static SDValue PerformShiftCombine(SDNode* N, SelectionDAG &DAG, + TargetLowering::DAGCombinerInfo &DCI, const X86Subtarget *Subtarget) { EVT VT = N->getValueType(0); if (N->getOpcode() == ISD::SHL) { @@ -13667,9 +13668,16 @@ BaseShAmt = InVec.getOperand(1); } } - if (BaseShAmt.getNode() == 0) + if (BaseShAmt.getNode() == 0) { + // Don't create instructions with illegal types after legalize + // types has run. + if (!DAG.getTargetLoweringInfo().isTypeLegal(EltVT) && + !DCI.isBeforeLegalize()) + return SDValue(); + BaseShAmt = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, EltVT, ShAmtOp, DAG.getIntPtrConstant(0)); + } } else return SDValue(); @@ -14833,7 +14841,7 @@ case ISD::MUL: return PerformMulCombine(N, DAG, DCI); case ISD::SHL: case ISD::SRA: - case ISD::SRL: return PerformShiftCombine(N, DAG, Subtarget); + case ISD::SRL: return PerformShiftCombine(N, DAG, DCI, Subtarget); case ISD::AND: return PerformAndCombine(N, DAG, DCI, Subtarget); case ISD::OR: return PerformOrCombine(N, DAG, DCI, Subtarget); case ISD::XOR: return PerformXorCombine(N, DAG, DCI, Subtarget); Added: llvm/trunk/test/CodeGen/X86/shl-i64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/shl-i64.ll?rev=149548&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/shl-i64.ll (added) +++ llvm/trunk/test/CodeGen/X86/shl-i64.ll Wed Feb 1 16:15:20 2012 @@ -0,0 +1,20 @@ +; RUN: llc -march=x86 < %s | FileCheck %s + +; Make sure that we don't generate an illegal i64 extract after LegalizeType. +; CHECK: shll + + +define void @test_cl(<4 x i64>* %dst, <4 x i64>* %src, i32 %idx) { +entry: + %arrayidx = getelementptr inbounds <4 x i64> * %src, i32 %idx + %0 = load <4 x i64> * %arrayidx, align 32 + %arrayidx1 = getelementptr inbounds <4 x i64> * %dst, i32 %idx + %1 = load <4 x i64> * %arrayidx1, align 32 + %2 = extractelement <4 x i64> %1, i32 0 + %and = and i64 %2, 63 + %3 = insertelement <4 x i64> undef, i64 %and, i32 0 + %splat = shufflevector <4 x i64> %3, <4 x i64> undef, <4 x i32> zeroinitializer + %shl = shl <4 x i64> %0, %splat + store <4 x i64> %shl, <4 x i64> * %arrayidx1, align 32 + ret void +} From atrick at apple.com Wed Feb 1 16:23:31 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 14:23:31 -0800 Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <0b6f01cce123$b5ff8ab0$21fea010$@org> References: <07f401cce02b$87f18630$97d49290$@org> <42CFE65B-4116-41F3-BC56-50FB139847C9@apple.com> <0b6f01cce123$b5ff8ab0$21fea010$@org> Message-ID: <50829CFC-8852-4E25-B545-CCC87136D183@apple.com> On Feb 1, 2012, at 12:54 PM, Sergei Larin wrote: > This looks great to me. Please commit it. Thanks a lot. Committed r149547. -Andy > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > From: Andrew Trick [mailto:atrick at apple.com] > Sent: Wednesday, February 01, 2012 2:35 PM > To: Sergei Larin > Cc: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review > > > On Jan 31, 2012, at 11:42 PM, Andrew Trick wrote: > > > On Jan 31, 2012, at 7:18 AM, Sergei Larin wrote: > > From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin > Sent: Friday, January 27, 2012 10:47 AM > To: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review > > > Hello everybody, > > Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . > > Several key points: > - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute ?cost? function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. > - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the ?default? scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). > - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. > - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. > - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. > > Comments and reviews are eagerly anticipated J > > Sergei, > > Let me know if these superficial changes are ok with you, and I'll commit (or if you can commit, go ahead): > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/f0c505fc/attachment-0001.html From stoklund at 2pi.dk Wed Feb 1 16:19:26 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 01 Feb 2012 22:19:26 -0000 Subject: [llvm-commits] [llvm] r149549 - /llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Message-ID: <20120201221926.4427D2A6C12C@llvm.org> Author: stoklund Date: Wed Feb 1 16:19:26 2012 New Revision: 149549 URL: http://llvm.org/viewvc/llvm-project?rev=149549&view=rev Log: Fix a bug in the TopoOrderRC comparison function. The final tie breaker comparison also needs to return +/-1, or 0. This is not a less() function. This could cause otherwise identical super-classes to be ordered unstably, depending on what the system qsort routine does with a bad compare function. Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.cpp?rev=149549&r1=149548&r2=149549&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.cpp (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Wed Feb 1 16:19:26 2012 @@ -503,7 +503,7 @@ return 1; // Finally order by name as a tie breaker. - return A->getName() < B->getName(); + return StringRef(A->getName()).compare(B->getName()); } std::string CodeGenRegisterClass::getQualifiedName() const { From atrick at apple.com Wed Feb 1 16:28:29 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 22:28:29 -0000 Subject: [llvm-commits] [llvm] r149553 - /llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Message-ID: <20120201222829.C23212A6C12C@llvm.org> Author: atrick Date: Wed Feb 1 16:28:29 2012 New Revision: 149553 URL: http://llvm.org/viewvc/llvm-project?rev=149553&view=rev Log: fix cmake Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt?rev=149553&r1=149552&r2=149553&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Wed Feb 1 16:28:29 2012 @@ -18,7 +18,7 @@ SelectionDAGBuilder.cpp SelectionDAGISel.cpp SelectionDAGPrinter.cpp - SelectionDAGVLIW.cpp + ScheduleDAGVLIW.cpp TargetLowering.cpp TargetSelectionDAGInfo.cpp ) From eli.friedman at gmail.com Wed Feb 1 16:47:50 2012 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 1 Feb 2012 14:47:50 -0800 Subject: [llvm-commits] cxa_guard elimination In-Reply-To: <4F24FA31.50601@mxc.ca> References: <4F0903F9.405@mxc.ca> <4F0BF82B.7030206@mxc.ca> <4F24F0E4.4090701@mxc.ca> <4F24FA31.50601@mxc.ca> Message-ID: On Sat, Jan 28, 2012 at 11:50 PM, Nick Lewycky wrote: > Nick Lewycky wrote: >> >> Nick Lewycky wrote: >>> >>> Eli Friedman wrote: >>>> >>>> On Sat, Jan 7, 2012 at 6:48 PM, Nick Lewycky wrote: >>>>> >>>>> This patch implements some really basic elimination of calls to >>>>> __cxa_guard. >>>>> More advanced cases can be added on request. Patch attached, please >>>>> review! >>>>> >>>>> There's one thing I need to draw to a reviewer's attention. I make the >>>>> assumption that the code in the 'not-yet-initialized' path is indeed >>>>> initializing. That is to say that I assume that this pseudo-code: >>>>> >>>>> *global = x; >>>>> if (expr) { >>>>> if (cxa_guard_acquire(global_guard)) { >>>>> *global = 0; >>>>> cxa_guard_release(global_guard); >>>>> } >>>>> } >>>>> use(global); >>>>> >>>>> represents undefined behavior because global is initialized before >>>>> the call >>>>> to __cxa_guard_acquire, and the guard is protecting a normal store, >>>>> not the >>>>> first store. >>>>> >>>>> The Itanium C++ ABI never states that guard_acquire/release must be >>>>> used for >>>>> *initializing*, so this patch arguably can miscompile. However, it's >>>>> generally understood that this is for initialization (indeed, the >>>>> section >>>>> heading is "Once-time initialization API") so I'm hoping we can >>>>> agree that >>>>> this assumption is valid and ask users doing other things with >>>>> cxa_guard_acquire/release to pass -fno-builtins. >>>> >>>> >>>> I think I agree with your assumption for the global associated with >>>> the guard. You can't make that assumption for all globals, though. >>>> Consider something like the following: >>>> >>>> static int x; >>>> struct C { C() { x = 10; } }; >>>> int f() { x = 20; static C dummy; return x; } >>>> >>>> As far as I can tell, with a bit of inlining etc. we end up with code >>>> which looks exactly like your bad pattern. >>> >>> >>> Clever! Thanks for the testcase. >>> >>> That's unfortunate, but repairable. Given that we already check that all >>> accesses come from within the same function, we'll need to show that no >>> access is made to the global before the acquire is reached (and don't >>> forget recursive calls). >>> >>> Regrettably, it means that we'll probably need a domtree (or else do >>> potentially-expensive CFG analysis ourselves). I'll see whether it still >>> fits in -simplify-libcalls when I'm done. >> >> >> ... done! >> >> The good news is that we already have a valid DomTree by the time >> simplify-libcalls runs, so we're good to request one here. >> >> The updated patch is barely more complicated, there's just an additional >> check to see whether there's a single predecessor block that's testing >> the guard variable, and an additional domtree BB test (so, no linear >> time) on each StoreInst. No sweat! >> >> Please review! I'd like to land this patch and then start working on >> some of its deficiencies in follow-up patches. > > > Uh... I mailed out the entirely wrong patch. > > Correct patch attached! > > Nick > >> >> It doesn't transform a lot of things due to pass ordering problems, and >> even when it does fire, we don't get optimal code because of more pass >> ordering problems. One thing to note is that we put it in >> simplify-libcalls because it's folding away known library calls, but it >> might be better off in globalopt because what it's really doing is >> converting dynamic initialization to static initialization. I haven't >> really decided what the right move is yet, but I'm seeing things like: >> >> static int x; >> struct C { C() { x = 10; } }; >> int f() { static C dummy; return x; } >> >> fail to optimize because 'x' is written to in the two constructor >> functions the ABI requires, and those functions get deleted at the end >> of the optz'n pass anyways. The good news is that we get near-optimal >> code on that testcase if you run opt -O2 twice, or if you wrap struct C >> in an anonymous namespace. Progress! + if (C) { + if (isa(C)) + return true; // eg., used in initializer of another global. + if (BlockAddress *BA = dyn_cast(C)) + return BA->getFunction() == F; Why are you handling BlockAddress specially here? And what is this handling supposed to even do? (I'm having a bit of trouble reasoning about it because I don't think it's possible to hit this case with your patch.) GlobalOpt has some similar simulation code; it would be nice if you could somehow leverage that (but it's not a hard requirement). Otherwise, this is looking good. -Eli From rafael.espindola at gmail.com Wed Feb 1 16:51:22 2012 From: rafael.espindola at gmail.com (=?UTF-8?Q?Rafael_Esp=C3=ADndola?=) Date: Wed, 1 Feb 2012 23:51:22 +0100 Subject: [llvm-commits] [llvm] r149498 - in /llvm/trunk: autoconf/configure.ac configure In-Reply-To: <20120201140622.2E4BD2A6C12C@llvm.org> References: <20120201140622.2E4BD2A6C12C@llvm.org> Message-ID: > (I lack the right version of autoconf et al. to regen, but it > was a simple change, so I just updated configure manually.) This tends to introduce dummy changes when someone does run the autotools :-( Cheers, Rafael From stoklund at 2pi.dk Wed Feb 1 17:16:41 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 01 Feb 2012 23:16:41 -0000 Subject: [llvm-commits] [llvm] r149556 - in /llvm/trunk: include/llvm/Target/Target.td utils/TableGen/CodeGenRegisters.cpp utils/TableGen/CodeGenRegisters.h Message-ID: <20120201231641.620562A6C12C@llvm.org> Author: stoklund Date: Wed Feb 1 17:16:41 2012 New Revision: 149556 URL: http://llvm.org/viewvc/llvm-project?rev=149556&view=rev Log: Specify SubRegIndex components on the index itself. It is simpler to define a composite index directly: def ssub_2 : SubRegIndex<[dsub_1, ssub_0]>; def ssub_3 : SubRegIndex<[dsub_1, ssub_1]>; Than specifying the composite indices on each register: CompositeIndices = [(ssub_2 dsub_1, ssub_0), (ssub_3 dsub_1, ssub_1)] in ... This also makes it clear that SubRegIndex composition is supposed to be unique. Modified: llvm/trunk/include/llvm/Target/Target.td llvm/trunk/utils/TableGen/CodeGenRegisters.cpp llvm/trunk/utils/TableGen/CodeGenRegisters.h Modified: llvm/trunk/include/llvm/Target/Target.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/Target.td?rev=149556&r1=149555&r2=149556&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/Target.td (original) +++ llvm/trunk/include/llvm/Target/Target.td Wed Feb 1 17:16:41 2012 @@ -22,8 +22,12 @@ class RegisterClass; // Forward def // SubRegIndex - Use instances of SubRegIndex to identify subregisters. -class SubRegIndex { +class SubRegIndex comps = []> { string Namespace = ""; + + // ComposedOf - A list of two SubRegIndex instances, [A, B]. + // This indicates that this SubRegIndex is the result of composing A and B. + list ComposedOf = comps; } // RegAltNameIndex - The alternate name set to use for register operands of Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.cpp?rev=149556&r1=149555&r2=149556&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.cpp (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.cpp Wed Feb 1 17:16:41 2012 @@ -49,6 +49,19 @@ return N; } +void CodeGenSubRegIndex::updateComponents(CodeGenRegBank &RegBank) { + std::vector Comps = TheDef->getValueAsListOfDefs("ComposedOf"); + if (Comps.empty()) + return; + if (Comps.size() != 2) + throw TGError(TheDef->getLoc(), "ComposedOf must have exactly two entries"); + CodeGenSubRegIndex *A = RegBank.getSubRegIdx(Comps[0]); + CodeGenSubRegIndex *B = RegBank.getSubRegIdx(Comps[1]); + CodeGenSubRegIndex *X = A->addComposite(B, this); + if (X) + throw TGError(TheDef->getLoc(), "Ambiguous ComposedOf entries"); +} + void CodeGenSubRegIndex::cleanComposites() { // Clean out redundant mappings of the form this+X -> X. for (CompMap::iterator i = Composed.begin(), e = Composed.end(); i != e;) { @@ -75,15 +88,6 @@ return TheDef->getName(); } -namespace { - struct Orphan { - CodeGenRegister *SubReg; - CodeGenSubRegIndex *First, *Second; - Orphan(CodeGenRegister *r, CodeGenSubRegIndex *a, CodeGenSubRegIndex *b) - : SubReg(r), First(a), Second(b) {} - }; -} - const CodeGenRegister::SubRegMap & CodeGenRegister::getSubRegs(CodeGenRegBank &RegBank) { // Only compute this map once. @@ -92,28 +96,29 @@ SubRegsComplete = true; std::vector SubList = TheDef->getValueAsListOfDefs("SubRegs"); - std::vector Indices = TheDef->getValueAsListOfDefs("SubRegIndices"); - if (SubList.size() != Indices.size()) + std::vector IdxList = TheDef->getValueAsListOfDefs("SubRegIndices"); + if (SubList.size() != IdxList.size()) throw TGError(TheDef->getLoc(), "Register " + getName() + " SubRegIndices doesn't match SubRegs"); // First insert the direct subregs and make sure they are fully indexed. + SmallVector Indices; for (unsigned i = 0, e = SubList.size(); i != e; ++i) { CodeGenRegister *SR = RegBank.getReg(SubList[i]); - CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(Indices[i]); + CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(IdxList[i]); + Indices.push_back(Idx); if (!SubRegs.insert(std::make_pair(Idx, SR)).second) throw TGError(TheDef->getLoc(), "SubRegIndex " + Idx->getName() + " appears twice in Register " + getName()); } // Keep track of inherited subregs and how they can be reached. - SmallVector Orphans; + SmallPtrSet Orphans; - // Clone inherited subregs and place duplicate entries on Orphans. + // Clone inherited subregs and place duplicate entries in Orphans. // Here the order is important - earlier subregs take precedence. for (unsigned i = 0, e = SubList.size(); i != e; ++i) { CodeGenRegister *SR = RegBank.getReg(SubList[i]); - CodeGenSubRegIndex *Idx = RegBank.getSubRegIdx(Indices[i]); const SubRegMap &Map = SR->getSubRegs(RegBank); // Add this as a super-register of SR now all sub-registers are in the list. @@ -124,7 +129,7 @@ for (SubRegMap::const_iterator SI = Map.begin(), SE = Map.end(); SI != SE; ++SI) { if (!SubRegs.insert(*SI).second) - Orphans.push_back(Orphan(SI->second, Idx, SI->first)); + Orphans.insert(SI->second); // Noop sub-register indexes are possible, so avoid duplicates. if (SI->second != SR) @@ -132,6 +137,33 @@ } } + // Expand any composed subreg indices. + // If dsub_2 has ComposedOf = [qsub_1, dsub_0], and this register has a + // qsub_1 subreg, add a dsub_2 subreg. Keep growing Indices and process + // expanded subreg indices recursively. + for (unsigned i = 0; i != Indices.size(); ++i) { + CodeGenSubRegIndex *Idx = Indices[i]; + const CodeGenSubRegIndex::CompMap &Comps = Idx->getComposites(); + CodeGenRegister *SR = SubRegs[Idx]; + const SubRegMap &Map = SR->getSubRegs(RegBank); + + // Look at the possible compositions of Idx. + // They may not all be supported by SR. + for (CodeGenSubRegIndex::CompMap::const_iterator I = Comps.begin(), + E = Comps.end(); I != E; ++I) { + SubRegMap::const_iterator SRI = Map.find(I->first); + if (SRI == Map.end()) + continue; // Idx + I->first doesn't exist in SR. + // Add I->second as a name for the subreg SRI->second, assuming it is + // orphaned, and the name isn't already used for something else. + if (SubRegs.count(I->second) || !Orphans.erase(SRI->second)) + continue; + // We found a new name for the orphaned sub-register. + SubRegs.insert(std::make_pair(I->second, SRI->second)); + Indices.push_back(I->second); + } + } + // Process the composites. ListInit *Comps = TheDef->getValueAsListInit("CompositeIndices"); for (unsigned i = 0, e = Comps->size(); i != e; ++i) { @@ -167,19 +199,33 @@ SubRegs[BaseIdx] = R2; // R2 is no longer an orphan. - for (unsigned j = 0, je = Orphans.size(); j != je; ++j) - if (Orphans[j].SubReg == R2) - Orphans[j].SubReg = 0; + Orphans.erase(R2); } // Now Orphans contains the inherited subregisters without a direct index. // Create inferred indexes for all missing entries. - for (unsigned i = 0, e = Orphans.size(); i != e; ++i) { - Orphan &O = Orphans[i]; - if (!O.SubReg) - continue; - SubRegs[RegBank.getCompositeSubRegIndex(O.First, O.Second)] = - O.SubReg; + // Work backwards in the Indices vector in order to compose subregs bottom-up. + // Consider this subreg sequence: + // + // qsub_1 -> dsub_0 -> ssub_0 + // + // The qsub_1 -> dsub_0 composition becomes dsub_2, so the ssub_0 register + // can be reached in two different ways: + // + // qsub_1 -> ssub_0 + // dsub_2 -> ssub_0 + // + // We pick the latter composition because another register may have [dsub_0, + // dsub_1, dsub_2] subregs without neccessarily having a qsub_1 subreg. The + // dsub_2 -> ssub_0 composition can be shared. + while (!Indices.empty() && !Orphans.empty()) { + CodeGenSubRegIndex *Idx = Indices.pop_back_val(); + CodeGenRegister *SR = SubRegs[Idx]; + const SubRegMap &Map = SR->getSubRegs(RegBank); + for (SubRegMap::const_iterator SI = Map.begin(), SE = Map.end(); SI != SE; + ++SI) + if (Orphans.erase(SI->second)) + SubRegs[RegBank.getCompositeSubRegIndex(Idx, SI->first)] = SI->second; } return SubRegs; } @@ -590,6 +636,9 @@ NumNamedIndices = SRIs.size(); for (unsigned i = 0, e = SRIs.size(); i != e; ++i) getSubRegIdx(SRIs[i]); + // Build composite maps from ComposedOf fields. + for (unsigned i = 0, e = SubRegIndices.size(); i != e; ++i) + SubRegIndices[i]->updateComponents(*this); // Read in the register definitions. std::vector Regs = Records.getAllDerivedDefinitions("Register"); Modified: llvm/trunk/utils/TableGen/CodeGenRegisters.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenRegisters.h?rev=149556&r1=149555&r2=149556&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CodeGenRegisters.h (original) +++ llvm/trunk/utils/TableGen/CodeGenRegisters.h Wed Feb 1 17:16:41 2012 @@ -71,6 +71,9 @@ return (Ins.second || Ins.first->second == B) ? 0 : Ins.first->second; } + // Update the composite maps of components specified in 'ComposedOf'. + void updateComponents(CodeGenRegBank&); + // Clean out redundant composite mappings. void cleanComposites(); From stoklund at 2pi.dk Wed Feb 1 17:16:44 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Wed, 01 Feb 2012 23:16:44 -0000 Subject: [llvm-commits] [llvm] r149557 - /llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td Message-ID: <20120201231644.2F7AE2A6C12C@llvm.org> Author: stoklund Date: Wed Feb 1 17:16:43 2012 New Revision: 149557 URL: http://llvm.org/viewvc/llvm-project?rev=149557&view=rev Log: Move ARM subreg index compositions to the SubRegIndex itself. Modified: llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td Modified: llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td?rev=149557&r1=149556&r2=149557&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td (original) +++ llvm/trunk/lib/Target/ARM/ARMRegisterInfo.td Wed Feb 1 17:16:43 2012 @@ -27,28 +27,30 @@ // Subregister indices. let Namespace = "ARM" in { +def qqsub_0 : SubRegIndex; +def qqsub_1 : SubRegIndex; + // Note: Code depends on these having consecutive numbers. -def ssub_0 : SubRegIndex; -def ssub_1 : SubRegIndex; -def ssub_2 : SubRegIndex; // In a Q reg. -def ssub_3 : SubRegIndex; +def qsub_0 : SubRegIndex; +def qsub_1 : SubRegIndex; +def qsub_2 : SubRegIndex<[qqsub_1, qsub_0]>; +def qsub_3 : SubRegIndex<[qqsub_1, qsub_1]>; def dsub_0 : SubRegIndex; def dsub_1 : SubRegIndex; -def dsub_2 : SubRegIndex; -def dsub_3 : SubRegIndex; -def dsub_4 : SubRegIndex; -def dsub_5 : SubRegIndex; -def dsub_6 : SubRegIndex; -def dsub_7 : SubRegIndex; +def dsub_2 : SubRegIndex<[qsub_1, dsub_0]>; +def dsub_3 : SubRegIndex<[qsub_1, dsub_1]>; +def dsub_4 : SubRegIndex<[qsub_2, dsub_0]>; +def dsub_5 : SubRegIndex<[qsub_2, dsub_1]>; +def dsub_6 : SubRegIndex<[qsub_3, dsub_0]>; +def dsub_7 : SubRegIndex<[qsub_3, dsub_1]>; -def qsub_0 : SubRegIndex; -def qsub_1 : SubRegIndex; -def qsub_2 : SubRegIndex; -def qsub_3 : SubRegIndex; - -def qqsub_0 : SubRegIndex; -def qqsub_1 : SubRegIndex; +def ssub_0 : SubRegIndex; +def ssub_1 : SubRegIndex; +def ssub_2 : SubRegIndex<[dsub_1, ssub_0]>; +def ssub_3 : SubRegIndex<[dsub_1, ssub_1]>; +// Let TableGen synthesize the remaining 12 ssub_* indices. +// We don't need to name them. } // Integer registers @@ -129,9 +131,7 @@ def D31 : ARMFReg<31, "d31">, DwarfRegNum<[287]>; // Advanced SIMD (NEON) defines 16 quad-word aliases -let SubRegIndices = [dsub_0, dsub_1], - CompositeIndices = [(ssub_2 dsub_1, ssub_0), - (ssub_3 dsub_1, ssub_1)] in { +let SubRegIndices = [dsub_0, dsub_1] in { def Q0 : ARMReg< 0, "q0", [D0, D1]>; def Q1 : ARMReg< 1, "q1", [D2, D3]>; def Q2 : ARMReg< 2, "q2", [D4, D5]>; @@ -297,9 +297,7 @@ // stuff very messy. def Tuples2Q : RegisterTuples<[qsub_0, qsub_1], [(decimate QPR, 2), - (decimate (shl QPR, 1), 2)]> { - let CompositeIndices = [(dsub_2 qsub_1, dsub_0), (dsub_3 qsub_1, dsub_1)]; -} + (decimate (shl QPR, 1), 2)]>; // Pseudo 256-bit vector register class to model pairs of Q registers // (4 consecutive D registers). @@ -314,11 +312,7 @@ // Pseudo 512-bit registers to represent four consecutive Q registers. def Tuples2QQ : RegisterTuples<[qqsub_0, qqsub_1], [(decimate QQPR, 2), - (decimate (shl QQPR, 1), 2)]> { - let CompositeIndices = [(qsub_2 qqsub_1, qsub_0), (qsub_3 qqsub_1, qsub_1), - (dsub_4 qqsub_1, dsub_0), (dsub_5 qqsub_1, dsub_1), - (dsub_6 qqsub_1, dsub_2), (dsub_7 qqsub_1, dsub_3)]; -} + (decimate (shl QQPR, 1), 2)]>; // Pseudo 512-bit vector register class to model 4 consecutive Q registers // (8 consecutive D registers). From atrick at apple.com Wed Feb 1 17:20:52 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 23:20:52 -0000 Subject: [llvm-commits] [llvm] r149558 - in /llvm/trunk: lib/Target/X86/ test/CodeGen/X86/ Message-ID: <20120201232053.E046B2A6C12C@llvm.org> Author: atrick Date: Wed Feb 1 17:20:51 2012 New Revision: 149558 URL: http://llvm.org/viewvc/llvm-project?rev=149558&view=rev Log: Instruction scheduling itinerary for Intel Atom. Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. Adds a test to verify that the scheduler is working. Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. Patch by Preston Gurd! Added: llvm/trunk/lib/Target/X86/X86Schedule.td llvm/trunk/lib/Target/X86/X86ScheduleAtom.td llvm/trunk/test/CodeGen/X86/atom-sched.ll Modified: llvm/trunk/lib/Target/X86/X86.td llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86InstrArithmetic.td llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td llvm/trunk/lib/Target/X86/X86InstrControl.td llvm/trunk/lib/Target/X86/X86InstrFormats.td llvm/trunk/lib/Target/X86/X86InstrMMX.td llvm/trunk/lib/Target/X86/X86InstrSSE.td llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td llvm/trunk/lib/Target/X86/X86Subtarget.cpp llvm/trunk/lib/Target/X86/X86Subtarget.h llvm/trunk/lib/Target/X86/X86TargetMachine.cpp llvm/trunk/lib/Target/X86/X86TargetMachine.h llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll llvm/trunk/test/CodeGen/X86/abi-isel.ll llvm/trunk/test/CodeGen/X86/add.ll llvm/trunk/test/CodeGen/X86/byval6.ll llvm/trunk/test/CodeGen/X86/divide-by-constant.ll llvm/trunk/test/CodeGen/X86/epilogue.ll llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll llvm/trunk/test/CodeGen/X86/fold-load.ll llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll llvm/trunk/test/CodeGen/X86/optimize-max-3.ll llvm/trunk/test/CodeGen/X86/peep-test-3.ll llvm/trunk/test/CodeGen/X86/pic.ll llvm/trunk/test/CodeGen/X86/red-zone.ll llvm/trunk/test/CodeGen/X86/red-zone2.ll llvm/trunk/test/CodeGen/X86/reghinting.ll llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll llvm/trunk/test/CodeGen/X86/segmented-stacks.ll llvm/trunk/test/CodeGen/X86/stack-align2.ll llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll llvm/trunk/test/CodeGen/X86/tailcallstack64.ll llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll llvm/trunk/test/CodeGen/X86/v-binop-widen.ll llvm/trunk/test/CodeGen/X86/vec_call.ll llvm/trunk/test/CodeGen/X86/widen_arith-1.ll llvm/trunk/test/CodeGen/X86/widen_arith-3.ll llvm/trunk/test/CodeGen/X86/widen_load-2.ll llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll llvm/trunk/test/CodeGen/X86/win64_vararg.ll llvm/trunk/test/CodeGen/X86/zext-fold.ll Modified: llvm/trunk/lib/Target/X86/X86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86.td (original) +++ llvm/trunk/lib/Target/X86/X86.td Wed Feb 1 17:20:51 2012 @@ -120,8 +120,16 @@ // X86 processors supported. //===----------------------------------------------------------------------===// +include "X86Schedule.td" + +def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom", + "Intel Atom processors">; + class Proc Features> - : Processor; + : Processor; + +class AtomProc Features> + : Processor; def : Proc<"generic", []>; def : Proc<"i386", []>; @@ -146,8 +154,8 @@ FeatureSlowBTMem]>; def : Proc<"penryn", [FeatureSSE41, FeatureCMPXCHG16B, FeatureSlowBTMem]>; -def : Proc<"atom", [FeatureSSE3, FeatureCMPXCHG16B, FeatureMOVBE, - FeatureSlowBTMem]>; +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, + FeatureMOVBE, FeatureSlowBTMem]>; // "Arrandale" along with corei3 and corei5 def : Proc<"corei7", [FeatureSSE42, FeatureCMPXCHG16B, FeatureSlowBTMem, FeatureFastUAMem, Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 17:20:51 2012 @@ -179,8 +179,11 @@ // For 64-bit since we have so many registers use the ILP scheduler, for // 32-bit code use the register pressure specific scheduling. + // For 32 bit Atom, use Hybrid (register pressure + latency) scheduling. if (Subtarget->is64Bit()) setSchedulingPreference(Sched::ILP); + else if (Subtarget->isAtom()) + setSchedulingPreference(Sched::Hybrid); else setSchedulingPreference(Sched::RegPressure); setStackPointerRegisterToSaveRestore(X86StackPtr); Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Feb 1 17:20:51 2012 @@ -18,22 +18,24 @@ let neverHasSideEffects = 1 in def LEA16r : I<0x8D, MRMSrcMem, (outs GR16:$dst), (ins i32mem:$src), - "lea{w}\t{$src|$dst}, {$dst|$src}", []>, OpSize; + "lea{w}\t{$src|$dst}, {$dst|$src}", [], IIC_LEA_16>, OpSize; let isReMaterializable = 1 in def LEA32r : I<0x8D, MRMSrcMem, (outs GR32:$dst), (ins i32mem:$src), "lea{l}\t{$src|$dst}, {$dst|$src}", - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In32BitMode]>; + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, + Requires<[In32BitMode]>; def LEA64_32r : I<0x8D, MRMSrcMem, (outs GR32:$dst), (ins lea64_32mem:$src), "lea{l}\t{$src|$dst}, {$dst|$src}", - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In64BitMode]>; + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, + Requires<[In64BitMode]>; let isReMaterializable = 1 in def LEA64r : RI<0x8D, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src), "lea{q}\t{$src|$dst}, {$dst|$src}", - [(set GR64:$dst, lea64addr:$src)]>; + [(set GR64:$dst, lea64addr:$src)], IIC_LEA>; @@ -56,16 +58,18 @@ let Defs = [AX,DX,EFLAGS], Uses = [AX], neverHasSideEffects = 1 in def MUL16r : I<0xF7, MRM4r, (outs), (ins GR16:$src), "mul{w}\t$src", - []>, OpSize; // AX,DX = AX*GR16 + [], IIC_MUL16_REG>, OpSize; // AX,DX = AX*GR16 let Defs = [EAX,EDX,EFLAGS], Uses = [EAX], neverHasSideEffects = 1 in def MUL32r : I<0xF7, MRM4r, (outs), (ins GR32:$src), "mul{l}\t$src", // EAX,EDX = EAX*GR32 - [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/]>; + [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/], + IIC_MUL32_REG>; let Defs = [RAX,RDX,EFLAGS], Uses = [RAX], neverHasSideEffects = 1 in def MUL64r : RI<0xF7, MRM4r, (outs), (ins GR64:$src), "mul{q}\t$src", // RAX,RDX = RAX*GR64 - [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/]>; + [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/], + IIC_MUL64>; let Defs = [AL,EFLAGS,AX], Uses = [AL] in def MUL8m : I<0xF6, MRM4m, (outs), (ins i8mem :$src), @@ -74,21 +78,21 @@ // This probably ought to be moved to a def : Pat<> if the // syntax can be accepted. [(set AL, (mul AL, (loadi8 addr:$src))), - (implicit EFLAGS)]>; // AL,AH = AL*[mem8] + (implicit EFLAGS)], IIC_MUL8>; // AL,AH = AL*[mem8] let mayLoad = 1, neverHasSideEffects = 1 in { let Defs = [AX,DX,EFLAGS], Uses = [AX] in def MUL16m : I<0xF7, MRM4m, (outs), (ins i16mem:$src), "mul{w}\t$src", - []>, OpSize; // AX,DX = AX*[mem16] + [], IIC_MUL16_MEM>, OpSize; // AX,DX = AX*[mem16] let Defs = [EAX,EDX,EFLAGS], Uses = [EAX] in def MUL32m : I<0xF7, MRM4m, (outs), (ins i32mem:$src), "mul{l}\t$src", - []>; // EAX,EDX = EAX*[mem32] + [], IIC_MUL32_MEM>; // EAX,EDX = EAX*[mem32] let Defs = [RAX,RDX,EFLAGS], Uses = [RAX] in def MUL64m : RI<0xF7, MRM4m, (outs), (ins i64mem:$src), - "mul{q}\t$src", []>; // RAX,RDX = RAX*[mem64] + "mul{q}\t$src", [], IIC_MUL64>; // RAX,RDX = RAX*[mem64] } let neverHasSideEffects = 1 in { @@ -130,16 +134,19 @@ def IMUL16rr : I<0xAF, MRMSrcReg, (outs GR16:$dst), (ins GR16:$src1,GR16:$src2), "imul{w}\t{$src2, $dst|$dst, $src2}", [(set GR16:$dst, EFLAGS, - (X86smul_flag GR16:$src1, GR16:$src2))]>, TB, OpSize; + (X86smul_flag GR16:$src1, GR16:$src2))], IIC_IMUL16_RR>, + TB, OpSize; def IMUL32rr : I<0xAF, MRMSrcReg, (outs GR32:$dst), (ins GR32:$src1,GR32:$src2), "imul{l}\t{$src2, $dst|$dst, $src2}", [(set GR32:$dst, EFLAGS, - (X86smul_flag GR32:$src1, GR32:$src2))]>, TB; + (X86smul_flag GR32:$src1, GR32:$src2))], IIC_IMUL32_RR>, + TB; def IMUL64rr : RI<0xAF, MRMSrcReg, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2), "imul{q}\t{$src2, $dst|$dst, $src2}", [(set GR64:$dst, EFLAGS, - (X86smul_flag GR64:$src1, GR64:$src2))]>, TB; + (X86smul_flag GR64:$src1, GR64:$src2))], IIC_IMUL64_RR>, + TB; } // Register-Memory Signed Integer Multiply @@ -147,18 +154,23 @@ (ins GR16:$src1, i16mem:$src2), "imul{w}\t{$src2, $dst|$dst, $src2}", [(set GR16:$dst, EFLAGS, - (X86smul_flag GR16:$src1, (load addr:$src2)))]>, + (X86smul_flag GR16:$src1, (load addr:$src2)))], + IIC_IMUL16_RM>, TB, OpSize; def IMUL32rm : I<0xAF, MRMSrcMem, (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2), "imul{l}\t{$src2, $dst|$dst, $src2}", [(set GR32:$dst, EFLAGS, - (X86smul_flag GR32:$src1, (load addr:$src2)))]>, TB; + (X86smul_flag GR32:$src1, (load addr:$src2)))], + IIC_IMUL32_RM>, + TB; def IMUL64rm : RI<0xAF, MRMSrcMem, (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2), "imul{q}\t{$src2, $dst|$dst, $src2}", [(set GR64:$dst, EFLAGS, - (X86smul_flag GR64:$src1, (load addr:$src2)))]>, TB; + (X86smul_flag GR64:$src1, (load addr:$src2)))], + IIC_IMUL64_RM>, + TB; } // Constraints = "$src1 = $dst" } // Defs = [EFLAGS] @@ -170,33 +182,39 @@ (outs GR16:$dst), (ins GR16:$src1, i16imm:$src2), "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR16:$dst, EFLAGS, - (X86smul_flag GR16:$src1, imm:$src2))]>, OpSize; + (X86smul_flag GR16:$src1, imm:$src2))], + IIC_IMUL16_RRI>, OpSize; def IMUL16rri8 : Ii8<0x6B, MRMSrcReg, // GR16 = GR16*I8 (outs GR16:$dst), (ins GR16:$src1, i16i8imm:$src2), "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR16:$dst, EFLAGS, - (X86smul_flag GR16:$src1, i16immSExt8:$src2))]>, + (X86smul_flag GR16:$src1, i16immSExt8:$src2))], + IIC_IMUL16_RRI>, OpSize; def IMUL32rri : Ii32<0x69, MRMSrcReg, // GR32 = GR32*I32 (outs GR32:$dst), (ins GR32:$src1, i32imm:$src2), "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR32:$dst, EFLAGS, - (X86smul_flag GR32:$src1, imm:$src2))]>; + (X86smul_flag GR32:$src1, imm:$src2))], + IIC_IMUL32_RRI>; def IMUL32rri8 : Ii8<0x6B, MRMSrcReg, // GR32 = GR32*I8 (outs GR32:$dst), (ins GR32:$src1, i32i8imm:$src2), "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR32:$dst, EFLAGS, - (X86smul_flag GR32:$src1, i32immSExt8:$src2))]>; + (X86smul_flag GR32:$src1, i32immSExt8:$src2))], + IIC_IMUL32_RRI>; def IMUL64rri32 : RIi32<0x69, MRMSrcReg, // GR64 = GR64*I32 (outs GR64:$dst), (ins GR64:$src1, i64i32imm:$src2), "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR64:$dst, EFLAGS, - (X86smul_flag GR64:$src1, i64immSExt32:$src2))]>; + (X86smul_flag GR64:$src1, i64immSExt32:$src2))], + IIC_IMUL64_RRI>; def IMUL64rri8 : RIi8<0x6B, MRMSrcReg, // GR64 = GR64*I8 (outs GR64:$dst), (ins GR64:$src1, i64i8imm:$src2), "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR64:$dst, EFLAGS, - (X86smul_flag GR64:$src1, i64immSExt8:$src2))]>; + (X86smul_flag GR64:$src1, i64immSExt8:$src2))], + IIC_IMUL64_RRI>; // Memory-Integer Signed Integer Multiply @@ -204,37 +222,43 @@ (outs GR16:$dst), (ins i16mem:$src1, i16imm:$src2), "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR16:$dst, EFLAGS, - (X86smul_flag (load addr:$src1), imm:$src2))]>, + (X86smul_flag (load addr:$src1), imm:$src2))], + IIC_IMUL16_RMI>, OpSize; def IMUL16rmi8 : Ii8<0x6B, MRMSrcMem, // GR16 = [mem16]*I8 (outs GR16:$dst), (ins i16mem:$src1, i16i8imm :$src2), "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR16:$dst, EFLAGS, (X86smul_flag (load addr:$src1), - i16immSExt8:$src2))]>, OpSize; + i16immSExt8:$src2))], IIC_IMUL16_RMI>, + OpSize; def IMUL32rmi : Ii32<0x69, MRMSrcMem, // GR32 = [mem32]*I32 (outs GR32:$dst), (ins i32mem:$src1, i32imm:$src2), "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR32:$dst, EFLAGS, - (X86smul_flag (load addr:$src1), imm:$src2))]>; + (X86smul_flag (load addr:$src1), imm:$src2))], + IIC_IMUL32_RMI>; def IMUL32rmi8 : Ii8<0x6B, MRMSrcMem, // GR32 = [mem32]*I8 (outs GR32:$dst), (ins i32mem:$src1, i32i8imm: $src2), "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR32:$dst, EFLAGS, (X86smul_flag (load addr:$src1), - i32immSExt8:$src2))]>; + i32immSExt8:$src2))], + IIC_IMUL32_RMI>; def IMUL64rmi32 : RIi32<0x69, MRMSrcMem, // GR64 = [mem64]*I32 (outs GR64:$dst), (ins i64mem:$src1, i64i32imm:$src2), "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR64:$dst, EFLAGS, (X86smul_flag (load addr:$src1), - i64immSExt32:$src2))]>; + i64immSExt32:$src2))], + IIC_IMUL64_RMI>; def IMUL64rmi8 : RIi8<0x6B, MRMSrcMem, // GR64 = [mem64]*I8 (outs GR64:$dst), (ins i64mem:$src1, i64i8imm: $src2), "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set GR64:$dst, EFLAGS, (X86smul_flag (load addr:$src1), - i64immSExt8:$src2))]>; + i64immSExt8:$src2))], + IIC_IMUL64_RMI>; } // Defs = [EFLAGS] @@ -243,62 +267,62 @@ // unsigned division/remainder let Defs = [AL,EFLAGS,AX], Uses = [AX] in def DIV8r : I<0xF6, MRM6r, (outs), (ins GR8:$src), // AX/r8 = AL,AH - "div{b}\t$src", []>; + "div{b}\t$src", [], IIC_DIV8_REG>; let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in def DIV16r : I<0xF7, MRM6r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX - "div{w}\t$src", []>, OpSize; + "div{w}\t$src", [], IIC_DIV16>, OpSize; let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in def DIV32r : I<0xF7, MRM6r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX - "div{l}\t$src", []>; + "div{l}\t$src", [], IIC_DIV32>; // RDX:RAX/r64 = RAX,RDX let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in def DIV64r : RI<0xF7, MRM6r, (outs), (ins GR64:$src), - "div{q}\t$src", []>; + "div{q}\t$src", [], IIC_DIV64>; let mayLoad = 1 in { let Defs = [AL,EFLAGS,AX], Uses = [AX] in def DIV8m : I<0xF6, MRM6m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH - "div{b}\t$src", []>; + "div{b}\t$src", [], IIC_DIV8_MEM>; let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX - "div{w}\t$src", []>, OpSize; + "div{w}\t$src", [], IIC_DIV16>, OpSize; let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src), - "div{l}\t$src", []>; + "div{l}\t$src", [], IIC_DIV32>; // RDX:RAX/[mem64] = RAX,RDX let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src), - "div{q}\t$src", []>; + "div{q}\t$src", [], IIC_DIV64>; } // Signed division/remainder. let Defs = [AL,EFLAGS,AX], Uses = [AX] in def IDIV8r : I<0xF6, MRM7r, (outs), (ins GR8:$src), // AX/r8 = AL,AH - "idiv{b}\t$src", []>; + "idiv{b}\t$src", [], IIC_IDIV8>; let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in def IDIV16r: I<0xF7, MRM7r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX - "idiv{w}\t$src", []>, OpSize; + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in def IDIV32r: I<0xF7, MRM7r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX - "idiv{l}\t$src", []>; + "idiv{l}\t$src", [], IIC_IDIV32>; // RDX:RAX/r64 = RAX,RDX let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in def IDIV64r: RI<0xF7, MRM7r, (outs), (ins GR64:$src), - "idiv{q}\t$src", []>; + "idiv{q}\t$src", [], IIC_IDIV64>; let mayLoad = 1 in { let Defs = [AL,EFLAGS,AX], Uses = [AX] in def IDIV8m : I<0xF6, MRM7m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH - "idiv{b}\t$src", []>; + "idiv{b}\t$src", [], IIC_IDIV8>; let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in def IDIV16m: I<0xF7, MRM7m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX - "idiv{w}\t$src", []>, OpSize; + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX def IDIV32m: I<0xF7, MRM7m, (outs), (ins i32mem:$src), - "idiv{l}\t$src", []>; + "idiv{l}\t$src", [], IIC_IDIV32>; let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in // RDX:RAX/[mem64] = RAX,RDX def IDIV64m: RI<0xF7, MRM7m, (outs), (ins i64mem:$src), - "idiv{q}\t$src", []>; + "idiv{q}\t$src", [], IIC_IDIV64>; } //===----------------------------------------------------------------------===// @@ -312,35 +336,35 @@ def NEG8r : I<0xF6, MRM3r, (outs GR8 :$dst), (ins GR8 :$src1), "neg{b}\t$dst", [(set GR8:$dst, (ineg GR8:$src1)), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_REG>; def NEG16r : I<0xF7, MRM3r, (outs GR16:$dst), (ins GR16:$src1), "neg{w}\t$dst", [(set GR16:$dst, (ineg GR16:$src1)), - (implicit EFLAGS)]>, OpSize; + (implicit EFLAGS)], IIC_UNARY_REG>, OpSize; def NEG32r : I<0xF7, MRM3r, (outs GR32:$dst), (ins GR32:$src1), "neg{l}\t$dst", [(set GR32:$dst, (ineg GR32:$src1)), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_REG>; def NEG64r : RI<0xF7, MRM3r, (outs GR64:$dst), (ins GR64:$src1), "neg{q}\t$dst", [(set GR64:$dst, (ineg GR64:$src1)), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_REG>; } // Constraints = "$src1 = $dst" def NEG8m : I<0xF6, MRM3m, (outs), (ins i8mem :$dst), "neg{b}\t$dst", [(store (ineg (loadi8 addr:$dst)), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; def NEG16m : I<0xF7, MRM3m, (outs), (ins i16mem:$dst), "neg{w}\t$dst", [(store (ineg (loadi16 addr:$dst)), addr:$dst), - (implicit EFLAGS)]>, OpSize; + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize; def NEG32m : I<0xF7, MRM3m, (outs), (ins i32mem:$dst), "neg{l}\t$dst", [(store (ineg (loadi32 addr:$dst)), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; def NEG64m : RI<0xF7, MRM3m, (outs), (ins i64mem:$dst), "neg{q}\t$dst", [(store (ineg (loadi64 addr:$dst)), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; } // Defs = [EFLAGS] @@ -351,29 +375,30 @@ let AddedComplexity = 15 in { def NOT8r : I<0xF6, MRM2r, (outs GR8 :$dst), (ins GR8 :$src1), "not{b}\t$dst", - [(set GR8:$dst, (not GR8:$src1))]>; + [(set GR8:$dst, (not GR8:$src1))], IIC_UNARY_REG>; def NOT16r : I<0xF7, MRM2r, (outs GR16:$dst), (ins GR16:$src1), "not{w}\t$dst", - [(set GR16:$dst, (not GR16:$src1))]>, OpSize; + [(set GR16:$dst, (not GR16:$src1))], IIC_UNARY_REG>, OpSize; def NOT32r : I<0xF7, MRM2r, (outs GR32:$dst), (ins GR32:$src1), "not{l}\t$dst", - [(set GR32:$dst, (not GR32:$src1))]>; + [(set GR32:$dst, (not GR32:$src1))], IIC_UNARY_REG>; def NOT64r : RI<0xF7, MRM2r, (outs GR64:$dst), (ins GR64:$src1), "not{q}\t$dst", - [(set GR64:$dst, (not GR64:$src1))]>; + [(set GR64:$dst, (not GR64:$src1))], IIC_UNARY_REG>; } } // Constraints = "$src1 = $dst" def NOT8m : I<0xF6, MRM2m, (outs), (ins i8mem :$dst), "not{b}\t$dst", - [(store (not (loadi8 addr:$dst)), addr:$dst)]>; + [(store (not (loadi8 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; def NOT16m : I<0xF7, MRM2m, (outs), (ins i16mem:$dst), "not{w}\t$dst", - [(store (not (loadi16 addr:$dst)), addr:$dst)]>, OpSize; + [(store (not (loadi16 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>, + OpSize; def NOT32m : I<0xF7, MRM2m, (outs), (ins i32mem:$dst), "not{l}\t$dst", - [(store (not (loadi32 addr:$dst)), addr:$dst)]>; + [(store (not (loadi32 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; def NOT64m : RI<0xF7, MRM2m, (outs), (ins i64mem:$dst), "not{q}\t$dst", - [(store (not (loadi64 addr:$dst)), addr:$dst)]>; + [(store (not (loadi64 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; } // CodeSize // TODO: inc/dec is slow for P4, but fast for Pentium-M. @@ -382,19 +407,22 @@ let CodeSize = 2 in def INC8r : I<0xFE, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), "inc{b}\t$dst", - [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))]>; + [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))], + IIC_UNARY_REG>; let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. def INC16r : I<0x40, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), "inc{w}\t$dst", - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], IIC_UNARY_REG>, OpSize, Requires<[In32BitMode]>; def INC32r : I<0x40, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), "inc{l}\t$dst", - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], + IIC_UNARY_REG>, Requires<[In32BitMode]>; def INC64r : RI<0xFF, MRM0r, (outs GR64:$dst), (ins GR64:$src1), "inc{q}\t$dst", - [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))]>; + [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))], + IIC_UNARY_REG>; } // isConvertibleToThreeAddress = 1, CodeSize = 1 @@ -403,19 +431,23 @@ // Can transform into LEA. def INC64_16r : I<0xFF, MRM0r, (outs GR16:$dst), (ins GR16:$src1), "inc{w}\t$dst", - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], + IIC_UNARY_REG>, OpSize, Requires<[In64BitMode]>; def INC64_32r : I<0xFF, MRM0r, (outs GR32:$dst), (ins GR32:$src1), "inc{l}\t$dst", - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], + IIC_UNARY_REG>, Requires<[In64BitMode]>; def DEC64_16r : I<0xFF, MRM1r, (outs GR16:$dst), (ins GR16:$src1), "dec{w}\t$dst", - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], + IIC_UNARY_REG>, OpSize, Requires<[In64BitMode]>; def DEC64_32r : I<0xFF, MRM1r, (outs GR32:$dst), (ins GR32:$src1), "dec{l}\t$dst", - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], + IIC_UNARY_REG>, Requires<[In64BitMode]>; } // isConvertibleToThreeAddress = 1, CodeSize = 2 @@ -424,37 +456,37 @@ let CodeSize = 2 in { def INC8m : I<0xFE, MRM0m, (outs), (ins i8mem :$dst), "inc{b}\t$dst", [(store (add (loadi8 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; def INC16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", [(store (add (loadi16 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize, Requires<[In32BitMode]>; def INC32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", [(store (add (loadi32 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, Requires<[In32BitMode]>; def INC64m : RI<0xFF, MRM0m, (outs), (ins i64mem:$dst), "inc{q}\t$dst", [(store (add (loadi64 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; // These are duplicates of their 32-bit counterparts. Only needed so X86 knows // how to unfold them. // FIXME: What is this for?? def INC64_16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", [(store (add (loadi16 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize, Requires<[In64BitMode]>; def INC64_32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", [(store (add (loadi32 addr:$dst), 1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, Requires<[In64BitMode]>; def DEC64_16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", [(store (add (loadi16 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize, Requires<[In64BitMode]>; def DEC64_32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", [(store (add (loadi32 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, Requires<[In64BitMode]>; } // CodeSize = 2 @@ -462,18 +494,22 @@ let CodeSize = 2 in def DEC8r : I<0xFE, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), "dec{b}\t$dst", - [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))]>; + [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))], + IIC_UNARY_REG>; let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. def DEC16r : I<0x48, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), "dec{w}\t$dst", - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], + IIC_UNARY_REG>, OpSize, Requires<[In32BitMode]>; def DEC32r : I<0x48, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), "dec{l}\t$dst", - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], + IIC_UNARY_REG>, Requires<[In32BitMode]>; def DEC64r : RI<0xFF, MRM1r, (outs GR64:$dst), (ins GR64:$src1), "dec{q}\t$dst", - [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))]>; + [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))], + IIC_UNARY_REG>; } // CodeSize = 2 } // Constraints = "$src1 = $dst" @@ -481,18 +517,18 @@ let CodeSize = 2 in { def DEC8m : I<0xFE, MRM1m, (outs), (ins i8mem :$dst), "dec{b}\t$dst", [(store (add (loadi8 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; def DEC16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", [(store (add (loadi16 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize, Requires<[In32BitMode]>; def DEC32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", [(store (add (loadi32 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>, + (implicit EFLAGS)], IIC_UNARY_MEM>, Requires<[In32BitMode]>; def DEC64m : RI<0xFF, MRM1m, (outs), (ins i64mem:$dst), "dec{q}\t$dst", [(store (add (loadi64 addr:$dst), -1), addr:$dst), - (implicit EFLAGS)]>; + (implicit EFLAGS)], IIC_UNARY_MEM>; } // CodeSize = 2 } // Defs = [EFLAGS] @@ -588,11 +624,13 @@ /// 4. Infers whether the low bit of the opcode should be 0 (for i8 operations) /// or 1 (for i16,i32,i64 operations). class ITy opcode, Format f, X86TypeInfo typeinfo, dag outs, dag ins, - string mnemonic, string args, list pattern> + string mnemonic, string args, list pattern, + InstrItinClass itin = IIC_BIN_NONMEM> : I<{opcode{7}, opcode{6}, opcode{5}, opcode{4}, opcode{3}, opcode{2}, opcode{1}, typeinfo.HasOddOpcode }, f, outs, ins, - !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern> { + !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern, + itin> { // Infer instruction prefixes from type info. let hasOpSizePrefix = typeinfo.HasOpSizePrefix; @@ -664,7 +702,7 @@ dag outlist, list pattern> : ITy; + mnemonic, "{$src2, $src1|$src1, $src2}", pattern, IIC_BIN_MEM>; // BinOpRM_R - Instructions like "add reg, reg, [mem]". class BinOpRM_R opcode, string mnemonic, X86TypeInfo typeinfo, @@ -776,7 +814,7 @@ list pattern> : ITy; + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM>; // BinOpMR_RMW - Instructions like "add [mem], reg". class BinOpMR_RMW opcode, string mnemonic, X86TypeInfo typeinfo, @@ -804,7 +842,7 @@ Format f, list pattern, bits<8> opcode = 0x80> : ITy { + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { let ImmT = typeinfo.ImmEncoding; } @@ -837,7 +875,7 @@ Format f, list pattern> : ITy<0x82, f, typeinfo, (outs), (ins typeinfo.MemOperand:$dst, typeinfo.Imm8Operand:$src), - mnemonic, "{$src, $dst|$dst, $src}", pattern> { + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { let ImmT = Imm8; // Always 8-bit immediate. } @@ -1150,7 +1188,7 @@ // register class is constrained to GR8_NOREX. let isPseudo = 1 in def TEST8ri_NOREX : I<0, Pseudo, (outs), (ins GR8_NOREX:$src, i8imm:$mask), - "", []>; + "", [], IIC_BIN_NONMEM>; } //===----------------------------------------------------------------------===// @@ -1160,11 +1198,12 @@ PatFrag ld_frag> { def rr : I<0xF2, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2), !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), - [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))]>; + [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))], + IIC_BIN_NONMEM>; def rm : I<0xF2, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2), !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), [(set RC:$dst, EFLAGS, - (X86andn_flag RC:$src1, (ld_frag addr:$src2)))]>; + (X86andn_flag RC:$src1, (ld_frag addr:$src2)))], IIC_BIN_MEM>; } let Predicates = [HasBMI], Defs = [EFLAGS] in { Modified: llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td Wed Feb 1 17:20:51 2012 @@ -21,17 +21,20 @@ : I,TB,OpSize; + (X86cmov GR16:$src1, GR16:$src2, CondNode, EFLAGS))], + IIC_CMOV16_RR>,TB,OpSize; def #NAME#32rr : I, TB; + (X86cmov GR32:$src1, GR32:$src2, CondNode, EFLAGS))], + IIC_CMOV32_RR>, TB; def #NAME#64rr :RI, TB; + (X86cmov GR64:$src1, GR64:$src2, CondNode, EFLAGS))], + IIC_CMOV32_RR>, TB; } let Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" in { @@ -39,17 +42,18 @@ : I, TB, OpSize; + CondNode, EFLAGS))], IIC_CMOV16_RM>, + TB, OpSize; def #NAME#32rm : I, TB; + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; def #NAME#64rm :RI, TB; + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; } // Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" } // end multiclass @@ -78,10 +82,12 @@ let Uses = [EFLAGS] in { def r : I, TB; + [(set GR8:$dst, (X86setcc OpNode, EFLAGS))], + IIC_SET_R>, TB; def m : I, TB; + [(store (X86setcc OpNode, EFLAGS), addr:$dst)], + IIC_SET_M>, TB; } // Uses = [EFLAGS] } Modified: llvm/trunk/lib/Target/X86/X86InstrControl.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrControl.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrControl.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrControl.td Wed Feb 1 17:20:51 2012 @@ -20,41 +20,42 @@ hasCtrlDep = 1, FPForm = SpecialFP in { def RET : I <0xC3, RawFrm, (outs), (ins variable_ops), "ret", - [(X86retflag 0)]>; + [(X86retflag 0)], IIC_RET>; def RETI : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), "ret\t$amt", - [(X86retflag timm:$amt)]>; + [(X86retflag timm:$amt)], IIC_RET_IMM>; def RETIW : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), "retw\t$amt", - []>, OpSize; + [], IIC_RET_IMM>, OpSize; def LRETL : I <0xCB, RawFrm, (outs), (ins), - "lretl", []>; + "lretl", [], IIC_RET>; def LRETQ : RI <0xCB, RawFrm, (outs), (ins), - "lretq", []>; + "lretq", [], IIC_RET>; def LRETI : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), - "lret\t$amt", []>; + "lret\t$amt", [], IIC_RET>; def LRETIW : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), - "lretw\t$amt", []>, OpSize; + "lretw\t$amt", [], IIC_RET>, OpSize; } // Unconditional branches. let isBarrier = 1, isBranch = 1, isTerminator = 1 in { def JMP_4 : Ii32PCRel<0xE9, RawFrm, (outs), (ins brtarget:$dst), - "jmp\t$dst", [(br bb:$dst)]>; + "jmp\t$dst", [(br bb:$dst)], IIC_JMP_REL>; def JMP_1 : Ii8PCRel<0xEB, RawFrm, (outs), (ins brtarget8:$dst), - "jmp\t$dst", []>; + "jmp\t$dst", [], IIC_JMP_REL>; // FIXME : Intel syntax for JMP64pcrel32 such that it is not ambiguious // with JMP_1. def JMP64pcrel32 : I<0xE9, RawFrm, (outs), (ins brtarget:$dst), - "jmpq\t$dst", []>; + "jmpq\t$dst", [], IIC_JMP_REL>; } // Conditional Branches. let isBranch = 1, isTerminator = 1, Uses = [EFLAGS] in { multiclass ICBr opc1, bits<8> opc4, string asm, PatFrag Cond> { - def _1 : Ii8PCRel ; + def _1 : Ii8PCRel ; def _4 : Ii32PCRel, TB; + [(X86brcond bb:$dst, Cond, EFLAGS)], IIC_Jcc>, TB; } } @@ -82,55 +83,55 @@ // jecxz. let Uses = [CX] in def JCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), - "jcxz\t$dst", []>, AdSize, Requires<[In32BitMode]>; + "jcxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In32BitMode]>; let Uses = [ECX] in def JECXZ_32 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), - "jecxz\t$dst", []>, Requires<[In32BitMode]>; + "jecxz\t$dst", [], IIC_JCXZ>, Requires<[In32BitMode]>; // J*CXZ instruction: 64-bit versions of this instruction for the asmparser. // In 64-bit mode, the address size prefix is jecxz and the unprefixed version // is jrcxz. let Uses = [ECX] in def JECXZ_64 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), - "jecxz\t$dst", []>, AdSize, Requires<[In64BitMode]>; + "jecxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In64BitMode]>; let Uses = [RCX] in def JRCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), - "jrcxz\t$dst", []>, Requires<[In64BitMode]>; + "jrcxz\t$dst", [], IIC_JCXZ>, Requires<[In64BitMode]>; } // Indirect branches let isBranch = 1, isTerminator = 1, isBarrier = 1, isIndirectBranch = 1 in { def JMP32r : I<0xFF, MRM4r, (outs), (ins GR32:$dst), "jmp{l}\t{*}$dst", - [(brind GR32:$dst)]>, Requires<[In32BitMode]>; + [(brind GR32:$dst)], IIC_JMP_REG>, Requires<[In32BitMode]>; def JMP32m : I<0xFF, MRM4m, (outs), (ins i32mem:$dst), "jmp{l}\t{*}$dst", - [(brind (loadi32 addr:$dst))]>, Requires<[In32BitMode]>; + [(brind (loadi32 addr:$dst))], IIC_JMP_MEM>, Requires<[In32BitMode]>; def JMP64r : I<0xFF, MRM4r, (outs), (ins GR64:$dst), "jmp{q}\t{*}$dst", - [(brind GR64:$dst)]>, Requires<[In64BitMode]>; + [(brind GR64:$dst)], IIC_JMP_REG>, Requires<[In64BitMode]>; def JMP64m : I<0xFF, MRM4m, (outs), (ins i64mem:$dst), "jmp{q}\t{*}$dst", - [(brind (loadi64 addr:$dst))]>, Requires<[In64BitMode]>; + [(brind (loadi64 addr:$dst))], IIC_JMP_MEM>, Requires<[In64BitMode]>; def FARJMP16i : Iseg16<0xEA, RawFrmImm16, (outs), (ins i16imm:$off, i16imm:$seg), - "ljmp{w}\t{$seg, $off|$off, $seg}", []>, OpSize; + "ljmp{w}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>, OpSize; def FARJMP32i : Iseg32<0xEA, RawFrmImm16, (outs), (ins i32imm:$off, i16imm:$seg), - "ljmp{l}\t{$seg, $off|$off, $seg}", []>; + "ljmp{l}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>; def FARJMP64 : RI<0xFF, MRM5m, (outs), (ins opaque80mem:$dst), - "ljmp{q}\t{*}$dst", []>; + "ljmp{q}\t{*}$dst", [], IIC_JMP_FAR_MEM>; def FARJMP16m : I<0xFF, MRM5m, (outs), (ins opaque32mem:$dst), - "ljmp{w}\t{*}$dst", []>, OpSize; + "ljmp{w}\t{*}$dst", [], IIC_JMP_FAR_MEM>, OpSize; def FARJMP32m : I<0xFF, MRM5m, (outs), (ins opaque48mem:$dst), - "ljmp{l}\t{*}$dst", []>; + "ljmp{l}\t{*}$dst", [], IIC_JMP_FAR_MEM>; } // Loop instructions -def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", []>; -def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", []>; -def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", []>; +def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", [], IIC_LOOP>; +def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", [], IIC_LOOPE>; +def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", [], IIC_LOOPNE>; //===----------------------------------------------------------------------===// // Call Instructions... @@ -147,25 +148,27 @@ Uses = [ESP] in { def CALLpcrel32 : Ii32PCRel<0xE8, RawFrm, (outs), (ins i32imm_pcrel:$dst,variable_ops), - "call{l}\t$dst", []>, Requires<[In32BitMode]>; + "call{l}\t$dst", [], IIC_CALL_RI>, Requires<[In32BitMode]>; def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), - "call{l}\t{*}$dst", [(X86call GR32:$dst)]>, + "call{l}\t{*}$dst", [(X86call GR32:$dst)], IIC_CALL_RI>, Requires<[In32BitMode]>; def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), - "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))]>, + "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))], IIC_CALL_MEM>, Requires<[In32BitMode]>; def FARCALL16i : Iseg16<0x9A, RawFrmImm16, (outs), (ins i16imm:$off, i16imm:$seg), - "lcall{w}\t{$seg, $off|$off, $seg}", []>, OpSize; + "lcall{w}\t{$seg, $off|$off, $seg}", [], + IIC_CALL_FAR_PTR>, OpSize; def FARCALL32i : Iseg32<0x9A, RawFrmImm16, (outs), (ins i32imm:$off, i16imm:$seg), - "lcall{l}\t{$seg, $off|$off, $seg}", []>; + "lcall{l}\t{$seg, $off|$off, $seg}", [], + IIC_CALL_FAR_PTR>; def FARCALL16m : I<0xFF, MRM3m, (outs), (ins opaque32mem:$dst), - "lcall{w}\t{*}$dst", []>, OpSize; + "lcall{w}\t{*}$dst", [], IIC_CALL_FAR_MEM>, OpSize; def FARCALL32m : I<0xFF, MRM3m, (outs), (ins opaque48mem:$dst), - "lcall{l}\t{*}$dst", []>; + "lcall{l}\t{*}$dst", [], IIC_CALL_FAR_MEM>; // callw for 16 bit code for the assembler. let isAsmParserOnly = 1 in @@ -196,13 +199,13 @@ // mcinst. def TAILJMPd : Ii32PCRel<0xE9, RawFrm, (outs), (ins i32imm_pcrel:$dst, variable_ops), - "jmp\t$dst # TAILCALL", - []>; + "jmp\t$dst # TAILCALL", + [], IIC_JMP_REL>; def TAILJMPr : I<0xFF, MRM4r, (outs), (ins GR32_TC:$dst, variable_ops), - "", []>; // FIXME: Remove encoding when JIT is dead. + "", [], IIC_JMP_REG>; // FIXME: Remove encoding when JIT is dead. let mayLoad = 1 in def TAILJMPm : I<0xFF, MRM4m, (outs), (ins i32mem_TC:$dst, variable_ops), - "jmp{l}\t{*}$dst # TAILCALL", []>; + "jmp{l}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; } @@ -226,17 +229,19 @@ // the 32-bit pcrel field that we have. def CALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, (outs), (ins i64i32imm_pcrel:$dst, variable_ops), - "call{q}\t$dst", []>, + "call{q}\t$dst", [], IIC_CALL_RI>, Requires<[In64BitMode, NotWin64]>; def CALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), - "call{q}\t{*}$dst", [(X86call GR64:$dst)]>, + "call{q}\t{*}$dst", [(X86call GR64:$dst)], + IIC_CALL_RI>, Requires<[In64BitMode, NotWin64]>; def CALL64m : I<0xFF, MRM2m, (outs), (ins i64mem:$dst, variable_ops), - "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))]>, + "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))], + IIC_CALL_MEM>, Requires<[In64BitMode, NotWin64]>; def FARCALL64 : RI<0xFF, MRM3m, (outs), (ins opaque80mem:$dst), - "lcall{q}\t{*}$dst", []>; + "lcall{q}\t{*}$dst", [], IIC_CALL_FAR_MEM>; } // FIXME: We need to teach codegen about single list of call-clobbered @@ -253,15 +258,16 @@ Uses = [RSP] in { def WINCALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, (outs), (ins i64i32imm_pcrel:$dst, variable_ops), - "call{q}\t$dst", []>, + "call{q}\t$dst", [], IIC_CALL_RI>, Requires<[IsWin64]>; def WINCALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), "call{q}\t{*}$dst", - [(X86call GR64:$dst)]>, Requires<[IsWin64]>; + [(X86call GR64:$dst)], IIC_CALL_RI>, + Requires<[IsWin64]>; def WINCALL64m : I<0xFF, MRM2m, (outs), (ins i64mem:$dst,variable_ops), "call{q}\t{*}$dst", - [(X86call (loadi64 addr:$dst))]>, + [(X86call (loadi64 addr:$dst))], IIC_CALL_MEM>, Requires<[IsWin64]>; } @@ -272,7 +278,7 @@ Uses = [RSP] in { def W64ALLOCA : Ii32PCRel<0xE8, RawFrm, (outs), (ins i64i32imm_pcrel:$dst, variable_ops), - "call{q}\t$dst", []>, + "call{q}\t$dst", [], IIC_CALL_RI>, Requires<[IsWin64]>; } @@ -296,11 +302,11 @@ def TAILJMPd64 : Ii32PCRel<0xE9, RawFrm, (outs), (ins i64i32imm_pcrel:$dst, variable_ops), - "jmp\t$dst # TAILCALL", []>; + "jmp\t$dst # TAILCALL", [], IIC_JMP_REL>; def TAILJMPr64 : I<0xFF, MRM4r, (outs), (ins ptr_rc_tailcall:$dst, variable_ops), - "jmp{q}\t{*}$dst # TAILCALL", []>; + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; let mayLoad = 1 in def TAILJMPm64 : I<0xFF, MRM4m, (outs), (ins i64mem_TC:$dst, variable_ops), - "jmp{q}\t{*}$dst # TAILCALL", []>; + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; } Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Wed Feb 1 17:20:51 2012 @@ -123,7 +123,9 @@ class MemOp4 { bit hasMemOp4Prefix = 1; } class XOP { bit hasXOP_Prefix = 1; } class X86Inst opcod, Format f, ImmType i, dag outs, dag ins, - string AsmStr, Domain d = GenericDomain> + string AsmStr, + InstrItinClass itin, + Domain d = GenericDomain> : Instruction { let Namespace = "X86"; @@ -139,6 +141,8 @@ // If this is a pseudo instruction, mark it isCodeGenOnly. let isCodeGenOnly = !eq(!cast(f), "Pseudo"); + let Itinerary = itin; + // // Attributes specific to X86 instructions... // @@ -189,51 +193,53 @@ } class PseudoI pattern> - : X86Inst<0, Pseudo, NoImm, oops, iops, ""> { + : X86Inst<0, Pseudo, NoImm, oops, iops, "", NoItinerary> { let Pattern = pattern; } class I o, Format f, dag outs, dag ins, string asm, - list pattern, Domain d = GenericDomain> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT, + Domain d = GenericDomain> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii8 o, Format f, dag outs, dag ins, string asm, - list pattern, Domain d = GenericDomain> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT, + Domain d = GenericDomain> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii8PCRel o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii16 o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii32 o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii16PCRel o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Ii32PCRel o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } @@ -244,8 +250,9 @@ : I {} // FpI_ - Floating Point Pseudo Instruction template. Not Predicated. -class FpI_ pattern> - : X86Inst<0, Pseudo, NoImm, outs, ins, ""> { +class FpI_ pattern, + InstrItinClass itin = IIC_DEFAULT> + : X86Inst<0, Pseudo, NoImm, outs, ins, "", itin> { let FPForm = fp; let Pattern = pattern; } @@ -257,20 +264,23 @@ // Iseg32 - 16-bit segment selector, 32-bit offset class Iseg16 o, Format f, dag outs, dag ins, string asm, - list pattern> : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } class Iseg32 o, Format f, dag outs, dag ins, string asm, - list pattern> : X86Inst { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst { let Pattern = pattern; let CodeSize = 3; } // SI - SSE 1 & 2 scalar instructions -class SI o, Format F, dag outs, dag ins, string asm, list pattern> - : I { +class SI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I { let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); @@ -280,8 +290,8 @@ // SIi8 - SSE 1 & 2 scalar instructions class SIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8 { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8 { let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); @@ -291,8 +301,8 @@ // PI - SSE 1 & 2 packed instructions class PI o, Format F, dag outs, dag ins, string asm, list pattern, - Domain d> - : I { + InstrItinClass itin, Domain d> + : I { let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); @@ -302,8 +312,8 @@ // PIi8 - SSE 1 & 2 packed instructions with immediate class PIi8 o, Format F, dag outs, dag ins, string asm, - list pattern, Domain d> - : Ii8 { + list pattern, InstrItinClass itin, Domain d> + : Ii8 { let Predicates = !if(hasVEX_4VPrefix /* VEX */, [HasAVX], !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); @@ -319,25 +329,27 @@ // VSSI - SSE1 instructions with XS prefix in AVX form. // VPSI - SSE1 instructions with TB prefix in AVX form. -class SSI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, XS, Requires<[HasSSE1]>; +class SSI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XS, Requires<[HasSSE1]>; class SSIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, XS, Requires<[HasSSE1]>; -class PSI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, TB, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, XS, Requires<[HasSSE1]>; +class PSI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, Requires<[HasSSE1]>; class PSIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TB, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TB, Requires<[HasSSE1]>; class VSSI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, XS, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XS, Requires<[HasAVX]>; class VPSI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, Requires<[HasAVX]>; // SSE2 Instruction Templates: @@ -350,28 +362,30 @@ // VSDI - SSE2 instructions with XD prefix in AVX form. // VPDI - SSE2 instructions with TB and OpSize prefixes in AVX form. -class SDI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, XD, Requires<[HasSSE2]>; +class SDI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XD, Requires<[HasSSE2]>; class SDIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, XD, Requires<[HasSSE2]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, XD, Requires<[HasSSE2]>; class SSDIi8 o, Format F, dag outs, dag ins, string asm, list pattern> : Ii8, XS, Requires<[HasSSE2]>; -class PDI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, TB, OpSize, +class PDI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, OpSize, Requires<[HasSSE2]>; class PDIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TB, OpSize, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TB, OpSize, Requires<[HasSSE2]>; class VSDI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, XD, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XD, Requires<[HasAVX]>; class VPDI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, OpSize, Requires<[HasAVX]>; // SSE3 Instruction Templates: @@ -381,15 +395,16 @@ // S3DI - SSE3 instructions with XD prefix. class S3SI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, XS, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XS, Requires<[HasSSE3]>; class S3DI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, XD, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XD, Requires<[HasSSE3]>; -class S3I o, Format F, dag outs, dag ins, string asm, list pattern> - : I, TB, OpSize, +class S3I o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, OpSize, Requires<[HasSSE3]>; @@ -403,12 +418,12 @@ // classes. They need to be enabled even if AVX is enabled. class SS38I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, Requires<[HasSSSE3]>; class SS3AI o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, Requires<[HasSSSE3]>; // SSE4.1 Instruction Templates: @@ -417,31 +432,31 @@ // SS41AIi8 - SSE 4.1 instructions with TA prefix and ImmT == Imm8. // class SS48I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, Requires<[HasSSE41]>; class SS4AIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, Requires<[HasSSE41]>; // SSE4.2 Instruction Templates: // // SS428I - SSE 4.2 instructions with T8 prefix. class SS428I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, Requires<[HasSSE42]>; // SS42FI - SSE 4.2 instructions with T8XD prefix. class SS42FI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8XD, Requires<[HasSSE42]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8XD, Requires<[HasSSE42]>; // SS42AI = SSE 4.2 instructions with TA prefix class SS42AI o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, Requires<[HasSSE42]>; // AVX Instruction Templates: @@ -450,12 +465,12 @@ // AVX8I - AVX instructions with T8 and OpSize prefix. // AVXAIi8 - AVX instructions with TA, OpSize prefix and ImmT = Imm8. class AVX8I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8, OpSize, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, OpSize, Requires<[HasAVX]>; class AVXAIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, OpSize, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, OpSize, Requires<[HasAVX]>; // AVX2 Instruction Templates: @@ -464,12 +479,12 @@ // AVX28I - AVX2 instructions with T8 and OpSize prefix. // AVX2AIi8 - AVX2 instructions with TA, OpSize prefix and ImmT = Imm8. class AVX28I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, T8, OpSize, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, OpSize, Requires<[HasAVX2]>; class AVX2AIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, OpSize, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, OpSize, Requires<[HasAVX2]>; // AES Instruction Templates: @@ -477,87 +492,88 @@ // AES8I // These use the same encoding as the SSE4.2 T8 and TA encodings. class AES8I o, Format F, dag outs, dag ins, string asm, - listpattern> - : I, T8, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, Requires<[HasSSE2, HasAES]>; class AESAI o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TA, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, Requires<[HasSSE2, HasAES]>; // CLMUL Instruction Templates class CLMULIi8 o, Format F, dag outs, dag ins, string asm, - listpattern> - : Ii8, TA, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, OpSize, Requires<[HasSSE2, HasCLMUL]>; class AVXCLMULIi8 o, Format F, dag outs, dag ins, string asm, - listpattern> - : Ii8, TA, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, OpSize, VEX_4V, Requires<[HasAVX, HasCLMUL]>; // FMA3 Instruction Templates class FMA3 o, Format F, dag outs, dag ins, string asm, - listpattern> - : I, T8, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : I, T8, OpSize, VEX_4V, Requires<[HasFMA3]>; // FMA4 Instruction Templates class FMA4 o, Format F, dag outs, dag ins, string asm, - listpattern> - : Ii8, TA, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : I, TA, OpSize, VEX_4V, VEX_I8IMM, Requires<[HasFMA4]>; // XOP 2, 3 and 4 Operand Instruction Template class IXOP o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, XOP, XOP9, Requires<[HasXOP]>; // XOP 2, 3 and 4 Operand Instruction Templates with imm byte class IXOPi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, XOP, XOP8, Requires<[HasXOP]>; // XOP 5 operand instruction (VEX encoding!) class IXOP5 o, Format F, dag outs, dag ins, string asm, - listpattern> - : Ii8, TA, + listpattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TA, OpSize, VEX_4V, VEX_I8IMM, Requires<[HasXOP]>; // X86-64 Instruction templates... // -class RI o, Format F, dag outs, dag ins, string asm, list pattern> - : I, REX_W; +class RI o, Format F, dag outs, dag ins, string asm, + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, REX_W; class RIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, REX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, REX_W; class RIi32 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii32, REX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii32, REX_W; class RIi64 o, Format f, dag outs, dag ins, string asm, - list pattern> - : X86Inst, REX_W { + list pattern, InstrItinClass itin = IIC_DEFAULT> + : X86Inst, REX_W { let Pattern = pattern; let CodeSize = 3; } class RSSI o, Format F, dag outs, dag ins, string asm, - list pattern> - : SSI, REX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : SSI, REX_W; class RSDI o, Format F, dag outs, dag ins, string asm, - list pattern> - : SDI, REX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : SDI, REX_W; class RPDI o, Format F, dag outs, dag ins, string asm, - list pattern> - : PDI, REX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : PDI, REX_W; class VRPDI o, Format F, dag outs, dag ins, string asm, - list pattern> - : VPDI, VEX_W; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : VPDI, VEX_W; // MMX Instruction templates // @@ -570,23 +586,23 @@ // MMXID - MMX instructions with XD prefix. // MMXIS - MMX instructions with XS prefix. class MMXI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, Requires<[HasMMX]>; class MMXI64 o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, Requires<[HasMMX,In64BitMode]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, Requires<[HasMMX,In64BitMode]>; class MMXRI o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, REX_W, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, REX_W, Requires<[HasMMX]>; class MMX2I o, Format F, dag outs, dag ins, string asm, - list pattern> - : I, TB, OpSize, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : I, TB, OpSize, Requires<[HasMMX]>; class MMXIi8 o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, TB, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, TB, Requires<[HasMMX]>; class MMXID o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, XD, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, XD, Requires<[HasMMX]>; class MMXIS o, Format F, dag outs, dag ins, string asm, - list pattern> - : Ii8, XS, Requires<[HasMMX]>; + list pattern, InstrItinClass itin = IIC_DEFAULT> + : Ii8, XS, Requires<[HasMMX]>; Modified: llvm/trunk/lib/Target/X86/X86InstrMMX.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrMMX.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrMMX.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrMMX.td Wed Feb 1 17:20:51 2012 @@ -105,19 +105,23 @@ Intrinsic Int, X86MemOperand x86memop, PatFrag ld_frag, string asm, Domain d> { def irr : PI; + [(set DstRC:$dst, (Int SrcRC:$src))], + IIC_DEFAULT, d>; def irm : PI; + [(set DstRC:$dst, (Int (ld_frag addr:$src)))], + IIC_DEFAULT, d>; } multiclass sse12_cvt_pint_3addr opc, RegisterClass SrcRC, RegisterClass DstRC, Intrinsic Int, X86MemOperand x86memop, PatFrag ld_frag, string asm, Domain d> { def irr : PI; + asm, [(set DstRC:$dst, (Int DstRC:$src1, SrcRC:$src2))], + IIC_DEFAULT, d>; def irm : PI; + [(set DstRC:$dst, (Int DstRC:$src1, (ld_frag addr:$src2)))], + IIC_DEFAULT, d>; } //===----------------------------------------------------------------------===// Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Wed Feb 1 17:20:51 2012 @@ -67,13 +67,14 @@ !if(Is2Addr, !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), - [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], d>; + [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], IIC_DEFAULT, d>; let mayLoad = 1 in def rm : PI; + [(set RC:$dst, (OpNode RC:$src1, (mem_frag addr:$src2)))], + IIC_DEFAULT, d>; } /// sse12_fp_packed_logical_rm - SSE 1 & 2 packed instructions class @@ -87,12 +88,12 @@ !if(Is2Addr, !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), - pat_rr, d>; + pat_rr, IIC_DEFAULT, d>; def rm : PI; + pat_rm, IIC_DEFAULT, d>; } /// sse12_fp_packed_int - SSE 1 & 2 packed instructions intrinsics class @@ -106,14 +107,14 @@ !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), [(set RC:$dst, (!cast( !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) - RC:$src1, RC:$src2))], d>; + RC:$src1, RC:$src2))], IIC_DEFAULT, d>; def rm_Int : PI( !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) - RC:$src1, (mem_frag addr:$src2)))], d>; + RC:$src1, (mem_frag addr:$src2)))], IIC_DEFAULT, d>; } //===----------------------------------------------------------------------===// @@ -737,11 +738,11 @@ bit IsReMaterializable = 1> { let neverHasSideEffects = 1 in def rr : PI; + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], IIC_DEFAULT, d>; let canFoldAsLoad = 1, isReMaterializable = IsReMaterializable in def rm : PI; + [(set RC:$dst, (ld_frag addr:$src))], IIC_DEFAULT, d>; } defm VMOVAPS : sse12_mov_packed<0x28, VR128, f128mem, alignedloadv4f32, @@ -1003,14 +1004,14 @@ [(set RC:$dst, (mov_frag RC:$src1, (bc_v4f32 (v2f64 (scalar_to_vector (loadf64 addr:$src2))))))], - SSEPackedSingle>, TB; + IIC_DEFAULT, SSEPackedSingle>, TB; def PDrm : PI, TB, OpSize; + IIC_DEFAULT, SSEPackedDouble>, TB, OpSize; } let AddedComplexity = 20 in { @@ -1413,9 +1414,11 @@ SDNode OpNode, X86MemOperand x86memop, PatFrag ld_frag, string asm, Domain d> { def rr : PI; + [(set DstRC:$dst, (OpNode SrcRC:$src))], + IIC_DEFAULT, d>; def rm : PI; + [(set DstRC:$dst, (OpNode (ld_frag addr:$src)))], + IIC_DEFAULT, d>; } multiclass sse12_vcvt_avx opc, RegisterClass SrcRC, RegisterClass DstRC, @@ -2124,11 +2127,13 @@ PatFrag ld_frag, string OpcodeStr, Domain d> { def rr: PI; + [(set EFLAGS, (OpNode (vt RC:$src1), RC:$src2))], + IIC_DEFAULT, d>; def rm: PI; + (ld_frag addr:$src2)))], + IIC_DEFAULT, d>; } let Defs = [EFLAGS] in { @@ -2185,19 +2190,21 @@ let isAsmParserOnly = 1 in { def rri : PIi8<0xC2, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2, SSECC:$cc), asm, - [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], d>; + [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], + IIC_DEFAULT, d>; def rmi : PIi8<0xC2, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2, SSECC:$cc), asm, - [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], d>; + [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], + IIC_DEFAULT, d>; } // Accept explicit immediate argument form instead of comparison code. def rri_alt : PIi8<0xC2, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2, i8imm:$cc), - asm_alt, [], d>; + asm_alt, [], IIC_DEFAULT, d>; def rmi_alt : PIi8<0xC2, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2, i8imm:$cc), - asm_alt, [], d>; + asm_alt, [], IIC_DEFAULT, d>; } defm VCMPPS : sse12_cmp_packed; + RC:$src1, (mem_frag addr:$src2))))], + IIC_DEFAULT, d>; let isConvertibleToThreeAddress = IsConvertibleToThreeAddress in def rri : PIi8<0xC6, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2, i8imm:$src3), asm, [(set RC:$dst, - (vt (shufp:$src3 RC:$src1, RC:$src2)))], d>; + (vt (shufp:$src3 RC:$src1, RC:$src2)))], + IIC_DEFAULT, d>; } defm VSHUFPS : sse12_shuffle; + (vt (OpNode RC:$src1, RC:$src2)))], + IIC_DEFAULT, d>; def rm : PI; + (mem_frag addr:$src2))))], + IIC_DEFAULT, d>; } let AddedComplexity = 10 in { @@ -2589,9 +2600,10 @@ Domain d> { def rr32 : PI<0x50, MRMSrcReg, (outs GR32:$dst), (ins RC:$src), !strconcat(asm, "\t{$src, $dst|$dst, $src}"), - [(set GR32:$dst, (Int RC:$src))], d>; + [(set GR32:$dst, (Int RC:$src))], IIC_DEFAULT, d>; def rr64 : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins RC:$src), - !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], d>, REX_W; + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], + IIC_DEFAULT, d>, REX_W; } let Predicates = [HasAVX] in { @@ -2621,14 +2633,18 @@ // Assembler Only def VMOVMSKPSr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, + SSEPackedSingle>, TB, VEX; def VMOVMSKPDr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, + SSEPackedDouble>, TB, OpSize, VEX; def VMOVMSKPSYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, + SSEPackedSingle>, TB, VEX; def VMOVMSKPDYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, + SSEPackedDouble>, TB, OpSize, VEX; } @@ -6395,7 +6411,7 @@ !strconcat(OpcodeStr, "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), [(set RC:$dst, (IntId RC:$src1, RC:$src2, RC:$src3))], - SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; def rm : Ii8, OpSize, TA, VEX_4V, VEX_I8IMM; + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; } let Predicates = [HasAVX] in { Modified: llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td Wed Feb 1 17:20:51 2012 @@ -19,44 +19,46 @@ let Uses = [CL] in { def SHL8rCL : I<0xD2, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1), "shl{b}\t{%cl, $dst|$dst, CL}", - [(set GR8:$dst, (shl GR8:$src1, CL))]>; + [(set GR8:$dst, (shl GR8:$src1, CL))], IIC_SR>; def SHL16rCL : I<0xD3, MRM4r, (outs GR16:$dst), (ins GR16:$src1), "shl{w}\t{%cl, $dst|$dst, CL}", - [(set GR16:$dst, (shl GR16:$src1, CL))]>, OpSize; + [(set GR16:$dst, (shl GR16:$src1, CL))], IIC_SR>, OpSize; def SHL32rCL : I<0xD3, MRM4r, (outs GR32:$dst), (ins GR32:$src1), "shl{l}\t{%cl, $dst|$dst, CL}", - [(set GR32:$dst, (shl GR32:$src1, CL))]>; + [(set GR32:$dst, (shl GR32:$src1, CL))], IIC_SR>; def SHL64rCL : RI<0xD3, MRM4r, (outs GR64:$dst), (ins GR64:$src1), "shl{q}\t{%cl, $dst|$dst, CL}", - [(set GR64:$dst, (shl GR64:$src1, CL))]>; + [(set GR64:$dst, (shl GR64:$src1, CL))], IIC_SR>; } // Uses = [CL] def SHL8ri : Ii8<0xC0, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), "shl{b}\t{$src2, $dst|$dst, $src2}", - [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))]>; + [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; let isConvertibleToThreeAddress = 1 in { // Can transform into LEA. def SHL16ri : Ii8<0xC1, MRM4r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), "shl{w}\t{$src2, $dst|$dst, $src2}", - [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))]>, OpSize; + [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))], IIC_SR>, + OpSize; def SHL32ri : Ii8<0xC1, MRM4r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), "shl{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))]>; + [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))], IIC_SR>; def SHL64ri : RIi8<0xC1, MRM4r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), "shl{q}\t{$src2, $dst|$dst, $src2}", - [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))]>; + [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))], + IIC_SR>; // NOTE: We don't include patterns for shifts of a register by one, because // 'add reg,reg' is cheaper (and we have a Pat pattern for shift-by-one). def SHL8r1 : I<0xD0, MRM4r, (outs GR8:$dst), (ins GR8:$src1), - "shl{b}\t$dst", []>; + "shl{b}\t$dst", [], IIC_SR>; def SHL16r1 : I<0xD1, MRM4r, (outs GR16:$dst), (ins GR16:$src1), - "shl{w}\t$dst", []>, OpSize; + "shl{w}\t$dst", [], IIC_SR>, OpSize; def SHL32r1 : I<0xD1, MRM4r, (outs GR32:$dst), (ins GR32:$src1), - "shl{l}\t$dst", []>; + "shl{l}\t$dst", [], IIC_SR>; def SHL64r1 : RI<0xD1, MRM4r, (outs GR64:$dst), (ins GR64:$src1), - "shl{q}\t$dst", []>; + "shl{q}\t$dst", [], IIC_SR>; } // isConvertibleToThreeAddress = 1 } // Constraints = "$src = $dst" @@ -66,223 +68,266 @@ let Uses = [CL] in { def SHL8mCL : I<0xD2, MRM4m, (outs), (ins i8mem :$dst), "shl{b}\t{%cl, $dst|$dst, CL}", - [(store (shl (loadi8 addr:$dst), CL), addr:$dst)]>; + [(store (shl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; def SHL16mCL : I<0xD3, MRM4m, (outs), (ins i16mem:$dst), "shl{w}\t{%cl, $dst|$dst, CL}", - [(store (shl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; + [(store (shl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, + OpSize; def SHL32mCL : I<0xD3, MRM4m, (outs), (ins i32mem:$dst), "shl{l}\t{%cl, $dst|$dst, CL}", - [(store (shl (loadi32 addr:$dst), CL), addr:$dst)]>; + [(store (shl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; def SHL64mCL : RI<0xD3, MRM4m, (outs), (ins i64mem:$dst), "shl{q}\t{%cl, $dst|$dst, CL}", - [(store (shl (loadi64 addr:$dst), CL), addr:$dst)]>; + [(store (shl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; } def SHL8mi : Ii8<0xC0, MRM4m, (outs), (ins i8mem :$dst, i8imm:$src), "shl{b}\t{$src, $dst|$dst, $src}", - [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SHL16mi : Ii8<0xC1, MRM4m, (outs), (ins i16mem:$dst, i8imm:$src), "shl{w}\t{$src, $dst|$dst, $src}", - [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, + [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>, OpSize; def SHL32mi : Ii8<0xC1, MRM4m, (outs), (ins i32mem:$dst, i8imm:$src), "shl{l}\t{$src, $dst|$dst, $src}", - [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SHL64mi : RIi8<0xC1, MRM4m, (outs), (ins i64mem:$dst, i8imm:$src), "shl{q}\t{$src, $dst|$dst, $src}", - [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; // Shift by 1 def SHL8m1 : I<0xD0, MRM4m, (outs), (ins i8mem :$dst), "shl{b}\t$dst", - [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SHL16m1 : I<0xD1, MRM4m, (outs), (ins i16mem:$dst), "shl{w}\t$dst", - [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, + [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>, OpSize; def SHL32m1 : I<0xD1, MRM4m, (outs), (ins i32mem:$dst), "shl{l}\t$dst", - [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SHL64m1 : RI<0xD1, MRM4m, (outs), (ins i64mem:$dst), "shl{q}\t$dst", - [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; let Constraints = "$src1 = $dst" in { let Uses = [CL] in { def SHR8rCL : I<0xD2, MRM5r, (outs GR8 :$dst), (ins GR8 :$src1), "shr{b}\t{%cl, $dst|$dst, CL}", - [(set GR8:$dst, (srl GR8:$src1, CL))]>; + [(set GR8:$dst, (srl GR8:$src1, CL))], IIC_SR>; def SHR16rCL : I<0xD3, MRM5r, (outs GR16:$dst), (ins GR16:$src1), "shr{w}\t{%cl, $dst|$dst, CL}", - [(set GR16:$dst, (srl GR16:$src1, CL))]>, OpSize; + [(set GR16:$dst, (srl GR16:$src1, CL))], IIC_SR>, OpSize; def SHR32rCL : I<0xD3, MRM5r, (outs GR32:$dst), (ins GR32:$src1), "shr{l}\t{%cl, $dst|$dst, CL}", - [(set GR32:$dst, (srl GR32:$src1, CL))]>; + [(set GR32:$dst, (srl GR32:$src1, CL))], IIC_SR>; def SHR64rCL : RI<0xD3, MRM5r, (outs GR64:$dst), (ins GR64:$src1), "shr{q}\t{%cl, $dst|$dst, CL}", - [(set GR64:$dst, (srl GR64:$src1, CL))]>; + [(set GR64:$dst, (srl GR64:$src1, CL))], IIC_SR>; } def SHR8ri : Ii8<0xC0, MRM5r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$src2), "shr{b}\t{$src2, $dst|$dst, $src2}", - [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))]>; + [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; def SHR16ri : Ii8<0xC1, MRM5r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), "shr{w}\t{$src2, $dst|$dst, $src2}", - [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))]>, OpSize; + [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))], + IIC_SR>, OpSize; def SHR32ri : Ii8<0xC1, MRM5r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), "shr{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))]>; + [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))], + IIC_SR>; def SHR64ri : RIi8<0xC1, MRM5r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), "shr{q}\t{$src2, $dst|$dst, $src2}", - [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))]>; + [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))], IIC_SR>; // Shift right by 1 def SHR8r1 : I<0xD0, MRM5r, (outs GR8:$dst), (ins GR8:$src1), "shr{b}\t$dst", - [(set GR8:$dst, (srl GR8:$src1, (i8 1)))]>; + [(set GR8:$dst, (srl GR8:$src1, (i8 1)))], IIC_SR>; def SHR16r1 : I<0xD1, MRM5r, (outs GR16:$dst), (ins GR16:$src1), "shr{w}\t$dst", - [(set GR16:$dst, (srl GR16:$src1, (i8 1)))]>, OpSize; + [(set GR16:$dst, (srl GR16:$src1, (i8 1)))], IIC_SR>, OpSize; def SHR32r1 : I<0xD1, MRM5r, (outs GR32:$dst), (ins GR32:$src1), "shr{l}\t$dst", - [(set GR32:$dst, (srl GR32:$src1, (i8 1)))]>; + [(set GR32:$dst, (srl GR32:$src1, (i8 1)))], IIC_SR>; def SHR64r1 : RI<0xD1, MRM5r, (outs GR64:$dst), (ins GR64:$src1), "shr{q}\t$dst", - [(set GR64:$dst, (srl GR64:$src1, (i8 1)))]>; + [(set GR64:$dst, (srl GR64:$src1, (i8 1)))], IIC_SR>; } // Constraints = "$src = $dst" let Uses = [CL] in { def SHR8mCL : I<0xD2, MRM5m, (outs), (ins i8mem :$dst), "shr{b}\t{%cl, $dst|$dst, CL}", - [(store (srl (loadi8 addr:$dst), CL), addr:$dst)]>; + [(store (srl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; def SHR16mCL : I<0xD3, MRM5m, (outs), (ins i16mem:$dst), "shr{w}\t{%cl, $dst|$dst, CL}", - [(store (srl (loadi16 addr:$dst), CL), addr:$dst)]>, + [(store (srl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, OpSize; def SHR32mCL : I<0xD3, MRM5m, (outs), (ins i32mem:$dst), "shr{l}\t{%cl, $dst|$dst, CL}", - [(store (srl (loadi32 addr:$dst), CL), addr:$dst)]>; + [(store (srl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; def SHR64mCL : RI<0xD3, MRM5m, (outs), (ins i64mem:$dst), "shr{q}\t{%cl, $dst|$dst, CL}", - [(store (srl (loadi64 addr:$dst), CL), addr:$dst)]>; + [(store (srl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; } def SHR8mi : Ii8<0xC0, MRM5m, (outs), (ins i8mem :$dst, i8imm:$src), "shr{b}\t{$src, $dst|$dst, $src}", - [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SHR16mi : Ii8<0xC1, MRM5m, (outs), (ins i16mem:$dst, i8imm:$src), "shr{w}\t{$src, $dst|$dst, $src}", - [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, + [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>, OpSize; def SHR32mi : Ii8<0xC1, MRM5m, (outs), (ins i32mem:$dst, i8imm:$src), "shr{l}\t{$src, $dst|$dst, $src}", - [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SHR64mi : RIi8<0xC1, MRM5m, (outs), (ins i64mem:$dst, i8imm:$src), "shr{q}\t{$src, $dst|$dst, $src}", - [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; // Shift by 1 def SHR8m1 : I<0xD0, MRM5m, (outs), (ins i8mem :$dst), "shr{b}\t$dst", - [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SHR16m1 : I<0xD1, MRM5m, (outs), (ins i16mem:$dst), "shr{w}\t$dst", - [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>,OpSize; + [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>,OpSize; def SHR32m1 : I<0xD1, MRM5m, (outs), (ins i32mem:$dst), "shr{l}\t$dst", - [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SHR64m1 : RI<0xD1, MRM5m, (outs), (ins i64mem:$dst), "shr{q}\t$dst", - [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; let Constraints = "$src1 = $dst" in { let Uses = [CL] in { def SAR8rCL : I<0xD2, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), "sar{b}\t{%cl, $dst|$dst, CL}", - [(set GR8:$dst, (sra GR8:$src1, CL))]>; + [(set GR8:$dst, (sra GR8:$src1, CL))], + IIC_SR>; def SAR16rCL : I<0xD3, MRM7r, (outs GR16:$dst), (ins GR16:$src1), "sar{w}\t{%cl, $dst|$dst, CL}", - [(set GR16:$dst, (sra GR16:$src1, CL))]>, OpSize; + [(set GR16:$dst, (sra GR16:$src1, CL))], + IIC_SR>, OpSize; def SAR32rCL : I<0xD3, MRM7r, (outs GR32:$dst), (ins GR32:$src1), "sar{l}\t{%cl, $dst|$dst, CL}", - [(set GR32:$dst, (sra GR32:$src1, CL))]>; + [(set GR32:$dst, (sra GR32:$src1, CL))], + IIC_SR>; def SAR64rCL : RI<0xD3, MRM7r, (outs GR64:$dst), (ins GR64:$src1), "sar{q}\t{%cl, $dst|$dst, CL}", - [(set GR64:$dst, (sra GR64:$src1, CL))]>; + [(set GR64:$dst, (sra GR64:$src1, CL))], + IIC_SR>; } def SAR8ri : Ii8<0xC0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), "sar{b}\t{$src2, $dst|$dst, $src2}", - [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))]>; + [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))], + IIC_SR>; def SAR16ri : Ii8<0xC1, MRM7r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), "sar{w}\t{$src2, $dst|$dst, $src2}", - [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))]>, + [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))], + IIC_SR>, OpSize; def SAR32ri : Ii8<0xC1, MRM7r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), "sar{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))]>; + [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))], + IIC_SR>; def SAR64ri : RIi8<0xC1, MRM7r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), "sar{q}\t{$src2, $dst|$dst, $src2}", - [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))]>; + [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))], + IIC_SR>; // Shift by 1 def SAR8r1 : I<0xD0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), "sar{b}\t$dst", - [(set GR8:$dst, (sra GR8:$src1, (i8 1)))]>; + [(set GR8:$dst, (sra GR8:$src1, (i8 1)))], + IIC_SR>; def SAR16r1 : I<0xD1, MRM7r, (outs GR16:$dst), (ins GR16:$src1), "sar{w}\t$dst", - [(set GR16:$dst, (sra GR16:$src1, (i8 1)))]>, OpSize; + [(set GR16:$dst, (sra GR16:$src1, (i8 1)))], + IIC_SR>, OpSize; def SAR32r1 : I<0xD1, MRM7r, (outs GR32:$dst), (ins GR32:$src1), "sar{l}\t$dst", - [(set GR32:$dst, (sra GR32:$src1, (i8 1)))]>; + [(set GR32:$dst, (sra GR32:$src1, (i8 1)))], + IIC_SR>; def SAR64r1 : RI<0xD1, MRM7r, (outs GR64:$dst), (ins GR64:$src1), "sar{q}\t$dst", - [(set GR64:$dst, (sra GR64:$src1, (i8 1)))]>; + [(set GR64:$dst, (sra GR64:$src1, (i8 1)))], + IIC_SR>; } // Constraints = "$src = $dst" let Uses = [CL] in { def SAR8mCL : I<0xD2, MRM7m, (outs), (ins i8mem :$dst), "sar{b}\t{%cl, $dst|$dst, CL}", - [(store (sra (loadi8 addr:$dst), CL), addr:$dst)]>; + [(store (sra (loadi8 addr:$dst), CL), addr:$dst)], + IIC_SR>; def SAR16mCL : I<0xD3, MRM7m, (outs), (ins i16mem:$dst), "sar{w}\t{%cl, $dst|$dst, CL}", - [(store (sra (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; + [(store (sra (loadi16 addr:$dst), CL), addr:$dst)], + IIC_SR>, OpSize; def SAR32mCL : I<0xD3, MRM7m, (outs), (ins i32mem:$dst), "sar{l}\t{%cl, $dst|$dst, CL}", - [(store (sra (loadi32 addr:$dst), CL), addr:$dst)]>; + [(store (sra (loadi32 addr:$dst), CL), addr:$dst)], + IIC_SR>; def SAR64mCL : RI<0xD3, MRM7m, (outs), (ins i64mem:$dst), "sar{q}\t{%cl, $dst|$dst, CL}", - [(store (sra (loadi64 addr:$dst), CL), addr:$dst)]>; + [(store (sra (loadi64 addr:$dst), CL), addr:$dst)], + IIC_SR>; } def SAR8mi : Ii8<0xC0, MRM7m, (outs), (ins i8mem :$dst, i8imm:$src), "sar{b}\t{$src, $dst|$dst, $src}", - [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SAR16mi : Ii8<0xC1, MRM7m, (outs), (ins i16mem:$dst, i8imm:$src), "sar{w}\t{$src, $dst|$dst, $src}", - [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, + [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>, OpSize; def SAR32mi : Ii8<0xC1, MRM7m, (outs), (ins i32mem:$dst, i8imm:$src), "sar{l}\t{$src, $dst|$dst, $src}", - [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def SAR64mi : RIi8<0xC1, MRM7m, (outs), (ins i64mem:$dst, i8imm:$src), "sar{q}\t{$src, $dst|$dst, $src}", - [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; // Shift by 1 def SAR8m1 : I<0xD0, MRM7m, (outs), (ins i8mem :$dst), "sar{b}\t$dst", - [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SAR16m1 : I<0xD1, MRM7m, (outs), (ins i16mem:$dst), "sar{w}\t$dst", - [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, + [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>, OpSize; def SAR32m1 : I<0xD1, MRM7m, (outs), (ins i32mem:$dst), "sar{l}\t$dst", - [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def SAR64m1 : RI<0xD1, MRM7m, (outs), (ins i64mem:$dst), "sar{q}\t$dst", - [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; //===----------------------------------------------------------------------===// // Rotate instructions @@ -290,125 +335,125 @@ let Constraints = "$src1 = $dst" in { def RCL8r1 : I<0xD0, MRM2r, (outs GR8:$dst), (ins GR8:$src1), - "rcl{b}\t$dst", []>; + "rcl{b}\t$dst", [], IIC_SR>; def RCL8ri : Ii8<0xC0, MRM2r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCL8rCL : I<0xD2, MRM2r, (outs GR8:$dst), (ins GR8:$src1), - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCL16r1 : I<0xD1, MRM2r, (outs GR16:$dst), (ins GR16:$src1), - "rcl{w}\t$dst", []>, OpSize; + "rcl{w}\t$dst", [], IIC_SR>, OpSize; def RCL16ri : Ii8<0xC1, MRM2r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; let Uses = [CL] in def RCL16rCL : I<0xD3, MRM2r, (outs GR16:$dst), (ins GR16:$src1), - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; def RCL32r1 : I<0xD1, MRM2r, (outs GR32:$dst), (ins GR32:$src1), - "rcl{l}\t$dst", []>; + "rcl{l}\t$dst", [], IIC_SR>; def RCL32ri : Ii8<0xC1, MRM2r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCL32rCL : I<0xD3, MRM2r, (outs GR32:$dst), (ins GR32:$src1), - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCL64r1 : RI<0xD1, MRM2r, (outs GR64:$dst), (ins GR64:$src1), - "rcl{q}\t$dst", []>; + "rcl{q}\t$dst", [], IIC_SR>; def RCL64ri : RIi8<0xC1, MRM2r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCL64rCL : RI<0xD3, MRM2r, (outs GR64:$dst), (ins GR64:$src1), - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR8r1 : I<0xD0, MRM3r, (outs GR8:$dst), (ins GR8:$src1), - "rcr{b}\t$dst", []>; + "rcr{b}\t$dst", [], IIC_SR>; def RCR8ri : Ii8<0xC0, MRM3r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCR8rCL : I<0xD2, MRM3r, (outs GR8:$dst), (ins GR8:$src1), - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR16r1 : I<0xD1, MRM3r, (outs GR16:$dst), (ins GR16:$src1), - "rcr{w}\t$dst", []>, OpSize; + "rcr{w}\t$dst", [], IIC_SR>, OpSize; def RCR16ri : Ii8<0xC1, MRM3r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; let Uses = [CL] in def RCR16rCL : I<0xD3, MRM3r, (outs GR16:$dst), (ins GR16:$src1), - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; def RCR32r1 : I<0xD1, MRM3r, (outs GR32:$dst), (ins GR32:$src1), - "rcr{l}\t$dst", []>; + "rcr{l}\t$dst", [], IIC_SR>; def RCR32ri : Ii8<0xC1, MRM3r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCR32rCL : I<0xD3, MRM3r, (outs GR32:$dst), (ins GR32:$src1), - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR64r1 : RI<0xD1, MRM3r, (outs GR64:$dst), (ins GR64:$src1), - "rcr{q}\t$dst", []>; + "rcr{q}\t$dst", [], IIC_SR>; def RCR64ri : RIi8<0xC1, MRM3r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in def RCR64rCL : RI<0xD3, MRM3r, (outs GR64:$dst), (ins GR64:$src1), - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; } // Constraints = "$src = $dst" def RCL8m1 : I<0xD0, MRM2m, (outs), (ins i8mem:$dst), - "rcl{b}\t$dst", []>; + "rcl{b}\t$dst", [], IIC_SR>; def RCL8mi : Ii8<0xC0, MRM2m, (outs), (ins i8mem:$dst, i8imm:$cnt), - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; def RCL16m1 : I<0xD1, MRM2m, (outs), (ins i16mem:$dst), - "rcl{w}\t$dst", []>, OpSize; + "rcl{w}\t$dst", [], IIC_SR>, OpSize; def RCL16mi : Ii8<0xC1, MRM2m, (outs), (ins i16mem:$dst, i8imm:$cnt), - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; def RCL32m1 : I<0xD1, MRM2m, (outs), (ins i32mem:$dst), - "rcl{l}\t$dst", []>; + "rcl{l}\t$dst", [], IIC_SR>; def RCL32mi : Ii8<0xC1, MRM2m, (outs), (ins i32mem:$dst, i8imm:$cnt), - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; def RCL64m1 : RI<0xD1, MRM2m, (outs), (ins i64mem:$dst), - "rcl{q}\t$dst", []>; + "rcl{q}\t$dst", [], IIC_SR>; def RCL64mi : RIi8<0xC1, MRM2m, (outs), (ins i64mem:$dst, i8imm:$cnt), - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; def RCR8m1 : I<0xD0, MRM3m, (outs), (ins i8mem:$dst), - "rcr{b}\t$dst", []>; + "rcr{b}\t$dst", [], IIC_SR>; def RCR8mi : Ii8<0xC0, MRM3m, (outs), (ins i8mem:$dst, i8imm:$cnt), - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; def RCR16m1 : I<0xD1, MRM3m, (outs), (ins i16mem:$dst), - "rcr{w}\t$dst", []>, OpSize; + "rcr{w}\t$dst", [], IIC_SR>, OpSize; def RCR16mi : Ii8<0xC1, MRM3m, (outs), (ins i16mem:$dst, i8imm:$cnt), - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; def RCR32m1 : I<0xD1, MRM3m, (outs), (ins i32mem:$dst), - "rcr{l}\t$dst", []>; + "rcr{l}\t$dst", [], IIC_SR>; def RCR32mi : Ii8<0xC1, MRM3m, (outs), (ins i32mem:$dst, i8imm:$cnt), - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; def RCR64m1 : RI<0xD1, MRM3m, (outs), (ins i64mem:$dst), - "rcr{q}\t$dst", []>; + "rcr{q}\t$dst", [], IIC_SR>; def RCR64mi : RIi8<0xC1, MRM3m, (outs), (ins i64mem:$dst, i8imm:$cnt), - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; let Uses = [CL] in { def RCL8mCL : I<0xD2, MRM2m, (outs), (ins i8mem:$dst), - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCL16mCL : I<0xD3, MRM2m, (outs), (ins i16mem:$dst), - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; def RCL32mCL : I<0xD3, MRM2m, (outs), (ins i32mem:$dst), - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCL64mCL : RI<0xD3, MRM2m, (outs), (ins i64mem:$dst), - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR8mCL : I<0xD2, MRM3m, (outs), (ins i8mem:$dst), - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR16mCL : I<0xD3, MRM3m, (outs), (ins i16mem:$dst), - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; def RCR32mCL : I<0xD3, MRM3m, (outs), (ins i32mem:$dst), - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; def RCR64mCL : RI<0xD3, MRM3m, (outs), (ins i64mem:$dst), - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; } let Constraints = "$src1 = $dst" in { @@ -416,179 +461,217 @@ let Uses = [CL] in { def ROL8rCL : I<0xD2, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), "rol{b}\t{%cl, $dst|$dst, CL}", - [(set GR8:$dst, (rotl GR8:$src1, CL))]>; + [(set GR8:$dst, (rotl GR8:$src1, CL))], IIC_SR>; def ROL16rCL : I<0xD3, MRM0r, (outs GR16:$dst), (ins GR16:$src1), "rol{w}\t{%cl, $dst|$dst, CL}", - [(set GR16:$dst, (rotl GR16:$src1, CL))]>, OpSize; + [(set GR16:$dst, (rotl GR16:$src1, CL))], IIC_SR>, OpSize; def ROL32rCL : I<0xD3, MRM0r, (outs GR32:$dst), (ins GR32:$src1), "rol{l}\t{%cl, $dst|$dst, CL}", - [(set GR32:$dst, (rotl GR32:$src1, CL))]>; + [(set GR32:$dst, (rotl GR32:$src1, CL))], IIC_SR>; def ROL64rCL : RI<0xD3, MRM0r, (outs GR64:$dst), (ins GR64:$src1), "rol{q}\t{%cl, $dst|$dst, CL}", - [(set GR64:$dst, (rotl GR64:$src1, CL))]>; + [(set GR64:$dst, (rotl GR64:$src1, CL))], IIC_SR>; } def ROL8ri : Ii8<0xC0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), "rol{b}\t{$src2, $dst|$dst, $src2}", - [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))]>; + [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; def ROL16ri : Ii8<0xC1, MRM0r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), "rol{w}\t{$src2, $dst|$dst, $src2}", - [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))]>, + [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))], + IIC_SR>, OpSize; def ROL32ri : Ii8<0xC1, MRM0r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), "rol{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))]>; + [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))], + IIC_SR>; def ROL64ri : RIi8<0xC1, MRM0r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), "rol{q}\t{$src2, $dst|$dst, $src2}", - [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))]>; + [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))], + IIC_SR>; // Rotate by 1 def ROL8r1 : I<0xD0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), "rol{b}\t$dst", - [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))]>; + [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))], + IIC_SR>; def ROL16r1 : I<0xD1, MRM0r, (outs GR16:$dst), (ins GR16:$src1), "rol{w}\t$dst", - [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))]>, OpSize; + [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))], + IIC_SR>, OpSize; def ROL32r1 : I<0xD1, MRM0r, (outs GR32:$dst), (ins GR32:$src1), "rol{l}\t$dst", - [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))]>; + [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))], + IIC_SR>; def ROL64r1 : RI<0xD1, MRM0r, (outs GR64:$dst), (ins GR64:$src1), "rol{q}\t$dst", - [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))]>; + [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))], + IIC_SR>; } // Constraints = "$src = $dst" let Uses = [CL] in { def ROL8mCL : I<0xD2, MRM0m, (outs), (ins i8mem :$dst), "rol{b}\t{%cl, $dst|$dst, CL}", - [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)]>; + [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)], + IIC_SR>; def ROL16mCL : I<0xD3, MRM0m, (outs), (ins i16mem:$dst), "rol{w}\t{%cl, $dst|$dst, CL}", - [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; + [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)], + IIC_SR>, OpSize; def ROL32mCL : I<0xD3, MRM0m, (outs), (ins i32mem:$dst), "rol{l}\t{%cl, $dst|$dst, CL}", - [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)]>; + [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)], + IIC_SR>; def ROL64mCL : RI<0xD3, MRM0m, (outs), (ins i64mem:$dst), "rol{q}\t{%cl, $dst|$dst, %cl}", - [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)]>; + [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)], + IIC_SR>; } def ROL8mi : Ii8<0xC0, MRM0m, (outs), (ins i8mem :$dst, i8imm:$src1), "rol{b}\t{$src1, $dst|$dst, $src1}", - [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; + [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)], + IIC_SR>; def ROL16mi : Ii8<0xC1, MRM0m, (outs), (ins i16mem:$dst, i8imm:$src1), "rol{w}\t{$src1, $dst|$dst, $src1}", - [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)]>, + [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)], + IIC_SR>, OpSize; def ROL32mi : Ii8<0xC1, MRM0m, (outs), (ins i32mem:$dst, i8imm:$src1), "rol{l}\t{$src1, $dst|$dst, $src1}", - [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; + [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)], + IIC_SR>; def ROL64mi : RIi8<0xC1, MRM0m, (outs), (ins i64mem:$dst, i8imm:$src1), "rol{q}\t{$src1, $dst|$dst, $src1}", - [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; + [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)], + IIC_SR>; // Rotate by 1 def ROL8m1 : I<0xD0, MRM0m, (outs), (ins i8mem :$dst), "rol{b}\t$dst", - [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def ROL16m1 : I<0xD1, MRM0m, (outs), (ins i16mem:$dst), "rol{w}\t$dst", - [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, + [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>, OpSize; def ROL32m1 : I<0xD1, MRM0m, (outs), (ins i32mem:$dst), "rol{l}\t$dst", - [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def ROL64m1 : RI<0xD1, MRM0m, (outs), (ins i64mem:$dst), "rol{q}\t$dst", - [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; let Constraints = "$src1 = $dst" in { let Uses = [CL] in { def ROR8rCL : I<0xD2, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), "ror{b}\t{%cl, $dst|$dst, CL}", - [(set GR8:$dst, (rotr GR8:$src1, CL))]>; + [(set GR8:$dst, (rotr GR8:$src1, CL))], IIC_SR>; def ROR16rCL : I<0xD3, MRM1r, (outs GR16:$dst), (ins GR16:$src1), "ror{w}\t{%cl, $dst|$dst, CL}", - [(set GR16:$dst, (rotr GR16:$src1, CL))]>, OpSize; + [(set GR16:$dst, (rotr GR16:$src1, CL))], IIC_SR>, OpSize; def ROR32rCL : I<0xD3, MRM1r, (outs GR32:$dst), (ins GR32:$src1), "ror{l}\t{%cl, $dst|$dst, CL}", - [(set GR32:$dst, (rotr GR32:$src1, CL))]>; + [(set GR32:$dst, (rotr GR32:$src1, CL))], IIC_SR>; def ROR64rCL : RI<0xD3, MRM1r, (outs GR64:$dst), (ins GR64:$src1), "ror{q}\t{%cl, $dst|$dst, CL}", - [(set GR64:$dst, (rotr GR64:$src1, CL))]>; + [(set GR64:$dst, (rotr GR64:$src1, CL))], IIC_SR>; } def ROR8ri : Ii8<0xC0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), "ror{b}\t{$src2, $dst|$dst, $src2}", - [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))]>; + [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))], IIC_SR>; def ROR16ri : Ii8<0xC1, MRM1r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), "ror{w}\t{$src2, $dst|$dst, $src2}", - [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))]>, + [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))], + IIC_SR>, OpSize; def ROR32ri : Ii8<0xC1, MRM1r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), "ror{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))]>; + [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))], + IIC_SR>; def ROR64ri : RIi8<0xC1, MRM1r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), "ror{q}\t{$src2, $dst|$dst, $src2}", - [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))]>; + [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))], + IIC_SR>; // Rotate by 1 def ROR8r1 : I<0xD0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), "ror{b}\t$dst", - [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))]>; + [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))], + IIC_SR>; def ROR16r1 : I<0xD1, MRM1r, (outs GR16:$dst), (ins GR16:$src1), "ror{w}\t$dst", - [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))]>, OpSize; + [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))], + IIC_SR>, OpSize; def ROR32r1 : I<0xD1, MRM1r, (outs GR32:$dst), (ins GR32:$src1), "ror{l}\t$dst", - [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))]>; + [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))], + IIC_SR>; def ROR64r1 : RI<0xD1, MRM1r, (outs GR64:$dst), (ins GR64:$src1), "ror{q}\t$dst", - [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))]>; + [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))], + IIC_SR>; } // Constraints = "$src = $dst" let Uses = [CL] in { def ROR8mCL : I<0xD2, MRM1m, (outs), (ins i8mem :$dst), "ror{b}\t{%cl, $dst|$dst, CL}", - [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)]>; + [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)], + IIC_SR>; def ROR16mCL : I<0xD3, MRM1m, (outs), (ins i16mem:$dst), "ror{w}\t{%cl, $dst|$dst, CL}", - [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; + [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)], + IIC_SR>, OpSize; def ROR32mCL : I<0xD3, MRM1m, (outs), (ins i32mem:$dst), "ror{l}\t{%cl, $dst|$dst, CL}", - [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)]>; + [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)], + IIC_SR>; def ROR64mCL : RI<0xD3, MRM1m, (outs), (ins i64mem:$dst), "ror{q}\t{%cl, $dst|$dst, CL}", - [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)]>; + [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)], + IIC_SR>; } def ROR8mi : Ii8<0xC0, MRM1m, (outs), (ins i8mem :$dst, i8imm:$src), "ror{b}\t{$src, $dst|$dst, $src}", - [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def ROR16mi : Ii8<0xC1, MRM1m, (outs), (ins i16mem:$dst, i8imm:$src), "ror{w}\t{$src, $dst|$dst, $src}", - [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, + [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>, OpSize; def ROR32mi : Ii8<0xC1, MRM1m, (outs), (ins i32mem:$dst, i8imm:$src), "ror{l}\t{$src, $dst|$dst, $src}", - [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; def ROR64mi : RIi8<0xC1, MRM1m, (outs), (ins i64mem:$dst, i8imm:$src), "ror{q}\t{$src, $dst|$dst, $src}", - [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; + [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], + IIC_SR>; // Rotate by 1 def ROR8m1 : I<0xD0, MRM1m, (outs), (ins i8mem :$dst), "ror{b}\t$dst", - [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def ROR16m1 : I<0xD1, MRM1m, (outs), (ins i16mem:$dst), "ror{w}\t$dst", - [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, + [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>, OpSize; def ROR32m1 : I<0xD1, MRM1m, (outs), (ins i32mem:$dst), "ror{l}\t$dst", - [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; def ROR64m1 : RI<0xD1, MRM1m, (outs), (ins i64mem:$dst), "ror{q}\t$dst", - [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; + [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)], + IIC_SR>; //===----------------------------------------------------------------------===// @@ -601,30 +684,36 @@ def SHLD16rrCL : I<0xA5, MRMDestReg, (outs GR16:$dst), (ins GR16:$src1, GR16:$src2), "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))]>, + [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))], + IIC_SHD16_REG_CL>, TB, OpSize; def SHRD16rrCL : I<0xAD, MRMDestReg, (outs GR16:$dst), (ins GR16:$src1, GR16:$src2), "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))]>, + [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))], + IIC_SHD16_REG_CL>, TB, OpSize; def SHLD32rrCL : I<0xA5, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2), "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))]>, TB; + [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))], + IIC_SHD32_REG_CL>, TB; def SHRD32rrCL : I<0xAD, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2), "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))]>, TB; + [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))], + IIC_SHD32_REG_CL>, TB; def SHLD64rrCL : RI<0xA5, MRMDestReg, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2), "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))]>, + [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))], + IIC_SHD64_REG_CL>, TB; def SHRD64rrCL : RI<0xAD, MRMDestReg, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2), "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", - [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))]>, + [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))], + IIC_SHD64_REG_CL>, TB; } @@ -634,42 +723,42 @@ (ins GR16:$src1, GR16:$src2, i8imm:$src3), "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, TB, OpSize; def SHRD16rri8 : Ii8<0xAC, MRMDestReg, (outs GR16:$dst), (ins GR16:$src1, GR16:$src2, i8imm:$src3), "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, TB, OpSize; def SHLD32rri8 : Ii8<0xA4, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2, i8imm:$src3), "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, TB; def SHRD32rri8 : Ii8<0xAC, MRMDestReg, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2, i8imm:$src3), "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, TB; def SHLD64rri8 : RIi8<0xA4, MRMDestReg, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2, i8imm:$src3), "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, TB; def SHRD64rri8 : RIi8<0xAC, MRMDestReg, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2, i8imm:$src3), "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, - (i8 imm:$src3)))]>, + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, TB; } } // Constraints = "$src = $dst" @@ -678,68 +767,74 @@ def SHLD16mrCL : I<0xA5, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shld (loadi16 addr:$dst), GR16:$src2, CL), - addr:$dst)]>, TB, OpSize; + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; def SHRD16mrCL : I<0xAD, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, CL), - addr:$dst)]>, TB, OpSize; + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; def SHLD32mrCL : I<0xA5, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shld (loadi32 addr:$dst), GR32:$src2, CL), - addr:$dst)]>, TB; + addr:$dst)], IIC_SHD32_MEM_CL>, TB; def SHRD32mrCL : I<0xAD, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, CL), - addr:$dst)]>, TB; + addr:$dst)], IIC_SHD32_MEM_CL>, TB; def SHLD64mrCL : RI<0xA5, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shld (loadi64 addr:$dst), GR64:$src2, CL), - addr:$dst)]>, TB; + addr:$dst)], IIC_SHD64_MEM_CL>, TB; def SHRD64mrCL : RI<0xAD, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, CL), - addr:$dst)]>, TB; + addr:$dst)], IIC_SHD64_MEM_CL>, TB; } def SHLD16mri8 : Ii8<0xA4, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shld (loadi16 addr:$dst), GR16:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD16_MEM_IM>, TB, OpSize; def SHRD16mri8 : Ii8<0xAC, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD16_MEM_IM>, TB, OpSize; def SHLD32mri8 : Ii8<0xA4, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shld (loadi32 addr:$dst), GR32:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD32_MEM_IM>, TB; def SHRD32mri8 : Ii8<0xAC, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD32_MEM_IM>, TB; def SHLD64mri8 : RIi8<0xA4, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shld (loadi64 addr:$dst), GR64:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD64_MEM_IM>, TB; def SHRD64mri8 : RIi8<0xAC, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, - (i8 imm:$src3)), addr:$dst)]>, + (i8 imm:$src3)), addr:$dst)], + IIC_SHD64_MEM_IM>, TB; } // Defs = [EFLAGS] Added: llvm/trunk/lib/Target/X86/X86Schedule.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Schedule.td?rev=149558&view=auto ============================================================================== --- llvm/trunk/lib/Target/X86/X86Schedule.td (added) +++ llvm/trunk/lib/Target/X86/X86Schedule.td Wed Feb 1 17:20:51 2012 @@ -0,0 +1,115 @@ +//===- X86Schedule.td - X86 Scheduling Definitions ---------*- tablegen -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +//===----------------------------------------------------------------------===// +// Instruction Itinerary classes used for X86 +def IIC_DEFAULT : InstrItinClass; +def IIC_ALU_MEM : InstrItinClass; +def IIC_ALU_NONMEM : InstrItinClass; +def IIC_LEA : InstrItinClass; +def IIC_LEA_16 : InstrItinClass; +def IIC_MUL8 : InstrItinClass; +def IIC_MUL16_MEM : InstrItinClass; +def IIC_MUL16_REG : InstrItinClass; +def IIC_MUL32_MEM : InstrItinClass; +def IIC_MUL32_REG : InstrItinClass; +def IIC_MUL64 : InstrItinClass; +// imul by al, ax, eax, tax +def IIC_IMUL8 : InstrItinClass; +def IIC_IMUL16_MEM : InstrItinClass; +def IIC_IMUL16_REG : InstrItinClass; +def IIC_IMUL32_MEM : InstrItinClass; +def IIC_IMUL32_REG : InstrItinClass; +def IIC_IMUL64 : InstrItinClass; +// imul reg by reg|mem +def IIC_IMUL16_RM : InstrItinClass; +def IIC_IMUL16_RR : InstrItinClass; +def IIC_IMUL32_RM : InstrItinClass; +def IIC_IMUL32_RR : InstrItinClass; +def IIC_IMUL64_RM : InstrItinClass; +def IIC_IMUL64_RR : InstrItinClass; +// imul reg = reg/mem * imm +def IIC_IMUL16_RMI : InstrItinClass; +def IIC_IMUL16_RRI : InstrItinClass; +def IIC_IMUL32_RMI : InstrItinClass; +def IIC_IMUL32_RRI : InstrItinClass; +def IIC_IMUL64_RMI : InstrItinClass; +def IIC_IMUL64_RRI : InstrItinClass; +// div +def IIC_DIV8_MEM : InstrItinClass; +def IIC_DIV8_REG : InstrItinClass; +def IIC_DIV16 : InstrItinClass; +def IIC_DIV32 : InstrItinClass; +def IIC_DIV64 : InstrItinClass; +// idiv +def IIC_IDIV8 : InstrItinClass; +def IIC_IDIV16 : InstrItinClass; +def IIC_IDIV32 : InstrItinClass; +def IIC_IDIV64 : InstrItinClass; +// neg/not/inc/dec +def IIC_UNARY_REG : InstrItinClass; +def IIC_UNARY_MEM : InstrItinClass; +// add/sub/and/or/xor/adc/sbc/cmp/test +def IIC_BIN_MEM : InstrItinClass; +def IIC_BIN_NONMEM : InstrItinClass; +// shift/rotate +def IIC_SR : InstrItinClass; +// shift double +def IIC_SHD16_REG_IM : InstrItinClass; +def IIC_SHD16_REG_CL : InstrItinClass; +def IIC_SHD16_MEM_IM : InstrItinClass; +def IIC_SHD16_MEM_CL : InstrItinClass; +def IIC_SHD32_REG_IM : InstrItinClass; +def IIC_SHD32_REG_CL : InstrItinClass; +def IIC_SHD32_MEM_IM : InstrItinClass; +def IIC_SHD32_MEM_CL : InstrItinClass; +def IIC_SHD64_REG_IM : InstrItinClass; +def IIC_SHD64_REG_CL : InstrItinClass; +def IIC_SHD64_MEM_IM : InstrItinClass; +def IIC_SHD64_MEM_CL : InstrItinClass; +// cmov +def IIC_CMOV16_RM : InstrItinClass; +def IIC_CMOV16_RR : InstrItinClass; +def IIC_CMOV32_RM : InstrItinClass; +def IIC_CMOV32_RR : InstrItinClass; +def IIC_CMOV64_RM : InstrItinClass; +def IIC_CMOV64_RR : InstrItinClass; +// set +def IIC_SET_R : InstrItinClass; +def IIC_SET_M : InstrItinClass; +// jmp/jcc/jcxz +def IIC_Jcc : InstrItinClass; +def IIC_JCXZ : InstrItinClass; +def IIC_JMP_REL : InstrItinClass; +def IIC_JMP_REG : InstrItinClass; +def IIC_JMP_MEM : InstrItinClass; +def IIC_JMP_FAR_MEM : InstrItinClass; +def IIC_JMP_FAR_PTR : InstrItinClass; +// loop +def IIC_LOOP : InstrItinClass; +def IIC_LOOPE : InstrItinClass; +def IIC_LOOPNE : InstrItinClass; +// call +def IIC_CALL_RI : InstrItinClass; +def IIC_CALL_MEM : InstrItinClass; +def IIC_CALL_FAR_MEM : InstrItinClass; +def IIC_CALL_FAR_PTR : InstrItinClass; +// ret +def IIC_RET : InstrItinClass; +def IIC_RET_IMM : InstrItinClass; + +//===----------------------------------------------------------------------===// +// Processor instruction itineraries. + +def GenericItineraries : ProcessorItineraries<[], [], []>; + +include "X86ScheduleAtom.td" + + + Added: llvm/trunk/lib/Target/X86/X86ScheduleAtom.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ScheduleAtom.td?rev=149558&view=auto ============================================================================== --- llvm/trunk/lib/Target/X86/X86ScheduleAtom.td (added) +++ llvm/trunk/lib/Target/X86/X86ScheduleAtom.td Wed Feb 1 17:20:51 2012 @@ -0,0 +1,136 @@ +//=- X86ScheduleAtom.td - X86 Atom Scheduling Definitions -*- tablegen -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines the itinerary class data for the Intel Atom (Bonnell) +// processors. +// +//===----------------------------------------------------------------------===// + +// +// Scheduling information derived from the "Intel 64 and IA32 Architectures +// Optimization Reference Manual", Chapter 13, Section 4. +// Functional Units +// Port 0 +def Port0 : FuncUnit; // ALU: ALU0, shift/rotate, load/store + // SIMD/FP: SIMD ALU, Shuffle,SIMD/FP multiply, divide +def Port1 : FuncUnit; // ALU: ALU1, bit processing, jump, and LEA + // SIMD/FP: SIMD ALU, FP Adder + +def AtomItineraries : ProcessorItineraries< + [ Port0, Port1 ], + [], [ + // P0 only + // InstrItinData] >, + // P0 or P1 + // InstrItinData] >, + // P0 and P1 + // InstrItinData, InstrStage] >, + // + // Default is 1 cycle, port0 or port1 + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // mul + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // imul by al, ax, eax, rax + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // imul reg by reg|mem + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // imul reg = reg/mem * imm + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // idiv + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // div + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // neg/not/inc/dec + InstrItinData] >, + InstrItinData] >, + // add/sub/and/or/xor/adc/sbc/cmp/test + InstrItinData] >, + InstrItinData] >, + // shift/rotate + InstrItinData] >, + // shift double + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // cmov + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // set + InstrItinData] >, + InstrItinData] >, + // jcc + InstrItinData] >, + // jcxz/jecxz/jrcxz + InstrItinData] >, + // jmp rel + InstrItinData] >, + // jmp indirect + InstrItinData] >, + InstrItinData] >, + // jmp far + InstrItinData] >, + InstrItinData] >, + // loop/loope/loopne + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + // call - all but reg/imm + InstrItinData, InstrStage<1, [Port1]>] >, + InstrItinData] >, + InstrItinData] >, + InstrItinData] >, + //ret + InstrItinData] >, + InstrItinData, InstrStage<1, [Port1]>] > +]>; + Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Wed Feb 1 17:20:51 2012 @@ -246,6 +246,7 @@ IsBTMemSlow = true; ToggleFeature(X86::FeatureSlowBTMem); } + // If it's Nehalem, unaligned memory access is fast. // FIXME: Nehalem is family 6. Also include Westmere and later processors? if (Family == 15 && Model == 26) { @@ -253,6 +254,11 @@ ToggleFeature(X86::FeatureFastUAMem); } + // Set processor type. Currently only Atom is detected. + if (Family == 6 && Model == 28) { + X86ProcFamily = IntelAtom; + } + unsigned MaxExtLevel; X86_MC::GetCpuIDAndInfo(0x80000000, &MaxExtLevel, &EBX, &ECX, &EDX); @@ -310,6 +316,7 @@ const std::string &FS, unsigned StackAlignOverride, bool is64Bit) : X86GenSubtargetInfo(TT, CPU, FS) + , X86ProcFamily(Others) , PICStyle(PICStyles::None) , X86SSELevel(NoMMXSSE) , X863DNowLevel(NoThreeDNow) @@ -333,14 +340,15 @@ , IsUAMemFast(false) , HasVectorUAMem(false) , HasCmpxchg16b(false) + , PostRAScheduler(false) , stackAlignment(4) // FIXME: this is a known good value for Yonah. How about others? , MaxInlineSizeThreshold(128) , TargetTriple(TT) , In64BitMode(is64Bit) { // Determine default and user specified characteristics + std::string CPUName = CPU; if (!FS.empty() || !CPU.empty()) { - std::string CPUName = CPU; if (CPUName.empty()) { #if defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86)\ || defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) @@ -363,6 +371,13 @@ // If feature string is not empty, parse features string. ParseSubtargetFeatures(CPUName, FullFS); } else { + if (CPUName.empty()) { +#if defined (__x86_64__) || defined(__i386__) + CPUName = sys::getHostCPUName(); +#else + CPUName = "generic"; +#endif + } // Otherwise, use CPUID to auto-detect feature set. AutoDetectSubtargetFeatures(); @@ -379,6 +394,11 @@ } } + if (X86ProcFamily == IntelAtom) { + PostRAScheduler = true; + InstrItins = getInstrItineraryForCPU(CPUName); + } + // It's important to keep the MCSubtargetInfo feature bits in sync with // target data structure which is shared with MC code emitter, etc. if (In64BitMode) @@ -398,3 +418,12 @@ isTargetSolaris() || In64BitMode) stackAlignment = 16; } + +bool X86Subtarget::enablePostRAScheduler( + CodeGenOpt::Level OptLevel, + TargetSubtargetInfo::AntiDepBreakMode& Mode, + RegClassVector& CriticalPathRCs) const { + Mode = TargetSubtargetInfo::ANTIDEP_CRITICAL; + CriticalPathRCs.clear(); + return PostRAScheduler && OptLevel >= CodeGenOpt::Default; +} Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original) +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Wed Feb 1 17:20:51 2012 @@ -49,6 +49,13 @@ NoThreeDNow, ThreeDNow, ThreeDNowA }; + enum X86ProcFamilyEnum { + Others, IntelAtom + }; + + /// X86ProcFamily - X86 processor family: Intel Atom, and others + X86ProcFamilyEnum X86ProcFamily; + /// PICStyle - Which PIC style to use /// PICStyles::Style PICStyle; @@ -125,6 +132,9 @@ /// this is true for most x86-64 chips, but not the first AMD chips. bool HasCmpxchg16b; + /// PostRAScheduler - True if using post-register-allocation scheduler. + bool PostRAScheduler; + /// stackAlignment - The minimum alignment known to hold of the stack frame on /// entry to the function and which must be maintained by every function. unsigned stackAlignment; @@ -135,6 +145,9 @@ /// TargetTriple - What processor and OS we're targeting. Triple TargetTriple; + + /// Instruction itineraries for scheduling + InstrItineraryData InstrItins; private: /// In64BitMode - True if compiling for 64-bit, false for 32-bit. @@ -202,6 +215,8 @@ bool hasVectorUAMem() const { return HasVectorUAMem; } bool hasCmpxchg16b() const { return HasCmpxchg16b; } + bool isAtom() const { return X86ProcFamily == IntelAtom; } + const Triple &getTargetTriple() const { return TargetTriple; } bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); } @@ -291,6 +306,15 @@ /// indicating the number of scheduling cycles of backscheduling that /// should be attempted. unsigned getSpecialAddressLatency() const; + + /// enablePostRAScheduler - run for Atom optimization. + bool enablePostRAScheduler(CodeGenOpt::Level OptLevel, + TargetSubtargetInfo::AntiDepBreakMode& Mode, + RegClassVector& CriticalPathRCs) const; + + /// getInstrItins = Return the instruction itineraries based on the + /// subtarget selection. + const InstrItineraryData &getInstrItineraryData() const { return InstrItins; } }; } // End llvm namespace Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Wed Feb 1 17:20:51 2012 @@ -78,7 +78,8 @@ : LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL), Subtarget(TT, CPU, FS, Options.StackAlignmentOverride, is64Bit), FrameLowering(*this, Subtarget), - ELFWriterInfo(is64Bit, true) { + ELFWriterInfo(is64Bit, true), + InstrItins(Subtarget.getInstrItineraryData()){ // Determine the PICStyle based on the target selected. if (getRelocationModel() == Reloc::Static) { // Unless we're in PIC or DynamicNoPIC mode, set the PIC style to None. Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.h?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetMachine.h (original) +++ llvm/trunk/lib/Target/X86/X86TargetMachine.h Wed Feb 1 17:20:51 2012 @@ -32,9 +32,10 @@ class StringRef; class X86TargetMachine : public LLVMTargetMachine { - X86Subtarget Subtarget; - X86FrameLowering FrameLowering; - X86ELFWriterInfo ELFWriterInfo; + X86Subtarget Subtarget; + X86FrameLowering FrameLowering; + X86ELFWriterInfo ELFWriterInfo; + InstrItineraryData InstrItins; public: X86TargetMachine(const Target &T, StringRef TT, @@ -65,6 +66,9 @@ virtual const X86ELFWriterInfo *getELFWriterInfo() const { return Subtarget.isTargetELF() ? &ELFWriterInfo : 0; } + virtual const InstrItineraryData *getInstrItineraryData() const { + return &InstrItins; + } // Set up the pass pipeline. virtual bool addInstSelector(PassManagerBase &PM); Modified: llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll (original) +++ llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll Wed Feb 1 17:20:51 2012 @@ -1,5 +1,5 @@ ; PR1075 -; RUN: llc < %s -mtriple=x86_64-apple-darwin -O3 | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin -O3 | FileCheck %s define float @foo(float %x) nounwind { %tmp1 = fmul float %x, 3.000000e+00 Modified: llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll (original) +++ llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 -mattr=+sse2 | not grep lea +; RUN: llc < %s -march=x86 -mcpu=generic -mattr=+sse2 | not grep lea define float @foo(i32* %x, float* %y, i32 %c) nounwind { entry: Modified: llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll (original) +++ llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 | grep {(%esp)} | count 2 +; RUN: llc < %s -march=x86 -mcpu=generic | grep {(%esp)} | count 2 ; PR1872 %struct.c34007g__designated___XUB = type { i32, i32, i32, i32 } Modified: llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll (original) +++ llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s ; PR3149 ; Make sure the copy after inline asm is not coalesced away. Modified: llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll (original) +++ llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=x86_64-mingw32 < %s | FileCheck %s +; RUN: llc -mcpu=generic -mtriple=x86_64-mingw32 < %s | FileCheck %s ; CHECK: subq $40, %rsp ; CHECK: movaps %xmm8, (%rsp) ; CHECK: movaps %xmm7, 16(%rsp) Modified: llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll (original) +++ llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s +; RUN: llc -mcpu=generic -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s ; Check that lowered argumens do not overwrite the return address before it is moved. ; Bug 6225 ; Modified: llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll (original) +++ llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s | FileCheck %s +; RUN: llc < %s -mcpu=generic | FileCheck %s ; PR6941 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-apple-darwin10.0.0" Modified: llvm/trunk/test/CodeGen/X86/abi-isel.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/abi-isel.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/abi-isel.ll (original) +++ llvm/trunk/test/CodeGen/X86/abi-isel.ll Wed Feb 1 17:20:51 2012 @@ -1,16 +1,16 @@ -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC - -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC + +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC @src = external global [131072 x i32] @dst = external global [131072 x i32] Modified: llvm/trunk/test/CodeGen/X86/add.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/add.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/add.ll (original) +++ llvm/trunk/test/CodeGen/X86/add.ll Wed Feb 1 17:20:51 2012 @@ -1,6 +1,6 @@ -; RUN: llc < %s -march=x86 | FileCheck %s -check-prefix=X32 -; RUN: llc < %s -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 -; RUN: llc < %s -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s -check-prefix=X32 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 ; Some of these tests depend on -join-physregs to commute instructions. Added: llvm/trunk/test/CodeGen/X86/atom-sched.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/atom-sched.ll?rev=149558&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/atom-sched.ll (added) +++ llvm/trunk/test/CodeGen/X86/atom-sched.ll Wed Feb 1 17:20:51 2012 @@ -0,0 +1,28 @@ +; RUN: llc <%s -O2 -mcpu=atom -march=x86 -relocation-model=static | FileCheck -check-prefix=atom %s +; RUN: llc <%s -O2 -mcpu=core2 -march=x86 -relocation-model=static | FileCheck %s + + at a = common global i32 0, align 4 + at b = common global i32 0, align 4 + at c = common global i32 0, align 4 + at d = common global i32 0, align 4 + at e = common global i32 0, align 4 + at f = common global i32 0, align 4 + +define void @func() nounwind uwtable { +; atom: imull +; atom-NOT: movl +; atom: imull +; CHECK: imull +; CHECK: movl +; CHECK: imull +entry: + %0 = load i32* @b, align 4 + %1 = load i32* @c, align 4 + %mul = mul nsw i32 %0, %1 + store i32 %mul, i32* @a, align 4 + %2 = load i32* @e, align 4 + %3 = load i32* @f, align 4 + %mul1 = mul nsw i32 %2, %3 + store i32 %mul1, i32* @d, align 4 + ret void +} Modified: llvm/trunk/test/CodeGen/X86/byval6.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/byval6.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/byval6.ll (original) +++ llvm/trunk/test/CodeGen/X86/byval6.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 | grep add | not grep 16 +; RUN: llc < %s -mcpu=generic -march=x86 | grep add | not grep 16 %struct.W = type { x86_fp80, x86_fp80 } @B = global %struct.W { x86_fp80 0xK4001A000000000000000, x86_fp80 0xK4001C000000000000000 }, align 32 Modified: llvm/trunk/test/CodeGen/X86/divide-by-constant.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/divide-by-constant.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/divide-by-constant.ll (original) +++ llvm/trunk/test/CodeGen/X86/divide-by-constant.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" target triple = "i686-pc-linux-gnu" Modified: llvm/trunk/test/CodeGen/X86/epilogue.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/epilogue.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/epilogue.ll (original) +++ llvm/trunk/test/CodeGen/X86/epilogue.ll Wed Feb 1 17:20:51 2012 @@ -1,5 +1,5 @@ -; RUN: llc < %s -march=x86 | not grep lea -; RUN: llc < %s -march=x86 | grep {movl %ebp} +; RUN: llc < %s -mcpu=generic -march=x86 | not grep lea +; RUN: llc < %s -mcpu=generic -march=x86 | grep {movl %ebp} declare void @bar(<2 x i64>* %n) Modified: llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll (original) +++ llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 -x86-asm-syntax=intel | \ +; RUN: llc < %s -mcpu=generic -march=x86 -x86-asm-syntax=intel | \ ; RUN: grep {add ESP, 8} target triple = "i686-pc-linux-gnu" Modified: llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll (original) +++ llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc -fast-isel -O0 -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s +; RUN: llc -fast-isel -O0 -mcpu=generic -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s ; This should use flds to set the return value. ; CHECK: test0: Modified: llvm/trunk/test/CodeGen/X86/fold-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-load.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/fold-load.ll (original) +++ llvm/trunk/test/CodeGen/X86/fold-load.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 | FileCheck %s +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s %struct._obstack_chunk = type { i8*, %struct._obstack_chunk*, [4 x i8] } %struct.obstack = type { i32, %struct._obstack_chunk*, i8*, i8*, i8*, i32, i32, %struct._obstack_chunk* (...)*, void (...)*, i8*, i8 } @stmt_obstack = external global %struct.obstack ; <%struct.obstack*> [#uses=1] Modified: llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll (original) +++ llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=i386-apple-darwin | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin | FileCheck %s ; There should be no stack manipulations between the inline asm and ret. ; CHECK: test1 Modified: llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll (original) +++ llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86-64 > %t +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t ; RUN: not grep and %t ; RUN: not grep movz %t ; RUN: not grep sar %t Modified: llvm/trunk/test/CodeGen/X86/optimize-max-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/optimize-max-3.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/optimize-max-3.ll (original) +++ llvm/trunk/test/CodeGen/X86/optimize-max-3.ll Wed Feb 1 17:20:51 2012 @@ -1,5 +1,5 @@ -; RUN: llc < %s -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s -; RUN: llc < %s -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s ; LSR's OptimizeMax should eliminate the select (max). Modified: llvm/trunk/test/CodeGen/X86/peep-test-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/peep-test-3.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/peep-test-3.ll (original) +++ llvm/trunk/test/CodeGen/X86/peep-test-3.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 -post-RA-scheduler=false | FileCheck %s +; RUN: llc < %s -mcpu=generic -march=x86 -post-RA-scheduler=false | FileCheck %s ; rdar://7226797 ; LLVM should omit the testl and use the flags result from the orl. Modified: llvm/trunk/test/CodeGen/X86/pic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pic.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/pic.ll (original) +++ llvm/trunk/test/CodeGen/X86/pic.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX @ptr = external global i32* @dst = external global i32 Modified: llvm/trunk/test/CodeGen/X86/red-zone.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/red-zone.ll (original) +++ llvm/trunk/test/CodeGen/X86/red-zone.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s ; First without noredzone. ; CHECK: f0: Modified: llvm/trunk/test/CodeGen/X86/red-zone2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone2.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/red-zone2.ll (original) +++ llvm/trunk/test/CodeGen/X86/red-zone2.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86-64 > %t +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t ; RUN: grep subq %t | count 1 ; RUN: grep addq %t | count 1 Modified: llvm/trunk/test/CodeGen/X86/reghinting.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/reghinting.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/reghinting.ll (original) +++ llvm/trunk/test/CodeGen/X86/reghinting.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=x86_64-apple-macosx | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-macosx | FileCheck %s ; PR10221 ;; The registers %x and %y must both spill across the finit call. Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll (original) +++ llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll Wed Feb 1 17:20:51 2012 @@ -1,7 +1,7 @@ -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj ; Just to prevent the alloca from being optimized away declare void @dummy_use(i32*, i32) Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/segmented-stacks.ll (original) +++ llvm/trunk/test/CodeGen/X86/segmented-stacks.ll Wed Feb 1 17:20:51 2012 @@ -1,23 +1,23 @@ -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD ; We used to crash with filetype=obj -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -filetype=obj -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -filetype=obj +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj -; RUN: not llc < %s -mtriple=x86_64-solaris -segmented-stacks 2> %t.log +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-solaris -segmented-stacks 2> %t.log ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-Solaris -; RUN: not llc < %s -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-MinGW -; RUN: not llc < %s -mtriple=i686-freebsd -segmented-stacks 2> %t.log +; RUN: not llc < %s -mcpu=generic -mtriple=i686-freebsd -segmented-stacks 2> %t.log ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X32-FreeBSD ; X64-Solaris: Segmented stacks not supported on this platform Modified: llvm/trunk/test/CodeGen/X86/stack-align2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/stack-align2.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/stack-align2.ll (original) +++ llvm/trunk/test/CodeGen/X86/stack-align2.ll Wed Feb 1 17:20:51 2012 @@ -1,9 +1,9 @@ -; RUN: llc < %s -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 -; RUN: llc < %s -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 -; RUN: llc < %s -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 -; RUN: llc < %s -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 -; RUN: llc < %s -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 +; RUN: llc < %s -mcpu=generic -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 +; RUN: llc < %s -mcpu=generic -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 +; RUN: llc < %s -mcpu=generic -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 define i32 @test() nounwind { entry: Modified: llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll (original) +++ llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=x86_64-linux -tailcallopt | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -tailcallopt | FileCheck %s ; FIXME: Win64 does not support byval. Modified: llvm/trunk/test/CodeGen/X86/tailcallstack64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallstack64.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/tailcallstack64.ll (original) +++ llvm/trunk/test/CodeGen/X86/tailcallstack64.ll Wed Feb 1 17:20:51 2012 @@ -1,5 +1,5 @@ -; RUN: llc < %s -tailcallopt -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s -; RUN: llc < %s -tailcallopt -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s ; FIXME: Redundant unused stack allocation could be eliminated. ; CHECK: subq ${{24|72|80}}, %rsp Modified: llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll (original) +++ llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll Wed Feb 1 17:20:51 2012 @@ -5,7 +5,7 @@ ;; allocator turns the shift into an LEA. This also occurs for ADD. ; Check that the shift gets turned into an LEA. -; RUN: llc < %s -mtriple=x86_64-apple-darwin | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin | FileCheck %s @G = external global i32 Modified: llvm/trunk/test/CodeGen/X86/v-binop-widen.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/v-binop-widen.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/v-binop-widen.ll (original) +++ llvm/trunk/test/CodeGen/X86/v-binop-widen.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc -march=x86 -mattr=+sse < %s | FileCheck %s +; RUN: llc -mcpu=generic -march=x86 -mattr=+sse < %s | FileCheck %s ; CHECK: divss ; CHECK: divps ; CHECK: divps Modified: llvm/trunk/test/CodeGen/X86/vec_call.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_call.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/vec_call.ll (original) +++ llvm/trunk/test/CodeGen/X86/vec_call.ll Wed Feb 1 17:20:51 2012 @@ -1,6 +1,6 @@ -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ ; RUN: grep {subl.*60} -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ ; RUN: grep {movaps.*32} Modified: llvm/trunk/test/CodeGen/X86/widen_arith-1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-1.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/widen_arith-1.ll (original) +++ llvm/trunk/test/CodeGen/X86/widen_arith-1.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 -mattr=+sse42 | FileCheck %s +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 | FileCheck %s define void @update(<3 x i8>* %dst, <3 x i8>* %src, i32 %n) nounwind { entry: Modified: llvm/trunk/test/CodeGen/X86/widen_arith-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-3.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/widen_arith-3.ll (original) +++ llvm/trunk/test/CodeGen/X86/widen_arith-3.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s ; CHECK: incl ; CHECK: incl ; CHECK: incl Modified: llvm/trunk/test/CodeGen/X86/widen_load-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_load-2.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/widen_load-2.ll (original) +++ llvm/trunk/test/CodeGen/X86/widen_load-2.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -o - -march=x86-64 -mattr=+sse42 | FileCheck %s +; RUN: llc < %s -o - -mcpu=generic -march=x86-64 -mattr=+sse42 | FileCheck %s ; Test based on pr5626 to load/store ; Modified: llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll (original) +++ llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll Wed Feb 1 17:20:51 2012 @@ -1,6 +1,6 @@ -; RUN: llc < %s -join-physregs -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI ; PR8777 ; PR8778 Modified: llvm/trunk/test/CodeGen/X86/win64_vararg.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_vararg.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/win64_vararg.ll (original) +++ llvm/trunk/test/CodeGen/X86/win64_vararg.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=x86_64-pc-win32 | FileCheck %s +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-win32 | FileCheck %s ; Verify that the var arg parameters which are passed in registers are stored ; in home stack slots allocated by the caller and that AP is correctly Modified: llvm/trunk/test/CodeGen/X86/zext-fold.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/zext-fold.ll?rev=149558&r1=149557&r2=149558&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/zext-fold.ll (original) +++ llvm/trunk/test/CodeGen/X86/zext-fold.ll Wed Feb 1 17:20:51 2012 @@ -1,4 +1,4 @@ -; RUN: llc < %s -march=x86 | FileCheck %s +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s ;; Simple case define i32 @test1(i8 %x) nounwind readnone { From atrick at apple.com Wed Feb 1 17:29:47 2012 From: atrick at apple.com (Andrew Trick) Date: Wed, 01 Feb 2012 15:29:47 -0800 Subject: [llvm-commits] [llvm][PATCH - REVISED][Commit request] X86 Instruction scheduler for the Intel Atom In-Reply-To: References: <3893BF02-E382-470D-AFB4-9F30AE9ECE34@apple.com> Message-ID: Committed r149558. And added -relocation-model=static to atom-sched.ll. -Andy On Jan 31, 2012, at 1:42 PM, "Gurd, Preston" wrote: > Hello Andy, > > Thank you for your comments. I have revised the patch as you suggested. > > I have also added ?-mcpu=generic? to two additional tests (2010-02-19-TailCallRetAddrBug.ll and peep-test-3.ll) which were failing when run on Atom, after I applied a Evan?s suggestion to change the scheduling preference to ?Hybrid?. > > Unless you have any other comments, please commit the attached patch. > > Thanks, > > Preston > > > From: Andrew Trick [mailto:atrick at apple.com] > Sent: Tuesday, January 31, 2012 2:22 AM > To: Gurd, Preston > Cc: Evan Cheng; llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom > > On Jan 23, 2012, at 3:05 PM, "Gurd, Preston" wrote: > > > Revision 2: Tests which were failing, when run on an Atom, due to the tests finding a schedule different from what was expected, have been changed to use ?-mcpu=generic? in order to prevent the Atom scheduler from running, so that all ?make check? tests pass. > > From: Gurd, Preston > Sent: Tuesday, January 17, 2012 4:29 PM > To: Evan Cheng > Cc: llvm-commits at cs.uiuc.edu > Subject: [llvm-commits] [llvm][PATCH - REVISED][Review request] X86 Instruction scheduler for the Intel Atom > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. > > Revision: the patch also changes the scheduling preference to ?Hybrid? for i386 Atom, while leaving x86_64 as ILP. > > Please commit the patch if it seems acceptable. > > Preston > > > From: Evan Cheng [mailto:evan.cheng at apple.com] > Sent: Monday, January 16, 2012 12:01 PM > To: Gurd, Preston > Cc: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] [llvm][PATCH][Review request] X86 Instruction scheduler for the Intel Atom > > Very nice. One question, I noticed you haven't changed the scheduling preference so x86_64 is still using ILP scheduler while i386 is using register pressure reduction scheduler. Have you tried changing the preference to latency scheduler for Atom? > > Evan > > On Jan 13, 2012, at 3:26 PM, Gurd, Preston wrote: > > > The attached patch implements most of an instruction scheduler for the Intel Atom. > > It adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > It sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > It adds a test to verify that the scheduler is working. > > I realize that this patch is kind of large, but please consider that the vast majority of the changes consist only of adding an instruction itinerary class name to an instruction. > > Hi Preston, > > I just have a couple minor questions I'd like you to address before I commit this: > > +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, > + FeatureMOVBE, FeatureSlowBTMem]>; > > These features are already included in the ProcIntelAtom family. Why do you need to list them again? Please verify, but subtarget features should be transitively implied. > > + //CriticalPathRCs.push_back(&X86::GPRRegClass); > > If you want to leave this disabled, please add comments. > > Thanks, > -Andy > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/84907c86/attachment.html From rafael.espindola at gmail.com Wed Feb 1 17:40:51 2012 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Wed, 01 Feb 2012 23:40:51 -0000 Subject: [llvm-commits] [llvm] r149561 - /llvm/trunk/lib/Target/Hexagon/CMakeLists.txt Message-ID: <20120201234052.06FEA2A6C12C@llvm.org> Author: rafael Date: Wed Feb 1 17:40:51 2012 New Revision: 149561 URL: http://llvm.org/viewvc/llvm-project?rev=149561&view=rev Log: Fix the cmake build Modified: llvm/trunk/lib/Target/Hexagon/CMakeLists.txt Modified: llvm/trunk/lib/Target/Hexagon/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Hexagon/CMakeLists.txt?rev=149561&r1=149560&r2=149561&view=diff ============================================================================== --- llvm/trunk/lib/Target/Hexagon/CMakeLists.txt (original) +++ llvm/trunk/lib/Target/Hexagon/CMakeLists.txt Wed Feb 1 17:40:51 2012 @@ -7,6 +7,7 @@ tablegen(LLVM HexagonGenCallingConv.inc -gen-callingconv) tablegen(LLVM HexagonGenSubtargetInfo.inc -gen-subtarget) tablegen(LLVM HexagonGenIntrinsics.inc -gen-tgt-intrinsic) +tablegen(LLVM HexagonGenDFAPacketizer.inc -gen-dfa-packetizer) add_public_tablegen_target(HexagonCommonTableGen) add_llvm_target(HexagonCodeGen From mcrosier at apple.com Wed Feb 1 17:47:30 2012 From: mcrosier at apple.com (Chad Rosier) Date: Wed, 01 Feb 2012 15:47:30 -0800 Subject: [llvm-commits] [llvm] r149558 - in /llvm/trunk: lib/Target/X86/ test/CodeGen/X86/ In-Reply-To: <20120201232053.E046B2A6C12C@llvm.org> References: <20120201232053.E046B2A6C12C@llvm.org> Message-ID: Andy, On Feb 1, 2012, at 3:20 PM, Andrew Trick wrote: > Author: atrick > Date: Wed Feb 1 17:20:51 2012 > New Revision: 149558 > > URL: http://llvm.org/viewvc/llvm-project?rev=149558&view=rev > Log: > Instruction scheduling itinerary for Intel Atom. > > Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. > > Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. > > Adds a test to verify that the scheduler is working. > > Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. > > Patch by Preston Gurd! > > Added: > llvm/trunk/lib/Target/X86/X86Schedule.td > llvm/trunk/lib/Target/X86/X86ScheduleAtom.td > llvm/trunk/test/CodeGen/X86/atom-sched.ll Was CMakeLists.txt updated for these additions? Chad > Modified: > llvm/trunk/lib/Target/X86/X86.td > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > llvm/trunk/lib/Target/X86/X86InstrArithmetic.td > llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td > llvm/trunk/lib/Target/X86/X86InstrControl.td > llvm/trunk/lib/Target/X86/X86InstrFormats.td > llvm/trunk/lib/Target/X86/X86InstrMMX.td > llvm/trunk/lib/Target/X86/X86InstrSSE.td > llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td > llvm/trunk/lib/Target/X86/X86Subtarget.cpp > llvm/trunk/lib/Target/X86/X86Subtarget.h > llvm/trunk/lib/Target/X86/X86TargetMachine.cpp > llvm/trunk/lib/Target/X86/X86TargetMachine.h > llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll > llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll > llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll > llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll > llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll > llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll > llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll > llvm/trunk/test/CodeGen/X86/abi-isel.ll > llvm/trunk/test/CodeGen/X86/add.ll > llvm/trunk/test/CodeGen/X86/byval6.ll > llvm/trunk/test/CodeGen/X86/divide-by-constant.ll > llvm/trunk/test/CodeGen/X86/epilogue.ll > llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll > llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll > llvm/trunk/test/CodeGen/X86/fold-load.ll > llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll > llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll > llvm/trunk/test/CodeGen/X86/optimize-max-3.ll > llvm/trunk/test/CodeGen/X86/peep-test-3.ll > llvm/trunk/test/CodeGen/X86/pic.ll > llvm/trunk/test/CodeGen/X86/red-zone.ll > llvm/trunk/test/CodeGen/X86/red-zone2.ll > llvm/trunk/test/CodeGen/X86/reghinting.ll > llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll > llvm/trunk/test/CodeGen/X86/segmented-stacks.ll > llvm/trunk/test/CodeGen/X86/stack-align2.ll > llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll > llvm/trunk/test/CodeGen/X86/tailcallstack64.ll > llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll > llvm/trunk/test/CodeGen/X86/v-binop-widen.ll > llvm/trunk/test/CodeGen/X86/vec_call.ll > llvm/trunk/test/CodeGen/X86/widen_arith-1.ll > llvm/trunk/test/CodeGen/X86/widen_arith-3.ll > llvm/trunk/test/CodeGen/X86/widen_load-2.ll > llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll > llvm/trunk/test/CodeGen/X86/win64_vararg.ll > llvm/trunk/test/CodeGen/X86/zext-fold.ll > > Modified: llvm/trunk/lib/Target/X86/X86.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86.td (original) > +++ llvm/trunk/lib/Target/X86/X86.td Wed Feb 1 17:20:51 2012 > @@ -120,8 +120,16 @@ > // X86 processors supported. > //===----------------------------------------------------------------------===// > > +include "X86Schedule.td" > + > +def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom", > + "Intel Atom processors">; > + > class Proc Features> > - : Processor; > + : Processor; > + > +class AtomProc Features> > + : Processor; > > def : Proc<"generic", []>; > def : Proc<"i386", []>; > @@ -146,8 +154,8 @@ > FeatureSlowBTMem]>; > def : Proc<"penryn", [FeatureSSE41, FeatureCMPXCHG16B, > FeatureSlowBTMem]>; > -def : Proc<"atom", [FeatureSSE3, FeatureCMPXCHG16B, FeatureMOVBE, > - FeatureSlowBTMem]>; > +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, > + FeatureMOVBE, FeatureSlowBTMem]>; > // "Arrandale" along with corei3 and corei5 > def : Proc<"corei7", [FeatureSSE42, FeatureCMPXCHG16B, > FeatureSlowBTMem, FeatureFastUAMem, > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 17:20:51 2012 > @@ -179,8 +179,11 @@ > > // For 64-bit since we have so many registers use the ILP scheduler, for > // 32-bit code use the register pressure specific scheduling. > + // For 32 bit Atom, use Hybrid (register pressure + latency) scheduling. > if (Subtarget->is64Bit()) > setSchedulingPreference(Sched::ILP); > + else if (Subtarget->isAtom()) > + setSchedulingPreference(Sched::Hybrid); > else > setSchedulingPreference(Sched::RegPressure); > setStackPointerRegisterToSaveRestore(X86StackPtr); > > Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Feb 1 17:20:51 2012 > @@ -18,22 +18,24 @@ > let neverHasSideEffects = 1 in > def LEA16r : I<0x8D, MRMSrcMem, > (outs GR16:$dst), (ins i32mem:$src), > - "lea{w}\t{$src|$dst}, {$dst|$src}", []>, OpSize; > + "lea{w}\t{$src|$dst}, {$dst|$src}", [], IIC_LEA_16>, OpSize; > let isReMaterializable = 1 in > def LEA32r : I<0x8D, MRMSrcMem, > (outs GR32:$dst), (ins i32mem:$src), > "lea{l}\t{$src|$dst}, {$dst|$src}", > - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In32BitMode]>; > + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, > + Requires<[In32BitMode]>; > > def LEA64_32r : I<0x8D, MRMSrcMem, > (outs GR32:$dst), (ins lea64_32mem:$src), > "lea{l}\t{$src|$dst}, {$dst|$src}", > - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In64BitMode]>; > + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, > + Requires<[In64BitMode]>; > > let isReMaterializable = 1 in > def LEA64r : RI<0x8D, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src), > "lea{q}\t{$src|$dst}, {$dst|$src}", > - [(set GR64:$dst, lea64addr:$src)]>; > + [(set GR64:$dst, lea64addr:$src)], IIC_LEA>; > > > > @@ -56,16 +58,18 @@ > let Defs = [AX,DX,EFLAGS], Uses = [AX], neverHasSideEffects = 1 in > def MUL16r : I<0xF7, MRM4r, (outs), (ins GR16:$src), > "mul{w}\t$src", > - []>, OpSize; // AX,DX = AX*GR16 > + [], IIC_MUL16_REG>, OpSize; // AX,DX = AX*GR16 > > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX], neverHasSideEffects = 1 in > def MUL32r : I<0xF7, MRM4r, (outs), (ins GR32:$src), > "mul{l}\t$src", // EAX,EDX = EAX*GR32 > - [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/]>; > + [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/], > + IIC_MUL32_REG>; > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX], neverHasSideEffects = 1 in > def MUL64r : RI<0xF7, MRM4r, (outs), (ins GR64:$src), > "mul{q}\t$src", // RAX,RDX = RAX*GR64 > - [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/]>; > + [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/], > + IIC_MUL64>; > > let Defs = [AL,EFLAGS,AX], Uses = [AL] in > def MUL8m : I<0xF6, MRM4m, (outs), (ins i8mem :$src), > @@ -74,21 +78,21 @@ > // This probably ought to be moved to a def : Pat<> if the > // syntax can be accepted. > [(set AL, (mul AL, (loadi8 addr:$src))), > - (implicit EFLAGS)]>; // AL,AH = AL*[mem8] > + (implicit EFLAGS)], IIC_MUL8>; // AL,AH = AL*[mem8] > > let mayLoad = 1, neverHasSideEffects = 1 in { > let Defs = [AX,DX,EFLAGS], Uses = [AX] in > def MUL16m : I<0xF7, MRM4m, (outs), (ins i16mem:$src), > "mul{w}\t$src", > - []>, OpSize; // AX,DX = AX*[mem16] > + [], IIC_MUL16_MEM>, OpSize; // AX,DX = AX*[mem16] > > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX] in > def MUL32m : I<0xF7, MRM4m, (outs), (ins i32mem:$src), > "mul{l}\t$src", > - []>; // EAX,EDX = EAX*[mem32] > + [], IIC_MUL32_MEM>; // EAX,EDX = EAX*[mem32] > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX] in > def MUL64m : RI<0xF7, MRM4m, (outs), (ins i64mem:$src), > - "mul{q}\t$src", []>; // RAX,RDX = RAX*[mem64] > + "mul{q}\t$src", [], IIC_MUL64>; // RAX,RDX = RAX*[mem64] > } > > let neverHasSideEffects = 1 in { > @@ -130,16 +134,19 @@ > def IMUL16rr : I<0xAF, MRMSrcReg, (outs GR16:$dst), (ins GR16:$src1,GR16:$src2), > "imul{w}\t{$src2, $dst|$dst, $src2}", > [(set GR16:$dst, EFLAGS, > - (X86smul_flag GR16:$src1, GR16:$src2))]>, TB, OpSize; > + (X86smul_flag GR16:$src1, GR16:$src2))], IIC_IMUL16_RR>, > + TB, OpSize; > def IMUL32rr : I<0xAF, MRMSrcReg, (outs GR32:$dst), (ins GR32:$src1,GR32:$src2), > "imul{l}\t{$src2, $dst|$dst, $src2}", > [(set GR32:$dst, EFLAGS, > - (X86smul_flag GR32:$src1, GR32:$src2))]>, TB; > + (X86smul_flag GR32:$src1, GR32:$src2))], IIC_IMUL32_RR>, > + TB; > def IMUL64rr : RI<0xAF, MRMSrcReg, (outs GR64:$dst), > (ins GR64:$src1, GR64:$src2), > "imul{q}\t{$src2, $dst|$dst, $src2}", > [(set GR64:$dst, EFLAGS, > - (X86smul_flag GR64:$src1, GR64:$src2))]>, TB; > + (X86smul_flag GR64:$src1, GR64:$src2))], IIC_IMUL64_RR>, > + TB; > } > > // Register-Memory Signed Integer Multiply > @@ -147,18 +154,23 @@ > (ins GR16:$src1, i16mem:$src2), > "imul{w}\t{$src2, $dst|$dst, $src2}", > [(set GR16:$dst, EFLAGS, > - (X86smul_flag GR16:$src1, (load addr:$src2)))]>, > + (X86smul_flag GR16:$src1, (load addr:$src2)))], > + IIC_IMUL16_RM>, > TB, OpSize; > def IMUL32rm : I<0xAF, MRMSrcMem, (outs GR32:$dst), > (ins GR32:$src1, i32mem:$src2), > "imul{l}\t{$src2, $dst|$dst, $src2}", > [(set GR32:$dst, EFLAGS, > - (X86smul_flag GR32:$src1, (load addr:$src2)))]>, TB; > + (X86smul_flag GR32:$src1, (load addr:$src2)))], > + IIC_IMUL32_RM>, > + TB; > def IMUL64rm : RI<0xAF, MRMSrcMem, (outs GR64:$dst), > (ins GR64:$src1, i64mem:$src2), > "imul{q}\t{$src2, $dst|$dst, $src2}", > [(set GR64:$dst, EFLAGS, > - (X86smul_flag GR64:$src1, (load addr:$src2)))]>, TB; > + (X86smul_flag GR64:$src1, (load addr:$src2)))], > + IIC_IMUL64_RM>, > + TB; > } // Constraints = "$src1 = $dst" > > } // Defs = [EFLAGS] > @@ -170,33 +182,39 @@ > (outs GR16:$dst), (ins GR16:$src1, i16imm:$src2), > "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR16:$dst, EFLAGS, > - (X86smul_flag GR16:$src1, imm:$src2))]>, OpSize; > + (X86smul_flag GR16:$src1, imm:$src2))], > + IIC_IMUL16_RRI>, OpSize; > def IMUL16rri8 : Ii8<0x6B, MRMSrcReg, // GR16 = GR16*I8 > (outs GR16:$dst), (ins GR16:$src1, i16i8imm:$src2), > "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR16:$dst, EFLAGS, > - (X86smul_flag GR16:$src1, i16immSExt8:$src2))]>, > + (X86smul_flag GR16:$src1, i16immSExt8:$src2))], > + IIC_IMUL16_RRI>, > OpSize; > def IMUL32rri : Ii32<0x69, MRMSrcReg, // GR32 = GR32*I32 > (outs GR32:$dst), (ins GR32:$src1, i32imm:$src2), > "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR32:$dst, EFLAGS, > - (X86smul_flag GR32:$src1, imm:$src2))]>; > + (X86smul_flag GR32:$src1, imm:$src2))], > + IIC_IMUL32_RRI>; > def IMUL32rri8 : Ii8<0x6B, MRMSrcReg, // GR32 = GR32*I8 > (outs GR32:$dst), (ins GR32:$src1, i32i8imm:$src2), > "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR32:$dst, EFLAGS, > - (X86smul_flag GR32:$src1, i32immSExt8:$src2))]>; > + (X86smul_flag GR32:$src1, i32immSExt8:$src2))], > + IIC_IMUL32_RRI>; > def IMUL64rri32 : RIi32<0x69, MRMSrcReg, // GR64 = GR64*I32 > (outs GR64:$dst), (ins GR64:$src1, i64i32imm:$src2), > "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR64:$dst, EFLAGS, > - (X86smul_flag GR64:$src1, i64immSExt32:$src2))]>; > + (X86smul_flag GR64:$src1, i64immSExt32:$src2))], > + IIC_IMUL64_RRI>; > def IMUL64rri8 : RIi8<0x6B, MRMSrcReg, // GR64 = GR64*I8 > (outs GR64:$dst), (ins GR64:$src1, i64i8imm:$src2), > "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR64:$dst, EFLAGS, > - (X86smul_flag GR64:$src1, i64immSExt8:$src2))]>; > + (X86smul_flag GR64:$src1, i64immSExt8:$src2))], > + IIC_IMUL64_RRI>; > > > // Memory-Integer Signed Integer Multiply > @@ -204,37 +222,43 @@ > (outs GR16:$dst), (ins i16mem:$src1, i16imm:$src2), > "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR16:$dst, EFLAGS, > - (X86smul_flag (load addr:$src1), imm:$src2))]>, > + (X86smul_flag (load addr:$src1), imm:$src2))], > + IIC_IMUL16_RMI>, > OpSize; > def IMUL16rmi8 : Ii8<0x6B, MRMSrcMem, // GR16 = [mem16]*I8 > (outs GR16:$dst), (ins i16mem:$src1, i16i8imm :$src2), > "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR16:$dst, EFLAGS, > (X86smul_flag (load addr:$src1), > - i16immSExt8:$src2))]>, OpSize; > + i16immSExt8:$src2))], IIC_IMUL16_RMI>, > + OpSize; > def IMUL32rmi : Ii32<0x69, MRMSrcMem, // GR32 = [mem32]*I32 > (outs GR32:$dst), (ins i32mem:$src1, i32imm:$src2), > "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR32:$dst, EFLAGS, > - (X86smul_flag (load addr:$src1), imm:$src2))]>; > + (X86smul_flag (load addr:$src1), imm:$src2))], > + IIC_IMUL32_RMI>; > def IMUL32rmi8 : Ii8<0x6B, MRMSrcMem, // GR32 = [mem32]*I8 > (outs GR32:$dst), (ins i32mem:$src1, i32i8imm: $src2), > "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR32:$dst, EFLAGS, > (X86smul_flag (load addr:$src1), > - i32immSExt8:$src2))]>; > + i32immSExt8:$src2))], > + IIC_IMUL32_RMI>; > def IMUL64rmi32 : RIi32<0x69, MRMSrcMem, // GR64 = [mem64]*I32 > (outs GR64:$dst), (ins i64mem:$src1, i64i32imm:$src2), > "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR64:$dst, EFLAGS, > (X86smul_flag (load addr:$src1), > - i64immSExt32:$src2))]>; > + i64immSExt32:$src2))], > + IIC_IMUL64_RMI>; > def IMUL64rmi8 : RIi8<0x6B, MRMSrcMem, // GR64 = [mem64]*I8 > (outs GR64:$dst), (ins i64mem:$src1, i64i8imm: $src2), > "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", > [(set GR64:$dst, EFLAGS, > (X86smul_flag (load addr:$src1), > - i64immSExt8:$src2))]>; > + i64immSExt8:$src2))], > + IIC_IMUL64_RMI>; > } // Defs = [EFLAGS] > > > @@ -243,62 +267,62 @@ > // unsigned division/remainder > let Defs = [AL,EFLAGS,AX], Uses = [AX] in > def DIV8r : I<0xF6, MRM6r, (outs), (ins GR8:$src), // AX/r8 = AL,AH > - "div{b}\t$src", []>; > + "div{b}\t$src", [], IIC_DIV8_REG>; > let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in > def DIV16r : I<0xF7, MRM6r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX > - "div{w}\t$src", []>, OpSize; > + "div{w}\t$src", [], IIC_DIV16>, OpSize; > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in > def DIV32r : I<0xF7, MRM6r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX > - "div{l}\t$src", []>; > + "div{l}\t$src", [], IIC_DIV32>; > // RDX:RAX/r64 = RAX,RDX > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in > def DIV64r : RI<0xF7, MRM6r, (outs), (ins GR64:$src), > - "div{q}\t$src", []>; > + "div{q}\t$src", [], IIC_DIV64>; > > let mayLoad = 1 in { > let Defs = [AL,EFLAGS,AX], Uses = [AX] in > def DIV8m : I<0xF6, MRM6m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH > - "div{b}\t$src", []>; > + "div{b}\t$src", [], IIC_DIV8_MEM>; > let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in > def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX > - "div{w}\t$src", []>, OpSize; > + "div{w}\t$src", [], IIC_DIV16>, OpSize; > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX > def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src), > - "div{l}\t$src", []>; > + "div{l}\t$src", [], IIC_DIV32>; > // RDX:RAX/[mem64] = RAX,RDX > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in > def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src), > - "div{q}\t$src", []>; > + "div{q}\t$src", [], IIC_DIV64>; > } > > // Signed division/remainder. > let Defs = [AL,EFLAGS,AX], Uses = [AX] in > def IDIV8r : I<0xF6, MRM7r, (outs), (ins GR8:$src), // AX/r8 = AL,AH > - "idiv{b}\t$src", []>; > + "idiv{b}\t$src", [], IIC_IDIV8>; > let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in > def IDIV16r: I<0xF7, MRM7r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX > - "idiv{w}\t$src", []>, OpSize; > + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in > def IDIV32r: I<0xF7, MRM7r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX > - "idiv{l}\t$src", []>; > + "idiv{l}\t$src", [], IIC_IDIV32>; > // RDX:RAX/r64 = RAX,RDX > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in > def IDIV64r: RI<0xF7, MRM7r, (outs), (ins GR64:$src), > - "idiv{q}\t$src", []>; > + "idiv{q}\t$src", [], IIC_IDIV64>; > > let mayLoad = 1 in { > let Defs = [AL,EFLAGS,AX], Uses = [AX] in > def IDIV8m : I<0xF6, MRM7m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH > - "idiv{b}\t$src", []>; > + "idiv{b}\t$src", [], IIC_IDIV8>; > let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in > def IDIV16m: I<0xF7, MRM7m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX > - "idiv{w}\t$src", []>, OpSize; > + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX > def IDIV32m: I<0xF7, MRM7m, (outs), (ins i32mem:$src), > - "idiv{l}\t$src", []>; > + "idiv{l}\t$src", [], IIC_IDIV32>; > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in // RDX:RAX/[mem64] = RAX,RDX > def IDIV64m: RI<0xF7, MRM7m, (outs), (ins i64mem:$src), > - "idiv{q}\t$src", []>; > + "idiv{q}\t$src", [], IIC_IDIV64>; > } > > //===----------------------------------------------------------------------===// > @@ -312,35 +336,35 @@ > def NEG8r : I<0xF6, MRM3r, (outs GR8 :$dst), (ins GR8 :$src1), > "neg{b}\t$dst", > [(set GR8:$dst, (ineg GR8:$src1)), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_REG>; > def NEG16r : I<0xF7, MRM3r, (outs GR16:$dst), (ins GR16:$src1), > "neg{w}\t$dst", > [(set GR16:$dst, (ineg GR16:$src1)), > - (implicit EFLAGS)]>, OpSize; > + (implicit EFLAGS)], IIC_UNARY_REG>, OpSize; > def NEG32r : I<0xF7, MRM3r, (outs GR32:$dst), (ins GR32:$src1), > "neg{l}\t$dst", > [(set GR32:$dst, (ineg GR32:$src1)), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_REG>; > def NEG64r : RI<0xF7, MRM3r, (outs GR64:$dst), (ins GR64:$src1), "neg{q}\t$dst", > [(set GR64:$dst, (ineg GR64:$src1)), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_REG>; > } // Constraints = "$src1 = $dst" > > def NEG8m : I<0xF6, MRM3m, (outs), (ins i8mem :$dst), > "neg{b}\t$dst", > [(store (ineg (loadi8 addr:$dst)), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > def NEG16m : I<0xF7, MRM3m, (outs), (ins i16mem:$dst), > "neg{w}\t$dst", > [(store (ineg (loadi16 addr:$dst)), addr:$dst), > - (implicit EFLAGS)]>, OpSize; > + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize; > def NEG32m : I<0xF7, MRM3m, (outs), (ins i32mem:$dst), > "neg{l}\t$dst", > [(store (ineg (loadi32 addr:$dst)), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > def NEG64m : RI<0xF7, MRM3m, (outs), (ins i64mem:$dst), "neg{q}\t$dst", > [(store (ineg (loadi64 addr:$dst)), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > } // Defs = [EFLAGS] > > > @@ -351,29 +375,30 @@ > let AddedComplexity = 15 in { > def NOT8r : I<0xF6, MRM2r, (outs GR8 :$dst), (ins GR8 :$src1), > "not{b}\t$dst", > - [(set GR8:$dst, (not GR8:$src1))]>; > + [(set GR8:$dst, (not GR8:$src1))], IIC_UNARY_REG>; > def NOT16r : I<0xF7, MRM2r, (outs GR16:$dst), (ins GR16:$src1), > "not{w}\t$dst", > - [(set GR16:$dst, (not GR16:$src1))]>, OpSize; > + [(set GR16:$dst, (not GR16:$src1))], IIC_UNARY_REG>, OpSize; > def NOT32r : I<0xF7, MRM2r, (outs GR32:$dst), (ins GR32:$src1), > "not{l}\t$dst", > - [(set GR32:$dst, (not GR32:$src1))]>; > + [(set GR32:$dst, (not GR32:$src1))], IIC_UNARY_REG>; > def NOT64r : RI<0xF7, MRM2r, (outs GR64:$dst), (ins GR64:$src1), "not{q}\t$dst", > - [(set GR64:$dst, (not GR64:$src1))]>; > + [(set GR64:$dst, (not GR64:$src1))], IIC_UNARY_REG>; > } > } // Constraints = "$src1 = $dst" > > def NOT8m : I<0xF6, MRM2m, (outs), (ins i8mem :$dst), > "not{b}\t$dst", > - [(store (not (loadi8 addr:$dst)), addr:$dst)]>; > + [(store (not (loadi8 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; > def NOT16m : I<0xF7, MRM2m, (outs), (ins i16mem:$dst), > "not{w}\t$dst", > - [(store (not (loadi16 addr:$dst)), addr:$dst)]>, OpSize; > + [(store (not (loadi16 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>, > + OpSize; > def NOT32m : I<0xF7, MRM2m, (outs), (ins i32mem:$dst), > "not{l}\t$dst", > - [(store (not (loadi32 addr:$dst)), addr:$dst)]>; > + [(store (not (loadi32 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; > def NOT64m : RI<0xF7, MRM2m, (outs), (ins i64mem:$dst), "not{q}\t$dst", > - [(store (not (loadi64 addr:$dst)), addr:$dst)]>; > + [(store (not (loadi64 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; > } // CodeSize > > // TODO: inc/dec is slow for P4, but fast for Pentium-M. > @@ -382,19 +407,22 @@ > let CodeSize = 2 in > def INC8r : I<0xFE, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), > "inc{b}\t$dst", > - [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))]>; > + [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))], > + IIC_UNARY_REG>; > > let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. > def INC16r : I<0x40, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), > "inc{w}\t$dst", > - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, > + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], IIC_UNARY_REG>, > OpSize, Requires<[In32BitMode]>; > def INC32r : I<0x40, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), > "inc{l}\t$dst", > - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, > + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], > + IIC_UNARY_REG>, > Requires<[In32BitMode]>; > def INC64r : RI<0xFF, MRM0r, (outs GR64:$dst), (ins GR64:$src1), "inc{q}\t$dst", > - [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))]>; > + [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))], > + IIC_UNARY_REG>; > } // isConvertibleToThreeAddress = 1, CodeSize = 1 > > > @@ -403,19 +431,23 @@ > // Can transform into LEA. > def INC64_16r : I<0xFF, MRM0r, (outs GR16:$dst), (ins GR16:$src1), > "inc{w}\t$dst", > - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, > + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], > + IIC_UNARY_REG>, > OpSize, Requires<[In64BitMode]>; > def INC64_32r : I<0xFF, MRM0r, (outs GR32:$dst), (ins GR32:$src1), > "inc{l}\t$dst", > - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, > + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], > + IIC_UNARY_REG>, > Requires<[In64BitMode]>; > def DEC64_16r : I<0xFF, MRM1r, (outs GR16:$dst), (ins GR16:$src1), > "dec{w}\t$dst", > - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, > + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], > + IIC_UNARY_REG>, > OpSize, Requires<[In64BitMode]>; > def DEC64_32r : I<0xFF, MRM1r, (outs GR32:$dst), (ins GR32:$src1), > "dec{l}\t$dst", > - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, > + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], > + IIC_UNARY_REG>, > Requires<[In64BitMode]>; > } // isConvertibleToThreeAddress = 1, CodeSize = 2 > > @@ -424,37 +456,37 @@ > let CodeSize = 2 in { > def INC8m : I<0xFE, MRM0m, (outs), (ins i8mem :$dst), "inc{b}\t$dst", > [(store (add (loadi8 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > def INC16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", > [(store (add (loadi16 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > OpSize, Requires<[In32BitMode]>; > def INC32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", > [(store (add (loadi32 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > Requires<[In32BitMode]>; > def INC64m : RI<0xFF, MRM0m, (outs), (ins i64mem:$dst), "inc{q}\t$dst", > [(store (add (loadi64 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > > // These are duplicates of their 32-bit counterparts. Only needed so X86 knows > // how to unfold them. > // FIXME: What is this for?? > def INC64_16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", > [(store (add (loadi16 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > OpSize, Requires<[In64BitMode]>; > def INC64_32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", > [(store (add (loadi32 addr:$dst), 1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > Requires<[In64BitMode]>; > def DEC64_16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", > [(store (add (loadi16 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > OpSize, Requires<[In64BitMode]>; > def DEC64_32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", > [(store (add (loadi32 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > Requires<[In64BitMode]>; > } // CodeSize = 2 > > @@ -462,18 +494,22 @@ > let CodeSize = 2 in > def DEC8r : I<0xFE, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), > "dec{b}\t$dst", > - [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))]>; > + [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))], > + IIC_UNARY_REG>; > let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. > def DEC16r : I<0x48, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), > "dec{w}\t$dst", > - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, > + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], > + IIC_UNARY_REG>, > OpSize, Requires<[In32BitMode]>; > def DEC32r : I<0x48, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), > "dec{l}\t$dst", > - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, > + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], > + IIC_UNARY_REG>, > Requires<[In32BitMode]>; > def DEC64r : RI<0xFF, MRM1r, (outs GR64:$dst), (ins GR64:$src1), "dec{q}\t$dst", > - [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))]>; > + [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))], > + IIC_UNARY_REG>; > } // CodeSize = 2 > } // Constraints = "$src1 = $dst" > > @@ -481,18 +517,18 @@ > let CodeSize = 2 in { > def DEC8m : I<0xFE, MRM1m, (outs), (ins i8mem :$dst), "dec{b}\t$dst", > [(store (add (loadi8 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > def DEC16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", > [(store (add (loadi16 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > OpSize, Requires<[In32BitMode]>; > def DEC32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", > [(store (add (loadi32 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>, > + (implicit EFLAGS)], IIC_UNARY_MEM>, > Requires<[In32BitMode]>; > def DEC64m : RI<0xFF, MRM1m, (outs), (ins i64mem:$dst), "dec{q}\t$dst", > [(store (add (loadi64 addr:$dst), -1), addr:$dst), > - (implicit EFLAGS)]>; > + (implicit EFLAGS)], IIC_UNARY_MEM>; > } // CodeSize = 2 > } // Defs = [EFLAGS] > > @@ -588,11 +624,13 @@ > /// 4. Infers whether the low bit of the opcode should be 0 (for i8 operations) > /// or 1 (for i16,i32,i64 operations). > class ITy opcode, Format f, X86TypeInfo typeinfo, dag outs, dag ins, > - string mnemonic, string args, list pattern> > + string mnemonic, string args, list pattern, > + InstrItinClass itin = IIC_BIN_NONMEM> > : I<{opcode{7}, opcode{6}, opcode{5}, opcode{4}, > opcode{3}, opcode{2}, opcode{1}, typeinfo.HasOddOpcode }, > f, outs, ins, > - !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern> { > + !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern, > + itin> { > > // Infer instruction prefixes from type info. > let hasOpSizePrefix = typeinfo.HasOpSizePrefix; > @@ -664,7 +702,7 @@ > dag outlist, list pattern> > : ITy (ins typeinfo.RegClass:$src1, typeinfo.MemOperand:$src2), > - mnemonic, "{$src2, $src1|$src1, $src2}", pattern>; > + mnemonic, "{$src2, $src1|$src1, $src2}", pattern, IIC_BIN_MEM>; > > // BinOpRM_R - Instructions like "add reg, reg, [mem]". > class BinOpRM_R opcode, string mnemonic, X86TypeInfo typeinfo, > @@ -776,7 +814,7 @@ > list pattern> > : ITy (outs), (ins typeinfo.MemOperand:$dst, typeinfo.RegClass:$src), > - mnemonic, "{$src, $dst|$dst, $src}", pattern>; > + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM>; > > // BinOpMR_RMW - Instructions like "add [mem], reg". > class BinOpMR_RMW opcode, string mnemonic, X86TypeInfo typeinfo, > @@ -804,7 +842,7 @@ > Format f, list pattern, bits<8> opcode = 0x80> > : ITy (outs), (ins typeinfo.MemOperand:$dst, typeinfo.ImmOperand:$src), > - mnemonic, "{$src, $dst|$dst, $src}", pattern> { > + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { > let ImmT = typeinfo.ImmEncoding; > } > > @@ -837,7 +875,7 @@ > Format f, list pattern> > : ITy<0x82, f, typeinfo, > (outs), (ins typeinfo.MemOperand:$dst, typeinfo.Imm8Operand:$src), > - mnemonic, "{$src, $dst|$dst, $src}", pattern> { > + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { > let ImmT = Imm8; // Always 8-bit immediate. > } > > @@ -1150,7 +1188,7 @@ > // register class is constrained to GR8_NOREX. > let isPseudo = 1 in > def TEST8ri_NOREX : I<0, Pseudo, (outs), (ins GR8_NOREX:$src, i8imm:$mask), > - "", []>; > + "", [], IIC_BIN_NONMEM>; > } > > //===----------------------------------------------------------------------===// > @@ -1160,11 +1198,12 @@ > PatFrag ld_frag> { > def rr : I<0xF2, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2), > !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), > - [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))]>; > + [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))], > + IIC_BIN_NONMEM>; > def rm : I<0xF2, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2), > !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), > [(set RC:$dst, EFLAGS, > - (X86andn_flag RC:$src1, (ld_frag addr:$src2)))]>; > + (X86andn_flag RC:$src1, (ld_frag addr:$src2)))], IIC_BIN_MEM>; > } > > let Predicates = [HasBMI], Defs = [EFLAGS] in { > > Modified: llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td Wed Feb 1 17:20:51 2012 > @@ -21,17 +21,20 @@ > : I !strconcat(Mnemonic, "{w}\t{$src2, $dst|$dst, $src2}"), > [(set GR16:$dst, > - (X86cmov GR16:$src1, GR16:$src2, CondNode, EFLAGS))]>,TB,OpSize; > + (X86cmov GR16:$src1, GR16:$src2, CondNode, EFLAGS))], > + IIC_CMOV16_RR>,TB,OpSize; > def #NAME#32rr > : I !strconcat(Mnemonic, "{l}\t{$src2, $dst|$dst, $src2}"), > [(set GR32:$dst, > - (X86cmov GR32:$src1, GR32:$src2, CondNode, EFLAGS))]>, TB; > + (X86cmov GR32:$src1, GR32:$src2, CondNode, EFLAGS))], > + IIC_CMOV32_RR>, TB; > def #NAME#64rr > :RI !strconcat(Mnemonic, "{q}\t{$src2, $dst|$dst, $src2}"), > [(set GR64:$dst, > - (X86cmov GR64:$src1, GR64:$src2, CondNode, EFLAGS))]>, TB; > + (X86cmov GR64:$src1, GR64:$src2, CondNode, EFLAGS))], > + IIC_CMOV32_RR>, TB; > } > > let Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" in { > @@ -39,17 +42,18 @@ > : I !strconcat(Mnemonic, "{w}\t{$src2, $dst|$dst, $src2}"), > [(set GR16:$dst, (X86cmov GR16:$src1, (loadi16 addr:$src2), > - CondNode, EFLAGS))]>, TB, OpSize; > + CondNode, EFLAGS))], IIC_CMOV16_RM>, > + TB, OpSize; > def #NAME#32rm > : I !strconcat(Mnemonic, "{l}\t{$src2, $dst|$dst, $src2}"), > [(set GR32:$dst, (X86cmov GR32:$src1, (loadi32 addr:$src2), > - CondNode, EFLAGS))]>, TB; > + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; > def #NAME#64rm > :RI !strconcat(Mnemonic, "{q}\t{$src2, $dst|$dst, $src2}"), > [(set GR64:$dst, (X86cmov GR64:$src1, (loadi64 addr:$src2), > - CondNode, EFLAGS))]>, TB; > + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; > } // Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" > } // end multiclass > > @@ -78,10 +82,12 @@ > let Uses = [EFLAGS] in { > def r : I !strconcat(Mnemonic, "\t$dst"), > - [(set GR8:$dst, (X86setcc OpNode, EFLAGS))]>, TB; > + [(set GR8:$dst, (X86setcc OpNode, EFLAGS))], > + IIC_SET_R>, TB; > def m : I !strconcat(Mnemonic, "\t$dst"), > - [(store (X86setcc OpNode, EFLAGS), addr:$dst)]>, TB; > + [(store (X86setcc OpNode, EFLAGS), addr:$dst)], > + IIC_SET_M>, TB; > } // Uses = [EFLAGS] > } > > > Modified: llvm/trunk/lib/Target/X86/X86InstrControl.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrControl.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrControl.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrControl.td Wed Feb 1 17:20:51 2012 > @@ -20,41 +20,42 @@ > hasCtrlDep = 1, FPForm = SpecialFP in { > def RET : I <0xC3, RawFrm, (outs), (ins variable_ops), > "ret", > - [(X86retflag 0)]>; > + [(X86retflag 0)], IIC_RET>; > def RETI : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), > "ret\t$amt", > - [(X86retflag timm:$amt)]>; > + [(X86retflag timm:$amt)], IIC_RET_IMM>; > def RETIW : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), > "retw\t$amt", > - []>, OpSize; > + [], IIC_RET_IMM>, OpSize; > def LRETL : I <0xCB, RawFrm, (outs), (ins), > - "lretl", []>; > + "lretl", [], IIC_RET>; > def LRETQ : RI <0xCB, RawFrm, (outs), (ins), > - "lretq", []>; > + "lretq", [], IIC_RET>; > def LRETI : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), > - "lret\t$amt", []>; > + "lret\t$amt", [], IIC_RET>; > def LRETIW : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), > - "lretw\t$amt", []>, OpSize; > + "lretw\t$amt", [], IIC_RET>, OpSize; > } > > // Unconditional branches. > let isBarrier = 1, isBranch = 1, isTerminator = 1 in { > def JMP_4 : Ii32PCRel<0xE9, RawFrm, (outs), (ins brtarget:$dst), > - "jmp\t$dst", [(br bb:$dst)]>; > + "jmp\t$dst", [(br bb:$dst)], IIC_JMP_REL>; > def JMP_1 : Ii8PCRel<0xEB, RawFrm, (outs), (ins brtarget8:$dst), > - "jmp\t$dst", []>; > + "jmp\t$dst", [], IIC_JMP_REL>; > // FIXME : Intel syntax for JMP64pcrel32 such that it is not ambiguious > // with JMP_1. > def JMP64pcrel32 : I<0xE9, RawFrm, (outs), (ins brtarget:$dst), > - "jmpq\t$dst", []>; > + "jmpq\t$dst", [], IIC_JMP_REL>; > } > > // Conditional Branches. > let isBranch = 1, isTerminator = 1, Uses = [EFLAGS] in { > multiclass ICBr opc1, bits<8> opc4, string asm, PatFrag Cond> { > - def _1 : Ii8PCRel ; > + def _1 : Ii8PCRel + IIC_Jcc>; > def _4 : Ii32PCRel - [(X86brcond bb:$dst, Cond, EFLAGS)]>, TB; > + [(X86brcond bb:$dst, Cond, EFLAGS)], IIC_Jcc>, TB; > } > } > > @@ -82,55 +83,55 @@ > // jecxz. > let Uses = [CX] in > def JCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), > - "jcxz\t$dst", []>, AdSize, Requires<[In32BitMode]>; > + "jcxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In32BitMode]>; > let Uses = [ECX] in > def JECXZ_32 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), > - "jecxz\t$dst", []>, Requires<[In32BitMode]>; > + "jecxz\t$dst", [], IIC_JCXZ>, Requires<[In32BitMode]>; > > // J*CXZ instruction: 64-bit versions of this instruction for the asmparser. > // In 64-bit mode, the address size prefix is jecxz and the unprefixed version > // is jrcxz. > let Uses = [ECX] in > def JECXZ_64 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), > - "jecxz\t$dst", []>, AdSize, Requires<[In64BitMode]>; > + "jecxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In64BitMode]>; > let Uses = [RCX] in > def JRCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), > - "jrcxz\t$dst", []>, Requires<[In64BitMode]>; > + "jrcxz\t$dst", [], IIC_JCXZ>, Requires<[In64BitMode]>; > } > > // Indirect branches > let isBranch = 1, isTerminator = 1, isBarrier = 1, isIndirectBranch = 1 in { > def JMP32r : I<0xFF, MRM4r, (outs), (ins GR32:$dst), "jmp{l}\t{*}$dst", > - [(brind GR32:$dst)]>, Requires<[In32BitMode]>; > + [(brind GR32:$dst)], IIC_JMP_REG>, Requires<[In32BitMode]>; > def JMP32m : I<0xFF, MRM4m, (outs), (ins i32mem:$dst), "jmp{l}\t{*}$dst", > - [(brind (loadi32 addr:$dst))]>, Requires<[In32BitMode]>; > + [(brind (loadi32 addr:$dst))], IIC_JMP_MEM>, Requires<[In32BitMode]>; > > def JMP64r : I<0xFF, MRM4r, (outs), (ins GR64:$dst), "jmp{q}\t{*}$dst", > - [(brind GR64:$dst)]>, Requires<[In64BitMode]>; > + [(brind GR64:$dst)], IIC_JMP_REG>, Requires<[In64BitMode]>; > def JMP64m : I<0xFF, MRM4m, (outs), (ins i64mem:$dst), "jmp{q}\t{*}$dst", > - [(brind (loadi64 addr:$dst))]>, Requires<[In64BitMode]>; > + [(brind (loadi64 addr:$dst))], IIC_JMP_MEM>, Requires<[In64BitMode]>; > > def FARJMP16i : Iseg16<0xEA, RawFrmImm16, (outs), > (ins i16imm:$off, i16imm:$seg), > - "ljmp{w}\t{$seg, $off|$off, $seg}", []>, OpSize; > + "ljmp{w}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>, OpSize; > def FARJMP32i : Iseg32<0xEA, RawFrmImm16, (outs), > (ins i32imm:$off, i16imm:$seg), > - "ljmp{l}\t{$seg, $off|$off, $seg}", []>; > + "ljmp{l}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>; > def FARJMP64 : RI<0xFF, MRM5m, (outs), (ins opaque80mem:$dst), > - "ljmp{q}\t{*}$dst", []>; > + "ljmp{q}\t{*}$dst", [], IIC_JMP_FAR_MEM>; > > def FARJMP16m : I<0xFF, MRM5m, (outs), (ins opaque32mem:$dst), > - "ljmp{w}\t{*}$dst", []>, OpSize; > + "ljmp{w}\t{*}$dst", [], IIC_JMP_FAR_MEM>, OpSize; > def FARJMP32m : I<0xFF, MRM5m, (outs), (ins opaque48mem:$dst), > - "ljmp{l}\t{*}$dst", []>; > + "ljmp{l}\t{*}$dst", [], IIC_JMP_FAR_MEM>; > } > > > // Loop instructions > > -def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", []>; > -def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", []>; > -def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", []>; > +def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", [], IIC_LOOP>; > +def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", [], IIC_LOOPE>; > +def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", [], IIC_LOOPNE>; > > //===----------------------------------------------------------------------===// > // Call Instructions... > @@ -147,25 +148,27 @@ > Uses = [ESP] in { > def CALLpcrel32 : Ii32PCRel<0xE8, RawFrm, > (outs), (ins i32imm_pcrel:$dst,variable_ops), > - "call{l}\t$dst", []>, Requires<[In32BitMode]>; > + "call{l}\t$dst", [], IIC_CALL_RI>, Requires<[In32BitMode]>; > def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), > - "call{l}\t{*}$dst", [(X86call GR32:$dst)]>, > + "call{l}\t{*}$dst", [(X86call GR32:$dst)], IIC_CALL_RI>, > Requires<[In32BitMode]>; > def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), > - "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))]>, > + "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))], IIC_CALL_MEM>, > Requires<[In32BitMode]>; > > def FARCALL16i : Iseg16<0x9A, RawFrmImm16, (outs), > (ins i16imm:$off, i16imm:$seg), > - "lcall{w}\t{$seg, $off|$off, $seg}", []>, OpSize; > + "lcall{w}\t{$seg, $off|$off, $seg}", [], > + IIC_CALL_FAR_PTR>, OpSize; > def FARCALL32i : Iseg32<0x9A, RawFrmImm16, (outs), > (ins i32imm:$off, i16imm:$seg), > - "lcall{l}\t{$seg, $off|$off, $seg}", []>; > + "lcall{l}\t{$seg, $off|$off, $seg}", [], > + IIC_CALL_FAR_PTR>; > > def FARCALL16m : I<0xFF, MRM3m, (outs), (ins opaque32mem:$dst), > - "lcall{w}\t{*}$dst", []>, OpSize; > + "lcall{w}\t{*}$dst", [], IIC_CALL_FAR_MEM>, OpSize; > def FARCALL32m : I<0xFF, MRM3m, (outs), (ins opaque48mem:$dst), > - "lcall{l}\t{*}$dst", []>; > + "lcall{l}\t{*}$dst", [], IIC_CALL_FAR_MEM>; > > // callw for 16 bit code for the assembler. > let isAsmParserOnly = 1 in > @@ -196,13 +199,13 @@ > // mcinst. > def TAILJMPd : Ii32PCRel<0xE9, RawFrm, (outs), > (ins i32imm_pcrel:$dst, variable_ops), > - "jmp\t$dst # TAILCALL", > - []>; > + "jmp\t$dst # TAILCALL", > + [], IIC_JMP_REL>; > def TAILJMPr : I<0xFF, MRM4r, (outs), (ins GR32_TC:$dst, variable_ops), > - "", []>; // FIXME: Remove encoding when JIT is dead. > + "", [], IIC_JMP_REG>; // FIXME: Remove encoding when JIT is dead. > let mayLoad = 1 in > def TAILJMPm : I<0xFF, MRM4m, (outs), (ins i32mem_TC:$dst, variable_ops), > - "jmp{l}\t{*}$dst # TAILCALL", []>; > + "jmp{l}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; > } > > > @@ -226,17 +229,19 @@ > // the 32-bit pcrel field that we have. > def CALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, > (outs), (ins i64i32imm_pcrel:$dst, variable_ops), > - "call{q}\t$dst", []>, > + "call{q}\t$dst", [], IIC_CALL_RI>, > Requires<[In64BitMode, NotWin64]>; > def CALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), > - "call{q}\t{*}$dst", [(X86call GR64:$dst)]>, > + "call{q}\t{*}$dst", [(X86call GR64:$dst)], > + IIC_CALL_RI>, > Requires<[In64BitMode, NotWin64]>; > def CALL64m : I<0xFF, MRM2m, (outs), (ins i64mem:$dst, variable_ops), > - "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))]>, > + "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))], > + IIC_CALL_MEM>, > Requires<[In64BitMode, NotWin64]>; > > def FARCALL64 : RI<0xFF, MRM3m, (outs), (ins opaque80mem:$dst), > - "lcall{q}\t{*}$dst", []>; > + "lcall{q}\t{*}$dst", [], IIC_CALL_FAR_MEM>; > } > > // FIXME: We need to teach codegen about single list of call-clobbered > @@ -253,15 +258,16 @@ > Uses = [RSP] in { > def WINCALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, > (outs), (ins i64i32imm_pcrel:$dst, variable_ops), > - "call{q}\t$dst", []>, > + "call{q}\t$dst", [], IIC_CALL_RI>, > Requires<[IsWin64]>; > def WINCALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), > "call{q}\t{*}$dst", > - [(X86call GR64:$dst)]>, Requires<[IsWin64]>; > + [(X86call GR64:$dst)], IIC_CALL_RI>, > + Requires<[IsWin64]>; > def WINCALL64m : I<0xFF, MRM2m, (outs), > (ins i64mem:$dst,variable_ops), > "call{q}\t{*}$dst", > - [(X86call (loadi64 addr:$dst))]>, > + [(X86call (loadi64 addr:$dst))], IIC_CALL_MEM>, > Requires<[IsWin64]>; > } > > @@ -272,7 +278,7 @@ > Uses = [RSP] in { > def W64ALLOCA : Ii32PCRel<0xE8, RawFrm, > (outs), (ins i64i32imm_pcrel:$dst, variable_ops), > - "call{q}\t$dst", []>, > + "call{q}\t$dst", [], IIC_CALL_RI>, > Requires<[IsWin64]>; > } > > @@ -296,11 +302,11 @@ > > def TAILJMPd64 : Ii32PCRel<0xE9, RawFrm, (outs), > (ins i64i32imm_pcrel:$dst, variable_ops), > - "jmp\t$dst # TAILCALL", []>; > + "jmp\t$dst # TAILCALL", [], IIC_JMP_REL>; > def TAILJMPr64 : I<0xFF, MRM4r, (outs), (ins ptr_rc_tailcall:$dst, variable_ops), > - "jmp{q}\t{*}$dst # TAILCALL", []>; > + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; > > let mayLoad = 1 in > def TAILJMPm64 : I<0xFF, MRM4m, (outs), (ins i64mem_TC:$dst, variable_ops), > - "jmp{q}\t{*}$dst # TAILCALL", []>; > + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; > } > > Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Wed Feb 1 17:20:51 2012 > @@ -123,7 +123,9 @@ > class MemOp4 { bit hasMemOp4Prefix = 1; } > class XOP { bit hasXOP_Prefix = 1; } > class X86Inst opcod, Format f, ImmType i, dag outs, dag ins, > - string AsmStr, Domain d = GenericDomain> > + string AsmStr, > + InstrItinClass itin, > + Domain d = GenericDomain> > : Instruction { > let Namespace = "X86"; > > @@ -139,6 +141,8 @@ > // If this is a pseudo instruction, mark it isCodeGenOnly. > let isCodeGenOnly = !eq(!cast(f), "Pseudo"); > > + let Itinerary = itin; > + > // > // Attributes specific to X86 instructions... > // > @@ -189,51 +193,53 @@ > } > > class PseudoI pattern> > - : X86Inst<0, Pseudo, NoImm, oops, iops, ""> { > + : X86Inst<0, Pseudo, NoImm, oops, iops, "", NoItinerary> { > let Pattern = pattern; > } > > class I o, Format f, dag outs, dag ins, string asm, > - list pattern, Domain d = GenericDomain> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT, > + Domain d = GenericDomain> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > class Ii8 o, Format f, dag outs, dag ins, string asm, > - list pattern, Domain d = GenericDomain> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT, > + Domain d = GenericDomain> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > class Ii8PCRel o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > class Ii16 o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > class Ii32 o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > > class Ii16PCRel o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > > class Ii32PCRel o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > @@ -244,8 +250,9 @@ > : I {} > > // FpI_ - Floating Point Pseudo Instruction template. Not Predicated. > -class FpI_ pattern> > - : X86Inst<0, Pseudo, NoImm, outs, ins, ""> { > +class FpI_ pattern, > + InstrItinClass itin = IIC_DEFAULT> > + : X86Inst<0, Pseudo, NoImm, outs, ins, "", itin> { > let FPForm = fp; > let Pattern = pattern; > } > @@ -257,20 +264,23 @@ > // Iseg32 - 16-bit segment selector, 32-bit offset > > class Iseg16 o, Format f, dag outs, dag ins, string asm, > - list pattern> : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > > class Iseg32 o, Format f, dag outs, dag ins, string asm, > - list pattern> : X86Inst { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst { > let Pattern = pattern; > let CodeSize = 3; > } > > // SI - SSE 1 & 2 scalar instructions > -class SI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I { > +class SI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I { > let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], > !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); > > @@ -280,8 +290,8 @@ > > // SIi8 - SSE 1 & 2 scalar instructions > class SIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8 { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8 { > let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], > !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); > > @@ -291,8 +301,8 @@ > > // PI - SSE 1 & 2 packed instructions > class PI o, Format F, dag outs, dag ins, string asm, list pattern, > - Domain d> > - : I { > + InstrItinClass itin, Domain d> > + : I { > let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], > !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); > > @@ -302,8 +312,8 @@ > > // PIi8 - SSE 1 & 2 packed instructions with immediate > class PIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern, Domain d> > - : Ii8 { > + list pattern, InstrItinClass itin, Domain d> > + : Ii8 { > let Predicates = !if(hasVEX_4VPrefix /* VEX */, [HasAVX], > !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); > > @@ -319,25 +329,27 @@ > // VSSI - SSE1 instructions with XS prefix in AVX form. > // VPSI - SSE1 instructions with TB prefix in AVX form. > > -class SSI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, XS, Requires<[HasSSE1]>; > +class SSI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XS, Requires<[HasSSE1]>; > class SSIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, XS, Requires<[HasSSE1]>; > -class PSI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, TB, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, XS, Requires<[HasSSE1]>; > +class PSI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, > Requires<[HasSSE1]>; > class PSIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TB, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TB, > Requires<[HasSSE1]>; > class VSSI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, XS, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XS, > Requires<[HasAVX]>; > class VPSI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, > Requires<[HasAVX]>; > > // SSE2 Instruction Templates: > @@ -350,28 +362,30 @@ > // VSDI - SSE2 instructions with XD prefix in AVX form. > // VPDI - SSE2 instructions with TB and OpSize prefixes in AVX form. > > -class SDI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, XD, Requires<[HasSSE2]>; > +class SDI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XD, Requires<[HasSSE2]>; > class SDIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, XD, Requires<[HasSSE2]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, XD, Requires<[HasSSE2]>; > class SSDIi8 o, Format F, dag outs, dag ins, string asm, > list pattern> > : Ii8, XS, Requires<[HasSSE2]>; > -class PDI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, TB, OpSize, > +class PDI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, OpSize, > Requires<[HasSSE2]>; > class PDIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TB, OpSize, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TB, OpSize, > Requires<[HasSSE2]>; > class VSDI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, XD, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XD, > Requires<[HasAVX]>; > class VPDI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, > OpSize, Requires<[HasAVX]>; > > // SSE3 Instruction Templates: > @@ -381,15 +395,16 @@ > // S3DI - SSE3 instructions with XD prefix. > > class S3SI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, XS, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XS, > Requires<[HasSSE3]>; > class S3DI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, XD, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, XD, > Requires<[HasSSE3]>; > -class S3I o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, TB, OpSize, > +class S3I o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, OpSize, > Requires<[HasSSE3]>; > > > @@ -403,12 +418,12 @@ > // classes. They need to be enabled even if AVX is enabled. > > class SS38I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, > Requires<[HasSSSE3]>; > class SS3AI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > Requires<[HasSSSE3]>; > > // SSE4.1 Instruction Templates: > @@ -417,31 +432,31 @@ > // SS41AIi8 - SSE 4.1 instructions with TA prefix and ImmT == Imm8. > // > class SS48I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, > Requires<[HasSSE41]>; > class SS4AIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > Requires<[HasSSE41]>; > > // SSE4.2 Instruction Templates: > // > // SS428I - SSE 4.2 instructions with T8 prefix. > class SS428I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, > Requires<[HasSSE42]>; > > // SS42FI - SSE 4.2 instructions with T8XD prefix. > class SS42FI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8XD, Requires<[HasSSE42]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8XD, Requires<[HasSSE42]>; > > // SS42AI = SSE 4.2 instructions with TA prefix > class SS42AI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > Requires<[HasSSE42]>; > > // AVX Instruction Templates: > @@ -450,12 +465,12 @@ > // AVX8I - AVX instructions with T8 and OpSize prefix. > // AVXAIi8 - AVX instructions with TA, OpSize prefix and ImmT = Imm8. > class AVX8I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8, OpSize, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, OpSize, > Requires<[HasAVX]>; > class AVXAIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, OpSize, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, OpSize, > Requires<[HasAVX]>; > > // AVX2 Instruction Templates: > @@ -464,12 +479,12 @@ > // AVX28I - AVX2 instructions with T8 and OpSize prefix. > // AVX2AIi8 - AVX2 instructions with TA, OpSize prefix and ImmT = Imm8. > class AVX28I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, T8, OpSize, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, OpSize, > Requires<[HasAVX2]>; > class AVX2AIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, OpSize, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, OpSize, > Requires<[HasAVX2]>; > > // AES Instruction Templates: > @@ -477,87 +492,88 @@ > // AES8I > // These use the same encoding as the SSE4.2 T8 and TA encodings. > class AES8I o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : I, T8, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, > Requires<[HasSSE2, HasAES]>; > > class AESAI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TA, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > Requires<[HasSSE2, HasAES]>; > > // CLMUL Instruction Templates > class CLMULIi8 o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : Ii8, TA, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > OpSize, Requires<[HasSSE2, HasCLMUL]>; > > class AVXCLMULIi8 o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : Ii8, TA, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > OpSize, VEX_4V, Requires<[HasAVX, HasCLMUL]>; > > // FMA3 Instruction Templates > class FMA3 o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : I, T8, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : I, T8, > OpSize, VEX_4V, Requires<[HasFMA3]>; > > // FMA4 Instruction Templates > class FMA4 o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : Ii8, TA, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TA, > OpSize, VEX_4V, VEX_I8IMM, Requires<[HasFMA4]>; > > // XOP 2, 3 and 4 Operand Instruction Template > class IXOP o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, > XOP, XOP9, Requires<[HasXOP]>; > > // XOP 2, 3 and 4 Operand Instruction Templates with imm byte > class IXOPi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, > XOP, XOP8, Requires<[HasXOP]>; > > // XOP 5 operand instruction (VEX encoding!) > class IXOP5 o, Format F, dag outs, dag ins, string asm, > - listpattern> > - : Ii8, TA, > + listpattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TA, > OpSize, VEX_4V, VEX_I8IMM, Requires<[HasXOP]>; > > // X86-64 Instruction templates... > // > > -class RI o, Format F, dag outs, dag ins, string asm, list pattern> > - : I, REX_W; > +class RI o, Format F, dag outs, dag ins, string asm, > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, REX_W; > class RIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, REX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, REX_W; > class RIi32 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii32, REX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii32, REX_W; > > class RIi64 o, Format f, dag outs, dag ins, string asm, > - list pattern> > - : X86Inst, REX_W { > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : X86Inst, REX_W { > let Pattern = pattern; > let CodeSize = 3; > } > > class RSSI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : SSI, REX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : SSI, REX_W; > class RSDI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : SDI, REX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : SDI, REX_W; > class RPDI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : PDI, REX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : PDI, REX_W; > class VRPDI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : VPDI, VEX_W; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : VPDI, VEX_W; > > // MMX Instruction templates > // > @@ -570,23 +586,23 @@ > // MMXID - MMX instructions with XD prefix. > // MMXIS - MMX instructions with XS prefix. > class MMXI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, Requires<[HasMMX]>; > class MMXI64 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, Requires<[HasMMX,In64BitMode]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, Requires<[HasMMX,In64BitMode]>; > class MMXRI o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, REX_W, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, REX_W, Requires<[HasMMX]>; > class MMX2I o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : I, TB, OpSize, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : I, TB, OpSize, Requires<[HasMMX]>; > class MMXIi8 o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, TB, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, TB, Requires<[HasMMX]>; > class MMXID o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, XD, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, XD, Requires<[HasMMX]>; > class MMXIS o, Format F, dag outs, dag ins, string asm, > - list pattern> > - : Ii8, XS, Requires<[HasMMX]>; > + list pattern, InstrItinClass itin = IIC_DEFAULT> > + : Ii8, XS, Requires<[HasMMX]>; > > Modified: llvm/trunk/lib/Target/X86/X86InstrMMX.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrMMX.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrMMX.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrMMX.td Wed Feb 1 17:20:51 2012 > @@ -105,19 +105,23 @@ > Intrinsic Int, X86MemOperand x86memop, PatFrag ld_frag, > string asm, Domain d> { > def irr : PI - [(set DstRC:$dst, (Int SrcRC:$src))], d>; > + [(set DstRC:$dst, (Int SrcRC:$src))], > + IIC_DEFAULT, d>; > def irm : PI - [(set DstRC:$dst, (Int (ld_frag addr:$src)))], d>; > + [(set DstRC:$dst, (Int (ld_frag addr:$src)))], > + IIC_DEFAULT, d>; > } > > multiclass sse12_cvt_pint_3addr opc, RegisterClass SrcRC, > RegisterClass DstRC, Intrinsic Int, X86MemOperand x86memop, > PatFrag ld_frag, string asm, Domain d> { > def irr : PI - asm, [(set DstRC:$dst, (Int DstRC:$src1, SrcRC:$src2))], d>; > + asm, [(set DstRC:$dst, (Int DstRC:$src1, SrcRC:$src2))], > + IIC_DEFAULT, d>; > def irm : PI (ins DstRC:$src1, x86memop:$src2), asm, > - [(set DstRC:$dst, (Int DstRC:$src1, (ld_frag addr:$src2)))], d>; > + [(set DstRC:$dst, (Int DstRC:$src1, (ld_frag addr:$src2)))], > + IIC_DEFAULT, d>; > } > > //===----------------------------------------------------------------------===// > > Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Wed Feb 1 17:20:51 2012 > @@ -67,13 +67,14 @@ > !if(Is2Addr, > !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), > !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > - [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], d>; > + [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], IIC_DEFAULT, d>; > let mayLoad = 1 in > def rm : PI !if(Is2Addr, > !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), > !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > - [(set RC:$dst, (OpNode RC:$src1, (mem_frag addr:$src2)))], d>; > + [(set RC:$dst, (OpNode RC:$src1, (mem_frag addr:$src2)))], > + IIC_DEFAULT, d>; > } > > /// sse12_fp_packed_logical_rm - SSE 1 & 2 packed instructions class > @@ -87,12 +88,12 @@ > !if(Is2Addr, > !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), > !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > - pat_rr, d>; > + pat_rr, IIC_DEFAULT, d>; > def rm : PI !if(Is2Addr, > !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), > !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > - pat_rm, d>; > + pat_rm, IIC_DEFAULT, d>; > } > > /// sse12_fp_packed_int - SSE 1 & 2 packed instructions intrinsics class > @@ -106,14 +107,14 @@ > !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > [(set RC:$dst, (!cast( > !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) > - RC:$src1, RC:$src2))], d>; > + RC:$src1, RC:$src2))], IIC_DEFAULT, d>; > def rm_Int : PI !if(Is2Addr, > !strconcat(asm, "\t{$src2, $dst|$dst, $src2}"), > !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), > [(set RC:$dst, (!cast( > !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) > - RC:$src1, (mem_frag addr:$src2)))], d>; > + RC:$src1, (mem_frag addr:$src2)))], IIC_DEFAULT, d>; > } > > //===----------------------------------------------------------------------===// > @@ -737,11 +738,11 @@ > bit IsReMaterializable = 1> { > let neverHasSideEffects = 1 in > def rr : PI - !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], d>; > + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], IIC_DEFAULT, d>; > let canFoldAsLoad = 1, isReMaterializable = IsReMaterializable in > def rm : PI !strconcat(asm, "\t{$src, $dst|$dst, $src}"), > - [(set RC:$dst, (ld_frag addr:$src))], d>; > + [(set RC:$dst, (ld_frag addr:$src))], IIC_DEFAULT, d>; > } > > defm VMOVAPS : sse12_mov_packed<0x28, VR128, f128mem, alignedloadv4f32, > @@ -1003,14 +1004,14 @@ > [(set RC:$dst, > (mov_frag RC:$src1, > (bc_v4f32 (v2f64 (scalar_to_vector (loadf64 addr:$src2))))))], > - SSEPackedSingle>, TB; > + IIC_DEFAULT, SSEPackedSingle>, TB; > > def PDrm : PI (outs RC:$dst), (ins RC:$src1, f64mem:$src2), > !strconcat(base_opc, "d", asm_opr), > [(set RC:$dst, (v2f64 (mov_frag RC:$src1, > (scalar_to_vector (loadf64 addr:$src2)))))], > - SSEPackedDouble>, TB, OpSize; > + IIC_DEFAULT, SSEPackedDouble>, TB, OpSize; > } > > let AddedComplexity = 20 in { > @@ -1413,9 +1414,11 @@ > SDNode OpNode, X86MemOperand x86memop, PatFrag ld_frag, > string asm, Domain d> { > def rr : PI - [(set DstRC:$dst, (OpNode SrcRC:$src))], d>; > + [(set DstRC:$dst, (OpNode SrcRC:$src))], > + IIC_DEFAULT, d>; > def rm : PI - [(set DstRC:$dst, (OpNode (ld_frag addr:$src)))], d>; > + [(set DstRC:$dst, (OpNode (ld_frag addr:$src)))], > + IIC_DEFAULT, d>; > } > > multiclass sse12_vcvt_avx opc, RegisterClass SrcRC, RegisterClass DstRC, > @@ -2124,11 +2127,13 @@ > PatFrag ld_frag, string OpcodeStr, Domain d> { > def rr: PI !strconcat(OpcodeStr, "\t{$src2, $src1|$src1, $src2}"), > - [(set EFLAGS, (OpNode (vt RC:$src1), RC:$src2))], d>; > + [(set EFLAGS, (OpNode (vt RC:$src1), RC:$src2))], > + IIC_DEFAULT, d>; > def rm: PI !strconcat(OpcodeStr, "\t{$src2, $src1|$src1, $src2}"), > [(set EFLAGS, (OpNode (vt RC:$src1), > - (ld_frag addr:$src2)))], d>; > + (ld_frag addr:$src2)))], > + IIC_DEFAULT, d>; > } > > let Defs = [EFLAGS] in { > @@ -2185,19 +2190,21 @@ > let isAsmParserOnly = 1 in { > def rri : PIi8<0xC2, MRMSrcReg, > (outs RC:$dst), (ins RC:$src1, RC:$src2, SSECC:$cc), asm, > - [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], d>; > + [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], > + IIC_DEFAULT, d>; > def rmi : PIi8<0xC2, MRMSrcMem, > (outs RC:$dst), (ins RC:$src1, x86memop:$src2, SSECC:$cc), asm, > - [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], d>; > + [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], > + IIC_DEFAULT, d>; > } > > // Accept explicit immediate argument form instead of comparison code. > def rri_alt : PIi8<0xC2, MRMSrcReg, > (outs RC:$dst), (ins RC:$src1, RC:$src2, i8imm:$cc), > - asm_alt, [], d>; > + asm_alt, [], IIC_DEFAULT, d>; > def rmi_alt : PIi8<0xC2, MRMSrcMem, > (outs RC:$dst), (ins RC:$src1, x86memop:$src2, i8imm:$cc), > - asm_alt, [], d>; > + asm_alt, [], IIC_DEFAULT, d>; > } > > defm VCMPPS : sse12_cmp_packed @@ -2272,12 +2279,14 @@ > def rmi : PIi8<0xC6, MRMSrcMem, (outs RC:$dst), > (ins RC:$src1, x86memop:$src2, i8imm:$src3), asm, > [(set RC:$dst, (vt (shufp:$src3 > - RC:$src1, (mem_frag addr:$src2))))], d>; > + RC:$src1, (mem_frag addr:$src2))))], > + IIC_DEFAULT, d>; > let isConvertibleToThreeAddress = IsConvertibleToThreeAddress in > def rri : PIi8<0xC6, MRMSrcReg, (outs RC:$dst), > (ins RC:$src1, RC:$src2, i8imm:$src3), asm, > [(set RC:$dst, > - (vt (shufp:$src3 RC:$src1, RC:$src2)))], d>; > + (vt (shufp:$src3 RC:$src1, RC:$src2)))], > + IIC_DEFAULT, d>; > } > > defm VSHUFPS : sse12_shuffle @@ -2448,12 +2457,14 @@ > def rr : PI (outs RC:$dst), (ins RC:$src1, RC:$src2), > asm, [(set RC:$dst, > - (vt (OpNode RC:$src1, RC:$src2)))], d>; > + (vt (OpNode RC:$src1, RC:$src2)))], > + IIC_DEFAULT, d>; > def rm : PI (outs RC:$dst), (ins RC:$src1, x86memop:$src2), > asm, [(set RC:$dst, > (vt (OpNode RC:$src1, > - (mem_frag addr:$src2))))], d>; > + (mem_frag addr:$src2))))], > + IIC_DEFAULT, d>; > } > > let AddedComplexity = 10 in { > @@ -2589,9 +2600,10 @@ > Domain d> { > def rr32 : PI<0x50, MRMSrcReg, (outs GR32:$dst), (ins RC:$src), > !strconcat(asm, "\t{$src, $dst|$dst, $src}"), > - [(set GR32:$dst, (Int RC:$src))], d>; > + [(set GR32:$dst, (Int RC:$src))], IIC_DEFAULT, d>; > def rr64 : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins RC:$src), > - !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], d>, REX_W; > + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], > + IIC_DEFAULT, d>, REX_W; > } > > let Predicates = [HasAVX] in { > @@ -2621,14 +2633,18 @@ > > // Assembler Only > def VMOVMSKPSr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), > - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; > + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, > + SSEPackedSingle>, TB, VEX; > def VMOVMSKPDr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), > - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, > + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, > + SSEPackedDouble>, TB, > OpSize, VEX; > def VMOVMSKPSYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), > - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; > + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, > + SSEPackedSingle>, TB, VEX; > def VMOVMSKPDYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), > - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, > + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, > + SSEPackedDouble>, TB, > OpSize, VEX; > } > > @@ -6395,7 +6411,7 @@ > !strconcat(OpcodeStr, > "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), > [(set RC:$dst, (IntId RC:$src1, RC:$src2, RC:$src3))], > - SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; > + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; > > def rm : Ii8 (ins RC:$src1, x86memop:$src2, RC:$src3), > @@ -6404,7 +6420,7 @@ > [(set RC:$dst, > (IntId RC:$src1, (bitconvert (mem_frag addr:$src2)), > RC:$src3))], > - SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; > + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; > } > > let Predicates = [HasAVX] in { > > Modified: llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td Wed Feb 1 17:20:51 2012 > @@ -19,44 +19,46 @@ > let Uses = [CL] in { > def SHL8rCL : I<0xD2, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1), > "shl{b}\t{%cl, $dst|$dst, CL}", > - [(set GR8:$dst, (shl GR8:$src1, CL))]>; > + [(set GR8:$dst, (shl GR8:$src1, CL))], IIC_SR>; > def SHL16rCL : I<0xD3, MRM4r, (outs GR16:$dst), (ins GR16:$src1), > "shl{w}\t{%cl, $dst|$dst, CL}", > - [(set GR16:$dst, (shl GR16:$src1, CL))]>, OpSize; > + [(set GR16:$dst, (shl GR16:$src1, CL))], IIC_SR>, OpSize; > def SHL32rCL : I<0xD3, MRM4r, (outs GR32:$dst), (ins GR32:$src1), > "shl{l}\t{%cl, $dst|$dst, CL}", > - [(set GR32:$dst, (shl GR32:$src1, CL))]>; > + [(set GR32:$dst, (shl GR32:$src1, CL))], IIC_SR>; > def SHL64rCL : RI<0xD3, MRM4r, (outs GR64:$dst), (ins GR64:$src1), > "shl{q}\t{%cl, $dst|$dst, CL}", > - [(set GR64:$dst, (shl GR64:$src1, CL))]>; > + [(set GR64:$dst, (shl GR64:$src1, CL))], IIC_SR>; > } // Uses = [CL] > > def SHL8ri : Ii8<0xC0, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), > "shl{b}\t{$src2, $dst|$dst, $src2}", > - [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))]>; > + [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; > > let isConvertibleToThreeAddress = 1 in { // Can transform into LEA. > def SHL16ri : Ii8<0xC1, MRM4r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), > "shl{w}\t{$src2, $dst|$dst, $src2}", > - [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))]>, OpSize; > + [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))], IIC_SR>, > + OpSize; > def SHL32ri : Ii8<0xC1, MRM4r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), > "shl{l}\t{$src2, $dst|$dst, $src2}", > - [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))]>; > + [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))], IIC_SR>; > def SHL64ri : RIi8<0xC1, MRM4r, (outs GR64:$dst), > (ins GR64:$src1, i8imm:$src2), > "shl{q}\t{$src2, $dst|$dst, $src2}", > - [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))]>; > + [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))], > + IIC_SR>; > > // NOTE: We don't include patterns for shifts of a register by one, because > // 'add reg,reg' is cheaper (and we have a Pat pattern for shift-by-one). > def SHL8r1 : I<0xD0, MRM4r, (outs GR8:$dst), (ins GR8:$src1), > - "shl{b}\t$dst", []>; > + "shl{b}\t$dst", [], IIC_SR>; > def SHL16r1 : I<0xD1, MRM4r, (outs GR16:$dst), (ins GR16:$src1), > - "shl{w}\t$dst", []>, OpSize; > + "shl{w}\t$dst", [], IIC_SR>, OpSize; > def SHL32r1 : I<0xD1, MRM4r, (outs GR32:$dst), (ins GR32:$src1), > - "shl{l}\t$dst", []>; > + "shl{l}\t$dst", [], IIC_SR>; > def SHL64r1 : RI<0xD1, MRM4r, (outs GR64:$dst), (ins GR64:$src1), > - "shl{q}\t$dst", []>; > + "shl{q}\t$dst", [], IIC_SR>; > } // isConvertibleToThreeAddress = 1 > } // Constraints = "$src = $dst" > > @@ -66,223 +68,266 @@ > let Uses = [CL] in { > def SHL8mCL : I<0xD2, MRM4m, (outs), (ins i8mem :$dst), > "shl{b}\t{%cl, $dst|$dst, CL}", > - [(store (shl (loadi8 addr:$dst), CL), addr:$dst)]>; > + [(store (shl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; > def SHL16mCL : I<0xD3, MRM4m, (outs), (ins i16mem:$dst), > "shl{w}\t{%cl, $dst|$dst, CL}", > - [(store (shl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; > + [(store (shl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, > + OpSize; > def SHL32mCL : I<0xD3, MRM4m, (outs), (ins i32mem:$dst), > "shl{l}\t{%cl, $dst|$dst, CL}", > - [(store (shl (loadi32 addr:$dst), CL), addr:$dst)]>; > + [(store (shl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; > def SHL64mCL : RI<0xD3, MRM4m, (outs), (ins i64mem:$dst), > "shl{q}\t{%cl, $dst|$dst, CL}", > - [(store (shl (loadi64 addr:$dst), CL), addr:$dst)]>; > + [(store (shl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; > } > def SHL8mi : Ii8<0xC0, MRM4m, (outs), (ins i8mem :$dst, i8imm:$src), > "shl{b}\t{$src, $dst|$dst, $src}", > - [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SHL16mi : Ii8<0xC1, MRM4m, (outs), (ins i16mem:$dst, i8imm:$src), > "shl{w}\t{$src, $dst|$dst, $src}", > - [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, > + [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>, > OpSize; > def SHL32mi : Ii8<0xC1, MRM4m, (outs), (ins i32mem:$dst, i8imm:$src), > "shl{l}\t{$src, $dst|$dst, $src}", > - [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SHL64mi : RIi8<0xC1, MRM4m, (outs), (ins i64mem:$dst, i8imm:$src), > "shl{q}\t{$src, $dst|$dst, $src}", > - [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > > // Shift by 1 > def SHL8m1 : I<0xD0, MRM4m, (outs), (ins i8mem :$dst), > "shl{b}\t$dst", > - [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SHL16m1 : I<0xD1, MRM4m, (outs), (ins i16mem:$dst), > "shl{w}\t$dst", > - [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, > + [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>, > OpSize; > def SHL32m1 : I<0xD1, MRM4m, (outs), (ins i32mem:$dst), > "shl{l}\t$dst", > - [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SHL64m1 : RI<0xD1, MRM4m, (outs), (ins i64mem:$dst), > "shl{q}\t$dst", > - [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > > let Constraints = "$src1 = $dst" in { > let Uses = [CL] in { > def SHR8rCL : I<0xD2, MRM5r, (outs GR8 :$dst), (ins GR8 :$src1), > "shr{b}\t{%cl, $dst|$dst, CL}", > - [(set GR8:$dst, (srl GR8:$src1, CL))]>; > + [(set GR8:$dst, (srl GR8:$src1, CL))], IIC_SR>; > def SHR16rCL : I<0xD3, MRM5r, (outs GR16:$dst), (ins GR16:$src1), > "shr{w}\t{%cl, $dst|$dst, CL}", > - [(set GR16:$dst, (srl GR16:$src1, CL))]>, OpSize; > + [(set GR16:$dst, (srl GR16:$src1, CL))], IIC_SR>, OpSize; > def SHR32rCL : I<0xD3, MRM5r, (outs GR32:$dst), (ins GR32:$src1), > "shr{l}\t{%cl, $dst|$dst, CL}", > - [(set GR32:$dst, (srl GR32:$src1, CL))]>; > + [(set GR32:$dst, (srl GR32:$src1, CL))], IIC_SR>; > def SHR64rCL : RI<0xD3, MRM5r, (outs GR64:$dst), (ins GR64:$src1), > "shr{q}\t{%cl, $dst|$dst, CL}", > - [(set GR64:$dst, (srl GR64:$src1, CL))]>; > + [(set GR64:$dst, (srl GR64:$src1, CL))], IIC_SR>; > } > > def SHR8ri : Ii8<0xC0, MRM5r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$src2), > "shr{b}\t{$src2, $dst|$dst, $src2}", > - [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))]>; > + [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; > def SHR16ri : Ii8<0xC1, MRM5r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), > "shr{w}\t{$src2, $dst|$dst, $src2}", > - [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))]>, OpSize; > + [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))], > + IIC_SR>, OpSize; > def SHR32ri : Ii8<0xC1, MRM5r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), > "shr{l}\t{$src2, $dst|$dst, $src2}", > - [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))]>; > + [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))], > + IIC_SR>; > def SHR64ri : RIi8<0xC1, MRM5r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), > "shr{q}\t{$src2, $dst|$dst, $src2}", > - [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))]>; > + [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))], IIC_SR>; > > // Shift right by 1 > def SHR8r1 : I<0xD0, MRM5r, (outs GR8:$dst), (ins GR8:$src1), > "shr{b}\t$dst", > - [(set GR8:$dst, (srl GR8:$src1, (i8 1)))]>; > + [(set GR8:$dst, (srl GR8:$src1, (i8 1)))], IIC_SR>; > def SHR16r1 : I<0xD1, MRM5r, (outs GR16:$dst), (ins GR16:$src1), > "shr{w}\t$dst", > - [(set GR16:$dst, (srl GR16:$src1, (i8 1)))]>, OpSize; > + [(set GR16:$dst, (srl GR16:$src1, (i8 1)))], IIC_SR>, OpSize; > def SHR32r1 : I<0xD1, MRM5r, (outs GR32:$dst), (ins GR32:$src1), > "shr{l}\t$dst", > - [(set GR32:$dst, (srl GR32:$src1, (i8 1)))]>; > + [(set GR32:$dst, (srl GR32:$src1, (i8 1)))], IIC_SR>; > def SHR64r1 : RI<0xD1, MRM5r, (outs GR64:$dst), (ins GR64:$src1), > "shr{q}\t$dst", > - [(set GR64:$dst, (srl GR64:$src1, (i8 1)))]>; > + [(set GR64:$dst, (srl GR64:$src1, (i8 1)))], IIC_SR>; > } // Constraints = "$src = $dst" > > > let Uses = [CL] in { > def SHR8mCL : I<0xD2, MRM5m, (outs), (ins i8mem :$dst), > "shr{b}\t{%cl, $dst|$dst, CL}", > - [(store (srl (loadi8 addr:$dst), CL), addr:$dst)]>; > + [(store (srl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; > def SHR16mCL : I<0xD3, MRM5m, (outs), (ins i16mem:$dst), > "shr{w}\t{%cl, $dst|$dst, CL}", > - [(store (srl (loadi16 addr:$dst), CL), addr:$dst)]>, > + [(store (srl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, > OpSize; > def SHR32mCL : I<0xD3, MRM5m, (outs), (ins i32mem:$dst), > "shr{l}\t{%cl, $dst|$dst, CL}", > - [(store (srl (loadi32 addr:$dst), CL), addr:$dst)]>; > + [(store (srl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; > def SHR64mCL : RI<0xD3, MRM5m, (outs), (ins i64mem:$dst), > "shr{q}\t{%cl, $dst|$dst, CL}", > - [(store (srl (loadi64 addr:$dst), CL), addr:$dst)]>; > + [(store (srl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; > } > def SHR8mi : Ii8<0xC0, MRM5m, (outs), (ins i8mem :$dst, i8imm:$src), > "shr{b}\t{$src, $dst|$dst, $src}", > - [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SHR16mi : Ii8<0xC1, MRM5m, (outs), (ins i16mem:$dst, i8imm:$src), > "shr{w}\t{$src, $dst|$dst, $src}", > - [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, > + [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>, > OpSize; > def SHR32mi : Ii8<0xC1, MRM5m, (outs), (ins i32mem:$dst, i8imm:$src), > "shr{l}\t{$src, $dst|$dst, $src}", > - [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SHR64mi : RIi8<0xC1, MRM5m, (outs), (ins i64mem:$dst, i8imm:$src), > "shr{q}\t{$src, $dst|$dst, $src}", > - [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > > // Shift by 1 > def SHR8m1 : I<0xD0, MRM5m, (outs), (ins i8mem :$dst), > "shr{b}\t$dst", > - [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SHR16m1 : I<0xD1, MRM5m, (outs), (ins i16mem:$dst), > "shr{w}\t$dst", > - [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>,OpSize; > + [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>,OpSize; > def SHR32m1 : I<0xD1, MRM5m, (outs), (ins i32mem:$dst), > "shr{l}\t$dst", > - [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SHR64m1 : RI<0xD1, MRM5m, (outs), (ins i64mem:$dst), > "shr{q}\t$dst", > - [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > > let Constraints = "$src1 = $dst" in { > let Uses = [CL] in { > def SAR8rCL : I<0xD2, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), > "sar{b}\t{%cl, $dst|$dst, CL}", > - [(set GR8:$dst, (sra GR8:$src1, CL))]>; > + [(set GR8:$dst, (sra GR8:$src1, CL))], > + IIC_SR>; > def SAR16rCL : I<0xD3, MRM7r, (outs GR16:$dst), (ins GR16:$src1), > "sar{w}\t{%cl, $dst|$dst, CL}", > - [(set GR16:$dst, (sra GR16:$src1, CL))]>, OpSize; > + [(set GR16:$dst, (sra GR16:$src1, CL))], > + IIC_SR>, OpSize; > def SAR32rCL : I<0xD3, MRM7r, (outs GR32:$dst), (ins GR32:$src1), > "sar{l}\t{%cl, $dst|$dst, CL}", > - [(set GR32:$dst, (sra GR32:$src1, CL))]>; > + [(set GR32:$dst, (sra GR32:$src1, CL))], > + IIC_SR>; > def SAR64rCL : RI<0xD3, MRM7r, (outs GR64:$dst), (ins GR64:$src1), > "sar{q}\t{%cl, $dst|$dst, CL}", > - [(set GR64:$dst, (sra GR64:$src1, CL))]>; > + [(set GR64:$dst, (sra GR64:$src1, CL))], > + IIC_SR>; > } > > def SAR8ri : Ii8<0xC0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), > "sar{b}\t{$src2, $dst|$dst, $src2}", > - [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))]>; > + [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))], > + IIC_SR>; > def SAR16ri : Ii8<0xC1, MRM7r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), > "sar{w}\t{$src2, $dst|$dst, $src2}", > - [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))]>, > + [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))], > + IIC_SR>, > OpSize; > def SAR32ri : Ii8<0xC1, MRM7r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), > "sar{l}\t{$src2, $dst|$dst, $src2}", > - [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))]>; > + [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))], > + IIC_SR>; > def SAR64ri : RIi8<0xC1, MRM7r, (outs GR64:$dst), > (ins GR64:$src1, i8imm:$src2), > "sar{q}\t{$src2, $dst|$dst, $src2}", > - [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))]>; > + [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))], > + IIC_SR>; > > // Shift by 1 > def SAR8r1 : I<0xD0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), > "sar{b}\t$dst", > - [(set GR8:$dst, (sra GR8:$src1, (i8 1)))]>; > + [(set GR8:$dst, (sra GR8:$src1, (i8 1)))], > + IIC_SR>; > def SAR16r1 : I<0xD1, MRM7r, (outs GR16:$dst), (ins GR16:$src1), > "sar{w}\t$dst", > - [(set GR16:$dst, (sra GR16:$src1, (i8 1)))]>, OpSize; > + [(set GR16:$dst, (sra GR16:$src1, (i8 1)))], > + IIC_SR>, OpSize; > def SAR32r1 : I<0xD1, MRM7r, (outs GR32:$dst), (ins GR32:$src1), > "sar{l}\t$dst", > - [(set GR32:$dst, (sra GR32:$src1, (i8 1)))]>; > + [(set GR32:$dst, (sra GR32:$src1, (i8 1)))], > + IIC_SR>; > def SAR64r1 : RI<0xD1, MRM7r, (outs GR64:$dst), (ins GR64:$src1), > "sar{q}\t$dst", > - [(set GR64:$dst, (sra GR64:$src1, (i8 1)))]>; > + [(set GR64:$dst, (sra GR64:$src1, (i8 1)))], > + IIC_SR>; > } // Constraints = "$src = $dst" > > > let Uses = [CL] in { > def SAR8mCL : I<0xD2, MRM7m, (outs), (ins i8mem :$dst), > "sar{b}\t{%cl, $dst|$dst, CL}", > - [(store (sra (loadi8 addr:$dst), CL), addr:$dst)]>; > + [(store (sra (loadi8 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def SAR16mCL : I<0xD3, MRM7m, (outs), (ins i16mem:$dst), > "sar{w}\t{%cl, $dst|$dst, CL}", > - [(store (sra (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; > + [(store (sra (loadi16 addr:$dst), CL), addr:$dst)], > + IIC_SR>, OpSize; > def SAR32mCL : I<0xD3, MRM7m, (outs), (ins i32mem:$dst), > "sar{l}\t{%cl, $dst|$dst, CL}", > - [(store (sra (loadi32 addr:$dst), CL), addr:$dst)]>; > + [(store (sra (loadi32 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def SAR64mCL : RI<0xD3, MRM7m, (outs), (ins i64mem:$dst), > "sar{q}\t{%cl, $dst|$dst, CL}", > - [(store (sra (loadi64 addr:$dst), CL), addr:$dst)]>; > + [(store (sra (loadi64 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > } > def SAR8mi : Ii8<0xC0, MRM7m, (outs), (ins i8mem :$dst, i8imm:$src), > "sar{b}\t{$src, $dst|$dst, $src}", > - [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SAR16mi : Ii8<0xC1, MRM7m, (outs), (ins i16mem:$dst, i8imm:$src), > "sar{w}\t{$src, $dst|$dst, $src}", > - [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, > + [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>, > OpSize; > def SAR32mi : Ii8<0xC1, MRM7m, (outs), (ins i32mem:$dst, i8imm:$src), > "sar{l}\t{$src, $dst|$dst, $src}", > - [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def SAR64mi : RIi8<0xC1, MRM7m, (outs), (ins i64mem:$dst, i8imm:$src), > "sar{q}\t{$src, $dst|$dst, $src}", > - [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > > // Shift by 1 > def SAR8m1 : I<0xD0, MRM7m, (outs), (ins i8mem :$dst), > "sar{b}\t$dst", > - [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SAR16m1 : I<0xD1, MRM7m, (outs), (ins i16mem:$dst), > "sar{w}\t$dst", > - [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, > + [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>, > OpSize; > def SAR32m1 : I<0xD1, MRM7m, (outs), (ins i32mem:$dst), > "sar{l}\t$dst", > - [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def SAR64m1 : RI<0xD1, MRM7m, (outs), (ins i64mem:$dst), > "sar{q}\t$dst", > - [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > > //===----------------------------------------------------------------------===// > // Rotate instructions > @@ -290,125 +335,125 @@ > > let Constraints = "$src1 = $dst" in { > def RCL8r1 : I<0xD0, MRM2r, (outs GR8:$dst), (ins GR8:$src1), > - "rcl{b}\t$dst", []>; > + "rcl{b}\t$dst", [], IIC_SR>; > def RCL8ri : Ii8<0xC0, MRM2r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), > - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCL8rCL : I<0xD2, MRM2r, (outs GR8:$dst), (ins GR8:$src1), > - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > def RCL16r1 : I<0xD1, MRM2r, (outs GR16:$dst), (ins GR16:$src1), > - "rcl{w}\t$dst", []>, OpSize; > + "rcl{w}\t$dst", [], IIC_SR>, OpSize; > def RCL16ri : Ii8<0xC1, MRM2r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), > - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; > + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; > let Uses = [CL] in > def RCL16rCL : I<0xD3, MRM2r, (outs GR16:$dst), (ins GR16:$src1), > - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; > + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; > > def RCL32r1 : I<0xD1, MRM2r, (outs GR32:$dst), (ins GR32:$src1), > - "rcl{l}\t$dst", []>; > + "rcl{l}\t$dst", [], IIC_SR>; > def RCL32ri : Ii8<0xC1, MRM2r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), > - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCL32rCL : I<0xD3, MRM2r, (outs GR32:$dst), (ins GR32:$src1), > - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > > def RCL64r1 : RI<0xD1, MRM2r, (outs GR64:$dst), (ins GR64:$src1), > - "rcl{q}\t$dst", []>; > + "rcl{q}\t$dst", [], IIC_SR>; > def RCL64ri : RIi8<0xC1, MRM2r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), > - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCL64rCL : RI<0xD3, MRM2r, (outs GR64:$dst), (ins GR64:$src1), > - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > > def RCR8r1 : I<0xD0, MRM3r, (outs GR8:$dst), (ins GR8:$src1), > - "rcr{b}\t$dst", []>; > + "rcr{b}\t$dst", [], IIC_SR>; > def RCR8ri : Ii8<0xC0, MRM3r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), > - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCR8rCL : I<0xD2, MRM3r, (outs GR8:$dst), (ins GR8:$src1), > - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > def RCR16r1 : I<0xD1, MRM3r, (outs GR16:$dst), (ins GR16:$src1), > - "rcr{w}\t$dst", []>, OpSize; > + "rcr{w}\t$dst", [], IIC_SR>, OpSize; > def RCR16ri : Ii8<0xC1, MRM3r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), > - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; > + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; > let Uses = [CL] in > def RCR16rCL : I<0xD3, MRM3r, (outs GR16:$dst), (ins GR16:$src1), > - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; > + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; > > def RCR32r1 : I<0xD1, MRM3r, (outs GR32:$dst), (ins GR32:$src1), > - "rcr{l}\t$dst", []>; > + "rcr{l}\t$dst", [], IIC_SR>; > def RCR32ri : Ii8<0xC1, MRM3r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), > - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCR32rCL : I<0xD3, MRM3r, (outs GR32:$dst), (ins GR32:$src1), > - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > def RCR64r1 : RI<0xD1, MRM3r, (outs GR64:$dst), (ins GR64:$src1), > - "rcr{q}\t$dst", []>; > + "rcr{q}\t$dst", [], IIC_SR>; > def RCR64ri : RIi8<0xC1, MRM3r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), > - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > let Uses = [CL] in > def RCR64rCL : RI<0xD3, MRM3r, (outs GR64:$dst), (ins GR64:$src1), > - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > } // Constraints = "$src = $dst" > > def RCL8m1 : I<0xD0, MRM2m, (outs), (ins i8mem:$dst), > - "rcl{b}\t$dst", []>; > + "rcl{b}\t$dst", [], IIC_SR>; > def RCL8mi : Ii8<0xC0, MRM2m, (outs), (ins i8mem:$dst, i8imm:$cnt), > - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > def RCL16m1 : I<0xD1, MRM2m, (outs), (ins i16mem:$dst), > - "rcl{w}\t$dst", []>, OpSize; > + "rcl{w}\t$dst", [], IIC_SR>, OpSize; > def RCL16mi : Ii8<0xC1, MRM2m, (outs), (ins i16mem:$dst, i8imm:$cnt), > - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; > + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; > def RCL32m1 : I<0xD1, MRM2m, (outs), (ins i32mem:$dst), > - "rcl{l}\t$dst", []>; > + "rcl{l}\t$dst", [], IIC_SR>; > def RCL32mi : Ii8<0xC1, MRM2m, (outs), (ins i32mem:$dst, i8imm:$cnt), > - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > def RCL64m1 : RI<0xD1, MRM2m, (outs), (ins i64mem:$dst), > - "rcl{q}\t$dst", []>; > + "rcl{q}\t$dst", [], IIC_SR>; > def RCL64mi : RIi8<0xC1, MRM2m, (outs), (ins i64mem:$dst, i8imm:$cnt), > - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > > def RCR8m1 : I<0xD0, MRM3m, (outs), (ins i8mem:$dst), > - "rcr{b}\t$dst", []>; > + "rcr{b}\t$dst", [], IIC_SR>; > def RCR8mi : Ii8<0xC0, MRM3m, (outs), (ins i8mem:$dst, i8imm:$cnt), > - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > def RCR16m1 : I<0xD1, MRM3m, (outs), (ins i16mem:$dst), > - "rcr{w}\t$dst", []>, OpSize; > + "rcr{w}\t$dst", [], IIC_SR>, OpSize; > def RCR16mi : Ii8<0xC1, MRM3m, (outs), (ins i16mem:$dst, i8imm:$cnt), > - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; > + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; > def RCR32m1 : I<0xD1, MRM3m, (outs), (ins i32mem:$dst), > - "rcr{l}\t$dst", []>; > + "rcr{l}\t$dst", [], IIC_SR>; > def RCR32mi : Ii8<0xC1, MRM3m, (outs), (ins i32mem:$dst, i8imm:$cnt), > - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > def RCR64m1 : RI<0xD1, MRM3m, (outs), (ins i64mem:$dst), > - "rcr{q}\t$dst", []>; > + "rcr{q}\t$dst", [], IIC_SR>; > def RCR64mi : RIi8<0xC1, MRM3m, (outs), (ins i64mem:$dst, i8imm:$cnt), > - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; > + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; > > let Uses = [CL] in { > def RCL8mCL : I<0xD2, MRM2m, (outs), (ins i8mem:$dst), > - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > def RCL16mCL : I<0xD3, MRM2m, (outs), (ins i16mem:$dst), > - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; > + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; > def RCL32mCL : I<0xD3, MRM2m, (outs), (ins i32mem:$dst), > - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > def RCL64mCL : RI<0xD3, MRM2m, (outs), (ins i64mem:$dst), > - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; > + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > > def RCR8mCL : I<0xD2, MRM3m, (outs), (ins i8mem:$dst), > - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > def RCR16mCL : I<0xD3, MRM3m, (outs), (ins i16mem:$dst), > - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; > + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; > def RCR32mCL : I<0xD3, MRM3m, (outs), (ins i32mem:$dst), > - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > def RCR64mCL : RI<0xD3, MRM3m, (outs), (ins i64mem:$dst), > - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; > + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; > } > > let Constraints = "$src1 = $dst" in { > @@ -416,179 +461,217 @@ > let Uses = [CL] in { > def ROL8rCL : I<0xD2, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), > "rol{b}\t{%cl, $dst|$dst, CL}", > - [(set GR8:$dst, (rotl GR8:$src1, CL))]>; > + [(set GR8:$dst, (rotl GR8:$src1, CL))], IIC_SR>; > def ROL16rCL : I<0xD3, MRM0r, (outs GR16:$dst), (ins GR16:$src1), > "rol{w}\t{%cl, $dst|$dst, CL}", > - [(set GR16:$dst, (rotl GR16:$src1, CL))]>, OpSize; > + [(set GR16:$dst, (rotl GR16:$src1, CL))], IIC_SR>, OpSize; > def ROL32rCL : I<0xD3, MRM0r, (outs GR32:$dst), (ins GR32:$src1), > "rol{l}\t{%cl, $dst|$dst, CL}", > - [(set GR32:$dst, (rotl GR32:$src1, CL))]>; > + [(set GR32:$dst, (rotl GR32:$src1, CL))], IIC_SR>; > def ROL64rCL : RI<0xD3, MRM0r, (outs GR64:$dst), (ins GR64:$src1), > "rol{q}\t{%cl, $dst|$dst, CL}", > - [(set GR64:$dst, (rotl GR64:$src1, CL))]>; > + [(set GR64:$dst, (rotl GR64:$src1, CL))], IIC_SR>; > } > > def ROL8ri : Ii8<0xC0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), > "rol{b}\t{$src2, $dst|$dst, $src2}", > - [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))]>; > + [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; > def ROL16ri : Ii8<0xC1, MRM0r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), > "rol{w}\t{$src2, $dst|$dst, $src2}", > - [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))]>, > + [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))], > + IIC_SR>, > OpSize; > def ROL32ri : Ii8<0xC1, MRM0r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), > "rol{l}\t{$src2, $dst|$dst, $src2}", > - [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))]>; > + [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))], > + IIC_SR>; > def ROL64ri : RIi8<0xC1, MRM0r, (outs GR64:$dst), > (ins GR64:$src1, i8imm:$src2), > "rol{q}\t{$src2, $dst|$dst, $src2}", > - [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))]>; > + [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))], > + IIC_SR>; > > // Rotate by 1 > def ROL8r1 : I<0xD0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), > "rol{b}\t$dst", > - [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))]>; > + [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))], > + IIC_SR>; > def ROL16r1 : I<0xD1, MRM0r, (outs GR16:$dst), (ins GR16:$src1), > "rol{w}\t$dst", > - [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))]>, OpSize; > + [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))], > + IIC_SR>, OpSize; > def ROL32r1 : I<0xD1, MRM0r, (outs GR32:$dst), (ins GR32:$src1), > "rol{l}\t$dst", > - [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))]>; > + [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))], > + IIC_SR>; > def ROL64r1 : RI<0xD1, MRM0r, (outs GR64:$dst), (ins GR64:$src1), > "rol{q}\t$dst", > - [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))]>; > + [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))], > + IIC_SR>; > } // Constraints = "$src = $dst" > > let Uses = [CL] in { > def ROL8mCL : I<0xD2, MRM0m, (outs), (ins i8mem :$dst), > "rol{b}\t{%cl, $dst|$dst, CL}", > - [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)]>; > + [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def ROL16mCL : I<0xD3, MRM0m, (outs), (ins i16mem:$dst), > "rol{w}\t{%cl, $dst|$dst, CL}", > - [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; > + [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)], > + IIC_SR>, OpSize; > def ROL32mCL : I<0xD3, MRM0m, (outs), (ins i32mem:$dst), > "rol{l}\t{%cl, $dst|$dst, CL}", > - [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)]>; > + [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def ROL64mCL : RI<0xD3, MRM0m, (outs), (ins i64mem:$dst), > "rol{q}\t{%cl, $dst|$dst, %cl}", > - [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)]>; > + [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > } > def ROL8mi : Ii8<0xC0, MRM0m, (outs), (ins i8mem :$dst, i8imm:$src1), > "rol{b}\t{$src1, $dst|$dst, $src1}", > - [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; > + [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)], > + IIC_SR>; > def ROL16mi : Ii8<0xC1, MRM0m, (outs), (ins i16mem:$dst, i8imm:$src1), > "rol{w}\t{$src1, $dst|$dst, $src1}", > - [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)]>, > + [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)], > + IIC_SR>, > OpSize; > def ROL32mi : Ii8<0xC1, MRM0m, (outs), (ins i32mem:$dst, i8imm:$src1), > "rol{l}\t{$src1, $dst|$dst, $src1}", > - [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; > + [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)], > + IIC_SR>; > def ROL64mi : RIi8<0xC1, MRM0m, (outs), (ins i64mem:$dst, i8imm:$src1), > "rol{q}\t{$src1, $dst|$dst, $src1}", > - [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; > + [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)], > + IIC_SR>; > > // Rotate by 1 > def ROL8m1 : I<0xD0, MRM0m, (outs), (ins i8mem :$dst), > "rol{b}\t$dst", > - [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def ROL16m1 : I<0xD1, MRM0m, (outs), (ins i16mem:$dst), > "rol{w}\t$dst", > - [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, > + [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>, > OpSize; > def ROL32m1 : I<0xD1, MRM0m, (outs), (ins i32mem:$dst), > "rol{l}\t$dst", > - [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def ROL64m1 : RI<0xD1, MRM0m, (outs), (ins i64mem:$dst), > "rol{q}\t$dst", > - [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > > let Constraints = "$src1 = $dst" in { > let Uses = [CL] in { > def ROR8rCL : I<0xD2, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), > "ror{b}\t{%cl, $dst|$dst, CL}", > - [(set GR8:$dst, (rotr GR8:$src1, CL))]>; > + [(set GR8:$dst, (rotr GR8:$src1, CL))], IIC_SR>; > def ROR16rCL : I<0xD3, MRM1r, (outs GR16:$dst), (ins GR16:$src1), > "ror{w}\t{%cl, $dst|$dst, CL}", > - [(set GR16:$dst, (rotr GR16:$src1, CL))]>, OpSize; > + [(set GR16:$dst, (rotr GR16:$src1, CL))], IIC_SR>, OpSize; > def ROR32rCL : I<0xD3, MRM1r, (outs GR32:$dst), (ins GR32:$src1), > "ror{l}\t{%cl, $dst|$dst, CL}", > - [(set GR32:$dst, (rotr GR32:$src1, CL))]>; > + [(set GR32:$dst, (rotr GR32:$src1, CL))], IIC_SR>; > def ROR64rCL : RI<0xD3, MRM1r, (outs GR64:$dst), (ins GR64:$src1), > "ror{q}\t{%cl, $dst|$dst, CL}", > - [(set GR64:$dst, (rotr GR64:$src1, CL))]>; > + [(set GR64:$dst, (rotr GR64:$src1, CL))], IIC_SR>; > } > > def ROR8ri : Ii8<0xC0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), > "ror{b}\t{$src2, $dst|$dst, $src2}", > - [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))]>; > + [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))], IIC_SR>; > def ROR16ri : Ii8<0xC1, MRM1r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), > "ror{w}\t{$src2, $dst|$dst, $src2}", > - [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))]>, > + [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))], > + IIC_SR>, > OpSize; > def ROR32ri : Ii8<0xC1, MRM1r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), > "ror{l}\t{$src2, $dst|$dst, $src2}", > - [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))]>; > + [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))], > + IIC_SR>; > def ROR64ri : RIi8<0xC1, MRM1r, (outs GR64:$dst), > (ins GR64:$src1, i8imm:$src2), > "ror{q}\t{$src2, $dst|$dst, $src2}", > - [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))]>; > + [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))], > + IIC_SR>; > > // Rotate by 1 > def ROR8r1 : I<0xD0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), > "ror{b}\t$dst", > - [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))]>; > + [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))], > + IIC_SR>; > def ROR16r1 : I<0xD1, MRM1r, (outs GR16:$dst), (ins GR16:$src1), > "ror{w}\t$dst", > - [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))]>, OpSize; > + [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))], > + IIC_SR>, OpSize; > def ROR32r1 : I<0xD1, MRM1r, (outs GR32:$dst), (ins GR32:$src1), > "ror{l}\t$dst", > - [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))]>; > + [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))], > + IIC_SR>; > def ROR64r1 : RI<0xD1, MRM1r, (outs GR64:$dst), (ins GR64:$src1), > "ror{q}\t$dst", > - [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))]>; > + [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))], > + IIC_SR>; > } // Constraints = "$src = $dst" > > let Uses = [CL] in { > def ROR8mCL : I<0xD2, MRM1m, (outs), (ins i8mem :$dst), > "ror{b}\t{%cl, $dst|$dst, CL}", > - [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)]>; > + [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def ROR16mCL : I<0xD3, MRM1m, (outs), (ins i16mem:$dst), > "ror{w}\t{%cl, $dst|$dst, CL}", > - [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; > + [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)], > + IIC_SR>, OpSize; > def ROR32mCL : I<0xD3, MRM1m, (outs), (ins i32mem:$dst), > "ror{l}\t{%cl, $dst|$dst, CL}", > - [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)]>; > + [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > def ROR64mCL : RI<0xD3, MRM1m, (outs), (ins i64mem:$dst), > "ror{q}\t{%cl, $dst|$dst, CL}", > - [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)]>; > + [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)], > + IIC_SR>; > } > def ROR8mi : Ii8<0xC0, MRM1m, (outs), (ins i8mem :$dst, i8imm:$src), > "ror{b}\t{$src, $dst|$dst, $src}", > - [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def ROR16mi : Ii8<0xC1, MRM1m, (outs), (ins i16mem:$dst, i8imm:$src), > "ror{w}\t{$src, $dst|$dst, $src}", > - [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, > + [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>, > OpSize; > def ROR32mi : Ii8<0xC1, MRM1m, (outs), (ins i32mem:$dst, i8imm:$src), > "ror{l}\t{$src, $dst|$dst, $src}", > - [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > def ROR64mi : RIi8<0xC1, MRM1m, (outs), (ins i64mem:$dst, i8imm:$src), > "ror{q}\t{$src, $dst|$dst, $src}", > - [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; > + [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], > + IIC_SR>; > > // Rotate by 1 > def ROR8m1 : I<0xD0, MRM1m, (outs), (ins i8mem :$dst), > "ror{b}\t$dst", > - [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def ROR16m1 : I<0xD1, MRM1m, (outs), (ins i16mem:$dst), > "ror{w}\t$dst", > - [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, > + [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>, > OpSize; > def ROR32m1 : I<0xD1, MRM1m, (outs), (ins i32mem:$dst), > "ror{l}\t$dst", > - [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > def ROR64m1 : RI<0xD1, MRM1m, (outs), (ins i64mem:$dst), > "ror{q}\t$dst", > - [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; > + [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)], > + IIC_SR>; > > > //===----------------------------------------------------------------------===// > @@ -601,30 +684,36 @@ > def SHLD16rrCL : I<0xA5, MRMDestReg, (outs GR16:$dst), > (ins GR16:$src1, GR16:$src2), > "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))]>, > + [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))], > + IIC_SHD16_REG_CL>, > TB, OpSize; > def SHRD16rrCL : I<0xAD, MRMDestReg, (outs GR16:$dst), > (ins GR16:$src1, GR16:$src2), > "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))]>, > + [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))], > + IIC_SHD16_REG_CL>, > TB, OpSize; > def SHLD32rrCL : I<0xA5, MRMDestReg, (outs GR32:$dst), > (ins GR32:$src1, GR32:$src2), > "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))]>, TB; > + [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))], > + IIC_SHD32_REG_CL>, TB; > def SHRD32rrCL : I<0xAD, MRMDestReg, (outs GR32:$dst), > (ins GR32:$src1, GR32:$src2), > "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))]>, TB; > + [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))], > + IIC_SHD32_REG_CL>, TB; > def SHLD64rrCL : RI<0xA5, MRMDestReg, (outs GR64:$dst), > (ins GR64:$src1, GR64:$src2), > "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))]>, > + [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))], > + IIC_SHD64_REG_CL>, > TB; > def SHRD64rrCL : RI<0xAD, MRMDestReg, (outs GR64:$dst), > (ins GR64:$src1, GR64:$src2), > "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", > - [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))]>, > + [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))], > + IIC_SHD64_REG_CL>, > TB; > } > > @@ -634,42 +723,42 @@ > (ins GR16:$src1, GR16:$src2, i8imm:$src3), > "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, > TB, OpSize; > def SHRD16rri8 : Ii8<0xAC, MRMDestReg, > (outs GR16:$dst), > (ins GR16:$src1, GR16:$src2, i8imm:$src3), > "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, > TB, OpSize; > def SHLD32rri8 : Ii8<0xA4, MRMDestReg, > (outs GR32:$dst), > (ins GR32:$src1, GR32:$src2, i8imm:$src3), > "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, > TB; > def SHRD32rri8 : Ii8<0xAC, MRMDestReg, > (outs GR32:$dst), > (ins GR32:$src1, GR32:$src2, i8imm:$src3), > "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, > TB; > def SHLD64rri8 : RIi8<0xA4, MRMDestReg, > (outs GR64:$dst), > (ins GR64:$src1, GR64:$src2, i8imm:$src3), > "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, > TB; > def SHRD64rri8 : RIi8<0xAC, MRMDestReg, > (outs GR64:$dst), > (ins GR64:$src1, GR64:$src2, i8imm:$src3), > "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, > - (i8 imm:$src3)))]>, > + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, > TB; > } > } // Constraints = "$src = $dst" > @@ -678,68 +767,74 @@ > def SHLD16mrCL : I<0xA5, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), > "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shld (loadi16 addr:$dst), GR16:$src2, CL), > - addr:$dst)]>, TB, OpSize; > + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; > def SHRD16mrCL : I<0xAD, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), > "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, CL), > - addr:$dst)]>, TB, OpSize; > + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; > > def SHLD32mrCL : I<0xA5, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), > "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shld (loadi32 addr:$dst), GR32:$src2, CL), > - addr:$dst)]>, TB; > + addr:$dst)], IIC_SHD32_MEM_CL>, TB; > def SHRD32mrCL : I<0xAD, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), > "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, CL), > - addr:$dst)]>, TB; > + addr:$dst)], IIC_SHD32_MEM_CL>, TB; > > def SHLD64mrCL : RI<0xA5, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), > "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shld (loadi64 addr:$dst), GR64:$src2, CL), > - addr:$dst)]>, TB; > + addr:$dst)], IIC_SHD64_MEM_CL>, TB; > def SHRD64mrCL : RI<0xAD, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), > "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", > [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, CL), > - addr:$dst)]>, TB; > + addr:$dst)], IIC_SHD64_MEM_CL>, TB; > } > > def SHLD16mri8 : Ii8<0xA4, MRMDestMem, > (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), > "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shld (loadi16 addr:$dst), GR16:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD16_MEM_IM>, > TB, OpSize; > def SHRD16mri8 : Ii8<0xAC, MRMDestMem, > (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), > "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD16_MEM_IM>, > TB, OpSize; > > def SHLD32mri8 : Ii8<0xA4, MRMDestMem, > (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), > "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shld (loadi32 addr:$dst), GR32:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD32_MEM_IM>, > TB; > def SHRD32mri8 : Ii8<0xAC, MRMDestMem, > (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), > "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD32_MEM_IM>, > TB; > > def SHLD64mri8 : RIi8<0xA4, MRMDestMem, > (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), > "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shld (loadi64 addr:$dst), GR64:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD64_MEM_IM>, > TB; > def SHRD64mri8 : RIi8<0xAC, MRMDestMem, > (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), > "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", > [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, > - (i8 imm:$src3)), addr:$dst)]>, > + (i8 imm:$src3)), addr:$dst)], > + IIC_SHD64_MEM_IM>, > TB; > > } // Defs = [EFLAGS] > > Added: llvm/trunk/lib/Target/X86/X86Schedule.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Schedule.td?rev=149558&view=auto > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86Schedule.td (added) > +++ llvm/trunk/lib/Target/X86/X86Schedule.td Wed Feb 1 17:20:51 2012 > @@ -0,0 +1,115 @@ > +//===- X86Schedule.td - X86 Scheduling Definitions ---------*- tablegen -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > + > +//===----------------------------------------------------------------------===// > +// Instruction Itinerary classes used for X86 > +def IIC_DEFAULT : InstrItinClass; > +def IIC_ALU_MEM : InstrItinClass; > +def IIC_ALU_NONMEM : InstrItinClass; > +def IIC_LEA : InstrItinClass; > +def IIC_LEA_16 : InstrItinClass; > +def IIC_MUL8 : InstrItinClass; > +def IIC_MUL16_MEM : InstrItinClass; > +def IIC_MUL16_REG : InstrItinClass; > +def IIC_MUL32_MEM : InstrItinClass; > +def IIC_MUL32_REG : InstrItinClass; > +def IIC_MUL64 : InstrItinClass; > +// imul by al, ax, eax, tax > +def IIC_IMUL8 : InstrItinClass; > +def IIC_IMUL16_MEM : InstrItinClass; > +def IIC_IMUL16_REG : InstrItinClass; > +def IIC_IMUL32_MEM : InstrItinClass; > +def IIC_IMUL32_REG : InstrItinClass; > +def IIC_IMUL64 : InstrItinClass; > +// imul reg by reg|mem > +def IIC_IMUL16_RM : InstrItinClass; > +def IIC_IMUL16_RR : InstrItinClass; > +def IIC_IMUL32_RM : InstrItinClass; > +def IIC_IMUL32_RR : InstrItinClass; > +def IIC_IMUL64_RM : InstrItinClass; > +def IIC_IMUL64_RR : InstrItinClass; > +// imul reg = reg/mem * imm > +def IIC_IMUL16_RMI : InstrItinClass; > +def IIC_IMUL16_RRI : InstrItinClass; > +def IIC_IMUL32_RMI : InstrItinClass; > +def IIC_IMUL32_RRI : InstrItinClass; > +def IIC_IMUL64_RMI : InstrItinClass; > +def IIC_IMUL64_RRI : InstrItinClass; > +// div > +def IIC_DIV8_MEM : InstrItinClass; > +def IIC_DIV8_REG : InstrItinClass; > +def IIC_DIV16 : InstrItinClass; > +def IIC_DIV32 : InstrItinClass; > +def IIC_DIV64 : InstrItinClass; > +// idiv > +def IIC_IDIV8 : InstrItinClass; > +def IIC_IDIV16 : InstrItinClass; > +def IIC_IDIV32 : InstrItinClass; > +def IIC_IDIV64 : InstrItinClass; > +// neg/not/inc/dec > +def IIC_UNARY_REG : InstrItinClass; > +def IIC_UNARY_MEM : InstrItinClass; > +// add/sub/and/or/xor/adc/sbc/cmp/test > +def IIC_BIN_MEM : InstrItinClass; > +def IIC_BIN_NONMEM : InstrItinClass; > +// shift/rotate > +def IIC_SR : InstrItinClass; > +// shift double > +def IIC_SHD16_REG_IM : InstrItinClass; > +def IIC_SHD16_REG_CL : InstrItinClass; > +def IIC_SHD16_MEM_IM : InstrItinClass; > +def IIC_SHD16_MEM_CL : InstrItinClass; > +def IIC_SHD32_REG_IM : InstrItinClass; > +def IIC_SHD32_REG_CL : InstrItinClass; > +def IIC_SHD32_MEM_IM : InstrItinClass; > +def IIC_SHD32_MEM_CL : InstrItinClass; > +def IIC_SHD64_REG_IM : InstrItinClass; > +def IIC_SHD64_REG_CL : InstrItinClass; > +def IIC_SHD64_MEM_IM : InstrItinClass; > +def IIC_SHD64_MEM_CL : InstrItinClass; > +// cmov > +def IIC_CMOV16_RM : InstrItinClass; > +def IIC_CMOV16_RR : InstrItinClass; > +def IIC_CMOV32_RM : InstrItinClass; > +def IIC_CMOV32_RR : InstrItinClass; > +def IIC_CMOV64_RM : InstrItinClass; > +def IIC_CMOV64_RR : InstrItinClass; > +// set > +def IIC_SET_R : InstrItinClass; > +def IIC_SET_M : InstrItinClass; > +// jmp/jcc/jcxz > +def IIC_Jcc : InstrItinClass; > +def IIC_JCXZ : InstrItinClass; > +def IIC_JMP_REL : InstrItinClass; > +def IIC_JMP_REG : InstrItinClass; > +def IIC_JMP_MEM : InstrItinClass; > +def IIC_JMP_FAR_MEM : InstrItinClass; > +def IIC_JMP_FAR_PTR : InstrItinClass; > +// loop > +def IIC_LOOP : InstrItinClass; > +def IIC_LOOPE : InstrItinClass; > +def IIC_LOOPNE : InstrItinClass; > +// call > +def IIC_CALL_RI : InstrItinClass; > +def IIC_CALL_MEM : InstrItinClass; > +def IIC_CALL_FAR_MEM : InstrItinClass; > +def IIC_CALL_FAR_PTR : InstrItinClass; > +// ret > +def IIC_RET : InstrItinClass; > +def IIC_RET_IMM : InstrItinClass; > + > +//===----------------------------------------------------------------------===// > +// Processor instruction itineraries. > + > +def GenericItineraries : ProcessorItineraries<[], [], []>; > + > +include "X86ScheduleAtom.td" > + > + > + > > Added: llvm/trunk/lib/Target/X86/X86ScheduleAtom.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ScheduleAtom.td?rev=149558&view=auto > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ScheduleAtom.td (added) > +++ llvm/trunk/lib/Target/X86/X86ScheduleAtom.td Wed Feb 1 17:20:51 2012 > @@ -0,0 +1,136 @@ > +//=- X86ScheduleAtom.td - X86 Atom Scheduling Definitions -*- tablegen -*-=// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file defines the itinerary class data for the Intel Atom (Bonnell) > +// processors. > +// > +//===----------------------------------------------------------------------===// > + > +// > +// Scheduling information derived from the "Intel 64 and IA32 Architectures > +// Optimization Reference Manual", Chapter 13, Section 4. > +// Functional Units > +// Port 0 > +def Port0 : FuncUnit; // ALU: ALU0, shift/rotate, load/store > + // SIMD/FP: SIMD ALU, Shuffle,SIMD/FP multiply, divide > +def Port1 : FuncUnit; // ALU: ALU1, bit processing, jump, and LEA > + // SIMD/FP: SIMD ALU, FP Adder > + > +def AtomItineraries : ProcessorItineraries< > + [ Port0, Port1 ], > + [], [ > + // P0 only > + // InstrItinData] >, > + // P0 or P1 > + // InstrItinData] >, > + // P0 and P1 > + // InstrItinData, InstrStage] >, > + // > + // Default is 1 cycle, port0 or port1 > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // mul > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // imul by al, ax, eax, rax > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // imul reg by reg|mem > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // imul reg = reg/mem * imm > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // idiv > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // div > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // neg/not/inc/dec > + InstrItinData] >, > + InstrItinData] >, > + // add/sub/and/or/xor/adc/sbc/cmp/test > + InstrItinData] >, > + InstrItinData] >, > + // shift/rotate > + InstrItinData] >, > + // shift double > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // cmov > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // set > + InstrItinData] >, > + InstrItinData] >, > + // jcc > + InstrItinData] >, > + // jcxz/jecxz/jrcxz > + InstrItinData] >, > + // jmp rel > + InstrItinData] >, > + // jmp indirect > + InstrItinData] >, > + InstrItinData] >, > + // jmp far > + InstrItinData] >, > + InstrItinData] >, > + // loop/loope/loopne > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + // call - all but reg/imm > + InstrItinData, InstrStage<1, [Port1]>] >, > + InstrItinData] >, > + InstrItinData] >, > + InstrItinData] >, > + //ret > + InstrItinData] >, > + InstrItinData, InstrStage<1, [Port1]>] > > +]>; > + > > Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Wed Feb 1 17:20:51 2012 > @@ -246,6 +246,7 @@ > IsBTMemSlow = true; > ToggleFeature(X86::FeatureSlowBTMem); > } > + > // If it's Nehalem, unaligned memory access is fast. > // FIXME: Nehalem is family 6. Also include Westmere and later processors? > if (Family == 15 && Model == 26) { > @@ -253,6 +254,11 @@ > ToggleFeature(X86::FeatureFastUAMem); > } > > + // Set processor type. Currently only Atom is detected. > + if (Family == 6 && Model == 28) { > + X86ProcFamily = IntelAtom; > + } > + > unsigned MaxExtLevel; > X86_MC::GetCpuIDAndInfo(0x80000000, &MaxExtLevel, &EBX, &ECX, &EDX); > > @@ -310,6 +316,7 @@ > const std::string &FS, > unsigned StackAlignOverride, bool is64Bit) > : X86GenSubtargetInfo(TT, CPU, FS) > + , X86ProcFamily(Others) > , PICStyle(PICStyles::None) > , X86SSELevel(NoMMXSSE) > , X863DNowLevel(NoThreeDNow) > @@ -333,14 +340,15 @@ > , IsUAMemFast(false) > , HasVectorUAMem(false) > , HasCmpxchg16b(false) > + , PostRAScheduler(false) > , stackAlignment(4) > // FIXME: this is a known good value for Yonah. How about others? > , MaxInlineSizeThreshold(128) > , TargetTriple(TT) > , In64BitMode(is64Bit) { > // Determine default and user specified characteristics > + std::string CPUName = CPU; > if (!FS.empty() || !CPU.empty()) { > - std::string CPUName = CPU; > if (CPUName.empty()) { > #if defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86)\ > || defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) > @@ -363,6 +371,13 @@ > // If feature string is not empty, parse features string. > ParseSubtargetFeatures(CPUName, FullFS); > } else { > + if (CPUName.empty()) { > +#if defined (__x86_64__) || defined(__i386__) > + CPUName = sys::getHostCPUName(); > +#else > + CPUName = "generic"; > +#endif > + } > // Otherwise, use CPUID to auto-detect feature set. > AutoDetectSubtargetFeatures(); > > @@ -379,6 +394,11 @@ > } > } > > + if (X86ProcFamily == IntelAtom) { > + PostRAScheduler = true; > + InstrItins = getInstrItineraryForCPU(CPUName); > + } > + > // It's important to keep the MCSubtargetInfo feature bits in sync with > // target data structure which is shared with MC code emitter, etc. > if (In64BitMode) > @@ -398,3 +418,12 @@ > isTargetSolaris() || In64BitMode) > stackAlignment = 16; > } > + > +bool X86Subtarget::enablePostRAScheduler( > + CodeGenOpt::Level OptLevel, > + TargetSubtargetInfo::AntiDepBreakMode& Mode, > + RegClassVector& CriticalPathRCs) const { > + Mode = TargetSubtargetInfo::ANTIDEP_CRITICAL; > + CriticalPathRCs.clear(); > + return PostRAScheduler && OptLevel >= CodeGenOpt::Default; > +} > > Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original) > +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Wed Feb 1 17:20:51 2012 > @@ -49,6 +49,13 @@ > NoThreeDNow, ThreeDNow, ThreeDNowA > }; > > + enum X86ProcFamilyEnum { > + Others, IntelAtom > + }; > + > + /// X86ProcFamily - X86 processor family: Intel Atom, and others > + X86ProcFamilyEnum X86ProcFamily; > + > /// PICStyle - Which PIC style to use > /// > PICStyles::Style PICStyle; > @@ -125,6 +132,9 @@ > /// this is true for most x86-64 chips, but not the first AMD chips. > bool HasCmpxchg16b; > > + /// PostRAScheduler - True if using post-register-allocation scheduler. > + bool PostRAScheduler; > + > /// stackAlignment - The minimum alignment known to hold of the stack frame on > /// entry to the function and which must be maintained by every function. > unsigned stackAlignment; > @@ -135,6 +145,9 @@ > > /// TargetTriple - What processor and OS we're targeting. > Triple TargetTriple; > + > + /// Instruction itineraries for scheduling > + InstrItineraryData InstrItins; > > private: > /// In64BitMode - True if compiling for 64-bit, false for 32-bit. > @@ -202,6 +215,8 @@ > bool hasVectorUAMem() const { return HasVectorUAMem; } > bool hasCmpxchg16b() const { return HasCmpxchg16b; } > > + bool isAtom() const { return X86ProcFamily == IntelAtom; } > + > const Triple &getTargetTriple() const { return TargetTriple; } > > bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); } > @@ -291,6 +306,15 @@ > /// indicating the number of scheduling cycles of backscheduling that > /// should be attempted. > unsigned getSpecialAddressLatency() const; > + > + /// enablePostRAScheduler - run for Atom optimization. > + bool enablePostRAScheduler(CodeGenOpt::Level OptLevel, > + TargetSubtargetInfo::AntiDepBreakMode& Mode, > + RegClassVector& CriticalPathRCs) const; > + > + /// getInstrItins = Return the instruction itineraries based on the > + /// subtarget selection. > + const InstrItineraryData &getInstrItineraryData() const { return InstrItins; } > }; > > } // End llvm namespace > > Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Wed Feb 1 17:20:51 2012 > @@ -78,7 +78,8 @@ > : LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL), > Subtarget(TT, CPU, FS, Options.StackAlignmentOverride, is64Bit), > FrameLowering(*this, Subtarget), > - ELFWriterInfo(is64Bit, true) { > + ELFWriterInfo(is64Bit, true), > + InstrItins(Subtarget.getInstrItineraryData()){ > // Determine the PICStyle based on the target selected. > if (getRelocationModel() == Reloc::Static) { > // Unless we're in PIC or DynamicNoPIC mode, set the PIC style to None. > > Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.h?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86TargetMachine.h (original) > +++ llvm/trunk/lib/Target/X86/X86TargetMachine.h Wed Feb 1 17:20:51 2012 > @@ -32,9 +32,10 @@ > class StringRef; > > class X86TargetMachine : public LLVMTargetMachine { > - X86Subtarget Subtarget; > - X86FrameLowering FrameLowering; > - X86ELFWriterInfo ELFWriterInfo; > + X86Subtarget Subtarget; > + X86FrameLowering FrameLowering; > + X86ELFWriterInfo ELFWriterInfo; > + InstrItineraryData InstrItins; > > public: > X86TargetMachine(const Target &T, StringRef TT, > @@ -65,6 +66,9 @@ > virtual const X86ELFWriterInfo *getELFWriterInfo() const { > return Subtarget.isTargetELF() ? &ELFWriterInfo : 0; > } > + virtual const InstrItineraryData *getInstrItineraryData() const { > + return &InstrItins; > + } > > // Set up the pass pipeline. > virtual bool addInstSelector(PassManagerBase &PM); > > Modified: llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll Wed Feb 1 17:20:51 2012 > @@ -1,5 +1,5 @@ > ; PR1075 > -; RUN: llc < %s -mtriple=x86_64-apple-darwin -O3 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin -O3 | FileCheck %s > > define float @foo(float %x) nounwind { > %tmp1 = fmul float %x, 3.000000e+00 > > Modified: llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 -mattr=+sse2 | not grep lea > +; RUN: llc < %s -march=x86 -mcpu=generic -mattr=+sse2 | not grep lea > > define float @foo(i32* %x, float* %y, i32 %c) nounwind { > entry: > > Modified: llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 | grep {(%esp)} | count 2 > +; RUN: llc < %s -march=x86 -mcpu=generic | grep {(%esp)} | count 2 > ; PR1872 > > %struct.c34007g__designated___XUB = type { i32, i32, i32, i32 } > > Modified: llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s > ; PR3149 > ; Make sure the copy after inline asm is not coalesced away. > > > Modified: llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc -mtriple=x86_64-mingw32 < %s | FileCheck %s > +; RUN: llc -mcpu=generic -mtriple=x86_64-mingw32 < %s | FileCheck %s > ; CHECK: subq $40, %rsp > ; CHECK: movaps %xmm8, (%rsp) > ; CHECK: movaps %xmm7, 16(%rsp) > > Modified: llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s > +; RUN: llc -mcpu=generic -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s > ; Check that lowered argumens do not overwrite the return address before it is moved. > ; Bug 6225 > ; > > Modified: llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s | FileCheck %s > +; RUN: llc < %s -mcpu=generic | FileCheck %s > ; PR6941 > target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" > target triple = "x86_64-apple-darwin10.0.0" > > Modified: llvm/trunk/test/CodeGen/X86/abi-isel.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/abi-isel.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/abi-isel.ll (original) > +++ llvm/trunk/test/CodeGen/X86/abi-isel.ll Wed Feb 1 17:20:51 2012 > @@ -1,16 +1,16 @@ > -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC > > -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC > > -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC > - > -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC > -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC > + > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC > +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC > > @src = external global [131072 x i32] > @dst = external global [131072 x i32] > > Modified: llvm/trunk/test/CodeGen/X86/add.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/add.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/add.ll (original) > +++ llvm/trunk/test/CodeGen/X86/add.ll Wed Feb 1 17:20:51 2012 > @@ -1,6 +1,6 @@ > -; RUN: llc < %s -march=x86 | FileCheck %s -check-prefix=X32 > -; RUN: llc < %s -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 > -; RUN: llc < %s -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 > +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s -check-prefix=X32 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 > > ; Some of these tests depend on -join-physregs to commute instructions. > > > Added: llvm/trunk/test/CodeGen/X86/atom-sched.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/atom-sched.ll?rev=149558&view=auto > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/atom-sched.ll (added) > +++ llvm/trunk/test/CodeGen/X86/atom-sched.ll Wed Feb 1 17:20:51 2012 > @@ -0,0 +1,28 @@ > +; RUN: llc <%s -O2 -mcpu=atom -march=x86 -relocation-model=static | FileCheck -check-prefix=atom %s > +; RUN: llc <%s -O2 -mcpu=core2 -march=x86 -relocation-model=static | FileCheck %s > + > + at a = common global i32 0, align 4 > + at b = common global i32 0, align 4 > + at c = common global i32 0, align 4 > + at d = common global i32 0, align 4 > + at e = common global i32 0, align 4 > + at f = common global i32 0, align 4 > + > +define void @func() nounwind uwtable { > +; atom: imull > +; atom-NOT: movl > +; atom: imull > +; CHECK: imull > +; CHECK: movl > +; CHECK: imull > +entry: > + %0 = load i32* @b, align 4 > + %1 = load i32* @c, align 4 > + %mul = mul nsw i32 %0, %1 > + store i32 %mul, i32* @a, align 4 > + %2 = load i32* @e, align 4 > + %3 = load i32* @f, align 4 > + %mul1 = mul nsw i32 %2, %3 > + store i32 %mul1, i32* @d, align 4 > + ret void > +} > > Modified: llvm/trunk/test/CodeGen/X86/byval6.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/byval6.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/byval6.ll (original) > +++ llvm/trunk/test/CodeGen/X86/byval6.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 | grep add | not grep 16 > +; RUN: llc < %s -mcpu=generic -march=x86 | grep add | not grep 16 > > %struct.W = type { x86_fp80, x86_fp80 } > @B = global %struct.W { x86_fp80 0xK4001A000000000000000, x86_fp80 0xK4001C000000000000000 }, align 32 > > Modified: llvm/trunk/test/CodeGen/X86/divide-by-constant.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/divide-by-constant.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/divide-by-constant.ll (original) > +++ llvm/trunk/test/CodeGen/X86/divide-by-constant.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s > target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" > target triple = "i686-pc-linux-gnu" > > > Modified: llvm/trunk/test/CodeGen/X86/epilogue.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/epilogue.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/epilogue.ll (original) > +++ llvm/trunk/test/CodeGen/X86/epilogue.ll Wed Feb 1 17:20:51 2012 > @@ -1,5 +1,5 @@ > -; RUN: llc < %s -march=x86 | not grep lea > -; RUN: llc < %s -march=x86 | grep {movl %ebp} > +; RUN: llc < %s -mcpu=generic -march=x86 | not grep lea > +; RUN: llc < %s -mcpu=generic -march=x86 | grep {movl %ebp} > > declare void @bar(<2 x i64>* %n) > > > Modified: llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll (original) > +++ llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 -x86-asm-syntax=intel | \ > +; RUN: llc < %s -mcpu=generic -march=x86 -x86-asm-syntax=intel | \ > ; RUN: grep {add ESP, 8} > > target triple = "i686-pc-linux-gnu" > > Modified: llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll (original) > +++ llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc -fast-isel -O0 -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s > +; RUN: llc -fast-isel -O0 -mcpu=generic -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s > > ; This should use flds to set the return value. > ; CHECK: test0: > > Modified: llvm/trunk/test/CodeGen/X86/fold-load.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-load.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/fold-load.ll (original) > +++ llvm/trunk/test/CodeGen/X86/fold-load.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s > %struct._obstack_chunk = type { i8*, %struct._obstack_chunk*, [4 x i8] } > %struct.obstack = type { i32, %struct._obstack_chunk*, i8*, i8*, i8*, i32, i32, %struct._obstack_chunk* (...)*, void (...)*, i8*, i8 } > @stmt_obstack = external global %struct.obstack ; <%struct.obstack*> [#uses=1] > > Modified: llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll (original) > +++ llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=i386-apple-darwin | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin | FileCheck %s > > ; There should be no stack manipulations between the inline asm and ret. > ; CHECK: test1 > > Modified: llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll (original) > +++ llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86-64 > %t > +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t > ; RUN: not grep and %t > ; RUN: not grep movz %t > ; RUN: not grep sar %t > > Modified: llvm/trunk/test/CodeGen/X86/optimize-max-3.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/optimize-max-3.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/optimize-max-3.ll (original) > +++ llvm/trunk/test/CodeGen/X86/optimize-max-3.ll Wed Feb 1 17:20:51 2012 > @@ -1,5 +1,5 @@ > -; RUN: llc < %s -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s > -; RUN: llc < %s -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s > > ; LSR's OptimizeMax should eliminate the select (max). > > > Modified: llvm/trunk/test/CodeGen/X86/peep-test-3.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/peep-test-3.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/peep-test-3.ll (original) > +++ llvm/trunk/test/CodeGen/X86/peep-test-3.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 -post-RA-scheduler=false | FileCheck %s > +; RUN: llc < %s -mcpu=generic -march=x86 -post-RA-scheduler=false | FileCheck %s > ; rdar://7226797 > > ; LLVM should omit the testl and use the flags result from the orl. > > Modified: llvm/trunk/test/CodeGen/X86/pic.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pic.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/pic.ll (original) > +++ llvm/trunk/test/CodeGen/X86/pic.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX > +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX > > @ptr = external global i32* > @dst = external global i32 > > Modified: llvm/trunk/test/CodeGen/X86/red-zone.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/red-zone.ll (original) > +++ llvm/trunk/test/CodeGen/X86/red-zone.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s > > ; First without noredzone. > ; CHECK: f0: > > Modified: llvm/trunk/test/CodeGen/X86/red-zone2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone2.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/red-zone2.ll (original) > +++ llvm/trunk/test/CodeGen/X86/red-zone2.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86-64 > %t > +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t > ; RUN: grep subq %t | count 1 > ; RUN: grep addq %t | count 1 > > > Modified: llvm/trunk/test/CodeGen/X86/reghinting.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/reghinting.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/reghinting.ll (original) > +++ llvm/trunk/test/CodeGen/X86/reghinting.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=x86_64-apple-macosx | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-macosx | FileCheck %s > ; PR10221 > > ;; The registers %x and %y must both spill across the finit call. > > Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll (original) > +++ llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll Wed Feb 1 17:20:51 2012 > @@ -1,7 +1,7 @@ > -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 > -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 > -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 > +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj > > ; Just to prevent the alloca from being optimized away > declare void @dummy_use(i32*, i32) > > Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/segmented-stacks.ll (original) > +++ llvm/trunk/test/CodeGen/X86/segmented-stacks.ll Wed Feb 1 17:20:51 2012 > @@ -1,23 +1,23 @@ > -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux > -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux > -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin > -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin > -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW > -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD > +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux > +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin > +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD > > ; We used to crash with filetype=obj > -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -filetype=obj > -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -filetype=obj > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj > > -; RUN: not llc < %s -mtriple=x86_64-solaris -segmented-stacks 2> %t.log > +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-solaris -segmented-stacks 2> %t.log > ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-Solaris > -; RUN: not llc < %s -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log > +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log > ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-MinGW > -; RUN: not llc < %s -mtriple=i686-freebsd -segmented-stacks 2> %t.log > +; RUN: not llc < %s -mcpu=generic -mtriple=i686-freebsd -segmented-stacks 2> %t.log > ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X32-FreeBSD > > ; X64-Solaris: Segmented stacks not supported on this platform > > Modified: llvm/trunk/test/CodeGen/X86/stack-align2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/stack-align2.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/stack-align2.ll (original) > +++ llvm/trunk/test/CodeGen/X86/stack-align2.ll Wed Feb 1 17:20:51 2012 > @@ -1,9 +1,9 @@ > -; RUN: llc < %s -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 > -; RUN: llc < %s -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 > -; RUN: llc < %s -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 > -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 > -; RUN: llc < %s -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 > -; RUN: llc < %s -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 > +; RUN: llc < %s -mcpu=generic -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 > +; RUN: llc < %s -mcpu=generic -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 > +; RUN: llc < %s -mcpu=generic -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 > > define i32 @test() nounwind { > entry: > > Modified: llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll (original) > +++ llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=x86_64-linux -tailcallopt | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -tailcallopt | FileCheck %s > > ; FIXME: Win64 does not support byval. > > > Modified: llvm/trunk/test/CodeGen/X86/tailcallstack64.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallstack64.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/tailcallstack64.ll (original) > +++ llvm/trunk/test/CodeGen/X86/tailcallstack64.ll Wed Feb 1 17:20:51 2012 > @@ -1,5 +1,5 @@ > -; RUN: llc < %s -tailcallopt -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s > -; RUN: llc < %s -tailcallopt -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s > +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s > +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s > > ; FIXME: Redundant unused stack allocation could be eliminated. > ; CHECK: subq ${{24|72|80}}, %rsp > > Modified: llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll (original) > +++ llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll Wed Feb 1 17:20:51 2012 > @@ -5,7 +5,7 @@ > ;; allocator turns the shift into an LEA. This also occurs for ADD. > > ; Check that the shift gets turned into an LEA. > -; RUN: llc < %s -mtriple=x86_64-apple-darwin | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin | FileCheck %s > > @G = external global i32 > > > Modified: llvm/trunk/test/CodeGen/X86/v-binop-widen.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/v-binop-widen.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/v-binop-widen.ll (original) > +++ llvm/trunk/test/CodeGen/X86/v-binop-widen.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc -march=x86 -mattr=+sse < %s | FileCheck %s > +; RUN: llc -mcpu=generic -march=x86 -mattr=+sse < %s | FileCheck %s > ; CHECK: divss > ; CHECK: divps > ; CHECK: divps > > Modified: llvm/trunk/test/CodeGen/X86/vec_call.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_call.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/vec_call.ll (original) > +++ llvm/trunk/test/CodeGen/X86/vec_call.ll Wed Feb 1 17:20:51 2012 > @@ -1,6 +1,6 @@ > -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ > +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ > ; RUN: grep {subl.*60} > -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ > +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ > ; RUN: grep {movaps.*32} > > > > Modified: llvm/trunk/test/CodeGen/X86/widen_arith-1.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-1.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/widen_arith-1.ll (original) > +++ llvm/trunk/test/CodeGen/X86/widen_arith-1.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 -mattr=+sse42 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 | FileCheck %s > > define void @update(<3 x i8>* %dst, <3 x i8>* %src, i32 %n) nounwind { > entry: > > Modified: llvm/trunk/test/CodeGen/X86/widen_arith-3.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-3.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/widen_arith-3.ll (original) > +++ llvm/trunk/test/CodeGen/X86/widen_arith-3.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s > +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s > ; CHECK: incl > ; CHECK: incl > ; CHECK: incl > > Modified: llvm/trunk/test/CodeGen/X86/widen_load-2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_load-2.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/widen_load-2.ll (original) > +++ llvm/trunk/test/CodeGen/X86/widen_load-2.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -o - -march=x86-64 -mattr=+sse42 | FileCheck %s > +; RUN: llc < %s -o - -mcpu=generic -march=x86-64 -mattr=+sse42 | FileCheck %s > > ; Test based on pr5626 to load/store > ; > > Modified: llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll (original) > +++ llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll Wed Feb 1 17:20:51 2012 > @@ -1,6 +1,6 @@ > -; RUN: llc < %s -join-physregs -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 > -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 > -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI > +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 > +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 > +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI > ; PR8777 > ; PR8778 > > > Modified: llvm/trunk/test/CodeGen/X86/win64_vararg.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_vararg.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/win64_vararg.ll (original) > +++ llvm/trunk/test/CodeGen/X86/win64_vararg.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=x86_64-pc-win32 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-win32 | FileCheck %s > > ; Verify that the var arg parameters which are passed in registers are stored > ; in home stack slots allocated by the caller and that AP is correctly > > Modified: llvm/trunk/test/CodeGen/X86/zext-fold.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/zext-fold.ll?rev=149558&r1=149557&r2=149558&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/zext-fold.ll (original) > +++ llvm/trunk/test/CodeGen/X86/zext-fold.ll Wed Feb 1 17:20:51 2012 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -march=x86 | FileCheck %s > +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s > > ;; Simple case > define i32 @test1(i8 %x) nounwind readnone { > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From peter_cooper at apple.com Wed Feb 1 17:43:12 2012 From: peter_cooper at apple.com (Pete Cooper) Date: Wed, 01 Feb 2012 23:43:12 -0000 Subject: [llvm-commits] [llvm] r149562 - /llvm/trunk/lib/AsmParser/LLParser.cpp Message-ID: <20120201234313.0B02A2A6C12C@llvm.org> Author: pete Date: Wed Feb 1 17:43:12 2012 New Revision: 149562 URL: http://llvm.org/viewvc/llvm-project?rev=149562&view=rev Log: Typo Modified: llvm/trunk/lib/AsmParser/LLParser.cpp Modified: llvm/trunk/lib/AsmParser/LLParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=149562&r1=149561&r2=149562&view=diff ============================================================================== --- llvm/trunk/lib/AsmParser/LLParser.cpp (original) +++ llvm/trunk/lib/AsmParser/LLParser.cpp Wed Feb 1 17:43:12 2012 @@ -3455,7 +3455,7 @@ return true; if (!ShuffleVectorInst::isValidOperands(Op0, Op1, Op2)) - return Error(Loc, "invalid extractelement operands"); + return Error(Loc, "invalid shufflevector operands"); Inst = new ShuffleVectorInst(Op0, Op1, Op2); return false; From mcrosier at apple.com Wed Feb 1 17:50:29 2012 From: mcrosier at apple.com (Chad Rosier) Date: Wed, 01 Feb 2012 15:50:29 -0800 Subject: [llvm-commits] [llvm] r149558 - in /llvm/trunk: lib/Target/X86/ test/CodeGen/X86/ In-Reply-To: References: <20120201232053.E046B2A6C12C@llvm.org> Message-ID: <678A657D-9AEF-4546-957F-0C6B5BBBD2C5@apple.com> And nevermind.. :D On Feb 1, 2012, at 3:47 PM, Chad Rosier wrote: > Andy, > > On Feb 1, 2012, at 3:20 PM, Andrew Trick wrote: > >> Author: atrick >> Date: Wed Feb 1 17:20:51 2012 >> New Revision: 149558 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149558&view=rev >> Log: >> Instruction scheduling itinerary for Intel Atom. >> >> Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. >> >> Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. >> >> Adds a test to verify that the scheduler is working. >> >> Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. >> >> Patch by Preston Gurd! >> >> Added: >> llvm/trunk/lib/Target/X86/X86Schedule.td >> llvm/trunk/lib/Target/X86/X86ScheduleAtom.td >> llvm/trunk/test/CodeGen/X86/atom-sched.ll > > Was CMakeLists.txt updated for these additions? > > Chad > >> Modified: >> llvm/trunk/lib/Target/X86/X86.td >> llvm/trunk/lib/Target/X86/X86ISelLowering.cpp >> llvm/trunk/lib/Target/X86/X86InstrArithmetic.td >> llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td >> llvm/trunk/lib/Target/X86/X86InstrControl.td >> llvm/trunk/lib/Target/X86/X86InstrFormats.td >> llvm/trunk/lib/Target/X86/X86InstrMMX.td >> llvm/trunk/lib/Target/X86/X86InstrSSE.td >> llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td >> llvm/trunk/lib/Target/X86/X86Subtarget.cpp >> llvm/trunk/lib/Target/X86/X86Subtarget.h >> llvm/trunk/lib/Target/X86/X86TargetMachine.cpp >> llvm/trunk/lib/Target/X86/X86TargetMachine.h >> llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll >> llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll >> llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll >> llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll >> llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll >> llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll >> llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll >> llvm/trunk/test/CodeGen/X86/abi-isel.ll >> llvm/trunk/test/CodeGen/X86/add.ll >> llvm/trunk/test/CodeGen/X86/byval6.ll >> llvm/trunk/test/CodeGen/X86/divide-by-constant.ll >> llvm/trunk/test/CodeGen/X86/epilogue.ll >> llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll >> llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll >> llvm/trunk/test/CodeGen/X86/fold-load.ll >> llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll >> llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll >> llvm/trunk/test/CodeGen/X86/optimize-max-3.ll >> llvm/trunk/test/CodeGen/X86/peep-test-3.ll >> llvm/trunk/test/CodeGen/X86/pic.ll >> llvm/trunk/test/CodeGen/X86/red-zone.ll >> llvm/trunk/test/CodeGen/X86/red-zone2.ll >> llvm/trunk/test/CodeGen/X86/reghinting.ll >> llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll >> llvm/trunk/test/CodeGen/X86/segmented-stacks.ll >> llvm/trunk/test/CodeGen/X86/stack-align2.ll >> llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll >> llvm/trunk/test/CodeGen/X86/tailcallstack64.ll >> llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll >> llvm/trunk/test/CodeGen/X86/v-binop-widen.ll >> llvm/trunk/test/CodeGen/X86/vec_call.ll >> llvm/trunk/test/CodeGen/X86/widen_arith-1.ll >> llvm/trunk/test/CodeGen/X86/widen_arith-3.ll >> llvm/trunk/test/CodeGen/X86/widen_load-2.ll >> llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll >> llvm/trunk/test/CodeGen/X86/win64_vararg.ll >> llvm/trunk/test/CodeGen/X86/zext-fold.ll >> >> Modified: llvm/trunk/lib/Target/X86/X86.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86.td (original) >> +++ llvm/trunk/lib/Target/X86/X86.td Wed Feb 1 17:20:51 2012 >> @@ -120,8 +120,16 @@ >> // X86 processors supported. >> //===----------------------------------------------------------------------===// >> >> +include "X86Schedule.td" >> + >> +def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom", >> + "Intel Atom processors">; >> + >> class Proc Features> >> - : Processor; >> + : Processor; >> + >> +class AtomProc Features> >> + : Processor; >> >> def : Proc<"generic", []>; >> def : Proc<"i386", []>; >> @@ -146,8 +154,8 @@ >> FeatureSlowBTMem]>; >> def : Proc<"penryn", [FeatureSSE41, FeatureCMPXCHG16B, >> FeatureSlowBTMem]>; >> -def : Proc<"atom", [FeatureSSE3, FeatureCMPXCHG16B, FeatureMOVBE, >> - FeatureSlowBTMem]>; >> +def : AtomProc<"atom", [ProcIntelAtom, FeatureSSE3, FeatureCMPXCHG16B, >> + FeatureMOVBE, FeatureSlowBTMem]>; >> // "Arrandale" along with corei3 and corei5 >> def : Proc<"corei7", [FeatureSSE42, FeatureCMPXCHG16B, >> FeatureSlowBTMem, FeatureFastUAMem, >> >> Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) >> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Feb 1 17:20:51 2012 >> @@ -179,8 +179,11 @@ >> >> // For 64-bit since we have so many registers use the ILP scheduler, for >> // 32-bit code use the register pressure specific scheduling. >> + // For 32 bit Atom, use Hybrid (register pressure + latency) scheduling. >> if (Subtarget->is64Bit()) >> setSchedulingPreference(Sched::ILP); >> + else if (Subtarget->isAtom()) >> + setSchedulingPreference(Sched::Hybrid); >> else >> setSchedulingPreference(Sched::RegPressure); >> setStackPointerRegisterToSaveRestore(X86StackPtr); >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Feb 1 17:20:51 2012 >> @@ -18,22 +18,24 @@ >> let neverHasSideEffects = 1 in >> def LEA16r : I<0x8D, MRMSrcMem, >> (outs GR16:$dst), (ins i32mem:$src), >> - "lea{w}\t{$src|$dst}, {$dst|$src}", []>, OpSize; >> + "lea{w}\t{$src|$dst}, {$dst|$src}", [], IIC_LEA_16>, OpSize; >> let isReMaterializable = 1 in >> def LEA32r : I<0x8D, MRMSrcMem, >> (outs GR32:$dst), (ins i32mem:$src), >> "lea{l}\t{$src|$dst}, {$dst|$src}", >> - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In32BitMode]>; >> + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, >> + Requires<[In32BitMode]>; >> >> def LEA64_32r : I<0x8D, MRMSrcMem, >> (outs GR32:$dst), (ins lea64_32mem:$src), >> "lea{l}\t{$src|$dst}, {$dst|$src}", >> - [(set GR32:$dst, lea32addr:$src)]>, Requires<[In64BitMode]>; >> + [(set GR32:$dst, lea32addr:$src)], IIC_LEA>, >> + Requires<[In64BitMode]>; >> >> let isReMaterializable = 1 in >> def LEA64r : RI<0x8D, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src), >> "lea{q}\t{$src|$dst}, {$dst|$src}", >> - [(set GR64:$dst, lea64addr:$src)]>; >> + [(set GR64:$dst, lea64addr:$src)], IIC_LEA>; >> >> >> >> @@ -56,16 +58,18 @@ >> let Defs = [AX,DX,EFLAGS], Uses = [AX], neverHasSideEffects = 1 in >> def MUL16r : I<0xF7, MRM4r, (outs), (ins GR16:$src), >> "mul{w}\t$src", >> - []>, OpSize; // AX,DX = AX*GR16 >> + [], IIC_MUL16_REG>, OpSize; // AX,DX = AX*GR16 >> >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX], neverHasSideEffects = 1 in >> def MUL32r : I<0xF7, MRM4r, (outs), (ins GR32:$src), >> "mul{l}\t$src", // EAX,EDX = EAX*GR32 >> - [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/]>; >> + [/*(set EAX, EDX, EFLAGS, (X86umul_flag EAX, GR32:$src))*/], >> + IIC_MUL32_REG>; >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX], neverHasSideEffects = 1 in >> def MUL64r : RI<0xF7, MRM4r, (outs), (ins GR64:$src), >> "mul{q}\t$src", // RAX,RDX = RAX*GR64 >> - [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/]>; >> + [/*(set RAX, RDX, EFLAGS, (X86umul_flag RAX, GR64:$src))*/], >> + IIC_MUL64>; >> >> let Defs = [AL,EFLAGS,AX], Uses = [AL] in >> def MUL8m : I<0xF6, MRM4m, (outs), (ins i8mem :$src), >> @@ -74,21 +78,21 @@ >> // This probably ought to be moved to a def : Pat<> if the >> // syntax can be accepted. >> [(set AL, (mul AL, (loadi8 addr:$src))), >> - (implicit EFLAGS)]>; // AL,AH = AL*[mem8] >> + (implicit EFLAGS)], IIC_MUL8>; // AL,AH = AL*[mem8] >> >> let mayLoad = 1, neverHasSideEffects = 1 in { >> let Defs = [AX,DX,EFLAGS], Uses = [AX] in >> def MUL16m : I<0xF7, MRM4m, (outs), (ins i16mem:$src), >> "mul{w}\t$src", >> - []>, OpSize; // AX,DX = AX*[mem16] >> + [], IIC_MUL16_MEM>, OpSize; // AX,DX = AX*[mem16] >> >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX] in >> def MUL32m : I<0xF7, MRM4m, (outs), (ins i32mem:$src), >> "mul{l}\t$src", >> - []>; // EAX,EDX = EAX*[mem32] >> + [], IIC_MUL32_MEM>; // EAX,EDX = EAX*[mem32] >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX] in >> def MUL64m : RI<0xF7, MRM4m, (outs), (ins i64mem:$src), >> - "mul{q}\t$src", []>; // RAX,RDX = RAX*[mem64] >> + "mul{q}\t$src", [], IIC_MUL64>; // RAX,RDX = RAX*[mem64] >> } >> >> let neverHasSideEffects = 1 in { >> @@ -130,16 +134,19 @@ >> def IMUL16rr : I<0xAF, MRMSrcReg, (outs GR16:$dst), (ins GR16:$src1,GR16:$src2), >> "imul{w}\t{$src2, $dst|$dst, $src2}", >> [(set GR16:$dst, EFLAGS, >> - (X86smul_flag GR16:$src1, GR16:$src2))]>, TB, OpSize; >> + (X86smul_flag GR16:$src1, GR16:$src2))], IIC_IMUL16_RR>, >> + TB, OpSize; >> def IMUL32rr : I<0xAF, MRMSrcReg, (outs GR32:$dst), (ins GR32:$src1,GR32:$src2), >> "imul{l}\t{$src2, $dst|$dst, $src2}", >> [(set GR32:$dst, EFLAGS, >> - (X86smul_flag GR32:$src1, GR32:$src2))]>, TB; >> + (X86smul_flag GR32:$src1, GR32:$src2))], IIC_IMUL32_RR>, >> + TB; >> def IMUL64rr : RI<0xAF, MRMSrcReg, (outs GR64:$dst), >> (ins GR64:$src1, GR64:$src2), >> "imul{q}\t{$src2, $dst|$dst, $src2}", >> [(set GR64:$dst, EFLAGS, >> - (X86smul_flag GR64:$src1, GR64:$src2))]>, TB; >> + (X86smul_flag GR64:$src1, GR64:$src2))], IIC_IMUL64_RR>, >> + TB; >> } >> >> // Register-Memory Signed Integer Multiply >> @@ -147,18 +154,23 @@ >> (ins GR16:$src1, i16mem:$src2), >> "imul{w}\t{$src2, $dst|$dst, $src2}", >> [(set GR16:$dst, EFLAGS, >> - (X86smul_flag GR16:$src1, (load addr:$src2)))]>, >> + (X86smul_flag GR16:$src1, (load addr:$src2)))], >> + IIC_IMUL16_RM>, >> TB, OpSize; >> def IMUL32rm : I<0xAF, MRMSrcMem, (outs GR32:$dst), >> (ins GR32:$src1, i32mem:$src2), >> "imul{l}\t{$src2, $dst|$dst, $src2}", >> [(set GR32:$dst, EFLAGS, >> - (X86smul_flag GR32:$src1, (load addr:$src2)))]>, TB; >> + (X86smul_flag GR32:$src1, (load addr:$src2)))], >> + IIC_IMUL32_RM>, >> + TB; >> def IMUL64rm : RI<0xAF, MRMSrcMem, (outs GR64:$dst), >> (ins GR64:$src1, i64mem:$src2), >> "imul{q}\t{$src2, $dst|$dst, $src2}", >> [(set GR64:$dst, EFLAGS, >> - (X86smul_flag GR64:$src1, (load addr:$src2)))]>, TB; >> + (X86smul_flag GR64:$src1, (load addr:$src2)))], >> + IIC_IMUL64_RM>, >> + TB; >> } // Constraints = "$src1 = $dst" >> >> } // Defs = [EFLAGS] >> @@ -170,33 +182,39 @@ >> (outs GR16:$dst), (ins GR16:$src1, i16imm:$src2), >> "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR16:$dst, EFLAGS, >> - (X86smul_flag GR16:$src1, imm:$src2))]>, OpSize; >> + (X86smul_flag GR16:$src1, imm:$src2))], >> + IIC_IMUL16_RRI>, OpSize; >> def IMUL16rri8 : Ii8<0x6B, MRMSrcReg, // GR16 = GR16*I8 >> (outs GR16:$dst), (ins GR16:$src1, i16i8imm:$src2), >> "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR16:$dst, EFLAGS, >> - (X86smul_flag GR16:$src1, i16immSExt8:$src2))]>, >> + (X86smul_flag GR16:$src1, i16immSExt8:$src2))], >> + IIC_IMUL16_RRI>, >> OpSize; >> def IMUL32rri : Ii32<0x69, MRMSrcReg, // GR32 = GR32*I32 >> (outs GR32:$dst), (ins GR32:$src1, i32imm:$src2), >> "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR32:$dst, EFLAGS, >> - (X86smul_flag GR32:$src1, imm:$src2))]>; >> + (X86smul_flag GR32:$src1, imm:$src2))], >> + IIC_IMUL32_RRI>; >> def IMUL32rri8 : Ii8<0x6B, MRMSrcReg, // GR32 = GR32*I8 >> (outs GR32:$dst), (ins GR32:$src1, i32i8imm:$src2), >> "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR32:$dst, EFLAGS, >> - (X86smul_flag GR32:$src1, i32immSExt8:$src2))]>; >> + (X86smul_flag GR32:$src1, i32immSExt8:$src2))], >> + IIC_IMUL32_RRI>; >> def IMUL64rri32 : RIi32<0x69, MRMSrcReg, // GR64 = GR64*I32 >> (outs GR64:$dst), (ins GR64:$src1, i64i32imm:$src2), >> "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR64:$dst, EFLAGS, >> - (X86smul_flag GR64:$src1, i64immSExt32:$src2))]>; >> + (X86smul_flag GR64:$src1, i64immSExt32:$src2))], >> + IIC_IMUL64_RRI>; >> def IMUL64rri8 : RIi8<0x6B, MRMSrcReg, // GR64 = GR64*I8 >> (outs GR64:$dst), (ins GR64:$src1, i64i8imm:$src2), >> "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR64:$dst, EFLAGS, >> - (X86smul_flag GR64:$src1, i64immSExt8:$src2))]>; >> + (X86smul_flag GR64:$src1, i64immSExt8:$src2))], >> + IIC_IMUL64_RRI>; >> >> >> // Memory-Integer Signed Integer Multiply >> @@ -204,37 +222,43 @@ >> (outs GR16:$dst), (ins i16mem:$src1, i16imm:$src2), >> "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR16:$dst, EFLAGS, >> - (X86smul_flag (load addr:$src1), imm:$src2))]>, >> + (X86smul_flag (load addr:$src1), imm:$src2))], >> + IIC_IMUL16_RMI>, >> OpSize; >> def IMUL16rmi8 : Ii8<0x6B, MRMSrcMem, // GR16 = [mem16]*I8 >> (outs GR16:$dst), (ins i16mem:$src1, i16i8imm :$src2), >> "imul{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR16:$dst, EFLAGS, >> (X86smul_flag (load addr:$src1), >> - i16immSExt8:$src2))]>, OpSize; >> + i16immSExt8:$src2))], IIC_IMUL16_RMI>, >> + OpSize; >> def IMUL32rmi : Ii32<0x69, MRMSrcMem, // GR32 = [mem32]*I32 >> (outs GR32:$dst), (ins i32mem:$src1, i32imm:$src2), >> "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR32:$dst, EFLAGS, >> - (X86smul_flag (load addr:$src1), imm:$src2))]>; >> + (X86smul_flag (load addr:$src1), imm:$src2))], >> + IIC_IMUL32_RMI>; >> def IMUL32rmi8 : Ii8<0x6B, MRMSrcMem, // GR32 = [mem32]*I8 >> (outs GR32:$dst), (ins i32mem:$src1, i32i8imm: $src2), >> "imul{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR32:$dst, EFLAGS, >> (X86smul_flag (load addr:$src1), >> - i32immSExt8:$src2))]>; >> + i32immSExt8:$src2))], >> + IIC_IMUL32_RMI>; >> def IMUL64rmi32 : RIi32<0x69, MRMSrcMem, // GR64 = [mem64]*I32 >> (outs GR64:$dst), (ins i64mem:$src1, i64i32imm:$src2), >> "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR64:$dst, EFLAGS, >> (X86smul_flag (load addr:$src1), >> - i64immSExt32:$src2))]>; >> + i64immSExt32:$src2))], >> + IIC_IMUL64_RMI>; >> def IMUL64rmi8 : RIi8<0x6B, MRMSrcMem, // GR64 = [mem64]*I8 >> (outs GR64:$dst), (ins i64mem:$src1, i64i8imm: $src2), >> "imul{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}", >> [(set GR64:$dst, EFLAGS, >> (X86smul_flag (load addr:$src1), >> - i64immSExt8:$src2))]>; >> + i64immSExt8:$src2))], >> + IIC_IMUL64_RMI>; >> } // Defs = [EFLAGS] >> >> >> @@ -243,62 +267,62 @@ >> // unsigned division/remainder >> let Defs = [AL,EFLAGS,AX], Uses = [AX] in >> def DIV8r : I<0xF6, MRM6r, (outs), (ins GR8:$src), // AX/r8 = AL,AH >> - "div{b}\t$src", []>; >> + "div{b}\t$src", [], IIC_DIV8_REG>; >> let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in >> def DIV16r : I<0xF7, MRM6r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX >> - "div{w}\t$src", []>, OpSize; >> + "div{w}\t$src", [], IIC_DIV16>, OpSize; >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in >> def DIV32r : I<0xF7, MRM6r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX >> - "div{l}\t$src", []>; >> + "div{l}\t$src", [], IIC_DIV32>; >> // RDX:RAX/r64 = RAX,RDX >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in >> def DIV64r : RI<0xF7, MRM6r, (outs), (ins GR64:$src), >> - "div{q}\t$src", []>; >> + "div{q}\t$src", [], IIC_DIV64>; >> >> let mayLoad = 1 in { >> let Defs = [AL,EFLAGS,AX], Uses = [AX] in >> def DIV8m : I<0xF6, MRM6m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH >> - "div{b}\t$src", []>; >> + "div{b}\t$src", [], IIC_DIV8_MEM>; >> let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in >> def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX >> - "div{w}\t$src", []>, OpSize; >> + "div{w}\t$src", [], IIC_DIV16>, OpSize; >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX >> def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src), >> - "div{l}\t$src", []>; >> + "div{l}\t$src", [], IIC_DIV32>; >> // RDX:RAX/[mem64] = RAX,RDX >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in >> def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src), >> - "div{q}\t$src", []>; >> + "div{q}\t$src", [], IIC_DIV64>; >> } >> >> // Signed division/remainder. >> let Defs = [AL,EFLAGS,AX], Uses = [AX] in >> def IDIV8r : I<0xF6, MRM7r, (outs), (ins GR8:$src), // AX/r8 = AL,AH >> - "idiv{b}\t$src", []>; >> + "idiv{b}\t$src", [], IIC_IDIV8>; >> let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in >> def IDIV16r: I<0xF7, MRM7r, (outs), (ins GR16:$src), // DX:AX/r16 = AX,DX >> - "idiv{w}\t$src", []>, OpSize; >> + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in >> def IDIV32r: I<0xF7, MRM7r, (outs), (ins GR32:$src), // EDX:EAX/r32 = EAX,EDX >> - "idiv{l}\t$src", []>; >> + "idiv{l}\t$src", [], IIC_IDIV32>; >> // RDX:RAX/r64 = RAX,RDX >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in >> def IDIV64r: RI<0xF7, MRM7r, (outs), (ins GR64:$src), >> - "idiv{q}\t$src", []>; >> + "idiv{q}\t$src", [], IIC_IDIV64>; >> >> let mayLoad = 1 in { >> let Defs = [AL,EFLAGS,AX], Uses = [AX] in >> def IDIV8m : I<0xF6, MRM7m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH >> - "idiv{b}\t$src", []>; >> + "idiv{b}\t$src", [], IIC_IDIV8>; >> let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in >> def IDIV16m: I<0xF7, MRM7m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX >> - "idiv{w}\t$src", []>, OpSize; >> + "idiv{w}\t$src", [], IIC_IDIV16>, OpSize; >> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX >> def IDIV32m: I<0xF7, MRM7m, (outs), (ins i32mem:$src), >> - "idiv{l}\t$src", []>; >> + "idiv{l}\t$src", [], IIC_IDIV32>; >> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in // RDX:RAX/[mem64] = RAX,RDX >> def IDIV64m: RI<0xF7, MRM7m, (outs), (ins i64mem:$src), >> - "idiv{q}\t$src", []>; >> + "idiv{q}\t$src", [], IIC_IDIV64>; >> } >> >> //===----------------------------------------------------------------------===// >> @@ -312,35 +336,35 @@ >> def NEG8r : I<0xF6, MRM3r, (outs GR8 :$dst), (ins GR8 :$src1), >> "neg{b}\t$dst", >> [(set GR8:$dst, (ineg GR8:$src1)), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_REG>; >> def NEG16r : I<0xF7, MRM3r, (outs GR16:$dst), (ins GR16:$src1), >> "neg{w}\t$dst", >> [(set GR16:$dst, (ineg GR16:$src1)), >> - (implicit EFLAGS)]>, OpSize; >> + (implicit EFLAGS)], IIC_UNARY_REG>, OpSize; >> def NEG32r : I<0xF7, MRM3r, (outs GR32:$dst), (ins GR32:$src1), >> "neg{l}\t$dst", >> [(set GR32:$dst, (ineg GR32:$src1)), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_REG>; >> def NEG64r : RI<0xF7, MRM3r, (outs GR64:$dst), (ins GR64:$src1), "neg{q}\t$dst", >> [(set GR64:$dst, (ineg GR64:$src1)), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_REG>; >> } // Constraints = "$src1 = $dst" >> >> def NEG8m : I<0xF6, MRM3m, (outs), (ins i8mem :$dst), >> "neg{b}\t$dst", >> [(store (ineg (loadi8 addr:$dst)), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> def NEG16m : I<0xF7, MRM3m, (outs), (ins i16mem:$dst), >> "neg{w}\t$dst", >> [(store (ineg (loadi16 addr:$dst)), addr:$dst), >> - (implicit EFLAGS)]>, OpSize; >> + (implicit EFLAGS)], IIC_UNARY_MEM>, OpSize; >> def NEG32m : I<0xF7, MRM3m, (outs), (ins i32mem:$dst), >> "neg{l}\t$dst", >> [(store (ineg (loadi32 addr:$dst)), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> def NEG64m : RI<0xF7, MRM3m, (outs), (ins i64mem:$dst), "neg{q}\t$dst", >> [(store (ineg (loadi64 addr:$dst)), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> } // Defs = [EFLAGS] >> >> >> @@ -351,29 +375,30 @@ >> let AddedComplexity = 15 in { >> def NOT8r : I<0xF6, MRM2r, (outs GR8 :$dst), (ins GR8 :$src1), >> "not{b}\t$dst", >> - [(set GR8:$dst, (not GR8:$src1))]>; >> + [(set GR8:$dst, (not GR8:$src1))], IIC_UNARY_REG>; >> def NOT16r : I<0xF7, MRM2r, (outs GR16:$dst), (ins GR16:$src1), >> "not{w}\t$dst", >> - [(set GR16:$dst, (not GR16:$src1))]>, OpSize; >> + [(set GR16:$dst, (not GR16:$src1))], IIC_UNARY_REG>, OpSize; >> def NOT32r : I<0xF7, MRM2r, (outs GR32:$dst), (ins GR32:$src1), >> "not{l}\t$dst", >> - [(set GR32:$dst, (not GR32:$src1))]>; >> + [(set GR32:$dst, (not GR32:$src1))], IIC_UNARY_REG>; >> def NOT64r : RI<0xF7, MRM2r, (outs GR64:$dst), (ins GR64:$src1), "not{q}\t$dst", >> - [(set GR64:$dst, (not GR64:$src1))]>; >> + [(set GR64:$dst, (not GR64:$src1))], IIC_UNARY_REG>; >> } >> } // Constraints = "$src1 = $dst" >> >> def NOT8m : I<0xF6, MRM2m, (outs), (ins i8mem :$dst), >> "not{b}\t$dst", >> - [(store (not (loadi8 addr:$dst)), addr:$dst)]>; >> + [(store (not (loadi8 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; >> def NOT16m : I<0xF7, MRM2m, (outs), (ins i16mem:$dst), >> "not{w}\t$dst", >> - [(store (not (loadi16 addr:$dst)), addr:$dst)]>, OpSize; >> + [(store (not (loadi16 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>, >> + OpSize; >> def NOT32m : I<0xF7, MRM2m, (outs), (ins i32mem:$dst), >> "not{l}\t$dst", >> - [(store (not (loadi32 addr:$dst)), addr:$dst)]>; >> + [(store (not (loadi32 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; >> def NOT64m : RI<0xF7, MRM2m, (outs), (ins i64mem:$dst), "not{q}\t$dst", >> - [(store (not (loadi64 addr:$dst)), addr:$dst)]>; >> + [(store (not (loadi64 addr:$dst)), addr:$dst)], IIC_UNARY_MEM>; >> } // CodeSize >> >> // TODO: inc/dec is slow for P4, but fast for Pentium-M. >> @@ -382,19 +407,22 @@ >> let CodeSize = 2 in >> def INC8r : I<0xFE, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), >> "inc{b}\t$dst", >> - [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))]>; >> + [(set GR8:$dst, EFLAGS, (X86inc_flag GR8:$src1))], >> + IIC_UNARY_REG>; >> >> let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. >> def INC16r : I<0x40, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), >> "inc{w}\t$dst", >> - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, >> + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], IIC_UNARY_REG>, >> OpSize, Requires<[In32BitMode]>; >> def INC32r : I<0x40, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), >> "inc{l}\t$dst", >> - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, >> + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], >> + IIC_UNARY_REG>, >> Requires<[In32BitMode]>; >> def INC64r : RI<0xFF, MRM0r, (outs GR64:$dst), (ins GR64:$src1), "inc{q}\t$dst", >> - [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))]>; >> + [(set GR64:$dst, EFLAGS, (X86inc_flag GR64:$src1))], >> + IIC_UNARY_REG>; >> } // isConvertibleToThreeAddress = 1, CodeSize = 1 >> >> >> @@ -403,19 +431,23 @@ >> // Can transform into LEA. >> def INC64_16r : I<0xFF, MRM0r, (outs GR16:$dst), (ins GR16:$src1), >> "inc{w}\t$dst", >> - [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))]>, >> + [(set GR16:$dst, EFLAGS, (X86inc_flag GR16:$src1))], >> + IIC_UNARY_REG>, >> OpSize, Requires<[In64BitMode]>; >> def INC64_32r : I<0xFF, MRM0r, (outs GR32:$dst), (ins GR32:$src1), >> "inc{l}\t$dst", >> - [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))]>, >> + [(set GR32:$dst, EFLAGS, (X86inc_flag GR32:$src1))], >> + IIC_UNARY_REG>, >> Requires<[In64BitMode]>; >> def DEC64_16r : I<0xFF, MRM1r, (outs GR16:$dst), (ins GR16:$src1), >> "dec{w}\t$dst", >> - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, >> + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], >> + IIC_UNARY_REG>, >> OpSize, Requires<[In64BitMode]>; >> def DEC64_32r : I<0xFF, MRM1r, (outs GR32:$dst), (ins GR32:$src1), >> "dec{l}\t$dst", >> - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, >> + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], >> + IIC_UNARY_REG>, >> Requires<[In64BitMode]>; >> } // isConvertibleToThreeAddress = 1, CodeSize = 2 >> >> @@ -424,37 +456,37 @@ >> let CodeSize = 2 in { >> def INC8m : I<0xFE, MRM0m, (outs), (ins i8mem :$dst), "inc{b}\t$dst", >> [(store (add (loadi8 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> def INC16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", >> [(store (add (loadi16 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> OpSize, Requires<[In32BitMode]>; >> def INC32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", >> [(store (add (loadi32 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> Requires<[In32BitMode]>; >> def INC64m : RI<0xFF, MRM0m, (outs), (ins i64mem:$dst), "inc{q}\t$dst", >> [(store (add (loadi64 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> >> // These are duplicates of their 32-bit counterparts. Only needed so X86 knows >> // how to unfold them. >> // FIXME: What is this for?? >> def INC64_16m : I<0xFF, MRM0m, (outs), (ins i16mem:$dst), "inc{w}\t$dst", >> [(store (add (loadi16 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> OpSize, Requires<[In64BitMode]>; >> def INC64_32m : I<0xFF, MRM0m, (outs), (ins i32mem:$dst), "inc{l}\t$dst", >> [(store (add (loadi32 addr:$dst), 1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> Requires<[In64BitMode]>; >> def DEC64_16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", >> [(store (add (loadi16 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> OpSize, Requires<[In64BitMode]>; >> def DEC64_32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", >> [(store (add (loadi32 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> Requires<[In64BitMode]>; >> } // CodeSize = 2 >> >> @@ -462,18 +494,22 @@ >> let CodeSize = 2 in >> def DEC8r : I<0xFE, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), >> "dec{b}\t$dst", >> - [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))]>; >> + [(set GR8:$dst, EFLAGS, (X86dec_flag GR8:$src1))], >> + IIC_UNARY_REG>; >> let isConvertibleToThreeAddress = 1, CodeSize = 1 in { // Can xform into LEA. >> def DEC16r : I<0x48, AddRegFrm, (outs GR16:$dst), (ins GR16:$src1), >> "dec{w}\t$dst", >> - [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))]>, >> + [(set GR16:$dst, EFLAGS, (X86dec_flag GR16:$src1))], >> + IIC_UNARY_REG>, >> OpSize, Requires<[In32BitMode]>; >> def DEC32r : I<0x48, AddRegFrm, (outs GR32:$dst), (ins GR32:$src1), >> "dec{l}\t$dst", >> - [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))]>, >> + [(set GR32:$dst, EFLAGS, (X86dec_flag GR32:$src1))], >> + IIC_UNARY_REG>, >> Requires<[In32BitMode]>; >> def DEC64r : RI<0xFF, MRM1r, (outs GR64:$dst), (ins GR64:$src1), "dec{q}\t$dst", >> - [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))]>; >> + [(set GR64:$dst, EFLAGS, (X86dec_flag GR64:$src1))], >> + IIC_UNARY_REG>; >> } // CodeSize = 2 >> } // Constraints = "$src1 = $dst" >> >> @@ -481,18 +517,18 @@ >> let CodeSize = 2 in { >> def DEC8m : I<0xFE, MRM1m, (outs), (ins i8mem :$dst), "dec{b}\t$dst", >> [(store (add (loadi8 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> def DEC16m : I<0xFF, MRM1m, (outs), (ins i16mem:$dst), "dec{w}\t$dst", >> [(store (add (loadi16 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> OpSize, Requires<[In32BitMode]>; >> def DEC32m : I<0xFF, MRM1m, (outs), (ins i32mem:$dst), "dec{l}\t$dst", >> [(store (add (loadi32 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>, >> + (implicit EFLAGS)], IIC_UNARY_MEM>, >> Requires<[In32BitMode]>; >> def DEC64m : RI<0xFF, MRM1m, (outs), (ins i64mem:$dst), "dec{q}\t$dst", >> [(store (add (loadi64 addr:$dst), -1), addr:$dst), >> - (implicit EFLAGS)]>; >> + (implicit EFLAGS)], IIC_UNARY_MEM>; >> } // CodeSize = 2 >> } // Defs = [EFLAGS] >> >> @@ -588,11 +624,13 @@ >> /// 4. Infers whether the low bit of the opcode should be 0 (for i8 operations) >> /// or 1 (for i16,i32,i64 operations). >> class ITy opcode, Format f, X86TypeInfo typeinfo, dag outs, dag ins, >> - string mnemonic, string args, list pattern> >> + string mnemonic, string args, list pattern, >> + InstrItinClass itin = IIC_BIN_NONMEM> >> : I<{opcode{7}, opcode{6}, opcode{5}, opcode{4}, >> opcode{3}, opcode{2}, opcode{1}, typeinfo.HasOddOpcode }, >> f, outs, ins, >> - !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern> { >> + !strconcat(mnemonic, "{", typeinfo.InstrSuffix, "}\t", args), pattern, >> + itin> { >> >> // Infer instruction prefixes from type info. >> let hasOpSizePrefix = typeinfo.HasOpSizePrefix; >> @@ -664,7 +702,7 @@ >> dag outlist, list pattern> >> : ITy> (ins typeinfo.RegClass:$src1, typeinfo.MemOperand:$src2), >> - mnemonic, "{$src2, $src1|$src1, $src2}", pattern>; >> + mnemonic, "{$src2, $src1|$src1, $src2}", pattern, IIC_BIN_MEM>; >> >> // BinOpRM_R - Instructions like "add reg, reg, [mem]". >> class BinOpRM_R opcode, string mnemonic, X86TypeInfo typeinfo, >> @@ -776,7 +814,7 @@ >> list pattern> >> : ITy> (outs), (ins typeinfo.MemOperand:$dst, typeinfo.RegClass:$src), >> - mnemonic, "{$src, $dst|$dst, $src}", pattern>; >> + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM>; >> >> // BinOpMR_RMW - Instructions like "add [mem], reg". >> class BinOpMR_RMW opcode, string mnemonic, X86TypeInfo typeinfo, >> @@ -804,7 +842,7 @@ >> Format f, list pattern, bits<8> opcode = 0x80> >> : ITy> (outs), (ins typeinfo.MemOperand:$dst, typeinfo.ImmOperand:$src), >> - mnemonic, "{$src, $dst|$dst, $src}", pattern> { >> + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { >> let ImmT = typeinfo.ImmEncoding; >> } >> >> @@ -837,7 +875,7 @@ >> Format f, list pattern> >> : ITy<0x82, f, typeinfo, >> (outs), (ins typeinfo.MemOperand:$dst, typeinfo.Imm8Operand:$src), >> - mnemonic, "{$src, $dst|$dst, $src}", pattern> { >> + mnemonic, "{$src, $dst|$dst, $src}", pattern, IIC_BIN_MEM> { >> let ImmT = Imm8; // Always 8-bit immediate. >> } >> >> @@ -1150,7 +1188,7 @@ >> // register class is constrained to GR8_NOREX. >> let isPseudo = 1 in >> def TEST8ri_NOREX : I<0, Pseudo, (outs), (ins GR8_NOREX:$src, i8imm:$mask), >> - "", []>; >> + "", [], IIC_BIN_NONMEM>; >> } >> >> //===----------------------------------------------------------------------===// >> @@ -1160,11 +1198,12 @@ >> PatFrag ld_frag> { >> def rr : I<0xF2, MRMSrcReg, (outs RC:$dst), (ins RC:$src1, RC:$src2), >> !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), >> - [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))]>; >> + [(set RC:$dst, EFLAGS, (X86andn_flag RC:$src1, RC:$src2))], >> + IIC_BIN_NONMEM>; >> def rm : I<0xF2, MRMSrcMem, (outs RC:$dst), (ins RC:$src1, x86memop:$src2), >> !strconcat(mnemonic, "\t{$src2, $src1, $dst|$dst, $src1, $src2}"), >> [(set RC:$dst, EFLAGS, >> - (X86andn_flag RC:$src1, (ld_frag addr:$src2)))]>; >> + (X86andn_flag RC:$src1, (ld_frag addr:$src2)))], IIC_BIN_MEM>; >> } >> >> let Predicates = [HasBMI], Defs = [EFLAGS] in { >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrCMovSetCC.td Wed Feb 1 17:20:51 2012 >> @@ -21,17 +21,20 @@ >> : I> !strconcat(Mnemonic, "{w}\t{$src2, $dst|$dst, $src2}"), >> [(set GR16:$dst, >> - (X86cmov GR16:$src1, GR16:$src2, CondNode, EFLAGS))]>,TB,OpSize; >> + (X86cmov GR16:$src1, GR16:$src2, CondNode, EFLAGS))], >> + IIC_CMOV16_RR>,TB,OpSize; >> def #NAME#32rr >> : I> !strconcat(Mnemonic, "{l}\t{$src2, $dst|$dst, $src2}"), >> [(set GR32:$dst, >> - (X86cmov GR32:$src1, GR32:$src2, CondNode, EFLAGS))]>, TB; >> + (X86cmov GR32:$src1, GR32:$src2, CondNode, EFLAGS))], >> + IIC_CMOV32_RR>, TB; >> def #NAME#64rr >> :RI> !strconcat(Mnemonic, "{q}\t{$src2, $dst|$dst, $src2}"), >> [(set GR64:$dst, >> - (X86cmov GR64:$src1, GR64:$src2, CondNode, EFLAGS))]>, TB; >> + (X86cmov GR64:$src1, GR64:$src2, CondNode, EFLAGS))], >> + IIC_CMOV32_RR>, TB; >> } >> >> let Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" in { >> @@ -39,17 +42,18 @@ >> : I> !strconcat(Mnemonic, "{w}\t{$src2, $dst|$dst, $src2}"), >> [(set GR16:$dst, (X86cmov GR16:$src1, (loadi16 addr:$src2), >> - CondNode, EFLAGS))]>, TB, OpSize; >> + CondNode, EFLAGS))], IIC_CMOV16_RM>, >> + TB, OpSize; >> def #NAME#32rm >> : I> !strconcat(Mnemonic, "{l}\t{$src2, $dst|$dst, $src2}"), >> [(set GR32:$dst, (X86cmov GR32:$src1, (loadi32 addr:$src2), >> - CondNode, EFLAGS))]>, TB; >> + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; >> def #NAME#64rm >> :RI> !strconcat(Mnemonic, "{q}\t{$src2, $dst|$dst, $src2}"), >> [(set GR64:$dst, (X86cmov GR64:$src1, (loadi64 addr:$src2), >> - CondNode, EFLAGS))]>, TB; >> + CondNode, EFLAGS))], IIC_CMOV32_RM>, TB; >> } // Uses = [EFLAGS], Predicates = [HasCMov], Constraints = "$src1 = $dst" >> } // end multiclass >> >> @@ -78,10 +82,12 @@ >> let Uses = [EFLAGS] in { >> def r : I> !strconcat(Mnemonic, "\t$dst"), >> - [(set GR8:$dst, (X86setcc OpNode, EFLAGS))]>, TB; >> + [(set GR8:$dst, (X86setcc OpNode, EFLAGS))], >> + IIC_SET_R>, TB; >> def m : I> !strconcat(Mnemonic, "\t$dst"), >> - [(store (X86setcc OpNode, EFLAGS), addr:$dst)]>, TB; >> + [(store (X86setcc OpNode, EFLAGS), addr:$dst)], >> + IIC_SET_M>, TB; >> } // Uses = [EFLAGS] >> } >> >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrControl.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrControl.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrControl.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrControl.td Wed Feb 1 17:20:51 2012 >> @@ -20,41 +20,42 @@ >> hasCtrlDep = 1, FPForm = SpecialFP in { >> def RET : I <0xC3, RawFrm, (outs), (ins variable_ops), >> "ret", >> - [(X86retflag 0)]>; >> + [(X86retflag 0)], IIC_RET>; >> def RETI : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), >> "ret\t$amt", >> - [(X86retflag timm:$amt)]>; >> + [(X86retflag timm:$amt)], IIC_RET_IMM>; >> def RETIW : Ii16<0xC2, RawFrm, (outs), (ins i16imm:$amt, variable_ops), >> "retw\t$amt", >> - []>, OpSize; >> + [], IIC_RET_IMM>, OpSize; >> def LRETL : I <0xCB, RawFrm, (outs), (ins), >> - "lretl", []>; >> + "lretl", [], IIC_RET>; >> def LRETQ : RI <0xCB, RawFrm, (outs), (ins), >> - "lretq", []>; >> + "lretq", [], IIC_RET>; >> def LRETI : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), >> - "lret\t$amt", []>; >> + "lret\t$amt", [], IIC_RET>; >> def LRETIW : Ii16<0xCA, RawFrm, (outs), (ins i16imm:$amt), >> - "lretw\t$amt", []>, OpSize; >> + "lretw\t$amt", [], IIC_RET>, OpSize; >> } >> >> // Unconditional branches. >> let isBarrier = 1, isBranch = 1, isTerminator = 1 in { >> def JMP_4 : Ii32PCRel<0xE9, RawFrm, (outs), (ins brtarget:$dst), >> - "jmp\t$dst", [(br bb:$dst)]>; >> + "jmp\t$dst", [(br bb:$dst)], IIC_JMP_REL>; >> def JMP_1 : Ii8PCRel<0xEB, RawFrm, (outs), (ins brtarget8:$dst), >> - "jmp\t$dst", []>; >> + "jmp\t$dst", [], IIC_JMP_REL>; >> // FIXME : Intel syntax for JMP64pcrel32 such that it is not ambiguious >> // with JMP_1. >> def JMP64pcrel32 : I<0xE9, RawFrm, (outs), (ins brtarget:$dst), >> - "jmpq\t$dst", []>; >> + "jmpq\t$dst", [], IIC_JMP_REL>; >> } >> >> // Conditional Branches. >> let isBranch = 1, isTerminator = 1, Uses = [EFLAGS] in { >> multiclass ICBr opc1, bits<8> opc4, string asm, PatFrag Cond> { >> - def _1 : Ii8PCRel ; >> + def _1 : Ii8PCRel > + IIC_Jcc>; >> def _4 : Ii32PCRel> - [(X86brcond bb:$dst, Cond, EFLAGS)]>, TB; >> + [(X86brcond bb:$dst, Cond, EFLAGS)], IIC_Jcc>, TB; >> } >> } >> >> @@ -82,55 +83,55 @@ >> // jecxz. >> let Uses = [CX] in >> def JCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), >> - "jcxz\t$dst", []>, AdSize, Requires<[In32BitMode]>; >> + "jcxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In32BitMode]>; >> let Uses = [ECX] in >> def JECXZ_32 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), >> - "jecxz\t$dst", []>, Requires<[In32BitMode]>; >> + "jecxz\t$dst", [], IIC_JCXZ>, Requires<[In32BitMode]>; >> >> // J*CXZ instruction: 64-bit versions of this instruction for the asmparser. >> // In 64-bit mode, the address size prefix is jecxz and the unprefixed version >> // is jrcxz. >> let Uses = [ECX] in >> def JECXZ_64 : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), >> - "jecxz\t$dst", []>, AdSize, Requires<[In64BitMode]>; >> + "jecxz\t$dst", [], IIC_JCXZ>, AdSize, Requires<[In64BitMode]>; >> let Uses = [RCX] in >> def JRCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst), >> - "jrcxz\t$dst", []>, Requires<[In64BitMode]>; >> + "jrcxz\t$dst", [], IIC_JCXZ>, Requires<[In64BitMode]>; >> } >> >> // Indirect branches >> let isBranch = 1, isTerminator = 1, isBarrier = 1, isIndirectBranch = 1 in { >> def JMP32r : I<0xFF, MRM4r, (outs), (ins GR32:$dst), "jmp{l}\t{*}$dst", >> - [(brind GR32:$dst)]>, Requires<[In32BitMode]>; >> + [(brind GR32:$dst)], IIC_JMP_REG>, Requires<[In32BitMode]>; >> def JMP32m : I<0xFF, MRM4m, (outs), (ins i32mem:$dst), "jmp{l}\t{*}$dst", >> - [(brind (loadi32 addr:$dst))]>, Requires<[In32BitMode]>; >> + [(brind (loadi32 addr:$dst))], IIC_JMP_MEM>, Requires<[In32BitMode]>; >> >> def JMP64r : I<0xFF, MRM4r, (outs), (ins GR64:$dst), "jmp{q}\t{*}$dst", >> - [(brind GR64:$dst)]>, Requires<[In64BitMode]>; >> + [(brind GR64:$dst)], IIC_JMP_REG>, Requires<[In64BitMode]>; >> def JMP64m : I<0xFF, MRM4m, (outs), (ins i64mem:$dst), "jmp{q}\t{*}$dst", >> - [(brind (loadi64 addr:$dst))]>, Requires<[In64BitMode]>; >> + [(brind (loadi64 addr:$dst))], IIC_JMP_MEM>, Requires<[In64BitMode]>; >> >> def FARJMP16i : Iseg16<0xEA, RawFrmImm16, (outs), >> (ins i16imm:$off, i16imm:$seg), >> - "ljmp{w}\t{$seg, $off|$off, $seg}", []>, OpSize; >> + "ljmp{w}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>, OpSize; >> def FARJMP32i : Iseg32<0xEA, RawFrmImm16, (outs), >> (ins i32imm:$off, i16imm:$seg), >> - "ljmp{l}\t{$seg, $off|$off, $seg}", []>; >> + "ljmp{l}\t{$seg, $off|$off, $seg}", [], IIC_JMP_FAR_PTR>; >> def FARJMP64 : RI<0xFF, MRM5m, (outs), (ins opaque80mem:$dst), >> - "ljmp{q}\t{*}$dst", []>; >> + "ljmp{q}\t{*}$dst", [], IIC_JMP_FAR_MEM>; >> >> def FARJMP16m : I<0xFF, MRM5m, (outs), (ins opaque32mem:$dst), >> - "ljmp{w}\t{*}$dst", []>, OpSize; >> + "ljmp{w}\t{*}$dst", [], IIC_JMP_FAR_MEM>, OpSize; >> def FARJMP32m : I<0xFF, MRM5m, (outs), (ins opaque48mem:$dst), >> - "ljmp{l}\t{*}$dst", []>; >> + "ljmp{l}\t{*}$dst", [], IIC_JMP_FAR_MEM>; >> } >> >> >> // Loop instructions >> >> -def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", []>; >> -def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", []>; >> -def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", []>; >> +def LOOP : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", [], IIC_LOOP>; >> +def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", [], IIC_LOOPE>; >> +def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", [], IIC_LOOPNE>; >> >> //===----------------------------------------------------------------------===// >> // Call Instructions... >> @@ -147,25 +148,27 @@ >> Uses = [ESP] in { >> def CALLpcrel32 : Ii32PCRel<0xE8, RawFrm, >> (outs), (ins i32imm_pcrel:$dst,variable_ops), >> - "call{l}\t$dst", []>, Requires<[In32BitMode]>; >> + "call{l}\t$dst", [], IIC_CALL_RI>, Requires<[In32BitMode]>; >> def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), >> - "call{l}\t{*}$dst", [(X86call GR32:$dst)]>, >> + "call{l}\t{*}$dst", [(X86call GR32:$dst)], IIC_CALL_RI>, >> Requires<[In32BitMode]>; >> def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), >> - "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))]>, >> + "call{l}\t{*}$dst", [(X86call (loadi32 addr:$dst))], IIC_CALL_MEM>, >> Requires<[In32BitMode]>; >> >> def FARCALL16i : Iseg16<0x9A, RawFrmImm16, (outs), >> (ins i16imm:$off, i16imm:$seg), >> - "lcall{w}\t{$seg, $off|$off, $seg}", []>, OpSize; >> + "lcall{w}\t{$seg, $off|$off, $seg}", [], >> + IIC_CALL_FAR_PTR>, OpSize; >> def FARCALL32i : Iseg32<0x9A, RawFrmImm16, (outs), >> (ins i32imm:$off, i16imm:$seg), >> - "lcall{l}\t{$seg, $off|$off, $seg}", []>; >> + "lcall{l}\t{$seg, $off|$off, $seg}", [], >> + IIC_CALL_FAR_PTR>; >> >> def FARCALL16m : I<0xFF, MRM3m, (outs), (ins opaque32mem:$dst), >> - "lcall{w}\t{*}$dst", []>, OpSize; >> + "lcall{w}\t{*}$dst", [], IIC_CALL_FAR_MEM>, OpSize; >> def FARCALL32m : I<0xFF, MRM3m, (outs), (ins opaque48mem:$dst), >> - "lcall{l}\t{*}$dst", []>; >> + "lcall{l}\t{*}$dst", [], IIC_CALL_FAR_MEM>; >> >> // callw for 16 bit code for the assembler. >> let isAsmParserOnly = 1 in >> @@ -196,13 +199,13 @@ >> // mcinst. >> def TAILJMPd : Ii32PCRel<0xE9, RawFrm, (outs), >> (ins i32imm_pcrel:$dst, variable_ops), >> - "jmp\t$dst # TAILCALL", >> - []>; >> + "jmp\t$dst # TAILCALL", >> + [], IIC_JMP_REL>; >> def TAILJMPr : I<0xFF, MRM4r, (outs), (ins GR32_TC:$dst, variable_ops), >> - "", []>; // FIXME: Remove encoding when JIT is dead. >> + "", [], IIC_JMP_REG>; // FIXME: Remove encoding when JIT is dead. >> let mayLoad = 1 in >> def TAILJMPm : I<0xFF, MRM4m, (outs), (ins i32mem_TC:$dst, variable_ops), >> - "jmp{l}\t{*}$dst # TAILCALL", []>; >> + "jmp{l}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; >> } >> >> >> @@ -226,17 +229,19 @@ >> // the 32-bit pcrel field that we have. >> def CALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, >> (outs), (ins i64i32imm_pcrel:$dst, variable_ops), >> - "call{q}\t$dst", []>, >> + "call{q}\t$dst", [], IIC_CALL_RI>, >> Requires<[In64BitMode, NotWin64]>; >> def CALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), >> - "call{q}\t{*}$dst", [(X86call GR64:$dst)]>, >> + "call{q}\t{*}$dst", [(X86call GR64:$dst)], >> + IIC_CALL_RI>, >> Requires<[In64BitMode, NotWin64]>; >> def CALL64m : I<0xFF, MRM2m, (outs), (ins i64mem:$dst, variable_ops), >> - "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))]>, >> + "call{q}\t{*}$dst", [(X86call (loadi64 addr:$dst))], >> + IIC_CALL_MEM>, >> Requires<[In64BitMode, NotWin64]>; >> >> def FARCALL64 : RI<0xFF, MRM3m, (outs), (ins opaque80mem:$dst), >> - "lcall{q}\t{*}$dst", []>; >> + "lcall{q}\t{*}$dst", [], IIC_CALL_FAR_MEM>; >> } >> >> // FIXME: We need to teach codegen about single list of call-clobbered >> @@ -253,15 +258,16 @@ >> Uses = [RSP] in { >> def WINCALL64pcrel32 : Ii32PCRel<0xE8, RawFrm, >> (outs), (ins i64i32imm_pcrel:$dst, variable_ops), >> - "call{q}\t$dst", []>, >> + "call{q}\t$dst", [], IIC_CALL_RI>, >> Requires<[IsWin64]>; >> def WINCALL64r : I<0xFF, MRM2r, (outs), (ins GR64:$dst, variable_ops), >> "call{q}\t{*}$dst", >> - [(X86call GR64:$dst)]>, Requires<[IsWin64]>; >> + [(X86call GR64:$dst)], IIC_CALL_RI>, >> + Requires<[IsWin64]>; >> def WINCALL64m : I<0xFF, MRM2m, (outs), >> (ins i64mem:$dst,variable_ops), >> "call{q}\t{*}$dst", >> - [(X86call (loadi64 addr:$dst))]>, >> + [(X86call (loadi64 addr:$dst))], IIC_CALL_MEM>, >> Requires<[IsWin64]>; >> } >> >> @@ -272,7 +278,7 @@ >> Uses = [RSP] in { >> def W64ALLOCA : Ii32PCRel<0xE8, RawFrm, >> (outs), (ins i64i32imm_pcrel:$dst, variable_ops), >> - "call{q}\t$dst", []>, >> + "call{q}\t$dst", [], IIC_CALL_RI>, >> Requires<[IsWin64]>; >> } >> >> @@ -296,11 +302,11 @@ >> >> def TAILJMPd64 : Ii32PCRel<0xE9, RawFrm, (outs), >> (ins i64i32imm_pcrel:$dst, variable_ops), >> - "jmp\t$dst # TAILCALL", []>; >> + "jmp\t$dst # TAILCALL", [], IIC_JMP_REL>; >> def TAILJMPr64 : I<0xFF, MRM4r, (outs), (ins ptr_rc_tailcall:$dst, variable_ops), >> - "jmp{q}\t{*}$dst # TAILCALL", []>; >> + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; >> >> let mayLoad = 1 in >> def TAILJMPm64 : I<0xFF, MRM4m, (outs), (ins i64mem_TC:$dst, variable_ops), >> - "jmp{q}\t{*}$dst # TAILCALL", []>; >> + "jmp{q}\t{*}$dst # TAILCALL", [], IIC_JMP_MEM>; >> } >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrFormats.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFormats.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrFormats.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrFormats.td Wed Feb 1 17:20:51 2012 >> @@ -123,7 +123,9 @@ >> class MemOp4 { bit hasMemOp4Prefix = 1; } >> class XOP { bit hasXOP_Prefix = 1; } >> class X86Inst opcod, Format f, ImmType i, dag outs, dag ins, >> - string AsmStr, Domain d = GenericDomain> >> + string AsmStr, >> + InstrItinClass itin, >> + Domain d = GenericDomain> >> : Instruction { >> let Namespace = "X86"; >> >> @@ -139,6 +141,8 @@ >> // If this is a pseudo instruction, mark it isCodeGenOnly. >> let isCodeGenOnly = !eq(!cast(f), "Pseudo"); >> >> + let Itinerary = itin; >> + >> // >> // Attributes specific to X86 instructions... >> // >> @@ -189,51 +193,53 @@ >> } >> >> class PseudoI pattern> >> - : X86Inst<0, Pseudo, NoImm, oops, iops, ""> { >> + : X86Inst<0, Pseudo, NoImm, oops, iops, "", NoItinerary> { >> let Pattern = pattern; >> } >> >> class I o, Format f, dag outs, dag ins, string asm, >> - list pattern, Domain d = GenericDomain> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT, >> + Domain d = GenericDomain> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> class Ii8 o, Format f, dag outs, dag ins, string asm, >> - list pattern, Domain d = GenericDomain> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT, >> + Domain d = GenericDomain> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> class Ii8PCRel o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> class Ii16 o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> class Ii32 o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> >> class Ii16PCRel o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> >> class Ii32PCRel o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> @@ -244,8 +250,9 @@ >> : I {} >> >> // FpI_ - Floating Point Pseudo Instruction template. Not Predicated. >> -class FpI_ pattern> >> - : X86Inst<0, Pseudo, NoImm, outs, ins, ""> { >> +class FpI_ pattern, >> + InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst<0, Pseudo, NoImm, outs, ins, "", itin> { >> let FPForm = fp; >> let Pattern = pattern; >> } >> @@ -257,20 +264,23 @@ >> // Iseg32 - 16-bit segment selector, 32-bit offset >> >> class Iseg16 o, Format f, dag outs, dag ins, string asm, >> - list pattern> : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> >> class Iseg32 o, Format f, dag outs, dag ins, string asm, >> - list pattern> : X86Inst { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> >> // SI - SSE 1 & 2 scalar instructions >> -class SI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I { >> +class SI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I { >> let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], >> !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); >> >> @@ -280,8 +290,8 @@ >> >> // SIi8 - SSE 1 & 2 scalar instructions >> class SIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8 { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8 { >> let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], >> !if(!eq(Prefix, 12 /* XS */), [HasSSE1], [HasSSE2])); >> >> @@ -291,8 +301,8 @@ >> >> // PI - SSE 1 & 2 packed instructions >> class PI o, Format F, dag outs, dag ins, string asm, list pattern, >> - Domain d> >> - : I { >> + InstrItinClass itin, Domain d> >> + : I { >> let Predicates = !if(hasVEXPrefix /* VEX */, [HasAVX], >> !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); >> >> @@ -302,8 +312,8 @@ >> >> // PIi8 - SSE 1 & 2 packed instructions with immediate >> class PIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern, Domain d> >> - : Ii8 { >> + list pattern, InstrItinClass itin, Domain d> >> + : Ii8 { >> let Predicates = !if(hasVEX_4VPrefix /* VEX */, [HasAVX], >> !if(hasOpSizePrefix /* OpSize */, [HasSSE2], [HasSSE1])); >> >> @@ -319,25 +329,27 @@ >> // VSSI - SSE1 instructions with XS prefix in AVX form. >> // VPSI - SSE1 instructions with TB prefix in AVX form. >> >> -class SSI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, XS, Requires<[HasSSE1]>; >> +class SSI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XS, Requires<[HasSSE1]>; >> class SSIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, XS, Requires<[HasSSE1]>; >> -class PSI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, TB, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, XS, Requires<[HasSSE1]>; >> +class PSI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, >> Requires<[HasSSE1]>; >> class PSIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TB, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TB, >> Requires<[HasSSE1]>; >> class VSSI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, XS, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XS, >> Requires<[HasAVX]>; >> class VPSI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, >> Requires<[HasAVX]>; >> >> // SSE2 Instruction Templates: >> @@ -350,28 +362,30 @@ >> // VSDI - SSE2 instructions with XD prefix in AVX form. >> // VPDI - SSE2 instructions with TB and OpSize prefixes in AVX form. >> >> -class SDI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, XD, Requires<[HasSSE2]>; >> +class SDI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XD, Requires<[HasSSE2]>; >> class SDIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, XD, Requires<[HasSSE2]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, XD, Requires<[HasSSE2]>; >> class SSDIi8 o, Format F, dag outs, dag ins, string asm, >> list pattern> >> : Ii8, XS, Requires<[HasSSE2]>; >> -class PDI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, TB, OpSize, >> +class PDI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, OpSize, >> Requires<[HasSSE2]>; >> class PDIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TB, OpSize, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TB, OpSize, >> Requires<[HasSSE2]>; >> class VSDI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, XD, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XD, >> Requires<[HasAVX]>; >> class VPDI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, >> OpSize, Requires<[HasAVX]>; >> >> // SSE3 Instruction Templates: >> @@ -381,15 +395,16 @@ >> // S3DI - SSE3 instructions with XD prefix. >> >> class S3SI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, XS, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XS, >> Requires<[HasSSE3]>; >> class S3DI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, XD, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, XD, >> Requires<[HasSSE3]>; >> -class S3I o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, TB, OpSize, >> +class S3I o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, OpSize, >> Requires<[HasSSE3]>; >> >> >> @@ -403,12 +418,12 @@ >> // classes. They need to be enabled even if AVX is enabled. >> >> class SS38I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, >> Requires<[HasSSSE3]>; >> class SS3AI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> Requires<[HasSSSE3]>; >> >> // SSE4.1 Instruction Templates: >> @@ -417,31 +432,31 @@ >> // SS41AIi8 - SSE 4.1 instructions with TA prefix and ImmT == Imm8. >> // >> class SS48I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, >> Requires<[HasSSE41]>; >> class SS4AIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> Requires<[HasSSE41]>; >> >> // SSE4.2 Instruction Templates: >> // >> // SS428I - SSE 4.2 instructions with T8 prefix. >> class SS428I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, >> Requires<[HasSSE42]>; >> >> // SS42FI - SSE 4.2 instructions with T8XD prefix. >> class SS42FI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8XD, Requires<[HasSSE42]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8XD, Requires<[HasSSE42]>; >> >> // SS42AI = SSE 4.2 instructions with TA prefix >> class SS42AI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> Requires<[HasSSE42]>; >> >> // AVX Instruction Templates: >> @@ -450,12 +465,12 @@ >> // AVX8I - AVX instructions with T8 and OpSize prefix. >> // AVXAIi8 - AVX instructions with TA, OpSize prefix and ImmT = Imm8. >> class AVX8I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8, OpSize, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, OpSize, >> Requires<[HasAVX]>; >> class AVXAIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, OpSize, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, OpSize, >> Requires<[HasAVX]>; >> >> // AVX2 Instruction Templates: >> @@ -464,12 +479,12 @@ >> // AVX28I - AVX2 instructions with T8 and OpSize prefix. >> // AVX2AIi8 - AVX2 instructions with TA, OpSize prefix and ImmT = Imm8. >> class AVX28I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, T8, OpSize, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, OpSize, >> Requires<[HasAVX2]>; >> class AVX2AIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, OpSize, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, OpSize, >> Requires<[HasAVX2]>; >> >> // AES Instruction Templates: >> @@ -477,87 +492,88 @@ >> // AES8I >> // These use the same encoding as the SSE4.2 T8 and TA encodings. >> class AES8I o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : I, T8, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, >> Requires<[HasSSE2, HasAES]>; >> >> class AESAI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TA, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> Requires<[HasSSE2, HasAES]>; >> >> // CLMUL Instruction Templates >> class CLMULIi8 o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : Ii8, TA, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> OpSize, Requires<[HasSSE2, HasCLMUL]>; >> >> class AVXCLMULIi8 o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : Ii8, TA, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> OpSize, VEX_4V, Requires<[HasAVX, HasCLMUL]>; >> >> // FMA3 Instruction Templates >> class FMA3 o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : I, T8, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, T8, >> OpSize, VEX_4V, Requires<[HasFMA3]>; >> >> // FMA4 Instruction Templates >> class FMA4 o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : Ii8, TA, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TA, >> OpSize, VEX_4V, VEX_I8IMM, Requires<[HasFMA4]>; >> >> // XOP 2, 3 and 4 Operand Instruction Template >> class IXOP o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, >> XOP, XOP9, Requires<[HasXOP]>; >> >> // XOP 2, 3 and 4 Operand Instruction Templates with imm byte >> class IXOPi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, >> XOP, XOP8, Requires<[HasXOP]>; >> >> // XOP 5 operand instruction (VEX encoding!) >> class IXOP5 o, Format F, dag outs, dag ins, string asm, >> - listpattern> >> - : Ii8, TA, >> + listpattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TA, >> OpSize, VEX_4V, VEX_I8IMM, Requires<[HasXOP]>; >> >> // X86-64 Instruction templates... >> // >> >> -class RI o, Format F, dag outs, dag ins, string asm, list pattern> >> - : I, REX_W; >> +class RI o, Format F, dag outs, dag ins, string asm, >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, REX_W; >> class RIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, REX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, REX_W; >> class RIi32 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii32, REX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii32, REX_W; >> >> class RIi64 o, Format f, dag outs, dag ins, string asm, >> - list pattern> >> - : X86Inst, REX_W { >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : X86Inst, REX_W { >> let Pattern = pattern; >> let CodeSize = 3; >> } >> >> class RSSI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : SSI, REX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : SSI, REX_W; >> class RSDI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : SDI, REX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : SDI, REX_W; >> class RPDI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : PDI, REX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : PDI, REX_W; >> class VRPDI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : VPDI, VEX_W; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : VPDI, VEX_W; >> >> // MMX Instruction templates >> // >> @@ -570,23 +586,23 @@ >> // MMXID - MMX instructions with XD prefix. >> // MMXIS - MMX instructions with XS prefix. >> class MMXI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, Requires<[HasMMX]>; >> class MMXI64 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, Requires<[HasMMX,In64BitMode]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, Requires<[HasMMX,In64BitMode]>; >> class MMXRI o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, REX_W, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, REX_W, Requires<[HasMMX]>; >> class MMX2I o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : I, TB, OpSize, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : I, TB, OpSize, Requires<[HasMMX]>; >> class MMXIi8 o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, TB, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, TB, Requires<[HasMMX]>; >> class MMXID o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, XD, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, XD, Requires<[HasMMX]>; >> class MMXIS o, Format F, dag outs, dag ins, string asm, >> - list pattern> >> - : Ii8, XS, Requires<[HasMMX]>; >> + list pattern, InstrItinClass itin = IIC_DEFAULT> >> + : Ii8, XS, Requires<[HasMMX]>; >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrMMX.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrMMX.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrMMX.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrMMX.td Wed Feb 1 17:20:51 2012 >> @@ -105,19 +105,23 @@ >> Intrinsic Int, X86MemOperand x86memop, PatFrag ld_frag, >> string asm, Domain d> { >> def irr : PI> - [(set DstRC:$dst, (Int SrcRC:$src))], d>; >> + [(set DstRC:$dst, (Int SrcRC:$src))], >> + IIC_DEFAULT, d>; >> def irm : PI> - [(set DstRC:$dst, (Int (ld_frag addr:$src)))], d>; >> + [(set DstRC:$dst, (Int (ld_frag addr:$src)))], >> + IIC_DEFAULT, d>; >> } >> >> multiclass sse12_cvt_pint_3addr opc, RegisterClass SrcRC, >> RegisterClass DstRC, Intrinsic Int, X86MemOperand x86memop, >> PatFrag ld_frag, string asm, Domain d> { >> def irr : PI> - asm, [(set DstRC:$dst, (Int DstRC:$src1, SrcRC:$src2))], d>; >> + asm, [(set DstRC:$dst, (Int DstRC:$src1, SrcRC:$src2))], >> + IIC_DEFAULT, d>; >> def irm : PI> (ins DstRC:$src1, x86memop:$src2), asm, >> - [(set DstRC:$dst, (Int DstRC:$src1, (ld_frag addr:$src2)))], d>; >> + [(set DstRC:$dst, (Int DstRC:$src1, (ld_frag addr:$src2)))], >> + IIC_DEFAULT, d>; >> } >> >> //===----------------------------------------------------------------------===// >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Wed Feb 1 17:20:51 2012 >> @@ -67,13 +67,14 @@ >> !if(Is2Addr, >> !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), >> !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> - [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], d>; >> + [(set RC:$dst, (vt (OpNode RC:$src1, RC:$src2)))], IIC_DEFAULT, d>; >> let mayLoad = 1 in >> def rm : PI> !if(Is2Addr, >> !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), >> !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> - [(set RC:$dst, (OpNode RC:$src1, (mem_frag addr:$src2)))], d>; >> + [(set RC:$dst, (OpNode RC:$src1, (mem_frag addr:$src2)))], >> + IIC_DEFAULT, d>; >> } >> >> /// sse12_fp_packed_logical_rm - SSE 1 & 2 packed instructions class >> @@ -87,12 +88,12 @@ >> !if(Is2Addr, >> !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), >> !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> - pat_rr, d>; >> + pat_rr, IIC_DEFAULT, d>; >> def rm : PI> !if(Is2Addr, >> !strconcat(OpcodeStr, "\t{$src2, $dst|$dst, $src2}"), >> !strconcat(OpcodeStr, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> - pat_rm, d>; >> + pat_rm, IIC_DEFAULT, d>; >> } >> >> /// sse12_fp_packed_int - SSE 1 & 2 packed instructions intrinsics class >> @@ -106,14 +107,14 @@ >> !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> [(set RC:$dst, (!cast( >> !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) >> - RC:$src1, RC:$src2))], d>; >> + RC:$src1, RC:$src2))], IIC_DEFAULT, d>; >> def rm_Int : PI> !if(Is2Addr, >> !strconcat(asm, "\t{$src2, $dst|$dst, $src2}"), >> !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), >> [(set RC:$dst, (!cast( >> !strconcat("int_x86_", SSEVer, "_", OpcodeStr, FPSizeStr)) >> - RC:$src1, (mem_frag addr:$src2)))], d>; >> + RC:$src1, (mem_frag addr:$src2)))], IIC_DEFAULT, d>; >> } >> >> //===----------------------------------------------------------------------===// >> @@ -737,11 +738,11 @@ >> bit IsReMaterializable = 1> { >> let neverHasSideEffects = 1 in >> def rr : PI> - !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], d>; >> + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], IIC_DEFAULT, d>; >> let canFoldAsLoad = 1, isReMaterializable = IsReMaterializable in >> def rm : PI> !strconcat(asm, "\t{$src, $dst|$dst, $src}"), >> - [(set RC:$dst, (ld_frag addr:$src))], d>; >> + [(set RC:$dst, (ld_frag addr:$src))], IIC_DEFAULT, d>; >> } >> >> defm VMOVAPS : sse12_mov_packed<0x28, VR128, f128mem, alignedloadv4f32, >> @@ -1003,14 +1004,14 @@ >> [(set RC:$dst, >> (mov_frag RC:$src1, >> (bc_v4f32 (v2f64 (scalar_to_vector (loadf64 addr:$src2))))))], >> - SSEPackedSingle>, TB; >> + IIC_DEFAULT, SSEPackedSingle>, TB; >> >> def PDrm : PI> (outs RC:$dst), (ins RC:$src1, f64mem:$src2), >> !strconcat(base_opc, "d", asm_opr), >> [(set RC:$dst, (v2f64 (mov_frag RC:$src1, >> (scalar_to_vector (loadf64 addr:$src2)))))], >> - SSEPackedDouble>, TB, OpSize; >> + IIC_DEFAULT, SSEPackedDouble>, TB, OpSize; >> } >> >> let AddedComplexity = 20 in { >> @@ -1413,9 +1414,11 @@ >> SDNode OpNode, X86MemOperand x86memop, PatFrag ld_frag, >> string asm, Domain d> { >> def rr : PI> - [(set DstRC:$dst, (OpNode SrcRC:$src))], d>; >> + [(set DstRC:$dst, (OpNode SrcRC:$src))], >> + IIC_DEFAULT, d>; >> def rm : PI> - [(set DstRC:$dst, (OpNode (ld_frag addr:$src)))], d>; >> + [(set DstRC:$dst, (OpNode (ld_frag addr:$src)))], >> + IIC_DEFAULT, d>; >> } >> >> multiclass sse12_vcvt_avx opc, RegisterClass SrcRC, RegisterClass DstRC, >> @@ -2124,11 +2127,13 @@ >> PatFrag ld_frag, string OpcodeStr, Domain d> { >> def rr: PI> !strconcat(OpcodeStr, "\t{$src2, $src1|$src1, $src2}"), >> - [(set EFLAGS, (OpNode (vt RC:$src1), RC:$src2))], d>; >> + [(set EFLAGS, (OpNode (vt RC:$src1), RC:$src2))], >> + IIC_DEFAULT, d>; >> def rm: PI> !strconcat(OpcodeStr, "\t{$src2, $src1|$src1, $src2}"), >> [(set EFLAGS, (OpNode (vt RC:$src1), >> - (ld_frag addr:$src2)))], d>; >> + (ld_frag addr:$src2)))], >> + IIC_DEFAULT, d>; >> } >> >> let Defs = [EFLAGS] in { >> @@ -2185,19 +2190,21 @@ >> let isAsmParserOnly = 1 in { >> def rri : PIi8<0xC2, MRMSrcReg, >> (outs RC:$dst), (ins RC:$src1, RC:$src2, SSECC:$cc), asm, >> - [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], d>; >> + [(set RC:$dst, (Int RC:$src1, RC:$src2, imm:$cc))], >> + IIC_DEFAULT, d>; >> def rmi : PIi8<0xC2, MRMSrcMem, >> (outs RC:$dst), (ins RC:$src1, x86memop:$src2, SSECC:$cc), asm, >> - [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], d>; >> + [(set RC:$dst, (Int RC:$src1, (memop addr:$src2), imm:$cc))], >> + IIC_DEFAULT, d>; >> } >> >> // Accept explicit immediate argument form instead of comparison code. >> def rri_alt : PIi8<0xC2, MRMSrcReg, >> (outs RC:$dst), (ins RC:$src1, RC:$src2, i8imm:$cc), >> - asm_alt, [], d>; >> + asm_alt, [], IIC_DEFAULT, d>; >> def rmi_alt : PIi8<0xC2, MRMSrcMem, >> (outs RC:$dst), (ins RC:$src1, x86memop:$src2, i8imm:$cc), >> - asm_alt, [], d>; >> + asm_alt, [], IIC_DEFAULT, d>; >> } >> >> defm VCMPPS : sse12_cmp_packed> @@ -2272,12 +2279,14 @@ >> def rmi : PIi8<0xC6, MRMSrcMem, (outs RC:$dst), >> (ins RC:$src1, x86memop:$src2, i8imm:$src3), asm, >> [(set RC:$dst, (vt (shufp:$src3 >> - RC:$src1, (mem_frag addr:$src2))))], d>; >> + RC:$src1, (mem_frag addr:$src2))))], >> + IIC_DEFAULT, d>; >> let isConvertibleToThreeAddress = IsConvertibleToThreeAddress in >> def rri : PIi8<0xC6, MRMSrcReg, (outs RC:$dst), >> (ins RC:$src1, RC:$src2, i8imm:$src3), asm, >> [(set RC:$dst, >> - (vt (shufp:$src3 RC:$src1, RC:$src2)))], d>; >> + (vt (shufp:$src3 RC:$src1, RC:$src2)))], >> + IIC_DEFAULT, d>; >> } >> >> defm VSHUFPS : sse12_shuffle> @@ -2448,12 +2457,14 @@ >> def rr : PI> (outs RC:$dst), (ins RC:$src1, RC:$src2), >> asm, [(set RC:$dst, >> - (vt (OpNode RC:$src1, RC:$src2)))], d>; >> + (vt (OpNode RC:$src1, RC:$src2)))], >> + IIC_DEFAULT, d>; >> def rm : PI> (outs RC:$dst), (ins RC:$src1, x86memop:$src2), >> asm, [(set RC:$dst, >> (vt (OpNode RC:$src1, >> - (mem_frag addr:$src2))))], d>; >> + (mem_frag addr:$src2))))], >> + IIC_DEFAULT, d>; >> } >> >> let AddedComplexity = 10 in { >> @@ -2589,9 +2600,10 @@ >> Domain d> { >> def rr32 : PI<0x50, MRMSrcReg, (outs GR32:$dst), (ins RC:$src), >> !strconcat(asm, "\t{$src, $dst|$dst, $src}"), >> - [(set GR32:$dst, (Int RC:$src))], d>; >> + [(set GR32:$dst, (Int RC:$src))], IIC_DEFAULT, d>; >> def rr64 : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins RC:$src), >> - !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], d>, REX_W; >> + !strconcat(asm, "\t{$src, $dst|$dst, $src}"), [], >> + IIC_DEFAULT, d>, REX_W; >> } >> >> let Predicates = [HasAVX] in { >> @@ -2621,14 +2633,18 @@ >> >> // Assembler Only >> def VMOVMSKPSr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), >> - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; >> + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, >> + SSEPackedSingle>, TB, VEX; >> def VMOVMSKPDr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR128:$src), >> - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, >> + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, >> + SSEPackedDouble>, TB, >> OpSize, VEX; >> def VMOVMSKPSYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), >> - "movmskps\t{$src, $dst|$dst, $src}", [], SSEPackedSingle>, TB, VEX; >> + "movmskps\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, >> + SSEPackedSingle>, TB, VEX; >> def VMOVMSKPDYr64r : PI<0x50, MRMSrcReg, (outs GR64:$dst), (ins VR256:$src), >> - "movmskpd\t{$src, $dst|$dst, $src}", [], SSEPackedDouble>, TB, >> + "movmskpd\t{$src, $dst|$dst, $src}", [], IIC_DEFAULT, >> + SSEPackedDouble>, TB, >> OpSize, VEX; >> } >> >> @@ -6395,7 +6411,7 @@ >> !strconcat(OpcodeStr, >> "\t{$src3, $src2, $src1, $dst|$dst, $src1, $src2, $src3}"), >> [(set RC:$dst, (IntId RC:$src1, RC:$src2, RC:$src3))], >> - SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; >> + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; >> >> def rm : Ii8> (ins RC:$src1, x86memop:$src2, RC:$src3), >> @@ -6404,7 +6420,7 @@ >> [(set RC:$dst, >> (IntId RC:$src1, (bitconvert (mem_frag addr:$src2)), >> RC:$src3))], >> - SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; >> + IIC_DEFAULT, SSEPackedInt>, OpSize, TA, VEX_4V, VEX_I8IMM; >> } >> >> let Predicates = [HasAVX] in { >> >> Modified: llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td (original) >> +++ llvm/trunk/lib/Target/X86/X86InstrShiftRotate.td Wed Feb 1 17:20:51 2012 >> @@ -19,44 +19,46 @@ >> let Uses = [CL] in { >> def SHL8rCL : I<0xD2, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1), >> "shl{b}\t{%cl, $dst|$dst, CL}", >> - [(set GR8:$dst, (shl GR8:$src1, CL))]>; >> + [(set GR8:$dst, (shl GR8:$src1, CL))], IIC_SR>; >> def SHL16rCL : I<0xD3, MRM4r, (outs GR16:$dst), (ins GR16:$src1), >> "shl{w}\t{%cl, $dst|$dst, CL}", >> - [(set GR16:$dst, (shl GR16:$src1, CL))]>, OpSize; >> + [(set GR16:$dst, (shl GR16:$src1, CL))], IIC_SR>, OpSize; >> def SHL32rCL : I<0xD3, MRM4r, (outs GR32:$dst), (ins GR32:$src1), >> "shl{l}\t{%cl, $dst|$dst, CL}", >> - [(set GR32:$dst, (shl GR32:$src1, CL))]>; >> + [(set GR32:$dst, (shl GR32:$src1, CL))], IIC_SR>; >> def SHL64rCL : RI<0xD3, MRM4r, (outs GR64:$dst), (ins GR64:$src1), >> "shl{q}\t{%cl, $dst|$dst, CL}", >> - [(set GR64:$dst, (shl GR64:$src1, CL))]>; >> + [(set GR64:$dst, (shl GR64:$src1, CL))], IIC_SR>; >> } // Uses = [CL] >> >> def SHL8ri : Ii8<0xC0, MRM4r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), >> "shl{b}\t{$src2, $dst|$dst, $src2}", >> - [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))]>; >> + [(set GR8:$dst, (shl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; >> >> let isConvertibleToThreeAddress = 1 in { // Can transform into LEA. >> def SHL16ri : Ii8<0xC1, MRM4r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), >> "shl{w}\t{$src2, $dst|$dst, $src2}", >> - [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))]>, OpSize; >> + [(set GR16:$dst, (shl GR16:$src1, (i8 imm:$src2)))], IIC_SR>, >> + OpSize; >> def SHL32ri : Ii8<0xC1, MRM4r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), >> "shl{l}\t{$src2, $dst|$dst, $src2}", >> - [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))]>; >> + [(set GR32:$dst, (shl GR32:$src1, (i8 imm:$src2)))], IIC_SR>; >> def SHL64ri : RIi8<0xC1, MRM4r, (outs GR64:$dst), >> (ins GR64:$src1, i8imm:$src2), >> "shl{q}\t{$src2, $dst|$dst, $src2}", >> - [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))]>; >> + [(set GR64:$dst, (shl GR64:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> >> // NOTE: We don't include patterns for shifts of a register by one, because >> // 'add reg,reg' is cheaper (and we have a Pat pattern for shift-by-one). >> def SHL8r1 : I<0xD0, MRM4r, (outs GR8:$dst), (ins GR8:$src1), >> - "shl{b}\t$dst", []>; >> + "shl{b}\t$dst", [], IIC_SR>; >> def SHL16r1 : I<0xD1, MRM4r, (outs GR16:$dst), (ins GR16:$src1), >> - "shl{w}\t$dst", []>, OpSize; >> + "shl{w}\t$dst", [], IIC_SR>, OpSize; >> def SHL32r1 : I<0xD1, MRM4r, (outs GR32:$dst), (ins GR32:$src1), >> - "shl{l}\t$dst", []>; >> + "shl{l}\t$dst", [], IIC_SR>; >> def SHL64r1 : RI<0xD1, MRM4r, (outs GR64:$dst), (ins GR64:$src1), >> - "shl{q}\t$dst", []>; >> + "shl{q}\t$dst", [], IIC_SR>; >> } // isConvertibleToThreeAddress = 1 >> } // Constraints = "$src = $dst" >> >> @@ -66,223 +68,266 @@ >> let Uses = [CL] in { >> def SHL8mCL : I<0xD2, MRM4m, (outs), (ins i8mem :$dst), >> "shl{b}\t{%cl, $dst|$dst, CL}", >> - [(store (shl (loadi8 addr:$dst), CL), addr:$dst)]>; >> + [(store (shl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; >> def SHL16mCL : I<0xD3, MRM4m, (outs), (ins i16mem:$dst), >> "shl{w}\t{%cl, $dst|$dst, CL}", >> - [(store (shl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; >> + [(store (shl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, >> + OpSize; >> def SHL32mCL : I<0xD3, MRM4m, (outs), (ins i32mem:$dst), >> "shl{l}\t{%cl, $dst|$dst, CL}", >> - [(store (shl (loadi32 addr:$dst), CL), addr:$dst)]>; >> + [(store (shl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; >> def SHL64mCL : RI<0xD3, MRM4m, (outs), (ins i64mem:$dst), >> "shl{q}\t{%cl, $dst|$dst, CL}", >> - [(store (shl (loadi64 addr:$dst), CL), addr:$dst)]>; >> + [(store (shl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; >> } >> def SHL8mi : Ii8<0xC0, MRM4m, (outs), (ins i8mem :$dst, i8imm:$src), >> "shl{b}\t{$src, $dst|$dst, $src}", >> - [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (shl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SHL16mi : Ii8<0xC1, MRM4m, (outs), (ins i16mem:$dst, i8imm:$src), >> "shl{w}\t{$src, $dst|$dst, $src}", >> - [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, >> + [(store (shl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def SHL32mi : Ii8<0xC1, MRM4m, (outs), (ins i32mem:$dst, i8imm:$src), >> "shl{l}\t{$src, $dst|$dst, $src}", >> - [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (shl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SHL64mi : RIi8<0xC1, MRM4m, (outs), (ins i64mem:$dst, i8imm:$src), >> "shl{q}\t{$src, $dst|$dst, $src}", >> - [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (shl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> >> // Shift by 1 >> def SHL8m1 : I<0xD0, MRM4m, (outs), (ins i8mem :$dst), >> "shl{b}\t$dst", >> - [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (shl (loadi8 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SHL16m1 : I<0xD1, MRM4m, (outs), (ins i16mem:$dst), >> "shl{w}\t$dst", >> - [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, >> + [(store (shl (loadi16 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def SHL32m1 : I<0xD1, MRM4m, (outs), (ins i32mem:$dst), >> "shl{l}\t$dst", >> - [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (shl (loadi32 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SHL64m1 : RI<0xD1, MRM4m, (outs), (ins i64mem:$dst), >> "shl{q}\t$dst", >> - [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (shl (loadi64 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> >> let Constraints = "$src1 = $dst" in { >> let Uses = [CL] in { >> def SHR8rCL : I<0xD2, MRM5r, (outs GR8 :$dst), (ins GR8 :$src1), >> "shr{b}\t{%cl, $dst|$dst, CL}", >> - [(set GR8:$dst, (srl GR8:$src1, CL))]>; >> + [(set GR8:$dst, (srl GR8:$src1, CL))], IIC_SR>; >> def SHR16rCL : I<0xD3, MRM5r, (outs GR16:$dst), (ins GR16:$src1), >> "shr{w}\t{%cl, $dst|$dst, CL}", >> - [(set GR16:$dst, (srl GR16:$src1, CL))]>, OpSize; >> + [(set GR16:$dst, (srl GR16:$src1, CL))], IIC_SR>, OpSize; >> def SHR32rCL : I<0xD3, MRM5r, (outs GR32:$dst), (ins GR32:$src1), >> "shr{l}\t{%cl, $dst|$dst, CL}", >> - [(set GR32:$dst, (srl GR32:$src1, CL))]>; >> + [(set GR32:$dst, (srl GR32:$src1, CL))], IIC_SR>; >> def SHR64rCL : RI<0xD3, MRM5r, (outs GR64:$dst), (ins GR64:$src1), >> "shr{q}\t{%cl, $dst|$dst, CL}", >> - [(set GR64:$dst, (srl GR64:$src1, CL))]>; >> + [(set GR64:$dst, (srl GR64:$src1, CL))], IIC_SR>; >> } >> >> def SHR8ri : Ii8<0xC0, MRM5r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$src2), >> "shr{b}\t{$src2, $dst|$dst, $src2}", >> - [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))]>; >> + [(set GR8:$dst, (srl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; >> def SHR16ri : Ii8<0xC1, MRM5r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), >> "shr{w}\t{$src2, $dst|$dst, $src2}", >> - [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))]>, OpSize; >> + [(set GR16:$dst, (srl GR16:$src1, (i8 imm:$src2)))], >> + IIC_SR>, OpSize; >> def SHR32ri : Ii8<0xC1, MRM5r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), >> "shr{l}\t{$src2, $dst|$dst, $src2}", >> - [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))]>; >> + [(set GR32:$dst, (srl GR32:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> def SHR64ri : RIi8<0xC1, MRM5r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$src2), >> "shr{q}\t{$src2, $dst|$dst, $src2}", >> - [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))]>; >> + [(set GR64:$dst, (srl GR64:$src1, (i8 imm:$src2)))], IIC_SR>; >> >> // Shift right by 1 >> def SHR8r1 : I<0xD0, MRM5r, (outs GR8:$dst), (ins GR8:$src1), >> "shr{b}\t$dst", >> - [(set GR8:$dst, (srl GR8:$src1, (i8 1)))]>; >> + [(set GR8:$dst, (srl GR8:$src1, (i8 1)))], IIC_SR>; >> def SHR16r1 : I<0xD1, MRM5r, (outs GR16:$dst), (ins GR16:$src1), >> "shr{w}\t$dst", >> - [(set GR16:$dst, (srl GR16:$src1, (i8 1)))]>, OpSize; >> + [(set GR16:$dst, (srl GR16:$src1, (i8 1)))], IIC_SR>, OpSize; >> def SHR32r1 : I<0xD1, MRM5r, (outs GR32:$dst), (ins GR32:$src1), >> "shr{l}\t$dst", >> - [(set GR32:$dst, (srl GR32:$src1, (i8 1)))]>; >> + [(set GR32:$dst, (srl GR32:$src1, (i8 1)))], IIC_SR>; >> def SHR64r1 : RI<0xD1, MRM5r, (outs GR64:$dst), (ins GR64:$src1), >> "shr{q}\t$dst", >> - [(set GR64:$dst, (srl GR64:$src1, (i8 1)))]>; >> + [(set GR64:$dst, (srl GR64:$src1, (i8 1)))], IIC_SR>; >> } // Constraints = "$src = $dst" >> >> >> let Uses = [CL] in { >> def SHR8mCL : I<0xD2, MRM5m, (outs), (ins i8mem :$dst), >> "shr{b}\t{%cl, $dst|$dst, CL}", >> - [(store (srl (loadi8 addr:$dst), CL), addr:$dst)]>; >> + [(store (srl (loadi8 addr:$dst), CL), addr:$dst)], IIC_SR>; >> def SHR16mCL : I<0xD3, MRM5m, (outs), (ins i16mem:$dst), >> "shr{w}\t{%cl, $dst|$dst, CL}", >> - [(store (srl (loadi16 addr:$dst), CL), addr:$dst)]>, >> + [(store (srl (loadi16 addr:$dst), CL), addr:$dst)], IIC_SR>, >> OpSize; >> def SHR32mCL : I<0xD3, MRM5m, (outs), (ins i32mem:$dst), >> "shr{l}\t{%cl, $dst|$dst, CL}", >> - [(store (srl (loadi32 addr:$dst), CL), addr:$dst)]>; >> + [(store (srl (loadi32 addr:$dst), CL), addr:$dst)], IIC_SR>; >> def SHR64mCL : RI<0xD3, MRM5m, (outs), (ins i64mem:$dst), >> "shr{q}\t{%cl, $dst|$dst, CL}", >> - [(store (srl (loadi64 addr:$dst), CL), addr:$dst)]>; >> + [(store (srl (loadi64 addr:$dst), CL), addr:$dst)], IIC_SR>; >> } >> def SHR8mi : Ii8<0xC0, MRM5m, (outs), (ins i8mem :$dst, i8imm:$src), >> "shr{b}\t{$src, $dst|$dst, $src}", >> - [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (srl (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SHR16mi : Ii8<0xC1, MRM5m, (outs), (ins i16mem:$dst, i8imm:$src), >> "shr{w}\t{$src, $dst|$dst, $src}", >> - [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, >> + [(store (srl (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def SHR32mi : Ii8<0xC1, MRM5m, (outs), (ins i32mem:$dst, i8imm:$src), >> "shr{l}\t{$src, $dst|$dst, $src}", >> - [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (srl (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SHR64mi : RIi8<0xC1, MRM5m, (outs), (ins i64mem:$dst, i8imm:$src), >> "shr{q}\t{$src, $dst|$dst, $src}", >> - [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (srl (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> >> // Shift by 1 >> def SHR8m1 : I<0xD0, MRM5m, (outs), (ins i8mem :$dst), >> "shr{b}\t$dst", >> - [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (srl (loadi8 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SHR16m1 : I<0xD1, MRM5m, (outs), (ins i16mem:$dst), >> "shr{w}\t$dst", >> - [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>,OpSize; >> + [(store (srl (loadi16 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>,OpSize; >> def SHR32m1 : I<0xD1, MRM5m, (outs), (ins i32mem:$dst), >> "shr{l}\t$dst", >> - [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (srl (loadi32 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SHR64m1 : RI<0xD1, MRM5m, (outs), (ins i64mem:$dst), >> "shr{q}\t$dst", >> - [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (srl (loadi64 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> >> let Constraints = "$src1 = $dst" in { >> let Uses = [CL] in { >> def SAR8rCL : I<0xD2, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), >> "sar{b}\t{%cl, $dst|$dst, CL}", >> - [(set GR8:$dst, (sra GR8:$src1, CL))]>; >> + [(set GR8:$dst, (sra GR8:$src1, CL))], >> + IIC_SR>; >> def SAR16rCL : I<0xD3, MRM7r, (outs GR16:$dst), (ins GR16:$src1), >> "sar{w}\t{%cl, $dst|$dst, CL}", >> - [(set GR16:$dst, (sra GR16:$src1, CL))]>, OpSize; >> + [(set GR16:$dst, (sra GR16:$src1, CL))], >> + IIC_SR>, OpSize; >> def SAR32rCL : I<0xD3, MRM7r, (outs GR32:$dst), (ins GR32:$src1), >> "sar{l}\t{%cl, $dst|$dst, CL}", >> - [(set GR32:$dst, (sra GR32:$src1, CL))]>; >> + [(set GR32:$dst, (sra GR32:$src1, CL))], >> + IIC_SR>; >> def SAR64rCL : RI<0xD3, MRM7r, (outs GR64:$dst), (ins GR64:$src1), >> "sar{q}\t{%cl, $dst|$dst, CL}", >> - [(set GR64:$dst, (sra GR64:$src1, CL))]>; >> + [(set GR64:$dst, (sra GR64:$src1, CL))], >> + IIC_SR>; >> } >> >> def SAR8ri : Ii8<0xC0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), >> "sar{b}\t{$src2, $dst|$dst, $src2}", >> - [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))]>; >> + [(set GR8:$dst, (sra GR8:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> def SAR16ri : Ii8<0xC1, MRM7r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), >> "sar{w}\t{$src2, $dst|$dst, $src2}", >> - [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))]>, >> + [(set GR16:$dst, (sra GR16:$src1, (i8 imm:$src2)))], >> + IIC_SR>, >> OpSize; >> def SAR32ri : Ii8<0xC1, MRM7r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), >> "sar{l}\t{$src2, $dst|$dst, $src2}", >> - [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))]>; >> + [(set GR32:$dst, (sra GR32:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> def SAR64ri : RIi8<0xC1, MRM7r, (outs GR64:$dst), >> (ins GR64:$src1, i8imm:$src2), >> "sar{q}\t{$src2, $dst|$dst, $src2}", >> - [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))]>; >> + [(set GR64:$dst, (sra GR64:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> >> // Shift by 1 >> def SAR8r1 : I<0xD0, MRM7r, (outs GR8 :$dst), (ins GR8 :$src1), >> "sar{b}\t$dst", >> - [(set GR8:$dst, (sra GR8:$src1, (i8 1)))]>; >> + [(set GR8:$dst, (sra GR8:$src1, (i8 1)))], >> + IIC_SR>; >> def SAR16r1 : I<0xD1, MRM7r, (outs GR16:$dst), (ins GR16:$src1), >> "sar{w}\t$dst", >> - [(set GR16:$dst, (sra GR16:$src1, (i8 1)))]>, OpSize; >> + [(set GR16:$dst, (sra GR16:$src1, (i8 1)))], >> + IIC_SR>, OpSize; >> def SAR32r1 : I<0xD1, MRM7r, (outs GR32:$dst), (ins GR32:$src1), >> "sar{l}\t$dst", >> - [(set GR32:$dst, (sra GR32:$src1, (i8 1)))]>; >> + [(set GR32:$dst, (sra GR32:$src1, (i8 1)))], >> + IIC_SR>; >> def SAR64r1 : RI<0xD1, MRM7r, (outs GR64:$dst), (ins GR64:$src1), >> "sar{q}\t$dst", >> - [(set GR64:$dst, (sra GR64:$src1, (i8 1)))]>; >> + [(set GR64:$dst, (sra GR64:$src1, (i8 1)))], >> + IIC_SR>; >> } // Constraints = "$src = $dst" >> >> >> let Uses = [CL] in { >> def SAR8mCL : I<0xD2, MRM7m, (outs), (ins i8mem :$dst), >> "sar{b}\t{%cl, $dst|$dst, CL}", >> - [(store (sra (loadi8 addr:$dst), CL), addr:$dst)]>; >> + [(store (sra (loadi8 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def SAR16mCL : I<0xD3, MRM7m, (outs), (ins i16mem:$dst), >> "sar{w}\t{%cl, $dst|$dst, CL}", >> - [(store (sra (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; >> + [(store (sra (loadi16 addr:$dst), CL), addr:$dst)], >> + IIC_SR>, OpSize; >> def SAR32mCL : I<0xD3, MRM7m, (outs), (ins i32mem:$dst), >> "sar{l}\t{%cl, $dst|$dst, CL}", >> - [(store (sra (loadi32 addr:$dst), CL), addr:$dst)]>; >> + [(store (sra (loadi32 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def SAR64mCL : RI<0xD3, MRM7m, (outs), (ins i64mem:$dst), >> "sar{q}\t{%cl, $dst|$dst, CL}", >> - [(store (sra (loadi64 addr:$dst), CL), addr:$dst)]>; >> + [(store (sra (loadi64 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> } >> def SAR8mi : Ii8<0xC0, MRM7m, (outs), (ins i8mem :$dst, i8imm:$src), >> "sar{b}\t{$src, $dst|$dst, $src}", >> - [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (sra (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SAR16mi : Ii8<0xC1, MRM7m, (outs), (ins i16mem:$dst, i8imm:$src), >> "sar{w}\t{$src, $dst|$dst, $src}", >> - [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, >> + [(store (sra (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def SAR32mi : Ii8<0xC1, MRM7m, (outs), (ins i32mem:$dst, i8imm:$src), >> "sar{l}\t{$src, $dst|$dst, $src}", >> - [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (sra (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def SAR64mi : RIi8<0xC1, MRM7m, (outs), (ins i64mem:$dst, i8imm:$src), >> "sar{q}\t{$src, $dst|$dst, $src}", >> - [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (sra (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> >> // Shift by 1 >> def SAR8m1 : I<0xD0, MRM7m, (outs), (ins i8mem :$dst), >> "sar{b}\t$dst", >> - [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (sra (loadi8 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SAR16m1 : I<0xD1, MRM7m, (outs), (ins i16mem:$dst), >> "sar{w}\t$dst", >> - [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, >> + [(store (sra (loadi16 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def SAR32m1 : I<0xD1, MRM7m, (outs), (ins i32mem:$dst), >> "sar{l}\t$dst", >> - [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (sra (loadi32 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def SAR64m1 : RI<0xD1, MRM7m, (outs), (ins i64mem:$dst), >> "sar{q}\t$dst", >> - [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (sra (loadi64 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> >> //===----------------------------------------------------------------------===// >> // Rotate instructions >> @@ -290,125 +335,125 @@ >> >> let Constraints = "$src1 = $dst" in { >> def RCL8r1 : I<0xD0, MRM2r, (outs GR8:$dst), (ins GR8:$src1), >> - "rcl{b}\t$dst", []>; >> + "rcl{b}\t$dst", [], IIC_SR>; >> def RCL8ri : Ii8<0xC0, MRM2r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), >> - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCL8rCL : I<0xD2, MRM2r, (outs GR8:$dst), (ins GR8:$src1), >> - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> def RCL16r1 : I<0xD1, MRM2r, (outs GR16:$dst), (ins GR16:$src1), >> - "rcl{w}\t$dst", []>, OpSize; >> + "rcl{w}\t$dst", [], IIC_SR>, OpSize; >> def RCL16ri : Ii8<0xC1, MRM2r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), >> - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; >> + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; >> let Uses = [CL] in >> def RCL16rCL : I<0xD3, MRM2r, (outs GR16:$dst), (ins GR16:$src1), >> - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; >> + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; >> >> def RCL32r1 : I<0xD1, MRM2r, (outs GR32:$dst), (ins GR32:$src1), >> - "rcl{l}\t$dst", []>; >> + "rcl{l}\t$dst", [], IIC_SR>; >> def RCL32ri : Ii8<0xC1, MRM2r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), >> - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCL32rCL : I<0xD3, MRM2r, (outs GR32:$dst), (ins GR32:$src1), >> - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> >> def RCL64r1 : RI<0xD1, MRM2r, (outs GR64:$dst), (ins GR64:$src1), >> - "rcl{q}\t$dst", []>; >> + "rcl{q}\t$dst", [], IIC_SR>; >> def RCL64ri : RIi8<0xC1, MRM2r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), >> - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCL64rCL : RI<0xD3, MRM2r, (outs GR64:$dst), (ins GR64:$src1), >> - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> >> def RCR8r1 : I<0xD0, MRM3r, (outs GR8:$dst), (ins GR8:$src1), >> - "rcr{b}\t$dst", []>; >> + "rcr{b}\t$dst", [], IIC_SR>; >> def RCR8ri : Ii8<0xC0, MRM3r, (outs GR8:$dst), (ins GR8:$src1, i8imm:$cnt), >> - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCR8rCL : I<0xD2, MRM3r, (outs GR8:$dst), (ins GR8:$src1), >> - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> def RCR16r1 : I<0xD1, MRM3r, (outs GR16:$dst), (ins GR16:$src1), >> - "rcr{w}\t$dst", []>, OpSize; >> + "rcr{w}\t$dst", [], IIC_SR>, OpSize; >> def RCR16ri : Ii8<0xC1, MRM3r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$cnt), >> - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; >> + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; >> let Uses = [CL] in >> def RCR16rCL : I<0xD3, MRM3r, (outs GR16:$dst), (ins GR16:$src1), >> - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; >> + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; >> >> def RCR32r1 : I<0xD1, MRM3r, (outs GR32:$dst), (ins GR32:$src1), >> - "rcr{l}\t$dst", []>; >> + "rcr{l}\t$dst", [], IIC_SR>; >> def RCR32ri : Ii8<0xC1, MRM3r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$cnt), >> - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCR32rCL : I<0xD3, MRM3r, (outs GR32:$dst), (ins GR32:$src1), >> - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> def RCR64r1 : RI<0xD1, MRM3r, (outs GR64:$dst), (ins GR64:$src1), >> - "rcr{q}\t$dst", []>; >> + "rcr{q}\t$dst", [], IIC_SR>; >> def RCR64ri : RIi8<0xC1, MRM3r, (outs GR64:$dst), (ins GR64:$src1, i8imm:$cnt), >> - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> let Uses = [CL] in >> def RCR64rCL : RI<0xD3, MRM3r, (outs GR64:$dst), (ins GR64:$src1), >> - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> } // Constraints = "$src = $dst" >> >> def RCL8m1 : I<0xD0, MRM2m, (outs), (ins i8mem:$dst), >> - "rcl{b}\t$dst", []>; >> + "rcl{b}\t$dst", [], IIC_SR>; >> def RCL8mi : Ii8<0xC0, MRM2m, (outs), (ins i8mem:$dst, i8imm:$cnt), >> - "rcl{b}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> def RCL16m1 : I<0xD1, MRM2m, (outs), (ins i16mem:$dst), >> - "rcl{w}\t$dst", []>, OpSize; >> + "rcl{w}\t$dst", [], IIC_SR>, OpSize; >> def RCL16mi : Ii8<0xC1, MRM2m, (outs), (ins i16mem:$dst, i8imm:$cnt), >> - "rcl{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; >> + "rcl{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; >> def RCL32m1 : I<0xD1, MRM2m, (outs), (ins i32mem:$dst), >> - "rcl{l}\t$dst", []>; >> + "rcl{l}\t$dst", [], IIC_SR>; >> def RCL32mi : Ii8<0xC1, MRM2m, (outs), (ins i32mem:$dst, i8imm:$cnt), >> - "rcl{l}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> def RCL64m1 : RI<0xD1, MRM2m, (outs), (ins i64mem:$dst), >> - "rcl{q}\t$dst", []>; >> + "rcl{q}\t$dst", [], IIC_SR>; >> def RCL64mi : RIi8<0xC1, MRM2m, (outs), (ins i64mem:$dst, i8imm:$cnt), >> - "rcl{q}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcl{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> >> def RCR8m1 : I<0xD0, MRM3m, (outs), (ins i8mem:$dst), >> - "rcr{b}\t$dst", []>; >> + "rcr{b}\t$dst", [], IIC_SR>; >> def RCR8mi : Ii8<0xC0, MRM3m, (outs), (ins i8mem:$dst, i8imm:$cnt), >> - "rcr{b}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{b}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> def RCR16m1 : I<0xD1, MRM3m, (outs), (ins i16mem:$dst), >> - "rcr{w}\t$dst", []>, OpSize; >> + "rcr{w}\t$dst", [], IIC_SR>, OpSize; >> def RCR16mi : Ii8<0xC1, MRM3m, (outs), (ins i16mem:$dst, i8imm:$cnt), >> - "rcr{w}\t{$cnt, $dst|$dst, $cnt}", []>, OpSize; >> + "rcr{w}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>, OpSize; >> def RCR32m1 : I<0xD1, MRM3m, (outs), (ins i32mem:$dst), >> - "rcr{l}\t$dst", []>; >> + "rcr{l}\t$dst", [], IIC_SR>; >> def RCR32mi : Ii8<0xC1, MRM3m, (outs), (ins i32mem:$dst, i8imm:$cnt), >> - "rcr{l}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{l}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> def RCR64m1 : RI<0xD1, MRM3m, (outs), (ins i64mem:$dst), >> - "rcr{q}\t$dst", []>; >> + "rcr{q}\t$dst", [], IIC_SR>; >> def RCR64mi : RIi8<0xC1, MRM3m, (outs), (ins i64mem:$dst, i8imm:$cnt), >> - "rcr{q}\t{$cnt, $dst|$dst, $cnt}", []>; >> + "rcr{q}\t{$cnt, $dst|$dst, $cnt}", [], IIC_SR>; >> >> let Uses = [CL] in { >> def RCL8mCL : I<0xD2, MRM2m, (outs), (ins i8mem:$dst), >> - "rcl{b}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> def RCL16mCL : I<0xD3, MRM2m, (outs), (ins i16mem:$dst), >> - "rcl{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; >> + "rcl{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; >> def RCL32mCL : I<0xD3, MRM2m, (outs), (ins i32mem:$dst), >> - "rcl{l}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> def RCL64mCL : RI<0xD3, MRM2m, (outs), (ins i64mem:$dst), >> - "rcl{q}\t{%cl, $dst|$dst, CL}", []>; >> + "rcl{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> >> def RCR8mCL : I<0xD2, MRM3m, (outs), (ins i8mem:$dst), >> - "rcr{b}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{b}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> def RCR16mCL : I<0xD3, MRM3m, (outs), (ins i16mem:$dst), >> - "rcr{w}\t{%cl, $dst|$dst, CL}", []>, OpSize; >> + "rcr{w}\t{%cl, $dst|$dst, CL}", [], IIC_SR>, OpSize; >> def RCR32mCL : I<0xD3, MRM3m, (outs), (ins i32mem:$dst), >> - "rcr{l}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{l}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> def RCR64mCL : RI<0xD3, MRM3m, (outs), (ins i64mem:$dst), >> - "rcr{q}\t{%cl, $dst|$dst, CL}", []>; >> + "rcr{q}\t{%cl, $dst|$dst, CL}", [], IIC_SR>; >> } >> >> let Constraints = "$src1 = $dst" in { >> @@ -416,179 +461,217 @@ >> let Uses = [CL] in { >> def ROL8rCL : I<0xD2, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), >> "rol{b}\t{%cl, $dst|$dst, CL}", >> - [(set GR8:$dst, (rotl GR8:$src1, CL))]>; >> + [(set GR8:$dst, (rotl GR8:$src1, CL))], IIC_SR>; >> def ROL16rCL : I<0xD3, MRM0r, (outs GR16:$dst), (ins GR16:$src1), >> "rol{w}\t{%cl, $dst|$dst, CL}", >> - [(set GR16:$dst, (rotl GR16:$src1, CL))]>, OpSize; >> + [(set GR16:$dst, (rotl GR16:$src1, CL))], IIC_SR>, OpSize; >> def ROL32rCL : I<0xD3, MRM0r, (outs GR32:$dst), (ins GR32:$src1), >> "rol{l}\t{%cl, $dst|$dst, CL}", >> - [(set GR32:$dst, (rotl GR32:$src1, CL))]>; >> + [(set GR32:$dst, (rotl GR32:$src1, CL))], IIC_SR>; >> def ROL64rCL : RI<0xD3, MRM0r, (outs GR64:$dst), (ins GR64:$src1), >> "rol{q}\t{%cl, $dst|$dst, CL}", >> - [(set GR64:$dst, (rotl GR64:$src1, CL))]>; >> + [(set GR64:$dst, (rotl GR64:$src1, CL))], IIC_SR>; >> } >> >> def ROL8ri : Ii8<0xC0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), >> "rol{b}\t{$src2, $dst|$dst, $src2}", >> - [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))]>; >> + [(set GR8:$dst, (rotl GR8:$src1, (i8 imm:$src2)))], IIC_SR>; >> def ROL16ri : Ii8<0xC1, MRM0r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), >> "rol{w}\t{$src2, $dst|$dst, $src2}", >> - [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))]>, >> + [(set GR16:$dst, (rotl GR16:$src1, (i8 imm:$src2)))], >> + IIC_SR>, >> OpSize; >> def ROL32ri : Ii8<0xC1, MRM0r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), >> "rol{l}\t{$src2, $dst|$dst, $src2}", >> - [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))]>; >> + [(set GR32:$dst, (rotl GR32:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> def ROL64ri : RIi8<0xC1, MRM0r, (outs GR64:$dst), >> (ins GR64:$src1, i8imm:$src2), >> "rol{q}\t{$src2, $dst|$dst, $src2}", >> - [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))]>; >> + [(set GR64:$dst, (rotl GR64:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> >> // Rotate by 1 >> def ROL8r1 : I<0xD0, MRM0r, (outs GR8 :$dst), (ins GR8 :$src1), >> "rol{b}\t$dst", >> - [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))]>; >> + [(set GR8:$dst, (rotl GR8:$src1, (i8 1)))], >> + IIC_SR>; >> def ROL16r1 : I<0xD1, MRM0r, (outs GR16:$dst), (ins GR16:$src1), >> "rol{w}\t$dst", >> - [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))]>, OpSize; >> + [(set GR16:$dst, (rotl GR16:$src1, (i8 1)))], >> + IIC_SR>, OpSize; >> def ROL32r1 : I<0xD1, MRM0r, (outs GR32:$dst), (ins GR32:$src1), >> "rol{l}\t$dst", >> - [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))]>; >> + [(set GR32:$dst, (rotl GR32:$src1, (i8 1)))], >> + IIC_SR>; >> def ROL64r1 : RI<0xD1, MRM0r, (outs GR64:$dst), (ins GR64:$src1), >> "rol{q}\t$dst", >> - [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))]>; >> + [(set GR64:$dst, (rotl GR64:$src1, (i8 1)))], >> + IIC_SR>; >> } // Constraints = "$src = $dst" >> >> let Uses = [CL] in { >> def ROL8mCL : I<0xD2, MRM0m, (outs), (ins i8mem :$dst), >> "rol{b}\t{%cl, $dst|$dst, CL}", >> - [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotl (loadi8 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def ROL16mCL : I<0xD3, MRM0m, (outs), (ins i16mem:$dst), >> "rol{w}\t{%cl, $dst|$dst, CL}", >> - [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; >> + [(store (rotl (loadi16 addr:$dst), CL), addr:$dst)], >> + IIC_SR>, OpSize; >> def ROL32mCL : I<0xD3, MRM0m, (outs), (ins i32mem:$dst), >> "rol{l}\t{%cl, $dst|$dst, CL}", >> - [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotl (loadi32 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def ROL64mCL : RI<0xD3, MRM0m, (outs), (ins i64mem:$dst), >> "rol{q}\t{%cl, $dst|$dst, %cl}", >> - [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotl (loadi64 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> } >> def ROL8mi : Ii8<0xC0, MRM0m, (outs), (ins i8mem :$dst, i8imm:$src1), >> "rol{b}\t{$src1, $dst|$dst, $src1}", >> - [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; >> + [(store (rotl (loadi8 addr:$dst), (i8 imm:$src1)), addr:$dst)], >> + IIC_SR>; >> def ROL16mi : Ii8<0xC1, MRM0m, (outs), (ins i16mem:$dst, i8imm:$src1), >> "rol{w}\t{$src1, $dst|$dst, $src1}", >> - [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)]>, >> + [(store (rotl (loadi16 addr:$dst), (i8 imm:$src1)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def ROL32mi : Ii8<0xC1, MRM0m, (outs), (ins i32mem:$dst, i8imm:$src1), >> "rol{l}\t{$src1, $dst|$dst, $src1}", >> - [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; >> + [(store (rotl (loadi32 addr:$dst), (i8 imm:$src1)), addr:$dst)], >> + IIC_SR>; >> def ROL64mi : RIi8<0xC1, MRM0m, (outs), (ins i64mem:$dst, i8imm:$src1), >> "rol{q}\t{$src1, $dst|$dst, $src1}", >> - [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)]>; >> + [(store (rotl (loadi64 addr:$dst), (i8 imm:$src1)), addr:$dst)], >> + IIC_SR>; >> >> // Rotate by 1 >> def ROL8m1 : I<0xD0, MRM0m, (outs), (ins i8mem :$dst), >> "rol{b}\t$dst", >> - [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotl (loadi8 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def ROL16m1 : I<0xD1, MRM0m, (outs), (ins i16mem:$dst), >> "rol{w}\t$dst", >> - [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, >> + [(store (rotl (loadi16 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def ROL32m1 : I<0xD1, MRM0m, (outs), (ins i32mem:$dst), >> "rol{l}\t$dst", >> - [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotl (loadi32 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def ROL64m1 : RI<0xD1, MRM0m, (outs), (ins i64mem:$dst), >> "rol{q}\t$dst", >> - [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotl (loadi64 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> >> let Constraints = "$src1 = $dst" in { >> let Uses = [CL] in { >> def ROR8rCL : I<0xD2, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), >> "ror{b}\t{%cl, $dst|$dst, CL}", >> - [(set GR8:$dst, (rotr GR8:$src1, CL))]>; >> + [(set GR8:$dst, (rotr GR8:$src1, CL))], IIC_SR>; >> def ROR16rCL : I<0xD3, MRM1r, (outs GR16:$dst), (ins GR16:$src1), >> "ror{w}\t{%cl, $dst|$dst, CL}", >> - [(set GR16:$dst, (rotr GR16:$src1, CL))]>, OpSize; >> + [(set GR16:$dst, (rotr GR16:$src1, CL))], IIC_SR>, OpSize; >> def ROR32rCL : I<0xD3, MRM1r, (outs GR32:$dst), (ins GR32:$src1), >> "ror{l}\t{%cl, $dst|$dst, CL}", >> - [(set GR32:$dst, (rotr GR32:$src1, CL))]>; >> + [(set GR32:$dst, (rotr GR32:$src1, CL))], IIC_SR>; >> def ROR64rCL : RI<0xD3, MRM1r, (outs GR64:$dst), (ins GR64:$src1), >> "ror{q}\t{%cl, $dst|$dst, CL}", >> - [(set GR64:$dst, (rotr GR64:$src1, CL))]>; >> + [(set GR64:$dst, (rotr GR64:$src1, CL))], IIC_SR>; >> } >> >> def ROR8ri : Ii8<0xC0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1, i8imm:$src2), >> "ror{b}\t{$src2, $dst|$dst, $src2}", >> - [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))]>; >> + [(set GR8:$dst, (rotr GR8:$src1, (i8 imm:$src2)))], IIC_SR>; >> def ROR16ri : Ii8<0xC1, MRM1r, (outs GR16:$dst), (ins GR16:$src1, i8imm:$src2), >> "ror{w}\t{$src2, $dst|$dst, $src2}", >> - [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))]>, >> + [(set GR16:$dst, (rotr GR16:$src1, (i8 imm:$src2)))], >> + IIC_SR>, >> OpSize; >> def ROR32ri : Ii8<0xC1, MRM1r, (outs GR32:$dst), (ins GR32:$src1, i8imm:$src2), >> "ror{l}\t{$src2, $dst|$dst, $src2}", >> - [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))]>; >> + [(set GR32:$dst, (rotr GR32:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> def ROR64ri : RIi8<0xC1, MRM1r, (outs GR64:$dst), >> (ins GR64:$src1, i8imm:$src2), >> "ror{q}\t{$src2, $dst|$dst, $src2}", >> - [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))]>; >> + [(set GR64:$dst, (rotr GR64:$src1, (i8 imm:$src2)))], >> + IIC_SR>; >> >> // Rotate by 1 >> def ROR8r1 : I<0xD0, MRM1r, (outs GR8 :$dst), (ins GR8 :$src1), >> "ror{b}\t$dst", >> - [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))]>; >> + [(set GR8:$dst, (rotr GR8:$src1, (i8 1)))], >> + IIC_SR>; >> def ROR16r1 : I<0xD1, MRM1r, (outs GR16:$dst), (ins GR16:$src1), >> "ror{w}\t$dst", >> - [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))]>, OpSize; >> + [(set GR16:$dst, (rotr GR16:$src1, (i8 1)))], >> + IIC_SR>, OpSize; >> def ROR32r1 : I<0xD1, MRM1r, (outs GR32:$dst), (ins GR32:$src1), >> "ror{l}\t$dst", >> - [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))]>; >> + [(set GR32:$dst, (rotr GR32:$src1, (i8 1)))], >> + IIC_SR>; >> def ROR64r1 : RI<0xD1, MRM1r, (outs GR64:$dst), (ins GR64:$src1), >> "ror{q}\t$dst", >> - [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))]>; >> + [(set GR64:$dst, (rotr GR64:$src1, (i8 1)))], >> + IIC_SR>; >> } // Constraints = "$src = $dst" >> >> let Uses = [CL] in { >> def ROR8mCL : I<0xD2, MRM1m, (outs), (ins i8mem :$dst), >> "ror{b}\t{%cl, $dst|$dst, CL}", >> - [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotr (loadi8 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def ROR16mCL : I<0xD3, MRM1m, (outs), (ins i16mem:$dst), >> "ror{w}\t{%cl, $dst|$dst, CL}", >> - [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)]>, OpSize; >> + [(store (rotr (loadi16 addr:$dst), CL), addr:$dst)], >> + IIC_SR>, OpSize; >> def ROR32mCL : I<0xD3, MRM1m, (outs), (ins i32mem:$dst), >> "ror{l}\t{%cl, $dst|$dst, CL}", >> - [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotr (loadi32 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> def ROR64mCL : RI<0xD3, MRM1m, (outs), (ins i64mem:$dst), >> "ror{q}\t{%cl, $dst|$dst, CL}", >> - [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)]>; >> + [(store (rotr (loadi64 addr:$dst), CL), addr:$dst)], >> + IIC_SR>; >> } >> def ROR8mi : Ii8<0xC0, MRM1m, (outs), (ins i8mem :$dst, i8imm:$src), >> "ror{b}\t{$src, $dst|$dst, $src}", >> - [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (rotr (loadi8 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def ROR16mi : Ii8<0xC1, MRM1m, (outs), (ins i16mem:$dst, i8imm:$src), >> "ror{w}\t{$src, $dst|$dst, $src}", >> - [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)]>, >> + [(store (rotr (loadi16 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def ROR32mi : Ii8<0xC1, MRM1m, (outs), (ins i32mem:$dst, i8imm:$src), >> "ror{l}\t{$src, $dst|$dst, $src}", >> - [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (rotr (loadi32 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> def ROR64mi : RIi8<0xC1, MRM1m, (outs), (ins i64mem:$dst, i8imm:$src), >> "ror{q}\t{$src, $dst|$dst, $src}", >> - [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)]>; >> + [(store (rotr (loadi64 addr:$dst), (i8 imm:$src)), addr:$dst)], >> + IIC_SR>; >> >> // Rotate by 1 >> def ROR8m1 : I<0xD0, MRM1m, (outs), (ins i8mem :$dst), >> "ror{b}\t$dst", >> - [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotr (loadi8 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def ROR16m1 : I<0xD1, MRM1m, (outs), (ins i16mem:$dst), >> "ror{w}\t$dst", >> - [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)]>, >> + [(store (rotr (loadi16 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>, >> OpSize; >> def ROR32m1 : I<0xD1, MRM1m, (outs), (ins i32mem:$dst), >> "ror{l}\t$dst", >> - [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotr (loadi32 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> def ROR64m1 : RI<0xD1, MRM1m, (outs), (ins i64mem:$dst), >> "ror{q}\t$dst", >> - [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)]>; >> + [(store (rotr (loadi64 addr:$dst), (i8 1)), addr:$dst)], >> + IIC_SR>; >> >> >> //===----------------------------------------------------------------------===// >> @@ -601,30 +684,36 @@ >> def SHLD16rrCL : I<0xA5, MRMDestReg, (outs GR16:$dst), >> (ins GR16:$src1, GR16:$src2), >> "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))]>, >> + [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))], >> + IIC_SHD16_REG_CL>, >> TB, OpSize; >> def SHRD16rrCL : I<0xAD, MRMDestReg, (outs GR16:$dst), >> (ins GR16:$src1, GR16:$src2), >> "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))]>, >> + [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))], >> + IIC_SHD16_REG_CL>, >> TB, OpSize; >> def SHLD32rrCL : I<0xA5, MRMDestReg, (outs GR32:$dst), >> (ins GR32:$src1, GR32:$src2), >> "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))]>, TB; >> + [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))], >> + IIC_SHD32_REG_CL>, TB; >> def SHRD32rrCL : I<0xAD, MRMDestReg, (outs GR32:$dst), >> (ins GR32:$src1, GR32:$src2), >> "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))]>, TB; >> + [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))], >> + IIC_SHD32_REG_CL>, TB; >> def SHLD64rrCL : RI<0xA5, MRMDestReg, (outs GR64:$dst), >> (ins GR64:$src1, GR64:$src2), >> "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))]>, >> + [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))], >> + IIC_SHD64_REG_CL>, >> TB; >> def SHRD64rrCL : RI<0xAD, MRMDestReg, (outs GR64:$dst), >> (ins GR64:$src1, GR64:$src2), >> "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> - [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))]>, >> + [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))], >> + IIC_SHD64_REG_CL>, >> TB; >> } >> >> @@ -634,42 +723,42 @@ >> (ins GR16:$src1, GR16:$src2, i8imm:$src3), >> "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, >> TB, OpSize; >> def SHRD16rri8 : Ii8<0xAC, MRMDestReg, >> (outs GR16:$dst), >> (ins GR16:$src1, GR16:$src2, i8imm:$src3), >> "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD16_REG_IM>, >> TB, OpSize; >> def SHLD32rri8 : Ii8<0xA4, MRMDestReg, >> (outs GR32:$dst), >> (ins GR32:$src1, GR32:$src2, i8imm:$src3), >> "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, >> TB; >> def SHRD32rri8 : Ii8<0xAC, MRMDestReg, >> (outs GR32:$dst), >> (ins GR32:$src1, GR32:$src2, i8imm:$src3), >> "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD32_REG_IM>, >> TB; >> def SHLD64rri8 : RIi8<0xA4, MRMDestReg, >> (outs GR64:$dst), >> (ins GR64:$src1, GR64:$src2, i8imm:$src3), >> "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, >> TB; >> def SHRD64rri8 : RIi8<0xAC, MRMDestReg, >> (outs GR64:$dst), >> (ins GR64:$src1, GR64:$src2, i8imm:$src3), >> "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, >> - (i8 imm:$src3)))]>, >> + (i8 imm:$src3)))], IIC_SHD64_REG_IM>, >> TB; >> } >> } // Constraints = "$src = $dst" >> @@ -678,68 +767,74 @@ >> def SHLD16mrCL : I<0xA5, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), >> "shld{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shld (loadi16 addr:$dst), GR16:$src2, CL), >> - addr:$dst)]>, TB, OpSize; >> + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; >> def SHRD16mrCL : I<0xAD, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2), >> "shrd{w}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, CL), >> - addr:$dst)]>, TB, OpSize; >> + addr:$dst)], IIC_SHD16_MEM_CL>, TB, OpSize; >> >> def SHLD32mrCL : I<0xA5, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), >> "shld{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shld (loadi32 addr:$dst), GR32:$src2, CL), >> - addr:$dst)]>, TB; >> + addr:$dst)], IIC_SHD32_MEM_CL>, TB; >> def SHRD32mrCL : I<0xAD, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2), >> "shrd{l}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, CL), >> - addr:$dst)]>, TB; >> + addr:$dst)], IIC_SHD32_MEM_CL>, TB; >> >> def SHLD64mrCL : RI<0xA5, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), >> "shld{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shld (loadi64 addr:$dst), GR64:$src2, CL), >> - addr:$dst)]>, TB; >> + addr:$dst)], IIC_SHD64_MEM_CL>, TB; >> def SHRD64mrCL : RI<0xAD, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2), >> "shrd{q}\t{%cl, $src2, $dst|$dst, $src2, CL}", >> [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, CL), >> - addr:$dst)]>, TB; >> + addr:$dst)], IIC_SHD64_MEM_CL>, TB; >> } >> >> def SHLD16mri8 : Ii8<0xA4, MRMDestMem, >> (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), >> "shld{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shld (loadi16 addr:$dst), GR16:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD16_MEM_IM>, >> TB, OpSize; >> def SHRD16mri8 : Ii8<0xAC, MRMDestMem, >> (outs), (ins i16mem:$dst, GR16:$src2, i8imm:$src3), >> "shrd{w}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shrd (loadi16 addr:$dst), GR16:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD16_MEM_IM>, >> TB, OpSize; >> >> def SHLD32mri8 : Ii8<0xA4, MRMDestMem, >> (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), >> "shld{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shld (loadi32 addr:$dst), GR32:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD32_MEM_IM>, >> TB; >> def SHRD32mri8 : Ii8<0xAC, MRMDestMem, >> (outs), (ins i32mem:$dst, GR32:$src2, i8imm:$src3), >> "shrd{l}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shrd (loadi32 addr:$dst), GR32:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD32_MEM_IM>, >> TB; >> >> def SHLD64mri8 : RIi8<0xA4, MRMDestMem, >> (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), >> "shld{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shld (loadi64 addr:$dst), GR64:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD64_MEM_IM>, >> TB; >> def SHRD64mri8 : RIi8<0xAC, MRMDestMem, >> (outs), (ins i64mem:$dst, GR64:$src2, i8imm:$src3), >> "shrd{q}\t{$src3, $src2, $dst|$dst, $src2, $src3}", >> [(store (X86shrd (loadi64 addr:$dst), GR64:$src2, >> - (i8 imm:$src3)), addr:$dst)]>, >> + (i8 imm:$src3)), addr:$dst)], >> + IIC_SHD64_MEM_IM>, >> TB; >> >> } // Defs = [EFLAGS] >> >> Added: llvm/trunk/lib/Target/X86/X86Schedule.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Schedule.td?rev=149558&view=auto >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86Schedule.td (added) >> +++ llvm/trunk/lib/Target/X86/X86Schedule.td Wed Feb 1 17:20:51 2012 >> @@ -0,0 +1,115 @@ >> +//===- X86Schedule.td - X86 Scheduling Definitions ---------*- tablegen -*-===// >> +// >> +// The LLVM Compiler Infrastructure >> +// >> +// This file is distributed under the University of Illinois Open Source >> +// License. See LICENSE.TXT for details. >> +// >> +//===----------------------------------------------------------------------===// >> + >> +//===----------------------------------------------------------------------===// >> +// Instruction Itinerary classes used for X86 >> +def IIC_DEFAULT : InstrItinClass; >> +def IIC_ALU_MEM : InstrItinClass; >> +def IIC_ALU_NONMEM : InstrItinClass; >> +def IIC_LEA : InstrItinClass; >> +def IIC_LEA_16 : InstrItinClass; >> +def IIC_MUL8 : InstrItinClass; >> +def IIC_MUL16_MEM : InstrItinClass; >> +def IIC_MUL16_REG : InstrItinClass; >> +def IIC_MUL32_MEM : InstrItinClass; >> +def IIC_MUL32_REG : InstrItinClass; >> +def IIC_MUL64 : InstrItinClass; >> +// imul by al, ax, eax, tax >> +def IIC_IMUL8 : InstrItinClass; >> +def IIC_IMUL16_MEM : InstrItinClass; >> +def IIC_IMUL16_REG : InstrItinClass; >> +def IIC_IMUL32_MEM : InstrItinClass; >> +def IIC_IMUL32_REG : InstrItinClass; >> +def IIC_IMUL64 : InstrItinClass; >> +// imul reg by reg|mem >> +def IIC_IMUL16_RM : InstrItinClass; >> +def IIC_IMUL16_RR : InstrItinClass; >> +def IIC_IMUL32_RM : InstrItinClass; >> +def IIC_IMUL32_RR : InstrItinClass; >> +def IIC_IMUL64_RM : InstrItinClass; >> +def IIC_IMUL64_RR : InstrItinClass; >> +// imul reg = reg/mem * imm >> +def IIC_IMUL16_RMI : InstrItinClass; >> +def IIC_IMUL16_RRI : InstrItinClass; >> +def IIC_IMUL32_RMI : InstrItinClass; >> +def IIC_IMUL32_RRI : InstrItinClass; >> +def IIC_IMUL64_RMI : InstrItinClass; >> +def IIC_IMUL64_RRI : InstrItinClass; >> +// div >> +def IIC_DIV8_MEM : InstrItinClass; >> +def IIC_DIV8_REG : InstrItinClass; >> +def IIC_DIV16 : InstrItinClass; >> +def IIC_DIV32 : InstrItinClass; >> +def IIC_DIV64 : InstrItinClass; >> +// idiv >> +def IIC_IDIV8 : InstrItinClass; >> +def IIC_IDIV16 : InstrItinClass; >> +def IIC_IDIV32 : InstrItinClass; >> +def IIC_IDIV64 : InstrItinClass; >> +// neg/not/inc/dec >> +def IIC_UNARY_REG : InstrItinClass; >> +def IIC_UNARY_MEM : InstrItinClass; >> +// add/sub/and/or/xor/adc/sbc/cmp/test >> +def IIC_BIN_MEM : InstrItinClass; >> +def IIC_BIN_NONMEM : InstrItinClass; >> +// shift/rotate >> +def IIC_SR : InstrItinClass; >> +// shift double >> +def IIC_SHD16_REG_IM : InstrItinClass; >> +def IIC_SHD16_REG_CL : InstrItinClass; >> +def IIC_SHD16_MEM_IM : InstrItinClass; >> +def IIC_SHD16_MEM_CL : InstrItinClass; >> +def IIC_SHD32_REG_IM : InstrItinClass; >> +def IIC_SHD32_REG_CL : InstrItinClass; >> +def IIC_SHD32_MEM_IM : InstrItinClass; >> +def IIC_SHD32_MEM_CL : InstrItinClass; >> +def IIC_SHD64_REG_IM : InstrItinClass; >> +def IIC_SHD64_REG_CL : InstrItinClass; >> +def IIC_SHD64_MEM_IM : InstrItinClass; >> +def IIC_SHD64_MEM_CL : InstrItinClass; >> +// cmov >> +def IIC_CMOV16_RM : InstrItinClass; >> +def IIC_CMOV16_RR : InstrItinClass; >> +def IIC_CMOV32_RM : InstrItinClass; >> +def IIC_CMOV32_RR : InstrItinClass; >> +def IIC_CMOV64_RM : InstrItinClass; >> +def IIC_CMOV64_RR : InstrItinClass; >> +// set >> +def IIC_SET_R : InstrItinClass; >> +def IIC_SET_M : InstrItinClass; >> +// jmp/jcc/jcxz >> +def IIC_Jcc : InstrItinClass; >> +def IIC_JCXZ : InstrItinClass; >> +def IIC_JMP_REL : InstrItinClass; >> +def IIC_JMP_REG : InstrItinClass; >> +def IIC_JMP_MEM : InstrItinClass; >> +def IIC_JMP_FAR_MEM : InstrItinClass; >> +def IIC_JMP_FAR_PTR : InstrItinClass; >> +// loop >> +def IIC_LOOP : InstrItinClass; >> +def IIC_LOOPE : InstrItinClass; >> +def IIC_LOOPNE : InstrItinClass; >> +// call >> +def IIC_CALL_RI : InstrItinClass; >> +def IIC_CALL_MEM : InstrItinClass; >> +def IIC_CALL_FAR_MEM : InstrItinClass; >> +def IIC_CALL_FAR_PTR : InstrItinClass; >> +// ret >> +def IIC_RET : InstrItinClass; >> +def IIC_RET_IMM : InstrItinClass; >> + >> +//===----------------------------------------------------------------------===// >> +// Processor instruction itineraries. >> + >> +def GenericItineraries : ProcessorItineraries<[], [], []>; >> + >> +include "X86ScheduleAtom.td" >> + >> + >> + >> >> Added: llvm/trunk/lib/Target/X86/X86ScheduleAtom.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ScheduleAtom.td?rev=149558&view=auto >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86ScheduleAtom.td (added) >> +++ llvm/trunk/lib/Target/X86/X86ScheduleAtom.td Wed Feb 1 17:20:51 2012 >> @@ -0,0 +1,136 @@ >> +//=- X86ScheduleAtom.td - X86 Atom Scheduling Definitions -*- tablegen -*-=// >> +// >> +// The LLVM Compiler Infrastructure >> +// >> +// This file is distributed under the University of Illinois Open Source >> +// License. See LICENSE.TXT for details. >> +// >> +//===----------------------------------------------------------------------===// >> +// >> +// This file defines the itinerary class data for the Intel Atom (Bonnell) >> +// processors. >> +// >> +//===----------------------------------------------------------------------===// >> + >> +// >> +// Scheduling information derived from the "Intel 64 and IA32 Architectures >> +// Optimization Reference Manual", Chapter 13, Section 4. >> +// Functional Units >> +// Port 0 >> +def Port0 : FuncUnit; // ALU: ALU0, shift/rotate, load/store >> + // SIMD/FP: SIMD ALU, Shuffle,SIMD/FP multiply, divide >> +def Port1 : FuncUnit; // ALU: ALU1, bit processing, jump, and LEA >> + // SIMD/FP: SIMD ALU, FP Adder >> + >> +def AtomItineraries : ProcessorItineraries< >> + [ Port0, Port1 ], >> + [], [ >> + // P0 only >> + // InstrItinData] >, >> + // P0 or P1 >> + // InstrItinData] >, >> + // P0 and P1 >> + // InstrItinData, InstrStage] >, >> + // >> + // Default is 1 cycle, port0 or port1 >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // mul >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // imul by al, ax, eax, rax >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // imul reg by reg|mem >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // imul reg = reg/mem * imm >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // idiv >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // div >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // neg/not/inc/dec >> + InstrItinData] >, >> + InstrItinData] >, >> + // add/sub/and/or/xor/adc/sbc/cmp/test >> + InstrItinData] >, >> + InstrItinData] >, >> + // shift/rotate >> + InstrItinData] >, >> + // shift double >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // cmov >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // set >> + InstrItinData] >, >> + InstrItinData] >, >> + // jcc >> + InstrItinData] >, >> + // jcxz/jecxz/jrcxz >> + InstrItinData] >, >> + // jmp rel >> + InstrItinData] >, >> + // jmp indirect >> + InstrItinData] >, >> + InstrItinData] >, >> + // jmp far >> + InstrItinData] >, >> + InstrItinData] >, >> + // loop/loope/loopne >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + // call - all but reg/imm >> + InstrItinData, InstrStage<1, [Port1]>] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + InstrItinData] >, >> + //ret >> + InstrItinData] >, >> + InstrItinData, InstrStage<1, [Port1]>] > >> +]>; >> + >> >> Modified: llvm/trunk/lib/Target/X86/X86Subtarget.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.cpp?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86Subtarget.cpp (original) >> +++ llvm/trunk/lib/Target/X86/X86Subtarget.cpp Wed Feb 1 17:20:51 2012 >> @@ -246,6 +246,7 @@ >> IsBTMemSlow = true; >> ToggleFeature(X86::FeatureSlowBTMem); >> } >> + >> // If it's Nehalem, unaligned memory access is fast. >> // FIXME: Nehalem is family 6. Also include Westmere and later processors? >> if (Family == 15 && Model == 26) { >> @@ -253,6 +254,11 @@ >> ToggleFeature(X86::FeatureFastUAMem); >> } >> >> + // Set processor type. Currently only Atom is detected. >> + if (Family == 6 && Model == 28) { >> + X86ProcFamily = IntelAtom; >> + } >> + >> unsigned MaxExtLevel; >> X86_MC::GetCpuIDAndInfo(0x80000000, &MaxExtLevel, &EBX, &ECX, &EDX); >> >> @@ -310,6 +316,7 @@ >> const std::string &FS, >> unsigned StackAlignOverride, bool is64Bit) >> : X86GenSubtargetInfo(TT, CPU, FS) >> + , X86ProcFamily(Others) >> , PICStyle(PICStyles::None) >> , X86SSELevel(NoMMXSSE) >> , X863DNowLevel(NoThreeDNow) >> @@ -333,14 +340,15 @@ >> , IsUAMemFast(false) >> , HasVectorUAMem(false) >> , HasCmpxchg16b(false) >> + , PostRAScheduler(false) >> , stackAlignment(4) >> // FIXME: this is a known good value for Yonah. How about others? >> , MaxInlineSizeThreshold(128) >> , TargetTriple(TT) >> , In64BitMode(is64Bit) { >> // Determine default and user specified characteristics >> + std::string CPUName = CPU; >> if (!FS.empty() || !CPU.empty()) { >> - std::string CPUName = CPU; >> if (CPUName.empty()) { >> #if defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86)\ >> || defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) >> @@ -363,6 +371,13 @@ >> // If feature string is not empty, parse features string. >> ParseSubtargetFeatures(CPUName, FullFS); >> } else { >> + if (CPUName.empty()) { >> +#if defined (__x86_64__) || defined(__i386__) >> + CPUName = sys::getHostCPUName(); >> +#else >> + CPUName = "generic"; >> +#endif >> + } >> // Otherwise, use CPUID to auto-detect feature set. >> AutoDetectSubtargetFeatures(); >> >> @@ -379,6 +394,11 @@ >> } >> } >> >> + if (X86ProcFamily == IntelAtom) { >> + PostRAScheduler = true; >> + InstrItins = getInstrItineraryForCPU(CPUName); >> + } >> + >> // It's important to keep the MCSubtargetInfo feature bits in sync with >> // target data structure which is shared with MC code emitter, etc. >> if (In64BitMode) >> @@ -398,3 +418,12 @@ >> isTargetSolaris() || In64BitMode) >> stackAlignment = 16; >> } >> + >> +bool X86Subtarget::enablePostRAScheduler( >> + CodeGenOpt::Level OptLevel, >> + TargetSubtargetInfo::AntiDepBreakMode& Mode, >> + RegClassVector& CriticalPathRCs) const { >> + Mode = TargetSubtargetInfo::ANTIDEP_CRITICAL; >> + CriticalPathRCs.clear(); >> + return PostRAScheduler && OptLevel >= CodeGenOpt::Default; >> +} >> >> Modified: llvm/trunk/lib/Target/X86/X86Subtarget.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Subtarget.h?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86Subtarget.h (original) >> +++ llvm/trunk/lib/Target/X86/X86Subtarget.h Wed Feb 1 17:20:51 2012 >> @@ -49,6 +49,13 @@ >> NoThreeDNow, ThreeDNow, ThreeDNowA >> }; >> >> + enum X86ProcFamilyEnum { >> + Others, IntelAtom >> + }; >> + >> + /// X86ProcFamily - X86 processor family: Intel Atom, and others >> + X86ProcFamilyEnum X86ProcFamily; >> + >> /// PICStyle - Which PIC style to use >> /// >> PICStyles::Style PICStyle; >> @@ -125,6 +132,9 @@ >> /// this is true for most x86-64 chips, but not the first AMD chips. >> bool HasCmpxchg16b; >> >> + /// PostRAScheduler - True if using post-register-allocation scheduler. >> + bool PostRAScheduler; >> + >> /// stackAlignment - The minimum alignment known to hold of the stack frame on >> /// entry to the function and which must be maintained by every function. >> unsigned stackAlignment; >> @@ -135,6 +145,9 @@ >> >> /// TargetTriple - What processor and OS we're targeting. >> Triple TargetTriple; >> + >> + /// Instruction itineraries for scheduling >> + InstrItineraryData InstrItins; >> >> private: >> /// In64BitMode - True if compiling for 64-bit, false for 32-bit. >> @@ -202,6 +215,8 @@ >> bool hasVectorUAMem() const { return HasVectorUAMem; } >> bool hasCmpxchg16b() const { return HasCmpxchg16b; } >> >> + bool isAtom() const { return X86ProcFamily == IntelAtom; } >> + >> const Triple &getTargetTriple() const { return TargetTriple; } >> >> bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); } >> @@ -291,6 +306,15 @@ >> /// indicating the number of scheduling cycles of backscheduling that >> /// should be attempted. >> unsigned getSpecialAddressLatency() const; >> + >> + /// enablePostRAScheduler - run for Atom optimization. >> + bool enablePostRAScheduler(CodeGenOpt::Level OptLevel, >> + TargetSubtargetInfo::AntiDepBreakMode& Mode, >> + RegClassVector& CriticalPathRCs) const; >> + >> + /// getInstrItins = Return the instruction itineraries based on the >> + /// subtarget selection. >> + const InstrItineraryData &getInstrItineraryData() const { return InstrItins; } >> }; >> >> } // End llvm namespace >> >> Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original) >> +++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Wed Feb 1 17:20:51 2012 >> @@ -78,7 +78,8 @@ >> : LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL), >> Subtarget(TT, CPU, FS, Options.StackAlignmentOverride, is64Bit), >> FrameLowering(*this, Subtarget), >> - ELFWriterInfo(is64Bit, true) { >> + ELFWriterInfo(is64Bit, true), >> + InstrItins(Subtarget.getInstrItineraryData()){ >> // Determine the PICStyle based on the target selected. >> if (getRelocationModel() == Reloc::Static) { >> // Unless we're in PIC or DynamicNoPIC mode, set the PIC style to None. >> >> Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.h?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/lib/Target/X86/X86TargetMachine.h (original) >> +++ llvm/trunk/lib/Target/X86/X86TargetMachine.h Wed Feb 1 17:20:51 2012 >> @@ -32,9 +32,10 @@ >> class StringRef; >> >> class X86TargetMachine : public LLVMTargetMachine { >> - X86Subtarget Subtarget; >> - X86FrameLowering FrameLowering; >> - X86ELFWriterInfo ELFWriterInfo; >> + X86Subtarget Subtarget; >> + X86FrameLowering FrameLowering; >> + X86ELFWriterInfo ELFWriterInfo; >> + InstrItineraryData InstrItins; >> >> public: >> X86TargetMachine(const Target &T, StringRef TT, >> @@ -65,6 +66,9 @@ >> virtual const X86ELFWriterInfo *getELFWriterInfo() const { >> return Subtarget.isTargetELF() ? &ELFWriterInfo : 0; >> } >> + virtual const InstrItineraryData *getInstrItineraryData() const { >> + return &InstrItins; >> + } >> >> // Set up the pass pipeline. >> virtual bool addInstSelector(PassManagerBase &PM); >> >> Modified: llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2007-01-08-InstrSched.ll Wed Feb 1 17:20:51 2012 >> @@ -1,5 +1,5 @@ >> ; PR1075 >> -; RUN: llc < %s -mtriple=x86_64-apple-darwin -O3 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin -O3 | FileCheck %s >> >> define float @foo(float %x) nounwind { >> %tmp1 = fmul float %x, 3.000000e+00 >> >> Modified: llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2007-11-06-InstrSched.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 -mattr=+sse2 | not grep lea >> +; RUN: llc < %s -march=x86 -mcpu=generic -mattr=+sse2 | not grep lea >> >> define float @foo(i32* %x, float* %y, i32 %c) nounwind { >> entry: >> >> Modified: llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2007-12-18-LoadCSEBug.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 | grep {(%esp)} | count 2 >> +; RUN: llc < %s -march=x86 -mcpu=generic | grep {(%esp)} | count 2 >> ; PR1872 >> >> %struct.c34007g__designated___XUB = type { i32, i32, i32, i32 } >> >> Modified: llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2008-12-19-EarlyClobberBug.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin -asm-verbose=0 | FileCheck %s >> ; PR3149 >> ; Make sure the copy after inline asm is not coalesced away. >> >> >> Modified: llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2009-06-03-Win64SpillXMM.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc -mtriple=x86_64-mingw32 < %s | FileCheck %s >> +; RUN: llc -mcpu=generic -mtriple=x86_64-mingw32 < %s | FileCheck %s >> ; CHECK: subq $40, %rsp >> ; CHECK: movaps %xmm8, (%rsp) >> ; CHECK: movaps %xmm7, 16(%rsp) >> >> Modified: llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s >> +; RUN: llc -mcpu=generic -mtriple=i386-apple-darwin -tailcallopt < %s | FileCheck %s >> ; Check that lowered argumens do not overwrite the return address before it is moved. >> ; Bug 6225 >> ; >> >> Modified: llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2010-05-03-CoalescerSubRegClobber.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s | FileCheck %s >> +; RUN: llc < %s -mcpu=generic | FileCheck %s >> ; PR6941 >> target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" >> target triple = "x86_64-apple-darwin10.0.0" >> >> Modified: llvm/trunk/test/CodeGen/X86/abi-isel.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/abi-isel.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/abi-isel.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/abi-isel.ll Wed Feb 1 17:20:51 2012 >> @@ -1,16 +1,16 @@ >> -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-STATIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-unknown-linux-gnu -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-32-PIC >> >> -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=LINUX-64-STATIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-unknown-linux-gnu -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=LINUX-64-PIC >> >> -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC >> - >> -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC >> -; RUN: llc < %s -asm-verbose=0 -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-32-STATIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-DYNAMIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=i686-apple-darwin -march=x86 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-32-PIC >> + >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=static -code-model=small | FileCheck %s -check-prefix=DARWIN-64-STATIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=dynamic-no-pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-DYNAMIC >> +; RUN: llc < %s -asm-verbose=0 -mcpu=generic -mtriple=x86_64-apple-darwin -march=x86-64 -relocation-model=pic -code-model=small | FileCheck %s -check-prefix=DARWIN-64-PIC >> >> @src = external global [131072 x i32] >> @dst = external global [131072 x i32] >> >> Modified: llvm/trunk/test/CodeGen/X86/add.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/add.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/add.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/add.ll Wed Feb 1 17:20:51 2012 >> @@ -1,6 +1,6 @@ >> -; RUN: llc < %s -march=x86 | FileCheck %s -check-prefix=X32 >> -; RUN: llc < %s -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 >> -; RUN: llc < %s -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 >> +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s -check-prefix=X32 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -join-physregs | FileCheck %s -check-prefix=X64 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -join-physregs | FileCheck %s -check-prefix=X64 >> >> ; Some of these tests depend on -join-physregs to commute instructions. >> >> >> Added: llvm/trunk/test/CodeGen/X86/atom-sched.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/atom-sched.ll?rev=149558&view=auto >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/atom-sched.ll (added) >> +++ llvm/trunk/test/CodeGen/X86/atom-sched.ll Wed Feb 1 17:20:51 2012 >> @@ -0,0 +1,28 @@ >> +; RUN: llc <%s -O2 -mcpu=atom -march=x86 -relocation-model=static | FileCheck -check-prefix=atom %s >> +; RUN: llc <%s -O2 -mcpu=core2 -march=x86 -relocation-model=static | FileCheck %s >> + >> + at a = common global i32 0, align 4 >> + at b = common global i32 0, align 4 >> + at c = common global i32 0, align 4 >> + at d = common global i32 0, align 4 >> + at e = common global i32 0, align 4 >> + at f = common global i32 0, align 4 >> + >> +define void @func() nounwind uwtable { >> +; atom: imull >> +; atom-NOT: movl >> +; atom: imull >> +; CHECK: imull >> +; CHECK: movl >> +; CHECK: imull >> +entry: >> + %0 = load i32* @b, align 4 >> + %1 = load i32* @c, align 4 >> + %mul = mul nsw i32 %0, %1 >> + store i32 %mul, i32* @a, align 4 >> + %2 = load i32* @e, align 4 >> + %3 = load i32* @f, align 4 >> + %mul1 = mul nsw i32 %2, %3 >> + store i32 %mul1, i32* @d, align 4 >> + ret void >> +} >> >> Modified: llvm/trunk/test/CodeGen/X86/byval6.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/byval6.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/byval6.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/byval6.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 | grep add | not grep 16 >> +; RUN: llc < %s -mcpu=generic -march=x86 | grep add | not grep 16 >> >> %struct.W = type { x86_fp80, x86_fp80 } >> @B = global %struct.W { x86_fp80 0xK4001A000000000000000, x86_fp80 0xK4001C000000000000000 }, align 32 >> >> Modified: llvm/trunk/test/CodeGen/X86/divide-by-constant.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/divide-by-constant.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/divide-by-constant.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/divide-by-constant.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -asm-verbose=0 | FileCheck %s >> target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32" >> target triple = "i686-pc-linux-gnu" >> >> >> Modified: llvm/trunk/test/CodeGen/X86/epilogue.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/epilogue.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/epilogue.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/epilogue.ll Wed Feb 1 17:20:51 2012 >> @@ -1,5 +1,5 @@ >> -; RUN: llc < %s -march=x86 | not grep lea >> -; RUN: llc < %s -march=x86 | grep {movl %ebp} >> +; RUN: llc < %s -mcpu=generic -march=x86 | not grep lea >> +; RUN: llc < %s -mcpu=generic -march=x86 | grep {movl %ebp} >> >> declare void @bar(<2 x i64>* %n) >> >> >> Modified: llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/fast-cc-merge-stack-adj.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 -x86-asm-syntax=intel | \ >> +; RUN: llc < %s -mcpu=generic -march=x86 -x86-asm-syntax=intel | \ >> ; RUN: grep {add ESP, 8} >> >> target triple = "i686-pc-linux-gnu" >> >> Modified: llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/fast-isel-x86.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc -fast-isel -O0 -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s >> +; RUN: llc -fast-isel -O0 -mcpu=generic -mtriple=i386-apple-darwin10 -relocation-model=pic < %s | FileCheck %s >> >> ; This should use flds to set the return value. >> ; CHECK: test0: >> >> Modified: llvm/trunk/test/CodeGen/X86/fold-load.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/fold-load.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/fold-load.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/fold-load.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s >> %struct._obstack_chunk = type { i8*, %struct._obstack_chunk*, [4 x i8] } >> %struct.obstack = type { i32, %struct._obstack_chunk*, i8*, i8*, i8*, i32, i32, %struct._obstack_chunk* (...)*, void (...)*, i8*, i8 } >> @stmt_obstack = external global %struct.obstack ; <%struct.obstack*> [#uses=1] >> >> Modified: llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=i386-apple-darwin | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=i386-apple-darwin | FileCheck %s >> >> ; There should be no stack manipulations between the inline asm and ret. >> ; CHECK: test1 >> >> Modified: llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/masked-iv-safe.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86-64 > %t >> +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t >> ; RUN: not grep and %t >> ; RUN: not grep movz %t >> ; RUN: not grep sar %t >> >> Modified: llvm/trunk/test/CodeGen/X86/optimize-max-3.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/optimize-max-3.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/optimize-max-3.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/optimize-max-3.ll Wed Feb 1 17:20:51 2012 >> @@ -1,5 +1,5 @@ >> -; RUN: llc < %s -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s >> -; RUN: llc < %s -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -asm-verbose=false | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-win32 -asm-verbose=false | FileCheck %s >> >> ; LSR's OptimizeMax should eliminate the select (max). >> >> >> Modified: llvm/trunk/test/CodeGen/X86/peep-test-3.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/peep-test-3.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/peep-test-3.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/peep-test-3.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 -post-RA-scheduler=false | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -march=x86 -post-RA-scheduler=false | FileCheck %s >> ; rdar://7226797 >> >> ; LLVM should omit the testl and use the flags result from the orl. >> >> Modified: llvm/trunk/test/CodeGen/X86/pic.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pic.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/pic.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/pic.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck %s -check-prefix=LINUX >> >> @ptr = external global i32* >> @dst = external global i32 >> >> Modified: llvm/trunk/test/CodeGen/X86/red-zone.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/red-zone.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/red-zone.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s >> >> ; First without noredzone. >> ; CHECK: f0: >> >> Modified: llvm/trunk/test/CodeGen/X86/red-zone2.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/red-zone2.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/red-zone2.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/red-zone2.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86-64 > %t >> +; RUN: llc < %s -mcpu=generic -march=x86-64 > %t >> ; RUN: grep subq %t | count 1 >> ; RUN: grep addq %t | count 1 >> >> >> Modified: llvm/trunk/test/CodeGen/X86/reghinting.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/reghinting.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/reghinting.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/reghinting.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=x86_64-apple-macosx | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-macosx | FileCheck %s >> ; PR10221 >> >> ;; The registers %x and %y must both spill across the finit call. >> >> Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/segmented-stacks-dynamic.ll Wed Feb 1 17:20:51 2012 >> @@ -1,7 +1,7 @@ >> -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 >> -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 >> -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64 >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj >> >> ; Just to prevent the alloca from being optimized away >> declare void @dummy_use(i32*, i32) >> >> Modified: llvm/trunk/test/CodeGen/X86/segmented-stacks.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/segmented-stacks.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/segmented-stacks.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/segmented-stacks.ll Wed Feb 1 17:20:51 2012 >> @@ -1,23 +1,23 @@ >> -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux >> -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux >> -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin >> -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin >> -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW >> -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Linux >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Linux >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-Darwin >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-Darwin >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X32-MinGW >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -verify-machineinstrs | FileCheck %s -check-prefix=X64-FreeBSD >> >> ; We used to crash with filetype=obj >> -; RUN: llc < %s -mtriple=i686-linux -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=x86_64-linux -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=i686-darwin -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=x86_64-darwin -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=i686-mingw32 -segmented-stacks -filetype=obj >> -; RUN: llc < %s -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-linux -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-darwin -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-darwin -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-mingw32 -segmented-stacks -filetype=obj >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-freebsd -segmented-stacks -filetype=obj >> >> -; RUN: not llc < %s -mtriple=x86_64-solaris -segmented-stacks 2> %t.log >> +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-solaris -segmented-stacks 2> %t.log >> ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-Solaris >> -; RUN: not llc < %s -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log >> +; RUN: not llc < %s -mcpu=generic -mtriple=x86_64-mingw32 -segmented-stacks 2> %t.log >> ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X64-MinGW >> -; RUN: not llc < %s -mtriple=i686-freebsd -segmented-stacks 2> %t.log >> +; RUN: not llc < %s -mcpu=generic -mtriple=i686-freebsd -segmented-stacks 2> %t.log >> ; RUN: FileCheck %s -input-file=%t.log -check-prefix=X32-FreeBSD >> >> ; X64-Solaris: Segmented stacks not supported on this platform >> >> Modified: llvm/trunk/test/CodeGen/X86/stack-align2.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/stack-align2.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/stack-align2.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/stack-align2.ll Wed Feb 1 17:20:51 2012 >> @@ -1,9 +1,9 @@ >> -; RUN: llc < %s -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 >> -; RUN: llc < %s -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 >> -; RUN: llc < %s -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 >> -; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 >> -; RUN: llc < %s -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 >> -; RUN: llc < %s -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 >> +; RUN: llc < %s -mcpu=generic -mtriple=i386-linux | FileCheck %s -check-prefix=LINUX-I386 >> +; RUN: llc < %s -mcpu=generic -mtriple=i386-netbsd | FileCheck %s -check-prefix=NETBSD-I386 >> +; RUN: llc < %s -mcpu=generic -mtriple=i686-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-I386 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux | FileCheck %s -check-prefix=LINUX-X86_64 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-netbsd | FileCheck %s -check-prefix=NETBSD-X86_64 >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin8 | FileCheck %s -check-prefix=DARWIN-X86_64 >> >> define i32 @test() nounwind { >> entry: >> >> Modified: llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/tailcallbyval64.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=x86_64-linux -tailcallopt | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux -tailcallopt | FileCheck %s >> >> ; FIXME: Win64 does not support byval. >> >> >> Modified: llvm/trunk/test/CodeGen/X86/tailcallstack64.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/tailcallstack64.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/tailcallstack64.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/tailcallstack64.ll Wed Feb 1 17:20:51 2012 >> @@ -1,5 +1,5 @@ >> -; RUN: llc < %s -tailcallopt -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s >> -; RUN: llc < %s -tailcallopt -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s >> +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-linux -post-RA-scheduler=true | FileCheck %s >> +; RUN: llc < %s -tailcallopt -mcpu=generic -mtriple=x86_64-win32 -post-RA-scheduler=true | FileCheck %s >> >> ; FIXME: Redundant unused stack allocation could be eliminated. >> ; CHECK: subq ${{24|72|80}}, %rsp >> >> Modified: llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/twoaddr-lea.ll Wed Feb 1 17:20:51 2012 >> @@ -5,7 +5,7 @@ >> ;; allocator turns the shift into an LEA. This also occurs for ADD. >> >> ; Check that the shift gets turned into an LEA. >> -; RUN: llc < %s -mtriple=x86_64-apple-darwin | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-apple-darwin | FileCheck %s >> >> @G = external global i32 >> >> >> Modified: llvm/trunk/test/CodeGen/X86/v-binop-widen.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/v-binop-widen.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/v-binop-widen.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/v-binop-widen.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc -march=x86 -mattr=+sse < %s | FileCheck %s >> +; RUN: llc -mcpu=generic -march=x86 -mattr=+sse < %s | FileCheck %s >> ; CHECK: divss >> ; CHECK: divps >> ; CHECK: divps >> >> Modified: llvm/trunk/test/CodeGen/X86/vec_call.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_call.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/vec_call.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/vec_call.ll Wed Feb 1 17:20:51 2012 >> @@ -1,6 +1,6 @@ >> -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ >> +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ >> ; RUN: grep {subl.*60} >> -; RUN: llc < %s -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ >> +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse2 -mtriple=i686-apple-darwin8 | \ >> ; RUN: grep {movaps.*32} >> >> >> >> Modified: llvm/trunk/test/CodeGen/X86/widen_arith-1.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-1.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/widen_arith-1.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/widen_arith-1.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 -mattr=+sse42 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 | FileCheck %s >> >> define void @update(<3 x i8>* %dst, <3 x i8>* %src, i32 %n) nounwind { >> entry: >> >> Modified: llvm/trunk/test/CodeGen/X86/widen_arith-3.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_arith-3.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/widen_arith-3.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/widen_arith-3.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -march=x86 -mattr=+sse42 -post-RA-scheduler=true | FileCheck %s >> ; CHECK: incl >> ; CHECK: incl >> ; CHECK: incl >> >> Modified: llvm/trunk/test/CodeGen/X86/widen_load-2.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_load-2.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/widen_load-2.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/widen_load-2.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -o - -march=x86-64 -mattr=+sse42 | FileCheck %s >> +; RUN: llc < %s -o - -mcpu=generic -march=x86-64 -mattr=+sse42 | FileCheck %s >> >> ; Test based on pr5626 to load/store >> ; >> >> Modified: llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/win64_alloca_dynalloca.ll Wed Feb 1 17:20:51 2012 >> @@ -1,6 +1,6 @@ >> -; RUN: llc < %s -join-physregs -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 >> -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 >> -; RUN: llc < %s -join-physregs -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI >> +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-mingw32 | FileCheck %s -check-prefix=M64 >> +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32 | FileCheck %s -check-prefix=W64 >> +; RUN: llc < %s -join-physregs -mcpu=generic -mtriple=x86_64-win32-macho | FileCheck %s -check-prefix=EFI >> ; PR8777 >> ; PR8778 >> >> >> Modified: llvm/trunk/test/CodeGen/X86/win64_vararg.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/win64_vararg.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/win64_vararg.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/win64_vararg.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=x86_64-pc-win32 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-win32 | FileCheck %s >> >> ; Verify that the var arg parameters which are passed in registers are stored >> ; in home stack slots allocated by the caller and that AP is correctly >> >> Modified: llvm/trunk/test/CodeGen/X86/zext-fold.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/zext-fold.ll?rev=149558&r1=149557&r2=149558&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/zext-fold.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/zext-fold.ll Wed Feb 1 17:20:51 2012 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -march=x86 | FileCheck %s >> +; RUN: llc < %s -mcpu=generic -march=x86 | FileCheck %s >> >> ;; Simple case >> define i32 @test1(i8 %x) nounwind readnone { >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From geek4civic at gmail.com Wed Feb 1 17:52:27 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Thu, 2 Feb 2012 08:52:27 +0900 Subject: [llvm-commits] [llvm] r149498 - in /llvm/trunk: autoconf/configure.ac configure In-Reply-To: <20120201140622.2E4BD2A6C12C@llvm.org> References: <20120201140622.2E4BD2A6C12C@llvm.org> Message-ID: 2012/2/1 Dylan Noblesmith : > Author: nobled > Date: Wed Feb ?1 08:06:21 2012 > New Revision: 149498 > > URL: http://llvm.org/viewvc/llvm-project?rev=149498&view=rev > Log: > autoconf: generate clang's private config.h header > > The CMake build already generated one. Follows clang r149497. > > This brings us one step closer to compiling and configuring clang > separately from LLVM using the autoconf build, too. It is incompatible to --with-clang-srcdir. My buildbot does not have tools/clang in each builder. ...Takumi From nobled at dreamwidth.org Wed Feb 1 18:11:14 2012 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Thu, 02 Feb 2012 00:11:14 -0000 Subject: [llvm-commits] [llvm] r149563 - /llvm/trunk/autoconf/configure.ac Message-ID: <20120202001114.416892A6C12C@llvm.org> Author: nobled Date: Wed Feb 1 18:11:14 2012 New Revision: 149563 URL: http://llvm.org/viewvc/llvm-project?rev=149563&view=rev Log: autoconf: honor --with-clang-srcdir configure was silently failing to produce anything in the case where clang wasn't at tools/clang/, resulting in compilation errors much later in the build when config.h didn't exist. Modified: llvm/trunk/autoconf/configure.ac Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=149563&r1=149562&r2=149563&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Wed Feb 1 18:11:14 2012 @@ -813,7 +813,7 @@ [Directory to the out-of-tree Clang source]),, withval="-") case "$withval" in - -) clang_src_root="" ;; + -) clang_src_root="$ac_pwd/tools/clang" ;; /* | [[A-Za-z]]:[[\\/]]*) clang_src_root="$withval" ;; *) clang_src_root="$ac_pwd/$withval" ;; esac @@ -1588,9 +1588,9 @@ AC_CONFIG_FILES([docs/doxygen.cfg]) dnl Configure clang, if present -if test -f ${srcdir}/tools/clang/README.txt; then - AC_CONFIG_HEADERS([tools/clang/include/clang/Config/config.h]) - AC_CONFIG_FILES([tools/clang/docs/doxygen.cfg]) +if test -f ${clang_src_root}/README.txt; then + AC_CONFIG_HEADERS([${clang_src_root}/include/clang/Config/config.h]) + AC_CONFIG_FILES([${clang_src_root}/docs/doxygen.cfg]) fi dnl OCaml findlib META file From echristo at apple.com Wed Feb 1 18:16:56 2012 From: echristo at apple.com (Eric Christopher) Date: Thu, 02 Feb 2012 00:16:56 -0000 Subject: [llvm-commits] [llvm] r149567 - /llvm/trunk/configure Message-ID: <20120202001656.232962A6C12C@llvm.org> Author: echristo Date: Wed Feb 1 18:16:55 2012 New Revision: 149567 URL: http://llvm.org/viewvc/llvm-project?rev=149567&view=rev Log: Regenerate configure. Modified: llvm/trunk/configure Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=149567&r1=149566&r2=149567&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Wed Feb 1 18:16:55 2012 @@ -5535,7 +5535,7 @@ fi case "$withval" in - -) clang_src_root="" ;; + -) clang_src_root="$ac_pwd/tools/clang" ;; /* | [A-Za-z]:[\\/]*) clang_src_root="$withval" ;; *) clang_src_root="$ac_pwd/$withval" ;; esac @@ -21094,9 +21094,11 @@ ac_config_files="$ac_config_files docs/doxygen.cfg" -if test -f ${srcdir}/tools/clang/README.txt; then - ac_config_headers="$ac_config_headers tools/clang/include/clang/Config/config.h" - ac_config_files="$ac_config_files tools/clang/docs/doxygen.cfg" + +if test -f ${clang_src_root}/README.txt; then + ac_config_headers="$ac_config_headers ${clang_src_root}/include/clang/Config/config.h" + + ac_config_files="$ac_config_files ${clang_src_root}/docs/doxygen.cfg" fi @@ -21718,7 +21720,8 @@ "Makefile.config") CONFIG_FILES="$CONFIG_FILES Makefile.config" ;; "llvm.spec") CONFIG_FILES="$CONFIG_FILES llvm.spec" ;; "docs/doxygen.cfg") CONFIG_FILES="$CONFIG_FILES docs/doxygen.cfg" ;; - "tools/clang/docs/doxygen.cfg") CONFIG_FILES="$CONFIG_FILES tools/clang/docs/doxygen.cfg" ;; + "${clang_src_root}/include/clang/Config/config.h") CONFIG_HEADERS="$CONFIG_HEADERS ${clang_src_root}/include/clang/Config/config.h" ;; + "${clang_src_root}/docs/doxygen.cfg") CONFIG_FILES="$CONFIG_FILES ${clang_src_root}/docs/doxygen.cfg" ;; "bindings/ocaml/llvm/META.llvm") CONFIG_FILES="$CONFIG_FILES bindings/ocaml/llvm/META.llvm" ;; "setup") CONFIG_COMMANDS="$CONFIG_COMMANDS setup" ;; "Makefile") CONFIG_COMMANDS="$CONFIG_COMMANDS Makefile" ;; From nobled at dreamwidth.org Wed Feb 1 18:17:33 2012 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Thu, 02 Feb 2012 00:17:33 -0000 Subject: [llvm-commits] [llvm] r149568 - /llvm/trunk/autoconf/configure.ac Message-ID: <20120202001733.EF37C2A6C12C@llvm.org> Author: nobled Date: Wed Feb 1 18:17:33 2012 New Revision: 149568 URL: http://llvm.org/viewvc/llvm-project?rev=149568&view=rev Log: autoconf: restore old clang-srcdir behavior Keep the string empty when unspecified. Undoes part of r149563. Modified: llvm/trunk/autoconf/configure.ac Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=149568&r1=149567&r2=149568&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Wed Feb 1 18:17:33 2012 @@ -813,7 +813,7 @@ [Directory to the out-of-tree Clang source]),, withval="-") case "$withval" in - -) clang_src_root="$ac_pwd/tools/clang" ;; + -) clang_src_root="" ;; /* | [[A-Za-z]]:[[\\/]]*) clang_src_root="$withval" ;; *) clang_src_root="$ac_pwd/$withval" ;; esac @@ -1588,6 +1588,9 @@ AC_CONFIG_FILES([docs/doxygen.cfg]) dnl Configure clang, if present +if test ${clang_src_root} = ""; then + clang_src_root="$ac_pwd/tools/clang" +fi if test -f ${clang_src_root}/README.txt; then AC_CONFIG_HEADERS([${clang_src_root}/include/clang/Config/config.h]) AC_CONFIG_FILES([${clang_src_root}/docs/doxygen.cfg]) From echristo at apple.com Wed Feb 1 18:21:47 2012 From: echristo at apple.com (Eric Christopher) Date: Wed, 01 Feb 2012 16:21:47 -0800 Subject: [llvm-commits] [llvm] r149563 - /llvm/trunk/autoconf/configure.ac In-Reply-To: <20120202001114.416892A6C12C@llvm.org> References: <20120202001114.416892A6C12C@llvm.org> Message-ID: <0887B7AA-1C42-44D5-87F8-E528296F50A8@apple.com> On Feb 1, 2012, at 4:11 PM, Dylan Noblesmith wrote: > autoconf: honor --with-clang-srcdir > > configure was silently failing to produce anything in the case > where clang wasn't at tools/clang/, resulting in compilation > errors much later in the build when config.h didn't exist. > > > Modified: > llvm/trunk/autoconf/configure.ac Please get and build the proper autotools support or ask me to commit patches for you if you have to touch configure. Thanks! -eric From echristo at apple.com Wed Feb 1 18:19:05 2012 From: echristo at apple.com (Eric Christopher) Date: Thu, 02 Feb 2012 00:19:05 -0000 Subject: [llvm-commits] [llvm] r149569 - /llvm/trunk/configure Message-ID: <20120202001905.C25D62A6C12C@llvm.org> Author: echristo Date: Wed Feb 1 18:19:05 2012 New Revision: 149569 URL: http://llvm.org/viewvc/llvm-project?rev=149569&view=rev Log: Regenerate again. Modified: llvm/trunk/configure Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=149569&r1=149568&r2=149569&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Wed Feb 1 18:19:05 2012 @@ -5535,7 +5535,7 @@ fi case "$withval" in - -) clang_src_root="$ac_pwd/tools/clang" ;; + -) clang_src_root="" ;; /* | [A-Za-z]:[\\/]*) clang_src_root="$withval" ;; *) clang_src_root="$ac_pwd/$withval" ;; esac @@ -21095,6 +21095,9 @@ ac_config_files="$ac_config_files docs/doxygen.cfg" +if test ${clang_src_root} = ""; then + clang_src_root="$ac_pwd/tools/clang" +fi if test -f ${clang_src_root}/README.txt; then ac_config_headers="$ac_config_headers ${clang_src_root}/include/clang/Config/config.h" From nlewycky at google.com Wed Feb 1 18:32:23 2012 From: nlewycky at google.com (Nick Lewycky) Date: Wed, 1 Feb 2012 16:32:23 -0800 Subject: [llvm-commits] Proposal/patch: Enable bitcode streaming In-Reply-To: References: <583C473A-9640-46A7-89FE-63BF0B97E6C6@apple.com> Message-ID: On 26 January 2012 11:19, Derek Schuff wrote: > Thanks for the review. comments inline, and updated patches attached. Thanks. As for StreamableMemoryObject vs. StreamingMemoryObject; please just add pretty-much what you wrote in your email as a comment explaining the difference, and then commit it. It seems to me that once we have some experience with these APIs, we'll know what the right factoring of this code really ought to be. Nick On Wed, Jan 25, 2012 at 2:04 PM, Nick Lewycky wrote: > >> On 20 January 2012 11:55, Derek Schuff wrote: >> >>> And finally, the StreamingMemoryObject implementation, modified >>> BitcodeReader, and modifed llvm-dis.cpp using the streaming interface. >>> Please take a look >>> >> >> Overall this looks good. I'm especially happy with some of the >> refactoring inside BitcodeReader! Comments: >> >> --- a/include/llvm/Bitcode/ReaderWriter.h >> +++ b/include/llvm/Bitcode/ReaderWriter.h >> @@ -21,31 +21,41 @@ namespace llvm { >> class MemoryBuffer; >> class ModulePass; >> class BitstreamWriter; >> + class DataStreamer; >> class LLVMContext; >> class raw_ostream; >> >> I realize these were unsorted when you got here, but please alphabetize >> them. >> > Done. > >> >> + /// If 'verify' is true, check that the file fits in the buffer. >> + static inline bool SkipBitcodeWrapperHeader(const unsigned char >> *&BufPtr, >> + const unsigned char >> *&BufEnd, >> + bool Verify) { >> >> I didn't really understand the comment. I think what you're doing is >> disabling the check that the buffer contained the whole header. Could you >> make it "bool SkipPastEnd" instead? >> > > There 2 checks: the first checks that the buffer is large enough to > contain the whole header. That always runs. The check that's > conditionalized on 'verify' checks that the buffer is large enough to > contain the whole bitcode. This doesn't work with the streaming > implementation since it does not allocate a buffer that fits the whole file > ahead of time. It also always skips past the end of the header if found > (since this is how it returns the size of the buffer), so SkipPastEnd would > be a bad name for the variable. I clarified the comment and renamed the > variable VerifyBufferSize. (Also on this pass I found and deleted some > trailing whitespace). > > >> >> +DataStreamer *getDataFileStreamer(const std::string &Filename, >> + std::string *Err); >> >> Please line up the argument to the (. >> > Done. > >> >> --- a/include/llvm/Support/StreamableMemoryObject.h >> +++ b/include/llvm/Support/StreamableMemoryObject.h >> @@ -12,6 +12,9 @@ >> #define STREAMABLEMEMORYOBJECT_H_ >> >> #include "llvm/Support/MemoryObject.h" >> +#include >> +#include "llvm/ADT/OwningPtr.h" >> +#include "llvm/Support/DataStream.h" >> >> ADT, Support, then headers. See >> http://llvm.org/docs/CodingStandards.html#scf_includes . >> >> +/// StreamingMemoryObject - interface to data which is actually streamed >> from >> +/// at DataStreamer. In addition to inherited members, it has the >> +/// dropLeadingBytes and setKnownObjectSize methods which are not >> applicable >> +/// to non-streamed objects >> +class StreamingMemoryObject : public StreamableMemoryObject { >> >> I think that's a full sentence missing a period. It feels *awfully weird* >> to have StreamableMemoryObject and StreamingMemoryObject, and both of them >> are interfaces. The comment doesn't seem to sufficiently explain what's >> going on here. (The DataStreamer can stream from a Streaming but not with a >> Streamable? What?) >> > > Yeah, this was kind of a tough naming problem. But there really does need > to be these 2 different kinds of interfaces. The one I called > StreamableMemoryObject (in which the data may or may not actually be > streamed) needs to have extra methods over and above MemoryObject, which > are directly due to the streamability: isValidAddress and isObjectEnd are > needed because if we don't know the length of the stream ahead of time, > then calling getExtent requires waiting until the entire stream is fetched. > (getPointer is basically just there to support BLOBs, avoiding extra > copies). Then you have RawMemoryObject, a non-streamed > StreamableMemoryObject, and StreamingMemoryObject is an interface because > there could be different implementations of StreamableMemoryObject (getting > data from different sources). > I'm open to ideas to simplify the situation. > > >> >> + // fetch enough bytes such that Pos can be read or EOF is reached >> + // (i.e. BytesRead > Pos). Return true if Pos can be read. >> + // Unlike most of the functions in BitcodeReader, returns true on >> success. >> + bool fetchToPos(size_t Pos) { >> >> Comment should start with a capital. >> >> + bool fetchToPos(size_t Pos) { >> + if (EOFReached) return Pos < ObjectSize; >> + while (Pos >= BytesRead) { >> + Bytes.resize(BytesRead + kChunkSize); >> + size_t bytes = Streamer->GetBytes(&Bytes[BytesRead + BytesSkipped], >> + kChunkSize); >> >> Why is kChunkSize so special? Why not ask for all the bytes up until Pos? >> The comment on DataStreamer::GetBytes doesn't give any reason not to ask >> for as many bytes as you want? >> > > The common case (actually the only case, currently) will actually be that > the requested size is much smaller than the chunk size, and the chunk size > just ensures that we batch them together rather than making a lot of > potentially expensive requests into the streamer. The 'while' loop is just > there instead of an 'if' to cover the corner case of a large request. I > updated the comment > > >> >> +bool BitcodeReader::SuspendModuleParse() { >> + // save our current position >> + NextUnreadBit = Stream.GetCurrentBitNo(); >> + return false; >> +} >> >> What's up with that returning bool? >> > Originally it was going to check for error and use the same convention as > the rest of the functions, but it ended up being simpler than I expected. I > removed it entirely now. > > >> >> + // ParseModule will parse the next body in the stream and set its >> + // position in the DeferredFunctionInfo map >> >> Sentence needs period. >> > Done. > > >> >> + unsigned char buf[16]; >> + if (Bytes->readBytes(0, 16, buf, NULL) == -1) >> + return Error("Bitcode stream must be at least 16 bytes in length"); >> + >> + if (!isBitcode(buf, buf + 16)) { >> + return Error("Invalid bitcode signature"); >> + } >> >> So, uh, braces or no braces around one-line return statements? :-) >> > LLVM style seems to be no braces, but Google style dies hard :) > Fixed. > > >> >> +Module *llvm::getStreamedBitcodeModule(const std::string &name, >> + DataStreamer *streamer, >> + LLVMContext &Context, >> + std::string *ErrMsg) { >> >> These args don't line up. >> > Done > >> >> --- /dev/null >> +++ b/lib/Support/DataStream.cpp >> @@ -0,0 +1,96 @@ >> +//===--- llvm/Support/DataStream.cpp - Lazy streamed Data -*- C++ >> -*-===// >> >> Don't include emacs mode markers (the -*- C++ -*- bit) on .cpp files, >> only on .h files. >> > Done. > >> >> +// Very simple stream backed by a file. Mostly useful for stdin and >> debugging; >> +// actual file access is probably still best done with mmap >> +class DataFileStreamer : public DataStreamer { >> + int Fd; >> >> Sentence seeking full stop. >> >> +DataStreamer *getDataFileStreamer(const std::string &Filename, >> + std::string *StrError) { >> >> Line up to the ( again. >> > Done. > >> >> + if (e != success) { >> + *StrError = std::string() + "Could not open " + Filename + ": " >> + + e.message() + "\n"; >> + return NULL; >> + } >> >> Optional: std::string("Could not open ") + Filename, and also putting the >> + on the previous line instead of starting a line with the operator. >> > Done. > >> >> --- a/lib/Support/StreamableMemoryObject.cpp >> +++ b/lib/Support/StreamableMemoryObject.cpp >> >> In this file you added some spurious blank lines. Please don't do that. >> > I think i got them all. Also fixed some argument alignment. > >> >> +bool StreamingMemoryObject::isObjectEnd(uint64_t address) { >> + if (ObjectSize) return address == ObjectSize; >> + fetchToPos(address); >> + return address == BytesRead; >> +} >> >> Shouldn't that end with "return address == ObjectSize"? If the file is >> larger than 'address' bytes, won't this read up to address bytes, then >> stop, leaving BytesRead == address? ObjectSize on the other hand isn't set >> until EOF is reached. >> > > Yes, it should. although BytesRead is in practice unlikely to actually == > address due to the chunk-fetching behavior. No doubt that's why this > slipped through all the testing. Fixed (and covered the case where address > == 0; it's never the end because 0 is an invalid stream size) > > >> >> +int StreamingMemoryObject::readBytes(uint64_t address, >> + uint64_t size, >> + uint8_t* buf, >> + uint64_t* copied) { >> >> Misaligned. >> > Fixed. > >> >> + //StreamableMemoryObject >> + >> >> Please remove. >> > Done. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/51f060fa/attachment.html From nobled at dreamwidth.org Wed Feb 1 18:54:19 2012 From: nobled at dreamwidth.org (Dylan Noblesmith) Date: Thu, 02 Feb 2012 00:54:19 -0000 Subject: [llvm-commits] [llvm] r149574 - /llvm/trunk/autoconf/configure.ac Message-ID: <20120202005419.1CFF02A6C12C@llvm.org> Author: nobled Date: Wed Feb 1 18:54:18 2012 New Revision: 149574 URL: http://llvm.org/viewvc/llvm-project?rev=149574&view=rev Log: autoconf: fix build/src dir confusion This was the cause of the silent failure to generate clang's config.h. My bad. Fix on r149563 / r149568. Modified: llvm/trunk/autoconf/configure.ac Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=149574&r1=149573&r2=149574&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Wed Feb 1 18:54:18 2012 @@ -1589,7 +1589,7 @@ dnl Configure clang, if present if test ${clang_src_root} = ""; then - clang_src_root="$ac_pwd/tools/clang" + clang_src_root="$srcdir/tools/clang" fi if test -f ${clang_src_root}/README.txt; then AC_CONFIG_HEADERS([${clang_src_root}/include/clang/Config/config.h]) From echristo at apple.com Wed Feb 1 19:11:30 2012 From: echristo at apple.com (Eric Christopher) Date: Thu, 02 Feb 2012 01:11:30 -0000 Subject: [llvm-commits] [llvm] r149576 - /llvm/trunk/configure Message-ID: <20120202011130.8A7322A6C12C@llvm.org> Author: echristo Date: Wed Feb 1 19:11:30 2012 New Revision: 149576 URL: http://llvm.org/viewvc/llvm-project?rev=149576&view=rev Log: Regen one last time. Modified: llvm/trunk/configure Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=149576&r1=149575&r2=149576&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Wed Feb 1 19:11:30 2012 @@ -21096,7 +21096,7 @@ if test ${clang_src_root} = ""; then - clang_src_root="$ac_pwd/tools/clang" + clang_src_root="$srcdir/tools/clang" fi if test -f ${clang_src_root}/README.txt; then ac_config_headers="$ac_config_headers ${clang_src_root}/include/clang/Config/config.h" From clattner at apple.com Wed Feb 1 19:50:19 2012 From: clattner at apple.com (Chris Lattner) Date: Wed, 01 Feb 2012 17:50:19 -0800 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll In-Reply-To: <81F5B899-05F9-4C9D-A461-81777B249C77@apple.com> References: <20120131065244.AB1222A6C12C@llvm.org> <4F27C3F1.9060505@free.fr> <81F5B899-05F9-4C9D-A461-81777B249C77@apple.com> Message-ID: On Feb 1, 2012, at 10:29 AM, Evan Cheng wrote: > Can you add logic to bitcode upgrader to handle them? Great catch: this is something that is important, but that I overlooked. We really want LLVM to be able to read LLVM 3.0 bitcode (and later releases) files. If LLVM 3.0 generated these intrinsics, then we want the bitcode reader to be able to handle them, transparently rewriting them into the compare instructions they are now represented with. The code for this should just be dropped into lib/VMCore/AutoUpgrade.cpp. There are only a couple of intrinsics being upgraded now, but a lot more were supported back in the LLVM 3.0 release (and have been subsequently removed, since we don't need to support 2.x bitcode files). -Chris From craig.topper at gmail.com Wed Feb 1 20:24:12 2012 From: craig.topper at gmail.com (Craig Topper) Date: Wed, 1 Feb 2012 18:24:12 -0800 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll In-Reply-To: References: <20120131065244.AB1222A6C12C@llvm.org> <4F27C3F1.9060505@free.fr> <81F5B899-05F9-4C9D-A461-81777B249C77@apple.com> Message-ID: On Wed, Feb 1, 2012 at 5:50 PM, Chris Lattner wrote: > > On Feb 1, 2012, at 10:29 AM, Evan Cheng wrote: > > > Can you add logic to bitcode upgrader to handle them? > > Great catch: this is something that is important, but that I overlooked. > We really want LLVM to be able to read LLVM 3.0 bitcode (and later > releases) files. If LLVM 3.0 generated these intrinsics, then we want the > bitcode reader to be able to handle them, transparently rewriting them into > the compare instructions they are now represented with. > > The code for this should just be dropped into lib/VMCore/AutoUpgrade.cpp. > There are only a couple of intrinsics being upgraded now, but a lot more > were supported back in the LLVM 3.0 release (and have been subsequently > removed, since we don't need to support 2.x bitcode files). > I'll see if I can take a stab at doing this. > > -Chris > > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120201/bd663b6f/attachment.html From ahatanaka at mips.com Wed Feb 1 20:56:14 2012 From: ahatanaka at mips.com (Akira Hatanaka) Date: Thu, 02 Feb 2012 02:56:14 -0000 Subject: [llvm-commits] [llvm] r149583 - /llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td Message-ID: <20120202025614.A5A0E2A6C12C@llvm.org> Author: ahatanak Date: Wed Feb 1 20:56:14 2012 New Revision: 149583 URL: http://llvm.org/viewvc/llvm-project?rev=149583&view=rev Log: Add DWARF numbers of 64-bit registers. Modified: llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td Modified: llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td?rev=149583&r1=149582&r2=149583&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td (original) +++ llvm/trunk/lib/Target/Mips/MipsRegisterInfo.td Wed Feb 1 20:56:14 2012 @@ -106,38 +106,38 @@ def RA : MipsGPRReg< 31, "RA">, DwarfRegNum<[31]>; // General Purpose 64-bit Registers - def ZERO_64 : Mips64GPRReg< 0, "ZERO", [ZERO]>; - def AT_64 : Mips64GPRReg< 1, "AT", [AT]>; - def V0_64 : Mips64GPRReg< 2, "2", [V0]>; - def V1_64 : Mips64GPRReg< 3, "3", [V1]>; - def A0_64 : Mips64GPRReg< 4, "4", [A0]>; - def A1_64 : Mips64GPRReg< 5, "5", [A1]>; - def A2_64 : Mips64GPRReg< 6, "6", [A2]>; - def A3_64 : Mips64GPRReg< 7, "7", [A3]>; - def T0_64 : Mips64GPRReg< 8, "8", [T0]>; - def T1_64 : Mips64GPRReg< 9, "9", [T1]>; - def T2_64 : Mips64GPRReg< 10, "10", [T2]>; - def T3_64 : Mips64GPRReg< 11, "11", [T3]>; - def T4_64 : Mips64GPRReg< 12, "12", [T4]>; - def T5_64 : Mips64GPRReg< 13, "13", [T5]>; - def T6_64 : Mips64GPRReg< 14, "14", [T6]>; - def T7_64 : Mips64GPRReg< 15, "15", [T7]>; - def S0_64 : Mips64GPRReg< 16, "16", [S0]>; - def S1_64 : Mips64GPRReg< 17, "17", [S1]>; - def S2_64 : Mips64GPRReg< 18, "18", [S2]>; - def S3_64 : Mips64GPRReg< 19, "19", [S3]>; - def S4_64 : Mips64GPRReg< 20, "20", [S4]>; - def S5_64 : Mips64GPRReg< 21, "21", [S5]>; - def S6_64 : Mips64GPRReg< 22, "22", [S6]>; - def S7_64 : Mips64GPRReg< 23, "23", [S7]>; - def T8_64 : Mips64GPRReg< 24, "24", [T8]>; - def T9_64 : Mips64GPRReg< 25, "25", [T9]>; - def K0_64 : Mips64GPRReg< 26, "26", [K0]>; - def K1_64 : Mips64GPRReg< 27, "27", [K1]>; - def GP_64 : Mips64GPRReg< 28, "GP", [GP]>; - def SP_64 : Mips64GPRReg< 29, "SP", [SP]>; - def FP_64 : Mips64GPRReg< 30, "FP", [FP]>; - def RA_64 : Mips64GPRReg< 31, "RA", [RA]>; + def ZERO_64 : Mips64GPRReg< 0, "ZERO", [ZERO]>, DwarfRegNum<[0]>; + def AT_64 : Mips64GPRReg< 1, "AT", [AT]>, DwarfRegNum<[1]>; + def V0_64 : Mips64GPRReg< 2, "2", [V0]>, DwarfRegNum<[2]>; + def V1_64 : Mips64GPRReg< 3, "3", [V1]>, DwarfRegNum<[3]>; + def A0_64 : Mips64GPRReg< 4, "4", [A0]>, DwarfRegNum<[4]>; + def A1_64 : Mips64GPRReg< 5, "5", [A1]>, DwarfRegNum<[5]>; + def A2_64 : Mips64GPRReg< 6, "6", [A2]>, DwarfRegNum<[6]>; + def A3_64 : Mips64GPRReg< 7, "7", [A3]>, DwarfRegNum<[7]>; + def T0_64 : Mips64GPRReg< 8, "8", [T0]>, DwarfRegNum<[8]>; + def T1_64 : Mips64GPRReg< 9, "9", [T1]>, DwarfRegNum<[9]>; + def T2_64 : Mips64GPRReg< 10, "10", [T2]>, DwarfRegNum<[10]>; + def T3_64 : Mips64GPRReg< 11, "11", [T3]>, DwarfRegNum<[11]>; + def T4_64 : Mips64GPRReg< 12, "12", [T4]>, DwarfRegNum<[12]>; + def T5_64 : Mips64GPRReg< 13, "13", [T5]>, DwarfRegNum<[13]>; + def T6_64 : Mips64GPRReg< 14, "14", [T6]>, DwarfRegNum<[14]>; + def T7_64 : Mips64GPRReg< 15, "15", [T7]>, DwarfRegNum<[15]>; + def S0_64 : Mips64GPRReg< 16, "16", [S0]>, DwarfRegNum<[16]>; + def S1_64 : Mips64GPRReg< 17, "17", [S1]>, DwarfRegNum<[17]>; + def S2_64 : Mips64GPRReg< 18, "18", [S2]>, DwarfRegNum<[18]>; + def S3_64 : Mips64GPRReg< 19, "19", [S3]>, DwarfRegNum<[19]>; + def S4_64 : Mips64GPRReg< 20, "20", [S4]>, DwarfRegNum<[20]>; + def S5_64 : Mips64GPRReg< 21, "21", [S5]>, DwarfRegNum<[21]>; + def S6_64 : Mips64GPRReg< 22, "22", [S6]>, DwarfRegNum<[22]>; + def S7_64 : Mips64GPRReg< 23, "23", [S7]>, DwarfRegNum<[23]>; + def T8_64 : Mips64GPRReg< 24, "24", [T8]>, DwarfRegNum<[24]>; + def T9_64 : Mips64GPRReg< 25, "25", [T9]>, DwarfRegNum<[25]>; + def K0_64 : Mips64GPRReg< 26, "26", [K0]>, DwarfRegNum<[26]>; + def K1_64 : Mips64GPRReg< 27, "27", [K1]>, DwarfRegNum<[27]>; + def GP_64 : Mips64GPRReg< 28, "GP", [GP]>, DwarfRegNum<[28]>; + def SP_64 : Mips64GPRReg< 29, "SP", [SP]>, DwarfRegNum<[29]>; + def FP_64 : Mips64GPRReg< 30, "FP", [FP]>, DwarfRegNum<[30]>; + def RA_64 : Mips64GPRReg< 31, "RA", [RA]>, DwarfRegNum<[31]>; /// Mips Single point precision FPU Registers def F0 : FPR< 0, "F0">, DwarfRegNum<[32]>; @@ -193,38 +193,38 @@ def D15 : AFPR<30
2011
2012

January

+

Improved support for the isl scheduling optimizer

+ Polly can now automatically optimize all polybench kernels without the help of + an external optimizer. The compile time is reasonable fast and we can show + notable speedups for various kernels. +

2011

November

From eli.bendersky at intel.com Tue Jan 31 03:19:18 2012 From: eli.bendersky at intel.com (Bendersky, Eli) Date: Tue, 31 Jan 2012 09:19:18 +0000 Subject: [llvm-commits] [PATCH] JIT profiling support with Intel Parallel Amplifier XE 2011 (VTune) Message-ID: <9BBE4537D1BAAB479E9E8F9D4234619D326470@HASMSX103.ger.corp.intel.com> Hello, Currently the only profiling LLVM JITted code support is via OProfile. This patch adds profiling support for Intel Parallel Amplifier XE 2011 (used to be called "VTune"), does some refactoring to share code between the implementations, and adds unit tests both for the existing OProfile interface and the new Amplifier XE interface. In more detail: - Added Intel JIT Events API compatible JITEventListener, and allow OProfileJITEventListener to load libopagent.so at runtime - Removed link-time requirement on libopagent when building with OProfile support - Added Intel JIT API and OProfile support to cmake builds (Boolean options LLVM_USE_OPROFILE and LLVM_USE_INTEL_JITEVENTS) - Added IntelJITEventListener to connect to Intel JIT API (support for profiling with Parallel Amplifier XE 2011) - Added unit tests for both IntelJIT and OProfile JITEventListener implementations which can still be run in the absence the respective 3rd party libraries The change was broken into several patches. The first contains the new implementation and tests. The others are build system changes to incorporate the new code. This is essentially similar to the patch sent by Daniel Malea in the beginning of December, but which unfortunately hasn't received a reply. We updated it to cleanly apply to SVN trunk. Please review Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-intel_jitevents_impl_and_tests_v2.patch Type: application/octet-stream Size: 48715 bytes Desc: 0001-intel_jitevents_impl_and_tests_v2.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/c94ab33f/attachment-0004.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-intel_jitevents_cmake_v2.patch Type: application/octet-stream Size: 4550 bytes Desc: 0002-intel_jitevents_cmake_v2.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/c94ab33f/attachment-0005.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-intel_jitevents_automake_v2.patch Type: application/octet-stream Size: 4169 bytes Desc: 0003-intel_jitevents_automake_v2.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/c94ab33f/attachment-0006.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-intel_jitevents_autogenerated_v2.patch Type: application/octet-stream Size: 14098 bytes Desc: 0004-intel_jitevents_autogenerated_v2.patch Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/c94ab33f/attachment-0007.obj From pichet2000 at gmail.com Tue Jan 31 03:35:01 2012 From: pichet2000 at gmail.com (Francois Pichet) Date: Tue, 31 Jan 2012 09:35:01 -0000 Subject: [llvm-commits] [llvm] r149375 - /llvm/trunk/examples/BrainF/BrainF.cpp Message-ID: <20120131093501.EEBCA2A6C131@llvm.org> Author: fpichet Date: Tue Jan 31 03:35:01 2012 New Revision: 149375 URL: http://llvm.org/viewvc/llvm-project?rev=149375&view=rev Log: Fix BrainF compilation. Modified: llvm/trunk/examples/BrainF/BrainF.cpp Modified: llvm/trunk/examples/BrainF/BrainF.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/examples/BrainF/BrainF.cpp?rev=149375&r1=149374&r2=149375&view=diff ============================================================================== --- llvm/trunk/examples/BrainF/BrainF.cpp (original) +++ llvm/trunk/examples/BrainF/BrainF.cpp Tue Jan 31 03:35:01 2012 @@ -134,7 +134,8 @@ { //@aberrormsg = internal constant [%d x i8] c"\00" Constant *msg_0 = - ConstantArray::get(C, "Error: The head has left the tape.", true); + ConstantDataArray::getString(C, "Error: The head has left the tape.", + true); GlobalVariable *aberrormsg = new GlobalVariable( *module, From baldrick at free.fr Tue Jan 31 03:40:13 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 10:40:13 +0100 Subject: [llvm-commits] [llvm] r149335 - in /llvm/trunk/test: CodeGen/Generic/2007-12-31-UnusedSelector.ll CodeGen/Generic/2009-11-16-BadKillsCrash.ll CodeGen/Mips/eh.ll CodeGen/X86/2008-05-28-LocalRegAllocBug.ll CodeGen/X86/negate-add-zero.ll Transforms/Inline/inline-invoke-tail.ll Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll In-Reply-To: <20120131020907.DD6682A6C12C@llvm.org> References: <20120131020907.DD6682A6C12C@llvm.org> Message-ID: <4F27B6FD.7090302@free.fr> Hi Bill, does auto-upgrade still work with these changes you've been making? Thanks, Duncan. On 31/01/12 03:09, Bill Wendling wrote: > Author: void > Date: Mon Jan 30 20:09:07 2012 > New Revision: 149335 > > URL: http://llvm.org/viewvc/llvm-project?rev=149335&view=rev > Log: > Remove all references to the old EH. > > There was always the current EH. -- Ministry of Truth > > Modified: > llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll > llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll > llvm/trunk/test/CodeGen/Mips/eh.ll > llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll > llvm/trunk/test/CodeGen/X86/negate-add-zero.ll > llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll > llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll > > Modified: llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll (original) > +++ llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll Mon Jan 30 20:09:07 2012 > @@ -30,8 +30,6 @@ > > declare void @__cxa_throw(i8*, i8*, void (i8*)*) noreturn > > -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) > - > declare void @__cxa_end_catch() > > declare i32 @__gxx_personality_v0(...) > > Modified: llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll (original) > +++ llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll Mon Jan 30 20:09:07 2012 > @@ -15,8 +15,6 @@ > %"struct.std::locale::facet" = type { i32 (...)**, i32 } > %union..0._15 = type { i32 } > > -declare i8* @llvm.eh.exception() nounwind readonly > - > declare i8* @__cxa_begin_catch(i8*) nounwind > > declare %"struct.std::ctype"* @_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale(%"struct.std::locale"*) > > Modified: llvm/trunk/test/CodeGen/Mips/eh.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/eh.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/Mips/eh.ll (original) > +++ llvm/trunk/test/CodeGen/Mips/eh.ll Mon Jan 30 20:09:07 2012 > @@ -54,16 +54,10 @@ > > declare i8* @__cxa_allocate_exception(i32) > > -declare i8* @llvm.eh.exception() nounwind readonly > - > declare i32 @__gxx_personality_v0(...) > > -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind > - > declare i32 @llvm.eh.typeid.for(i8*) nounwind > > -declare void @llvm.eh.resume(i8*, i32) > - > declare void @__cxa_throw(i8*, i8*, i8*) > > declare i8* @__cxa_begin_catch(i8*) > > Modified: llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll Mon Jan 30 20:09:07 2012 > @@ -2,8 +2,6 @@ > > @_ZTVN10Evaluation10GridOutputILi3EEE = external constant [5 x i32 (...)*] ;<[5 x i32 (...)*]*> [#uses=1] > > -declare i8* @llvm.eh.exception() nounwind > - > declare i8* @_Znwm(i32) > > declare i8* @__cxa_begin_catch(i8*) nounwind > > Modified: llvm/trunk/test/CodeGen/X86/negate-add-zero.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/negate-add-zero.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/negate-add-zero.ll (original) > +++ llvm/trunk/test/CodeGen/X86/negate-add-zero.ll Mon Jan 30 20:09:07 2012 > @@ -486,10 +486,6 @@ > > declare i8* @_Znwm(i32) > > -declare i8* @llvm.eh.exception() nounwind > - > -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind > - > declare i32 @llvm.eh.typeid.for.i32(i8*) nounwind > > declare void @_ZdlPv(i8*) nounwind > > Modified: llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll (original) > +++ llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Mon Jan 30 20:09:07 2012 > @@ -28,10 +28,6 @@ > unreachable > } > > -declare i8* @llvm.eh.exception() nounwind readonly > - > -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind > - > declare i32 @__gxx_personality_v0(...) > > declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind > > Modified: llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll?rev=149335&r1=149334&r2=149335&view=diff > ============================================================================== > --- llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll (original) > +++ llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll Mon Jan 30 20:09:07 2012 > @@ -21,10 +21,6 @@ > > declare i8* @__cxa_begin_catch(i8*) nounwind > > -declare i8* @llvm.eh.exception() nounwind > - > -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind > - > declare void @__cxa_end_catch() > > declare i32 @__gxx_personality_v0(...) > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From baldrick at free.fr Tue Jan 31 03:49:58 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 10:49:58 +0100 Subject: [llvm-commits] [llvm] r149356 - in /llvm/trunk: include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp In-Reply-To: <20120131050917.C16DD2A6C12C@llvm.org> References: <20120131050917.C16DD2A6C12C@llvm.org> Message-ID: <4F27B946.2020800@free.fr> Hi Chris, > --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) > +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Mon Jan 30 23:09:17 2012 > @@ -132,8 +132,8 @@ > uint64_t Offset = 0); > > // FIXME: Remove this. > - bool GetConstantStringInfo(const Value *V, std::string&Str, > - uint64_t Offset = 0); > + // bool GetConstantStringInfo(const Value *V, std::string&Str, > + // uint64_t Offset = 0); did you mean to leave this in the file? Ciao, Duncan. From timurrrr at google.com Tue Jan 31 04:12:13 2012 From: timurrrr at google.com (Timur Iskhodzhanov) Date: Tue, 31 Jan 2012 14:12:13 +0400 Subject: [llvm-commits] [PATCH][AddressSanitizer] A few more tweaks for the Visual Studio ASan build Message-ID: Hi Kostya, Can you please review the attached patch and land it if everything's OK with it? It fixes the wrong __WORDSIZE definition on Win x64 and also introduced ASAN_INTERFACE_FUNCTION_ATTRIBUTE you've suggested in the other discussion. Thanks, Timur Iskhodzhanov -------------- next part -------------- A non-text attachment was scrubbed... Name: win2.patch Type: text/x-patch Size: 3419 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/31e5d510/attachment.bin From baldrick at free.fr Tue Jan 31 04:35:29 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 11:35:29 +0100 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll In-Reply-To: <20120131065244.AB1222A6C12C@llvm.org> References: <20120131065244.AB1222A6C12C@llvm.org> Message-ID: <4F27C3F1.9060505@free.fr> Hi Craig, > Remove pcmpgt/pcmpeq intrinsics as clang is not using them. dragonegg is using them. Can the same effect be obtained some other way? Ciao, Duncan. > > Modified: > llvm/trunk/include/llvm/IntrinsicsX86.td > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > > Modified: llvm/trunk/include/llvm/IntrinsicsX86.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=149367&r1=149366&r2=149367&view=diff > ============================================================================== > --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) > +++ llvm/trunk/include/llvm/IntrinsicsX86.td Tue Jan 31 00:52:44 2012 > @@ -452,28 +452,6 @@ > llvm_i32_ty], [IntrNoMem]>; > } > > -// Integer comparison ops > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > - def int_x86_sse2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb128">, > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > - llvm_v16i8_ty], [IntrNoMem, Commutative]>; > - def int_x86_sse2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw128">, > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > - llvm_v8i16_ty], [IntrNoMem, Commutative]>; > - def int_x86_sse2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd128">, > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > - llvm_v4i32_ty], [IntrNoMem, Commutative]>; > - def int_x86_sse2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb128">, > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > - llvm_v16i8_ty], [IntrNoMem]>; > - def int_x86_sse2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw128">, > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > - llvm_v8i16_ty], [IntrNoMem]>; > - def int_x86_sse2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd128">, > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > - llvm_v4i32_ty], [IntrNoMem]>; > -} > - > // Conversion ops > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > def int_x86_sse2_cvtdq2pd : GCCBuiltin<"__builtin_ia32_cvtdq2pd">, > @@ -792,12 +770,6 @@ > > // Vector compare, min, max > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > - def int_x86_sse41_pcmpeqq : GCCBuiltin<"__builtin_ia32_pcmpeqq">, > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > - [IntrNoMem, Commutative]>; > - def int_x86_sse42_pcmpgtq : GCCBuiltin<"__builtin_ia32_pcmpgtq">, > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > - [IntrNoMem]>; > def int_x86_sse41_pmaxsb : GCCBuiltin<"__builtin_ia32_pmaxsb128">, > Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], > [IntrNoMem, Commutative]>; > @@ -1515,34 +1487,6 @@ > llvm_i32_ty], [IntrNoMem]>; > } > > -// Integer comparison ops > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > - def int_x86_avx2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb256">, > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > - [IntrNoMem, Commutative]>; > - def int_x86_avx2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw256">, > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], > - [IntrNoMem, Commutative]>; > - def int_x86_avx2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd256">, > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > - [IntrNoMem, Commutative]>; > - def int_x86_avx2_pcmpeq_q : GCCBuiltin<"__builtin_ia32_pcmpeqq256">, > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > - [IntrNoMem, Commutative]>; > - def int_x86_avx2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb256">, > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > - [IntrNoMem]>; > - def int_x86_avx2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw256">, > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, llvm_v16i16_ty], > - [IntrNoMem]>; > - def int_x86_avx2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd256">, > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > - [IntrNoMem]>; > - def int_x86_avx2_pcmpgt_q : GCCBuiltin<"__builtin_ia32_pcmpgtq256">, > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > - [IntrNoMem]>; > -} > - > // Pack ops. > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > def int_x86_avx2_packsswb : GCCBuiltin<"__builtin_ia32_packsswb256">, > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149367&r1=149366&r2=149367&view=diff > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jan 31 00:52:44 2012 > @@ -9492,26 +9492,6 @@ > case Intrinsic::x86_avx2_psrav_d_256: > return DAG.getNode(ISD::SRA, dl, Op.getValueType(), > Op.getOperand(1), Op.getOperand(2)); > - case Intrinsic::x86_sse2_pcmpeq_b: > - case Intrinsic::x86_sse2_pcmpeq_w: > - case Intrinsic::x86_sse2_pcmpeq_d: > - case Intrinsic::x86_sse41_pcmpeqq: > - case Intrinsic::x86_avx2_pcmpeq_b: > - case Intrinsic::x86_avx2_pcmpeq_w: > - case Intrinsic::x86_avx2_pcmpeq_d: > - case Intrinsic::x86_avx2_pcmpeq_q: > - return DAG.getNode(X86ISD::PCMPEQ, dl, Op.getValueType(), > - Op.getOperand(1), Op.getOperand(2)); > - case Intrinsic::x86_sse2_pcmpgt_b: > - case Intrinsic::x86_sse2_pcmpgt_w: > - case Intrinsic::x86_sse2_pcmpgt_d: > - case Intrinsic::x86_sse42_pcmpgtq: > - case Intrinsic::x86_avx2_pcmpgt_b: > - case Intrinsic::x86_avx2_pcmpgt_w: > - case Intrinsic::x86_avx2_pcmpgt_d: > - case Intrinsic::x86_avx2_pcmpgt_q: > - return DAG.getNode(X86ISD::PCMPGT, dl, Op.getValueType(), > - Op.getOperand(1), Op.getOperand(2)); > case Intrinsic::x86_ssse3_pshuf_b_128: > case Intrinsic::x86_avx2_pshuf_b: > return DAG.getNode(X86ISD::PSHUFB, dl, Op.getValueType(), > > Modified: llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll?rev=149367&r1=149366&r2=149367&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll (original) > +++ llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll Tue Jan 31 00:52:44 2012 > @@ -30,7 +30,7 @@ > %tmp87 = bitcast<16 x i8> %tmp66 to<4 x i32> ;<<4 x i32>> [#uses=1] > %tmp88 = add<4 x i32> %tmp87, %tmp77 ;<<4 x i32>> [#uses=2] > %tmp88.upgrd.4 = bitcast<4 x i32> %tmp88 to<2 x i64> ;<<2 x i64>> [#uses=1] > - %tmp99 = tail call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > + %tmp99 = tail call<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > %tmp99.upgrd.5 = bitcast<4 x i32> %tmp99 to<2 x i64> ;<<2 x i64>> [#uses=2] > %tmp110 = xor<2 x i64> %tmp99.upgrd.5,< i64 -1, i64 -1> ;<<2 x i64>> [#uses=1] > %tmp111 = and<2 x i64> %tmp110, %tmp55.upgrd.2 ;<<2 x i64>> [#uses=1] > @@ -48,4 +48,4 @@ > ret void > } > > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) > +declare<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>,<4 x i32>) > > Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) > +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 > @@ -369,54 +369,6 @@ > declare<8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>,<8 x i16>) nounwind readnone > > > -define<16 x i8> @test_x86_sse2_pcmpeq_b(<16 x i8> %a0,<16 x i8> %a1) { > - ; CHECK: vpcmpeqb > - %res = call<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8> %a0,<16 x i8> %a1) ;<<16 x i8>> [#uses=1] > - ret<16 x i8> %res > -} > -declare<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8>,<16 x i8>) nounwind readnone > - > - > -define<4 x i32> @test_x86_sse2_pcmpeq_d(<4 x i32> %a0,<4 x i32> %a1) { > - ; CHECK: vpcmpeqd > - %res = call<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32> %a0,<4 x i32> %a1) ;<<4 x i32>> [#uses=1] > - ret<4 x i32> %res > -} > -declare<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32>,<4 x i32>) nounwind readnone > - > - > -define<8 x i16> @test_x86_sse2_pcmpeq_w(<8 x i16> %a0,<8 x i16> %a1) { > - ; CHECK: vpcmpeqw > - %res = call<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16> %a0,<8 x i16> %a1) ;<<8 x i16>> [#uses=1] > - ret<8 x i16> %res > -} > -declare<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16>,<8 x i16>) nounwind readnone > - > - > -define<16 x i8> @test_x86_sse2_pcmpgt_b(<16 x i8> %a0,<16 x i8> %a1) { > - ; CHECK: vpcmpgtb > - %res = call<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8> %a0,<16 x i8> %a1) ;<<16 x i8>> [#uses=1] > - ret<16 x i8> %res > -} > -declare<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8>,<16 x i8>) nounwind readnone > - > - > -define<4 x i32> @test_x86_sse2_pcmpgt_d(<4 x i32> %a0,<4 x i32> %a1) { > - ; CHECK: vpcmpgtd > - %res = call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %a0,<4 x i32> %a1) ;<<4 x i32>> [#uses=1] > - ret<4 x i32> %res > -} > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) nounwind readnone > - > - > -define<8 x i16> @test_x86_sse2_pcmpgt_w(<8 x i16> %a0,<8 x i16> %a1) { > - ; CHECK: vpcmpgtw > - %res = call<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16> %a0,<8 x i16> %a1) ;<<8 x i16>> [#uses=1] > - ret<8 x i16> %res > -} > -declare<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16>,<8 x i16>) nounwind readnone > - > - > define<4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0,<8 x i16> %a1) { > ; CHECK: vpmaddwd > %res = call<4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0,<8 x i16> %a1) ;<<4 x i32>> [#uses=1] > @@ -950,14 +902,6 @@ > declare<8 x i16> @llvm.x86.sse41.pblendw(<8 x i16>,<8 x i16>, i32) nounwind readnone > > > -define<2 x i64> @test_x86_sse41_pcmpeqq(<2 x i64> %a0,<2 x i64> %a1) { > - ; CHECK: vpcmpeqq > - %res = call<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64> %a0,<2 x i64> %a1) ;<<2 x i64>> [#uses=1] > - ret<2 x i64> %res > -} > -declare<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64>,<2 x i64>) nounwind readnone > - > - > define<8 x i16> @test_x86_sse41_phminposuw(<8 x i16> %a0) { > ; CHECK: vphminposuw > %res = call<8 x i16> @llvm.x86.sse41.phminposuw(<8 x i16> %a0) ;<<8 x i16>> [#uses=1] > @@ -1271,14 +1215,6 @@ > declare<16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32,<16 x i8>, i32, i8) nounwind readnone > > > -define<2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0,<2 x i64> %a1) { > - ; CHECK: vpcmpgtq > - %res = call<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0,<2 x i64> %a1) ;<<2 x i64>> [#uses=1] > - ret<2 x i64> %res > -} > -declare<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>,<2 x i64>) nounwind readnone > - > - > define i32 @test_x86_sse42_pcmpistri128(<16 x i8> %a0,<16 x i8> %a1) { > ; CHECK: vpcmpistri > ; CHECK: movl > > Modified: llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll (original) > +++ llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll Tue Jan 31 00:52:44 2012 > @@ -72,54 +72,6 @@ > declare<16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>,<16 x i16>) nounwind readnone > > > -define<32 x i8> @test_x86_avx2_pcmpeq_b(<32 x i8> %a0,<32 x i8> %a1) { > - ; CHECK: vpcmpeqb > - %res = call<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > - ret<32 x i8> %res > -} > -declare<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8>,<32 x i8>) nounwind readnone > - > - > -define<8 x i32> @test_x86_avx2_pcmpeq_d(<8 x i32> %a0,<8 x i32> %a1) { > - ; CHECK: vpcmpeqd > - %res = call<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32> %a0,<8 x i32> %a1) ;<<8 x i32>> [#uses=1] > - ret<8 x i32> %res > -} > -declare<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32>,<8 x i32>) nounwind readnone > - > - > -define<16 x i16> @test_x86_avx2_pcmpeq_w(<16 x i16> %a0,<16 x i16> %a1) { > - ; CHECK: vpcmpeqw > - %res = call<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16> %a0,<16 x i16> %a1) ;<<16 x i16>> [#uses=1] > - ret<16 x i16> %res > -} > -declare<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16>,<16 x i16>) nounwind readnone > - > - > -define<32 x i8> @test_x86_avx2_pcmpgt_b(<32 x i8> %a0,<32 x i8> %a1) { > - ; CHECK: vpcmpgtb > - %res = call<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > - ret<32 x i8> %res > -} > -declare<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8>,<32 x i8>) nounwind readnone > - > - > -define<8 x i32> @test_x86_avx2_pcmpgt_d(<8 x i32> %a0,<8 x i32> %a1) { > - ; CHECK: vpcmpgtd > - %res = call<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32> %a0,<8 x i32> %a1) ;<<8 x i32>> [#uses=1] > - ret<8 x i32> %res > -} > -declare<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32>,<8 x i32>) nounwind readnone > - > - > -define<16 x i16> @test_x86_avx2_pcmpgt_w(<16 x i16> %a0,<16 x i16> %a1) { > - ; CHECK: vpcmpgtw > - %res = call<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16> %a0,<16 x i16> %a1) ;<<16 x i16>> [#uses=1] > - ret<16 x i16> %res > -} > -declare<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16>,<16 x i16>) nounwind readnone > - > - > define<8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0,<16 x i16> %a1) { > ; CHECK: vpmaddwd > %res = call<8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0,<16 x i16> %a1) ;<<8 x i32>> [#uses=1] > @@ -553,14 +505,6 @@ > declare<16 x i16> @llvm.x86.avx2.pblendw(<16 x i16>,<16 x i16>, i32) nounwind readnone > > > -define<4 x i64> @test_x86_avx2_pcmpeqq(<4 x i64> %a0,<4 x i64> %a1) { > - ; CHECK: vpcmpeqq > - %res = call<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64> %a0,<4 x i64> %a1) ;<<4 x i64>> [#uses=1] > - ret<4 x i64> %res > -} > -declare<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64>,<4 x i64>) nounwind readnone > - > - > define<32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0,<32 x i8> %a1) { > ; CHECK: vpmaxsb > %res = call<32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0,<32 x i8> %a1) ;<<32 x i8>> [#uses=1] > @@ -729,14 +673,6 @@ > declare<4 x i64> @llvm.x86.avx2.pmul.dq(<8 x i32>,<8 x i32>) nounwind readnone > > > -define<4 x i64> @test_x86_avx2_pcmpgtq(<4 x i64> %a0,<4 x i64> %a1) { > - ; CHECK: vpcmpgtq > - %res = call<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64> %a0,<4 x i64> %a1) ;<<4 x i64>> [#uses=1] > - ret<4 x i64> %res > -} > -declare<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64>,<4 x i64>) nounwind readnone > - > - > define<4 x i64> @test_x86_avx2_vbroadcasti128(i8* %a0) { > ; CHECK: vbroadcasti128 > %res = call<4 x i64> @llvm.x86.avx2.vbroadcasti128(i8* %a0) ;<<4 x i64>> [#uses=1] > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From benny.kra at googlemail.com Tue Jan 31 04:39:36 2012 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Tue, 31 Jan 2012 10:39:36 -0000 Subject: [llvm-commits] [dragonegg] r149378 - in /dragonegg/trunk/src: Backend.cpp Constants.cpp Message-ID: <20120131103936.9BA7E2A6C131@llvm.org> Author: d0k Date: Tue Jan 31 04:39:36 2012 New Revision: 149378 URL: http://llvm.org/viewvc/llvm-project?rev=149378&view=rev Log: ConstantArray::get doesn't do strings anymore. Modified: dragonegg/trunk/src/Backend.cpp dragonegg/trunk/src/Constants.cpp Modified: dragonegg/trunk/src/Backend.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Backend.cpp?rev=149378&r1=149377&r2=149378&view=diff ============================================================================== --- dragonegg/trunk/src/Backend.cpp (original) +++ dragonegg/trunk/src/Backend.cpp Tue Jan 31 04:39:36 2012 @@ -750,7 +750,7 @@ /// global if possible. Constant* ConvertMetadataStringToGV(const char *str) { - Constant *Init = ConstantArray::get(getGlobalContext(), std::string(str)); + Constant *Init = ConstantDataArray::getString(getGlobalContext(), str); // Use cached string if it exists. static std::map StringCSTCache; Modified: dragonegg/trunk/src/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Constants.cpp?rev=149378&r1=149377&r2=149378&view=diff ============================================================================== --- dragonegg/trunk/src/Constants.cpp (original) +++ dragonegg/trunk/src/Constants.cpp Tue Jan 31 04:39:36 2012 @@ -788,8 +788,8 @@ assert(CharsWritten == SizeInChars && "Failed to fully encode expression!"); (void)CharsWritten; // Avoid unused variable warning when assertions disabled. // Turn it into an LLVM byte array. - return ConstantArray::get(Context, StringRef((char *)&Buffer[0], SizeInChars), - /*AddNull*/false); + StringRef Str((char *)&Buffer[0], SizeInChars); + return ConstantDataArray::getString(Context, str, /*AddNull*/false); } static Constant *ConvertSTRING_CST(tree exp, TargetFolder &) { From baldrick at free.fr Tue Jan 31 04:47:27 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 10:47:27 -0000 Subject: [llvm-commits] [dragonegg] r149379 - /dragonegg/trunk/src/Constants.cpp Message-ID: <20120131104727.C3CED2A6C131@llvm.org> Author: baldrick Date: Tue Jan 31 04:47:27 2012 New Revision: 149379 URL: http://llvm.org/viewvc/llvm-project?rev=149379&view=rev Log: Simplify after move to ConstantDataArray. Modified: dragonegg/trunk/src/Constants.cpp Modified: dragonegg/trunk/src/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Constants.cpp?rev=149379&r1=149378&r2=149379&view=diff ============================================================================== --- dragonegg/trunk/src/Constants.cpp (original) +++ dragonegg/trunk/src/Constants.cpp Tue Jan 31 04:47:27 2012 @@ -783,13 +783,12 @@ unsigned SizeInChars = (TREE_INT_CST_LOW(TYPE_SIZE(type)) + CHAR_BIT - 1) / CHAR_BIT; // Encode the constant in Buffer in target format. - std::vector Buffer(SizeInChars); + SmallVector Buffer(SizeInChars); unsigned CharsWritten = native_encode_expr(exp, &Buffer[0], SizeInChars); assert(CharsWritten == SizeInChars && "Failed to fully encode expression!"); (void)CharsWritten; // Avoid unused variable warning when assertions disabled. // Turn it into an LLVM byte array. - StringRef Str((char *)&Buffer[0], SizeInChars); - return ConstantDataArray::getString(Context, str, /*AddNull*/false); + return ConstantDataArray::get(Context, Buffer); } static Constant *ConvertSTRING_CST(tree exp, TargetFolder &) { From baldrick at free.fr Tue Jan 31 04:53:34 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 11:53:34 +0100 Subject: [llvm-commits] [dragonegg] r149378 - in /dragonegg/trunk/src: Backend.cpp Constants.cpp In-Reply-To: <20120131103936.9BA7E2A6C131@llvm.org> References: <20120131103936.9BA7E2A6C131@llvm.org> Message-ID: <4F27C82E.6060106@free.fr> Hi Benjamin, > ConstantArray::get doesn't do strings anymore. thanks for doing this. I did this too but was still running my version through the dragonegg testsuite :) I've just committed some additional changes from my version on top of yours. Ciao, Duncan. From sabre at nondot.org Tue Jan 31 05:50:28 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 11:50:28 -0000 Subject: [llvm-commits] [www] r149380 - /www/trunk/OpenProjects.html Message-ID: <20120131115028.84D2A2A6C12C@llvm.org> Author: lattner Date: Tue Jan 31 05:50:28 2012 New Revision: 149380 URL: http://llvm.org/viewvc/llvm-project?rev=149380&view=rev Log: specifically list a few open projects, to raise their awareness. Modified: www/trunk/OpenProjects.html Modified: www/trunk/OpenProjects.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/OpenProjects.html?rev=149380&r1=149379&r2=149380&view=diff ============================================================================== --- www/trunk/OpenProjects.html (original) +++ www/trunk/OpenProjects.html Tue Jan 31 05:50:28 2012 @@ -109,8 +109,22 @@ has "code-cleanup" bugs filed in it. Taking one of these and fixing it is a good way to get your feet wet in the -LLVM code and discover how some of its components work. +LLVM code and discover how some of its components work. Some of these include +some major IR redesign work, which is high-impact because it can simplify a lot +of things in the optimizer.

+ +

+Some specific ones that would be great to have: + +

+

+

Additionally, there are performance improvements in LLVM that need to get fixed. These are marked with the slow-compile keyword. Use this Bugzilla query @@ -204,7 +218,7 @@ transactions to the PassManager for improved bugpoint.

  • Improve bugpoint to support running tests in parallel on MP machines.
  • -
  • Add JIT support to the SPARC port.
  • +
  • Add MC assembler/disassembler and JIT support to the SPARC port.
  • Move more optimizations out of the -instcombine pass and into InstructionSimplify. The optimizations that should be moved are those that do not create new instructions, for example turning sub i32 %x, 0 From sabre at nondot.org Tue Jan 31 05:56:15 2012 From: sabre at nondot.org (Chris Lattner) Date: Tue, 31 Jan 2012 03:56:15 -0800 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: References: Message-ID: On Jan 31, 2012, at 12:27 AM, Alexander Potapenko wrote: > On Tue, Jan 31, 2012 at 10:18 AM, Chris Lattner wrote: >> Author: lattner >> Date: Tue Jan 31 00:18:43 2012 >> New Revision: 149365 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev >> Log: >> eliminate the "string" form of ConstantArray::get, using >> ConstantDataArray::getString instead. > Chris, > > our internal (for the moment) ASan buildbot is reporting test failures > starting at the range of r149363 to r149365 on Linux x64 and Mac x64 > Namely they are: That sounds bad, and I'd definitely like to fix these. Can you give me more information about how they are failing? What command line is failing, and with what stack trace? -Chris > ******************** TEST 'Clang :: CodeGenObjC/arc-ivar-layout.m' > FAILED ******************** > ******************** TEST 'Clang :: > CodeGenObjC/arc-block-ivar-layout.m' FAILED ******************** > ******************** TEST 'Clang :: CodeGenObjC/block-var-layout.m' > FAILED ******************** > ******************** TEST 'Clang :: > CodeGenObjC/ivar-layout-array0-struct.m' FAILED ******************** > ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-64.m' > FAILED ******************** > ******************** TEST 'Clang :: > CodeGenObjC/ivar-layout-no-optimize.m' FAILED ******************** > ******************** TEST 'Clang :: CodeGenObjCXX/block-var-layout.mm' > FAILED ******************** > > HTH, > Alex From klimek at google.com Tue Jan 31 06:09:30 2012 From: klimek at google.com (Manuel Klimek) Date: Tue, 31 Jan 2012 13:09:30 +0100 Subject: [llvm-commits] [PATCH] Fix bug in RefCountedBaseVPTR and add regression tests for some of the latest changes Message-ID: RefCountedBaseVPTR needs the IntrusiveRefCntPtrInfo as friend, now that this handles the release / retain calls. Adds a regression test for that bug (which is a compile-time regression) and for the last two changes to the IntrusiveRefCntPtr, especially tests for the memory leak due to copy construction of the ref-counted object and ensuring that the traits are used for release / retain calls. Cheers, /Manuel -------------- next part -------------- A non-text attachment was scrubbed... Name: refcount.patch Type: text/x-patch Size: 3069 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/f5a5b4b1/attachment.bin From glider at google.com Tue Jan 31 06:22:41 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 16:22:41 +0400 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 3:56 PM, Chris Lattner wrote: > > On Jan 31, 2012, at 12:27 AM, Alexander Potapenko wrote: > >> On Tue, Jan 31, 2012 at 10:18 AM, Chris Lattner wrote: >>> Author: lattner >>> Date: Tue Jan 31 00:18:43 2012 >>> New Revision: 149365 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=149365&view=rev >>> Log: >>> eliminate the "string" form of ConstantArray::get, using >>> ConstantDataArray::getString instead. >> Chris, >> >> our internal (for the moment) ASan buildbot is reporting test failures >> starting at the range of r149363 to r149365 on Linux x64 and Mac x64 >> Namely they are: > > That sounds bad, and I'd definitely like to fix these. ?Can you give me more information about how they are failing? ?What command line is failing, and with what stack trace? > > -Chris Attached is the log of `make check-all` on my Snow Leopard machine. Feel free to ask for more information. -------------- next part -------------- llvm[0]: Running test suite make[1]: Entering directory `/usr/local/google/asan/asan-llvm-trunk/llvm/build/test' Making a new site.exp file... Making LLVM 'lit.site.cfg' file... Making LLVM unittest 'lit.site.cfg' file... make -C /usr/local/google/asan/asan-llvm-trunk/llvm/build/test/../tools/clang/test lit.site.cfg Unit/lit.site.cfg make[2]: Entering directory `/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test' Making Clang 'lit.site.cfg' file... Making Clang 'Unit/lit.site.cfg' file... make[2]: Leaving directory `/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test' ( ulimit -t 600 ; ulimit -d 512000 ; ulimit -m 512000 ; ulimit -v 1024000 ; \ /usr/local/google/asan/asan-llvm-trunk/llvm/utils/lit/lit.py -s -v . /usr/local/google/asan/asan-llvm-trunk/llvm/build/test/../tools/clang/test ) lit.py: lit.cfg:146: note: using clang: '/usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang' -- Testing: 9969 tests, 12 threads -- Testing: 0 .. 10. FAIL: Clang :: CodeGenObjC/arc-ivar-layout.m (1742 of 9969) ******************** TEST 'Clang :: CodeGenObjC/arc-ivar-layout.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -fobjc-arc -fobjc-runtime-has-weak -triple x86_64-apple-darwin -O0 -S /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-ivar-layout.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-ivar-layout.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-ivar-layout.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-ivar-layout.m -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-ivar-layout.m:20:21: error: expected string not found in input // CHECK-LP64-NEXT: .asciz "\003" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-ivar-layout.m.tmp-64.s:126:2: note: scanning from here .asciz "\003\000" ^ -- ******************** Testing: 0 .. 10. FAIL: Clang :: CodeGenObjC/arc-block-ivar-layout.m (1757 of 9969) ******************** TEST 'Clang :: CodeGenObjC/arc-block-ivar-layout.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -fblocks -fobjc-arc -fobjc-runtime-has-weak -triple x86_64-apple-darwin -O0 -emit-llvm /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-block-ivar-layout.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-block-ivar-layout.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-block-ivar-layout.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-block-ivar-layout.m -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-block-ivar-layout.m:37:16: error: expected string not found in input // CHECK-LP64: @"\01L_OBJC_CLASS_NAME_{{.*}}" = internal global [4 x i8] c"\015\10\00" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-block-ivar-layout.m.tmp-64.s:1:1: note: scanning from here ; ModuleID = '/usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/arc-block-ivar-layout.m' ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/arc-block-ivar-layout.m.tmp-64.s:18:1: note: possible intended match here @"\01L_OBJC_CLASS_NAME_" = internal global [5 x i8] c"\015\10\00\00", section "__TEXT,__objc_classname,cstring_literals", align 1 ^ -- ******************** Testing: 0 .. 10.. FAIL: Clang :: CodeGenObjC/block-var-layout.m (1796 of 9969) ******************** TEST 'Clang :: CodeGenObjC/block-var-layout.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -fblocks -fobjc-gc -triple x86_64-apple-darwin -fobjc-fragile-abi -O0 -emit-llvm /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/block-var-layout.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/block-var-layout.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/block-var-layout.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/block-var-layout.m -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/block-var-layout.m:50:16: error: expected string not found in input // CHECK-LP64: @"\01L_OBJC_CLASS_NAME_{{.*}}" = internal global [4 x i8] c"\015\10\00" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/block-var-layout.m.tmp-64.s:1:1: note: scanning from here ; ModuleID = '/usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/block-var-layout.m' ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/block-var-layout.m.tmp-64.s:37:1: note: possible intended match here @"\01L_OBJC_CLASS_NAME_12" = internal global [6 x i8] c"\01A\11\11\00\00", section "__TEXT,__cstring,cstring_literals", align 1 ^ -- ******************** Testing: 0 .. 10.. FAIL: Clang :: CodeGenObjC/ivar-layout-64.m (1835 of 9969) ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-64.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -triple x86_64-apple-darwin10 -fobjc-gc -emit-llvm -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-64.m grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"A\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"\\11q\\10\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"!q\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"\\01\\14\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -x objective-c++ -triple x86_64-apple-darwin10 -fobjc-gc -emit-llvm -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-64.m grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"A\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"\\11q\\10\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"!q\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp grep '@"\\01L_OBJC_CLASS_NAME_.*" = internal global .* c"\\01\\14\\00"' /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-64.m.tmp -- Exit Code: 1 Command Output (stdout): -- @"\01L_OBJC_CLASS_NAME_8" = internal global [2 x i8] c"A\00", section "__TEXT,__objc_classname,cstring_literals", align 1 -- ******************** Testing: 0 .. 10.. FAIL: Clang :: CodeGenObjC/ivar-layout-array0-struct.m (1838 of 9969) ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-array0-struct.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -fobjc-gc -triple x86_64-apple-darwin -fobjc-fragile-abi -O0 -S /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-array0-struct.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-array0-struct.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-array0-struct.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-array0-struct.m -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-array0-struct.m:23:21: error: expected string not found in input // CHECK-LP64-NEXT: .asciz "\001\020" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-array0-struct.m.tmp-64.s:66:2: note: scanning from here .asciz "\001\020\000" ^ -- ******************** Testing: 0 .. 10.. FAIL: Clang :: CodeGenObjC/ivar-layout-no-optimize.m (1839 of 9969) ******************** TEST 'Clang :: CodeGenObjC/ivar-layout-no-optimize.m' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -fobjc-gc -triple x86_64-apple-darwin -fobjc-fragile-abi -O0 -S /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-no-optimize.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-no-optimize.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-no-optimize.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-no-optimize.m /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -x objective-c++ -fobjc-gc -triple x86_64-apple-darwin -fobjc-fragile-abi -O0 -S /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-no-optimize.m -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-no-optimize.m.tmp-64.s FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-no-optimize.m.tmp-64.s /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-no-optimize.m -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjC/ivar-layout-no-optimize.m:20:21: error: expected string not found in input // CHECK-LP64-NEXT: .asciz "\004" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjC/Output/ivar-layout-no-optimize.m.tmp-64.s:76:2: note: scanning from here .asciz "\004\000" ^ -- ******************** Testing: 0 .. 10.. FAIL: Clang :: CodeGenObjCXX/block-var-layout.mm (1986 of 9969) ******************** TEST 'Clang :: CodeGenObjCXX/block-var-layout.mm' FAILED ******************** Script: -- /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/clang -cc1 -internal-isystem /usr/local/google/asan/asan-llvm-trunk/llvm/build/Release+Asserts/bin/../lib/clang/3.1/include -x objective-c++ -fblocks -fobjc-gc -triple x86_64-apple-darwin -fobjc-fragile-abi -emit-llvm /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjCXX/block-var-layout.mm -o /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjCXX/Output/block-var-layout.mm.tmp-64.ll FileCheck -check-prefix LP64 --input-file=/usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjCXX/Output/block-var-layout.mm.tmp-64.ll /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjCXX/block-var-layout.mm -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjCXX/block-var-layout.mm:40:16: error: expected string not found in input // CHECK-LP64: @"\01L_OBJC_CLASS_NAME_{{.*}}" = internal global [4 x i8] c"\015\10\00" ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjCXX/Output/block-var-layout.mm.tmp-64.ll:1:1: note: scanning from here ; ModuleID = '/usr/local/google/asan/asan-llvm-trunk/llvm/tools/clang/test/CodeGenObjCXX/block-var-layout.mm' ^ /usr/local/google/asan/asan-llvm-trunk/llvm/build/tools/clang/test/CodeGenObjCXX/Output/block-var-layout.mm.tmp-64.ll:37:1: note: possible intended match here @"\01L_OBJC_CLASS_NAME_12" = internal global [6 x i8] c"\01A\11\11\00\00", section "__TEXT,__cstring,cstring_literals", align 1 ^ -- ******************** Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Testing Time: 33.96s ******************** Failing Tests (7): Clang :: CodeGenObjC/arc-block-ivar-layout.m Clang :: CodeGenObjC/arc-ivar-layout.m Clang :: CodeGenObjC/block-var-layout.m Clang :: CodeGenObjC/ivar-layout-64.m Clang :: CodeGenObjC/ivar-layout-array0-struct.m Clang :: CodeGenObjC/ivar-layout-no-optimize.m Clang :: CodeGenObjCXX/block-var-layout.mm Expected Passes : 9876 Expected Failures : 73 Unsupported Tests : 13 Unexpected Failures: 7 make[1]: *** [check-local-all] Error 1 make[1]: Leaving directory `/usr/local/google/asan/asan-llvm-trunk/llvm/build/test' make: *** [check-all] Error 2 From glider at google.com Tue Jan 31 06:24:05 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 16:24:05 +0400 Subject: [llvm-commits] [llvm] r149365 - in /llvm/trunk: include/llvm/Constants.h lib/AsmParser/LLParser.cpp lib/Transforms/Instrumentation/AddressSanitizer.cpp lib/VMCore/Constants.cpp lib/VMCore/Core.cpp lib/VMCore/IRBuilder.cpp tools/bugpoint/Miscompil In-Reply-To: References: Message-ID: > Attached is the log of `make check-all` on my Snow Leopard machine. > Feel free to ask for more information. s/Snow Leopard/Ubuntu 10.04/ From glider at google.com Tue Jan 31 07:11:17 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 17:11:17 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not test memcpy on Lion Message-ID: The attached patch disables testing memcpy() on Mac OS 10.7, where memcpy() in fact aliases memmove() and thus calling it with overlapping parameters is not an error. -- Alexander Potapenko Software Engineer Google Moscow -------------- next part -------------- A non-text attachment was scrubbed... Name: memcpy_test_lion.patch Type: text/x-patch Size: 1106 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/2bc50094/attachment-0001.bin From glider at google.com Tue Jan 31 07:19:18 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 13:19:18 -0000 Subject: [llvm-commits] [compiler-rt] r149382 - in /compiler-rt/trunk/lib/asan: asan_mac.cc asan_mac.h Message-ID: <20120131131918.E4B812A6C12C@llvm.org> Author: glider Date: Tue Jan 31 07:19:18 2012 New Revision: 149382 URL: http://llvm.org/viewvc/llvm-project?rev=149382&view=rev Log: Implement GetMacosVersion() to obtain the OS X version at runtime. Modified: compiler-rt/trunk/lib/asan/asan_mac.cc compiler-rt/trunk/lib/asan/asan_mac.h Modified: compiler-rt/trunk/lib/asan/asan_mac.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.cc?rev=149382&r1=149381&r2=149382&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.cc (original) +++ compiler-rt/trunk/lib/asan/asan_mac.cc Tue Jan 31 07:19:18 2012 @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -58,6 +59,28 @@ # endif // __WORDSIZE } +int GetMacosVersion() { + int mib[2] = { CTL_KERN, KERN_OSRELEASE }; + char version[100]; + size_t len = 0, maxlen = sizeof(version) / sizeof(version[0]); + for (int i = 0; i < maxlen; i++) version[i] = '\0'; + // Get the version length. + CHECK(sysctl(mib, 2, NULL, &len, NULL, 0) != -1); + CHECK(len < maxlen); + CHECK(sysctl(mib, 2, version, &len, NULL, 0) != -1); + switch (version[0]) { + case '9': return MACOS_VERSION_LEOPARD; + case '1': { + switch (version[1]) { + case '0': return MACOS_VERSION_SNOW_LEOPARD; + case '1': return MACOS_VERSION_LION; + default: return MACOS_VERSION_UNKNOWN; + } + } + default: return MACOS_VERSION_UNKNOWN; + } +} + // No-op. Mac does not support static linkage anyway. void *AsanDoesNotSupportStaticLinkage() { return NULL; Modified: compiler-rt/trunk/lib/asan/asan_mac.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.h?rev=149382&r1=149381&r2=149382&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.h (original) +++ compiler-rt/trunk/lib/asan/asan_mac.h Tue Jan 31 07:19:18 2012 @@ -24,6 +24,17 @@ #include #include +enum { + MACOS_VERSION_UNKNOWN = 0, + MACOS_VERSION_LEOPARD, + MACOS_VERSION_SNOW_LEOPARD, + MACOS_VERSION_LION, +}; + +namespace __asan { +int GetMacosVersion(); +} + typedef void* pthread_workqueue_t; typedef void* pthread_workitem_handle_t; From grosser at fim.uni-passau.de Tue Jan 31 07:26:29 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 13:26:29 -0000 Subject: [llvm-commits] [polly] r149383 - /polly/trunk/lib/ScheduleOptimizer.cpp Message-ID: <20120131132629.823C52A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 07:26:29 2012 New Revision: 149383 URL: http://llvm.org/viewvc/llvm-project?rev=149383&view=rev Log: Schedule: Sort includes and remove useless ones Modified: polly/trunk/lib/ScheduleOptimizer.cpp Modified: polly/trunk/lib/ScheduleOptimizer.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/ScheduleOptimizer.cpp?rev=149383&r1=149382&r2=149383&view=diff ============================================================================== --- polly/trunk/lib/ScheduleOptimizer.cpp (original) +++ polly/trunk/lib/ScheduleOptimizer.cpp Tue Jan 31 07:26:29 2012 @@ -19,20 +19,18 @@ #include "polly/ScheduleOptimizer.h" -#include "polly/Cloog.h" -#include "polly/LinkAllPasses.h" #include "polly/CodeGeneration.h" -#include "polly/Support/GICHelper.h" #include "polly/Dependences.h" +#include "polly/LinkAllPasses.h" #include "polly/ScopInfo.h" #include "isl/aff.h" -#include "isl/space.h" -#include "isl/map.h" -#include "isl/constraint.h" -#include "isl/schedule.h" #include "isl/band.h" +#include "isl/constraint.h" +#include "isl/map.h" #include "isl/options.h" +#include "isl/schedule.h" +#include "isl/space.h" #define DEBUG_TYPE "polly-opt-isl" #include "llvm/Support/Debug.h" @@ -503,10 +501,6 @@ if (!schedule) return false; - DEBUG(dbgs() << "Computed schedule: "); - DEBUG(dbgs() << stringFromIslObj(schedule)); - DEBUG(dbgs() << "Individual bands: "); - isl_union_map *ScheduleMap = getScheduleMap(schedule); for (Scop::iterator SI = S.begin(), SE = S.end(); SI != SE; ++SI) { From glider at google.com Tue Jan 31 07:36:17 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 17:36:17 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not wrap memcpy() on Mac OS 10.7 Message-ID: The attached patch disables wrapping memcpy() on Mac OS Lion, where it actually falls back to memmove. In this case we still need to initialize real_memcpy, so we set it to real_memmove We check for MACOS_VERSION_SNOW_LEOPARD, because currently only Snow Leopard and Lion are supported. -- Alexander Potapenko Software Engineer Google Moscow -------------- next part -------------- A non-text attachment was scrubbed... Name: asan_memcpy.patch Type: application/octet-stream Size: 1026 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/20bf7759/attachment.obj From klimek at google.com Tue Jan 31 07:40:47 2012 From: klimek at google.com (Manuel Klimek) Date: Tue, 31 Jan 2012 14:40:47 +0100 Subject: [llvm-commits] [PATCH] YAML parser. In-Reply-To: References: Message-ID: One first quick note: the yaml parser seems to use about twice as much memory during parsing (at least on my machine the yaml parser gets my machine to go into swap stop, while the json parser runs through fine when using 1GB test data). With -memory-limit 100 the json parser uses about 500MB VmPeak while the yaml parser has about 1GB VmPeak. Cheers, /Manuel On Mon, Jan 30, 2012 at 9:08 PM, Michael Spencer wrote: > Attached is the patch for the YAML parser I've been working on. YAML > is a super set of JSON that adds many features that I want for writing > tests for lld (the llvm linker) and other places where we use object > files. > > The API is very similar to the existing JSON API, but is not exactly > the same as YAML has an extended data model. > > This parser is slower than the currently existing JSON parser. For > files with large scalars, there is almost no difference. For medium > scalars, there's a ~2x slowdown. And for small scalars, there's a ~6x > slowdown. > > Here are some performance numbers for {yaml,json}-bench -memory-limit > 100 (a 32bit build can't do much more than that :P). Note that for > YAML. The Parsing time includes the Tokenizing time. > > c:\Users\mspencer\Projects\llvm-project\llvm>yaml-bench -memory-limit 100 > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 5.2104 seconds (5.2185 wall clock) > > ? ---User Time--- ? --System Time-- ? --User+System-- ? ---Wall > Time--- ?--- Name --- > ? 2.8392 ( 56.3%) ? 0.1716 (100.0%) ? 3.0108 ( 57.8%) ? 3.0118 ( > 57.7%) ?Small Values: Parsing > ? 2.1216 ( 42.1%) ? 0.0000 ( ?0.0%) ? 2.1216 ( 40.7%) ? 2.1257 ( > 40.7%) ?Small Values: Tokenizing > ? 0.0780 ( ?1.5%) ? 0.0000 ( ?0.0%) ? 0.0780 ( ?1.5%) ? 0.0810 ( > 1.6%) ?Small Values: Loop > ? 5.0388 (100.0%) ? 0.1716 (100.0%) ? 5.2104 (100.0%) ? 5.2185 (100.0%) ?Total > > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 0.4836 seconds (0.4740 wall clock) > > ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- > ? 0.2184 ( 45.2%) ? 0.2184 ( 45.2%) ? 0.2200 ( 46.4%) ?Medium Values: Parsing > ? 0.1716 ( 35.5%) ? 0.1716 ( 35.5%) ? 0.1710 ( 36.1%) ?Medium Values: > Tokenizing > ? 0.0936 ( 19.4%) ? 0.0936 ( 19.4%) ? 0.0830 ( 17.5%) ?Medium Values: Loop > ? 0.4836 (100.0%) ? 0.4836 (100.0%) ? 0.4740 (100.0%) ?Total > > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 0.2496 seconds (0.2480 wall clock) > > ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- > ? 0.0780 ( 31.3%) ? 0.0780 ( 31.3%) ? 0.0830 ( 33.5%) ?Large Values: Parsing > ? 0.0936 ( 37.5%) ? 0.0936 ( 37.5%) ? 0.0830 ( 33.5%) ?Large Values: Tokenizing > ? 0.0780 ( 31.3%) ? 0.0780 ( 31.3%) ? 0.0820 ( 33.1%) ?Large Values: Loop > ? 0.2496 (100.0%) ? 0.2496 (100.0%) ? 0.2480 (100.0%) ?Total > > c:\Users\mspencer\Projects\llvm-project\llvm>json-bench -memory-limit 100 > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 0.6552 seconds (0.6531 wall clock) > > ? ---User Time--- ? --System Time-- ? --User+System-- ? ---Wall > Time--- ?--- Name --- > ? 0.5460 ( 87.5%) ? 0.0312 (100.0%) ? 0.5772 ( 88.1%) ? 0.5721 ( > 87.6%) ?Small Values: Parsing > ? 0.0780 ( 12.5%) ? 0.0000 ( ?0.0%) ? 0.0780 ( 11.9%) ? 0.0810 ( > 12.4%) ?Small Values: Loop > ? 0.6240 (100.0%) ? 0.0312 (100.0%) ? 0.6552 (100.0%) ? 0.6531 (100.0%) ?Total > > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 0.1872 seconds (0.1830 wall clock) > > ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- > ? 0.1092 ( 58.3%) ? 0.1092 ( 58.3%) ? 0.1030 ( 56.3%) ?Medium Values: Parsing > ? 0.0780 ( 41.7%) ? 0.0780 ( 41.7%) ? 0.0800 ( 43.7%) ?Medium Values: Loop > ? 0.1872 (100.0%) ? 0.1872 (100.0%) ? 0.1830 (100.0%) ?Total > > ===-------------------------------------------------------------------------=== > ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark > ===-------------------------------------------------------------------------=== > ?Total Execution Time: 0.1716 seconds (0.1620 wall clock) > > ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- > ? 0.0780 ( 45.5%) ? 0.0780 ( 45.5%) ? 0.0810 ( 50.0%) ?Large Values: Parsing > ? 0.0936 ( 54.5%) ? 0.0936 ( 54.5%) ? 0.0810 ( 50.0%) ?Large Values: Loop > ? 0.1716 (100.0%) ? 0.1716 (100.0%) ? 0.1620 (100.0%) ?Total > > > - Michael Spencer From baldrick at free.fr Tue Jan 31 07:40:31 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 13:40:31 -0000 Subject: [llvm-commits] [dragonegg] r149384 - /dragonegg/trunk/src/Constants.cpp Message-ID: <20120131134031.AA1A92A6C12C@llvm.org> Author: baldrick Date: Tue Jan 31 07:40:31 2012 New Revision: 149384 URL: http://llvm.org/viewvc/llvm-project?rev=149384&view=rev Log: Speed up handling of type mismatches with huge initial values. The origin of this kind of thing is usually that the two types are structurally identical struct types, one named the other anonymous. It's probably worth trying to avoid this kind of thing altogether by catching it earlier on. Modified: dragonegg/trunk/src/Constants.cpp Modified: dragonegg/trunk/src/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/src/Constants.cpp?rev=149384&r1=149383&r2=149384&view=diff ============================================================================== --- dragonegg/trunk/src/Constants.cpp (original) +++ dragonegg/trunk/src/Constants.cpp Tue Jan 31 07:40:31 2012 @@ -412,9 +412,17 @@ /// value of type 'Ty' from the stored to memory location. static Constant *InterpretAsType(Constant *C, Type* Ty, int StartingBit, TargetFolder &Folder) { + // Efficient handling for some common cases. if (C->getType() == Ty) return C; + if (isa(C)) + return UndefValue::get(Ty); + + if (C->isNullValue()) + return Constant::getNullValue(Ty); + + // The general case. switch (Ty->getTypeID()) { default: DieAbjectly("Unsupported type!"); From clattner at apple.com Tue Jan 31 07:51:16 2012 From: clattner at apple.com (Chris Lattner) Date: Tue, 31 Jan 2012 05:51:16 -0800 Subject: [llvm-commits] [dragonegg] r149378 - in /dragonegg/trunk/src: Backend.cpp Constants.cpp In-Reply-To: <4F27C82E.6060106@free.fr> References: <20120131103936.9BA7E2A6C131@llvm.org> <4F27C82E.6060106@free.fr> Message-ID: Thanks both, I'm sorry I didn't think about this :( -Chris On Jan 31, 2012, at 2:53 AM, Duncan Sands wrote: > Hi Benjamin, > >> ConstantArray::get doesn't do strings anymore. > > thanks for doing this. I did this too but was still running my version through > the dragonegg testsuite :) I've just committed some additional changes from my > version on top of yours. > > Ciao, Duncan. > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From clattner at apple.com Tue Jan 31 07:52:33 2012 From: clattner at apple.com (Chris Lattner) Date: Tue, 31 Jan 2012 05:52:33 -0800 Subject: [llvm-commits] [llvm] r149356 - in /llvm/trunk: include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp In-Reply-To: <4F27B946.2020800@free.fr> References: <20120131050917.C16DD2A6C12C@llvm.org> <4F27B946.2020800@free.fr> Message-ID: <6AF96E59-4277-40DC-A7BF-8BF047BD5C81@apple.com> Nope, I cleaned it up in a later commit. Thanks, -Chris On Jan 31, 2012, at 1:49 AM, Duncan Sands wrote: > Hi Chris, > >> --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original) >> +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Mon Jan 30 23:09:17 2012 >> @@ -132,8 +132,8 @@ >> uint64_t Offset = 0); >> >> // FIXME: Remove this. >> - bool GetConstantStringInfo(const Value *V, std::string&Str, >> - uint64_t Offset = 0); >> + // bool GetConstantStringInfo(const Value *V, std::string&Str, >> + // uint64_t Offset = 0); > > did you mean to leave this in the file? > > Ciao, Duncan. > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From grosser at fim.uni-passau.de Tue Jan 31 08:00:27 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 14:00:27 -0000 Subject: [llvm-commits] [polly] r149386 - in /polly/trunk: include/polly/LinkAllPasses.h lib/CMakeLists.txt lib/DeadCodeElimination.cpp lib/RegisterPasses.cpp Message-ID: <20120131140027.A1CBE2A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 08:00:27 2012 New Revision: 149386 URL: http://llvm.org/viewvc/llvm-project?rev=149386&view=rev Log: Add a sceleton for a polyhedral dead code elimination. Such a dead code elimination can remove redundant stores to arrays. It can also eliminate calculations where the results are stored to memory but where they are overwritten before ever being read. It may also fix bugs like: http://llvm.org/bugs/show_bug.cgi?id=5117 This commit just adds a sceleton without any functionality. If anybody is interested to learn about polyhedral optimizations this would be a good task. Well definined, self contained and pretty simple. Ping me if you want to start and you need some pointers to get going. Added: polly/trunk/lib/DeadCodeElimination.cpp Modified: polly/trunk/include/polly/LinkAllPasses.h polly/trunk/lib/CMakeLists.txt polly/trunk/lib/RegisterPasses.cpp Modified: polly/trunk/include/polly/LinkAllPasses.h URL: http://llvm.org/viewvc/llvm-project/polly/trunk/include/polly/LinkAllPasses.h?rev=149386&r1=149385&r2=149386&view=diff ============================================================================== --- polly/trunk/include/polly/LinkAllPasses.h (original) +++ polly/trunk/include/polly/LinkAllPasses.h Tue Jan 31 08:00:27 2012 @@ -33,6 +33,7 @@ Pass *createCloogInfoPass(); Pass *createCodeGenerationPass(); Pass *createCodePreparationPass(); + Pass *createDeadCodeElimPass(); Pass *createDependencesPass(); Pass *createDOTOnlyPrinterPass(); Pass *createDOTOnlyViewerPass(); @@ -79,6 +80,7 @@ createCloogInfoPass(); createCodeGenerationPass(); createCodePreparationPass(); + createDeadCodeElimPass(); createDependencesPass(); createDOTOnlyPrinterPass(); createDOTOnlyViewerPass(); @@ -111,6 +113,7 @@ class PassRegistry; void initializeCodeGenerationPass(llvm::PassRegistry&); void initializeCodePreparationPass(llvm::PassRegistry&); + void initializeDeadCodeElimPass(llvm::PassRegistry&); void initializeIndependentBlocksPass(llvm::PassRegistry&); void initializeJSONExporterPass(llvm::PassRegistry&); void initializeJSONImporterPass(llvm::PassRegistry&); Modified: polly/trunk/lib/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/CMakeLists.txt?rev=149386&r1=149385&r2=149386&view=diff ============================================================================== --- polly/trunk/lib/CMakeLists.txt (original) +++ polly/trunk/lib/CMakeLists.txt Tue Jan 31 08:00:27 2012 @@ -21,6 +21,7 @@ Cloog.cpp CodePreparation.cpp CodeGeneration.cpp + DeadCodeElimination.cpp IndependentBlocks.cpp MayAliasSet.cpp Pocc.cpp Added: polly/trunk/lib/DeadCodeElimination.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/DeadCodeElimination.cpp?rev=149386&view=auto ============================================================================== --- polly/trunk/lib/DeadCodeElimination.cpp (added) +++ polly/trunk/lib/DeadCodeElimination.cpp Tue Jan 31 08:00:27 2012 @@ -0,0 +1,75 @@ +//===- DeadCodeElimination.cpp - Eliminate dead iteration ----------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// If values calculated within an iteration are not used later on the iteration +// can be removed entirely. This pass removes such iterations. +//===----------------------------------------------------------------------===// + +#include "polly/Dependences.h" +#include "polly/LinkAllPasses.h" +#include "polly/ScopInfo.h" + +#include "isl/union_map.h" + +using namespace llvm; +using namespace polly; + +namespace { + + class DeadCodeElim : public ScopPass { + + public: + static char ID; + explicit DeadCodeElim() : ScopPass(ID) {} + + virtual bool runOnScop(Scop &S); + void printScop(llvm::raw_ostream &OS) const; + void getAnalysisUsage(AnalysisUsage &AU) const; + }; +} + +char DeadCodeElim::ID = 0; + +bool DeadCodeElim::runOnScop(Scop &S) { + Dependences *D = &getAnalysis(); + + int dependencyKinds = Dependences::TYPE_RAW + | Dependences::TYPE_WAR + | Dependences::TYPE_WAW; + + isl_union_map *dependences = D->getDependences(dependencyKinds); + + // The idea of this pass is to loop over all statments and remove statement + // iterations that do not calculate any value that is read later on. We need + // to make sure to forward RAR and WAR dependences. + // + // A case where this pass might be useful is + // http://llvm.org/bugs/show_bug.cgi?id=5117 + isl_union_map_free(dependences); + return false; +} + +void DeadCodeElim::printScop(raw_ostream &OS) const { +} + +void DeadCodeElim::getAnalysisUsage(AnalysisUsage &AU) const { + ScopPass::getAnalysisUsage(AU); + AU.addRequired(); +} + +INITIALIZE_PASS_BEGIN(DeadCodeElim, "polly-dce", + "Polly - Remove dead iterations", false, false) +INITIALIZE_PASS_DEPENDENCY(Dependences) +INITIALIZE_PASS_DEPENDENCY(ScopInfo) +INITIALIZE_PASS_END(DeadCodeElim, "polly-dce", + "Polly - Remove dead iterations", false, false) + +Pass* polly::createDeadCodeElimPass() { + return new DeadCodeElim(); +} Modified: polly/trunk/lib/RegisterPasses.cpp URL: http://llvm.org/viewvc/llvm-project/polly/trunk/lib/RegisterPasses.cpp?rev=149386&r1=149385&r2=149386&view=diff ============================================================================== --- polly/trunk/lib/RegisterPasses.cpp (original) +++ polly/trunk/lib/RegisterPasses.cpp Tue Jan 31 08:00:27 2012 @@ -61,6 +61,12 @@ cl::desc("Enable the Polly DOT viewer in -O3"), cl::Hidden, cl::value_desc("Run the Polly DOT viewer at -O3"), cl::init(false)); + +static cl::opt +DeadCodeElim("polly-run-dce", + cl::desc("Run the dead code elimination"), + cl::Hidden, cl::init(false)); + static cl::opt PollyOnlyViewer("polly-show-only", cl::desc("Enable the Polly DOT viewer in -O3 (no BB content)"), @@ -83,6 +89,7 @@ initializeCloogInfoPass(Registry); initializeCodeGenerationPass(Registry); initializeCodePreparationPass(Registry); + initializeDeadCodeElimPass(Registry); initializeDependencesPass(Registry); initializeIndependentBlocksPass(Registry); initializeJSONExporterPass(Registry); @@ -161,6 +168,9 @@ if (ImportJScop) PM.add(polly::createJSONImporterPass()); + if (DeadCodeElim) + PM.add(polly::createDeadCodeElimPass()); + if (RunScheduler) { if (Optimizer == "pocc") { #ifdef SCOPLIB_FOUND From baldrick at free.fr Tue Jan 31 08:38:03 2012 From: baldrick at free.fr (Duncan Sands) Date: Tue, 31 Jan 2012 15:38:03 +0100 Subject: [llvm-commits] [dragonegg] r149378 - in /dragonegg/trunk/src: Backend.cpp Constants.cpp In-Reply-To: References: <20120131103936.9BA7E2A6C131@llvm.org> <4F27C82E.6060106@free.fr> Message-ID: <4F27FCCB.9060603@free.fr> On 31/01/12 14:51, Chris Lattner wrote: > Thanks both, I'm sorry I didn't think about this :( I forgive you, though thousands wouldn't :) Ciao, Duncan. > > -Chris > > On Jan 31, 2012, at 2:53 AM, Duncan Sands wrote: > >> Hi Benjamin, >> >>> ConstantArray::get doesn't do strings anymore. >> >> thanks for doing this. I did this too but was still running my version through >> the dragonegg testsuite :) I've just committed some additional changes from my >> version on top of yours. >> >> Ciao, Duncan. >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From klimek at google.com Tue Jan 31 08:40:55 2012 From: klimek at google.com (Manuel Klimek) Date: Tue, 31 Jan 2012 15:40:55 +0100 Subject: [llvm-commits] [PATCH] YAML parser. In-Reply-To: References: Message-ID: I now ran some benchmarks on a real larger project configuration (chromium) and found that the YAML parser is about 30% slower than the JSONParser. Currently I'd say that's not too bad. Will do a more in-depth review of the code next. On Tue, Jan 31, 2012 at 2:40 PM, Manuel Klimek wrote: > One first quick note: the yaml parser seems to use about twice as much > memory during parsing (at least on my machine the yaml parser gets my > machine to go into swap stop, while the json parser runs through fine > when using 1GB test data). > > With -memory-limit 100 the json parser uses about 500MB VmPeak while > the yaml parser has about 1GB VmPeak. > > Cheers, > /Manuel > > On Mon, Jan 30, 2012 at 9:08 PM, Michael Spencer wrote: >> Attached is the patch for the YAML parser I've been working on. YAML >> is a super set of JSON that adds many features that I want for writing >> tests for lld (the llvm linker) and other places where we use object >> files. >> >> The API is very similar to the existing JSON API, but is not exactly >> the same as YAML has an extended data model. >> >> This parser is slower than the currently existing JSON parser. For >> files with large scalars, there is almost no difference. For medium >> scalars, there's a ~2x slowdown. And for small scalars, there's a ~6x >> slowdown. >> >> Here are some performance numbers for {yaml,json}-bench -memory-limit >> 100 (a 32bit build can't do much more than that :P). Note that for >> YAML. The Parsing time includes the Tokenizing time. >> >> c:\Users\mspencer\Projects\llvm-project\llvm>yaml-bench -memory-limit 100 >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 5.2104 seconds (5.2185 wall clock) >> >> ? ---User Time--- ? --System Time-- ? --User+System-- ? ---Wall >> Time--- ?--- Name --- >> ? 2.8392 ( 56.3%) ? 0.1716 (100.0%) ? 3.0108 ( 57.8%) ? 3.0118 ( >> 57.7%) ?Small Values: Parsing >> ? 2.1216 ( 42.1%) ? 0.0000 ( ?0.0%) ? 2.1216 ( 40.7%) ? 2.1257 ( >> 40.7%) ?Small Values: Tokenizing >> ? 0.0780 ( ?1.5%) ? 0.0000 ( ?0.0%) ? 0.0780 ( ?1.5%) ? 0.0810 ( >> 1.6%) ?Small Values: Loop >> ? 5.0388 (100.0%) ? 0.1716 (100.0%) ? 5.2104 (100.0%) ? 5.2185 (100.0%) ?Total >> >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 0.4836 seconds (0.4740 wall clock) >> >> ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- >> ? 0.2184 ( 45.2%) ? 0.2184 ( 45.2%) ? 0.2200 ( 46.4%) ?Medium Values: Parsing >> ? 0.1716 ( 35.5%) ? 0.1716 ( 35.5%) ? 0.1710 ( 36.1%) ?Medium Values: >> Tokenizing >> ? 0.0936 ( 19.4%) ? 0.0936 ( 19.4%) ? 0.0830 ( 17.5%) ?Medium Values: Loop >> ? 0.4836 (100.0%) ? 0.4836 (100.0%) ? 0.4740 (100.0%) ?Total >> >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? YAML parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 0.2496 seconds (0.2480 wall clock) >> >> ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- >> ? 0.0780 ( 31.3%) ? 0.0780 ( 31.3%) ? 0.0830 ( 33.5%) ?Large Values: Parsing >> ? 0.0936 ( 37.5%) ? 0.0936 ( 37.5%) ? 0.0830 ( 33.5%) ?Large Values: Tokenizing >> ? 0.0780 ( 31.3%) ? 0.0780 ( 31.3%) ? 0.0820 ( 33.1%) ?Large Values: Loop >> ? 0.2496 (100.0%) ? 0.2496 (100.0%) ? 0.2480 (100.0%) ?Total >> >> c:\Users\mspencer\Projects\llvm-project\llvm>json-bench -memory-limit 100 >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 0.6552 seconds (0.6531 wall clock) >> >> ? ---User Time--- ? --System Time-- ? --User+System-- ? ---Wall >> Time--- ?--- Name --- >> ? 0.5460 ( 87.5%) ? 0.0312 (100.0%) ? 0.5772 ( 88.1%) ? 0.5721 ( >> 87.6%) ?Small Values: Parsing >> ? 0.0780 ( 12.5%) ? 0.0000 ( ?0.0%) ? 0.0780 ( 11.9%) ? 0.0810 ( >> 12.4%) ?Small Values: Loop >> ? 0.6240 (100.0%) ? 0.0312 (100.0%) ? 0.6552 (100.0%) ? 0.6531 (100.0%) ?Total >> >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 0.1872 seconds (0.1830 wall clock) >> >> ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- >> ? 0.1092 ( 58.3%) ? 0.1092 ( 58.3%) ? 0.1030 ( 56.3%) ?Medium Values: Parsing >> ? 0.0780 ( 41.7%) ? 0.0780 ( 41.7%) ? 0.0800 ( 43.7%) ?Medium Values: Loop >> ? 0.1872 (100.0%) ? 0.1872 (100.0%) ? 0.1830 (100.0%) ?Total >> >> ===-------------------------------------------------------------------------=== >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? JSON parser benchmark >> ===-------------------------------------------------------------------------=== >> ?Total Execution Time: 0.1716 seconds (0.1620 wall clock) >> >> ? ---User Time--- ? --User+System-- ? ---Wall Time--- ?--- Name --- >> ? 0.0780 ( 45.5%) ? 0.0780 ( 45.5%) ? 0.0810 ( 50.0%) ?Large Values: Parsing >> ? 0.0936 ( 54.5%) ? 0.0936 ( 54.5%) ? 0.0810 ( 50.0%) ?Large Values: Loop >> ? 0.1716 (100.0%) ? 0.1716 (100.0%) ? 0.1620 (100.0%) ?Total >> >> >> - Michael Spencer From slarin at codeaurora.org Tue Jan 31 09:18:05 2012 From: slarin at codeaurora.org (Sergei Larin) Date: Tue, 31 Jan 2012 09:18:05 -0600 Subject: [llvm-commits] FW: Hexagon VLIW instruction scheduler framework patch for review Message-ID: <07f401cce02b$87f18630$97d49290$@org> Sorry, can someone please take a look? Thanks! -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Sergei Larin Sent: Friday, January 27, 2012 10:47 AM To: llvm-commits at cs.uiuc.edu Subject: [llvm-commits] Hexagon VLIW instruction scheduler framework patch for review Hello everybody, Attached is initial patch for a VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA) . Several key points: - The scheduler is largely based on the existing framework, but introduces several VLIW specific concepts. It could be classified as a top down list scheduler, critical path first, with DFA used for parallel resources modeling. It also models and tracks register pressure in the way similar to the current RegPressure scheduler. It employs a slightly different way to compute "cost" function for all SUs in AQ which allows for somewhat easier balancing of multiple heuristic inputs. Current version does _not_ generates bundles/packets (but models them internally). It could be easily modified to do so, and it is our plan to make it a part of bundle generation in the near future. - The scheduler is enabled for the Hexagon backend. Comparing to any existing scheduler, for this VLIW target this code produces between 1.9% slowdown and 11% speedup on our internal test suite. This test set comprised from a variety of real world applications ranging from DSP specific applications to SPEC. Some DSP kernels (when taken out of context) enjoy up to 20% speedup when compared to the "default" scheduling mechanism (RegPressure pre-RA + post RA). Main reason for this kind of corner case behavior is long chains of independent memory accesses that are conservatively serialized by the default scheduler (and there is no HW scheduler to sort it out at the run time). - This patch is an initial submission with a bare minimum of features, and more heuristics will be added to it later. We prefer to submit it in stages to simplify review process and improve SW management. - Patch also contains minor updates to two Hexagon specific tests in order to compensate for new order of instructions generated by the Hexagon backend __with scheduler disabled__. - SVN revision 149130. LLVM verification test run for x86 platform detects no additional failures. Comments and reviews are eagerly anticipated J Thanks. Sergei Larin -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/cc19741e/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: Hexagon_vliw_scheduler_framework.patch Type: application/octet-stream Size: 46656 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/cc19741e/attachment-0001.obj -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ATT01398.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/cc19741e/attachment-0001.txt From klimek at google.com Tue Jan 31 09:52:11 2012 From: klimek at google.com (Manuel Klimek) Date: Tue, 31 Jan 2012 16:52:11 +0100 Subject: [llvm-commits] [llvm] r149308 - /llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h In-Reply-To: <20120131005705.2D0732A6C12C@llvm.org> References: <20120131005705.2D0732A6C12C@llvm.org> Message-ID: Hi Ted, this change broke the cfe/tooling branch. See http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/135932.html for a fix proposal. Cheers, /Manuel On Tue, Jan 31, 2012 at 1:57 AM, Ted Kremenek wrote: > Author: kremenek > Date: Mon Jan 30 18:57:04 2012 > New Revision: 149308 > > URL: http://llvm.org/viewvc/llvm-project?rev=149308&view=rev > Log: > Use traits for IntrusiveRefCntPtr to determine how to increment/decrement a reference count. > > Modified: > ? ?llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h > > Modified: llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h?rev=149308&r1=149307&r2=149308&view=diff > ============================================================================== > --- llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h (original) > +++ llvm/trunk/include/llvm/ADT/IntrusiveRefCntPtr.h Mon Jan 30 18:57:04 2012 > @@ -83,6 +83,12 @@ > ? ? friend class IntrusiveRefCntPtr; > ? }; > > + > + ?template struct IntrusiveRefCntPtrInfo { > + ? ?static void retain(T *obj) { obj->Retain(); } > + ? ?static void release(T *obj) { obj->Release(); } > + ?}; > + > ?//===----------------------------------------------------------------------===// > ?/// IntrusiveRefCntPtr - A template class that implements a "smart pointer" > ?/// ?that assumes the wrapped object has a reference count associated > @@ -168,8 +174,8 @@ > ? ? } > > ? private: > - ? ?void retain() { if (Obj) Obj->Retain(); } > - ? ?void release() { if (Obj) Obj->Release(); } > + ? ?void retain() { if (Obj) IntrusiveRefCntPtrInfo::retain(Obj); } > + ? ?void release() { if (Obj) IntrusiveRefCntPtrInfo::release(Obj); } > > ? ? void replace(T* S) { > ? ? ? this_type(S).swap(*this); > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From thomas.stellard at amd.com Tue Jan 31 10:08:38 2012 From: thomas.stellard at amd.com (Tom Stellard) Date: Tue, 31 Jan 2012 11:08:38 -0500 Subject: [llvm-commits] FW: Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <07f401cce02b$87f18630$97d49290$@org> References: <07f401cce02b$87f18630$97d49290$@org> Message-ID: <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> On Tue, Jan 31, 2012 at 09:18:05AM -0600, Sergei Larin wrote: > > Hello everybody, > > > > Attached is initial patch for a VLIW specific scheduler framework that > utilizes deterministic finite automaton (DFA) . > > > > Several key points: > > - The scheduler is largely based on the existing framework, but > introduces several VLIW specific concepts. It could be classified as a top > down list scheduler, critical path first, with DFA used for parallel > resources modeling. It also models and tracks register pressure in the way > similar to the current RegPressure scheduler. It employs a slightly > different way to compute "cost" function for all SUs in AQ which allows for > somewhat easier balancing of multiple heuristic inputs. Current version does > _not_ generates bundles/packets (but models them internally). It could be > easily modified to do so, and it is our plan to make it a part of bundle > generation in the near future. > > - The scheduler is enabled for the Hexagon backend. Comparing to > any existing scheduler, for this VLIW target this code produces between 1.9% > slowdown and 11% speedup on our internal test suite. This test set comprised > from a variety of real world applications ranging from DSP specific > applications to SPEC. Some DSP kernels (when taken out of context) enjoy up > to 20% speedup when compared to the "default" scheduling mechanism > (RegPressure pre-RA + post RA). Main reason for this kind of corner case > behavior is long chains of independent memory accesses that are > conservatively serialized by the default scheduler (and there is no HW > scheduler to sort it out at the run time). > > - This patch is an initial submission with a bare minimum of > features, and more heuristics will be added to it later. We prefer to submit > it in stages to simplify review process and improve SW management. > > - Patch also contains minor updates to two Hexagon specific tests > in order to compensate for new order of instructions generated by the > Hexagon backend __with scheduler disabled__. > > - SVN revision 149130. LLVM verification test run for x86 platform > detects no additional failures. > > > > Comments and reviews are eagerly anticipated J > Hi Sergei, I'm glad to see a VLIW scheduler proposed for LLVM. We are working on an LLVM backend for our Evergreen / Northern Islands open source drivers, which are also VLIW. I'm hoping we can use this in our backend as well. I just have a few questions and comments. When you start doing bundle generation in the scheduler will you be using the new MachineInstrBundle? How are you going to model bundle constraints? I'm not familiar with the Hexagon architecture, but our hardware has several bundle constraints. For example, some instructions can only be in a certain slot within the bundle, while other instructions fill all slots in the bundle. There is also a limit to the number of constant registers (these are in a different register space than the GPRs) that can be read from within the bundle, among other things. It would be nice to have some way to apply these constraints in the scheduler. A quick note on the patch, I noticed a few whitespace errors in LinkAllCodegenComponents.h, SchedulerRegistry.h, and HexagonInstrInfo.cpp Nice Work! -Tom Stellard > > > Thanks. > > > > Sergei Larin > > > > -- > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From adasgupt at codeaurora.org Tue Jan 31 10:37:56 2012 From: adasgupt at codeaurora.org (Anshuman Dasgupta) Date: Tue, 31 Jan 2012 10:37:56 -0600 Subject: [llvm-commits] FW: Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> References: <07f401cce02b$87f18630$97d49290$@org> <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> Message-ID: <4F2818E4.3070900@codeaurora.org> Hi Tom, > How are you going to model bundle > constraints? I'm not familiar with the Hexagon architecture, but our > hardware has several bundle constraints. For example, some instructions > can only be in a certain slot within the bundle, while other instructions > fill all slots in the bundle. Yes, we had to solve the same problem for Hexagon. We authored a target-independent packetizer (or bundler) that examines available slots and automatically constructs a DFA to represent slot restrictions in a VLIW architecture. This DFA can be queried by a target while bundling instructions. You may be interested in this posting on llvm-commit: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20111128/132357.html The corresponding commit is: http://llvm.org/viewvc/llvm-project?view=rev&revision=145629 Feel free to ping me if you have any questions on the DFA packetizer. -Anshu -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum On 1/31/2012 10:08 AM, Tom Stellard wrote: > On Tue, Jan 31, 2012 at 09:18:05AM -0600, Sergei Larin wrote: >> >> Hello everybody, >> >> >> >> Attached is initial patch for a VLIW specific scheduler framework that >> utilizes deterministic finite automaton (DFA) . >> >> >> >> Several key points: >> >> - The scheduler is largely based on the existing framework, but >> introduces several VLIW specific concepts. It could be classified as a top >> down list scheduler, critical path first, with DFA used for parallel >> resources modeling. It also models and tracks register pressure in the way >> similar to the current RegPressure scheduler. It employs a slightly >> different way to compute "cost" function for all SUs in AQ which allows for >> somewhat easier balancing of multiple heuristic inputs. Current version does >> _not_ generates bundles/packets (but models them internally). It could be >> easily modified to do so, and it is our plan to make it a part of bundle >> generation in the near future. >> >> - The scheduler is enabled for the Hexagon backend. Comparing to >> any existing scheduler, for this VLIW target this code produces between 1.9% >> slowdown and 11% speedup on our internal test suite. This test set comprised >> from a variety of real world applications ranging from DSP specific >> applications to SPEC. Some DSP kernels (when taken out of context) enjoy up >> to 20% speedup when compared to the "default" scheduling mechanism >> (RegPressure pre-RA + post RA). Main reason for this kind of corner case >> behavior is long chains of independent memory accesses that are >> conservatively serialized by the default scheduler (and there is no HW >> scheduler to sort it out at the run time). >> >> - This patch is an initial submission with a bare minimum of >> features, and more heuristics will be added to it later. We prefer to submit >> it in stages to simplify review process and improve SW management. >> >> - Patch also contains minor updates to two Hexagon specific tests >> in order to compensate for new order of instructions generated by the >> Hexagon backend __with scheduler disabled__. >> >> - SVN revision 149130. LLVM verification test run for x86 platform >> detects no additional failures. >> >> >> >> Comments and reviews are eagerly anticipated J >> > > Hi Sergei, > > I'm glad to see a VLIW scheduler proposed for LLVM. We are working on > an LLVM backend for our Evergreen / Northern Islands open source drivers, > which are also VLIW. I'm hoping we can use this in our backend as well. > I just have a few questions and comments. > > When you start doing bundle generation in the scheduler will you be > using the new MachineInstrBundle? How are you going to model bundle > constraints? I'm not familiar with the Hexagon architecture, but our > hardware has several bundle constraints. For example, some instructions > can only be in a certain slot within the bundle, while other instructions > fill all slots in the bundle. There is also a limit to the number of > constant registers (these are in a different register space than the GPRs) > that can be read from within the bundle, among other things. It would > be nice to have some way to apply these constraints in the scheduler. > > A quick note on the patch, I noticed a few whitespace errors in > LinkAllCodegenComponents.h, SchedulerRegistry.h, and > HexagonInstrInfo.cpp > > Nice Work! > > -Tom Stellard >> >> >> Thanks. >> >> >> >> Sergei Larin >> >> >> >> -- >> >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. >> >> >> > >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/14ed4d77/attachment.html From craig.topper at gmail.com Tue Jan 31 10:42:43 2012 From: craig.topper at gmail.com (Craig Topper) Date: Tue, 31 Jan 2012 08:42:43 -0800 Subject: [llvm-commits] [llvm] r149367 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2006-05-11-InstrSched.ll test/CodeGen/X86/avx-intrinsics-x86.ll test/CodeGen/X86/avx2-intrinsics-x86.ll In-Reply-To: <4F27C3F1.9060505@free.fr> References: <20120131065244.AB1222A6C12C@llvm.org> <4F27C3F1.9060505@free.fr> Message-ID: Here's what clang has in its emmintrin.h file *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpeq_epi8*(__m128i a, __m128i b) { *return* (__m128i)((__v16qi)a == (__v16qi)b); } *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpeq_epi16*(__m128i a, __m128i b) { *return* (__m128i)((__v8hi)a == (__v8hi)b); } *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpeq_epi32*(__m128i a, __m128i b) { *return* (__m128i)((__v4si)a == (__v4si)b); } *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpgt_epi8*(__m128i a, __m128i b) { *return* (__m128i)((__v16qi)a > (__v16qi)b); } *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpgt_epi16*(__m128i a, __m128i b) { *return* (__m128i)((__v8hi)a > (__v8hi)b); } *static* __inline__ __m128i *__attribute__*((__always_inline__, __nodebug__))*_mm_cmpgt_epi32*(__m128i a, __m128i b) { *return* (__m128i)((__v4si)a > (__v4si)b); } On Tue, Jan 31, 2012 at 2:35 AM, Duncan Sands wrote: > Hi Craig, > > > Remove pcmpgt/pcmpeq intrinsics as clang is not using them. > > dragonegg is using them. Can the same effect be obtained some other way? > > Ciao, Duncan. > > > > > Modified: > > llvm/trunk/include/llvm/IntrinsicsX86.td > > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > > llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > > llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > > llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > > > > Modified: llvm/trunk/include/llvm/IntrinsicsX86.td > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=149367&r1=149366&r2=149367&view=diff > > > ============================================================================== > > --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) > > +++ llvm/trunk/include/llvm/IntrinsicsX86.td Tue Jan 31 00:52:44 2012 > > @@ -452,28 +452,6 @@ > > llvm_i32_ty], [IntrNoMem]>; > > } > > > > -// Integer comparison ops > > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > - def int_x86_sse2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb128">, > > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > > - llvm_v16i8_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw128">, > > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > > - llvm_v8i16_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd128">, > > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > > - llvm_v4i32_ty], [IntrNoMem, Commutative]>; > > - def int_x86_sse2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb128">, > > - Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > > - llvm_v16i8_ty], [IntrNoMem]>; > > - def int_x86_sse2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw128">, > > - Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, > > - llvm_v8i16_ty], [IntrNoMem]>; > > - def int_x86_sse2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd128">, > > - Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, > > - llvm_v4i32_ty], [IntrNoMem]>; > > -} > > - > > // Conversion ops > > let TargetPrefix = "x86" in { // All intrinsics start with > "llvm.x86.". > > def int_x86_sse2_cvtdq2pd : GCCBuiltin<"__builtin_ia32_cvtdq2pd">, > > @@ -792,12 +770,6 @@ > > > > // Vector compare, min, max > > let TargetPrefix = "x86" in { // All intrinsics start with > "llvm.x86.". > > - def int_x86_sse41_pcmpeqq : > GCCBuiltin<"__builtin_ia32_pcmpeqq">, > > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_sse42_pcmpgtq : > GCCBuiltin<"__builtin_ia32_pcmpgtq">, > > - Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], > > - [IntrNoMem]>; > > def int_x86_sse41_pmaxsb : > GCCBuiltin<"__builtin_ia32_pmaxsb128">, > > Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, > llvm_v16i8_ty], > > [IntrNoMem, Commutative]>; > > @@ -1515,34 +1487,6 @@ > > llvm_i32_ty], [IntrNoMem]>; > > } > > > > -// Integer comparison ops > > -let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > > - def int_x86_avx2_pcmpeq_b : GCCBuiltin<"__builtin_ia32_pcmpeqb256">, > > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_w : GCCBuiltin<"__builtin_ia32_pcmpeqw256">, > > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, > llvm_v16i16_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_d : GCCBuiltin<"__builtin_ia32_pcmpeqd256">, > > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpeq_q : GCCBuiltin<"__builtin_ia32_pcmpeqq256">, > > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > > - [IntrNoMem, Commutative]>; > > - def int_x86_avx2_pcmpgt_b : GCCBuiltin<"__builtin_ia32_pcmpgtb256">, > > - Intrinsic<[llvm_v32i8_ty], [llvm_v32i8_ty, llvm_v32i8_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_w : GCCBuiltin<"__builtin_ia32_pcmpgtw256">, > > - Intrinsic<[llvm_v16i16_ty], [llvm_v16i16_ty, > llvm_v16i16_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_d : GCCBuiltin<"__builtin_ia32_pcmpgtd256">, > > - Intrinsic<[llvm_v8i32_ty], [llvm_v8i32_ty, llvm_v8i32_ty], > > - [IntrNoMem]>; > > - def int_x86_avx2_pcmpgt_q : GCCBuiltin<"__builtin_ia32_pcmpgtq256">, > > - Intrinsic<[llvm_v4i64_ty], [llvm_v4i64_ty, llvm_v4i64_ty], > > - [IntrNoMem]>; > > -} > > - > > // Pack ops. > > let TargetPrefix = "x86" in { // All intrinsics start with > "llvm.x86.". > > def int_x86_avx2_packsswb : GCCBuiltin<"__builtin_ia32_packsswb256">, > > > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=149367&r1=149366&r2=149367&view=diff > > > ============================================================================== > > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jan 31 00:52:44 > 2012 > > @@ -9492,26 +9492,6 @@ > > case Intrinsic::x86_avx2_psrav_d_256: > > return DAG.getNode(ISD::SRA, dl, Op.getValueType(), > > Op.getOperand(1), Op.getOperand(2)); > > - case Intrinsic::x86_sse2_pcmpeq_b: > > - case Intrinsic::x86_sse2_pcmpeq_w: > > - case Intrinsic::x86_sse2_pcmpeq_d: > > - case Intrinsic::x86_sse41_pcmpeqq: > > - case Intrinsic::x86_avx2_pcmpeq_b: > > - case Intrinsic::x86_avx2_pcmpeq_w: > > - case Intrinsic::x86_avx2_pcmpeq_d: > > - case Intrinsic::x86_avx2_pcmpeq_q: > > - return DAG.getNode(X86ISD::PCMPEQ, dl, Op.getValueType(), > > - Op.getOperand(1), Op.getOperand(2)); > > - case Intrinsic::x86_sse2_pcmpgt_b: > > - case Intrinsic::x86_sse2_pcmpgt_w: > > - case Intrinsic::x86_sse2_pcmpgt_d: > > - case Intrinsic::x86_sse42_pcmpgtq: > > - case Intrinsic::x86_avx2_pcmpgt_b: > > - case Intrinsic::x86_avx2_pcmpgt_w: > > - case Intrinsic::x86_avx2_pcmpgt_d: > > - case Intrinsic::x86_avx2_pcmpgt_q: > > - return DAG.getNode(X86ISD::PCMPGT, dl, Op.getValueType(), > > - Op.getOperand(1), Op.getOperand(2)); > > case Intrinsic::x86_ssse3_pshuf_b_128: > > case Intrinsic::x86_avx2_pshuf_b: > > return DAG.getNode(X86ISD::PSHUFB, dl, Op.getValueType(), > > > > Modified: llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll?rev=149367&r1=149366&r2=149367&view=diff > > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/2006-05-11-InstrSched.ll Tue Jan 31 > 00:52:44 2012 > > @@ -30,7 +30,7 @@ > > %tmp87 = bitcast<16 x i8> %tmp66 to<4 x i32> ;<<4 x > i32>> [#uses=1] > > %tmp88 = add<4 x i32> %tmp87, %tmp77 ;<<4 x i32>> > [#uses=2] > > %tmp88.upgrd.4 = bitcast<4 x i32> %tmp88 to<2 x i64> > ;<<2 x i64>> [#uses=1] > > - %tmp99 = tail call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> > %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > > + %tmp99 = tail call<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> > %tmp88,<4 x i32> %tmp55 ) ;<<4 x i32>> [#uses=1] > > %tmp99.upgrd.5 = bitcast<4 x i32> %tmp99 to<2 x i64> > ;<<2 x i64>> [#uses=2] > > %tmp110 = xor<2 x i64> %tmp99.upgrd.5,< i64 -1, i64 -1> > ;<<2 x i64>> [#uses=1] > > %tmp111 = and<2 x i64> %tmp110, %tmp55.upgrd.2 ;<<2 x > i64>> [#uses=1] > > @@ -48,4 +48,4 @@ > > ret void > > } > > > > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) > > +declare<4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>,<4 x i32>) > > > > Modified: llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll Tue Jan 31 > 00:52:44 2012 > > @@ -369,54 +369,6 @@ > > declare<8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>,<8 x i16>) nounwind > readnone > > > > > > -define<16 x i8> @test_x86_sse2_pcmpeq_b(<16 x i8> %a0,<16 x i8> %a1) > { > > - ; CHECK: vpcmpeqb > > - %res = call<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8> %a0,<16 x > i8> %a1) ;<<16 x i8>> [#uses=1] > > - ret<16 x i8> %res > > -} > > -declare<16 x i8> @llvm.x86.sse2.pcmpeq.b(<16 x i8>,<16 x i8>) nounwind > readnone > > - > > - > > -define<4 x i32> @test_x86_sse2_pcmpeq_d(<4 x i32> %a0,<4 x i32> %a1) > { > > - ; CHECK: vpcmpeqd > > - %res = call<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32> %a0,<4 x > i32> %a1) ;<<4 x i32>> [#uses=1] > > - ret<4 x i32> %res > > -} > > -declare<4 x i32> @llvm.x86.sse2.pcmpeq.d(<4 x i32>,<4 x i32>) nounwind > readnone > > - > > - > > -define<8 x i16> @test_x86_sse2_pcmpeq_w(<8 x i16> %a0,<8 x i16> %a1) > { > > - ; CHECK: vpcmpeqw > > - %res = call<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16> %a0,<8 x > i16> %a1) ;<<8 x i16>> [#uses=1] > > - ret<8 x i16> %res > > -} > > -declare<8 x i16> @llvm.x86.sse2.pcmpeq.w(<8 x i16>,<8 x i16>) nounwind > readnone > > - > > - > > -define<16 x i8> @test_x86_sse2_pcmpgt_b(<16 x i8> %a0,<16 x i8> %a1) > { > > - ; CHECK: vpcmpgtb > > - %res = call<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8> %a0,<16 x > i8> %a1) ;<<16 x i8>> [#uses=1] > > - ret<16 x i8> %res > > -} > > -declare<16 x i8> @llvm.x86.sse2.pcmpgt.b(<16 x i8>,<16 x i8>) nounwind > readnone > > - > > - > > -define<4 x i32> @test_x86_sse2_pcmpgt_d(<4 x i32> %a0,<4 x i32> %a1) > { > > - ; CHECK: vpcmpgtd > > - %res = call<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32> %a0,<4 x > i32> %a1) ;<<4 x i32>> [#uses=1] > > - ret<4 x i32> %res > > -} > > -declare<4 x i32> @llvm.x86.sse2.pcmpgt.d(<4 x i32>,<4 x i32>) nounwind > readnone > > - > > - > > -define<8 x i16> @test_x86_sse2_pcmpgt_w(<8 x i16> %a0,<8 x i16> %a1) > { > > - ; CHECK: vpcmpgtw > > - %res = call<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16> %a0,<8 x > i16> %a1) ;<<8 x i16>> [#uses=1] > > - ret<8 x i16> %res > > -} > > -declare<8 x i16> @llvm.x86.sse2.pcmpgt.w(<8 x i16>,<8 x i16>) nounwind > readnone > > - > > - > > define<4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0,<8 x i16> > %a1) { > > ; CHECK: vpmaddwd > > %res = call<4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0,<8 x > i16> %a1) ;<<4 x i32>> [#uses=1] > > @@ -950,14 +902,6 @@ > > declare<8 x i16> @llvm.x86.sse41.pblendw(<8 x i16>,<8 x i16>, i32) > nounwind readnone > > > > > > -define<2 x i64> @test_x86_sse41_pcmpeqq(<2 x i64> %a0,<2 x i64> %a1) > { > > - ; CHECK: vpcmpeqq > > - %res = call<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64> %a0,<2 x > i64> %a1) ;<<2 x i64>> [#uses=1] > > - ret<2 x i64> %res > > -} > > -declare<2 x i64> @llvm.x86.sse41.pcmpeqq(<2 x i64>,<2 x i64>) nounwind > readnone > > - > > - > > define<8 x i16> @test_x86_sse41_phminposuw(<8 x i16> %a0) { > > ; CHECK: vphminposuw > > %res = call<8 x i16> @llvm.x86.sse41.phminposuw(<8 x i16> %a0) > ;<<8 x i16>> [#uses=1] > > @@ -1271,14 +1215,6 @@ > > declare<16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32,<16 x > i8>, i32, i8) nounwind readnone > > > > > > -define<2 x i64> @test_x86_sse42_pcmpgtq(<2 x i64> %a0,<2 x i64> %a1) > { > > - ; CHECK: vpcmpgtq > > - %res = call<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64> %a0,<2 x > i64> %a1) ;<<2 x i64>> [#uses=1] > > - ret<2 x i64> %res > > -} > > -declare<2 x i64> @llvm.x86.sse42.pcmpgtq(<2 x i64>,<2 x i64>) nounwind > readnone > > - > > - > > define i32 @test_x86_sse42_pcmpistri128(<16 x i8> %a0,<16 x i8> %a1) > { > > ; CHECK: vpcmpistri > > ; CHECK: movl > > > > Modified: llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll > > URL: > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll?rev=149367&r1=149366&r2=149367&view=diff > > > ============================================================================== > > --- llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll (original) > > +++ llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll Tue Jan 31 > 00:52:44 2012 > > @@ -72,54 +72,6 @@ > > declare<16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>,<16 x i16>) > nounwind readnone > > > > > > -define<32 x i8> @test_x86_avx2_pcmpeq_b(<32 x i8> %a0,<32 x i8> %a1) > { > > - ; CHECK: vpcmpeqb > > - %res = call<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8> %a0,<32 x > i8> %a1) ;<<32 x i8>> [#uses=1] > > - ret<32 x i8> %res > > -} > > -declare<32 x i8> @llvm.x86.avx2.pcmpeq.b(<32 x i8>,<32 x i8>) nounwind > readnone > > - > > - > > -define<8 x i32> @test_x86_avx2_pcmpeq_d(<8 x i32> %a0,<8 x i32> %a1) > { > > - ; CHECK: vpcmpeqd > > - %res = call<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32> %a0,<8 x > i32> %a1) ;<<8 x i32>> [#uses=1] > > - ret<8 x i32> %res > > -} > > -declare<8 x i32> @llvm.x86.avx2.pcmpeq.d(<8 x i32>,<8 x i32>) nounwind > readnone > > - > > - > > -define<16 x i16> @test_x86_avx2_pcmpeq_w(<16 x i16> %a0,<16 x i16> > %a1) { > > - ; CHECK: vpcmpeqw > > - %res = call<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16> %a0,<16 x > i16> %a1) ;<<16 x i16>> [#uses=1] > > - ret<16 x i16> %res > > -} > > -declare<16 x i16> @llvm.x86.avx2.pcmpeq.w(<16 x i16>,<16 x i16>) > nounwind readnone > > - > > - > > -define<32 x i8> @test_x86_avx2_pcmpgt_b(<32 x i8> %a0,<32 x i8> %a1) > { > > - ; CHECK: vpcmpgtb > > - %res = call<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8> %a0,<32 x > i8> %a1) ;<<32 x i8>> [#uses=1] > > - ret<32 x i8> %res > > -} > > -declare<32 x i8> @llvm.x86.avx2.pcmpgt.b(<32 x i8>,<32 x i8>) nounwind > readnone > > - > > - > > -define<8 x i32> @test_x86_avx2_pcmpgt_d(<8 x i32> %a0,<8 x i32> %a1) > { > > - ; CHECK: vpcmpgtd > > - %res = call<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32> %a0,<8 x > i32> %a1) ;<<8 x i32>> [#uses=1] > > - ret<8 x i32> %res > > -} > > -declare<8 x i32> @llvm.x86.avx2.pcmpgt.d(<8 x i32>,<8 x i32>) nounwind > readnone > > - > > - > > -define<16 x i16> @test_x86_avx2_pcmpgt_w(<16 x i16> %a0,<16 x i16> > %a1) { > > - ; CHECK: vpcmpgtw > > - %res = call<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16> %a0,<16 x > i16> %a1) ;<<16 x i16>> [#uses=1] > > - ret<16 x i16> %res > > -} > > -declare<16 x i16> @llvm.x86.avx2.pcmpgt.w(<16 x i16>,<16 x i16>) > nounwind readnone > > - > > - > > define<8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0,<16 x i16> > %a1) { > > ; CHECK: vpmaddwd > > %res = call<8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0,<16 x > i16> %a1) ;<<8 x i32>> [#uses=1] > > @@ -553,14 +505,6 @@ > > declare<16 x i16> @llvm.x86.avx2.pblendw(<16 x i16>,<16 x i16>, i32) > nounwind readnone > > > > > > -define<4 x i64> @test_x86_avx2_pcmpeqq(<4 x i64> %a0,<4 x i64> %a1) { > > - ; CHECK: vpcmpeqq > > - %res = call<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64> %a0,<4 x > i64> %a1) ;<<4 x i64>> [#uses=1] > > - ret<4 x i64> %res > > -} > > -declare<4 x i64> @llvm.x86.avx2.pcmpeq.q(<4 x i64>,<4 x i64>) nounwind > readnone > > - > > - > > define<32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0,<32 x i8> %a1) { > > ; CHECK: vpmaxsb > > %res = call<32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0,<32 x > i8> %a1) ;<<32 x i8>> [#uses=1] > > @@ -729,14 +673,6 @@ > > declare<4 x i64> @llvm.x86.avx2.pmul.dq(<8 x i32>,<8 x i32>) nounwind > readnone > > > > > > -define<4 x i64> @test_x86_avx2_pcmpgtq(<4 x i64> %a0,<4 x i64> %a1) { > > - ; CHECK: vpcmpgtq > > - %res = call<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64> %a0,<4 x > i64> %a1) ;<<4 x i64>> [#uses=1] > > - ret<4 x i64> %res > > -} > > -declare<4 x i64> @llvm.x86.avx2.pcmpgt.q(<4 x i64>,<4 x i64>) nounwind > readnone > > - > > - > > define<4 x i64> @test_x86_avx2_vbroadcasti128(i8* %a0) { > > ; CHECK: vbroadcasti128 > > %res = call<4 x i64> @llvm.x86.avx2.vbroadcasti128(i8* %a0) ;<<4 x > i64>> [#uses=1] > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > -- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/a587b7a4/attachment.html From slarin at codeaurora.org Tue Jan 31 10:50:45 2012 From: slarin at codeaurora.org (Sergei Larin) Date: Tue, 31 Jan 2012 10:50:45 -0600 Subject: [llvm-commits] FW: Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> References: <07f401cce02b$87f18630$97d49290$@org> <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> Message-ID: <07fa01cce038$79e73e10$6db5ba30$@org> Tom, Thank you for your comments. Please see some answers embedded below. Sergei -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > -----Original Message----- > From: Tom Stellard [mailto:thomas.stellard at amd.com] > Sent: Tuesday, January 31, 2012 10:09 AM > To: Sergei Larin > Cc: llvm-commits at cs.uiuc.edu > Subject: Re: [llvm-commits] FW: Hexagon VLIW instruction scheduler > framework patch for review > > On Tue, Jan 31, 2012 at 09:18:05AM -0600, Sergei Larin wrote: > > > > Hello everybody, > > > > > > > > Attached is initial patch for a VLIW specific scheduler framework > that > > utilizes deterministic finite automaton (DFA) . > > > > > > > > Several key points: > > > > - The scheduler is largely based on the existing framework, > but > > introduces several VLIW specific concepts. It could be classified as > a top > > down list scheduler, critical path first, with DFA used for parallel > > resources modeling. It also models and tracks register pressure in > the way > > similar to the current RegPressure scheduler. It employs a slightly > > different way to compute "cost" function for all SUs in AQ which > allows for > > somewhat easier balancing of multiple heuristic inputs. Current > version does > > _not_ generates bundles/packets (but models them internally). It > could be > > easily modified to do so, and it is our plan to make it a part of > bundle > > generation in the near future. > > > > - The scheduler is enabled for the Hexagon backend. > Comparing to > > any existing scheduler, for this VLIW target this code produces > between 1.9% > > slowdown and 11% speedup on our internal test suite. This test set > comprised > > from a variety of real world applications ranging from DSP specific > > applications to SPEC. Some DSP kernels (when taken out of context) > enjoy up > > to 20% speedup when compared to the "default" scheduling mechanism > > (RegPressure pre-RA + post RA). Main reason for this kind of corner > case > > behavior is long chains of independent memory accesses that are > > conservatively serialized by the default scheduler (and there is no > HW > > scheduler to sort it out at the run time). > > > > - This patch is an initial submission with a bare minimum of > > features, and more heuristics will be added to it later. We prefer to > submit > > it in stages to simplify review process and improve SW management. > > > > - Patch also contains minor updates to two Hexagon specific > tests > > in order to compensate for new order of instructions generated by the > > Hexagon backend __with scheduler disabled__. > > > > - SVN revision 149130. LLVM verification test run for x86 > platform > > detects no additional failures. > > > > > > > > Comments and reviews are eagerly anticipated J > > > > > Hi Sergei, > > I'm glad to see a VLIW scheduler proposed for LLVM. We are working on > an LLVM backend for our Evergreen / Northern Islands open source > drivers, > which are also VLIW. I'm hoping we can use this in our backend as > well. > I just have a few questions and comments. > > When you start doing bundle generation in the scheduler will you be > using the new MachineInstrBundle? How are you going to model bundle [Larin, Sergei] Yes, definitely. This patch is just the first step. Once complete, hopefully we will have full blown VLIW centric scheduler. Any feedback from perspective of your architecture will be very welcomed. The goal here is not to target single back end, but rather to introduce the best value for the whole project. > constraints? I'm not familiar with the Hexagon architecture, but our > hardware has several bundle constraints. For example, some > instructions > can only be in a certain slot within the bundle, while other > instructions > fill all slots in the bundle. There is also a limit to the number of > constant registers (these are in a different register space than the > GPRs) > that can be read from within the bundle, among other things. It would > be nice to have some way to apply these constraints in the scheduler. [Larin, Sergei] Hexagon is very similar in terms of bundle slot restrictions. We use the DFA state machine (see this thread http://groups.google.com/group/llvm-commit/browse_thread/thread/d7c0fe14c70c 6be6/e5fe95e2d1feb521?#e5fe95e2d1feb521 for example) which is automatically generated from your machine description, and statefully encodes (hopefully) all the above mentioned constraints. If there is some constraint in your architecture that could not be addressed by the DFA, we could augment it, or introduce some sort of target specific peephole/constraint checker hook into the scheduler. Have you looked at current DFA implementation? It has been around for a couple months now, and if you have questions my colleague could easily address those. > > A quick note on the patch, I noticed a few whitespace errors in > LinkAllCodegenComponents.h, SchedulerRegistry.h, and > HexagonInstrInfo.cpp [Larin, Sergei] Thanks... I'll wait for more comments and will address them all at once :) > > Nice Work! [Larin, Sergei] Thanks again. > > -Tom Stellard > > > > > > Thanks. > > > > > > > > Sergei Larin > > > > > > > > -- > > > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > > > > > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From rafael.espindola at gmail.com Tue Jan 31 11:18:47 2012 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Tue, 31 Jan 2012 17:18:47 -0000 Subject: [llvm-commits] [llvm] r149391 - in /llvm/trunk: configure projects/sample/configure Message-ID: <20120131171848.184C72A6C12D@llvm.org> Author: rafael Date: Tue Jan 31 11:18:47 2012 New Revision: 149391 URL: http://llvm.org/viewvc/llvm-project?rev=149391&view=rev Log: Regenerate configure. Modified: llvm/trunk/configure llvm/trunk/projects/sample/configure Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=149391&r1=149390&r2=149391&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Tue Jan 31 11:18:47 2012 @@ -3903,7 +3903,6 @@ echo "$as_me: WARNING: Configuring LLVM for an unknown target archicture" >&2;} fi -# Determine the LLVM native architecture for the target case "$llvm_cv_target_arch" in x86) LLVM_NATIVE_ARCH="X86" ;; x86_64) LLVM_NATIVE_ARCH="X86" ;; @@ -5345,8 +5344,6 @@ TARGETS_TO_BUILD=$TARGETS_TO_BUILD -# Determine whether we are building LLVM support for the native architecture. -# If so, define LLVM_NATIVE_ARCH to that LLVM target. for a_target in $TARGETS_TO_BUILD; do if test "$a_target" = "$LLVM_NATIVE_ARCH"; then @@ -5391,8 +5388,6 @@ fi done -# Build the LLVM_TARGET and LLVM_... macros for Targets.def and the individual -# target feature def files. LLVM_ENUM_TARGETS="" LLVM_ENUM_ASM_PRINTERS="" LLVM_ENUM_ASM_PARSERS="" @@ -10502,7 +10497,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <>confdefs.h <<_ACEOF #define LLVM_PREFIX "$LLVM_PREFIX" @@ -20917,7 +20907,6 @@ _ACEOF -# Determine which bindings to build. if test "$BINDINGS_TO_BUILD" = auto ; then BINDINGS_TO_BUILD="" if test "x$OCAMLC" != x -a "x$OCAMLDEP" != x ; then @@ -20927,12 +20916,9 @@ BINDINGS_TO_BUILD=$BINDINGS_TO_BUILD -# This isn't really configurey, but it avoids having to repeat the list in -# other files. ALL_BINDINGS=ocaml -# Do any work necessary to ensure that bindings have what they need. binding_prereqs_failed=0 for a_binding in $BINDINGS_TO_BUILD ; do case "$a_binding" in Modified: llvm/trunk/projects/sample/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/projects/sample/configure?rev=149391&r1=149390&r2=149391&view=diff ============================================================================== --- llvm/trunk/projects/sample/configure (original) +++ llvm/trunk/projects/sample/configure Tue Jan 31 11:18:47 2012 @@ -8854,7 +8854,7 @@ shlibpath_var=LD_LIBRARY_PATH ;; -freebsd1*) +freebsd1.*) dynamic_linker=no ;; @@ -8877,7 +8877,7 @@ objformat=`/usr/bin/objformat` else case $host_os in - freebsd[123]*) objformat=aout ;; + freebsd[123].*) objformat=aout ;; *) objformat=elf ;; esac fi @@ -8895,7 +8895,7 @@ esac shlibpath_var=LD_LIBRARY_PATH case $host_os in - freebsd2*) + freebsd2.*) shlibpath_overrides_runpath=yes ;; freebsd3.[01]* | freebsdelf3.[01]*) @@ -10409,7 +10409,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext < Author: kcc Date: Tue Jan 31 11:29:02 2012 New Revision: 149392 URL: http://llvm.org/viewvc/llvm-project?rev=149392&view=rev Log: [asan] remove dead code Modified: compiler-rt/trunk/lib/asan/asan_linux.cc compiler-rt/trunk/lib/asan/asan_mac.cc Modified: compiler-rt/trunk/lib/asan/asan_linux.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_linux.cc?rev=149392&r1=149391&r2=149392&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_linux.cc (original) +++ compiler-rt/trunk/lib/asan/asan_linux.cc Tue Jan 31 11:29:02 2012 @@ -97,13 +97,6 @@ 0, 0); } -void *AsanMmapFixedReserve(uintptr_t fixed_addr, size_t size) { - return asan_mmap((void*)fixed_addr, size, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON | MAP_FIXED, - 0, 0); -} - void *AsanMprotect(uintptr_t fixed_addr, size_t size) { return asan_mmap((void*)fixed_addr, size, PROT_NONE, Modified: compiler-rt/trunk/lib/asan/asan_mac.cc URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_mac.cc?rev=149392&r1=149391&r2=149392&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_mac.cc (original) +++ compiler-rt/trunk/lib/asan/asan_mac.cc Tue Jan 31 11:29:02 2012 @@ -117,13 +117,6 @@ 0, 0); } -void *AsanMmapFixedReserve(uintptr_t fixed_addr, size_t size) { - return asan_mmap((void*)fixed_addr, size, - PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANON | MAP_FIXED, - 0, 0); -} - void *AsanMprotect(uintptr_t fixed_addr, size_t size) { return asan_mmap((void*)fixed_addr, size, PROT_NONE, From geek4civic at gmail.com Tue Jan 31 11:32:30 2012 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Wed, 1 Feb 2012 02:32:30 +0900 Subject: [llvm-commits] [llvm] r149348 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp In-Reply-To: <20120131043922.528222A6C12C@llvm.org> References: <20120131043922.528222A6C12C@llvm.org> Message-ID: 2012/1/31 Chris Lattner : > Author: lattner > Date: Mon Jan 30 22:39:22 2012 > New Revision: 149348 > > URL: http://llvm.org/viewvc/llvm-project?rev=149348&view=rev > Log: > rework this logic to not depend on the last argument to GetConstantStringInfo, > which is going away. Chris, it seems it might trigger a failure on stage2(stage1-built-clang) build on my two builders, x86_64-linux and i686-cygwin. I have not investigated why yet. ...Takumi ******************** TEST 'LLVM-Unit :: VMCore/Release/VMCoreTests/MDStringTest.PrintingComplex' FAILED ********************Note: Google Test filter = MDStringTest.PrintingComplex [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from MDStringTest [ RUN ] MDStringTest.PrintingComplex /home/bb/buildslave/clang-3stage-x86_64-linux/llvm-project/llvm/unittests/VMCore/MetadataTest.cpp:71: Failure Value of: oss.str().c_str() Actual: "metadata !"\00\00\00\00\00"" Expected: "metadata !\"\\00\\0A\\22\\5C\\FF\"" Which is: "metadata !"\00\0A\22\5C\FF"" [ FAILED ] MDStringTest.PrintingComplex (0 ms) [----------] 1 test from MDStringTest (0 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (0 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] MDStringTest.PrintingComplex 1 FAILED TEST ******************** From spop at codeaurora.org Tue Jan 31 12:04:01 2012 From: spop at codeaurora.org (Sebastian Pop) Date: Tue, 31 Jan 2012 12:04:01 -0600 Subject: [llvm-commits] [polly] r149374 - /polly/trunk/www/index.html In-Reply-To: <20120131091312.BF16B2A6C131@llvm.org> References: <20120131091312.BF16B2A6C131@llvm.org> Message-ID: > + ?Polly can now automatically optimize all polybench kernels without the help of It would be good to have a href to the polybench here, although not necessary. > + ?an external optimizer. The compile time is reasonable fast and we can show please replace "reasonable fast" with "reasonably fast". Sebastian -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum From kcc at google.com Tue Jan 31 12:09:50 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 10:09:50 -0800 Subject: [llvm-commits] [compiler-rt] r149296 - in /compiler-rt/trunk/lib/asan/tests: asan_mac_test.mm asan_test.cc In-Reply-To: References: <20120130232326.B4A2E2A6C12C@llvm.org> Message-ID: On Tue, Jan 31, 2012 at 12:55 AM, Alexander Potapenko wrote: > I dislike the idea of using noinline functions instead of actual > memory accesses. > This may mask problems with uninstrumented accesses, see > http://code.google.com/p/address-sanitizer/issues/detail?id=33#c9 But the previous code tested nothing because the LLVM optimized away the memory accesses. In the C++ test code we use the Ident(a) function which hides the fact that it simply returns 'a'. We may want to do the same in ObjC tests. --kcc > > > On Tue, Jan 31, 2012 at 3:23 AM, Kostya Serebryany wrote: > > Author: kcc > > Date: Mon Jan 30 17:23:26 2012 > > New Revision: 149296 > > > > URL: http://llvm.org/viewvc/llvm-project?rev=149296&view=rev > > Log: > > [asan] fix issue 35: don't let the optimizer to optimize the test code > away. > > > > Modified: > > compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm > > compiler-rt/trunk/lib/asan/tests/asan_test.cc > > > > Modified: compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm > > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm?rev=149296&r1=149295&r2=149296&view=diff > > > ============================================================================== > > --- compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm (original) > > +++ compiler-rt/trunk/lib/asan/tests/asan_mac_test.mm Mon Jan 30 > 17:23:26 2012 > > @@ -32,6 +32,10 @@ > > CFAllocatorDeallocate(kCFAllocatorMallocZone, mem); > > } > > > > +__attribute__((noinline)) > > +void access_memory(char *a) { > > + *a = 0; > > +} > > > > // Test the +load instrumentation. > > // Because the +load methods are invoked before anything else is > initialized, > > @@ -51,7 +55,8 @@ > > > > +(void) load { > > for (int i = 0; i < strlen(kStartupStr); i++) { > > - volatile char ch = kStartupStr[i]; // make sure no optimizations > occur. > > + // TODO: this is currently broken, see Issue 33. > > + // access_memory(&kStartupStr[i]); // make sure no optimizations > occur. > > } > > // Don't print anything here not to interfere with the death tests. > > } > > @@ -66,7 +71,7 @@ > > > > void worker_do_crash(int size) { > > char * volatile mem = malloc(size); > > - mem[size] = 0; // BOOM > > + access_memory(&mem[size]); // BOOM > > free(mem); > > } > > > > @@ -162,7 +167,7 @@ > > dispatch_source_set_timer(timer, milestone, DISPATCH_TIME_FOREVER, 0); > > char * volatile mem = malloc(10); > > dispatch_source_set_event_handler(timer, ^{ > > - mem[10] = 1; > > + access_memory(&mem[10]); > > }); > > dispatch_resume(timer); > > sleep(2); > > @@ -186,7 +191,7 @@ > > dispatch_source_cancel(timer); > > }); > > dispatch_source_set_cancel_handler(timer, ^{ > > - mem[10] = 1; > > + access_memory(&mem[10]); > > }); > > dispatch_resume(timer); > > sleep(2); > > @@ -197,7 +202,7 @@ > > dispatch_group_t group = dispatch_group_create(); > > char * volatile mem = malloc(10); > > dispatch_group_async(group, queue, ^{ > > - mem[10] = 1; > > + access_memory(&mem[10]); > > }); > > dispatch_group_wait(group, DISPATCH_TIME_FOREVER); > > } > > > > Modified: compiler-rt/trunk/lib/asan/tests/asan_test.cc > > URL: > http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/tests/asan_test.cc?rev=149296&r1=149295&r2=149296&view=diff > > > ============================================================================== > > --- compiler-rt/trunk/lib/asan/tests/asan_test.cc (original) > > +++ compiler-rt/trunk/lib/asan/tests/asan_test.cc Mon Jan 30 17:23:26 > 2012 > > @@ -1668,7 +1668,7 @@ > > *Ident(&a) = *Ident(&a); > > } > > > > - __attribute__((no_address_safety_analysis)) > > +__attribute__((no_address_safety_analysis)) > > static void NoAddressSafety() { > > char *foo = new char[10]; > > Ident(foo)[10] = 0; > > > > > > _______________________________________________ > > llvm-commits mailing list > > llvm-commits at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > > -- > Alexander Potapenko > Software Engineer > Google Moscow > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/b94a5dc9/attachment.html From kcc at google.com Tue Jan 31 12:13:50 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 18:13:50 -0000 Subject: [llvm-commits] [compiler-rt] r149395 - in /compiler-rt/trunk/lib/asan: asan_interface.h asan_internal.h Message-ID: <20120131181350.919D72A6C12C@llvm.org> Author: kcc Date: Tue Jan 31 12:13:50 2012 New Revision: 149395 URL: http://llvm.org/viewvc/llvm-project?rev=149395&view=rev Log: [asan] fix the wrong __WORDSIZE definition on Win x64, add ASAN_INTERFACE_FUNCTION_ATTRIBUTE. Patch by timurrrr at google.com Modified: compiler-rt/trunk/lib/asan/asan_interface.h compiler-rt/trunk/lib/asan/asan_internal.h Modified: compiler-rt/trunk/lib/asan/asan_interface.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_interface.h?rev=149395&r1=149394&r2=149395&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_interface.h (original) +++ compiler-rt/trunk/lib/asan/asan_interface.h Tue Jan 31 12:13:50 2012 @@ -16,10 +16,11 @@ #define ASAN_INTERFACE_H #if !defined(_WIN32) -#include // for __WORDSIZE +#include // for uintptr_t +#define ASAN_INTERFACE_FUNCTION_ATTRIBUTE __attribute__((visibility("default"))) #else -// The __attribute__ keyword is not understood by Visual Studio. -#define __attribute__(x) +// TODO(timurrrr): find out what we need on Windows. __declspec(dllexport) ? +#define ASAN_INTERFACE_FUNCTION_ATTRIBUTE #endif #include // for size_t @@ -29,13 +30,12 @@ extern "C" { // This function should be called at the very beginning of the process, // before any instrumented code is executed and before any call to malloc. - void __asan_init() - __attribute__((visibility("default"))); + void __asan_init() ASAN_INTERFACE_FUNCTION_ATTRIBUTE; // This function should be called by the instrumented code. // 'addr' is the address of a global variable called 'name' of 'size' bytes. void __asan_register_global(uintptr_t addr, size_t size, const char *name) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; // This structure describes an instrumented global variable. struct __asan_global { @@ -48,18 +48,18 @@ // These two functions should be called by the instrumented code. // 'globals' is an array of structures describing 'n' globals. void __asan_register_globals(__asan_global *globals, size_t n) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; void __asan_unregister_globals(__asan_global *globals, size_t n) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; // These two functions are used by the instrumented code in the // use-after-return mode. __asan_stack_malloc allocates size bytes of // fake stack and __asan_stack_free poisons it. real_stack is a pointer to // the real stack region. size_t __asan_stack_malloc(size_t size, size_t real_stack) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; void __asan_stack_free(size_t ptr, size_t size, size_t real_stack) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; // Marks memory region [addr, addr+size) as unaddressable. // This memory must be previously allocated by the user program. Accessing @@ -101,7 +101,7 @@ // set a breakpoint on this function in a debugger. void __asan_report_error(uintptr_t pc, uintptr_t bp, uintptr_t sp, uintptr_t addr, bool is_write, size_t access_size) - __attribute__((visibility("default"))); + ASAN_INTERFACE_FUNCTION_ATTRIBUTE; // Sets the exit code to use when reporting an error. // Returns the old value. @@ -137,4 +137,5 @@ void __asan_print_accumulated_stats(); } // namespace +#undef ASAN_INTERFACE_FUNCTION_ATTRIBUTE #endif // ASAN_INTERFACE_H Modified: compiler-rt/trunk/lib/asan/asan_internal.h URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/asan/asan_internal.h?rev=149395&r1=149394&r2=149395&view=diff ============================================================================== --- compiler-rt/trunk/lib/asan/asan_internal.h (original) +++ compiler-rt/trunk/lib/asan/asan_internal.h Tue Jan 31 12:13:50 2012 @@ -35,9 +35,9 @@ #endif // _WIN32 // If __WORDSIZE was undefined by the platform, define it in terms of the -// compiler built-in __LP64__. +// compiler built-ins __LP64__ and _WIN64. #ifndef __WORDSIZE -#if __LP64__ +#if __LP64__ || defined(_WIN64) #define __WORDSIZE 64 #else #define __WORDSIZE 32 From dpatel at apple.com Tue Jan 31 12:14:05 2012 From: dpatel at apple.com (Devang Patel) Date: Tue, 31 Jan 2012 18:14:05 -0000 Subject: [llvm-commits] [llvm] r149396 - in /llvm/trunk: include/llvm/MC/MCParser/MCAsmParser.h lib/MC/MCParser/AsmParser.cpp lib/Target/X86/AsmParser/X86AsmParser.cpp Message-ID: <20120131181405.F1C272A6C12C@llvm.org> Author: dpatel Date: Tue Jan 31 12:14:05 2012 New Revision: 149396 URL: http://llvm.org/viewvc/llvm-project?rev=149396&view=rev Log: Add assembler dialect attribute in asm parser which lets target specific asm parser change dialect on the fly. Modified: llvm/trunk/include/llvm/MC/MCParser/MCAsmParser.h llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp Modified: llvm/trunk/include/llvm/MC/MCParser/MCAsmParser.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCParser/MCAsmParser.h?rev=149396&r1=149395&r2=149396&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCParser/MCAsmParser.h (original) +++ llvm/trunk/include/llvm/MC/MCParser/MCAsmParser.h Tue Jan 31 12:14:05 2012 @@ -65,6 +65,7 @@ void setTargetParser(MCTargetAsmParser &P); virtual unsigned getAssemblerDialect() { return 0;} + virtual void setAssemblerDialect(unsigned i) { } bool getShowParsedOperands() const { return ShowParsedOperands; } void setShowParsedOperands(bool Value) { ShowParsedOperands = Value; } Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=149396&r1=149395&r2=149396&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Tue Jan 31 12:14:05 2012 @@ -123,6 +123,9 @@ int64_t CppHashLineNumber; SMLoc CppHashLoc; + /// AssemblerDialect. ~OU means unset value and use value provided by MAI. + unsigned AssemblerDialect; + public: AsmParser(SourceMgr &SM, MCContext &Ctx, MCStreamer &Out, const MCAsmInfo &MAI); @@ -144,7 +147,15 @@ virtual MCAsmLexer &getLexer() { return Lexer; } virtual MCContext &getContext() { return Ctx; } virtual MCStreamer &getStreamer() { return Out; } - virtual unsigned getAssemblerDialect() { return MAI.getAssemblerDialect(); } + virtual unsigned getAssemblerDialect() { + if (AssemblerDialect == ~0U) + return MAI.getAssemblerDialect(); + else + return AssemblerDialect; + } + virtual void setAssemblerDialect(unsigned i) { + AssemblerDialect = i; + } virtual bool Warning(SMLoc L, const Twine &Msg, ArrayRef Ranges = ArrayRef()); @@ -369,7 +380,8 @@ MCStreamer &_Out, const MCAsmInfo &_MAI) : Lexer(_MAI), Ctx(_Ctx), Out(_Out), MAI(_MAI), SrcMgr(_SM), GenericParser(new GenericAsmParser), PlatformParser(0), - CurBuffer(0), MacrosEnabled(true), CppHashLineNumber(0) { + CurBuffer(0), MacrosEnabled(true), CppHashLineNumber(0), + AssemblerDialect(~0U) { // Save the old handler. SavedDiagHandler = SrcMgr.getDiagHandler(); SavedDiagContext = SrcMgr.getDiagContext(); Modified: llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp?rev=149396&r1=149395&r2=149396&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp (original) +++ llvm/trunk/lib/Target/X86/AsmParser/X86AsmParser.cpp Tue Jan 31 12:14:05 2012 @@ -34,7 +34,6 @@ class X86AsmParser : public MCTargetAsmParser { MCSubtargetInfo &STI; MCAsmParser &Parser; - bool IntelSyntax; private: MCAsmParser &getParser() const { return Parser; } @@ -94,7 +93,7 @@ public: X86AsmParser(MCSubtargetInfo &sti, MCAsmParser &parser) - : MCTargetAsmParser(), STI(sti), Parser(parser), IntelSyntax(false) { + : MCTargetAsmParser(), STI(sti), Parser(parser) { // Initialize the set of available features. setAvailableFeatures(ComputeAvailableFeatures(STI.getFeatureBits())); @@ -107,7 +106,7 @@ virtual bool ParseDirective(AsmToken DirectiveID); bool isParsingIntelSyntax() { - return IntelSyntax || getParser().getAssemblerDialect(); + return getParser().getAssemblerDialect(); } }; } // end anonymous namespace @@ -1646,7 +1645,7 @@ else if (IDVal.startswith(".code")) return ParseDirectiveCode(IDVal, DirectiveID.getLoc()); else if (IDVal.startswith(".intel_syntax")) { - IntelSyntax = true; + getParser().setAssemblerDialect(1); if (getLexer().isNot(AsmToken::EndOfStatement)) { if(Parser.getTok().getString() == "noprefix") { // FIXME : Handle noprefix From kcc at google.com Tue Jan 31 12:18:08 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 10:18:08 -0800 Subject: [llvm-commits] [PATCH][AddressSanitizer] A few more tweaks for the Visual Studio ASan build In-Reply-To: References: Message-ID: r149395, thanks! On Tue, Jan 31, 2012 at 2:12 AM, Timur Iskhodzhanov wrote: > Hi Kostya, > > Can you please review the attached patch and land it if everything's OK > with it? > > It fixes the wrong __WORDSIZE definition on Win x64 and also > introduced ASAN_INTERFACE_FUNCTION_ATTRIBUTE you've suggested in the > other discussion. > > Thanks, > Timur Iskhodzhanov > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/bd721873/attachment.html From kcc at google.com Tue Jan 31 12:23:59 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 10:23:59 -0800 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not test memcpy on Lion In-Reply-To: References: Message-ID: Looks good. Weird. Does this mean that valgrind will also have problems on Lion? --kcc On Tue, Jan 31, 2012 at 5:11 AM, Alexander Potapenko wrote: > The attached patch disables testing memcpy() on Mac OS 10.7, > where memcpy() in fact aliases memmove() and thus calling it with > overlapping parameters is not an error. > > -- > Alexander Potapenko > Software Engineer > Google Moscow > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/d747718f/attachment.html From kcc at google.com Tue Jan 31 12:27:24 2012 From: kcc at google.com (Kostya Serebryany) Date: Tue, 31 Jan 2012 10:27:24 -0800 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not wrap memcpy() on Mac OS 10.7 In-Reply-To: References: Message-ID: Looks good. But maybe we should simply drop memcpy on all variants of MacOS? --kcc On Tue, Jan 31, 2012 at 5:36 AM, Alexander Potapenko wrote: > The attached patch disables wrapping memcpy() on Mac OS Lion, where it > actually falls back to memmove. > In this case we still need to initialize real_memcpy, so we set it to > real_memmove > We check for MACOS_VERSION_SNOW_LEOPARD, because currently only Snow > Leopard and Lion are supported. > > -- > Alexander Potapenko > Software Engineer > Google Moscow > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/d8b96042/attachment.html From stoklund at 2pi.dk Tue Jan 31 12:27:52 2012 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Tue, 31 Jan 2012 10:27:52 -0800 Subject: [llvm-commits] [llvm] r149360 - /llvm/trunk/lib/CodeGen/RegAllocFast.cpp In-Reply-To: <20120131055532.950F62A6C12C@llvm.org> References: <20120131055532.950F62A6C12C@llvm.org> Message-ID: <5E4214BE-C5E2-41FF-9CA8-570F2C4A0679@2pi.dk> On Jan 30, 2012, at 9:55 PM, Andrew Trick wrote: > + unsigned OperReg = MO.getReg(); > + for (const unsigned *AS = TRI->getOverlaps(Reg); *AS; ++AS) { > + if (OperReg != *AS) > + continue; > + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { > + // If the ret already has an operand for this physreg or a superset, > + // don't duplicate it. Set the kill flag if the value is defined. > + if (hasDef && !MO.isKill()) > + MO.setIsKill(); > + Found = true; > + break; > + } > + } This loop looks weird. Can't you just go: > + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { > + // If the ret already has an operand for this physreg or a superset, > + // don't duplicate it. Set the kill flag if the value is defined. > + if (hasDef && !MO.isKill()) > + MO.setIsKill(); > + Found = true; > + break; > + } From atrick at apple.com Tue Jan 31 12:29:30 2012 From: atrick at apple.com (Andrew Trick) Date: Tue, 31 Jan 2012 10:29:30 -0800 Subject: [llvm-commits] [llvm] r149360 - /llvm/trunk/lib/CodeGen/RegAllocFast.cpp In-Reply-To: <5E4214BE-C5E2-41FF-9CA8-570F2C4A0679@2pi.dk> References: <20120131055532.950F62A6C12C@llvm.org> <5E4214BE-C5E2-41FF-9CA8-570F2C4A0679@2pi.dk> Message-ID: <39ABAA3C-B9CC-4001-9DD4-E248B379BEF3@apple.com> On Jan 31, 2012, at 10:27 AM, Jakob Stoklund Olesen wrote: > > On Jan 30, 2012, at 9:55 PM, Andrew Trick wrote: > >> + unsigned OperReg = MO.getReg(); >> + for (const unsigned *AS = TRI->getOverlaps(Reg); *AS; ++AS) { >> + if (OperReg != *AS) >> + continue; >> + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { >> + // If the ret already has an operand for this physreg or a superset, >> + // don't duplicate it. Set the kill flag if the value is defined. >> + if (hasDef && !MO.isKill()) >> + MO.setIsKill(); >> + Found = true; >> + break; >> + } >> + } > > This loop looks weird. Can't you just go: > >> + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { >> + // If the ret already has an operand for this physreg or a superset, >> + // don't duplicate it. Set the kill flag if the value is defined. >> + if (hasDef && !MO.isKill()) >> + MO.setIsKill(); >> + Found = true; >> + break; >> + } > > Thanks. Some last minute cleanup gone awry here. -Andy From anat.shemer at intel.com Tue Jan 31 12:49:10 2012 From: anat.shemer at intel.com (Shemer, Anat) Date: Tue, 31 Jan 2012 18:49:10 +0000 Subject: [llvm-commits] [PATCH][Review request] replace multiple architecture flag arguments with Subtarget argument Message-ID: <042D0278E37F0F40BF8993003DC4291008B2B8@HASMSX104.ger.corp.intel.com> Hi, The function getZeroVector() in X86ISelLowering.cpp currently accepts two architecture flags: HasSSE2 and HasAVX. I replaced this with passing Subtarget. Now getZeroVector() queries Subtarget for the specific architecture features. The patch is attached. I will appreciate if you can review it and approve or comment. Thanks, Anat --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/4ca58824/attachment.html -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: diff.txt Url: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20120131/4ca58824/attachment.txt From atrick at apple.com Tue Jan 31 12:54:19 2012 From: atrick at apple.com (Andrew Trick) Date: Tue, 31 Jan 2012 18:54:19 -0000 Subject: [llvm-commits] [llvm] r149398 - /llvm/trunk/lib/CodeGen/RegAllocFast.cpp Message-ID: <20120131185419.480FF2A6C12C@llvm.org> Author: atrick Date: Tue Jan 31 12:54:19 2012 New Revision: 149398 URL: http://llvm.org/viewvc/llvm-project?rev=149398&view=rev Log: Obvious unnecessary loop removal. Follow through from previous checkin. Modified: llvm/trunk/lib/CodeGen/RegAllocFast.cpp Modified: llvm/trunk/lib/CodeGen/RegAllocFast.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegAllocFast.cpp?rev=149398&r1=149397&r2=149398&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/RegAllocFast.cpp (original) +++ llvm/trunk/lib/CodeGen/RegAllocFast.cpp Tue Jan 31 12:54:19 2012 @@ -775,17 +775,16 @@ continue; unsigned OperReg = MO.getReg(); - for (const unsigned *AS = TRI->getOverlaps(Reg); *AS; ++AS) { - if (OperReg != *AS) - continue; - if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { - // If the ret already has an operand for this physreg or a superset, - // don't duplicate it. Set the kill flag if the value is defined. - if (hasDef && !MO.isKill()) - MO.setIsKill(); - Found = true; - break; - } + if (!TargetRegisterInfo::isPhysicalRegister(OperReg)) + continue; + + if (OperReg == Reg || TRI->isSuperRegister(OperReg, Reg)) { + // If the ret already has an operand for this physreg or a superset, + // don't duplicate it. Set the kill flag if the value is defined. + if (hasDef && !MO.isKill()) + MO.setIsKill(); + Found = true; + break; } } if (!Found) From gkistanova at gmail.com Tue Jan 31 12:55:36 2012 From: gkistanova at gmail.com (Galina Kistanova) Date: Tue, 31 Jan 2012 18:55:36 -0000 Subject: [llvm-commits] [zorg] r149399 - /zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py Message-ID: <20120131185536.2F8902A6C12C@llvm.org> Author: gkistanova Date: Tue Jan 31 12:55:35 2012 New Revision: 149399 URL: http://llvm.org/viewvc/llvm-project?rev=149399&view=rev Log: lldb builder changes; Patch for Mark Peek Modified: zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py?rev=149399&r1=149398&r2=149399&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLDBBuilder.py Tue Jan 31 12:55:35 2012 @@ -3,91 +3,120 @@ import buildbot import buildbot.process.factory from buildbot.steps.source import SVN -from buildbot.steps.shell import SetProperty, ShellCommand, WarningCountingShellCommand +from buildbot.steps.shell import Configure, SetProperty +from buildbot.steps.shell import ShellCommand, WarningCountingShellCommand from buildbot.process.properties import WithProperties -import ClangBuilder - -def isNewLLVMRevision(build_status): - if build_status.getNumber() == 0: - return true - - current_llvmrev = build_status.getProperty('llvmrev') - try: - prev_build_no = build_status.getNumber()-1 - prev_build_status = build_status.getBuilder().getBuild(prev_build_no) - prev_llvmrev = prev_build_status.getProperty('llvmrev') - return prev_llvmrev != current_llvmrev - except IndexError: - return true - -def getLLDBBuildFactory(triple, outOfDir=False, useTwoStage=False, +def getLLDBBuildFactory(triple, outOfDir=False, useTwoStage=False, jobs=1, always_install=False, extra_configure_args=[], env={}, *args, **kwargs): - # FIXME: this code is copied from getClangBuildFactory + inDir = not outOfDir and not useTwoStage if inDir: llvm_srcdir = "llvm" - llvm_1_objdir = "llvm" - if always_install: - llvm_1_installdir = "llvm.install" - else: - llvm_1_installdir = None + llvm_objdir = "llvm" else: llvm_srcdir = "llvm.src" - llvm_1_objdir = "llvm.obj" - llvm_1_installdir = "llvm.install.1" - llvm_2_objdir = "llvm.obj.2" - llvm_2_installdir = "llvm.install" + llvm_objdir = "llvm.obj" f = buildbot.process.factory.BuildFactory() + # Determine the build directory. + f.addStep(buildbot.steps.shell.SetProperty(name="get_builddir", + command=["pwd"], + property="builddir", + description="set build dir", + workdir=".")) + + # We really want to revert the patched llvm/clang files but svn sometimes + # doesn't do the right thing. We're left with removing and rebuilding. + f.addStep(ShellCommand(name='rm-%s' % llvm_srcdir, + command=['rm', '-rf', llvm_srcdir], + haltOnFailure = True, + workdir='.', env=env)) + # Find out what version of llvm and clang are needed to build this version + # of lldb. Right now we will assume they use the same version. + # XXX - could this be done directly on the master instead of the slave? + f.addStep(SetProperty(command='svn cat http://llvm.org/svn/llvm-project/lldb/trunk/scripts/build-llvm.pl | grep ^our.*llvm_revision | cut -d \\" -f 2', + property='llvmrev')) + + # The SVN build step provides no mechanism to check out a specific revision + # based on a property, so just run the commands directly here. + + svn_co = ['svn', 'checkout', '--force'] + svn_co += ['--revision', WithProperties('%(llvmrev)s')] + + # build llvm svn checkout command + svn_co_llvm = svn_co + \ + [WithProperties('http://llvm.org/svn/llvm-project/llvm/trunk@%(llvmrev)s'), + llvm_srcdir] + # build clang svn checkout command + svn_co_clang = svn_co + \ + [WithProperties('http://llvm.org/svn/llvm-project/cfe/trunk@%(llvmrev)s'), + '%s/tools/clang' % llvm_srcdir] + + f.addStep(ShellCommand(name='svn-llvm', + command=svn_co_llvm, + haltOnFailure=True, + workdir='.')) + f.addStep(ShellCommand(name='svn-clang', + command=svn_co_clang, + haltOnFailure=True, + workdir='.')) + f.addStep(SVN(name='svn-lldb', mode='update', baseURL='http://llvm.org/svn/llvm-project/lldb/', defaultBranch='trunk', + always_purge=True, workdir='%s/tools/lldb' % llvm_srcdir)) - f.addStep(SetProperty(command='grep ^our.*llvm_revision scripts/build-llvm.pl | cut -d \\" -f 2', - property='llvmrev', - workdir='%s/tools/lldb' % llvm_srcdir)) - - same_llvmrev = lambda step: not isNewLLVMRevision(step.build.getStatus()) - new_llvmrev = lambda step: isNewLLVMRevision(step.build.getStatus()) - - # Clean LLVM only if its revision number changed since the last build. - # Otherwise, only clean LLDB. - clean_lldb = \ - WarningCountingShellCommand(name="clean-lldb", - command=['make', "clean"], - haltOnFailure=True, - description="cleaning lldb", - descriptionDone="clean lldb", - workdir='%s/tools/lldb' % llvm_1_objdir, - doStepIf=same_llvmrev) - - # We use force_checkout to ensure the initial checkout is not aborted due to - # the presence of the tools/lldb directory - clangf = ClangBuilder.getClangBuildFactory(triple, test=False, - outOfDir=outOfDir, - useTwoStage=useTwoStage, - always_install=always_install, - extra_configure_args= - extra_configure_args+ - ['--enable-targets=host'], - env=env, - trunk_revision='%(llvmrev)s', - force_checkout=True, - clean=new_llvmrev, - extra_clean_step=clean_lldb, - *args, **kwargs) - f.steps += clangf.steps + + # Patch llvm with lldb changes + f.addStep(ShellCommand(name='patch.llvm', + command='for i in tools/lldb/scripts/llvm*.diff; do echo "Patching with file $i"; patch -p0 -i $i; done', + workdir=llvm_srcdir)) + + # Patch clang with lldb changes + f.addStep(ShellCommand(name='patch.clang', + command='for i in ../lldb/scripts/clang*.diff; do echo "Patching with file $i"; patch -p0 -i $i; done', + workdir='%s/tools/clang' % llvm_srcdir)) + + # Run configure + config_args = [WithProperties("%%(builddir)s/%s/configure" % llvm_srcdir), + "--disable-bindings", + "--without-llvmgcc", + "--without-llvmgxx", + ] + if triple: + config_args += ['--build=%s' % triple] + config_args += extra_configure_args + + f.addStep(Configure(name='configure', + command=config_args, + workdir=llvm_objdir)) + + f.addStep(WarningCountingShellCommand(name="compile", + command=['nice', '-n', '10', + 'make', WithProperties("-j%s" % jobs) + ], + haltOnFailure=True, + workdir=llvm_objdir)) + + # The tests are hanging on Linux at the moment due to some "expect" + # functionality not happening correctly. For now we will stub out the tests + # so we can at least get builds running and reinstate the tests later. + + # Fixup file needed for tests + # f.addStep(ShellCommand(name"copy-gnu_libstdcpp.py", + # command="cp tools/lldb/examples/synthetic/gnu_libstdcpp.py Debug+Asserts/bin", + # workdir=llvm_srcdir)) # Test. - f.addStep(ShellCommand(name="test", - command=['nice', '-n', '10', - 'make'], - haltOnFailure=True, description="test lldb", - env=env, - workdir='%s/tools/lldb/test' % llvm_1_objdir)) + #f.addStep(ShellCommand(name="test", + # command=['nice', '-n', '10', + # 'make'], + # haltOnFailure=True, description="test lldb", + # env=env, + # workdir='%s/tools/lldb/test' % llvm_objdir)) return f From thomas.stellard at amd.com Tue Jan 31 13:17:41 2012 From: thomas.stellard at amd.com (Tom Stellard) Date: Tue, 31 Jan 2012 14:17:41 -0500 Subject: [llvm-commits] FW: Hexagon VLIW instruction scheduler framework patch for review In-Reply-To: <4F2818E4.3070900@codeaurora.org> References: <07f401cce02b$87f18630$97d49290$@org> <20120131160838.GD2075@L7-CNU1252LKR-172027226155.amd.com> <4F2818E4.3070900@codeaurora.org> Message-ID: <20120131191741.GF2075@L7-CNU1252LKR-172027226155.amd.com> On Tue, Jan 31, 2012 at 10:37:56AM -0600, Anshuman Dasgupta wrote: > Hi Tom, > > > How are you going to model bundle > > constraints? I'm not familiar with the Hexagon architecture, but our > > hardware has several bundle constraints. For example, some instructions > > can only be in a certain slot within the bundle, while other instructions > > fill all slots in the bundle. > > Yes, we had to solve the same problem for Hexagon. We authored a target-independent packetizer (or bundler) that examines available slots and automatically constructs a DFA to represent slot restrictions in a VLIW architecture. This DFA can be queried by a target while bundling instructions. You may be interested in this posting on llvm-commit: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20111128/132357.html The corresponding commit is: http://llvm.org/viewvc/llvm-project?view=rev&revision=145629 > Thanks for pointing this out to me. It looks really useful. I'll try to see if I can integrate it into our backend. I'll let you know if I have any questions. Thanks, Tom > Feel free to ping me if you have any questions on the DFA packetizer. > > -Anshu > > > -- > Qualcomm Innovation Center, Inc is a member of Code Aurora Forum > > > > > > > > On 1/31/2012 10:08 AM, Tom Stellard wrote: > >On Tue, Jan 31, 2012 at 09:18:05AM -0600, Sergei Larin wrote: > >> > >> Hello everybody, > >> > >> > >> > >> Attached is initial patch for a VLIW specific scheduler framework that > >>utilizes deterministic finite automaton (DFA) . > >> > >> > >> > >>Several key points: > >> > >>- The scheduler is largely based on the existing framework, but > >>introduces several VLIW specific concepts. It could be classified as a top > >>down list scheduler, critical path first, with DFA used for parallel > >>resources modeling. It also models and tracks register pressure in the way > >>similar to the current RegPressure scheduler. It employs a slightly > >>different way to compute "cost" function for all SUs in AQ which allows for > >>somewhat easier balancing of multiple heuristic inputs. Current version does > >>_not_ generates bundles/packets (but models them internally). It could be > >>easily modified to do so, and it is our plan to make it a part of bundle > >>generation in the near future. > >> > >>- The scheduler is enabled for the Hexagon backend. Comparing to > >>any existing scheduler, for this VLIW target this code produces between 1.9% > >>slowdown and 11% speedup on our internal test suite. This test set comprised > >>from a variety of real world applications ranging from DSP specific > >>applications to SPEC. Some DSP kernels (when taken out of context) enjoy up > >>to 20% speedup when compared to the "default" scheduling mechanism > >>(RegPressure pre-RA + post RA). Main reason for this kind of corner case > >>behavior is long chains of independent memory accesses that are > >>conservatively serialized by the default scheduler (and there is no HW > >>scheduler to sort it out at the run time). > >> > >>- This patch is an initial submission with a bare minimum of > >>features, and more heuristics will be added to it later. We prefer to submit > >>it in stages to simplify review process and improve SW management. > >> > >>- Patch also contains minor updates to two Hexagon specific tests > >>in order to compensate for new order of instructions generated by the > >>Hexagon backend __with scheduler disabled__. > >> > >>- SVN revision 149130. LLVM verification test run for x86 platform > >>detects no additional failures. > >> > >> > >> > >> Comments and reviews are eagerly anticipated J > >> > > > >Hi Sergei, > > > >I'm glad to see a VLIW scheduler proposed for LLVM. We are working on > >an LLVM backend for our Evergreen / Northern Islands open source drivers, > >which are also VLIW. I'm hoping we can use this in our backend as well. > >I just have a few questions and comments. > > > >When you start doing bundle generation in the scheduler will you be > >using the new MachineInstrBundle? How are you going to model bundle > >constraints? I'm not familiar with the Hexagon architecture, but our > >hardware has several bundle constraints. For example, some instructions > >can only be in a certain slot within the bundle, while other instructions > >fill all slots in the bundle. There is also a limit to the number of > >constant registers (these are in a different register space than the GPRs) > >that can be read from within the bundle, among other things. It would > >be nice to have some way to apply these constraints in the scheduler. > > > >A quick note on the patch, I noticed a few whitespace errors in > >LinkAllCodegenComponents.h, SchedulerRegistry.h, and > >HexagonInstrInfo.cpp > > > >Nice Work! > > > >-Tom Stellard > >> > >> > >>Thanks. > >> > >> > >> > >>Sergei Larin > >> > >> > >> > >>-- > >> > >>Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum. > >> > >> > >> > > > >>_______________________________________________ > >>llvm-commits mailing list > >>llvm-commits at cs.uiuc.edu > >>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > >>_______________________________________________ > >>llvm-commits mailing list > >>llvm-commits at cs.uiuc.edu > >>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > >_______________________________________________ > >llvm-commits mailing list > >llvm-commits at cs.uiuc.edu > >http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From wendling at apple.com Tue Jan 31 13:34:45 2012 From: wendling at apple.com (Bill Wendling) Date: Tue, 31 Jan 2012 11:34:45 -0800 Subject: [llvm-commits] [llvm] r149335 - in /llvm/trunk/test: CodeGen/Generic/2007-12-31-UnusedSelector.ll CodeGen/Generic/2009-11-16-BadKillsCrash.ll CodeGen/Mips/eh.ll CodeGen/X86/2008-05-28-LocalRegAllocBug.ll CodeGen/X86/negate-add-zero.ll Transforms/Inline/inline-invoke-tail.ll Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll In-Reply-To: <4F27B6FD.7090302@free.fr> References: <20120131020907.DD6682A6C12C@llvm.org> <4F27B6FD.7090302@free.fr> Message-ID: <9D6C39EB-79DF-4B4E-9BDD-2774D2ABE023@apple.com> Hi Duncan, The auto-upgrading of EH was removed back in November. -bw On Jan 31, 2012, at 1:40 AM, Duncan Sands wrote: > Hi Bill, does auto-upgrade still work with these changes you've been making? > > Thanks, Duncan. > > On 31/01/12 03:09, Bill Wendling wrote: >> Author: void >> Date: Mon Jan 30 20:09:07 2012 >> New Revision: 149335 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=149335&view=rev >> Log: >> Remove all references to the old EH. >> >> There was always the current EH. -- Ministry of Truth >> >> Modified: >> llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll >> llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll >> llvm/trunk/test/CodeGen/Mips/eh.ll >> llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll >> llvm/trunk/test/CodeGen/X86/negate-add-zero.ll >> llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll >> llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll >> >> Modified: llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll (original) >> +++ llvm/trunk/test/CodeGen/Generic/2007-12-31-UnusedSelector.ll Mon Jan 30 20:09:07 2012 >> @@ -30,8 +30,6 @@ >> >> declare void @__cxa_throw(i8*, i8*, void (i8*)*) noreturn >> >> -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) >> - >> declare void @__cxa_end_catch() >> >> declare i32 @__gxx_personality_v0(...) >> >> Modified: llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll (original) >> +++ llvm/trunk/test/CodeGen/Generic/2009-11-16-BadKillsCrash.ll Mon Jan 30 20:09:07 2012 >> @@ -15,8 +15,6 @@ >> %"struct.std::locale::facet" = type { i32 (...)**, i32 } >> %union..0._15 = type { i32 } >> >> -declare i8* @llvm.eh.exception() nounwind readonly >> - >> declare i8* @__cxa_begin_catch(i8*) nounwind >> >> declare %"struct.std::ctype"* @_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale(%"struct.std::locale"*) >> >> Modified: llvm/trunk/test/CodeGen/Mips/eh.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/eh.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/Mips/eh.ll (original) >> +++ llvm/trunk/test/CodeGen/Mips/eh.ll Mon Jan 30 20:09:07 2012 >> @@ -54,16 +54,10 @@ >> >> declare i8* @__cxa_allocate_exception(i32) >> >> -declare i8* @llvm.eh.exception() nounwind readonly >> - >> declare i32 @__gxx_personality_v0(...) >> >> -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind >> - >> declare i32 @llvm.eh.typeid.for(i8*) nounwind >> >> -declare void @llvm.eh.resume(i8*, i32) >> - >> declare void @__cxa_throw(i8*, i8*, i8*) >> >> declare i8* @__cxa_begin_catch(i8*) >> >> Modified: llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/2008-05-28-LocalRegAllocBug.ll Mon Jan 30 20:09:07 2012 >> @@ -2,8 +2,6 @@ >> >> @_ZTVN10Evaluation10GridOutputILi3EEE = external constant [5 x i32 (...)*] ;<[5 x i32 (...)*]*> [#uses=1] >> >> -declare i8* @llvm.eh.exception() nounwind >> - >> declare i8* @_Znwm(i32) >> >> declare i8* @__cxa_begin_catch(i8*) nounwind >> >> Modified: llvm/trunk/test/CodeGen/X86/negate-add-zero.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/negate-add-zero.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/CodeGen/X86/negate-add-zero.ll (original) >> +++ llvm/trunk/test/CodeGen/X86/negate-add-zero.ll Mon Jan 30 20:09:07 2012 >> @@ -486,10 +486,6 @@ >> >> declare i8* @_Znwm(i32) >> >> -declare i8* @llvm.eh.exception() nounwind >> - >> -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind >> - >> declare i32 @llvm.eh.typeid.for.i32(i8*) nounwind >> >> declare void @_ZdlPv(i8*) nounwind >> >> Modified: llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll (original) >> +++ llvm/trunk/test/Transforms/Inline/inline-invoke-tail.ll Mon Jan 30 20:09:07 2012 >> @@ -28,10 +28,6 @@ >> unreachable >> } >> >> -declare i8* @llvm.eh.exception() nounwind readonly >> - >> -declare i32 @llvm.eh.selector(i8*, i8*, ...) nounwind >> - >> declare i32 @__gxx_personality_v0(...) >> >> declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind >> >> Modified: llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll?rev=149335&r1=149334&r2=149335&view=diff >> ============================================================================== >> --- llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll (original) >> +++ llvm/trunk/test/Transforms/SCCP/2009-01-14-IPSCCP-Invoke.ll Mon Jan 30 20:09:07 2012 >> @@ -21,10 +21,6 @@ >> >> declare i8* @__cxa_begin_catch(i8*) nounwind >> >> -declare i8* @llvm.eh.exception() nounwind >> - >> -declare i32 @llvm.eh.selector.i32(i8*, i8*, ...) nounwind >> - >> declare void @__cxa_end_catch() >> >> declare i32 @__gxx_personality_v0(...) >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From grosbach at apple.com Tue Jan 31 13:47:32 2012 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 31 Jan 2012 19:47:32 -0000 Subject: [llvm-commits] [llvm] r149408 - /llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h Message-ID: <20120131194732.EEACB2A6C12C@llvm.org> Author: grosbach Date: Tue Jan 31 13:47:32 2012 New Revision: 149408 URL: http://llvm.org/viewvc/llvm-project?rev=149408&view=rev Log: Tidy up. Trailing whitespace. Modified: llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h Modified: llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h?rev=149408&r1=149407&r2=149408&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h Tue Jan 31 13:47:32 2012 @@ -166,7 +166,7 @@ bool operator!=(const bundle_iterator &x) const { return !operator==(x); } - + // Increment and decrement operators... bundle_iterator &operator--() { // predecrement - Back up do { @@ -197,7 +197,7 @@ IterTy getInstrIterator() const { return MII; - } + } }; typedef Instructions::iterator instr_iterator; @@ -360,7 +360,7 @@ const MachineBasicBlock *getLandingPadSuccessor() const; // Code Layout methods. - + /// moveBefore/moveAfter - move 'this' block before or after the specified /// block. This only moves the block, it does not modify the CFG or adjust /// potential fall-throughs at the end of the block. @@ -407,7 +407,7 @@ /// in transferSuccessors, and update PHI operands in the successor blocks /// which refer to fromMBB to refer to this. void transferSuccessorsAndUpdatePHIs(MachineBasicBlock *fromMBB); - + /// isSuccessor - Return true if the specified MBB is a successor of this /// block. bool isSuccessor(const MachineBasicBlock *MBB) const; @@ -425,7 +425,7 @@ /// branch to do so (e.g., a table jump). True is a conservative answer. bool canFallThrough(); - /// Returns a pointer to the first instructon in this block that is not a + /// Returns a pointer to the first instructon in this block that is not a /// PHINode instruction. When adding instruction to the beginning of the /// basic block, they should be added before the returned value, not before /// the first instruction, which might be PHI. @@ -471,8 +471,8 @@ instr_iterator insert(instr_iterator I, MachineInstr *M) { return Insts.insert(I, M); } - instr_iterator insertAfter(instr_iterator I, MachineInstr *M) { - return Insts.insertAfter(I, M); + instr_iterator insertAfter(instr_iterator I, MachineInstr *M) { + return Insts.insertAfter(I, M); } template @@ -482,8 +482,8 @@ iterator insert(iterator I, MachineInstr *M) { return Insts.insert(I.getInstrIterator(), M); } - iterator insertAfter(iterator I, MachineInstr *M) { - return Insts.insertAfter(I.getInstrIterator(), M); + iterator insertAfter(iterator I, MachineInstr *M) { + return Insts.insertAfter(I.getInstrIterator(), M); } /// erase - Remove the specified element or range from the instruction list. @@ -544,7 +544,7 @@ /// removeFromParent - This method unlinks 'this' from the containing /// function, and returns it, but does not delete it. MachineBasicBlock *removeFromParent(); - + /// eraseFromParent - This method unlinks 'this' from the containing /// function and deletes it. void eraseFromParent(); From glider at google.com Tue Jan 31 13:56:06 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 23:56:06 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not wrap memcpy() on Mac OS 10.7 In-Reply-To: References: Message-ID: I think it's better not to. We may miss something interesting on 10.6 On Tue, Jan 31, 2012 at 10:27 PM, Kostya Serebryany wrote: > Looks good. > But maybe we should simply drop memcpy on all variants of MacOS? > > --kcc > > > On Tue, Jan 31, 2012 at 5:36 AM, Alexander Potapenko > wrote: >> >> The attached patch disables wrapping memcpy() on Mac OS Lion, where it >> actually falls back to memmove. >> In this case we still need to initialize real_memcpy, so we set it to >> real_memmove >> We check for MACOS_VERSION_SNOW_LEOPARD, because currently only Snow >> Leopard and Lion are supported. >> >> -- >> Alexander Potapenko >> Software Engineer >> Google Moscow >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > -- Alexander Potapenko Software Engineer Google Moscow From glider at google.com Tue Jan 31 13:57:13 2012 From: glider at google.com (Alexander Potapenko) Date: Tue, 31 Jan 2012 23:57:13 +0400 Subject: [llvm-commits] [PATCH] AddressSanitizer: do not test memcpy on Lion In-Reply-To: References: Message-ID: I think it already has, see http://code.google.com/p/valgrind-variant/issues/detail?id=5 On Tue, Jan 31, 2012 at 10:23 PM, Kostya Serebryany wrote: > Looks good. > Weird. Does this mean that valgrind will also have problems on Lion? > > --kcc > > > On Tue, Jan 31, 2012 at 5:11 AM, Alexander Potapenko > wrote: >> >> The attached patch disables testing memcpy() on Mac OS 10.7, >> where memcpy() in fact aliases memmove() and thus calling it with >> overlapping parameters is not an error. >> >> -- >> Alexander Potapenko >> Software Engineer >> Google Moscow > > -- Alexander Potapenko Software Engineer Google Moscow From grosser at fim.uni-passau.de Tue Jan 31 13:54:51 2012 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Tue, 31 Jan 2012 19:54:51 -0000 Subject: [llvm-commits] [polly] r149410 - /polly/trunk/www/index.html Message-ID: <20120131195451.296842A6C12C@llvm.org> Author: grosser Date: Tue Jan 31 13:54:50 2012 New Revision: 149410 URL: http://llvm.org/viewvc/llvm-project?rev=149410&view=rev Log: www: Spelling fixes Reported by Sebastian Pop Modified: polly/trunk/www/index.html Modified: polly/trunk/www/index.html URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/index.html?rev=149410&r1=149409&r2=149410&view=diff ============================================================================== --- polly/trunk/www/index.html (original) +++ polly/trunk/www/index.html Tue Jan 31 13:54:50 2012 @@ -73,8 +73,10 @@
  • January

    Improved support for the isl scheduling optimizer

    - Polly can now automatically optimize all polybench kernels without the help of - an external optimizer. The compile time is reasonable fast and we can show + Polly can now automatically optimize all polybench + 2.0 kernels without the help of + an external optimizer. The compile time is reasonable and we can show notable speedups for various kernels.
    -adceAggressive Dead Code Elimination
    -always-inlineInliner for always_inline functions
    -argpromotionPromote 'by reference' arguments to scalars
    -bb-vectorizeCombine instructions to form vector instructions within basic blocks
    -block-placementProfile Guided Basic Block Placement
    -break-crit-edgesBreak critical edges in CFG
    -codegenprepareOptimize for code generation