From evan.cheng at apple.com Mon Dec 22 03:27:49 2008 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 22 Dec 2008 01:27:49 -0800 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> Message-ID: <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> On Dec 19, 2008, at 5:38 PM, Rafael Espindola wrote: > 2008/12/17 Evan Cheng : >> Author: evancheng >> Date: Wed Dec 17 01:37:53 2008 >> New Revision: 61128 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61128&view=rev >> Log: >> If this input operand is matching an output operand, e.g. '0', >> check if this is something that llvm supports. If the operand types >> are different, then emit an error if 1) one of the types is not >> integer, 2) if size of input type is larger than the output type. >> If the size of the integer input size is smaller than the integer >> output type, then cast it to the larger type and shift the value if >> the target is big endian. > > > This broke FD_SET on linux. An example of preprocessed code > > ---------------------------------------------------------------- > typedef long int __fd_mask; > typedef struct { > __fd_mask __fds_bits[1024 / (8 * sizeof (__fd_mask))]; > } > fd_set; > int xmlNanoFTPCheckResponse(void *ctx) { > fd_set rfd; > int __d0, __d1; > __asm__ __volatile__ ("cld; rep; stosl" : "=c" (__d0), "=D" (__d1) : > "a" (0), "0" (sizeof (fd_set) / sizeof (__fd_mask)), "1" > (&(((&rfd))->__fds_bits)[0]) : "memory"); > } > --- > --- > --- > --- > --- > --- > --- > -------------------------------------------------------------------- > > This is the intended consequence of your patch? Should there be an > explicit cast to intptr_t in the C code? How is it failing? An compile time error? You are probably right that it's missing a cast. Evan > > -- > Rafael Avila de Espindola > > Google | Gordon House | Barrow Street | Dublin 4 | Ireland > Registered in Dublin, Ireland | Registration Number: 368047 From espindola at google.com Mon Dec 22 03:49:08 2008 From: espindola at google.com (Rafael Espindola) Date: Mon, 22 Dec 2008 09:49:08 +0000 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> Message-ID: <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> > How is it failing? An compile time error? > > You are probably right that it's missing a cast. The error is Unsupported inline asm: input constraint with a matching output constraint of incompatible type! My current fix is to change d1 to be of pointer type. I will try to get the change to FD_ZERO upstream. > Evan > Cheers, -- Rafael Avila de Espindola Google | Gordon House | Barrow Street | Dublin 4 | Ireland Registered in Dublin, Ireland | Registration Number: 368047 From edwintorok at gmail.com Mon Dec 22 09:15:18 2008 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Mon, 22 Dec 2008 17:15:18 +0200 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <494E0148.4030600@gmail.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> Message-ID: <494FAF06.4050600@gmail.com> On 2008-12-21 10:41, T?r?k Edwin wrote: > On 2008-12-21 03:03, Nick Lewycky wrote: > >> Eli Friedman wrote: >> >> >>> On Sat, Dec 20, 2008 at 4:38 PM, Eli Friedman wrote: >>> >>> >>>> On Sat, Dec 20, 2008 at 4:19 PM, Nick Lewycky wrote: >>>> >>>> >>>>> Author: nicholas >>>>> Date: Sat Dec 20 18:19:21 2008 >>>>> New Revision: 61297 >>>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=61297&view=rev >>>>> Log: >>>>> Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). >>>>> >>>>> >>>> I'm pretty sure this isn't safe; take the following testcase: >>>> >>>> int foo(char* x) {return strcmp(x, "x", 2) == 0;} >>>> >>>> >>> Oops, messed that up slightly; try the following: >>> int foo(char* x) {return strcmp(x, "x") == 0;} >>> >>> >> The pathological case is: >> >> char x[1] = "\0"; // pretend we don't know its length >> char y[2] = "x\0"; // we know its length >> return strcmp(x, y); >> >> If we transform it into >> >> return memcmp(x, y, 2) >> >> then the existing MemCmpOpt will exercise these cases: >> >> // memcmp(S1,S2,2) != 0 -> (*(short*)LHS ^ *(short*)RHS) != 0 >> // memcmp(S1,S2,4) != 0 -> (*(int*)LHS ^ *(int*)RHS) != 0 >> >> resulting in a 2-byte load of x[1], which is illegal. >> >> Could someone comment on what part of this transform is wrong? I recall >> having a discussion about it on IRC, but can't recall the details... >> >> > > > You need to check the alignment. > I think something like: > MinAlignStr = Minimum Alignment of Str1P, Str2P (max 16) > MinLen = Minimum length of Str1P and Str2P > AlignMinLen = AlignOf(MinLen) (max 16) > If (MinAlignStr >= AlingMinLen) -> transform is safe > I found the IRC discussion, we assumed that memcmp won't do a 2 byte load if unaligned: dwin: memcmp may do a 2 byte read (11:12:15 PM) _sabre_: memcmp can't do a 2 byte read (11:12:19 PM) danchr: sdt: oh, I didn't know memcmp stopped at first difference (11:12:19 PM) _sabre_: unless the pointers are aligned (11:12:38 PM) _sabre_: or it knows it is safe some other way (11:12:55 PM) baldrick: is that guaranteed? (11:13:33 PM) edwin: for sure it can't transform strcmp(P,"1234\0") into memcmp(P,"1234\0",5) (11:13:39 PM) baldrick: or just what is expected by a quality implementation :) (11:13:58 PM) baldrick: edwin: why not? (11:14:06 PM) edwin: because P may be 4 byte aligned But try this with llvm-gcc: #include int foo(const char *s) { return !memcmp(s,"x",2); } It produces: foo: .Leh_func_begin1: .Llabel1: cmpw $120, (%rdi) sete %al movzbl %al, %eax ret So is it a bug in MemCmp optimization? Best regards, --Edwin From dpatel at apple.com Mon Dec 22 12:22:11 2008 From: dpatel at apple.com (Devang Patel) Date: Mon, 22 Dec 2008 10:22:11 -0800 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> Message-ID: Nick, char *s1 = "hi\0how are you"; char *s2 = "hi\0I am fine "; s1 and s2 are identical as per strcmp, but memcmp does not agree. Do you handle this case ? - Devang On Dec 20, 2008, at 4:19 PM, Nick Lewycky wrote: > Author: nicholas > Date: Sat Dec 20 18:19:21 2008 > New Revision: 61297 > > URL: http://llvm.org/viewvc/llvm-project?rev=61297&view=rev > Log: > Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). > > Added: > llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- > StrcmpMemcmp.ll > Modified: > llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp > > Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=61297&r1=61296&r2=61297&view=diff > > = > = > = > = > = > = > = > = > ====================================================================== > --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) > +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Sat Dec 20 > 18:19:21 2008 > @@ -80,7 +80,10 @@ > /// EmitMemChr - Emit a call to the memchr function. This assumes > that Ptr is > /// a pointer, Val is an i32 value, and Len is an 'intptr_t' value. > Value *EmitMemChr(Value *Ptr, Value *Val, Value *Len, IRBuilder<> > &B); > - > + > + /// EmitMemCmp - Emit a call to the memcmp function. > + Value *EmitMemCmp(Value *Ptr1, Value *Ptr2, Value *Len, > IRBuilder<> &B); > + > /// EmitUnaryFloatFnCall - Emit a call to the unary function named > 'Name' (e.g. > /// 'floor'). This function is known to take a single of type > matching 'Op' > /// and returns one value with the same type. If 'Op' is a long > double, 'l' > @@ -106,7 +109,7 @@ > /// EmitFWrite - Emit a call to the fwrite function. This assumes > that Ptr is > /// a pointer, Size is an 'intptr_t', and File is a pointer to FILE. > void EmitFWrite(Value *Ptr, Value *Size, Value *File, IRBuilder<> > &B); > - > + > }; > } // End anonymous namespace. > > @@ -151,6 +154,19 @@ > return B.CreateCall3(MemChr, CastToCStr(Ptr, B), Val, Len, > "memchr"); > } > > +/// EmitMemCmp - Emit a call to the memcmp function. > +Value *LibCallOptimization::EmitMemCmp(Value *Ptr1, Value *Ptr2, > + Value *Len, IRBuilder<> &B) { > + Module *M = Caller->getParent(); > + Value *MemCmp = M->getOrInsertFunction("memcmp", > + Type::Int32Ty, > + > PointerType::getUnqual(Type::Int8Ty), > + > PointerType::getUnqual(Type::Int8Ty), > + TD->getIntPtrType(), NULL); > + return B.CreateCall3(MemCmp, CastToCStr(Ptr1, B), > CastToCStr(Ptr2, B), > + Len, "memcmp"); > +} > + > /// EmitUnaryFloatFnCall - Emit a call to the unary function named > 'Name' (e.g. > /// 'floor'). This function is known to take a single of type > matching 'Op' and > /// returns one value with the same type. If 'Op' is a long double, > 'l' is > @@ -537,6 +553,18 @@ > // strcmp(x, y) -> cnst (if both x and y are constant strings) > if (HasStr1 && HasStr2) > return ConstantInt::get(CI->getType(), > strcmp(Str1.c_str(),Str2.c_str())); > + > + // strcmp(P, "x") -> memcmp(P, "x", 2) > + uint64_t Len1 = GetStringLength(Str1P); > + uint64_t Len2 = GetStringLength(Str2P); > + if (Len1 || Len2) { > + // Choose the smallest Len excluding 0 which means 'unknown'. > + if (!Len1 || (Len2 && Len2 < Len1)) > + Len1 = Len2; > + return EmitMemCmp(Str1P, Str2P, > + ConstantInt::get(TD->getIntPtrType(), > Len1), B); > + } > + > return 0; > } > }; > > Added: llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- > StrcmpMemcmp.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll?rev=61297&view=auto > > = > = > = > = > = > = > = > = > ====================================================================== > --- llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- > StrcmpMemcmp.ll (added) > +++ llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- > StrcmpMemcmp.ll Sat Dec 20 18:19:21 2008 > @@ -0,0 +1,10 @@ > +; RUN: llvm-as < %s | opt -simplify-libcalls | llvm-dis | grep > call.*memcmp > + > + at .str = internal constant [2 x i8] c"x\00" > + > +declare i32 @strcmp(i8* %dest, i8* %src) > + > +define i32 @foo(i8* %x, i8* %y) { > + %A = call i32 @strcmp(i8* %x, i8* getelementptr ([2 x i8]* @.str, > i32 0, i32 0)) > + ret i32 %A > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From gohman at apple.com Mon Dec 22 13:44:43 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 19:44:43 -0000 Subject: [llvm-commits] [llvm] r61338 - /llvm/trunk/include/llvm/Target/TargetInstrDesc.h Message-ID: <200812221944.mBMJihNE021258@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 13:44:39 2008 New Revision: 61338 URL: http://llvm.org/viewvc/llvm-project?rev=61338&view=rev Log: Clarify a comment. Modified: llvm/trunk/include/llvm/Target/TargetInstrDesc.h Modified: llvm/trunk/include/llvm/Target/TargetInstrDesc.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrDesc.h?rev=61338&r1=61337&r2=61338&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetInstrDesc.h (original) +++ llvm/trunk/include/llvm/Target/TargetInstrDesc.h Mon Dec 22 13:44:39 2008 @@ -159,7 +159,8 @@ /// getNumDefs - Return the number of MachineOperands that are register /// definitions. Register definitions always occur at the start of the - /// machine operand list. This is the number of "outs" in the .td file. + /// machine operand list. This is the number of "outs" in the .td file, + /// and does not include implicit defs. unsigned getNumDefs() const { return NumDefs; } From gohman at apple.com Mon Dec 22 15:03:26 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 13:03:26 -0800 Subject: [llvm-commits] Patch: Adding unit tests to LLVM In-Reply-To: References: Message-ID: On Dec 18, 2008, at 1:10 PM, Talin wrote: > This patch adds a unit test framework to LLVM, along with a sample > unit test for DenseMap. I don't expect this patch to be accepted as- > is, this is mainly a trial balloon and proof of concept. Hi Talin, I think this looks useful. I'm contemplating a change to the DenseMap class right now, actually, and I may try out your patch to help test it :-). > 3) I did not actually include the testing framework in the patch; It > will need to be checked in separately. There are two approaches to > this. One approach is to use the svn:external feature to create a > link to the googletest svn repository from the LLVM svn repository. > The other approach is to take a snapshot of googletest and check it > in to the LLVM repository. I've found svn:externals to repositories on different servers to be inconvenient. If the framework isn't inconveniently large, I think it'd be best to take snapshots. > > The GoogleTest tar archive is here: http://code.google.com/p/googletest/downloads/list > . I've located it within the LLVM source tree in the location "third- > party/googletest". My sense is that the "third-party" part is unnecessary, and it'd be nicer to just have googletest at the top level, but I don't have a strong opinion here. Dan From gohman at apple.com Mon Dec 22 15:06:21 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 21:06:21 -0000 Subject: [llvm-commits] [llvm] r61341 - /llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h Message-ID: <200812222106.mBML6LY8023728@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 15:06:20 2008 New Revision: 61341 URL: http://llvm.org/viewvc/llvm-project?rev=61341&view=rev Log: Add an assertion to catch SUnits reallocations. And add a doxygen comment for the ScheduleDAGSDNodes class. Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h?rev=61341&r1=61340&r2=61341&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h Mon Dec 22 15:06:20 2008 @@ -59,6 +59,20 @@ virtual void EmitNoop() {} }; + /// ScheduleDAGSDNodes - A ScheduleDAG for scheduling SDNode-based DAGs. + /// + /// Edges between SUnits are initially based on edges in the SelectionDAG, + /// and additional edges can be added by the schedulers as heuristics. + /// SDNodes such as Constants, Registers, and a few others that are not + /// interesting to schedulers are not allocated SUnits. + /// + /// SDNodes with MVT::Flag operands are grouped along with the flagged + /// nodes into a single SUnit so that they are scheduled together. + /// + /// SDNode-based scheduling graphs do not use SDep::Anti or SDep::Output + /// edges. Physical register dependence information is not carried in + /// the DAG and must be handled explicitly by schedulers. + /// class ScheduleDAGSDNodes : public ScheduleDAG { public: SmallSet CommuteSet; // Nodes that should be commuted. @@ -88,7 +102,11 @@ /// NewSUnit - Creates a new SUnit and return a ptr to it. /// SUnit *NewSUnit(SDNode *N) { +#ifndef NDEBUG + const SUnit *Addr = &SUnits[0]; +#endif SUnits.push_back(SUnit(N, (unsigned)SUnits.size())); + assert(Addr == &SUnits[0] && "SUnits std::vector reallocated on the fly!"); SUnits.back().OrigNode = &SUnits.back(); return &SUnits.back(); } From gohman at apple.com Mon Dec 22 15:06:57 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 21:06:57 -0000 Subject: [llvm-commits] [llvm] r61342 - /llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h Message-ID: <200812222106.mBML6vPF023754@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 15:06:56 2008 New Revision: 61342 URL: http://llvm.org/viewvc/llvm-project?rev=61342&view=rev Log: Add an accesor for the isNormalMemory field in the SDep class. Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h?rev=61342&r1=61341&r2=61342&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h Mon Dec 22 15:06:56 2008 @@ -166,6 +166,13 @@ return getKind() != Data; } + /// isNormalMemory - Test if this is an Order dependence between two + /// memory accesses where both sides of the dependence access memory + /// in non-volatile and fully modeled ways. + bool isNormalMemory() const { + return getKind() == Order && Contents.Order.isNormalMemory; + } + /// isMustAlias - Test if this is an Order dependence that is marked /// as "must alias", meaning that the SUnits at either end of the edge /// have a memory dependence on a known memory location. From gohman at apple.com Mon Dec 22 15:08:09 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 21:08:09 -0000 Subject: [llvm-commits] [llvm] r61343 - /llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h Message-ID: <200812222108.mBML8ARP023806@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 15:08:08 2008 New Revision: 61343 URL: http://llvm.org/viewvc/llvm-project?rev=61343&view=rev Log: Add an assertion to the ScheduleDAGInstrs class to catch SUnits reallocations. We don't do cloning on MachineInstr schedule DAGs, but this is a worthwhile sanity check regardless. Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h?rev=61343&r1=61342&r2=61343&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h Mon Dec 22 15:08:08 2008 @@ -36,7 +36,11 @@ /// NewSUnit - Creates a new SUnit and return a ptr to it. /// SUnit *NewSUnit(MachineInstr *MI) { +#ifndef NDEBUG + const SUnit *Addr = &SUnits[0]; +#endif SUnits.push_back(SUnit(MI, (unsigned)SUnits.size())); + assert(Addr == &SUnits[0] && "SUnits std::vector reallocated on the fly!"); SUnits.back().OrigNode = &SUnits.back(); return &SUnits.back(); } From echristo at apple.com Mon Dec 22 15:09:53 2008 From: echristo at apple.com (Eric Christopher) Date: Mon, 22 Dec 2008 13:09:53 -0800 Subject: [llvm-commits] [llvm] r61239 - in /llvm/trunk: docs/LangRef.html include/llvm/Attributes.h lib/Analysis/BasicAliasAnalysis.cpp test/Analysis/BasicAA/nocapture.ll In-Reply-To: <494B6A94.3010504@mxc.ca> References: <200812190639.mBJ6dMeR028570@zion.cs.uiuc.edu> <494B6A94.3010504@mxc.ca> Message-ID: <16590569-0BFE-4B98-B082-A0EE8A8F797B@apple.com> I'm seeing failures on darwin due to this: /// getParamAlignment - Return the alignment for the specified function /// parameter. unsigned getParamAlignment(unsigned Idx) const { Attributes Align = getAttributes(Idx) & Attribute::Alignment; if (Align == 0) return 0; return 1ull << ((Align >> 16) - 1); } there's a warning (local to darwin -Wshorten64-to-32) here that we're truncating the return value from 64 to 32. :) -eric From gohman at apple.com Mon Dec 22 15:11:34 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 21:11:34 -0000 Subject: [llvm-commits] [llvm] r61344 - /llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Message-ID: <200812222111.mBMLBYQY023926@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 15:11:33 2008 New Revision: 61344 URL: http://llvm.org/viewvc/llvm-project?rev=61344&view=rev Log: Optimize setDepthDirty and setHeightDirty a little, as they showed up on a profile. Modified: llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Modified: llvm/trunk/lib/CodeGen/ScheduleDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAG.cpp?rev=61344&r1=61343&r2=61344&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ScheduleDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Mon Dec 22 15:11:33 2008 @@ -119,29 +119,35 @@ } void SUnit::setDepthDirty() { + if (!isDepthCurrent) return; SmallVector WorkList; WorkList.push_back(this); - while (!WorkList.empty()) { + do { SUnit *SU = WorkList.pop_back_val(); - if (!SU->isDepthCurrent) continue; SU->isDepthCurrent = false; for (SUnit::const_succ_iterator I = SU->Succs.begin(), - E = SU->Succs.end(); I != E; ++I) - WorkList.push_back(I->getSUnit()); - } + E = SU->Succs.end(); I != E; ++I) { + SUnit *SuccSU = I->getSUnit(); + if (SuccSU->isDepthCurrent) + WorkList.push_back(SuccSU); + } + } while (!WorkList.empty()); } void SUnit::setHeightDirty() { + if (!isHeightCurrent) return; SmallVector WorkList; WorkList.push_back(this); - while (!WorkList.empty()) { + do { SUnit *SU = WorkList.pop_back_val(); - if (!SU->isHeightCurrent) continue; SU->isHeightCurrent = false; for (SUnit::const_pred_iterator I = SU->Preds.begin(), - E = SU->Preds.end(); I != E; ++I) - WorkList.push_back(I->getSUnit()); - } + E = SU->Preds.end(); I != E; ++I) { + SUnit *PredSU = I->getSUnit(); + if (PredSU->isHeightCurrent) + WorkList.push_back(PredSU); + } + } while (!WorkList.empty()); } /// setDepthToAtLeast - Update this node's successors to reflect the From gohman at apple.com Mon Dec 22 15:14:27 2008 From: gohman at apple.com (Dan Gohman) Date: Mon, 22 Dec 2008 21:14:27 -0000 Subject: [llvm-commits] [llvm] r61345 - in /llvm/trunk: include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter/AsmPrinter.cpp Message-ID: <200812222114.mBMLESO7024023@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 15:14:27 2008 New Revision: 61345 URL: http://llvm.org/viewvc/llvm-project?rev=61345&view=rev Log: Refactor a bunch of code out of AsmPrinter::EmitGlobalConstant into separate functions. Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/AsmPrinter.h?rev=61345&r1=61344&r2=61345&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/AsmPrinter.h (original) +++ llvm/trunk/include/llvm/CodeGen/AsmPrinter.h Mon Dec 22 15:14:27 2008 @@ -25,6 +25,9 @@ class GCStrategy; class Constant; class ConstantArray; + class ConstantInt; + class ConstantStruct; + class ConstantVector; class GCMetadataPrinter; class GlobalVariable; class GlobalAlias; @@ -369,6 +372,11 @@ const GlobalValue *findGlobalValue(const Constant* CV); void EmitLLVMUsedList(Constant *List); void EmitXXStructorList(Constant *List); + void EmitGlobalConstantStruct(const ConstantStruct* CVS); + void EmitGlobalConstantArray(const ConstantArray* CVA); + void EmitGlobalConstantVector(const ConstantVector* CP); + void EmitGlobalConstantFP(const ConstantFP* CFP); + void EmitGlobalConstantLargeInt(const ConstantInt* CI); GCMetadataPrinter *GetOrCreateGCPrinter(GCStrategy *C); }; } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=61345&r1=61344&r2=61345&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Mon Dec 22 15:14:27 2008 @@ -939,206 +939,231 @@ O << '\n'; } +void AsmPrinter::EmitGlobalConstantArray(const ConstantArray *CVA) { + if (CVA->isString()) { + EmitString(CVA); + } else { // Not a string. Print the values in successive locations + for (unsigned i = 0, e = CVA->getNumOperands(); i != e; ++i) + EmitGlobalConstant(CVA->getOperand(i)); + } +} + +void AsmPrinter::EmitGlobalConstantVector(const ConstantVector *CP) { + const VectorType *PTy = CP->getType(); + + for (unsigned I = 0, E = PTy->getNumElements(); I < E; ++I) + EmitGlobalConstant(CP->getOperand(I)); +} + +void AsmPrinter::EmitGlobalConstantStruct(const ConstantStruct *CVS) { + // Print the fields in successive locations. Pad to align if needed! + const TargetData *TD = TM.getTargetData(); + unsigned Size = TD->getABITypeSize(CVS->getType()); + const StructLayout *cvsLayout = TD->getStructLayout(CVS->getType()); + uint64_t sizeSoFar = 0; + for (unsigned i = 0, e = CVS->getNumOperands(); i != e; ++i) { + const Constant* field = CVS->getOperand(i); + + // Check if padding is needed and insert one or more 0s. + uint64_t fieldSize = TD->getABITypeSize(field->getType()); + uint64_t padSize = ((i == e-1 ? Size : cvsLayout->getElementOffset(i+1)) + - cvsLayout->getElementOffset(i)) - fieldSize; + sizeSoFar += fieldSize + padSize; + + // Now print the actual field value. + EmitGlobalConstant(field); + + // Insert padding - this may include padding to increase the size of the + // current field up to the ABI size (if the struct is not packed) as well + // as padding to ensure that the next field starts at the right offset. + EmitZeros(padSize); + } + assert(sizeSoFar == cvsLayout->getSizeInBytes() && + "Layout of constant struct may be incorrect!"); +} + +void AsmPrinter::EmitGlobalConstantFP(const ConstantFP *CFP) { + // FP Constants are printed as integer constants to avoid losing + // precision... + const TargetData *TD = TM.getTargetData(); + if (CFP->getType() == Type::DoubleTy) { + double Val = CFP->getValueAPF().convertToDouble(); // for comment only + uint64_t i = CFP->getValueAPF().bitcastToAPInt().getZExtValue(); + if (TAI->getData64bitsDirective()) + O << TAI->getData64bitsDirective() << i << '\t' + << TAI->getCommentString() << " double value: " << Val << '\n'; + else if (TD->isBigEndian()) { + O << TAI->getData32bitsDirective() << unsigned(i >> 32) + << '\t' << TAI->getCommentString() + << " double most significant word " << Val << '\n'; + O << TAI->getData32bitsDirective() << unsigned(i) + << '\t' << TAI->getCommentString() + << " double least significant word " << Val << '\n'; + } else { + O << TAI->getData32bitsDirective() << unsigned(i) + << '\t' << TAI->getCommentString() + << " double least significant word " << Val << '\n'; + O << TAI->getData32bitsDirective() << unsigned(i >> 32) + << '\t' << TAI->getCommentString() + << " double most significant word " << Val << '\n'; + } + return; + } else if (CFP->getType() == Type::FloatTy) { + float Val = CFP->getValueAPF().convertToFloat(); // for comment only + O << TAI->getData32bitsDirective() + << CFP->getValueAPF().bitcastToAPInt().getZExtValue() + << '\t' << TAI->getCommentString() << " float " << Val << '\n'; + return; + } else if (CFP->getType() == Type::X86_FP80Ty) { + // all long double variants are printed as hex + // api needed to prevent premature destruction + APInt api = CFP->getValueAPF().bitcastToAPInt(); + const uint64_t *p = api.getRawData(); + // Convert to double so we can print the approximate val as a comment. + APFloat DoubleVal = CFP->getValueAPF(); + bool ignored; + DoubleVal.convert(APFloat::IEEEdouble, APFloat::rmNearestTiesToEven, + &ignored); + if (TD->isBigEndian()) { + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 48) + << '\t' << TAI->getCommentString() + << " long double most significant halfword of ~" + << DoubleVal.convertToDouble() << '\n'; + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 32) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 16) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[0]) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[1]) + << '\t' << TAI->getCommentString() + << " long double least significant halfword\n"; + } else { + O << TAI->getData16bitsDirective() << uint16_t(p[1]) + << '\t' << TAI->getCommentString() + << " long double least significant halfword of ~" + << DoubleVal.convertToDouble() << '\n'; + O << TAI->getData16bitsDirective() << uint16_t(p[0]) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 16) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 32) + << '\t' << TAI->getCommentString() + << " long double next halfword\n"; + O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 48) + << '\t' << TAI->getCommentString() + << " long double most significant halfword\n"; + } + EmitZeros(TD->getABITypeSize(Type::X86_FP80Ty) - + TD->getTypeStoreSize(Type::X86_FP80Ty)); + return; + } else if (CFP->getType() == Type::PPC_FP128Ty) { + // all long double variants are printed as hex + // api needed to prevent premature destruction + APInt api = CFP->getValueAPF().bitcastToAPInt(); + const uint64_t *p = api.getRawData(); + if (TD->isBigEndian()) { + O << TAI->getData32bitsDirective() << uint32_t(p[0] >> 32) + << '\t' << TAI->getCommentString() + << " long double most significant word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[0]) + << '\t' << TAI->getCommentString() + << " long double next word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[1] >> 32) + << '\t' << TAI->getCommentString() + << " long double next word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[1]) + << '\t' << TAI->getCommentString() + << " long double least significant word\n"; + } else { + O << TAI->getData32bitsDirective() << uint32_t(p[1]) + << '\t' << TAI->getCommentString() + << " long double least significant word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[1] >> 32) + << '\t' << TAI->getCommentString() + << " long double next word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[0]) + << '\t' << TAI->getCommentString() + << " long double next word\n"; + O << TAI->getData32bitsDirective() << uint32_t(p[0] >> 32) + << '\t' << TAI->getCommentString() + << " long double most significant word\n"; + } + return; + } else assert(0 && "Floating point constant type not handled"); +} + +void AsmPrinter::EmitGlobalConstantLargeInt(const ConstantInt *CI) { + const TargetData *TD = TM.getTargetData(); + unsigned BitWidth = CI->getBitWidth(); + assert(isPowerOf2_32(BitWidth) && + "Non-power-of-2-sized integers not handled!"); + + // We don't expect assemblers to support integer data directives + // for more than 64 bits, so we emit the data in at most 64-bit + // quantities at a time. + const uint64_t *RawData = CI->getValue().getRawData(); + for (unsigned i = 0, e = BitWidth / 64; i != e; ++i) { + uint64_t Val; + if (TD->isBigEndian()) + Val = RawData[e - i - 1]; + else + Val = RawData[i]; + + if (TAI->getData64bitsDirective()) + O << TAI->getData64bitsDirective() << Val << '\n'; + else if (TD->isBigEndian()) { + O << TAI->getData32bitsDirective() << unsigned(Val >> 32) + << '\t' << TAI->getCommentString() + << " Double-word most significant word " << Val << '\n'; + O << TAI->getData32bitsDirective() << unsigned(Val) + << '\t' << TAI->getCommentString() + << " Double-word least significant word " << Val << '\n'; + } else { + O << TAI->getData32bitsDirective() << unsigned(Val) + << '\t' << TAI->getCommentString() + << " Double-word least significant word " << Val << '\n'; + O << TAI->getData32bitsDirective() << unsigned(Val >> 32) + << '\t' << TAI->getCommentString() + << " Double-word most significant word " << Val << '\n'; + } + } +} + /// EmitGlobalConstant - Print a general LLVM constant to the .s file. void AsmPrinter::EmitGlobalConstant(const Constant *CV) { const TargetData *TD = TM.getTargetData(); - unsigned Size = TD->getABITypeSize(CV->getType()); + const Type *type = CV->getType(); + unsigned Size = TD->getABITypeSize(type); if (CV->isNullValue() || isa(CV)) { EmitZeros(Size); return; } else if (const ConstantArray *CVA = dyn_cast(CV)) { - if (CVA->isString()) { - EmitString(CVA); - } else { // Not a string. Print the values in successive locations - for (unsigned i = 0, e = CVA->getNumOperands(); i != e; ++i) - EmitGlobalConstant(CVA->getOperand(i)); - } + EmitGlobalConstantArray(CVA); return; } else if (const ConstantStruct *CVS = dyn_cast(CV)) { - // Print the fields in successive locations. Pad to align if needed! - const StructLayout *cvsLayout = TD->getStructLayout(CVS->getType()); - uint64_t sizeSoFar = 0; - for (unsigned i = 0, e = CVS->getNumOperands(); i != e; ++i) { - const Constant* field = CVS->getOperand(i); - - // Check if padding is needed and insert one or more 0s. - uint64_t fieldSize = TD->getABITypeSize(field->getType()); - uint64_t padSize = ((i == e-1 ? Size : cvsLayout->getElementOffset(i+1)) - - cvsLayout->getElementOffset(i)) - fieldSize; - sizeSoFar += fieldSize + padSize; - - // Now print the actual field value. - EmitGlobalConstant(field); - - // Insert padding - this may include padding to increase the size of the - // current field up to the ABI size (if the struct is not packed) as well - // as padding to ensure that the next field starts at the right offset. - EmitZeros(padSize); - } - assert(sizeSoFar == cvsLayout->getSizeInBytes() && - "Layout of constant struct may be incorrect!"); + EmitGlobalConstantStruct(CVS); return; } else if (const ConstantFP *CFP = dyn_cast(CV)) { - // FP Constants are printed as integer constants to avoid losing - // precision... - if (CFP->getType() == Type::DoubleTy) { - double Val = CFP->getValueAPF().convertToDouble(); // for comment only - uint64_t i = CFP->getValueAPF().bitcastToAPInt().getZExtValue(); - if (TAI->getData64bitsDirective()) - O << TAI->getData64bitsDirective() << i << '\t' - << TAI->getCommentString() << " double value: " << Val << '\n'; - else if (TD->isBigEndian()) { - O << TAI->getData32bitsDirective() << unsigned(i >> 32) - << '\t' << TAI->getCommentString() - << " double most significant word " << Val << '\n'; - O << TAI->getData32bitsDirective() << unsigned(i) - << '\t' << TAI->getCommentString() - << " double least significant word " << Val << '\n'; - } else { - O << TAI->getData32bitsDirective() << unsigned(i) - << '\t' << TAI->getCommentString() - << " double least significant word " << Val << '\n'; - O << TAI->getData32bitsDirective() << unsigned(i >> 32) - << '\t' << TAI->getCommentString() - << " double most significant word " << Val << '\n'; - } - return; - } else if (CFP->getType() == Type::FloatTy) { - float Val = CFP->getValueAPF().convertToFloat(); // for comment only - O << TAI->getData32bitsDirective() - << CFP->getValueAPF().bitcastToAPInt().getZExtValue() - << '\t' << TAI->getCommentString() << " float " << Val << '\n'; - return; - } else if (CFP->getType() == Type::X86_FP80Ty) { - // all long double variants are printed as hex - // api needed to prevent premature destruction - APInt api = CFP->getValueAPF().bitcastToAPInt(); - const uint64_t *p = api.getRawData(); - // Convert to double so we can print the approximate val as a comment. - APFloat DoubleVal = CFP->getValueAPF(); - bool ignored; - DoubleVal.convert(APFloat::IEEEdouble, APFloat::rmNearestTiesToEven, - &ignored); - if (TD->isBigEndian()) { - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 48) - << '\t' << TAI->getCommentString() - << " long double most significant halfword of ~" - << DoubleVal.convertToDouble() << '\n'; - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 32) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 16) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[0]) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[1]) - << '\t' << TAI->getCommentString() - << " long double least significant halfword\n"; - } else { - O << TAI->getData16bitsDirective() << uint16_t(p[1]) - << '\t' << TAI->getCommentString() - << " long double least significant halfword of ~" - << DoubleVal.convertToDouble() << '\n'; - O << TAI->getData16bitsDirective() << uint16_t(p[0]) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 16) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 32) - << '\t' << TAI->getCommentString() - << " long double next halfword\n"; - O << TAI->getData16bitsDirective() << uint16_t(p[0] >> 48) - << '\t' << TAI->getCommentString() - << " long double most significant halfword\n"; - } - EmitZeros(Size - TD->getTypeStoreSize(Type::X86_FP80Ty)); - return; - } else if (CFP->getType() == Type::PPC_FP128Ty) { - // all long double variants are printed as hex - // api needed to prevent premature destruction - APInt api = CFP->getValueAPF().bitcastToAPInt(); - const uint64_t *p = api.getRawData(); - if (TD->isBigEndian()) { - O << TAI->getData32bitsDirective() << uint32_t(p[0] >> 32) - << '\t' << TAI->getCommentString() - << " long double most significant word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[0]) - << '\t' << TAI->getCommentString() - << " long double next word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[1] >> 32) - << '\t' << TAI->getCommentString() - << " long double next word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[1]) - << '\t' << TAI->getCommentString() - << " long double least significant word\n"; - } else { - O << TAI->getData32bitsDirective() << uint32_t(p[1]) - << '\t' << TAI->getCommentString() - << " long double least significant word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[1] >> 32) - << '\t' << TAI->getCommentString() - << " long double next word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[0]) - << '\t' << TAI->getCommentString() - << " long double next word\n"; - O << TAI->getData32bitsDirective() << uint32_t(p[0] >> 32) - << '\t' << TAI->getCommentString() - << " long double most significant word\n"; - } - return; - } else assert(0 && "Floating point constant type not handled"); - } else if (CV->getType()->isInteger() && - cast(CV->getType())->getBitWidth() >= 64) { - if (const ConstantInt *CI = dyn_cast(CV)) { - unsigned BitWidth = CI->getBitWidth(); - assert(isPowerOf2_32(BitWidth) && - "Non-power-of-2-sized integers not handled!"); - - // We don't expect assemblers to support integer data directives - // for more than 64 bits, so we emit the data in at most 64-bit - // quantities at a time. - const uint64_t *RawData = CI->getValue().getRawData(); - for (unsigned i = 0, e = BitWidth / 64; i != e; ++i) { - uint64_t Val; - if (TD->isBigEndian()) - Val = RawData[e - i - 1]; - else - Val = RawData[i]; - - if (TAI->getData64bitsDirective()) - O << TAI->getData64bitsDirective() << Val << '\n'; - else if (TD->isBigEndian()) { - O << TAI->getData32bitsDirective() << unsigned(Val >> 32) - << '\t' << TAI->getCommentString() - << " Double-word most significant word " << Val << '\n'; - O << TAI->getData32bitsDirective() << unsigned(Val) - << '\t' << TAI->getCommentString() - << " Double-word least significant word " << Val << '\n'; - } else { - O << TAI->getData32bitsDirective() << unsigned(Val) - << '\t' << TAI->getCommentString() - << " Double-word least significant word " << Val << '\n'; - O << TAI->getData32bitsDirective() << unsigned(Val >> 32) - << '\t' << TAI->getCommentString() - << " Double-word most significant word " << Val << '\n'; - } - } + EmitGlobalConstantFP(CFP); + return; + } else if (const ConstantInt *CI = dyn_cast(CV)) { + // Small integers are handled below; large integers are handled here. + if (Size > 4) { + EmitGlobalConstantLargeInt(CI); return; } } else if (const ConstantVector *CP = dyn_cast(CV)) { - const VectorType *PTy = CP->getType(); - - for (unsigned I = 0, E = PTy->getNumElements(); I < E; ++I) - EmitGlobalConstant(CP->getOperand(I)); - + EmitGlobalConstantVector(CP); return; } - const Type *type = CV->getType(); printDataDirective(type); EmitConstantValueOnly(CV); if (const ConstantInt *CI = dyn_cast(CV)) { From isanbard at gmail.com Mon Dec 22 15:36:08 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 21:36:08 -0000 Subject: [llvm-commits] [llvm] r61347 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222136.mBMLa9hC024796@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 15:36:08 2008 New Revision: 61347 URL: http://llvm.org/viewvc/llvm-project?rev=61347&view=rev Log: Add verification functions to GVN which check to see that an instruction was truely deleted. These will be expanded with further checks of all of the data structures. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61347&r1=61346&r2=61347&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 15:36:08 2008 @@ -173,6 +173,7 @@ void setMemDep(MemoryDependenceAnalysis* M) { MD = M; } void setDomTree(DominatorTree* D) { DT = D; } uint32_t getNextUnusedValueNumber() { return nextValueNumber; } + void verifyRemoved(const Value *) const; }; } @@ -678,6 +679,15 @@ valueNumbering.erase(V); } +/// verifyRemoved - Verify that the value is removed from all internal data +/// structures. +void ValueTable::verifyRemoved(const Value *V) const { + for (DenseMap::iterator + I = valueNumbering.begin(), E = valueNumbering.end(); I != E; ++I) { + assert(I->first != V && "Inst still occurs in value numbering map!"); + } +} + //===----------------------------------------------------------------------===// // GVN Pass //===----------------------------------------------------------------------===// @@ -741,6 +751,7 @@ bool mergeBlockIntoPredecessor(BasicBlock* BB); Value* AttemptRedundancyElimination(Instruction* orig, unsigned valno); void cleanupGlobalSets(); + void verifyRemoved(const Instruction *I) const; }; char GVN::ID = 0; @@ -859,6 +870,7 @@ DEBUG(cerr << "GVN removed: " << *PN); MD->removeInstruction(PN); PN->eraseFromParent(); + DEBUG(verifyRemoved(PN)); Phis[BB] = v; return v; @@ -1640,3 +1652,9 @@ delete I->second; localAvail.clear(); } + +/// verifyRemoved - Verify that the specified instruction does not occur in our +/// internal data structures. +void GVN::verifyRemoved(const Instruction *I) const { + VN.verifyRemoved(I); +} From isanbard at gmail.com Mon Dec 22 15:57:30 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 21:57:30 -0000 Subject: [llvm-commits] [llvm] r61349 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222157.mBMLvVch025504@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 15:57:30 2008 New Revision: 61349 URL: http://llvm.org/viewvc/llvm-project?rev=61349&view=rev Log: Verify removed in a few more places. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61349&r1=61348&r2=61349&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 15:57:30 2008 @@ -1461,6 +1461,7 @@ DEBUG(cerr << "GVN removed: " << **I); MD->removeInstruction(*I); (*I)->eraseFromParent(); + DEBUG(verifyRemoved(*I)); } toErase.clear(); @@ -1611,6 +1612,7 @@ DEBUG(cerr << "GVN PRE removed: " << *CurInst); MD->removeInstruction(CurInst); CurInst->eraseFromParent(); + DEBUG(verifyRemoved(CurInst)); Changed = true; } } From isanbard at gmail.com Mon Dec 22 16:14:08 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 22:14:08 -0000 Subject: [llvm-commits] [llvm] r61350 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222214.mBMME9F8026068@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 16:14:07 2008 New Revision: 61350 URL: http://llvm.org/viewvc/llvm-project?rev=61350&view=rev Log: Add verification that deleted instruction isn't hiding in the PHI map. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61350&r1=61349&r2=61350&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 16:14:07 2008 @@ -40,11 +40,11 @@ #include using namespace llvm; -STATISTIC(NumGVNInstr, "Number of instructions deleted"); -STATISTIC(NumGVNLoad, "Number of loads deleted"); -STATISTIC(NumGVNPRE, "Number of instructions PRE'd"); +STATISTIC(NumGVNInstr, "Number of instructions deleted"); +STATISTIC(NumGVNLoad, "Number of loads deleted"); +STATISTIC(NumGVNPRE, "Number of instructions PRE'd"); STATISTIC(NumGVNBlocks, "Number of blocks merged"); -STATISTIC(NumPRELoad, "Number of loads PRE'd"); +STATISTIC(NumPRELoad, "Number of loads PRE'd"); static cl::opt EnablePRE("enable-pre", cl::init(true), cl::Hidden); @@ -1581,6 +1581,7 @@ // are not value numbered precisely. if (!success) { delete PREInstr; + DEBUG(verifyRemoved(PREInstr)); continue; } @@ -1659,4 +1660,16 @@ /// internal data structures. void GVN::verifyRemoved(const Instruction *I) const { VN.verifyRemoved(I); + + // Walk through the PHI map to make sure the instruction isn't hiding in there + // somewhere. + for (PhiMapType::iterator + II = phiMap.begin(), IE = phiMap.end(); II != IE; ++II) { + assert(II->first != I && "Inst is still a key in PHI map!"); + + for (SmallPtrSet::iterator + SI = II->second.begin(), SE = II->second.end(); SI != SE; ++SI) { + assert(*SI != I && "Inst is still a value in PHI map!"); + } + } } From isanbard at gmail.com Mon Dec 22 16:16:32 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 22:16:32 -0000 Subject: [llvm-commits] [llvm] r61352 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222216.mBMMGW66026169@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 16:16:31 2008 New Revision: 61352 URL: http://llvm.org/viewvc/llvm-project?rev=61352&view=rev Log: Simplification: Negate the operator== method instead of implementing a full operator!= method. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61352&r1=61351&r2=61352&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 16:16:31 2008 @@ -110,30 +110,7 @@ } bool operator!=(const Expression &other) const { - if (opcode != other.opcode) - return true; - else if (opcode == EMPTY || opcode == TOMBSTONE) - return false; - else if (type != other.type) - return true; - else if (function != other.function) - return true; - else if (firstVN != other.firstVN) - return true; - else if (secondVN != other.secondVN) - return true; - else if (thirdVN != other.thirdVN) - return true; - else { - if (varargs.size() != other.varargs.size()) - return true; - - for (size_t i = 0; i < varargs.size(); ++i) - if (varargs[i] != other.varargs[i]) - return true; - - return false; - } + return !(*this == other); } }; From isanbard at gmail.com Mon Dec 22 16:28:56 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 22:28:56 -0000 Subject: [llvm-commits] [llvm] r61353 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222228.mBMMSvRO026526@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 16:28:56 2008 New Revision: 61353 URL: http://llvm.org/viewvc/llvm-project?rev=61353&view=rev Log: Check that the instruction isn't in the value numbering scope. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61353&r1=61352&r2=61353&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 16:28:56 2008 @@ -1635,18 +1635,34 @@ /// verifyRemoved - Verify that the specified instruction does not occur in our /// internal data structures. -void GVN::verifyRemoved(const Instruction *I) const { - VN.verifyRemoved(I); +void GVN::verifyRemoved(const Instruction *Inst) const { + VN.verifyRemoved(Inst); // Walk through the PHI map to make sure the instruction isn't hiding in there // somewhere. for (PhiMapType::iterator - II = phiMap.begin(), IE = phiMap.end(); II != IE; ++II) { - assert(II->first != I && "Inst is still a key in PHI map!"); + I = phiMap.begin(), E = phiMap.end(); I != E; ++I) { + assert(I->first != Inst && "Inst is still a key in PHI map!"); for (SmallPtrSet::iterator - SI = II->second.begin(), SE = II->second.end(); SI != SE; ++SI) { - assert(*SI != I && "Inst is still a value in PHI map!"); + II = I->second.begin(), IE = I->second.end(); II != IE; ++II) { + assert(*II != Inst && "Inst is still a value in PHI map!"); + } + } + + // Walk through the value number scope to make sure the instruction isn't + // ferreted away in it. + for (DenseMap::iterator + I = localAvail.begin(), E = localAvail.end(); I != E; ++I) { + const ValueNumberScope *VNS = I->second; + + while (VNS) { + for (DenseMap::iterator + II = VNS->table.begin(), IE = VNS->table.end(); II != IE; ++II) { + assert(II->second != Inst && "Inst still in value numbering scope!"); + } + + VNS = VNS->parent; } } } From isanbard at gmail.com Mon Dec 22 16:32:23 2008 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 22 Dec 2008 22:32:23 -0000 Subject: [llvm-commits] [llvm] r61354 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812222232.mBMMWN5R026682@zion.cs.uiuc.edu> Author: void Date: Mon Dec 22 16:32:22 2008 New Revision: 61354 URL: http://llvm.org/viewvc/llvm-project?rev=61354&view=rev Log: Comment clean-ups. No functionality change. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61354&r1=61353&r2=61354&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 16:32:22 2008 @@ -666,7 +666,7 @@ } //===----------------------------------------------------------------------===// -// GVN Pass +// GVN Pass //===----------------------------------------------------------------------===// namespace { @@ -1353,9 +1353,7 @@ return false; } -// GVN::runOnFunction - This is the main transformation entry point for a -// function. -// +/// runOnFunction - This is the main transformation entry point for a function. bool GVN::runOnFunction(Function& F) { MD = &getAnalysis(); DT = &getAnalysis(); @@ -1602,7 +1600,7 @@ return Changed || toSplit.size(); } -// iterateOnFunction - Executes one iteration of GVN +/// iterateOnFunction - Executes one iteration of GVN bool GVN::iterateOnFunction(Function &F) { cleanupGlobalSets(); From echristo at apple.com Mon Dec 22 17:26:40 2008 From: echristo at apple.com (Eric Christopher) Date: Mon, 22 Dec 2008 15:26:40 -0800 Subject: [llvm-commits] [llvm] r61239 - in /llvm/trunk: docs/LangRef.html include/llvm/Attributes.h lib/Analysis/BasicAliasAnalysis.cpp test/Analysis/BasicAA/nocapture.ll In-Reply-To: <16590569-0BFE-4B98-B082-A0EE8A8F797B@apple.com> References: <200812190639.mBJ6dMeR028570@zion.cs.uiuc.edu> <494B6A94.3010504@mxc.ca> <16590569-0BFE-4B98-B082-A0EE8A8F797B@apple.com> Message-ID: <01441A6D-6D89-42AC-B691-5332F6ED6CA6@apple.com> On Dec 22, 2008, at 1:09 PM, Eric Christopher wrote: > I'm seeing failures on darwin due to this: > > /// getParamAlignment - Return the alignment for the specified > function > /// parameter. > unsigned getParamAlignment(unsigned Idx) const { > Attributes Align = getAttributes(Idx) & Attribute::Alignment; > if (Align == 0) > return 0; > > return 1ull << ((Align >> 16) - 1); > } > > there's a warning (local to darwin -Wshorten64-to-32) here that we're > truncating the return value from 64 to 32. fwiw the obvious cast to unsigned works, but isn't preferable. What happened to the changes in the patch you were testing? -eric From nicholas at mxc.ca Mon Dec 22 18:09:51 2008 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 22 Dec 2008 16:09:51 -0800 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> Message-ID: <49502C4F.2040700@mxc.ca> Devang Patel wrote: > Nick, > > char *s1 = "hi\0how are you"; > char *s2 = "hi\0I am fine "; > > s1 and s2 are identical as per strcmp, but memcmp does not agree. Do > you handle this case ? > Yes. GetStringLength is supposed to return the same length that strlen would have returned, stopping at the first null. > - > Devang > > On Dec 20, 2008, at 4:19 PM, Nick Lewycky wrote: > > >> Author: nicholas >> Date: Sat Dec 20 18:19:21 2008 >> New Revision: 61297 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61297&view=rev >> Log: >> Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). >> >> Added: >> llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- >> StrcmpMemcmp.ll >> Modified: >> llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp >> >> Modified: llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp?rev=61297&r1=61296&r2=61297&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> ====================================================================== >> --- llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp (original) >> +++ llvm/trunk/lib/Transforms/Scalar/SimplifyLibCalls.cpp Sat Dec 20 >> 18:19:21 2008 >> @@ -80,7 +80,10 @@ >> /// EmitMemChr - Emit a call to the memchr function. This assumes >> that Ptr is >> /// a pointer, Val is an i32 value, and Len is an 'intptr_t' value. >> Value *EmitMemChr(Value *Ptr, Value *Val, Value *Len, IRBuilder<> >> &B); >> - >> + >> + /// EmitMemCmp - Emit a call to the memcmp function. >> + Value *EmitMemCmp(Value *Ptr1, Value *Ptr2, Value *Len, >> IRBuilder<> &B); >> + >> /// EmitUnaryFloatFnCall - Emit a call to the unary function named >> 'Name' (e.g. >> /// 'floor'). This function is known to take a single of type >> matching 'Op' >> /// and returns one value with the same type. If 'Op' is a long >> double, 'l' >> @@ -106,7 +109,7 @@ >> /// EmitFWrite - Emit a call to the fwrite function. This assumes >> that Ptr is >> /// a pointer, Size is an 'intptr_t', and File is a pointer to FILE. >> void EmitFWrite(Value *Ptr, Value *Size, Value *File, IRBuilder<> >> &B); >> - >> + >> }; >> } // End anonymous namespace. >> >> @@ -151,6 +154,19 @@ >> return B.CreateCall3(MemChr, CastToCStr(Ptr, B), Val, Len, >> "memchr"); >> } >> >> +/// EmitMemCmp - Emit a call to the memcmp function. >> +Value *LibCallOptimization::EmitMemCmp(Value *Ptr1, Value *Ptr2, >> + Value *Len, IRBuilder<> &B) { >> + Module *M = Caller->getParent(); >> + Value *MemCmp = M->getOrInsertFunction("memcmp", >> + Type::Int32Ty, >> + >> PointerType::getUnqual(Type::Int8Ty), >> + >> PointerType::getUnqual(Type::Int8Ty), >> + TD->getIntPtrType(), NULL); >> + return B.CreateCall3(MemCmp, CastToCStr(Ptr1, B), >> CastToCStr(Ptr2, B), >> + Len, "memcmp"); >> +} >> + >> /// EmitUnaryFloatFnCall - Emit a call to the unary function named >> 'Name' (e.g. >> /// 'floor'). This function is known to take a single of type >> matching 'Op' and >> /// returns one value with the same type. If 'Op' is a long double, >> 'l' is >> @@ -537,6 +553,18 @@ >> // strcmp(x, y) -> cnst (if both x and y are constant strings) >> if (HasStr1 && HasStr2) >> return ConstantInt::get(CI->getType(), >> strcmp(Str1.c_str(),Str2.c_str())); >> + >> + // strcmp(P, "x") -> memcmp(P, "x", 2) >> + uint64_t Len1 = GetStringLength(Str1P); >> + uint64_t Len2 = GetStringLength(Str2P); >> + if (Len1 || Len2) { >> + // Choose the smallest Len excluding 0 which means 'unknown'. >> + if (!Len1 || (Len2 && Len2 < Len1)) >> + Len1 = Len2; >> + return EmitMemCmp(Str1P, Str2P, >> + ConstantInt::get(TD->getIntPtrType(), >> Len1), B); >> + } >> + >> return 0; >> } >> }; >> >> Added: llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- >> StrcmpMemcmp.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll?rev=61297&view=auto >> >> = >> = >> = >> = >> = >> = >> = >> = >> ====================================================================== >> --- llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- >> StrcmpMemcmp.ll (added) >> +++ llvm/trunk/test/Transforms/SimplifyLibCalls/2008-12-20- >> StrcmpMemcmp.ll Sat Dec 20 18:19:21 2008 >> @@ -0,0 +1,10 @@ >> +; RUN: llvm-as < %s | opt -simplify-libcalls | llvm-dis | grep >> call.*memcmp >> + >> + at .str = internal constant [2 x i8] c"x\00" >> + >> +declare i32 @strcmp(i8* %dest, i8* %src) >> + >> +define i32 @foo(i8* %x, i8* %y) { >> + %A = call i32 @strcmp(i8* %x, i8* getelementptr ([2 x i8]* @.str, >> i32 0, i32 0)) >> + ret i32 %A >> +} >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From gohman at apple.com Mon Dec 22 18:19:21 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 00:19:21 -0000 Subject: [llvm-commits] [llvm] r61356 - /llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Message-ID: <200812230019.mBN0JLSv029740@zion.cs.uiuc.edu> Author: djg Date: Mon Dec 22 18:19:20 2008 New Revision: 61356 URL: http://llvm.org/viewvc/llvm-project?rev=61356&view=rev Log: Make the fuse-failed debug output human-readable. Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=61356&r1=61355&r2=61356&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon Dec 22 18:19:20 2008 @@ -2029,7 +2029,7 @@ // No fusion if (PrintFailedFusing) - cerr << "We failed to fuse operand " << i << *MI; + cerr << "We failed to fuse operand " << i << " in " << *MI; return NULL; } From resistor at mac.com Mon Dec 22 18:49:52 2008 From: resistor at mac.com (Owen Anderson) Date: Tue, 23 Dec 2008 00:49:52 -0000 Subject: [llvm-commits] [llvm] r61358 - /llvm/trunk/lib/Transforms/Scalar/GVN.cpp Message-ID: <200812230049.mBN0nqHb030580@zion.cs.uiuc.edu> Author: resistor Date: Mon Dec 22 18:49:51 2008 New Revision: 61358 URL: http://llvm.org/viewvc/llvm-project?rev=61358&view=rev Log: Don't forget to remove phi nodes from the value numbering table after we collapse them. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=61358&r1=61357&r2=61358&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Mon Dec 22 18:49:51 2008 @@ -1313,6 +1313,8 @@ p->replaceAllUsesWith(constVal); if (isa(constVal->getType())) MD->invalidateCachedPointerInfo(constVal); + VN.erase(p); + toErase.push_back(p); } else { localAvail[I->getParent()]->table.insert(std::make_pair(num, I)); From clattner at apple.com Mon Dec 22 18:56:36 2008 From: clattner at apple.com (Chris Lattner) Date: Mon, 22 Dec 2008 16:56:36 -0800 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <494FAF06.4050600@gmail.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> Message-ID: <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> On Dec 22, 2008, at 7:15 AM, T?r?k Edwin wrote: >>> You need to check the alignment. >> I think something like: >> MinAlignStr = Minimum Alignment of Str1P, Str2P (max 16) >> MinLen = Minimum length of Str1P and Str2P >> AlignMinLen = AlignOf(MinLen) (max 16) >> If (MinAlignStr >= AlingMinLen) -> transform is safe >> > > But try this with llvm-gcc: > #include > int foo(const char *s) > { > return !memcmp(s,"x",2); > } > > It produces: > foo: > .Leh_func_begin1: > .Llabel1: > cmpw $120, (%rdi) > sete %al > movzbl %al, %eax > ret > > So is it a bug in MemCmp optimization? hi Edwin, This is a really important question that I don't know the answer to. My understanding is that memcmp only touches the bytes necessary to make a decision: it is not allowed to touch the full size if unneeded. However, I don't really *know* that, and nothing I find online in a quick scan comes up with an obvious answer. Can you try asking on comp.lang.c or something like that? It would also be interesting to look at the source for various memcmp implementations to decide if they are safe on commonly available systems (worst-case this becomes a target-specific optimization). For example, darwin/ppc contains this comment: // We optimize the compare by doing it word parallel. This introduces // a complication: if we blindly did word loads from both sides until // finding a difference, we might get a spurious page fault by // reading bytes past the difference. To avoid this, we never do a "lwz" // that crosses a page boundary. Darwin/x86 has a similar comment. This implies that the strcmp- >memcmp optimization is legal (on darwin at least!), but that "unaligned memcmp" -> i16 load is not (and should be fixed!). -Chris From dalej at apple.com Mon Dec 22 19:27:56 2008 From: dalej at apple.com (Dale Johannesen) Date: Mon, 22 Dec 2008 17:27:56 -0800 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> Message-ID: On Dec 22, 2008, at 4:56 PMPST, Chris Lattner wrote: > On Dec 22, 2008, at 7:15 AM, T?r?k Edwin wrote: >>>> >> So is it a bug in MemCmp optimization? > > hi Edwin, > > This is a really important question that I don't know the answer to. > My understanding is that memcmp only touches the bytes necessary to > make a decision: it is not allowed to touch the full size if > unneeded. However, I don't really *know* that, and nothing I find > online in a quick scan comes up with an obvious answer. The standard defines memcmp functionality as: The memcmp function compares the first n characters of the object pointed to be s1 to the first n characters of the object pointed to by s2. It doesn't say anything about comparing fewer than n characters, let alone requiring this under some circumstances. So I don't think it is required, although an implementation may compare fewer under the as-if rule. It does say that accessing one of its arguments beyond the end of an object is undefined behavior; this would presumably always be the case if we're getting a page fault, but that's a user error. But legalese aside, IMO the current darwin implementation is much friendlier to users, and should be left alone. I'm also reasonably sure it will break some things if we change it. From dalej at apple.com Mon Dec 22 19:59:55 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 01:59:55 -0000 Subject: [llvm-commits] [llvm] r61361 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-22-dagcombine-5.ll Message-ID: <200812230159.mBN1xuxJ032539@zion.cs.uiuc.edu> Author: johannes Date: Mon Dec 22 19:59:54 2008 New Revision: 61361 URL: http://llvm.org/viewvc/llvm-project?rev=61361&view=rev Log: One more permutation of subtracting off a base value. Added: llvm/trunk/test/CodeGen/X86/2008-12-22-dagcombine-5.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=61361&r1=61360&r2=61361&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Mon Dec 22 19:59:54 2008 @@ -1184,6 +1184,12 @@ N0.getOperand(1).getOperand(0) == N1) return DAG.getNode(ISD::SUB, VT, N0.getOperand(0), N0.getOperand(1).getOperand(1)); + // fold ((A-(B-C))-C) -> A-B + if (N0.getOpcode() == ISD::SUB && + N0.getOperand(1).getOpcode() == ISD::SUB && + N0.getOperand(1).getOperand(1) == N1) + return DAG.getNode(ISD::SUB, VT, N0.getOperand(0), + N0.getOperand(1).getOperand(0)); // fold (sub x, (select cc, 0, c)) -> (select cc, x, (sub, x, c)) if (N1.getOpcode() == ISD::SELECT && N1.getNode()->hasOneUse()) { SDValue Result = combineSelectAndUse(N, N1, N0, DAG, TLI, LegalOperations); Added: llvm/trunk/test/CodeGen/X86/2008-12-22-dagcombine-5.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-22-dagcombine-5.ll?rev=61361&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/2008-12-22-dagcombine-5.ll (added) +++ llvm/trunk/test/CodeGen/X86/2008-12-22-dagcombine-5.ll Mon Dec 22 19:59:54 2008 @@ -0,0 +1,14 @@ +; RUN: llvm-as < %s | llc -march=x86 | grep "(%esp)" | count 2 +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" +target triple = "i386-apple-darwin9.5" +; -(-a) - a should be found and removed, leaving refs to only L and P +define i32 @test(i32 %a, i32 %L, i32 %P) nounwind { +entry: + %0 = sub i32 %L, %a + %1 = sub i32 %P, %0 + %2 = sub i32 %1, %a + br label %return + +return: ; preds = %bb3 + ret i32 %2 +} From dalej at apple.com Mon Dec 22 20:12:53 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 02:12:53 -0000 Subject: [llvm-commits] [llvm] r61362 - /llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Message-ID: <200812230212.mBN2CrwP000419@zion.cs.uiuc.edu> Author: johannes Date: Mon Dec 22 20:12:52 2008 New Revision: 61362 URL: http://llvm.org/viewvc/llvm-project?rev=61362&view=rev Log: Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV*2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IV*scale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. I owe some testcases for this, want to get it in for nightly runs. Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=61362&r1=61361&r2=61362&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Mon Dec 22 20:12:52 2008 @@ -130,6 +130,12 @@ /// dependent on random ordering of pointers in the process. SmallVector StrideOrder; + /// GEPlist - A list of the GEP's that have been remembered in the SCEV + /// data structures. SCEV does not know to update these when the operands + /// of the GEP are changed, which means we cannot leave them live across + /// loops. + SmallVector GEPlist; + /// CastedValues - As we need to cast values to uintptr_t, this keeps track /// of the casted version of each value. This is accessed by /// getCastedVersionOf. @@ -191,7 +197,7 @@ bool FindIVUserForCond(ICmpInst *Cond, IVStrideUse *&CondUse, const SCEVHandle *&CondStride); bool RequiresTypeConversion(const Type *Ty, const Type *NewTy); - int64_t CheckForIVReuse(bool, bool, bool, const SCEVHandle&, + SCEVHandle CheckForIVReuse(bool, bool, bool, const SCEVHandle&, IVExpr&, const Type*, const std::vector& UsersToProcess); bool ValidStride(bool, int64_t, @@ -340,6 +346,7 @@ } SE->setSCEV(GEP, GEPVal); + GEPlist.push_back(GEP); return GEPVal; } @@ -508,14 +515,22 @@ if (isa(User) && Processed.count(User)) continue; - // If this is an instruction defined in a nested loop, or outside this loop, - // don't recurse into it. + // Descend recursively, but not into PHI nodes outside the current loop. + // It's important to see the entire expression outside the loop to get + // choices that depend on addressing mode use right, although we won't + // consider references ouside the loop in all cases. + // If User is already in Processed, we don't want to recurse into it again, + // but do want to record a second reference in the same instruction. bool AddUserToIVUsers = false; if (LI->getLoopFor(User->getParent()) != L) { - DOUT << "FOUND USER in other loop: " << *User - << " OF SCEV: " << *ISE << "\n"; - AddUserToIVUsers = true; - } else if (!AddUsersIfInteresting(User, L, Processed)) { + if (isa(User) || Processed.count(User) || + !AddUsersIfInteresting(User, L, Processed)) { + DOUT << "FOUND USER in other loop: " << *User + << " OF SCEV: " << *ISE << "\n"; + AddUserToIVUsers = true; + } + } else if (Processed.count(User) || + !AddUsersIfInteresting(User, L, Processed)) { DOUT << "FOUND USER: " << *User << " OF SCEV: " << *ISE << "\n"; AddUserToIVUsers = true; @@ -704,34 +719,45 @@ PHINode *PN = cast(Inst); for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) { if (PN->getIncomingValue(i) == OperandValToReplace) { - // If this is a critical edge, split the edge so that we do not insert the - // code on all predecessor/successor paths. We do this unless this is the - // canonical backedge for this loop, as this can make some inserted code - // be in an illegal position. - BasicBlock *PHIPred = PN->getIncomingBlock(i); - if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 && - (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) { - - // First step, split the critical edge. - SplitCriticalEdge(PHIPred, PN->getParent(), P, false); - - // Next step: move the basic block. In particular, if the PHI node - // is outside of the loop, and PredTI is in the loop, we want to - // move the block to be immediately before the PHI block, not - // immediately after PredTI. - if (L->contains(PHIPred) && !L->contains(PN->getParent())) { - BasicBlock *NewBB = PN->getIncomingBlock(i); - NewBB->moveBefore(PN->getParent()); + // If the original expression is outside the loop, put the replacement + // code in the same place as the original expression, + // which need not be an immediate predecessor of this PHI. This way we + // need only one copy of it even if it is referenced multiple times in + // the PHI. We don't do this when the original expression is inside the + // loop because multiple copies sometimes do useful sinking of code in that + // case(?). + Instruction *OldLoc = dyn_cast(OperandValToReplace); + if (L->contains(OldLoc->getParent())) { + // If this is a critical edge, split the edge so that we do not insert the + // code on all predecessor/successor paths. We do this unless this is the + // canonical backedge for this loop, as this can make some inserted code + // be in an illegal position. + BasicBlock *PHIPred = PN->getIncomingBlock(i); + if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 && + (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) { + + // First step, split the critical edge. + SplitCriticalEdge(PHIPred, PN->getParent(), P, false); + + // Next step: move the basic block. In particular, if the PHI node + // is outside of the loop, and PredTI is in the loop, we want to + // move the block to be immediately before the PHI block, not + // immediately after PredTI. + if (L->contains(PHIPred) && !L->contains(PN->getParent())) { + BasicBlock *NewBB = PN->getIncomingBlock(i); + NewBB->moveBefore(PN->getParent()); + } + + // Splitting the edge can reduce the number of PHI entries we have. + e = PN->getNumIncomingValues(); } - - // Splitting the edge can reduce the number of PHI entries we have. - e = PN->getNumIncomingValues(); } - Value *&Code = InsertedCode[PN->getIncomingBlock(i)]; if (!Code) { // Insert the code into the end of the predecessor block. - Instruction *InsertPt = PN->getIncomingBlock(i)->getTerminator(); + Instruction *InsertPt = (L->contains(OldLoc->getParent())) ? + PN->getIncomingBlock(i)->getTerminator() : + OldLoc->getParent()->getTerminator(); Code = InsertCodeForBaseAtPosition(NewBase, Rewriter, InsertPt, L); // Adjust the type back to match the PHI. Note that we can't use @@ -1168,7 +1194,11 @@ /// mode scale component and optional base reg. This allows the users of /// this stride to be rewritten as prev iv * factor. It returns 0 if no /// reuse is possible. Factors can be negative on same targets, e.g. ARM. -int64_t LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg, +/// +/// If all uses are outside the loop, we don't require that all multiplies +/// be folded into the addressing mode; a multiply (executed once) outside +/// the loop is better than another IV within. Well, usually. +SCEVHandle LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg, bool AllUsesAreAddresses, bool AllUsesAreOutsideLoop, const SCEVHandle &Stride, @@ -1180,7 +1210,7 @@ ++NewStride) { std::map::iterator SI = IVsByStride.find(StrideOrder[NewStride]); - if (SI == IVsByStride.end()) + if (SI == IVsByStride.end() || !isa(SI->first)) continue; int64_t SSInt = cast(SI->first)->getValue()->getSExtValue(); if (SI->first != Stride && @@ -1202,11 +1232,53 @@ if (II->Base->isZero() && !RequiresTypeConversion(II->Base->getType(), Ty)) { IV = *II; - return Scale; + return SE->getIntegerSCEV(Scale, Stride->getType()); } } + } else if (AllUsesAreOutsideLoop) { + // Accept nonconstant strides here; it is really really right to substitute + // an existing IV if we can. + for (unsigned NewStride = 0, e = StrideOrder.size(); NewStride != e; + ++NewStride) { + std::map::iterator SI = + IVsByStride.find(StrideOrder[NewStride]); + if (SI == IVsByStride.end() || !isa(SI->first)) + continue; + int64_t SSInt = cast(SI->first)->getValue()->getSExtValue(); + if (SI->first != Stride && SSInt != 1) + continue; + for (std::vector::iterator II = SI->second.IVs.begin(), + IE = SI->second.IVs.end(); II != IE; ++II) + // Accept nonzero base here. + // Only reuse previous IV if it would not require a type conversion. + if (!RequiresTypeConversion(II->Base->getType(), Ty)) { + IV = *II; + return Stride; + } + } + // Special case, old IV is -1*x and this one is x. Can treat this one as + // -1*old. + for (unsigned NewStride = 0, e = StrideOrder.size(); NewStride != e; + ++NewStride) { + std::map::iterator SI = + IVsByStride.find(StrideOrder[NewStride]); + if (SI == IVsByStride.end()) + continue; + if (SCEVMulExpr *ME = dyn_cast(SI->first)) + if (SCEVConstant *SC = dyn_cast(ME->getOperand(0))) + if (Stride == ME->getOperand(1) && + SC->getValue()->getSExtValue() == -1LL) + for (std::vector::iterator II = SI->second.IVs.begin(), + IE = SI->second.IVs.end(); II != IE; ++II) + // Accept nonzero base here. + // Only reuse previous IV if it would not require type conversion. + if (!RequiresTypeConversion(II->Base->getType(), Ty)) { + IV = *II; + return SE->getIntegerSCEV(-1LL, Stride->getType()); + } + } } - return 0; + return SE->getIntegerSCEV(0, Stride->getType()); } /// PartitionByIsUseOfPostIncrementedValue - Simple boolean predicate that @@ -1357,12 +1429,13 @@ IVExpr ReuseIV(SE->getIntegerSCEV(0, Type::Int32Ty), SE->getIntegerSCEV(0, Type::Int32Ty), 0, 0); - int64_t RewriteFactor = 0; - RewriteFactor = CheckForIVReuse(HaveCommonExprs, AllUsesAreAddresses, + SCEVHandle RewriteFactor = + CheckForIVReuse(HaveCommonExprs, AllUsesAreAddresses, AllUsesAreOutsideLoop, Stride, ReuseIV, CommonExprs->getType(), UsersToProcess); - if (RewriteFactor != 0) { + if (!isa(RewriteFactor) || + !cast(RewriteFactor)->isZero()) { DOUT << "BASED ON IV of STRIDE " << *ReuseIV.Stride << " and BASE " << *ReuseIV.Base << " :\n"; NewPHI = ReuseIV.PHI; @@ -1390,7 +1463,8 @@ Value *CommonBaseV = PreheaderRewriter.expandCodeFor(CommonExprs, PreInsertPt); - if (RewriteFactor == 0) { + if (isa(RewriteFactor) && + cast(RewriteFactor)->isZero()) { // Create a new Phi for this base, and stick it in the loop header. NewPHI = PHINode::Create(ReplacedTy, "iv.", PhiInsertBefore); ++NumInserted; @@ -1537,9 +1611,17 @@ // If we are reusing the iv, then it must be multiplied by a constant // factor take advantage of addressing mode scale component. - if (RewriteFactor != 0) { - RewriteExpr = SE->getMulExpr(SE->getIntegerSCEV(RewriteFactor, - RewriteExpr->getType()), + if (!isa(RewriteFactor) || + !cast(RewriteFactor)->isZero()) { + // If we're reusing an IV with a nonzero base (currently this happens + // only when all reuses are outside the loop) subtract that base here. + // The base has been used to initialize the PHI node but we don't want + // it here. + if (!ReuseIV.Base->isZero()) + RewriteExpr = SE->getMinusSCEV(RewriteExpr, ReuseIV.Base); + + // Multiply old variable, with base removed, by new scale factor. + RewriteExpr = SE->getMulExpr(RewriteFactor, RewriteExpr); // The common base is emitted in the loop preheader. But since we @@ -2174,6 +2256,9 @@ IVUsesByStride.clear(); IVsByStride.clear(); StrideOrder.clear(); + for (unsigned i=0; ideleteValueFromRecords(GEPlist[i]); + GEPlist.clear(); // Clean up after ourselves if (!DeadInsts.empty()) { From wangmp at apple.com Mon Dec 22 22:03:29 2008 From: wangmp at apple.com (Mon P Wang) Date: Tue, 23 Dec 2008 04:03:29 -0000 Subject: [llvm-commits] [llvm] r61365 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <200812230403.mBN43Uir003573@zion.cs.uiuc.edu> Author: wangmp Date: Mon Dec 22 22:03:27 2008 New Revision: 61365 URL: http://llvm.org/viewvc/llvm-project?rev=61365&view=rev Log: Fixed code generation for v8i16 and v16i8 splats on X86. Fixed lowering of v8i16 shuffles for v8i16 when we fall back to extract/insert. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61365&r1=61364&r2=61365&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Dec 22 22:03:27 2008 @@ -2554,6 +2554,23 @@ return cast(ElementBase)->getZExtValue() < NumElems; } +/// getSplatMaskEltNo - Given a splat mask, return the index to the element +/// we want to splat. +static SDValue getSplatMaskEltNo(SDNode *N) { + assert(isSplatMask(N) && "Not a splat mask"); + unsigned NumElems = N->getNumOperands(); + SDValue ElementBase; + unsigned i = 0; + for (; i != NumElems; ++i) { + SDValue Elt = N->getOperand(i); + if (isa(Elt)) + return Elt; + } + assert(0 && " No splat value found!"); + return SDValue(); +} + + /// isSplatMask - Return true if the specified VECTOR_SHUFFLE operand specifies /// a splat of a single element and it's a 2 or 4 element mask. bool X86::isSplatMask(SDNode *N) { @@ -3000,15 +3017,26 @@ return Op; SDValue V1 = Op.getOperand(0); SDValue Mask = Op.getOperand(2); - unsigned NumElems = Mask.getNumOperands(); + unsigned MaskNumElems = Mask.getNumOperands(); + unsigned NumElems = MaskNumElems; // Special handling of v4f32 -> v4i32. if (VT != MVT::v4f32) { - Mask = getUnpacklMask(NumElems, DAG); + // Find which element we want to splat. + SDNode* EltNoNode = getSplatMaskEltNo(Mask.getNode()).getNode(); + unsigned EltNo = cast(EltNoNode)->getZExtValue(); + // unpack elements to the correct location while (NumElems > 4) { + if (EltNo < NumElems/2) { + Mask = getUnpacklMask(MaskNumElems, DAG); + } else { + Mask = getUnpackhMask(MaskNumElems, DAG); + EltNo -= NumElems/2; + } V1 = DAG.getNode(ISD::VECTOR_SHUFFLE, VT, V1, V1, Mask); NumElems >>= 1; } - Mask = getZeroVector(MVT::v4i32, true, DAG); + SDValue Cst = DAG.getConstant(EltNo, MVT::i32); + Mask = DAG.getNode(ISD::BUILD_VECTOR, MVT::v4i32, Cst, Cst, Cst, Cst); } V1 = DAG.getNode(ISD::BIT_CONVERT, PVT, V1); @@ -3661,8 +3689,10 @@ ++V2InOrder; } else if (EltIdx < 8) { V1Elts.push_back(Elt); + V2Elts.push_back(DAG.getConstant(i+8, MaskEVT)); ++V1FromV1; } else { + V1Elts.push_back(Elt); V2Elts.push_back(DAG.getConstant(EltIdx-8, MaskEVT)); ++V2FromV2; } From wangmp at apple.com Mon Dec 22 22:05:10 2008 From: wangmp at apple.com (Mon P Wang) Date: Tue, 23 Dec 2008 04:05:10 -0000 Subject: [llvm-commits] [llvm] r61366 - in /llvm/trunk/test/CodeGen/X86: vec_shuffle-28.ll vec_splat-3.ll vec_splat-4.ll Message-ID: <200812230405.mBN45AaS003652@zion.cs.uiuc.edu> Author: wangmp Date: Mon Dec 22 22:05:08 2008 New Revision: 61366 URL: http://llvm.org/viewvc/llvm-project?rev=61366&view=rev Log: Added shuffle and splat test cases for r61365. Added: llvm/trunk/test/CodeGen/X86/vec_shuffle-28.ll llvm/trunk/test/CodeGen/X86/vec_splat-3.ll llvm/trunk/test/CodeGen/X86/vec_splat-4.ll Added: llvm/trunk/test/CodeGen/X86/vec_shuffle-28.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_shuffle-28.ll?rev=61366&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/vec_shuffle-28.ll (added) +++ llvm/trunk/test/CodeGen/X86/vec_shuffle-28.ll Mon Dec 22 22:05:08 2008 @@ -0,0 +1,33 @@ +; RUN: llvm-as < %s | llc -march=x86 -mattr=sse41 -o %t -f +; RUN: grep punpcklwd %t | count 1 +; RUN: grep pextrw %t | count 8 +; RUN: grep pinsrw %t | count 8 + + +; Pack various elements via shuffles. +define <8 x i16> @shuf1(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp7 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 1, i32 8, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp7 +} + + +define <8 x i16> @shuf2(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp8 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 undef, i32 undef, i32 7, i32 2, i32 8, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp8 +} + + +define <8 x i16> @shuf3(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp9 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 0, i32 1, i32 undef, i32 undef, i32 3, i32 11, i32 undef , i32 undef > + ret <8 x i16> %tmp9 +} + + +define <8 x i16> @shuf4(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp9 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 8, i32 9, i32 undef, i32 undef, i32 11, i32 3, i32 undef , i32 undef > + ret <8 x i16> %tmp9 +} Added: llvm/trunk/test/CodeGen/X86/vec_splat-3.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_splat-3.ll?rev=61366&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/vec_splat-3.ll (added) +++ llvm/trunk/test/CodeGen/X86/vec_splat-3.ll Mon Dec 22 22:05:08 2008 @@ -0,0 +1,55 @@ +; RUN: llvm-as < %s | llc -march=x86 -mattr=sse41 -o %t -f +; RUN: grep punpcklwd %t | count 4 +; RUN: grep punpckhwd %t | count 4 +; RUN: grep "pshufd" %t | count 8 + +; Splat test for v8i16 +; Should generate with pshufd with masks $0, $85, $170, $255 (each mask is used twice) +define <8 x i16> @shuf_8i16_0(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 0, i32 undef, i32 undef, i32 0, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_1(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 1, i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_2(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 2, i32 undef, i32 undef, i32 2, i32 undef, i32 2, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_3(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 3, i32 3, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_4(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 4, i32 undef, i32 undef, i32 undef, i32 4, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_5(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 5, i32 undef, i32 undef, i32 5, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + +define <8 x i16> @shuf_8i16_6(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 6, i32 6, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} + + +define <8 x i16> @shuf_8i16_7(<8 x i16> %T0, <8 x i16> %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <8 x i16> %T0, <8 x i16> %T1, <8 x i32> < i32 7, i32 undef, i32 undef, i32 7, i32 undef, i32 undef, i32 undef , i32 undef > + ret <8 x i16> %tmp6 +} Added: llvm/trunk/test/CodeGen/X86/vec_splat-4.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_splat-4.ll?rev=61366&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/vec_splat-4.ll (added) +++ llvm/trunk/test/CodeGen/X86/vec_splat-4.ll Mon Dec 22 22:05:08 2008 @@ -0,0 +1,104 @@ +; RUN: llvm-as < %s | llc -march=x86 -mattr=sse41 -o %t -f +; RUN: grep punpcklbw %t | count 16 +; RUN: grep punpckhbw %t | count 16 +; RUN: grep "pshufd" %t | count 16 + +; Should generate with pshufd with masks $0, $85, $170, $255 (each mask is used 4 times) + +; Splat test for v16i8 +define <16 x i8 > @shuf_16i8_0(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 0, i32 undef, i32 undef, i32 0, i32 undef, i32 0, i32 0 , i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_1(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 1, i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef, i32 undef, i32 undef, i32 undef, i32 undef , i32 undef, i32 undef, i32 undef, i32 undef > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_2(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 2, i32 undef, i32 undef, i32 2, i32 undef, i32 2, i32 2 , i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_3(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 3, i32 undef, i32 undef, i32 3, i32 undef, i32 3, i32 3 , i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3 > + ret <16 x i8 > %tmp6 +} + + +define <16 x i8 > @shuf_16i8_4(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 4, i32 undef, i32 undef, i32 undef, i32 4, i32 undef, i32 undef , i32 undef, i32 undef, i32 undef, i32 undef , i32 undef, i32 undef, i32 undef, i32 undef , i32 undef > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_5(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 5, i32 undef, i32 undef, i32 5, i32 undef, i32 5, i32 5 , i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5, i32 5 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_6(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 6, i32 undef, i32 undef, i32 6, i32 undef, i32 6, i32 6 , i32 6, i32 6, i32 6, i32 6, i32 6, i32 6, i32 6, i32 6, i32 6 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_7(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 7, i32 undef, i32 undef, i32 7, i32 undef, i32 undef, i32 undef , i32 undef, i32 undef, i32 undef, i32 undef , i32 undef , i32 undef, i32 undef, i32 undef , i32 undef > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_8(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 8, i32 undef, i32 undef, i32 8, i32 undef, i32 8, i32 8 , i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8, i32 8 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_9(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 9, i32 undef, i32 undef, i32 9, i32 undef, i32 9, i32 9 , i32 9, i32 9, i32 9, i32 9, i32 9, i32 9, i32 9, i32 9, i32 9 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_10(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 10, i32 undef, i32 undef, i32 10, i32 undef, i32 10, i32 10 , i32 10, i32 10, i32 10, i32 10, i32 10, i32 10, i32 10, i32 10, i32 10 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_11(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 11, i32 undef, i32 undef, i32 11, i32 undef, i32 11, i32 11 , i32 11, i32 11, i32 11, i32 11, i32 11, i32 11, i32 11, i32 11, i32 11 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_12(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 12, i32 undef, i32 undef, i32 12, i32 undef, i32 12, i32 12 , i32 12, i32 12, i32 12, i32 12, i32 12, i32 12, i32 12, i32 12, i32 12 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_13(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 13, i32 undef, i32 undef, i32 13, i32 undef, i32 13, i32 13 , i32 13, i32 13, i32 13, i32 13, i32 13, i32 13, i32 13, i32 13, i32 13 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_14(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 14, i32 undef, i32 undef, i32 14, i32 undef, i32 14, i32 14 , i32 14, i32 14, i32 14, i32 14, i32 14, i32 14, i32 14, i32 14, i32 14 > + ret <16 x i8 > %tmp6 +} + +define <16 x i8 > @shuf_16i8_15(<16 x i8 > %T0, <16 x i8 > %T1) nounwind readnone { +entry: + %tmp6 = shufflevector <16 x i8 > %T0, <16 x i8 > %T1, <16 x i32> < i32 15, i32 undef, i32 undef, i32 15, i32 undef, i32 15, i32 15 , i32 15, i32 15, i32 15, i32 15, i32 15, i32 15, i32 15, i32 15, i32 15 > + ret <16 x i8 > %tmp6 +} From nicholas at mxc.ca Mon Dec 22 23:20:34 2008 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 22 Dec 2008 21:20:34 -0800 Subject: [llvm-commits] [llvm] r61239 - in /llvm/trunk: docs/LangRef.html include/llvm/Attributes.h lib/Analysis/BasicAliasAnalysis.cpp test/Analysis/BasicAA/nocapture.ll In-Reply-To: <01441A6D-6D89-42AC-B691-5332F6ED6CA6@apple.com> References: <200812190639.mBJ6dMeR028570@zion.cs.uiuc.edu> <494B6A94.3010504@mxc.ca> <16590569-0BFE-4B98-B082-A0EE8A8F797B@apple.com> <01441A6D-6D89-42AC-B691-5332F6ED6CA6@apple.com> Message-ID: <49507522.8080107@mxc.ca> Eric Christopher wrote: > On Dec 22, 2008, at 1:09 PM, Eric Christopher wrote: > > >> I'm seeing failures on darwin due to this: >> >> /// getParamAlignment - Return the alignment for the specified >> function >> /// parameter. >> unsigned getParamAlignment(unsigned Idx) const { >> Attributes Align = getAttributes(Idx) & Attribute::Alignment; >> if (Align == 0) >> return 0; >> >> return 1ull << ((Align >> 16) - 1); >> } >> >> there's a warning (local to darwin -Wshorten64-to-32) here that we're >> truncating the return value from 64 to 32. >> > > fwiw the obvious cast to unsigned works, but isn't preferable. > The alignment is currently stored as 5 bits, which can only produce a 32-bit value. Anything wrong with the blunt cast in this case? Feel free to commit that change if it unbreaks your stuff, I won't get the chance to until end of day tomorrow. > What happened to the changes in the patch you were testing? > Sorry, but I'm not sure what you mean here? Nick From xuzhongxing at gmail.com Mon Dec 22 23:30:52 2008 From: xuzhongxing at gmail.com (Zhongxing Xu) Date: Tue, 23 Dec 2008 05:30:52 -0000 Subject: [llvm-commits] [llvm] r61368 - /llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Message-ID: <200812230530.mBN5UsUF006040@zion.cs.uiuc.edu> Author: zhongxingxu Date: Mon Dec 22 23:30:44 2008 New Revision: 61368 URL: http://llvm.org/viewvc/llvm-project?rev=61368&view=rev Log: Remove dead code. Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp?rev=61368&r1=61367&r2=61368&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp (original) +++ llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Mon Dec 22 23:30:44 2008 @@ -85,7 +85,7 @@ case bitc::END_BLOCK: { bool x = Stream.ReadBlockEnd(); - assert(!x && "Error at block end."); x=x; + assert(!x && "Error at block end."); BlockStack.pop_back(); continue; } From nicholas at mxc.ca Mon Dec 22 23:36:48 2008 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 22 Dec 2008 21:36:48 -0800 Subject: [llvm-commits] [llvm] r61368 - /llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp In-Reply-To: <200812230530.mBN5UsUF006040@zion.cs.uiuc.edu> References: <200812230530.mBN5UsUF006040@zion.cs.uiuc.edu> Message-ID: <495078F0.7020403@mxc.ca> That's there to prevent us from emitting a warning about an unused variable 'x' in release builds. Please revert. Nick Zhongxing Xu wrote: > Author: zhongxingxu > Date: Mon Dec 22 23:30:44 2008 > New Revision: 61368 > > URL: http://llvm.org/viewvc/llvm-project?rev=61368&view=rev > Log: > Remove dead code. > > Modified: > llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp > > Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp?rev=61368&r1=61367&r2=61368&view=diff > > ============================================================================== > --- llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp (original) > +++ llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Mon Dec 22 23:30:44 2008 > @@ -85,7 +85,7 @@ > > case bitc::END_BLOCK: { > bool x = Stream.ReadBlockEnd(); > - assert(!x && "Error at block end."); x=x; > + assert(!x && "Error at block end."); > BlockStack.pop_back(); > continue; > } > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From xuzhongxing at gmail.com Mon Dec 22 23:42:22 2008 From: xuzhongxing at gmail.com (Zhongxing Xu) Date: Tue, 23 Dec 2008 13:42:22 +0800 Subject: [llvm-commits] [llvm] r61368 - /llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp In-Reply-To: <495078F0.7020403@mxc.ca> References: <200812230530.mBN5UsUF006040@zion.cs.uiuc.edu> <495078F0.7020403@mxc.ca> Message-ID: <5400aeb80812222142m7245aefi2e09a8b9cf6e1763@mail.gmail.com> Sorry. I'll revert. On Tue, Dec 23, 2008 at 1:36 PM, Nick Lewycky wrote: > That's there to prevent us from emitting a warning about an unused variable > 'x' in release builds. Please revert. > > Nick > > > Zhongxing Xu wrote: > >> Author: zhongxingxu >> Date: Mon Dec 22 23:30:44 2008 >> New Revision: 61368 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61368&view=rev >> Log: >> Remove dead code. >> >> Modified: >> llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp >> >> Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp?rev=61368&r1=61367&r2=61368&view=diff >> >> >> ============================================================================== >> --- llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp (original) >> +++ llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Mon Dec 22 23:30:44 2008 >> @@ -85,7 +85,7 @@ >> case bitc::END_BLOCK: { >> bool x = Stream.ReadBlockEnd(); >> - assert(!x && "Error at block end."); x=x; >> + assert(!x && "Error at block end."); >> BlockStack.pop_back(); >> continue; >> } >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20081223/f7486c9d/attachment.html From xuzhongxing at gmail.com Mon Dec 22 23:44:00 2008 From: xuzhongxing at gmail.com (Zhongxing Xu) Date: Tue, 23 Dec 2008 05:44:00 -0000 Subject: [llvm-commits] [llvm] r61369 - /llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Message-ID: <200812230544.mBN5i1Q6006423@zion.cs.uiuc.edu> Author: zhongxingxu Date: Mon Dec 22 23:43:56 2008 New Revision: 61369 URL: http://llvm.org/viewvc/llvm-project?rev=61369&view=rev Log: revert r61368. Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Modified: llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp?rev=61369&r1=61368&r2=61369&view=diff ============================================================================== --- llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp (original) +++ llvm/trunk/lib/Bitcode/Reader/Deserialize.cpp Mon Dec 22 23:43:56 2008 @@ -85,7 +85,7 @@ case bitc::END_BLOCK: { bool x = Stream.ReadBlockEnd(); - assert(!x && "Error at block end."); + assert(!x && "Error at block end."); x=x; BlockStack.pop_back(); continue; } From echristo at apple.com Tue Dec 23 04:10:14 2008 From: echristo at apple.com (Eric Christopher) Date: Tue, 23 Dec 2008 02:10:14 -0800 Subject: [llvm-commits] [llvm] r61239 - in /llvm/trunk: docs/LangRef.html include/llvm/Attributes.h lib/Analysis/BasicAliasAnalysis.cpp test/Analysis/BasicAA/nocapture.ll In-Reply-To: <49507522.8080107@mxc.ca> References: <200812190639.mBJ6dMeR028570@zion.cs.uiuc.edu> <494B6A94.3010504@mxc.ca> <16590569-0BFE-4B98-B082-A0EE8A8F797B@apple.com> <01441A6D-6D89-42AC-B691-5332F6ED6CA6@apple.com> <49507522.8080107@mxc.ca> Message-ID: >> >> fwiw the obvious cast to unsigned works, but isn't preferable. >> > The alignment is currently stored as 5 bits, which can only produce > a 32-bit value. Anything wrong with the blunt cast in this case? > Not really I guess, no. Of course could just use 1 << ...? > Feel free to commit that change if it unbreaks your stuff, I won't > get the chance to until end of day tomorrow. >> What happened to the changes in the patch you were testing? >> > Sorry, but I'm not sure what you mean here? Some patch that went between you and Bill had a completely different one. -eric From edwintorok at gmail.com Tue Dec 23 06:17:32 2008 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Tue, 23 Dec 2008 14:17:32 +0200 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> Message-ID: <4950D6DC.9040802@gmail.com> On 2008-12-23 02:56, Chris Lattner wrote: > On Dec 22, 2008, at 7:15 AM, T?r?k Edwin wrote: > >>>> You need to check the alignment. >>>> >>> I think something like: >>> MinAlignStr = Minimum Alignment of Str1P, Str2P (max 16) >>> MinLen = Minimum length of Str1P and Str2P >>> AlignMinLen = AlignOf(MinLen) (max 16) >>> If (MinAlignStr >= AlingMinLen) -> transform is safe >>> >>> >> But try this with llvm-gcc: >> #include >> int foo(const char *s) >> { >> return !memcmp(s,"x",2); >> } >> >> It produces: >> foo: >> .Leh_func_begin1: >> .Llabel1: >> cmpw $120, (%rdi) >> sete %al >> movzbl %al, %eax >> ret >> >> So is it a bug in MemCmp optimization? >> > > hi Edwin, > > This is a really important question that I don't know the answer to. > My understanding is that memcmp only touches the bytes necessary to > make a decision: it is not allowed to touch the full size if > unneeded. However, I don't really *know* that, and nothing I find > online in a quick scan comes up with an obvious answer. > > Can you try asking on comp.lang.c or something like that? I found 2 messages on comp.lang.c about this: http://groups.google.com/group/comp.lang.c/msg/28fcc4a0e29e5339 http://groups.google.com/group/comp.lang.c/msg/786c9f1cb2e9b085 " /* or can probably use memcmp since in practice it will safely stop on or near null != nonnull and return nonmatch, but not AFAICT guaranteed, and being not str* may alarm some */ You seem to be correct - "7.21.4.1 The memcmp function" doesn't require it to stop as long as it ultimately returns the correct result - but it's in the implementation's interests to make it as efficient as possible." > It would > also be interesting to look at the source for various memcmp > implementations glibc: /* The strategy of this memcmp is: 1. Compare bytes until one of the block pointers is aligned. 2. Compare using memcmp_common_alignment or memcmp_not_common_alignment, regarding the alignment of the other block after the initial byte operations. The maximum number of full words (of type op_t) are compared in this way. 3. Compare the few remaining bytes. */ However in memcmp_not_common_alignment: /* memcmp_not_common_alignment -- Compare blocks at SRCP1 and SRCP2 with LEN `op_t' objects (not LEN bytes!). SRCP2 should be aligned for memory operations on `op_t', but SRCP1 *should be unaligned*. */ switch (len % 4) { case 2: a1 = ((op_t *) srcp1)[0]; a2 = ((op_t *) srcp1)[1]; ... } It does a 2-byte read on srcp1, which is unaligned, so I think it may cross a page boundary if we aren't allowed to do 2-byte reads on all the region we pass to memcmp. > to decide if they are safe on commonly available > systems (worst-case this becomes a target-specific optimization). > How about using the alignment and length of the string to determine that memcmp won't cross a page boundary, even with an implementation that does 8-byte reads always? Alternatively we may align the operands with some extra code inserted. I remember seeing valgrind warnings with older versions about strcmp, and valgrind has --partial-loads-ok, so I guess strcmp already does loads that cross the boundary of one of its arguments, if it knows that it is safe to do so (i.e. it won't cross page boundaries). Best regards, --Edwin From edwintorok at gmail.com Tue Dec 23 06:19:33 2008 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Tue, 23 Dec 2008 14:19:33 +0200 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <4950D6DC.9040802@gmail.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> <4950D6DC.9040802@gmail.com> Message-ID: <4950D755.8060701@gmail.com> On 2008-12-23 14:17, T?r?k Edwin wrote: > On 2008-12-23 02:56, Chris Lattner wrote: > >> On Dec 22, 2008, at 7:15 AM, T?r?k Edwin wrote: >> >> >>>>> You need to check the alignment. >>>>> >>>>> >>>> I think something like: >>>> MinAlignStr = Minimum Alignment of Str1P, Str2P (max 16) >>>> MinLen = Minimum length of Str1P and Str2P >>>> AlignMinLen = AlignOf(MinLen) (max 16) >>>> If (MinAlignStr >= AlingMinLen) -> transform is safe >>>> >>>> >>>> >>> But try this with llvm-gcc: >>> #include >>> int foo(const char *s) >>> { >>> return !memcmp(s,"x",2); >>> } >>> >>> It produces: >>> foo: >>> .Leh_func_begin1: >>> .Llabel1: >>> cmpw $120, (%rdi) >>> sete %al >>> movzbl %al, %eax >>> ret >>> >>> So is it a bug in MemCmp optimization? >>> >>> >> hi Edwin, >> >> This is a really important question that I don't know the answer to. >> My understanding is that memcmp only touches the bytes necessary to >> make a decision: it is not allowed to touch the full size if >> unneeded. However, I don't really *know* that, and nothing I find >> online in a quick scan comes up with an obvious answer. >> >> Can you try asking on comp.lang.c or something like that? >> > > I found 2 messages on comp.lang.c about this: > http://groups.google.com/group/comp.lang.c/msg/28fcc4a0e29e5339 > http://groups.google.com/group/comp.lang.c/msg/786c9f1cb2e9b085 > > " /* or can probably use memcmp since in practice it will safely > stop on or near null != nonnull and return nonmatch, but not > AFAICT guaranteed, and being not str* may alarm some */ > > You seem to be correct - "7.21.4.1 The memcmp function" doesn't require it > to stop as long as it ultimately returns the correct result - but it's in > the implementation's interests to make it as efficient as possible." > > >> It would >> also be interesting to look at the source for various memcmp >> implementations >> > > glibc: > /* The strategy of this memcmp is: > > 1. Compare bytes until one of the block pointers is aligned. > > 2. Compare using memcmp_common_alignment or > memcmp_not_common_alignment, regarding the alignment of the other > block after the initial byte operations. The maximum number of > full words (of type op_t) are compared in this way. > > 3. Compare the few remaining bytes. */ > > However in memcmp_not_common_alignment: > /* memcmp_not_common_alignment -- Compare blocks at SRCP1 and SRCP2 with LEN > `op_t' objects (not LEN bytes!). SRCP2 should be aligned for memory > operations on `op_t', but SRCP1 *should be unaligned*. */ > switch (len % 4) { > case 2: > a1 = ((op_t *) srcp1)[0]; > a2 = ((op_t *) srcp1)[1]; > ... > } > > It does a 2-byte read on srcp1, which is unaligned, so I think it may > cross a page boundary if we aren't allowed to do 2-byte reads on all the > region we pass to memcmp. > > > >> to decide if they are safe on commonly available >> systems (worst-case this becomes a target-specific optimization). >> >> > > How about using the alignment and length of the string to determine that > memcmp won't cross a page boundary, even > with an implementation that does 8-byte reads always? > Alternatively we may align the operands with some extra code inserted. > > I remember seeing valgrind warnings with older versions about strcmp, > and valgrind has --partial-loads-ok, so > I guess strcmp already does loads that cross the boundary of one of its > arguments, if it knows that it is safe to do so (i.e. it won't cross > page boundaries). [sorry, hit Send button too soon] We should also add a testcase to test/ or in test-suite/ that specifically tests these corner cases. We can allocate a single page with mmap, so that the OS will kill the program if it crosses page boundary. Best regards, --Edwin From espindola at google.com Tue Dec 23 10:41:31 2008 From: espindola at google.com (Rafael Espindola) Date: Tue, 23 Dec 2008 16:41:31 +0000 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> Message-ID: <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> > My current fix is to change d1 to be of pointer type. I will try to > get the change to FD_ZERO upstream. I posted a patch to libc (http://sourceware.org/ml/libc-alpha/2008-12/msg00064.html), but it got rejected. I will keep a local patch. Posting the link here in case someone else has the same problem. Cheers, -- Rafael Avila de Espindola Google | Gordon House | Barrow Street | Dublin 4 | Ireland Registered in Dublin, Ireland | Registration Number: 368047 From clattner at apple.com Tue Dec 23 11:14:40 2008 From: clattner at apple.com (Chris Lattner) Date: Tue, 23 Dec 2008 09:14:40 -0800 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> Message-ID: On Dec 23, 2008, at 8:41 AM, Rafael Espindola wrote: >> My current fix is to change d1 to be of pointer type. I will try to >> get the change to FD_ZERO upstream. > > I posted a patch to libc > (http://sourceware.org/ml/libc-alpha/2008-12/msg00064.html), but it > got rejected. I will keep a local patch. Posting the link here in case > someone else has the same problem. Wow, that is quite the strong response. Evan, is it possible to handle this by having llvm-gcc do the appropriate promotion when building the asm? Breaking FD_ZERO seems unacceptable :( -Chris From gohman at apple.com Tue Dec 23 11:22:37 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 17:22:37 -0000 Subject: [llvm-commits] [llvm] r61371 - /llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Message-ID: <200812231722.mBNHMdpJ005552@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 11:22:32 2008 New Revision: 61371 URL: http://llvm.org/viewvc/llvm-project?rev=61371&view=rev Log: Minor code simplifications. Modified: llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Modified: llvm/trunk/lib/CodeGen/ScheduleDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAG.cpp?rev=61371&r1=61370&r2=61371&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ScheduleDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/ScheduleDAG.cpp Tue Dec 23 11:22:32 2008 @@ -177,7 +177,7 @@ void SUnit::ComputeDepth() { SmallVector WorkList; WorkList.push_back(this); - while (!WorkList.empty()) { + do { SUnit *Cur = WorkList.back(); bool Done = true; @@ -202,7 +202,7 @@ } Cur->isDepthCurrent = true; } - } + } while (!WorkList.empty()); } /// ComputeHeight - Calculate the maximal path from the node to the entry. @@ -210,7 +210,7 @@ void SUnit::ComputeHeight() { SmallVector WorkList; WorkList.push_back(this); - while (!WorkList.empty()) { + do { SUnit *Cur = WorkList.back(); bool Done = true; @@ -235,7 +235,7 @@ } Cur->isHeightCurrent = true; } - } + } while (!WorkList.empty()); } /// SUnit - Scheduling unit. It's an wrapper around either a single SDNode or @@ -467,7 +467,7 @@ WorkList.reserve(SUnits.size()); WorkList.push_back(SU); - while (!WorkList.empty()) { + do { SU = WorkList.back(); WorkList.pop_back(); Visited.set(SU->NodeNum); @@ -482,7 +482,7 @@ WorkList.push_back(SU->Succs[I].getSUnit()); } } - } + } while (!WorkList.empty()); } /// Shift - Renumber the nodes so that the topological ordering is From clattner at apple.com Tue Dec 23 11:24:13 2008 From: clattner at apple.com (Chris Lattner) Date: Tue, 23 Dec 2008 09:24:13 -0800 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: <4950D6DC.9040802@gmail.com> References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> <4950D6DC.9040802@gmail.com> Message-ID: On Dec 23, 2008, at 4:17 AM, T?r?k Edwin wrote: >> This is a really important question that I don't know the answer to. >> My understanding is that memcmp only touches the bytes necessary to >> make a decision: it is not allowed to touch the full size if >> unneeded. However, I don't really *know* that, and nothing I find >> online in a quick scan comes up with an obvious answer. >> >> Can you try asking on comp.lang.c or something like that? > > I found 2 messages on comp.lang.c about this: > http://groups.google.com/group/comp.lang.c/msg/28fcc4a0e29e5339 > http://groups.google.com/group/comp.lang.c/msg/786c9f1cb2e9b085 > > " /* or can probably use memcmp since in practice it will safely > stop on or near null != nonnull and return nonmatch, but not > AFAICT guaranteed, and being not str* may alarm some */ > > You seem to be correct - "7.21.4.1 The memcmp function" doesn't > require it > to stop as long as it ultimately returns the correct result - but > it's in > the implementation's interests to make it as efficient as possible." Ok. This means that it should be opt-in on a per-target basis. >> It would >> also be interesting to look at the source for various memcmp >> implementations > > glibc: > However in memcmp_not_common_alignment: > /* memcmp_not_common_alignment -- Compare blocks at SRCP1 and SRCP2 > with LEN > `op_t' objects (not LEN bytes!). SRCP2 should be aligned for memory > operations on `op_t', but SRCP1 *should be unaligned*. */ > switch (len % 4) { > case 2: > a1 = ((op_t *) srcp1)[0]; > a2 = ((op_t *) srcp1)[1]; > ... > } > > It does a 2-byte read on srcp1, which is unaligned, so I think it may > cross a page boundary if we aren't allowed to do 2-byte reads on all > the > region we pass to memcmp. opt_t is 16-bits? That is unfortunate, it means we can't do the optimization on glibc. :( >> to decide if they are safe on commonly available >> systems (worst-case this becomes a target-specific optimization). >> > > How about using the alignment and length of the string to determine > that > memcmp won't cross a page boundary, even > with an implementation that does 8-byte reads always? > Alternatively we may align the operands with some extra code inserted. The problem is that we usually can't know the alignment of the non- constant string. -Chris From gohman at apple.com Tue Dec 23 11:24:57 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 17:24:57 -0000 Subject: [llvm-commits] [llvm] r61372 - /llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Message-ID: <200812231725.mBNHP1vj005624@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 11:24:50 2008 New Revision: 61372 URL: http://llvm.org/viewvc/llvm-project?rev=61372&view=rev Log: Avoid an unnecessary call to allnodes_size(), which is linear. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp?rev=61372&r1=61371&r2=61372&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Tue Dec 23 11:24:50 2008 @@ -68,20 +68,23 @@ /// This SUnit graph is similar to the SelectionDAG, but represents flagged /// together nodes with a single SUnit. void ScheduleDAGSDNodes::BuildSchedUnits() { - // Reserve entries in the vector for each of the SUnits we are creating. This - // ensure that reallocation of the vector won't happen, so SUnit*'s won't get - // invalidated. - // FIXME: Multiply by 2 because we may clone nodes during scheduling. - // This is a temporary workaround. - SUnits.reserve(DAG->allnodes_size() * 2); - // During scheduling, the NodeId field of SDNode is used to map SDNodes // to their associated SUnits by holding SUnits table indices. A value // of -1 means the SDNode does not yet have an associated SUnit. + unsigned NumNodes = 0; for (SelectionDAG::allnodes_iterator NI = DAG->allnodes_begin(), - E = DAG->allnodes_end(); NI != E; ++NI) + E = DAG->allnodes_end(); NI != E; ++NI) { NI->setNodeId(-1); + ++NumNodes; + } + // Reserve entries in the vector for each of the SUnits we are creating. This + // ensure that reallocation of the vector won't happen, so SUnit*'s won't get + // invalidated. + // FIXME: Multiply by 2 because we may clone nodes during scheduling. + // This is a temporary workaround. + SUnits.reserve(NumNodes * 2); + // Check to see if the scheduler cares about latencies. bool UnitLatencies = ForceUnitLatencies(); From edwintorok at gmail.com Tue Dec 23 11:27:53 2008 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Tue, 23 Dec 2008 19:27:53 +0200 Subject: [llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll In-Reply-To: References: <200812210019.mBL0JM32028246@zion.cs.uiuc.edu> <494D95DA.1030608@mxc.ca> <494E0148.4030600@gmail.com> <494FAF06.4050600@gmail.com> <3AC2BB5D-7ADC-4A5F-9DC7-E41C6A030E72@apple.com> <4950D6DC.9040802@gmail.com> Message-ID: <49511F99.9010500@gmail.com> On 2008-12-23 19:24, Chris Lattner wrote: > On Dec 23, 2008, at 4:17 AM, T?r?k Edwin wrote: > >>> This is a really important question that I don't know the answer to. >>> My understanding is that memcmp only touches the bytes necessary to >>> make a decision: it is not allowed to touch the full size if >>> unneeded. However, I don't really *know* that, and nothing I find >>> online in a quick scan comes up with an obvious answer. >>> >>> Can you try asking on comp.lang.c or something like that? >>> >> I found 2 messages on comp.lang.c about this: >> http://groups.google.com/group/comp.lang.c/msg/28fcc4a0e29e5339 >> http://groups.google.com/group/comp.lang.c/msg/786c9f1cb2e9b085 >> >> " /* or can probably use memcmp since in practice it will safely >> stop on or near null != nonnull and return nonmatch, but not >> AFAICT guaranteed, and being not str* may alarm some */ >> >> You seem to be correct - "7.21.4.1 The memcmp function" doesn't >> require it >> to stop as long as it ultimately returns the correct result - but >> it's in >> the implementation's interests to make it as efficient as possible." >> > > Ok. This means that it should be opt-in on a per-target basis. > > >>> It would >>> also be interesting to look at the source for various memcmp >>> implementations >>> >> glibc: >> However in memcmp_not_common_alignment: >> /* memcmp_not_common_alignment -- Compare blocks at SRCP1 and SRCP2 >> with LEN >> `op_t' objects (not LEN bytes!). SRCP2 should be aligned for memory >> operations on `op_t', but SRCP1 *should be unaligned*. */ >> switch (len % 4) { >> case 2: >> a1 = ((op_t *) srcp1)[0]; >> a2 = ((op_t *) srcp1)[1]; >> ... >> } >> >> It does a 2-byte read on srcp1, which is unaligned, so I think it may >> cross a page boundary if we aren't allowed to do 2-byte reads on all >> the >> region we pass to memcmp. >> > > opt_t is 16-bits? That is unfortunate, it means we can't do the > optimization on glibc. :( > op_t is an unsigned long, so 64-bits on my platform, and not 16 ;) From gohman at apple.com Tue Dec 23 11:28:53 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 17:28:53 -0000 Subject: [llvm-commits] [llvm] r61373 - in /llvm/trunk/lib/CodeGen: MachineInstr.cpp MachineLICM.cpp ScheduleDAGInstrs.cpp Message-ID: <200812231728.mBNHSsGc005732@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 11:28:50 2008 New Revision: 61373 URL: http://llvm.org/viewvc/llvm-project?rev=61373&view=rev Log: Use isTerminator() instead of isBranch()||isReturn() in several places. isTerminator() returns true for a superset of cases, and includes things like FP_REG_KILL, which are nither return or branch but aren't safe to move/remat/etc. Modified: llvm/trunk/lib/CodeGen/MachineInstr.cpp llvm/trunk/lib/CodeGen/MachineLICM.cpp llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Modified: llvm/trunk/lib/CodeGen/MachineInstr.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineInstr.cpp?rev=61373&r1=61372&r2=61373&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineInstr.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineInstr.cpp Tue Dec 23 11:28:50 2008 @@ -707,7 +707,7 @@ SawStore = true; return false; } - if (TID->isReturn() || TID->isBranch() || TID->hasUnmodeledSideEffects()) + if (TID->isTerminator() || TID->hasUnmodeledSideEffects()) return false; // See if this instruction does a load. If so, we have to guarantee that the Modified: llvm/trunk/lib/CodeGen/MachineLICM.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineLICM.cpp?rev=61373&r1=61372&r2=61373&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineLICM.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineLICM.cpp Tue Dec 23 11:28:50 2008 @@ -205,7 +205,7 @@ const TargetInstrDesc &TID = I.getDesc(); // Ignore stuff that we obviously can't hoist. - if (TID.mayStore() || TID.isCall() || TID.isReturn() || TID.isBranch() || + if (TID.mayStore() || TID.isCall() || TID.isTerminator() || TID.hasUnmodeledSideEffects()) return false; Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp?rev=61373&r1=61372&r2=61373&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original) +++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Tue Dec 23 11:28:50 2008 @@ -275,8 +275,7 @@ // after stack slots are lowered to actual addresses. // TODO: Use an AliasAnalysis and do real alias-analysis queries, and // produce more precise dependence information. - if (TID.isCall() || TID.isReturn() || TID.isBranch() || - TID.hasUnmodeledSideEffects()) { + if (TID.isCall() || TID.isTerminator() || TID.hasUnmodeledSideEffects()) { new_chain: // This is the conservative case. Add dependencies on all memory // references. @@ -300,7 +299,7 @@ // See if it is known to just have a single memory reference. MachineInstr *ChainMI = Chain->getInstr(); const TargetInstrDesc &ChainTID = ChainMI->getDesc(); - if (!ChainTID.isCall() && !ChainTID.isReturn() && !ChainTID.isBranch() && + if (!ChainTID.isCall() && !ChainTID.isTerminator() && !ChainTID.hasUnmodeledSideEffects() && ChainMI->hasOneMemOperand() && !ChainMI->memoperands_begin()->isVolatile() && From danchr at gmail.com Tue Dec 23 11:28:48 2008 From: danchr at gmail.com (Dan Villiom Podlaski Christiansen) Date: Tue, 23 Dec 2008 18:28:48 +0100 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> Message-ID: <6A763C60-ACBA-404F-A662-48A1F1530A68@gmail.com> On 23 Dec 2008, at 18:14, Chris Lattner wrote: > > On Dec 23, 2008, at 8:41 AM, Rafael Espindola wrote: > >>> My current fix is to change d1 to be of pointer type. I will try to >>> get the change to FD_ZERO upstream. >> >> I posted a patch to libc >> (http://sourceware.org/ml/libc-alpha/2008-12/msg00064.html), but it >> got rejected. I will keep a local patch. Posting the link here in >> case >> someone else has the same problem. > > Wow, that is quite the strong response. Evan, is it possible to > handle this by having llvm-gcc do the appropriate promotion when > building the asm? Breaking FD_ZERO seems unacceptable :( How about using the GCC fixincludes machinery? I don't know much about how it actually works, but from the name it seems possibly related :) -- Dan Villiom Podlaski Christiansen, stud. scient., danchr at cs.au.dk, danchr at gmail.com From anton at korobeynikov.info Tue Dec 23 11:56:10 2008 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 23 Dec 2008 20:56:10 +0300 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: <6A763C60-ACBA-404F-A662-48A1F1530A68@gmail.com> References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> <6A763C60-ACBA-404F-A662-48A1F1530A68@gmail.com> Message-ID: Hi, > How about using the GCC fixincludes machinery? I don't know much about > how it actually works, but from the name it seems possibly related :) This will work for system headers, but unfortunately won't - for user-provided headers, which can contain arbitrary weird stuff. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From danchr at gmail.com Tue Dec 23 12:00:24 2008 From: danchr at gmail.com (Dan Villiom Podlaski Christiansen) Date: Tue, 23 Dec 2008 19:00:24 +0100 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> <6A763C60-ACBA-404F-A662-48A1F1530A68@gmail.com> Message-ID: On 23 Dec 2008, at 18:56, Anton Korobeynikov wrote: > Hi, > >> How about using the GCC fixincludes machinery? I don't know much >> about >> how it actually works, but from the name it seems possibly related :) > This will work for system headers, but unfortunately won't - for > user-provided headers, which can contain arbitrary weird stuff. In that case, it might work for this. After all, FD_SET and glibc are quite likely to be installed as system headers. -- Dan Villiom Podlaski Christiansen, stud. scient., danchr at cs.au.dk, danchr at gmail.com From espindola at google.com Tue Dec 23 12:02:03 2008 From: espindola at google.com (Rafael Espindola) Date: Tue, 23 Dec 2008 18:02:03 +0000 Subject: [llvm-commits] [llvm-gcc-4.2] r61128 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp In-Reply-To: References: <200812170738.mBH7cG9Y013859@zion.cs.uiuc.edu> <38a0d8450812191738v2250c2e1l799b69e6e8c36fa6@mail.gmail.com> <3D4B9BAD-94A9-43BC-AB53-FE4BB9D1372A@apple.com> <38a0d8450812220149v79bfe823q94b5f1d0db64aa6c@mail.gmail.com> <38a0d8450812230841l6e4e8184w949ceca7d3859f62@mail.gmail.com> <6A763C60-ACBA-404F-A662-48A1F1530A68@gmail.com> Message-ID: <38a0d8450812231002n3fb90ad6r316b053754e5666@mail.gmail.com> > This will work for system headers, but unfortunately won't - for > user-provided headers, which can contain arbitrary weird stuff. > This was the only case I found, so using fixinclude looks like a good idea unless the problem is more common than I have noticed. Cheers, -- Rafael Avila de Espindola Google | Gordon House | Barrow Street | Dublin 4 | Ireland Registered in Dublin, Ireland | Registration Number: 368047 From gohman at apple.com Tue Dec 23 12:20:21 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 18:20:21 -0000 Subject: [llvm-commits] [llvm] r61374 - /llvm/trunk/include/llvm/CodeGen/MachineOperand.h Message-ID: <200812231820.mBNIKLOL007196@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 12:20:16 2008 New Revision: 61374 URL: http://llvm.org/viewvc/llvm-project?rev=61374&view=rev Log: Comment MO_FPImmediate and doxygenate surrounding comments. Modified: llvm/trunk/include/llvm/CodeGen/MachineOperand.h Modified: llvm/trunk/include/llvm/CodeGen/MachineOperand.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineOperand.h?rev=61374&r1=61373&r2=61374&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineOperand.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineOperand.h Tue Dec 23 12:20:16 2008 @@ -33,15 +33,15 @@ class MachineOperand { public: enum MachineOperandType { - MO_Register, // Register operand. - MO_Immediate, // Immediate Operand - MO_FPImmediate, - MO_MachineBasicBlock, // MachineBasicBlock reference - MO_FrameIndex, // Abstract Stack Frame Index - MO_ConstantPoolIndex, // Address of indexed Constant in Constant Pool - MO_JumpTableIndex, // Address of indexed Jump Table for switch - MO_ExternalSymbol, // Name of external global symbol - MO_GlobalAddress // Address of a global value + MO_Register, ///< Register operand. + MO_Immediate, ///< Immediate operand + MO_FPImmediate, ///< Floating-point immediate operand + MO_MachineBasicBlock, ///< MachineBasicBlock reference + MO_FrameIndex, ///< Abstract Stack Frame Index + MO_ConstantPoolIndex, ///< Address of indexed Constant in Constant Pool + MO_JumpTableIndex, ///< Address of indexed Jump Table for switch + MO_ExternalSymbol, ///< Name of external global symbol + MO_GlobalAddress ///< Address of a global value }; private: From gohman at apple.com Tue Dec 23 12:36:59 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 18:36:59 -0000 Subject: [llvm-commits] [llvm] r61376 - in /llvm/trunk: include/llvm/CodeGen/ScheduleDAG.h include/llvm/CodeGen/ScheduleDAGInstrs.h include/llvm/CodeGen/ScheduleDAGSDNodes.h lib/CodeGen/PostRASchedulerList.cpp lib/CodeGen/ScheduleDAGInstrs.cpp lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Message-ID: <200812231837.mBNIb1T9007712@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 12:36:58 2008 New Revision: 61376 URL: http://llvm.org/viewvc/llvm-project?rev=61376&view=rev Log: Rename BuildSchedUnits to BuildSchedGraph, and refactor the code in ScheduleDAGSDNodes' BuildSchedGraph into separate functions. Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h llvm/trunk/lib/CodeGen/PostRASchedulerList.cpp llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAG.h Tue Dec 23 12:36:58 2008 @@ -442,10 +442,10 @@ /// void Run(); - /// BuildSchedUnits - Build SUnits and set up their Preds and Succs + /// BuildSchedGraph - Build SUnits and set up their Preds and Succs /// to form the scheduling dependency graph. /// - virtual void BuildSchedUnits() = 0; + virtual void BuildSchedGraph() = 0; /// ComputeLatency - Compute node latency. /// Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAGInstrs.h Tue Dec 23 12:36:58 2008 @@ -45,9 +45,9 @@ return &SUnits.back(); } - /// BuildSchedUnits - Build SUnits from the MachineBasicBlock that we are + /// BuildSchedGraph - Build SUnits from the MachineBasicBlock that we are /// input. - virtual void BuildSchedUnits(); + virtual void BuildSchedGraph(); /// ComputeLatency - Compute node latency. /// Modified: llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/ScheduleDAGSDNodes.h Tue Dec 23 12:36:58 2008 @@ -118,10 +118,11 @@ virtual SelectionDAG *getDAG() { return DAG; } - /// BuildSchedUnits - Build SUnits from the selection dag that we are input. - /// This SUnit graph is similar to the SelectionDAG, but represents flagged - /// together nodes with a single SUnit. - virtual void BuildSchedUnits(); + /// BuildSchedGraph - Build the SUnit graph from the selection dag that we + /// are input. This SUnit graph is similar to the SelectionDAG, but + /// excludes nodes that aren't interesting to scheduling, and represents + /// flagged together nodes with a single SUnit. + virtual void BuildSchedGraph(); /// ComputeLatency - Compute node latency. /// @@ -189,6 +190,10 @@ void CreateVirtualRegisters(SDNode *Node, MachineInstr *MI, const TargetInstrDesc &II, DenseMap &VRBaseMap); + + /// BuildSchedUnits, AddSchedEdges - Helper functions for BuildSchedGraph. + void BuildSchedUnits(); + void AddSchedEdges(); }; } Modified: llvm/trunk/lib/CodeGen/PostRASchedulerList.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/PostRASchedulerList.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/PostRASchedulerList.cpp (original) +++ llvm/trunk/lib/CodeGen/PostRASchedulerList.cpp Tue Dec 23 12:36:58 2008 @@ -121,8 +121,8 @@ void SchedulePostRATDList::Schedule() { DOUT << "********** List Scheduling **********\n"; - // Build scheduling units. - BuildSchedUnits(); + // Build the scheduling graph. + BuildSchedGraph(); if (EnableAntiDepBreaking) { if (BreakAntiDependencies()) { @@ -133,7 +133,7 @@ // that register, and add new anti-dependence and output-dependence // edges based on the next live range of the register. SUnits.clear(); - BuildSchedUnits(); + BuildSchedGraph(); } } Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original) +++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Tue Dec 23 12:36:58 2008 @@ -89,7 +89,7 @@ const MachineDominatorTree &mdt) : ScheduleDAG(0, bb, tm), MLI(mli), MDT(mdt) {} -void ScheduleDAGInstrs::BuildSchedUnits() { +void ScheduleDAGInstrs::BuildSchedGraph() { SUnits.clear(); SUnits.reserve(BB->size()); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp Tue Dec 23 12:36:58 2008 @@ -114,8 +114,8 @@ LiveRegDefs.resize(TRI->getNumRegs(), NULL); LiveRegCycles.resize(TRI->getNumRegs(), 0); - // Build scheduling units. - BuildSchedUnits(); + // Build the scheduling graph. + BuildSchedGraph(); DEBUG(for (unsigned su = 0, e = SUnits.size(); su != e; ++su) SUnits[su].dumpAll(this)); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGList.cpp Tue Dec 23 12:36:58 2008 @@ -91,8 +91,8 @@ void ScheduleDAGList::Schedule() { DOUT << "********** List Scheduling **********\n"; - // Build scheduling units. - BuildSchedUnits(); + // Build the scheduling graph. + BuildSchedGraph(); AvailableQueue->initNodes(SUnits); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp Tue Dec 23 12:36:58 2008 @@ -170,8 +170,8 @@ LiveRegDefs.resize(TRI->getNumRegs(), NULL); LiveRegCycles.resize(TRI->getNumRegs(), 0); - // Build scheduling units. - BuildSchedUnits(); + // Build the scheduling graph. + BuildSchedGraph(); DEBUG(for (unsigned su = 0, e = SUnits.size(); su != e; ++su) SUnits[su].dumpAll(this)); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp?rev=61376&r1=61375&r2=61376&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp Tue Dec 23 12:36:58 2008 @@ -64,9 +64,6 @@ } } -/// BuildSchedUnits - Build SUnits from the selection dag that we are input. -/// This SUnit graph is similar to the SelectionDAG, but represents flagged -/// together nodes with a single SUnit. void ScheduleDAGSDNodes::BuildSchedUnits() { // During scheduling, the NodeId field of SDNode is used to map SDNodes // to their associated SUnits by holding SUnits table indices. A value @@ -146,7 +143,9 @@ else ComputeLatency(NodeSUnit); } - +} + +void ScheduleDAGSDNodes::AddSchedEdges() { // Pass 2: add the preds, succs, etc. for (unsigned su = 0, e = SUnits.size(); su != e; ++su) { SUnit *SU = &SUnits[su]; @@ -196,6 +195,17 @@ } } +/// BuildSchedGraph - Build the SUnit graph from the selection dag that we +/// are input. This SUnit graph is similar to the SelectionDAG, but +/// excludes nodes that aren't interesting to scheduling, and represents +/// flagged together nodes with a single SUnit. +void ScheduleDAGSDNodes::BuildSchedGraph() { + // Populate the SUnits array. + BuildSchedUnits(); + // Compute all the scheduling dependencies between nodes. + AddSchedEdges(); +} + void ScheduleDAGSDNodes::ComputeLatency(SUnit *SU) { const InstrItineraryData &InstrItins = TM.getInstrItineraryData(); From snaroff at apple.com Tue Dec 23 12:41:49 2008 From: snaroff at apple.com (Steve Naroff) Date: Tue, 23 Dec 2008 18:41:49 -0000 Subject: [llvm-commits] [llvm] r61378 - /llvm/trunk/lib/Support/CommandLine.cpp Message-ID: <200812231841.mBNIfn1c007859@zion.cs.uiuc.edu> Author: snaroff Date: Tue Dec 23 12:41:47 2008 New Revision: 61378 URL: http://llvm.org/viewvc/llvm-project?rev=61378&view=rev Log: Tweak --version to include the date and time. Modified: llvm/trunk/lib/Support/CommandLine.cpp Modified: llvm/trunk/lib/Support/CommandLine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/CommandLine.cpp?rev=61378&r1=61377&r2=61378&view=diff ============================================================================== --- llvm/trunk/lib/Support/CommandLine.cpp (original) +++ llvm/trunk/lib/Support/CommandLine.cpp Tue Dec 23 12:41:47 2008 @@ -1095,6 +1095,7 @@ cout << " with assertions"; #endif cout << ".\n"; + cout << " Built " << __DATE__ << "(" << __TIME__ << ").\n"; } void operator=(bool OptionWasSpecified) { if (OptionWasSpecified) { From sabre at nondot.org Tue Dec 23 12:52:08 2008 From: sabre at nondot.org (Chris Lattner) Date: Tue, 23 Dec 2008 18:52:08 -0000 Subject: [llvm-commits] [llvm-gcc-4.2] r61379 - /llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Message-ID: <200812231852.mBNIq8Nf008191@zion.cs.uiuc.edu> Author: lattner Date: Tue Dec 23 12:52:07 2008 New Revision: 61379 URL: http://llvm.org/viewvc/llvm-project?rev=61379&view=rev Log: Allow tying together an pointers with integers of the same size. This should fix FD_ZERO on linux. Testcase here: test/FrontendC/2008-12-23-AsmIntPointerTie.c Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Modified: llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp?rev=61379&r1=61378&r2=61379&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp (original) +++ llvm-gcc-4.2/trunk/gcc/llvm-convert.cpp Tue Dec 23 12:52:07 2008 @@ -4196,25 +4196,26 @@ const Type *OpTy = Op->getType(); // If this input operand is matching an output operand, e.g. '0', check if // this is something that llvm supports. If the operand types are - // different, then emit an error if 1) one of the types is not integer, - // 2) if size of input type is larger than the output type. If the size - // of the integer input size is smaller than the integer output type, then - // cast it to the larger type and shift the value if the target is big - // endian. + // different, then emit an error if 1) one of the types is not integer or + // pointer, 2) if size of input type is larger than the output type. If + // the size of the integer input size is smaller than the integer output + // type, then cast it to the larger type and shift the value if the target + // is big endian. if (ISDIGIT(Constraint[0])) { unsigned Match = atoi(Constraint); const Type *OTy = CallResultTypes[Match]; if (OTy != OpTy) { - if (!OTy->isInteger() || !OpTy->isInteger()) { - error("%HUnsupported inline asm: input constraint with a matching " + if (!(isa(OTy) || isa(OTy)) || + !(isa(OpTy) || isa(OpTy))) { + error("%Hunsupported inline asm: input constraint with a matching " "output constraint of incompatible type!", &EXPR_LOCATION(exp)); return 0; } - unsigned OTyBits = OTy->getPrimitiveSizeInBits(); - unsigned OpTyBits = OpTy->getPrimitiveSizeInBits(); + unsigned OTyBits = TD.getTypeSizeInBits(OTy); + unsigned OpTyBits = TD.getTypeSizeInBits(OpTy); if (OTyBits == 0 || OpTyBits == 0 || OTyBits < OpTyBits) { - error("%HUnsupported inline asm: input constraint with a matching " + error("%Hunsupported inline asm: input constraint with a matching " "output constraint of incompatible type!", &EXPR_LOCATION(exp)); return 0; From sabre at nondot.org Tue Dec 23 12:52:27 2008 From: sabre at nondot.org (Chris Lattner) Date: Tue, 23 Dec 2008 18:52:27 -0000 Subject: [llvm-commits] [llvm] r61380 - /llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c Message-ID: <200812231852.mBNIqSRU008220@zion.cs.uiuc.edu> Author: lattner Date: Tue Dec 23 12:52:26 2008 New Revision: 61380 URL: http://llvm.org/viewvc/llvm-project?rev=61380&view=rev Log: Testcase to show we can tie together integers and pointers of the same size. Added: llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c Added: llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c?rev=61380&view=auto ============================================================================== --- llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c (added) +++ llvm/trunk/test/FrontendC/2008-12-23-AsmIntPointerTie.c Tue Dec 23 12:52:26 2008 @@ -0,0 +1,9 @@ +// RUN: %llvmgcc %s -S -emit-llvm -O1 -o - + +#include + +int test(void *b) { + intptr_t a; + __asm__ __volatile__ ("%0 %1 " : "=r" (a): "0" (b)); + return a; +} From sabre at nondot.org Tue Dec 23 14:52:53 2008 From: sabre at nondot.org (Chris Lattner) Date: Tue, 23 Dec 2008 20:52:53 -0000 Subject: [llvm-commits] [llvm] r61385 - /llvm/trunk/lib/Target/README.txt Message-ID: <200812232052.mBNKqrpi012036@zion.cs.uiuc.edu> Author: lattner Date: Tue Dec 23 14:52:52 2008 New Revision: 61385 URL: http://llvm.org/viewvc/llvm-project?rev=61385&view=rev Log: add some notes for simplifylibcalls optimizations Modified: llvm/trunk/lib/Target/README.txt Modified: llvm/trunk/lib/Target/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=61385&r1=61384&r2=61385&view=diff ============================================================================== --- llvm/trunk/lib/Target/README.txt (original) +++ llvm/trunk/lib/Target/README.txt Tue Dec 23 14:52:52 2008 @@ -1461,3 +1461,28 @@ //===---------------------------------------------------------------------===// +simplifylibcalls should do several optimizations for strspn/strcspn: + +strcspn(x, "") -> strlen(x) +strcspn("", x) -> 0 +strspn("", x) -> 0 +strspn(x, "") -> strlen(x) +strspn(x, "a") -> strchr(x, 'a')-x + +strcspn(x, "a") -> inlined loop for up to 3 letters (similarly for strspn): + +size_t __strcspn_c3 (__const char *__s, int __reject1, int __reject2, + int __reject3) { + register size_t __result = 0; + while (__s[__result] != '\0' && __s[__result] != __reject1 && + __s[__result] != __reject2 && __s[__result] != __reject3) + ++__result; + return __result; +} + +This should turn into a switch on the character. See PR3253 for some notes on +codegen. + +456.hmmer apparently uses strcspn and strspn a lot. 471.omnetpp uses strspn. + +//===---------------------------------------------------------------------===// From gohman at apple.com Tue Dec 23 15:37:05 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 21:37:05 -0000 Subject: [llvm-commits] [llvm] r61389 - in /llvm/trunk: include/llvm/CodeGen/SelectionDAG.h include/llvm/CodeGen/SelectionDAGNodes.h include/llvm/Target/TargetLowering.h include/llvm/Target/TargetSelectionDAG.td lib/CodeGen/ScheduleDAGInstrs.cpp lib/CodeGen/SelectionDAG/LegalizeDAG.cpp lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp lib/Target/X86/X86ISelLowering.cpp Message-ID: <200812232137.mBNLb7i8013394@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 15:37:04 2008 New Revision: 61389 URL: http://llvm.org/viewvc/llvm-project?rev=61389&view=rev Log: Clean up the atomic opcodes in SelectionDAG. This removes all the _8, _16, _32, and _64 opcodes and replaces each group with an unsuffixed opcode. The MemoryVT field of the AtomicSDNode is now used to carry the size information. In tablegen, the size-specific opcodes are replaced by size-independent opcodes that utilize the ability to compose them with predicates. This shrinks the per-opcode tables and makes the code that handles atomics much more concise. Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h llvm/trunk/include/llvm/Target/TargetLowering.h llvm/trunk/include/llvm/Target/TargetSelectionDAG.td llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Tue Dec 23 15:37:04 2008 @@ -464,13 +464,13 @@ /// getAtomic - Gets a node for an atomic op, produces result and chain and /// takes 3 operands - SDValue getAtomic(unsigned Opcode, SDValue Chain, SDValue Ptr, + SDValue getAtomic(unsigned Opcode, MVT MemVT, SDValue Chain, SDValue Ptr, SDValue Cmp, SDValue Swp, const Value* PtrVal, unsigned Alignment=0); /// getAtomic - Gets a node for an atomic op, produces result and chain and /// takes 2 operands. - SDValue getAtomic(unsigned Opcode, SDValue Chain, SDValue Ptr, + SDValue getAtomic(unsigned Opcode, MVT MemVT, SDValue Chain, SDValue Ptr, SDValue Val, const Value* PtrVal, unsigned Alignment = 0); Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h Tue Dec 23 15:37:04 2008 @@ -628,64 +628,28 @@ // this corresponds to the atomic.lcs intrinsic. // cmp is compared to *ptr, and if equal, swap is stored in *ptr. // the return is always the original value in *ptr - ATOMIC_CMP_SWAP_8, - ATOMIC_CMP_SWAP_16, - ATOMIC_CMP_SWAP_32, - ATOMIC_CMP_SWAP_64, + ATOMIC_CMP_SWAP, // Val, OUTCHAIN = ATOMIC_SWAP(INCHAIN, ptr, amt) // this corresponds to the atomic.swap intrinsic. // amt is stored to *ptr atomically. // the return is always the original value in *ptr - ATOMIC_SWAP_8, - ATOMIC_SWAP_16, - ATOMIC_SWAP_32, - ATOMIC_SWAP_64, + ATOMIC_SWAP, // Val, OUTCHAIN = ATOMIC_L[OpName]S(INCHAIN, ptr, amt) // this corresponds to the atomic.[OpName] intrinsic. // op(*ptr, amt) is stored to *ptr atomically. // the return is always the original value in *ptr - ATOMIC_LOAD_ADD_8, - ATOMIC_LOAD_SUB_8, - ATOMIC_LOAD_AND_8, - ATOMIC_LOAD_OR_8, - ATOMIC_LOAD_XOR_8, - ATOMIC_LOAD_NAND_8, - ATOMIC_LOAD_MIN_8, - ATOMIC_LOAD_MAX_8, - ATOMIC_LOAD_UMIN_8, - ATOMIC_LOAD_UMAX_8, - ATOMIC_LOAD_ADD_16, - ATOMIC_LOAD_SUB_16, - ATOMIC_LOAD_AND_16, - ATOMIC_LOAD_OR_16, - ATOMIC_LOAD_XOR_16, - ATOMIC_LOAD_NAND_16, - ATOMIC_LOAD_MIN_16, - ATOMIC_LOAD_MAX_16, - ATOMIC_LOAD_UMIN_16, - ATOMIC_LOAD_UMAX_16, - ATOMIC_LOAD_ADD_32, - ATOMIC_LOAD_SUB_32, - ATOMIC_LOAD_AND_32, - ATOMIC_LOAD_OR_32, - ATOMIC_LOAD_XOR_32, - ATOMIC_LOAD_NAND_32, - ATOMIC_LOAD_MIN_32, - ATOMIC_LOAD_MAX_32, - ATOMIC_LOAD_UMIN_32, - ATOMIC_LOAD_UMAX_32, - ATOMIC_LOAD_ADD_64, - ATOMIC_LOAD_SUB_64, - ATOMIC_LOAD_AND_64, - ATOMIC_LOAD_OR_64, - ATOMIC_LOAD_XOR_64, - ATOMIC_LOAD_NAND_64, - ATOMIC_LOAD_MIN_64, - ATOMIC_LOAD_MAX_64, - ATOMIC_LOAD_UMIN_64, - ATOMIC_LOAD_UMAX_64, + ATOMIC_LOAD_ADD, + ATOMIC_LOAD_SUB, + ATOMIC_LOAD_AND, + ATOMIC_LOAD_OR, + ATOMIC_LOAD_XOR, + ATOMIC_LOAD_NAND, + ATOMIC_LOAD_MIN, + ATOMIC_LOAD_MAX, + ATOMIC_LOAD_UMIN, + ATOMIC_LOAD_UMAX, // BUILTIN_OP_END - This must be the last enum value in this list. BUILTIN_OP_END @@ -1615,58 +1579,18 @@ // with either an intrinsic or a target opcode. return N->getOpcode() == ISD::LOAD || N->getOpcode() == ISD::STORE || - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_8 || - N->getOpcode() == ISD::ATOMIC_SWAP_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_8 || - - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_16 || - N->getOpcode() == ISD::ATOMIC_SWAP_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_16 || - - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_32 || - N->getOpcode() == ISD::ATOMIC_SWAP_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_32 || - - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_64 || - N->getOpcode() == ISD::ATOMIC_SWAP_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_64 || - + N->getOpcode() == ISD::ATOMIC_CMP_SWAP || + N->getOpcode() == ISD::ATOMIC_SWAP || + N->getOpcode() == ISD::ATOMIC_LOAD_ADD || + N->getOpcode() == ISD::ATOMIC_LOAD_SUB || + N->getOpcode() == ISD::ATOMIC_LOAD_AND || + N->getOpcode() == ISD::ATOMIC_LOAD_OR || + N->getOpcode() == ISD::ATOMIC_LOAD_XOR || + N->getOpcode() == ISD::ATOMIC_LOAD_NAND || + N->getOpcode() == ISD::ATOMIC_LOAD_MIN || + N->getOpcode() == ISD::ATOMIC_LOAD_MAX || + N->getOpcode() == ISD::ATOMIC_LOAD_UMIN || + N->getOpcode() == ISD::ATOMIC_LOAD_UMAX || N->getOpcode() == ISD::INTRINSIC_W_CHAIN || N->getOpcode() == ISD::INTRINSIC_VOID || N->isTargetOpcode(); @@ -1688,10 +1612,11 @@ // Swp: swap value // SrcVal: address to update as a Value (used for MemOperand) // Align: alignment of memory - AtomicSDNode(unsigned Opc, SDVTList VTL, SDValue Chain, SDValue Ptr, + AtomicSDNode(unsigned Opc, SDVTList VTL, MVT MemVT, + SDValue Chain, SDValue Ptr, SDValue Cmp, SDValue Swp, const Value* SrcVal, unsigned Align=0) - : MemSDNode(Opc, VTL, Cmp.getValueType(), SrcVal, /*SVOffset=*/0, + : MemSDNode(Opc, VTL, MemVT, SrcVal, /*SVOffset=*/0, Align, /*isVolatile=*/true) { Ops[0] = Chain; Ops[1] = Ptr; @@ -1699,9 +1624,10 @@ Ops[3] = Swp; InitOperands(Ops, 4); } - AtomicSDNode(unsigned Opc, SDVTList VTL, SDValue Chain, SDValue Ptr, + AtomicSDNode(unsigned Opc, SDVTList VTL, MVT MemVT, + SDValue Chain, SDValue Ptr, SDValue Val, const Value* SrcVal, unsigned Align=0) - : MemSDNode(Opc, VTL, Val.getValueType(), SrcVal, /*SVOffset=*/0, + : MemSDNode(Opc, VTL, MemVT, SrcVal, /*SVOffset=*/0, Align, /*isVolatile=*/true) { Ops[0] = Chain; Ops[1] = Ptr; @@ -1714,63 +1640,24 @@ bool isCompareAndSwap() const { unsigned Op = getOpcode(); - return Op == ISD::ATOMIC_CMP_SWAP_8 || - Op == ISD::ATOMIC_CMP_SWAP_16 || - Op == ISD::ATOMIC_CMP_SWAP_32 || - Op == ISD::ATOMIC_CMP_SWAP_64; + return Op == ISD::ATOMIC_CMP_SWAP; } // Methods to support isa and dyn_cast static bool classof(const AtomicSDNode *) { return true; } static bool classof(const SDNode *N) { - return N->getOpcode() == ISD::ATOMIC_CMP_SWAP_8 || - N->getOpcode() == ISD::ATOMIC_SWAP_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_8 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_8 || - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_16 || - N->getOpcode() == ISD::ATOMIC_SWAP_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_16 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_16 || - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_32 || - N->getOpcode() == ISD::ATOMIC_SWAP_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_32 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_32 || - N->getOpcode() == ISD::ATOMIC_CMP_SWAP_64 || - N->getOpcode() == ISD::ATOMIC_SWAP_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_ADD_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_SUB_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_AND_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_OR_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_XOR_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_NAND_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_MIN_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_MAX_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMIN_64 || - N->getOpcode() == ISD::ATOMIC_LOAD_UMAX_64; + return N->getOpcode() == ISD::ATOMIC_CMP_SWAP || + N->getOpcode() == ISD::ATOMIC_SWAP || + N->getOpcode() == ISD::ATOMIC_LOAD_ADD || + N->getOpcode() == ISD::ATOMIC_LOAD_SUB || + N->getOpcode() == ISD::ATOMIC_LOAD_AND || + N->getOpcode() == ISD::ATOMIC_LOAD_OR || + N->getOpcode() == ISD::ATOMIC_LOAD_XOR || + N->getOpcode() == ISD::ATOMIC_LOAD_NAND || + N->getOpcode() == ISD::ATOMIC_LOAD_MIN || + N->getOpcode() == ISD::ATOMIC_LOAD_MAX || + N->getOpcode() == ISD::ATOMIC_LOAD_UMIN || + N->getOpcode() == ISD::ATOMIC_LOAD_UMAX; } }; Modified: llvm/trunk/include/llvm/Target/TargetLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetLowering.h (original) +++ llvm/trunk/include/llvm/Target/TargetLowering.h Tue Dec 23 15:37:04 2008 @@ -1492,7 +1492,7 @@ MVT TransformToType[MVT::LAST_VALUETYPE]; // Defines the capacity of the TargetLowering::OpActions table - static const int OpActionsCapacity = 222; + static const int OpActionsCapacity = 184; /// OpActions - For each operation and each value type, keep a LegalizeAction /// that indicates how instruction selection should deal with the operation. Modified: llvm/trunk/include/llvm/Target/TargetSelectionDAG.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetSelectionDAG.td?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetSelectionDAG.td (original) +++ llvm/trunk/include/llvm/Target/TargetSelectionDAG.td Tue Dec 23 15:37:04 2008 @@ -363,101 +363,29 @@ def membarrier : SDNode<"ISD::MEMBARRIER" , STDMemBarrier, [SDNPHasChain, SDNPSideEffect]>; -def atomic_cmp_swap_8 : SDNode<"ISD::ATOMIC_CMP_SWAP_8" , STDAtomic3, +def atomic_cmp_swap : SDNode<"ISD::ATOMIC_CMP_SWAP" , STDAtomic3, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_add_8 : SDNode<"ISD::ATOMIC_LOAD_ADD_8" , STDAtomic2, +def atomic_load_add : SDNode<"ISD::ATOMIC_LOAD_ADD" , STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_swap_8 : SDNode<"ISD::ATOMIC_SWAP_8", STDAtomic2, +def atomic_swap : SDNode<"ISD::ATOMIC_SWAP", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_sub_8 : SDNode<"ISD::ATOMIC_LOAD_SUB_8" , STDAtomic2, +def atomic_load_sub : SDNode<"ISD::ATOMIC_LOAD_SUB" , STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_and_8 : SDNode<"ISD::ATOMIC_LOAD_AND_8" , STDAtomic2, +def atomic_load_and : SDNode<"ISD::ATOMIC_LOAD_AND" , STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_or_8 : SDNode<"ISD::ATOMIC_LOAD_OR_8" , STDAtomic2, +def atomic_load_or : SDNode<"ISD::ATOMIC_LOAD_OR" , STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_xor_8 : SDNode<"ISD::ATOMIC_LOAD_XOR_8" , STDAtomic2, +def atomic_load_xor : SDNode<"ISD::ATOMIC_LOAD_XOR" , STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_nand_8: SDNode<"ISD::ATOMIC_LOAD_NAND_8", STDAtomic2, +def atomic_load_nand: SDNode<"ISD::ATOMIC_LOAD_NAND", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_min_8 : SDNode<"ISD::ATOMIC_LOAD_MIN_8", STDAtomic2, +def atomic_load_min : SDNode<"ISD::ATOMIC_LOAD_MIN", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_max_8 : SDNode<"ISD::ATOMIC_LOAD_MAX_8", STDAtomic2, +def atomic_load_max : SDNode<"ISD::ATOMIC_LOAD_MAX", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umin_8 : SDNode<"ISD::ATOMIC_LOAD_UMIN_8", STDAtomic2, +def atomic_load_umin : SDNode<"ISD::ATOMIC_LOAD_UMIN", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umax_8 : SDNode<"ISD::ATOMIC_LOAD_UMAX_8", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_cmp_swap_16 : SDNode<"ISD::ATOMIC_CMP_SWAP_16" , STDAtomic3, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_add_16 : SDNode<"ISD::ATOMIC_LOAD_ADD_16" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_swap_16 : SDNode<"ISD::ATOMIC_SWAP_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_sub_16 : SDNode<"ISD::ATOMIC_LOAD_SUB_16" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_and_16 : SDNode<"ISD::ATOMIC_LOAD_AND_16" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_or_16 : SDNode<"ISD::ATOMIC_LOAD_OR_16" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_xor_16 : SDNode<"ISD::ATOMIC_LOAD_XOR_16" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_nand_16: SDNode<"ISD::ATOMIC_LOAD_NAND_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_min_16 : SDNode<"ISD::ATOMIC_LOAD_MIN_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_max_16 : SDNode<"ISD::ATOMIC_LOAD_MAX_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umin_16 : SDNode<"ISD::ATOMIC_LOAD_UMIN_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umax_16 : SDNode<"ISD::ATOMIC_LOAD_UMAX_16", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_cmp_swap_32 : SDNode<"ISD::ATOMIC_CMP_SWAP_32" , STDAtomic3, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_add_32 : SDNode<"ISD::ATOMIC_LOAD_ADD_32" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_swap_32 : SDNode<"ISD::ATOMIC_SWAP_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_sub_32 : SDNode<"ISD::ATOMIC_LOAD_SUB_32" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_and_32 : SDNode<"ISD::ATOMIC_LOAD_AND_32" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_or_32 : SDNode<"ISD::ATOMIC_LOAD_OR_32" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_xor_32 : SDNode<"ISD::ATOMIC_LOAD_XOR_32" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_nand_32: SDNode<"ISD::ATOMIC_LOAD_NAND_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_min_32 : SDNode<"ISD::ATOMIC_LOAD_MIN_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_max_32 : SDNode<"ISD::ATOMIC_LOAD_MAX_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umin_32 : SDNode<"ISD::ATOMIC_LOAD_UMIN_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umax_32 : SDNode<"ISD::ATOMIC_LOAD_UMAX_32", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_cmp_swap_64 : SDNode<"ISD::ATOMIC_CMP_SWAP_64" , STDAtomic3, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_add_64 : SDNode<"ISD::ATOMIC_LOAD_ADD_64" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_swap_64 : SDNode<"ISD::ATOMIC_SWAP_64", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_sub_64 : SDNode<"ISD::ATOMIC_LOAD_SUB_64" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_and_64 : SDNode<"ISD::ATOMIC_LOAD_AND_64" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_or_64 : SDNode<"ISD::ATOMIC_LOAD_OR_64" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_xor_64 : SDNode<"ISD::ATOMIC_LOAD_XOR_64" , STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_nand_64: SDNode<"ISD::ATOMIC_LOAD_NAND_64", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_min_64 : SDNode<"ISD::ATOMIC_LOAD_MIN_64", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_max_64 : SDNode<"ISD::ATOMIC_LOAD_MAX_64", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umin_64 : SDNode<"ISD::ATOMIC_LOAD_UMIN_64", STDAtomic2, - [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; -def atomic_load_umax_64 : SDNode<"ISD::ATOMIC_LOAD_UMAX_64", STDAtomic2, +def atomic_load_umax : SDNode<"ISD::ATOMIC_LOAD_UMAX", STDAtomic2, [SDNPHasChain, SDNPMayStore, SDNPMayLoad, SDNPMemOperand]>; // Do not use ld, st directly. Use load, extload, sextload, zextload, store, @@ -794,6 +722,58 @@ def setne : PatFrag<(ops node:$lhs, node:$rhs), (setcc node:$lhs, node:$rhs, SETNE)>; +def atomic_cmp_swap_8 : + PatFrag<(ops node:$ptr, node:$cmp, node:$swap), + (atomic_cmp_swap node:$ptr, node:$cmp, node:$swap), [{ + return cast(N)->getMemoryVT() == MVT::i8; +}]>; +def atomic_cmp_swap_16 : + PatFrag<(ops node:$ptr, node:$cmp, node:$swap), + (atomic_cmp_swap node:$ptr, node:$cmp, node:$swap), [{ + return cast(N)->getMemoryVT() == MVT::i16; +}]>; +def atomic_cmp_swap_32 : + PatFrag<(ops node:$ptr, node:$cmp, node:$swap), + (atomic_cmp_swap node:$ptr, node:$cmp, node:$swap), [{ + return cast(N)->getMemoryVT() == MVT::i32; +}]>; +def atomic_cmp_swap_64 : + PatFrag<(ops node:$ptr, node:$cmp, node:$swap), + (atomic_cmp_swap node:$ptr, node:$cmp, node:$swap), [{ + return cast(N)->getMemoryVT() == MVT::i64; +}]>; + +multiclass binary_atomic_op { + def _8 : PatFrag<(ops node:$ptr, node:$val), + (atomic_op node:$ptr, node:$val), [{ + return cast(N)->getMemoryVT() == MVT::i8; + }]>; + def _16 : PatFrag<(ops node:$ptr, node:$val), + (atomic_op node:$ptr, node:$val), [{ + return cast(N)->getMemoryVT() == MVT::i16; + }]>; + def _32 : PatFrag<(ops node:$ptr, node:$val), + (atomic_op node:$ptr, node:$val), [{ + return cast(N)->getMemoryVT() == MVT::i32; + }]>; + def _64 : PatFrag<(ops node:$ptr, node:$val), + (atomic_op node:$ptr, node:$val), [{ + return cast(N)->getMemoryVT() == MVT::i64; + }]>; +} + +defm atomic_load_add : binary_atomic_op; +defm atomic_swap : binary_atomic_op; +defm atomic_load_sub : binary_atomic_op; +defm atomic_load_and : binary_atomic_op; +defm atomic_load_or : binary_atomic_op; +defm atomic_load_xor : binary_atomic_op; +defm atomic_load_nand : binary_atomic_op; +defm atomic_load_min : binary_atomic_op; +defm atomic_load_max : binary_atomic_op; +defm atomic_load_umin : binary_atomic_op; +defm atomic_load_umax : binary_atomic_op; + //===----------------------------------------------------------------------===// // Selection DAG CONVERT_RNDSAT patterns Modified: llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp (original) +++ llvm/trunk/lib/CodeGen/ScheduleDAGInstrs.cpp Tue Dec 23 15:37:04 2008 @@ -413,6 +413,7 @@ while (!BB->empty()) BB->remove(BB->begin()); + // Then re-insert them according to the given schedule. for (unsigned i = 0, e = Sequence.size(); i != e; i++) { SUnit *SU = Sequence[i]; if (!SU) { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Tue Dec 23 15:37:04 2008 @@ -1403,10 +1403,7 @@ break; } - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_CMP_SWAP_64: { + case ISD::ATOMIC_CMP_SWAP: { unsigned int num_operands = 4; assert(Node->getNumOperands() == num_operands && "Invalid Atomic node!"); SDValue Ops[4]; @@ -1426,50 +1423,17 @@ AddLegalizedOperand(SDValue(Node, 1), Result.getValue(1)); return Result.getValue(Op.getResNo()); } - case ISD::ATOMIC_LOAD_ADD_8: - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_AND_8: - case ISD::ATOMIC_LOAD_OR_8: - case ISD::ATOMIC_LOAD_XOR_8: - case ISD::ATOMIC_LOAD_NAND_8: - case ISD::ATOMIC_LOAD_MIN_8: - case ISD::ATOMIC_LOAD_MAX_8: - case ISD::ATOMIC_LOAD_UMIN_8: - case ISD::ATOMIC_LOAD_UMAX_8: - case ISD::ATOMIC_SWAP_8: - case ISD::ATOMIC_LOAD_ADD_16: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_AND_16: - case ISD::ATOMIC_LOAD_OR_16: - case ISD::ATOMIC_LOAD_XOR_16: - case ISD::ATOMIC_LOAD_NAND_16: - case ISD::ATOMIC_LOAD_MIN_16: - case ISD::ATOMIC_LOAD_MAX_16: - case ISD::ATOMIC_LOAD_UMIN_16: - case ISD::ATOMIC_LOAD_UMAX_16: - case ISD::ATOMIC_SWAP_16: - case ISD::ATOMIC_LOAD_ADD_32: - case ISD::ATOMIC_LOAD_SUB_32: - case ISD::ATOMIC_LOAD_AND_32: - case ISD::ATOMIC_LOAD_OR_32: - case ISD::ATOMIC_LOAD_XOR_32: - case ISD::ATOMIC_LOAD_NAND_32: - case ISD::ATOMIC_LOAD_MIN_32: - case ISD::ATOMIC_LOAD_MAX_32: - case ISD::ATOMIC_LOAD_UMIN_32: - case ISD::ATOMIC_LOAD_UMAX_32: - case ISD::ATOMIC_SWAP_32: - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_LOAD_MIN_64: - case ISD::ATOMIC_LOAD_MAX_64: - case ISD::ATOMIC_LOAD_UMIN_64: - case ISD::ATOMIC_LOAD_UMAX_64: - case ISD::ATOMIC_SWAP_64: { + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_LOAD_MIN: + case ISD::ATOMIC_LOAD_MAX: + case ISD::ATOMIC_LOAD_UMIN: + case ISD::ATOMIC_LOAD_UMAX: + case ISD::ATOMIC_SWAP: { unsigned int num_operands = 3; assert(Node->getNumOperands() == num_operands && "Invalid Atomic node!"); SDValue Ops[3]; @@ -4718,14 +4682,12 @@ break; } - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_CMP_SWAP_64: { + case ISD::ATOMIC_CMP_SWAP: { AtomicSDNode* AtomNode = cast(Node); Tmp2 = PromoteOp(Node->getOperand(2)); Tmp3 = PromoteOp(Node->getOperand(3)); - Result = DAG.getAtomic(Node->getOpcode(), AtomNode->getChain(), + Result = DAG.getAtomic(Node->getOpcode(), AtomNode->getMemoryVT(), + AtomNode->getChain(), AtomNode->getBasePtr(), Tmp2, Tmp3, AtomNode->getSrcValue(), AtomNode->getAlignment()); @@ -4733,53 +4695,21 @@ AddLegalizedOperand(Op.getValue(1), LegalizeOp(Result.getValue(1))); break; } - case ISD::ATOMIC_LOAD_ADD_8: - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_AND_8: - case ISD::ATOMIC_LOAD_OR_8: - case ISD::ATOMIC_LOAD_XOR_8: - case ISD::ATOMIC_LOAD_NAND_8: - case ISD::ATOMIC_LOAD_MIN_8: - case ISD::ATOMIC_LOAD_MAX_8: - case ISD::ATOMIC_LOAD_UMIN_8: - case ISD::ATOMIC_LOAD_UMAX_8: - case ISD::ATOMIC_SWAP_8: - case ISD::ATOMIC_LOAD_ADD_16: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_AND_16: - case ISD::ATOMIC_LOAD_OR_16: - case ISD::ATOMIC_LOAD_XOR_16: - case ISD::ATOMIC_LOAD_NAND_16: - case ISD::ATOMIC_LOAD_MIN_16: - case ISD::ATOMIC_LOAD_MAX_16: - case ISD::ATOMIC_LOAD_UMIN_16: - case ISD::ATOMIC_LOAD_UMAX_16: - case ISD::ATOMIC_SWAP_16: - case ISD::ATOMIC_LOAD_ADD_32: - case ISD::ATOMIC_LOAD_SUB_32: - case ISD::ATOMIC_LOAD_AND_32: - case ISD::ATOMIC_LOAD_OR_32: - case ISD::ATOMIC_LOAD_XOR_32: - case ISD::ATOMIC_LOAD_NAND_32: - case ISD::ATOMIC_LOAD_MIN_32: - case ISD::ATOMIC_LOAD_MAX_32: - case ISD::ATOMIC_LOAD_UMIN_32: - case ISD::ATOMIC_LOAD_UMAX_32: - case ISD::ATOMIC_SWAP_32: - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_LOAD_MIN_64: - case ISD::ATOMIC_LOAD_MAX_64: - case ISD::ATOMIC_LOAD_UMIN_64: - case ISD::ATOMIC_LOAD_UMAX_64: - case ISD::ATOMIC_SWAP_64: { + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_LOAD_MIN: + case ISD::ATOMIC_LOAD_MAX: + case ISD::ATOMIC_LOAD_UMIN: + case ISD::ATOMIC_LOAD_UMAX: + case ISD::ATOMIC_SWAP: { AtomicSDNode* AtomNode = cast(Node); Tmp2 = PromoteOp(Node->getOperand(2)); - Result = DAG.getAtomic(Node->getOpcode(), AtomNode->getChain(), + Result = DAG.getAtomic(Node->getOpcode(), AtomNode->getMemoryVT(), + AtomNode->getChain(), AtomNode->getBasePtr(), Tmp2, AtomNode->getSrcValue(), AtomNode->getAlignment()); @@ -6769,7 +6699,7 @@ break; } - case ISD::ATOMIC_CMP_SWAP_64: { + case ISD::ATOMIC_CMP_SWAP: { // This operation does not need a loop. SDValue Tmp = TLI.LowerOperation(Op, DAG); assert(Tmp.getNode() && "Node must be custom expanded!"); @@ -6779,13 +6709,13 @@ break; } - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_SWAP_64: { + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_SWAP: { // These operations require a loop to be generated. We can't do that yet, // so substitute a target-dependent pseudo and expand that later. SDValue In2Lo, In2Hi, In2; @@ -6793,7 +6723,8 @@ In2 = DAG.getNode(ISD::BUILD_PAIR, VT, In2Lo, In2Hi); AtomicSDNode* Anode = cast(Node); SDValue Replace = - DAG.getAtomic(Op.getOpcode(), Op.getOperand(0), Op.getOperand(1), In2, + DAG.getAtomic(Op.getOpcode(), Anode->getMemoryVT(), + Op.getOperand(0), Op.getOperand(1), In2, Anode->getSrcValue(), Anode->getAlignment()); SDValue Result = TLI.LowerOperation(Replace, DAG); ExpandOp(Result.getValue(0), Lo, Hi); @@ -8318,54 +8249,18 @@ Node->getOperand(2)); break; } - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_CMP_SWAP_64: - case ISD::ATOMIC_LOAD_ADD_8: - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_AND_8: - case ISD::ATOMIC_LOAD_OR_8: - case ISD::ATOMIC_LOAD_XOR_8: - case ISD::ATOMIC_LOAD_NAND_8: - case ISD::ATOMIC_LOAD_MIN_8: - case ISD::ATOMIC_LOAD_MAX_8: - case ISD::ATOMIC_LOAD_UMIN_8: - case ISD::ATOMIC_LOAD_UMAX_8: - case ISD::ATOMIC_SWAP_8: - case ISD::ATOMIC_LOAD_ADD_16: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_AND_16: - case ISD::ATOMIC_LOAD_OR_16: - case ISD::ATOMIC_LOAD_XOR_16: - case ISD::ATOMIC_LOAD_NAND_16: - case ISD::ATOMIC_LOAD_MIN_16: - case ISD::ATOMIC_LOAD_MAX_16: - case ISD::ATOMIC_LOAD_UMIN_16: - case ISD::ATOMIC_LOAD_UMAX_16: - case ISD::ATOMIC_SWAP_16: - case ISD::ATOMIC_LOAD_ADD_32: - case ISD::ATOMIC_LOAD_SUB_32: - case ISD::ATOMIC_LOAD_AND_32: - case ISD::ATOMIC_LOAD_OR_32: - case ISD::ATOMIC_LOAD_XOR_32: - case ISD::ATOMIC_LOAD_NAND_32: - case ISD::ATOMIC_LOAD_MIN_32: - case ISD::ATOMIC_LOAD_MAX_32: - case ISD::ATOMIC_LOAD_UMIN_32: - case ISD::ATOMIC_LOAD_UMAX_32: - case ISD::ATOMIC_SWAP_32: - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_LOAD_MIN_64: - case ISD::ATOMIC_LOAD_MAX_64: - case ISD::ATOMIC_LOAD_UMIN_64: - case ISD::ATOMIC_LOAD_UMAX_64: - case ISD::ATOMIC_SWAP_64: { + case ISD::ATOMIC_CMP_SWAP: + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_LOAD_MIN: + case ISD::ATOMIC_LOAD_MAX: + case ISD::ATOMIC_LOAD_UMIN: + case ISD::ATOMIC_LOAD_UMAX: + case ISD::ATOMIC_SWAP: { // For now, we assume that using vectors for these operations don't make // much sense so we just split it. We return an empty result SDValue X, Y; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Tue Dec 23 15:37:04 2008 @@ -98,56 +98,20 @@ case ISD::SMULO: case ISD::UMULO: Result = PromoteIntRes_XMULO(N, ResNo); break; - case ISD::ATOMIC_LOAD_ADD_8: - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_AND_8: - case ISD::ATOMIC_LOAD_OR_8: - case ISD::ATOMIC_LOAD_XOR_8: - case ISD::ATOMIC_LOAD_NAND_8: - case ISD::ATOMIC_LOAD_MIN_8: - case ISD::ATOMIC_LOAD_MAX_8: - case ISD::ATOMIC_LOAD_UMIN_8: - case ISD::ATOMIC_LOAD_UMAX_8: - case ISD::ATOMIC_SWAP_8: - case ISD::ATOMIC_LOAD_ADD_16: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_AND_16: - case ISD::ATOMIC_LOAD_OR_16: - case ISD::ATOMIC_LOAD_XOR_16: - case ISD::ATOMIC_LOAD_NAND_16: - case ISD::ATOMIC_LOAD_MIN_16: - case ISD::ATOMIC_LOAD_MAX_16: - case ISD::ATOMIC_LOAD_UMIN_16: - case ISD::ATOMIC_LOAD_UMAX_16: - case ISD::ATOMIC_SWAP_16: - case ISD::ATOMIC_LOAD_ADD_32: - case ISD::ATOMIC_LOAD_SUB_32: - case ISD::ATOMIC_LOAD_AND_32: - case ISD::ATOMIC_LOAD_OR_32: - case ISD::ATOMIC_LOAD_XOR_32: - case ISD::ATOMIC_LOAD_NAND_32: - case ISD::ATOMIC_LOAD_MIN_32: - case ISD::ATOMIC_LOAD_MAX_32: - case ISD::ATOMIC_LOAD_UMIN_32: - case ISD::ATOMIC_LOAD_UMAX_32: - case ISD::ATOMIC_SWAP_32: - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_LOAD_MIN_64: - case ISD::ATOMIC_LOAD_MAX_64: - case ISD::ATOMIC_LOAD_UMIN_64: - case ISD::ATOMIC_LOAD_UMAX_64: - case ISD::ATOMIC_SWAP_64: + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_LOAD_MIN: + case ISD::ATOMIC_LOAD_MAX: + case ISD::ATOMIC_LOAD_UMIN: + case ISD::ATOMIC_LOAD_UMAX: + case ISD::ATOMIC_SWAP: Result = PromoteIntRes_Atomic1(cast(N)); break; - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_CMP_SWAP_64: + case ISD::ATOMIC_CMP_SWAP: Result = PromoteIntRes_Atomic2(cast(N)); break; } @@ -170,7 +134,8 @@ SDValue DAGTypeLegalizer::PromoteIntRes_Atomic1(AtomicSDNode *N) { SDValue Op2 = GetPromotedInteger(N->getOperand(2)); - SDValue Res = DAG.getAtomic(N->getOpcode(), N->getChain(), N->getBasePtr(), + SDValue Res = DAG.getAtomic(N->getOpcode(), N->getMemoryVT(), + N->getChain(), N->getBasePtr(), Op2, N->getSrcValue(), N->getAlignment()); // Legalized the chain result - switch anything that used the old chain to // use the new one. @@ -181,7 +146,8 @@ SDValue DAGTypeLegalizer::PromoteIntRes_Atomic2(AtomicSDNode *N) { SDValue Op2 = GetPromotedInteger(N->getOperand(2)); SDValue Op3 = GetPromotedInteger(N->getOperand(3)); - SDValue Res = DAG.getAtomic(N->getOpcode(), N->getChain(), N->getBasePtr(), + SDValue Res = DAG.getAtomic(N->getOpcode(), N->getMemoryVT(), + N->getChain(), N->getBasePtr(), Op2, Op3, N->getSrcValue(), N->getAlignment()); // Legalized the chain result - switch anything that used the old chain to // use the new one. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Dec 23 15:37:04 2008 @@ -442,54 +442,18 @@ ID.AddInteger(ST->getRawFlags()); break; } - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_SWAP_8: - case ISD::ATOMIC_LOAD_ADD_8: - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_AND_8: - case ISD::ATOMIC_LOAD_OR_8: - case ISD::ATOMIC_LOAD_XOR_8: - case ISD::ATOMIC_LOAD_NAND_8: - case ISD::ATOMIC_LOAD_MIN_8: - case ISD::ATOMIC_LOAD_MAX_8: - case ISD::ATOMIC_LOAD_UMIN_8: - case ISD::ATOMIC_LOAD_UMAX_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_SWAP_16: - case ISD::ATOMIC_LOAD_ADD_16: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_AND_16: - case ISD::ATOMIC_LOAD_OR_16: - case ISD::ATOMIC_LOAD_XOR_16: - case ISD::ATOMIC_LOAD_NAND_16: - case ISD::ATOMIC_LOAD_MIN_16: - case ISD::ATOMIC_LOAD_MAX_16: - case ISD::ATOMIC_LOAD_UMIN_16: - case ISD::ATOMIC_LOAD_UMAX_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_SWAP_32: - case ISD::ATOMIC_LOAD_ADD_32: - case ISD::ATOMIC_LOAD_SUB_32: - case ISD::ATOMIC_LOAD_AND_32: - case ISD::ATOMIC_LOAD_OR_32: - case ISD::ATOMIC_LOAD_XOR_32: - case ISD::ATOMIC_LOAD_NAND_32: - case ISD::ATOMIC_LOAD_MIN_32: - case ISD::ATOMIC_LOAD_MAX_32: - case ISD::ATOMIC_LOAD_UMIN_32: - case ISD::ATOMIC_LOAD_UMAX_32: - case ISD::ATOMIC_CMP_SWAP_64: - case ISD::ATOMIC_SWAP_64: - case ISD::ATOMIC_LOAD_ADD_64: - case ISD::ATOMIC_LOAD_SUB_64: - case ISD::ATOMIC_LOAD_AND_64: - case ISD::ATOMIC_LOAD_OR_64: - case ISD::ATOMIC_LOAD_XOR_64: - case ISD::ATOMIC_LOAD_NAND_64: - case ISD::ATOMIC_LOAD_MIN_64: - case ISD::ATOMIC_LOAD_MAX_64: - case ISD::ATOMIC_LOAD_UMIN_64: - case ISD::ATOMIC_LOAD_UMAX_64: { + case ISD::ATOMIC_CMP_SWAP: + case ISD::ATOMIC_SWAP: + case ISD::ATOMIC_LOAD_ADD: + case ISD::ATOMIC_LOAD_SUB: + case ISD::ATOMIC_LOAD_AND: + case ISD::ATOMIC_LOAD_OR: + case ISD::ATOMIC_LOAD_XOR: + case ISD::ATOMIC_LOAD_NAND: + case ISD::ATOMIC_LOAD_MIN: + case ISD::ATOMIC_LOAD_MAX: + case ISD::ATOMIC_LOAD_UMIN: + case ISD::ATOMIC_LOAD_UMAX: { const AtomicSDNode *AT = cast(N); ID.AddInteger(AT->getRawFlags()); break; @@ -3287,20 +3251,18 @@ return CallResult.second; } -SDValue SelectionDAG::getAtomic(unsigned Opcode, SDValue Chain, +SDValue SelectionDAG::getAtomic(unsigned Opcode, MVT MemVT, + SDValue Chain, SDValue Ptr, SDValue Cmp, SDValue Swp, const Value* PtrVal, unsigned Alignment) { - assert((Opcode == ISD::ATOMIC_CMP_SWAP_8 || - Opcode == ISD::ATOMIC_CMP_SWAP_16 || - Opcode == ISD::ATOMIC_CMP_SWAP_32 || - Opcode == ISD::ATOMIC_CMP_SWAP_64) && "Invalid Atomic Op"); + assert(Opcode == ISD::ATOMIC_CMP_SWAP && "Invalid Atomic Op"); assert(Cmp.getValueType() == Swp.getValueType() && "Invalid Atomic Op Types"); MVT VT = Cmp.getValueType(); if (Alignment == 0) // Ensure that codegen never sees alignment 0 - Alignment = getMVTAlignment(VT); + Alignment = getMVTAlignment(MemVT); SDVTList VTs = getVTList(VT, MVT::Other); FoldingSetNodeID ID; @@ -3310,65 +3272,35 @@ if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) return SDValue(E, 0); SDNode* N = NodeAllocator.Allocate(); - new (N) AtomicSDNode(Opcode, VTs, Chain, Ptr, Cmp, Swp, PtrVal, Alignment); + new (N) AtomicSDNode(Opcode, VTs, MemVT, + Chain, Ptr, Cmp, Swp, PtrVal, Alignment); CSEMap.InsertNode(N, IP); AllNodes.push_back(N); return SDValue(N, 0); } -SDValue SelectionDAG::getAtomic(unsigned Opcode, SDValue Chain, +SDValue SelectionDAG::getAtomic(unsigned Opcode, MVT MemVT, + SDValue Chain, SDValue Ptr, SDValue Val, const Value* PtrVal, unsigned Alignment) { - assert((Opcode == ISD::ATOMIC_LOAD_ADD_8 || - Opcode == ISD::ATOMIC_LOAD_SUB_8 || - Opcode == ISD::ATOMIC_LOAD_AND_8 || - Opcode == ISD::ATOMIC_LOAD_OR_8 || - Opcode == ISD::ATOMIC_LOAD_XOR_8 || - Opcode == ISD::ATOMIC_LOAD_NAND_8 || - Opcode == ISD::ATOMIC_LOAD_MIN_8 || - Opcode == ISD::ATOMIC_LOAD_MAX_8 || - Opcode == ISD::ATOMIC_LOAD_UMIN_8 || - Opcode == ISD::ATOMIC_LOAD_UMAX_8 || - Opcode == ISD::ATOMIC_SWAP_8 || - Opcode == ISD::ATOMIC_LOAD_ADD_16 || - Opcode == ISD::ATOMIC_LOAD_SUB_16 || - Opcode == ISD::ATOMIC_LOAD_AND_16 || - Opcode == ISD::ATOMIC_LOAD_OR_16 || - Opcode == ISD::ATOMIC_LOAD_XOR_16 || - Opcode == ISD::ATOMIC_LOAD_NAND_16 || - Opcode == ISD::ATOMIC_LOAD_MIN_16 || - Opcode == ISD::ATOMIC_LOAD_MAX_16 || - Opcode == ISD::ATOMIC_LOAD_UMIN_16 || - Opcode == ISD::ATOMIC_LOAD_UMAX_16 || - Opcode == ISD::ATOMIC_SWAP_16 || - Opcode == ISD::ATOMIC_LOAD_ADD_32 || - Opcode == ISD::ATOMIC_LOAD_SUB_32 || - Opcode == ISD::ATOMIC_LOAD_AND_32 || - Opcode == ISD::ATOMIC_LOAD_OR_32 || - Opcode == ISD::ATOMIC_LOAD_XOR_32 || - Opcode == ISD::ATOMIC_LOAD_NAND_32 || - Opcode == ISD::ATOMIC_LOAD_MIN_32 || - Opcode == ISD::ATOMIC_LOAD_MAX_32 || - Opcode == ISD::ATOMIC_LOAD_UMIN_32 || - Opcode == ISD::ATOMIC_LOAD_UMAX_32 || - Opcode == ISD::ATOMIC_SWAP_32 || - Opcode == ISD::ATOMIC_LOAD_ADD_64 || - Opcode == ISD::ATOMIC_LOAD_SUB_64 || - Opcode == ISD::ATOMIC_LOAD_AND_64 || - Opcode == ISD::ATOMIC_LOAD_OR_64 || - Opcode == ISD::ATOMIC_LOAD_XOR_64 || - Opcode == ISD::ATOMIC_LOAD_NAND_64 || - Opcode == ISD::ATOMIC_LOAD_MIN_64 || - Opcode == ISD::ATOMIC_LOAD_MAX_64 || - Opcode == ISD::ATOMIC_LOAD_UMIN_64 || - Opcode == ISD::ATOMIC_LOAD_UMAX_64 || - Opcode == ISD::ATOMIC_SWAP_64) && "Invalid Atomic Op"); + assert((Opcode == ISD::ATOMIC_LOAD_ADD || + Opcode == ISD::ATOMIC_LOAD_SUB || + Opcode == ISD::ATOMIC_LOAD_AND || + Opcode == ISD::ATOMIC_LOAD_OR || + Opcode == ISD::ATOMIC_LOAD_XOR || + Opcode == ISD::ATOMIC_LOAD_NAND || + Opcode == ISD::ATOMIC_LOAD_MIN || + Opcode == ISD::ATOMIC_LOAD_MAX || + Opcode == ISD::ATOMIC_LOAD_UMIN || + Opcode == ISD::ATOMIC_LOAD_UMAX || + Opcode == ISD::ATOMIC_SWAP) && + "Invalid Atomic Op"); MVT VT = Val.getValueType(); if (Alignment == 0) // Ensure that codegen never sees alignment 0 - Alignment = getMVTAlignment(VT); + Alignment = getMVTAlignment(MemVT); SDVTList VTs = getVTList(VT, MVT::Other); FoldingSetNodeID ID; @@ -3378,7 +3310,8 @@ if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) return SDValue(E, 0); SDNode* N = NodeAllocator.Allocate(); - new (N) AtomicSDNode(Opcode, VTs, Chain, Ptr, Val, PtrVal, Alignment); + new (N) AtomicSDNode(Opcode, VTs, MemVT, + Chain, Ptr, Val, PtrVal, Alignment); CSEMap.InsertNode(N, IP); AllNodes.push_back(N); return SDValue(N, 0); @@ -5060,54 +4993,18 @@ #endif case ISD::PREFETCH: return "Prefetch"; case ISD::MEMBARRIER: return "MemBarrier"; - case ISD::ATOMIC_CMP_SWAP_8: return "AtomicCmpSwap8"; - case ISD::ATOMIC_SWAP_8: return "AtomicSwap8"; - case ISD::ATOMIC_LOAD_ADD_8: return "AtomicLoadAdd8"; - case ISD::ATOMIC_LOAD_SUB_8: return "AtomicLoadSub8"; - case ISD::ATOMIC_LOAD_AND_8: return "AtomicLoadAnd8"; - case ISD::ATOMIC_LOAD_OR_8: return "AtomicLoadOr8"; - case ISD::ATOMIC_LOAD_XOR_8: return "AtomicLoadXor8"; - case ISD::ATOMIC_LOAD_NAND_8: return "AtomicLoadNand8"; - case ISD::ATOMIC_LOAD_MIN_8: return "AtomicLoadMin8"; - case ISD::ATOMIC_LOAD_MAX_8: return "AtomicLoadMax8"; - case ISD::ATOMIC_LOAD_UMIN_8: return "AtomicLoadUMin8"; - case ISD::ATOMIC_LOAD_UMAX_8: return "AtomicLoadUMax8"; - case ISD::ATOMIC_CMP_SWAP_16: return "AtomicCmpSwap16"; - case ISD::ATOMIC_SWAP_16: return "AtomicSwap16"; - case ISD::ATOMIC_LOAD_ADD_16: return "AtomicLoadAdd16"; - case ISD::ATOMIC_LOAD_SUB_16: return "AtomicLoadSub16"; - case ISD::ATOMIC_LOAD_AND_16: return "AtomicLoadAnd16"; - case ISD::ATOMIC_LOAD_OR_16: return "AtomicLoadOr16"; - case ISD::ATOMIC_LOAD_XOR_16: return "AtomicLoadXor16"; - case ISD::ATOMIC_LOAD_NAND_16: return "AtomicLoadNand16"; - case ISD::ATOMIC_LOAD_MIN_16: return "AtomicLoadMin16"; - case ISD::ATOMIC_LOAD_MAX_16: return "AtomicLoadMax16"; - case ISD::ATOMIC_LOAD_UMIN_16: return "AtomicLoadUMin16"; - case ISD::ATOMIC_LOAD_UMAX_16: return "AtomicLoadUMax16"; - case ISD::ATOMIC_CMP_SWAP_32: return "AtomicCmpSwap32"; - case ISD::ATOMIC_SWAP_32: return "AtomicSwap32"; - case ISD::ATOMIC_LOAD_ADD_32: return "AtomicLoadAdd32"; - case ISD::ATOMIC_LOAD_SUB_32: return "AtomicLoadSub32"; - case ISD::ATOMIC_LOAD_AND_32: return "AtomicLoadAnd32"; - case ISD::ATOMIC_LOAD_OR_32: return "AtomicLoadOr32"; - case ISD::ATOMIC_LOAD_XOR_32: return "AtomicLoadXor32"; - case ISD::ATOMIC_LOAD_NAND_32: return "AtomicLoadNand32"; - case ISD::ATOMIC_LOAD_MIN_32: return "AtomicLoadMin32"; - case ISD::ATOMIC_LOAD_MAX_32: return "AtomicLoadMax32"; - case ISD::ATOMIC_LOAD_UMIN_32: return "AtomicLoadUMin32"; - case ISD::ATOMIC_LOAD_UMAX_32: return "AtomicLoadUMax32"; - case ISD::ATOMIC_CMP_SWAP_64: return "AtomicCmpSwap64"; - case ISD::ATOMIC_SWAP_64: return "AtomicSwap64"; - case ISD::ATOMIC_LOAD_ADD_64: return "AtomicLoadAdd64"; - case ISD::ATOMIC_LOAD_SUB_64: return "AtomicLoadSub64"; - case ISD::ATOMIC_LOAD_AND_64: return "AtomicLoadAnd64"; - case ISD::ATOMIC_LOAD_OR_64: return "AtomicLoadOr64"; - case ISD::ATOMIC_LOAD_XOR_64: return "AtomicLoadXor64"; - case ISD::ATOMIC_LOAD_NAND_64: return "AtomicLoadNand64"; - case ISD::ATOMIC_LOAD_MIN_64: return "AtomicLoadMin64"; - case ISD::ATOMIC_LOAD_MAX_64: return "AtomicLoadMax64"; - case ISD::ATOMIC_LOAD_UMIN_64: return "AtomicLoadUMin64"; - case ISD::ATOMIC_LOAD_UMAX_64: return "AtomicLoadUMax64"; + case ISD::ATOMIC_CMP_SWAP: return "AtomicCmpSwap"; + case ISD::ATOMIC_SWAP: return "AtomicSwap"; + case ISD::ATOMIC_LOAD_ADD: return "AtomicLoadAdd"; + case ISD::ATOMIC_LOAD_SUB: return "AtomicLoadSub"; + case ISD::ATOMIC_LOAD_AND: return "AtomicLoadAnd"; + case ISD::ATOMIC_LOAD_OR: return "AtomicLoadOr"; + case ISD::ATOMIC_LOAD_XOR: return "AtomicLoadXor"; + case ISD::ATOMIC_LOAD_NAND: return "AtomicLoadNand"; + case ISD::ATOMIC_LOAD_MIN: return "AtomicLoadMin"; + case ISD::ATOMIC_LOAD_MAX: return "AtomicLoadMax"; + case ISD::ATOMIC_LOAD_UMIN: return "AtomicLoadUMin"; + case ISD::ATOMIC_LOAD_UMAX: return "AtomicLoadUMax"; case ISD::PCMARKER: return "PCMarker"; case ISD::READCYCLECOUNTER: return "ReadCycleCounter"; case ISD::SRCVALUE: return "SrcValue"; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue Dec 23 15:37:04 2008 @@ -2959,10 +2959,12 @@ const char * SelectionDAGLowering::implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op) { SDValue Root = getRoot(); - SDValue L = DAG.getAtomic(Op, Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - I.getOperand(1)); + SDValue L = + DAG.getAtomic(Op, getValue(I.getOperand(2)).getValueType().getSimpleVT(), + Root, + getValue(I.getOperand(1)), + getValue(I.getOperand(2)), + I.getOperand(1)); setValue(&I, L); DAG.setRoot(L.getValue(1)); return 0; @@ -4145,198 +4147,40 @@ } case Intrinsic::atomic_cmp_swap: { SDValue Root = getRoot(); - SDValue L; - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - L = DAG.getAtomic(ISD::ATOMIC_CMP_SWAP_8, Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - break; - case MVT::i16: - L = DAG.getAtomic(ISD::ATOMIC_CMP_SWAP_16, Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - break; - case MVT::i32: - L = DAG.getAtomic(ISD::ATOMIC_CMP_SWAP_32, Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - break; - case MVT::i64: - L = DAG.getAtomic(ISD::ATOMIC_CMP_SWAP_64, Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - break; - default: - assert(0 && "Invalid atomic type"); - abort(); - } + SDValue L = + DAG.getAtomic(ISD::ATOMIC_CMP_SWAP, + getValue(I.getOperand(2)).getValueType().getSimpleVT(), + Root, + getValue(I.getOperand(1)), + getValue(I.getOperand(2)), + getValue(I.getOperand(3)), + I.getOperand(1)); setValue(&I, L); DAG.setRoot(L.getValue(1)); return 0; } case Intrinsic::atomic_load_add: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD); case Intrinsic::atomic_load_sub: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB); case Intrinsic::atomic_load_or: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR); case Intrinsic::atomic_load_xor: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR); case Intrinsic::atomic_load_and: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND); case Intrinsic::atomic_load_nand: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND); case Intrinsic::atomic_load_max: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX); case Intrinsic::atomic_load_min: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN); case Intrinsic::atomic_load_umin: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN); case Intrinsic::atomic_load_umax: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX); case Intrinsic::atomic_swap: - switch (getValue(I.getOperand(2)).getValueType().getSimpleVT()) { - case MVT::i8: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP_8); - case MVT::i16: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP_16); - case MVT::i32: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP_32); - case MVT::i64: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP_64); - default: - assert(0 && "Invalid atomic type"); - abort(); - } + return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP); } } Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61389&r1=61388&r2=61389&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Dec 23 15:37:04 2008 @@ -306,24 +306,24 @@ setOperationAction(ISD::MEMBARRIER , MVT::Other, Expand); // Expand certain atomics - setOperationAction(ISD::ATOMIC_CMP_SWAP_8 , MVT::i8, Custom); - setOperationAction(ISD::ATOMIC_CMP_SWAP_16, MVT::i16, Custom); - setOperationAction(ISD::ATOMIC_CMP_SWAP_32, MVT::i32, Custom); - setOperationAction(ISD::ATOMIC_CMP_SWAP_64, MVT::i64, Custom); - - setOperationAction(ISD::ATOMIC_LOAD_SUB_8 , MVT::i8, Custom); - setOperationAction(ISD::ATOMIC_LOAD_SUB_16, MVT::i16, Custom); - setOperationAction(ISD::ATOMIC_LOAD_SUB_32, MVT::i32, Custom); - setOperationAction(ISD::ATOMIC_LOAD_SUB_64, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i8, Custom); + setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i16, Custom); + setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Custom); + setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i64, Custom); + + setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i8, Custom); + setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i16, Custom); + setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Custom); + setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom); if (!Subtarget->is64Bit()) { - setOperationAction(ISD::ATOMIC_LOAD_ADD_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_LOAD_SUB_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_LOAD_AND_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_LOAD_OR_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_LOAD_XOR_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_LOAD_NAND_64, MVT::i64, Custom); - setOperationAction(ISD::ATOMIC_SWAP_64, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i64, Custom); + setOperationAction(ISD::ATOMIC_SWAP, MVT::i64, Custom); } // Use the default ISD::DBG_STOPPOINT, ISD::DECLARE expansion. @@ -6313,13 +6313,8 @@ MVT T = Node->getValueType(0); SDValue negOp = DAG.getNode(ISD::SUB, T, DAG.getConstant(0, T), Node->getOperand(2)); - return DAG.getAtomic((Op.getOpcode()==ISD::ATOMIC_LOAD_SUB_8 ? - ISD::ATOMIC_LOAD_ADD_8 : - Op.getOpcode()==ISD::ATOMIC_LOAD_SUB_16 ? - ISD::ATOMIC_LOAD_ADD_16 : - Op.getOpcode()==ISD::ATOMIC_LOAD_SUB_32 ? - ISD::ATOMIC_LOAD_ADD_32 : - ISD::ATOMIC_LOAD_ADD_64), + return DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, + cast(Node)->getMemoryVT(), Node->getOperand(0), Node->getOperand(1), negOp, cast(Node)->getSrcValue(), @@ -6331,14 +6326,8 @@ SDValue X86TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { switch (Op.getOpcode()) { default: assert(0 && "Should not custom lower this!"); - case ISD::ATOMIC_CMP_SWAP_8: - case ISD::ATOMIC_CMP_SWAP_16: - case ISD::ATOMIC_CMP_SWAP_32: - case ISD::ATOMIC_CMP_SWAP_64: return LowerCMP_SWAP(Op,DAG); - case ISD::ATOMIC_LOAD_SUB_8: - case ISD::ATOMIC_LOAD_SUB_16: - case ISD::ATOMIC_LOAD_SUB_32: return LowerLOAD_SUB(Op,DAG); - case ISD::ATOMIC_LOAD_SUB_64: return LowerLOAD_SUB(Op,DAG); + case ISD::ATOMIC_CMP_SWAP: return LowerCMP_SWAP(Op,DAG); + case ISD::ATOMIC_LOAD_SUB: return LowerLOAD_SUB(Op,DAG); case ISD::BUILD_VECTOR: return LowerBUILD_VECTOR(Op, DAG); case ISD::VECTOR_SHUFFLE: return LowerVECTOR_SHUFFLE(Op, DAG); case ISD::EXTRACT_VECTOR_ELT: return LowerEXTRACT_VECTOR_ELT(Op, DAG); @@ -6445,7 +6434,7 @@ Results.push_back(edx.getValue(1)); return; } - case ISD::ATOMIC_CMP_SWAP_64: { + case ISD::ATOMIC_CMP_SWAP: { MVT T = N->getValueType(0); assert (T == MVT::i64 && "Only know how to expand i64 Cmp and Swap"); SDValue cpInL, cpInH; @@ -6479,25 +6468,25 @@ Results.push_back(cpOutH.getValue(1)); return; } - case ISD::ATOMIC_LOAD_ADD_64: + case ISD::ATOMIC_LOAD_ADD: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMADD64_DAG); return; - case ISD::ATOMIC_LOAD_AND_64: + case ISD::ATOMIC_LOAD_AND: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMAND64_DAG); return; - case ISD::ATOMIC_LOAD_NAND_64: + case ISD::ATOMIC_LOAD_NAND: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMNAND64_DAG); return; - case ISD::ATOMIC_LOAD_OR_64: + case ISD::ATOMIC_LOAD_OR: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMOR64_DAG); return; - case ISD::ATOMIC_LOAD_SUB_64: + case ISD::ATOMIC_LOAD_SUB: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMSUB64_DAG); return; - case ISD::ATOMIC_LOAD_XOR_64: + case ISD::ATOMIC_LOAD_XOR: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMXOR64_DAG); return; - case ISD::ATOMIC_SWAP_64: + case ISD::ATOMIC_SWAP: ReplaceATOMIC_BINARY_64(N, Results, DAG, X86ISD::ATOMSWAP64_DAG); return; } From dpatel at apple.com Tue Dec 23 15:55:05 2008 From: dpatel at apple.com (Devang Patel) Date: Tue, 23 Dec 2008 21:55:05 -0000 Subject: [llvm-commits] [llvm] r61390 - in /llvm/trunk/lib/CodeGen: RegAllocSimple.cpp RegisterScavenging.cpp Message-ID: <200812232155.mBNLt56N013950@zion.cs.uiuc.edu> Author: dpatel Date: Tue Dec 23 15:55:04 2008 New Revision: 61390 URL: http://llvm.org/viewvc/llvm-project?rev=61390&view=rev Log: Silience unused warnings. Modified: llvm/trunk/lib/CodeGen/RegAllocSimple.cpp llvm/trunk/lib/CodeGen/RegisterScavenging.cpp Modified: llvm/trunk/lib/CodeGen/RegAllocSimple.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegAllocSimple.cpp?rev=61390&r1=61389&r2=61390&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/RegAllocSimple.cpp (original) +++ llvm/trunk/lib/CodeGen/RegAllocSimple.cpp Tue Dec 23 15:55:04 2008 @@ -122,7 +122,9 @@ unsigned RegAllocSimple::getFreeReg(unsigned virtualReg) { const TargetRegisterClass* RC = MF->getRegInfo().getRegClass(virtualReg); TargetRegisterClass::iterator RI = RC->allocation_order_begin(*MF); +#ifndef NDEBUG TargetRegisterClass::iterator RE = RC->allocation_order_end(*MF); +#endif while (1) { unsigned regIdx = RegClassIdx[RC]++; Modified: llvm/trunk/lib/CodeGen/RegisterScavenging.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegisterScavenging.cpp?rev=61390&r1=61389&r2=61390&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/RegisterScavenging.cpp (original) +++ llvm/trunk/lib/CodeGen/RegisterScavenging.cpp Tue Dec 23 15:55:04 2008 @@ -139,6 +139,7 @@ ScavengedRC = NULL; } +#ifndef NDEBUG /// isLiveInButUnusedBefore - Return true if register is livein the MBB not /// not used before it reaches the MI that defines register. static bool isLiveInButUnusedBefore(unsigned Reg, MachineInstr *MI, @@ -172,6 +173,7 @@ return false; return true; } +#endif void RegScavenger::forward() { // Move ptr forward. From dpatel at apple.com Tue Dec 23 15:55:39 2008 From: dpatel at apple.com (Devang Patel) Date: Tue, 23 Dec 2008 21:55:39 -0000 Subject: [llvm-commits] [llvm] r61391 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Message-ID: <200812232155.mBNLteni013987@zion.cs.uiuc.edu> Author: dpatel Date: Tue Dec 23 15:55:38 2008 New Revision: 61391 URL: http://llvm.org/viewvc/llvm-project?rev=61391&view=rev Log: Fix typo. Silence unused variable warning. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp?rev=61391&r1=61390&r2=61391&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Tue Dec 23 15:55:38 2008 @@ -1616,7 +1616,7 @@ while (FromTy) { if (FromTy->getTag() != DW_TAG_typedef) { FieldSize = FromTy->getSize(); - FieldAlign = FromTy->getSize(); + FieldAlign = FromTy->getAlign(); break; } @@ -2776,6 +2776,7 @@ sys::Path FullPath(Directories[SourceFiles[i].getDirectoryID()]); bool AppendOk = FullPath.appendComponent(SourceFiles[i].getName()); assert(AppendOk && "Could not append filename to directory!"); + AppendOk = false; Asm->EmitFile(i, FullPath.toString()); Asm->EOL(); } From dpatel at apple.com Tue Dec 23 15:56:29 2008 From: dpatel at apple.com (Devang Patel) Date: Tue, 23 Dec 2008 21:56:29 -0000 Subject: [llvm-commits] [llvm] r61392 - in /llvm/trunk/lib/Target/X86: X86FastISel.cpp X86RegisterInfo.cpp Message-ID: <200812232156.mBNLuTTN014025@zion.cs.uiuc.edu> Author: dpatel Date: Tue Dec 23 15:56:28 2008 New Revision: 61392 URL: http://llvm.org/viewvc/llvm-project?rev=61392&view=rev Log: Silence unused variable warnings. Modified: llvm/trunk/lib/Target/X86/X86FastISel.cpp llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Modified: llvm/trunk/lib/Target/X86/X86FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FastISel.cpp?rev=61392&r1=61391&r2=61392&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86FastISel.cpp (original) +++ llvm/trunk/lib/Target/X86/X86FastISel.cpp Tue Dec 23 15:56:28 2008 @@ -1220,6 +1220,7 @@ bool Emitted = X86FastEmitExtend(ISD::SIGN_EXTEND, VA.getLocVT(), Arg, ArgVT, Arg); assert(Emitted && "Failed to emit a sext!"); Emitted=Emitted; + Emitted = true; ArgVT = VA.getLocVT(); break; } @@ -1227,6 +1228,7 @@ bool Emitted = X86FastEmitExtend(ISD::ZERO_EXTEND, VA.getLocVT(), Arg, ArgVT, Arg); assert(Emitted && "Failed to emit a zext!"); Emitted=Emitted; + Emitted = true; ArgVT = VA.getLocVT(); break; } @@ -1251,6 +1253,7 @@ bool Emitted = TII.copyRegToReg(*MBB, MBB->end(), VA.getLocReg(), Arg, RC, RC); assert(Emitted && "Failed to emit a copy instruction!"); Emitted=Emitted; + Emitted = true; RegArgs.push_back(VA.getLocReg()); } else { unsigned LocMemOffset = VA.getLocMemOffset(); @@ -1278,6 +1281,7 @@ unsigned Base = getInstrInfo()->getGlobalBaseReg(&MF); bool Emitted = TII.copyRegToReg(*MBB, MBB->end(), X86::EBX, Base, RC, RC); assert(Emitted && "Failed to emit a copy instruction!"); Emitted=Emitted; + Emitted = true; } // Issue the call. @@ -1329,6 +1333,7 @@ bool Emitted = TII.copyRegToReg(*MBB, MBB->end(), ResultReg, RVLocs[0].getLocReg(), DstRC, SrcRC); assert(Emitted && "Failed to emit a copy instruction!"); Emitted=Emitted; + Emitted = true; if (CopyVT != RVLocs[0].getValVT()) { // Round the F80 the right size, which also moves to the appropriate xmm // register. This is accomplished by storing the F80 value in memory and Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=61392&r1=61391&r2=61392&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Tue Dec 23 15:56:28 2008 @@ -349,6 +349,7 @@ else { unsigned Align = MF.getFrameInfo()->getObjectAlignment(FI); assert( (-(Offset + StackSize)) % Align == 0); + Align = 0; return Offset + StackSize; } @@ -501,6 +502,7 @@ TailCallReturnAddrDelta); assert(FrameIdx == MF.getFrameInfo()->getObjectIndexBegin() && "Slot for EBP register must be last in order to be found!"); + FrameIdx = 0; } } From asl at math.spbu.ru Tue Dec 23 16:25:28 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 23 Dec 2008 22:25:28 -0000 Subject: [llvm-commits] [llvm] r61395 - in /llvm/trunk/lib/CodeGen/SelectionDAG: SelectionDAGBuild.cpp SelectionDAGBuild.h Message-ID: <200812232225.mBNMPTx4015251@zion.cs.uiuc.edu> Author: asl Date: Tue Dec 23 16:25:27 2008 New Revision: 61395 URL: http://llvm.org/viewvc/llvm-project?rev=61395&view=rev Log: Initial checkin of APInt'ififcation of switch lowering Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61395&r1=61394&r2=61395&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue Dec 23 16:25:27 2008 @@ -1259,8 +1259,8 @@ void SelectionDAGLowering::visitSwitchCase(CaseBlock &CB) { SDValue Cond; SDValue CondLHS = getValue(CB.CmpLHS); - - // Build the setcc now. + + // Build the setcc now. if (CB.CmpMHS == NULL) { // Fold "(X == true)" to X and "(X == false)" to !X to // handle common cases produced by branch lowering. @@ -1274,8 +1274,8 @@ } else { assert(CB.CC == ISD::SETLE && "Can handle only LE ranges now"); - uint64_t Low = cast(CB.CmpLHS)->getSExtValue(); - uint64_t High = cast(CB.CmpRHS)->getSExtValue(); + const APInt& Low = cast(CB.CmpLHS)->getValue(); + const APInt& High = cast(CB.CmpRHS)->getValue(); SDValue CmpOp = getValue(CB.CmpMHS); MVT VT = CmpOp.getValueType(); @@ -1288,18 +1288,18 @@ DAG.getConstant(High-Low, VT), ISD::SETULE); } } - + // Update successor info CurMBB->addSuccessor(CB.TrueBB); CurMBB->addSuccessor(CB.FalseBB); - + // Set NextBlock to be the MBB immediately after the current one, if any. // This is used to avoid emitting unnecessary branches to the next block. MachineBasicBlock *NextBlock = 0; MachineFunction::iterator BBI = CurMBB; if (++BBI != CurMBB->getParent()->end()) NextBlock = BBI; - + // If the lhs block is the next block, invert the condition so that we can // fall through to the lhs instead of the rhs block. if (CB.TrueBB == NextBlock) { @@ -1309,20 +1309,20 @@ } SDValue BrCond = DAG.getNode(ISD::BRCOND, MVT::Other, getControlRoot(), Cond, DAG.getBasicBlock(CB.TrueBB)); - + // If the branch was constant folded, fix up the CFG. if (BrCond.getOpcode() == ISD::BR) { CurMBB->removeSuccessor(CB.FalseBB); DAG.setRoot(BrCond); } else { // Otherwise, go ahead and insert the false branch. - if (BrCond == getControlRoot()) + if (BrCond == getControlRoot()) CurMBB->removeSuccessor(CB.TrueBB); - + if (CB.FalseBB == NextBlock) DAG.setRoot(BrCond); else - DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, + DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, DAG.getBasicBlock(CB.FalseBB))); } } @@ -1350,7 +1350,7 @@ MVT VT = SwitchOp.getValueType(); SDValue SUB = DAG.getNode(ISD::SUB, VT, SwitchOp, DAG.getConstant(JTH.First, VT)); - + // The SDNode we just created, which holds the value being switched on // minus the the smallest case value, needs to be copied to a virtual // register so it can be used as an index into the jump table in a @@ -1360,7 +1360,7 @@ SwitchOp = DAG.getNode(ISD::TRUNCATE, TLI.getPointerTy(), SUB); else SwitchOp = DAG.getNode(ISD::ZERO_EXTEND, TLI.getPointerTy(), SUB); - + unsigned JumpTableReg = FuncInfo.MakeReg(TLI.getPointerTy()); SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), JumpTableReg, SwitchOp); JT.Reg = JumpTableReg; @@ -1434,7 +1434,7 @@ SDValue BrRange = DAG.getNode(ISD::BRCOND, MVT::Other, CopyTo, RangeCmp, DAG.getBasicBlock(B.Default)); - + if (MBB == NextBlock) DAG.setRoot(BrRange); else @@ -1449,9 +1449,9 @@ unsigned Reg, BitTestCase &B) { // Emit bit tests and jumps - SDValue SwitchVal = DAG.getCopyFromReg(getControlRoot(), Reg, + SDValue SwitchVal = DAG.getCopyFromReg(getControlRoot(), Reg, TLI.getPointerTy()); - + SDValue AndOp = DAG.getNode(ISD::AND, TLI.getPointerTy(), SwitchVal, DAG.getConstant(B.Mask, TLI.getPointerTy())); SDValue AndCmp = DAG.getSetCC(TLI.getSetCCResultType(AndOp), AndOp, @@ -1460,7 +1460,7 @@ CurMBB->addSuccessor(B.TargetBB); CurMBB->addSuccessor(NextMBB); - + SDValue BrAnd = DAG.getNode(ISD::BRCOND, MVT::Other, getControlRoot(), AndCmp, DAG.getBasicBlock(B.TargetBB)); @@ -1517,15 +1517,15 @@ Value* SV, MachineBasicBlock* Default) { Case& BackCase = *(CR.Range.second-1); - + // Size is the number of Cases represented by this range. - unsigned Size = CR.Range.second - CR.Range.first; + size_t Size = CR.Range.second - CR.Range.first; if (Size > 3) - return false; - + return false; + // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = CurMBB->getParent(); + MachineFunction *CurMF = CurMBB->getParent(); // Figure out which block is immediately after the current one. MachineBasicBlock *NextBlock = 0; @@ -1538,7 +1538,7 @@ // is the same as the other, but has one bit unset that the other has set, // use bit manipulation to do two compares at once. For example: // "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" - + // Rearrange the case blocks so that the last one falls through if possible. if (NextBlock && Default != NextBlock && BackCase.BB != NextBlock) { // The last case block won't fall through into 'NextBlock' if we emit the @@ -1550,7 +1550,7 @@ } } } - + // Create a CaseBlock record representing a conditional branch to // the Case's target mbb if the value being switched on SV is equal // to C. @@ -1576,7 +1576,7 @@ LHS = I->Low; MHS = SV; RHS = I->High; } CaseBlock CB(CC, LHS, RHS, MHS, I->BB, FallThrough, CurBlock); - + // If emitting the first comparison, just call visitSwitchCase to emit the // code into the current block. Otherwise, push the CaseBlock onto the // vector to be later processed by SDISel, and insert the node's MBB @@ -1585,7 +1585,7 @@ visitSwitchCase(CB); else SwitchCases.push_back(CB); - + CurBlock = FallThrough; } @@ -1597,7 +1597,7 @@ (TLI.isOperationLegal(ISD::BR_JT, MVT::Other) || TLI.isOperationLegal(ISD::BRIND, MVT::Other)); } - + /// handleJTSwitchCase - Emit jumptable for current switch case range bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, CaseRecVector& WorkList, @@ -1606,24 +1606,25 @@ Case& FrontCase = *CR.Range.first; Case& BackCase = *(CR.Range.second-1); - int64_t First = cast(FrontCase.Low)->getSExtValue(); - int64_t Last = cast(BackCase.High)->getSExtValue(); + const APInt& First = cast(FrontCase.Low)->getValue(); + const APInt& Last = cast(BackCase.High)->getValue(); - uint64_t TSize = 0; + size_t TSize = 0; for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) TSize += I->size(); if (!areJTsAllowed(TLI) || TSize <= 3) return false; - - double Density = (double)TSize / (double)((Last - First) + 1ULL); + + APInt Range = Last - First + 1ULL; + double Density = (double)TSize / Range.roundToDouble(); if (Density < 0.4) return false; - DOUT << "Lowering jump table\n" + /*DOUT << "Lowering jump table\n" << "First entry: " << First << ". Last entry: " << Last << "\n" - << "Size: " << TSize << ". Density: " << Density << "\n\n"; + << "Size: " << TSize << ". Density: " << Density << "\n\n";*/ // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. @@ -1646,18 +1647,18 @@ CurMF->insert(BBI, JumpTableBB); CR.CaseBB->addSuccessor(Default); CR.CaseBB->addSuccessor(JumpTableBB); - + // Build a vector of destination BBs, corresponding to each target // of the jump table. If the value of the jump table slot corresponds to // a case statement, push the case's BB onto the vector, otherwise, push // the default BB. std::vector DestBBs; - int64_t TEI = First; + APInt TEI = First; for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; ++TEI) { - int64_t Low = cast(I->Low)->getSExtValue(); - int64_t High = cast(I->High)->getSExtValue(); - - if ((Low <= TEI) && (TEI <= High)) { + const APInt& Low = cast(I->Low)->getValue(); + const APInt& High = cast(I->High)->getValue(); + + if (Low.sle(TEI) && TEI.sle(High)) { DestBBs.push_back(I->BB); if (TEI==High) ++I; @@ -1665,28 +1666,28 @@ DestBBs.push_back(Default); } } - + // Update successor info. Add one edge to each unique successor. - BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); - for (std::vector::iterator I = DestBBs.begin(), + BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); + for (std::vector::iterator I = DestBBs.begin(), E = DestBBs.end(); I != E; ++I) { if (!SuccsHandled[(*I)->getNumber()]) { SuccsHandled[(*I)->getNumber()] = true; JumpTableBB->addSuccessor(*I); } } - + // Create a jump table index for this jump table, or return an existing // one. unsigned JTI = CurMF->getJumpTableInfo()->getJumpTableIndex(DestBBs); - + // Set the jump table information so that we can codegen it as a second // MachineBasicBlock JumpTable JT(-1U, JTI, JumpTableBB, Default); JumpTableHeader JTH(First, Last, SV, CR.CaseBB, (CR.CaseBB == CurMBB)); if (CR.CaseBB == CurMBB) visitJumpTableHeader(JT, JTH); - + JTCases.push_back(JumpTableBlock(JTH, JT)); return true; @@ -1700,7 +1701,7 @@ MachineBasicBlock* Default) { // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = CurMBB->getParent(); + MachineFunction *CurMF = CurMBB->getParent(); // Figure out which block is immediately after the current one. MachineBasicBlock *NextBlock = 0; @@ -1716,36 +1717,36 @@ // Size is the number of Cases represented by this range. unsigned Size = CR.Range.second - CR.Range.first; - int64_t First = cast(FrontCase.Low)->getSExtValue(); - int64_t Last = cast(BackCase.High)->getSExtValue(); + const APInt& First = cast(FrontCase.Low)->getValue(); + const APInt& Last = cast(BackCase.High)->getValue(); double FMetric = 0; CaseItr Pivot = CR.Range.first + Size/2; // Select optimal pivot, maximizing sum density of LHS and RHS. This will // (heuristically) allow us to emit JumpTable's later. - uint64_t TSize = 0; + size_t TSize = 0; for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) TSize += I->size(); - uint64_t LSize = FrontCase.size(); - uint64_t RSize = TSize-LSize; - DOUT << "Selecting best pivot: \n" + size_t LSize = FrontCase.size(); + size_t RSize = TSize-LSize; + /*DOUT << "Selecting best pivot: \n" << "First: " << First << ", Last: " << Last <<"\n" - << "LSize: " << LSize << ", RSize: " << RSize << "\n"; + << "LSize: " << LSize << ", RSize: " << RSize << "\n";*/ for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; J!=E; ++I, ++J) { - int64_t LEnd = cast(I->High)->getSExtValue(); - int64_t RBegin = cast(J->Low)->getSExtValue(); - assert((RBegin-LEnd>=1) && "Invalid case distance"); - double LDensity = (double)LSize / (double)((LEnd - First) + 1ULL); - double RDensity = (double)RSize / (double)((Last - RBegin) + 1ULL); - double Metric = Log2_64(RBegin-LEnd)*(LDensity+RDensity); + const APInt& LEnd = cast(I->High)->getValue(); + const APInt& RBegin = cast(J->Low)->getValue(); + assert((RBegin - LEnd - 1).isNonNegative() && "Invalid case distance"); + double LDensity = (double)LSize / (LEnd - First + 1ULL).roundToDouble(); + double RDensity = (double)RSize / (Last - RBegin + 1ULL).roundToDouble(); + double Metric = (RBegin-LEnd).logBase2()*(LDensity+RDensity); // Should always split in some non-trivial place - DOUT <<"=>Step\n" + /*DOUT <<"=>Step\n" << "LEnd: " << LEnd << ", RBegin: " << RBegin << "\n" << "LDensity: " << LDensity << ", RDensity: " << RDensity << "\n" - << "Metric: " << Metric << "\n"; + << "Metric: " << Metric << "\n";*/ if (FMetric < Metric) { Pivot = J; FMetric = Metric; @@ -1761,12 +1762,12 @@ } else { Pivot = CR.Range.first + Size/2; } - + CaseRange LHSR(CR.Range.first, Pivot); CaseRange RHSR(Pivot, CR.Range.second); Constant *C = Pivot->Low; MachineBasicBlock *FalseBB = 0, *TrueBB = 0; - + // We know that we branch to the LHS if the Value being switched on is // less than the Pivot value, C. We use this to optimize our binary // tree a bit, by recognizing that if SV is greater than or equal to the @@ -1775,22 +1776,22 @@ // rather than creating a leaf node for it. if ((LHSR.second - LHSR.first) == 1 && LHSR.first->High == CR.GE && - cast(C)->getSExtValue() == - (cast(CR.GE)->getSExtValue() + 1LL)) { + cast(C)->getValue() == + (cast(CR.GE)->getValue() + 1LL)) { TrueBB = LHSR.first->BB; } else { TrueBB = CurMF->CreateMachineBasicBlock(LLVMBB); CurMF->insert(BBI, TrueBB); WorkList.push_back(CaseRec(TrueBB, C, CR.GE, LHSR)); } - + // Similar to the optimization above, if the Value being switched on is // known to be less than the Constant CR.LT, and the current Case Value // is CR.LT - 1, then we can branch directly to the target block for // the current Case Value, rather than emitting a RHS leaf node for it. if ((RHSR.second - RHSR.first) == 1 && CR.LT && - cast(RHSR.first->Low)->getSExtValue() == - (cast(CR.LT)->getSExtValue() - 1LL)) { + cast(RHSR.first->Low)->getValue() == + (cast(CR.LT)->getValue() - 1LL)) { FalseBB = RHSR.first->BB; } else { FalseBB = CurMF->CreateMachineBasicBlock(LLVMBB); @@ -1825,18 +1826,15 @@ // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = CurMBB->getParent(); + MachineFunction *CurMF = CurMBB->getParent(); - unsigned numCmps = 0; + size_t numCmps = 0; for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { // Single case counts one, case range - two. - if (I->Low == I->High) - numCmps +=1; - else - numCmps +=2; + numCmps += (I->Low == I->High ? 1 : 2); } - + // Count unique destinations SmallSet Dests; for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { @@ -1847,35 +1845,34 @@ } DOUT << "Total number of unique destinations: " << Dests.size() << "\n" << "Total number of comparisons: " << numCmps << "\n"; - + // Compute span of values. - Constant* minValue = FrontCase.Low; - Constant* maxValue = BackCase.High; - uint64_t range = cast(maxValue)->getSExtValue() - - cast(minValue)->getSExtValue(); - DOUT << "Compare range: " << range << "\n" - << "Low bound: " << cast(minValue)->getSExtValue() << "\n" - << "High bound: " << cast(maxValue)->getSExtValue() << "\n"; - - if (range>=IntPtrBits || + const APInt& minValue = cast(FrontCase.Low)->getValue(); + const APInt& maxValue = cast(BackCase.High)->getValue(); + APInt cmpRange = maxValue - minValue; + /*DOUT << "Compare range: " << Range << "\n" + << "Low bound: " << cast(minValue)->getValue() << "\n" + << "High bound: " << cast(maxValue)->getValue() << "\n";*/ + + if (cmpRange.uge(APInt(cmpRange.getBitWidth(), IntPtrBits)) || (!(Dests.size() == 1 && numCmps >= 3) && !(Dests.size() == 2 && numCmps >= 5) && !(Dests.size() >= 3 && numCmps >= 6))) return false; - + DOUT << "Emitting bit tests\n"; - int64_t lowBound = 0; - + APInt lowBound = APInt::getNullValue(cmpRange.getBitWidth()); + // Optimize the case where all the case values fit in a // word without having to subtract minValue. In this case, // we can optimize away the subtraction. - if (cast(minValue)->getSExtValue() >= 0 && - cast(maxValue)->getSExtValue() < IntPtrBits) { - range = cast(maxValue)->getSExtValue(); + if (minValue.isNonNegative() && + maxValue.slt(APInt(maxValue.getBitWidth(), IntPtrBits))) { + cmpRange = maxValue; } else { - lowBound = cast(minValue)->getSExtValue(); + lowBound = minValue; } - + CaseBitsVector CasesBits; unsigned i, count = 0; @@ -1884,24 +1881,27 @@ for (i = 0; i < count; ++i) if (Dest == CasesBits[i].BB) break; - + if (i == count) { assert((count < 3) && "Too much destinations to test!"); CasesBits.push_back(CaseBits(0, Dest, 0)); count++; } - - uint64_t lo = cast(I->Low)->getSExtValue() - lowBound; - uint64_t hi = cast(I->High)->getSExtValue() - lowBound; - + + const APInt& lowValue = cast(I->Low)->getValue(); + const APInt& highValue = cast(I->High)->getValue(); + + uint64_t lo = (lowValue - lowBound).getZExtValue(); + uint64_t hi = (highValue - lowBound).getZExtValue(); + for (uint64_t j = lo; j <= hi; j++) { CasesBits[i].Mask |= 1ULL << j; CasesBits[i].Bits++; } - + } std::sort(CasesBits.begin(), CasesBits.end(), CaseBitsCmp()); - + BitTestInfo BTC; // Figure out which block is immediately after the current one. @@ -1921,14 +1921,14 @@ CaseBB, CasesBits[i].BB)); } - - BitTestBlock BTB(lowBound, range, SV, + + BitTestBlock BTB(lowBound, cmpRange, SV, -1U, (CR.CaseBB == CurMBB), CR.CaseBB, Default, BTC); if (CR.CaseBB == CurMBB) visitBitTestHeader(BTB); - + BitTestCases.push_back(BTB); return true; @@ -1936,12 +1936,12 @@ /// Clusterify - Transform simple list of Cases into list of CaseRange's -unsigned SelectionDAGLowering::Clusterify(CaseVector& Cases, +size_t SelectionDAGLowering::Clusterify(CaseVector& Cases, const SwitchInst& SI) { - unsigned numCmps = 0; + size_t numCmps = 0; // Start with "simple" cases - for (unsigned i = 1; i < SI.getNumSuccessors(); ++i) { + for (size_t i = 1; i < SI.getNumSuccessors(); ++i) { MachineBasicBlock *SMBB = FuncInfo.MBBMap[SI.getSuccessor(i)]; Cases.push_back(Case(SI.getSuccessorValue(i), SI.getSuccessorValue(i), @@ -1950,18 +1950,18 @@ std::sort(Cases.begin(), Cases.end(), CaseCmp()); // Merge case into clusters - if (Cases.size()>=2) + if (Cases.size() >= 2) // Must recompute end() each iteration because it may be // invalidated by erase if we hold on to it - for (CaseItr I=Cases.begin(), J=++(Cases.begin()); J!=Cases.end(); ) { - int64_t nextValue = cast(J->Low)->getSExtValue(); - int64_t currentValue = cast(I->High)->getSExtValue(); + for (CaseItr I = Cases.begin(), J = ++(Cases.begin()); J != Cases.end(); ) { + const APInt& nextValue = cast(J->Low)->getValue(); + const APInt& currentValue = cast(I->High)->getValue(); MachineBasicBlock* nextBB = J->BB; MachineBasicBlock* currentBB = I->BB; // If the two neighboring cases go to the same destination, merge them // into a single case. - if ((nextValue-currentValue==1) && (currentBB == nextBB)) { + if ((nextValue - currentValue == 1) && (currentBB == nextBB)) { I->High = J->High; J = Cases.erase(J); } else { @@ -1978,7 +1978,7 @@ return numCmps; } -void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { +void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { // Figure out which block is immediately after the current one. MachineBasicBlock *NextBlock = 0; MachineFunction::iterator BBI = CurMBB; @@ -1995,15 +1995,14 @@ if (Default != NextBlock) DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, getControlRoot(), DAG.getBasicBlock(Default))); - return; } - + // If there are any non-default case statements, create a vector of Cases // representing each one, and sort the vector so that we can efficiently // create a binary search tree from them. CaseVector Cases; - unsigned numCmps = Clusterify(Cases, SI); + size_t numCmps = Clusterify(Cases, SI); DOUT << "Clusterify finished. Total clusters: " << Cases.size() << ". Total compares: " << numCmps << "\n"; @@ -2023,18 +2022,18 @@ if (handleBitTestsSwitchCase(CR, WorkList, SV, Default)) continue; - + // If the range has few cases (two or less) emit a series of specific // tests. if (handleSmallSwitchRange(CR, WorkList, SV, Default)) continue; - + // If the switch has more than 5 blocks, and at least 40% dense, and the // target supports indirect branches, then emit a jump table rather than // lowering the switch to a binary tree of conditional branches. if (handleJTSwitchCase(CR, WorkList, SV, Default)) continue; - + // Emit binary tree. We need to pick a pivot, and push left and right ranges // onto the worklist. Leafs are handled via handleSmallSwitchRange() call. handleBTSplitSwitchCase(CR, WorkList, SV, Default); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h?rev=61395&r1=61394&r2=61395&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Tue Dec 23 16:25:27 2008 @@ -246,8 +246,8 @@ } }; - unsigned Clusterify(CaseVector& Cases, const SwitchInst &SI); - + size_t Clusterify(CaseVector& Cases, const SwitchInst &SI); + /// CaseBlock - This structure is used to communicate between SDLowering and /// SDISel for the code generation of additional basic blocks needed by multi- /// case switch statements. @@ -284,11 +284,11 @@ MachineBasicBlock *Default; }; struct JumpTableHeader { - JumpTableHeader(uint64_t F, uint64_t L, Value* SV, MachineBasicBlock* H, + JumpTableHeader(APInt F, APInt L, Value* SV, MachineBasicBlock* H, bool E = false): First(F), Last(L), SValue(SV), HeaderBB(H), Emitted(E) {} - uint64_t First; - uint64_t Last; + APInt First; + APInt Last; Value *SValue; MachineBasicBlock *HeaderBB; bool Emitted; @@ -306,14 +306,14 @@ typedef SmallVector BitTestInfo; struct BitTestBlock { - BitTestBlock(uint64_t F, uint64_t R, Value* SV, + BitTestBlock(APInt F, APInt R, Value* SV, unsigned Rg, bool E, MachineBasicBlock* P, MachineBasicBlock* D, const BitTestInfo& C): First(F), Range(R), SValue(SV), Reg(Rg), Emitted(E), Parent(P), Default(D), Cases(C) { } - uint64_t First; - uint64_t Range; + APInt First; + APInt Range; Value *SValue; unsigned Reg; bool Emitted; From asl at math.spbu.ru Tue Dec 23 16:25:46 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 23 Dec 2008 22:25:46 -0000 Subject: [llvm-commits] [llvm] r61396 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Message-ID: <200812232225.mBNMPkRO015268@zion.cs.uiuc.edu> Author: asl Date: Tue Dec 23 16:25:45 2008 New Revision: 61396 URL: http://llvm.org/viewvc/llvm-project?rev=61396&view=rev Log: Indent stuff properly Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61396&r1=61395&r2=61396&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue Dec 23 16:25:45 2008 @@ -1343,19 +1343,19 @@ /// in the JumpTable from switch case. void SelectionDAGLowering::visitJumpTableHeader(JumpTable &JT, JumpTableHeader &JTH) { - // Subtract the lowest switch case value from the value being switched on - // and conditional branch to default mbb if the result is greater than the + // Subtract the lowest switch case value from the value being switched on and + // conditional branch to default mbb if the result is greater than the // difference between smallest and largest cases. SDValue SwitchOp = getValue(JTH.SValue); MVT VT = SwitchOp.getValueType(); SDValue SUB = DAG.getNode(ISD::SUB, VT, SwitchOp, - DAG.getConstant(JTH.First, VT)); + DAG.getConstant(JTH.First, VT)); - // The SDNode we just created, which holds the value being switched on - // minus the the smallest case value, needs to be copied to a virtual - // register so it can be used as an index into the jump table in a - // subsequent basic block. This value may be smaller or larger than the - // target's pointer type, and therefore require extension or truncating. + // The SDNode we just created, which holds the value being switched on minus + // the the smallest case value, needs to be copied to a virtual register so it + // can be used as an index into the jump table in a subsequent basic block. + // This value may be smaller or larger than the target's pointer type, and + // therefore require extension or truncating. if (VT.bitsGT(TLI.getPointerTy())) SwitchOp = DAG.getNode(ISD::TRUNCATE, TLI.getPointerTy(), SUB); else @@ -1365,12 +1365,12 @@ SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), JumpTableReg, SwitchOp); JT.Reg = JumpTableReg; - // Emit the range check for the jump table, and branch to the default - // block for the switch statement if the value being switched on exceeds - // the largest case in the switch. + // Emit the range check for the jump table, and branch to the default block + // for the switch statement if the value being switched on exceeds the largest + // case in the switch. SDValue CMP = DAG.getSetCC(TLI.getSetCCResultType(SUB), SUB, - DAG.getConstant(JTH.Last-JTH.First,VT), - ISD::SETUGT); + DAG.getConstant(JTH.Last-JTH.First,VT), + ISD::SETUGT); // Set NextBlock to be the MBB immediately after the current one, if any. // This is used to avoid emitting unnecessary branches to the next block. @@ -1380,12 +1380,12 @@ NextBlock = BBI; SDValue BrCond = DAG.getNode(ISD::BRCOND, MVT::Other, CopyTo, CMP, - DAG.getBasicBlock(JT.Default)); + DAG.getBasicBlock(JT.Default)); if (JT.MBB == NextBlock) DAG.setRoot(BrCond); else - DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, + DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, DAG.getBasicBlock(JT.MBB))); return; @@ -1398,12 +1398,12 @@ SDValue SwitchOp = getValue(B.SValue); MVT VT = SwitchOp.getValueType(); SDValue SUB = DAG.getNode(ISD::SUB, VT, SwitchOp, - DAG.getConstant(B.First, VT)); + DAG.getConstant(B.First, VT)); // Check range SDValue RangeCmp = DAG.getSetCC(TLI.getSetCCResultType(SUB), SUB, - DAG.getConstant(B.Range, VT), - ISD::SETUGT); + DAG.getConstant(B.Range, VT), + ISD::SETUGT); SDValue ShiftOp; if (VT.bitsGT(TLI.getShiftAmountTy())) @@ -1413,8 +1413,8 @@ // Make desired shift SDValue SwitchVal = DAG.getNode(ISD::SHL, TLI.getPointerTy(), - DAG.getConstant(1, TLI.getPointerTy()), - ShiftOp); + DAG.getConstant(1, TLI.getPointerTy()), + ShiftOp); unsigned SwitchReg = FuncInfo.MakeReg(TLI.getPointerTy()); SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), SwitchReg, SwitchVal); @@ -1433,7 +1433,7 @@ CurMBB->addSuccessor(MBB); SDValue BrRange = DAG.getNode(ISD::BRCOND, MVT::Other, CopyTo, RangeCmp, - DAG.getBasicBlock(B.Default)); + DAG.getBasicBlock(B.Default)); if (MBB == NextBlock) DAG.setRoot(BrRange); @@ -1453,16 +1453,16 @@ TLI.getPointerTy()); SDValue AndOp = DAG.getNode(ISD::AND, TLI.getPointerTy(), SwitchVal, - DAG.getConstant(B.Mask, TLI.getPointerTy())); + DAG.getConstant(B.Mask, TLI.getPointerTy())); SDValue AndCmp = DAG.getSetCC(TLI.getSetCCResultType(AndOp), AndOp, - DAG.getConstant(0, TLI.getPointerTy()), - ISD::SETNE); + DAG.getConstant(0, TLI.getPointerTy()), + ISD::SETNE); CurMBB->addSuccessor(B.TargetBB); CurMBB->addSuccessor(NextMBB); SDValue BrAnd = DAG.getNode(ISD::BRCOND, MVT::Other, getControlRoot(), - AndCmp, DAG.getBasicBlock(B.TargetBB)); + AndCmp, DAG.getBasicBlock(B.TargetBB)); // Set NextBlock to be the MBB immediately after the current one, if any. // This is used to avoid emitting unnecessary branches to the next block. From asl at math.spbu.ru Tue Dec 23 16:26:02 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 23 Dec 2008 22:26:02 -0000 Subject: [llvm-commits] [llvm] r61397 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Message-ID: <200812232226.mBNMQ3vG015286@zion.cs.uiuc.edu> Author: asl Date: Tue Dec 23 16:26:01 2008 New Revision: 61397 URL: http://llvm.org/viewvc/llvm-project?rev=61397&view=rev Log: Sometimes APInt syntax is really ugly... :( Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61397&r1=61396&r2=61397&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue Dec 23 16:26:01 2008 @@ -1598,6 +1598,13 @@ TLI.isOperationLegal(ISD::BRIND, MVT::Other)); } +static APInt ComputeRange(const APInt &First, const APInt &Last) { + APInt LastExt(Last), FirstExt(First); + uint32_t BitWidth = std::max(Last.getBitWidth(), First.getBitWidth()) + 1; + LastExt.sext(BitWidth); FirstExt.sext(BitWidth); + return (LastExt - FirstExt + 1ULL); +} + /// handleJTSwitchCase - Emit jumptable for current switch case range bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, CaseRecVector& WorkList, @@ -1617,13 +1624,14 @@ if (!areJTsAllowed(TLI) || TSize <= 3) return false; - APInt Range = Last - First + 1ULL; + APInt Range = ComputeRange(First, Last); double Density = (double)TSize / Range.roundToDouble(); if (Density < 0.4) return false; /*DOUT << "Lowering jump table\n" - << "First entry: " << First << ". Last entry: " << Last << "\n" + << "First entry: " << First.getSExtValue() << ". Last entry: " << Last.getSExtValue() << "\n" + << "Range: " << Range.getSExtValue() << "Size: " << TSize << ". Density: " << Density << "\n\n";*/ // Get the MachineFunction which holds the current MBB. This is used when @@ -1732,19 +1740,21 @@ size_t LSize = FrontCase.size(); size_t RSize = TSize-LSize; /*DOUT << "Selecting best pivot: \n" - << "First: " << First << ", Last: " << Last <<"\n" + << "First: " << First.getSExtValue() << ", Last: " << Last.getSExtValue() <<"\n" << "LSize: " << LSize << ", RSize: " << RSize << "\n";*/ for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; J!=E; ++I, ++J) { const APInt& LEnd = cast(I->High)->getValue(); const APInt& RBegin = cast(J->Low)->getValue(); - assert((RBegin - LEnd - 1).isNonNegative() && "Invalid case distance"); + APInt Range = ComputeRange(LEnd, RBegin); + assert((Range - 2ULL).isNonNegative() && + "Invalid case distance"); double LDensity = (double)LSize / (LEnd - First + 1ULL).roundToDouble(); double RDensity = (double)RSize / (Last - RBegin + 1ULL).roundToDouble(); - double Metric = (RBegin-LEnd).logBase2()*(LDensity+RDensity); + double Metric = Range.logBase2()*(LDensity+RDensity); // Should always split in some non-trivial place /*DOUT <<"=>Step\n" - << "LEnd: " << LEnd << ", RBegin: " << RBegin << "\n" + << "LEnd: " << LEnd.getSExtValue() << ", RBegin: " << RBegin.getSExtValue() << "\n" << "LDensity: " << LDensity << ", RDensity: " << RDensity << "\n" << "Metric: " << Metric << "\n";*/ if (FMetric < Metric) { @@ -1849,8 +1859,9 @@ // Compute span of values. const APInt& minValue = cast(FrontCase.Low)->getValue(); const APInt& maxValue = cast(BackCase.High)->getValue(); - APInt cmpRange = maxValue - minValue; - /*DOUT << "Compare range: " << Range << "\n" + APInt cmpRange = maxValue - minValue; + + /*DOUT << "Compare range: " << Range.getSExtValue() << "\n" << "Low bound: " << cast(minValue)->getValue() << "\n" << "High bound: " << cast(maxValue)->getValue() << "\n";*/ @@ -2028,8 +2039,8 @@ if (handleSmallSwitchRange(CR, WorkList, SV, Default)) continue; - // If the switch has more than 5 blocks, and at least 40% dense, and the - // target supports indirect branches, then emit a jump table rather than + // If the switch has more than 5 blocks, and at least 40% dense, and the + // target supports indirect branches, then emit a jump table rather than // lowering the switch to a binary tree of conditional branches. if (handleJTSwitchCase(CR, WorkList, SV, Default)) continue; From asl at math.spbu.ru Tue Dec 23 16:26:19 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 23 Dec 2008 22:26:19 -0000 Subject: [llvm-commits] [llvm] r61398 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Message-ID: <200812232226.mBNMQKD7015306@zion.cs.uiuc.edu> Author: asl Date: Tue Dec 23 16:26:18 2008 New Revision: 61398 URL: http://llvm.org/viewvc/llvm-project?rev=61398&view=rev Log: Restore debug printing Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61398&r1=61397&r2=61398&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue Dec 23 16:26:18 2008 @@ -47,6 +47,7 @@ #include "llvm/Support/Compiler.h" #include "llvm/Support/Debug.h" #include "llvm/Support/MathExtras.h" +#include "llvm/Support/raw_ostream.h" #include using namespace llvm; @@ -1629,10 +1630,10 @@ if (Density < 0.4) return false; - /*DOUT << "Lowering jump table\n" - << "First entry: " << First.getSExtValue() << ". Last entry: " << Last.getSExtValue() << "\n" - << "Range: " << Range.getSExtValue() - << "Size: " << TSize << ". Density: " << Density << "\n\n";*/ + DEBUG(errs() << "Lowering jump table\n" + << "First entry: " << First << ". Last entry: " << Last << '\n' + << "Range: " << Range + << "Size: " << TSize << ". Density: " << Density << "\n\n"); // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. @@ -1739,9 +1740,9 @@ size_t LSize = FrontCase.size(); size_t RSize = TSize-LSize; - /*DOUT << "Selecting best pivot: \n" - << "First: " << First.getSExtValue() << ", Last: " << Last.getSExtValue() <<"\n" - << "LSize: " << LSize << ", RSize: " << RSize << "\n";*/ + DEBUG(errs() << "Selecting best pivot: \n" + << "First: " << First << ", Last: " << Last <<'\n' + << "LSize: " << LSize << ", RSize: " << RSize << '\n'); for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; J!=E; ++I, ++J) { const APInt& LEnd = cast(I->High)->getValue(); @@ -1753,14 +1754,15 @@ double RDensity = (double)RSize / (Last - RBegin + 1ULL).roundToDouble(); double Metric = Range.logBase2()*(LDensity+RDensity); // Should always split in some non-trivial place - /*DOUT <<"=>Step\n" - << "LEnd: " << LEnd.getSExtValue() << ", RBegin: " << RBegin.getSExtValue() << "\n" - << "LDensity: " << LDensity << ", RDensity: " << RDensity << "\n" - << "Metric: " << Metric << "\n";*/ + DEBUG(errs() <<"=>Step\n" + << "LEnd: " << LEnd << ", RBegin: " << RBegin << '\n' + << "LDensity: " << LDensity + << ", RDensity: " << RDensity << '\n' + << "Metric: " << Metric << '\n'); if (FMetric < Metric) { Pivot = J; FMetric = Metric; - DOUT << "Current metric set to: " << FMetric << "\n"; + DEBUG(errs() << "Current metric set to: " << FMetric << '\n'); } LSize += J->size(); @@ -1853,17 +1855,17 @@ // Don't bother the code below, if there are too much unique destinations return false; } - DOUT << "Total number of unique destinations: " << Dests.size() << "\n" - << "Total number of comparisons: " << numCmps << "\n"; + DEBUG(errs() << "Total number of unique destinations: " << Dests.size() << '\n' + << "Total number of comparisons: " << numCmps << '\n'); // Compute span of values. const APInt& minValue = cast(FrontCase.Low)->getValue(); const APInt& maxValue = cast(BackCase.High)->getValue(); APInt cmpRange = maxValue - minValue; - /*DOUT << "Compare range: " << Range.getSExtValue() << "\n" - << "Low bound: " << cast(minValue)->getValue() << "\n" - << "High bound: " << cast(maxValue)->getValue() << "\n";*/ + DEBUG(errs() << "Compare range: " << cmpRange << '\n' + << "Low bound: " << minValue << '\n' + << "High bound: " << maxValue << '\n'); if (cmpRange.uge(APInt(cmpRange.getBitWidth(), IntPtrBits)) || (!(Dests.size() == 1 && numCmps >= 3) && @@ -1871,7 +1873,7 @@ !(Dests.size() >= 3 && numCmps >= 6))) return false; - DOUT << "Emitting bit tests\n"; + DEBUG(errs() << "Emitting bit tests\n"); APInt lowBound = APInt::getNullValue(cmpRange.getBitWidth()); // Optimize the case where all the case values fit in a @@ -1921,10 +1923,11 @@ const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - DOUT << "Cases:\n"; + DEBUG(errs() << "Cases:\n"); for (unsigned i = 0, e = CasesBits.size(); i!=e; ++i) { - DOUT << "Mask: " << CasesBits[i].Mask << ", Bits: " << CasesBits[i].Bits - << ", BB: " << CasesBits[i].BB << "\n"; + DEBUG(errs() << "Mask: " << CasesBits[i].Mask + << ", Bits: " << CasesBits[i].Bits + << ", BB: " << CasesBits[i].BB << '\n'); MachineBasicBlock *CaseBB = CurMF->CreateMachineBasicBlock(LLVMBB); CurMF->insert(BBI, CaseBB); @@ -2014,8 +2017,8 @@ // create a binary search tree from them. CaseVector Cases; size_t numCmps = Clusterify(Cases, SI); - DOUT << "Clusterify finished. Total clusters: " << Cases.size() - << ". Total compares: " << numCmps << "\n"; + DEBUG(errs() << "Clusterify finished. Total clusters: " << Cases.size() + << ". Total compares: " << numCmps << '\n'); // Get the Value to be switched on and default basic blocks, which will be // inserted into CaseBlock records, representing basic blocks in the binary From asl at math.spbu.ru Tue Dec 23 16:26:38 2008 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 23 Dec 2008 22:26:38 -0000 Subject: [llvm-commits] [llvm] r61399 - /llvm/trunk/test/CodeGen/Generic/switch-lower.ll Message-ID: <200812232226.mBNMQcWx015327@zion.cs.uiuc.edu> Author: asl Date: Tue Dec 23 16:26:37 2008 New Revision: 61399 URL: http://llvm.org/viewvc/llvm-project?rev=61399&view=rev Log: Update test Modified: llvm/trunk/test/CodeGen/Generic/switch-lower.ll Modified: llvm/trunk/test/CodeGen/Generic/switch-lower.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/switch-lower.ll?rev=61399&r1=61398&r2=61399&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/switch-lower.ll (original) +++ llvm/trunk/test/CodeGen/Generic/switch-lower.ll Tue Dec 23 16:26:37 2008 @@ -1,16 +1,13 @@ -; RUN: llvm-as < %s | llc +; RUN: llvm-as < %s | llc -march=x86-64 > %t +; RUN: grep BB1_5 %t | count 131 +; RUN: grep 119 %t +; RUN: grep 128 %t ; PR1197 -define void @exp_attr__expand_n_attribute_reference() { -entry: - br i1 false, label %cond_next954, label %cond_true924 - -cond_true924: ; preds = %entry - ret void - +define void @exp_attr__expand_n_attribute_reference(i8 %in) { cond_next954: ; preds = %entry - switch i8 0, label %cleanup7419 [ + switch i8 %in, label %cleanup7419 [ i8 1, label %bb956 i8 2, label %bb1069 i8 4, label %bb7328 @@ -90,12 +87,11 @@ i8 126, label %bb6955 i8 127, label %bb6990 i8 -128, label %bb7027 - i8 -127, label %bb3879 - i8 -126, label %bb4700 - i8 -125, label %bb7076 - i8 -124, label %bb2366 - i8 -123, label %bb2366 - i8 -122, label %bb5490 + i8 -127, label %bb7027 + i8 -126, label %bb7027 + i8 -124, label %bb7027 + i8 -123, label %bb7027 + i8 -122, label %bb7027 ] bb956: ; preds = %cond_next954 From gohman at apple.com Tue Dec 23 16:45:24 2008 From: gohman at apple.com (Dan Gohman) Date: Tue, 23 Dec 2008 22:45:24 -0000 Subject: [llvm-commits] [llvm] r61400 - in /llvm/trunk/lib/Target/X86: X86ISelLowering.cpp X86ISelLowering.h X86Instr64bit.td X86InstrInfo.td Message-ID: <200812232245.mBNMjPw3015866@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 16:45:23 2008 New Revision: 61400 URL: http://llvm.org/viewvc/llvm-project?rev=61400&view=rev Log: Add instruction patterns and encodings for the x86 bt instructions. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.h llvm/trunk/lib/Target/X86/X86Instr64bit.td llvm/trunk/lib/Target/X86/X86InstrInfo.td Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61400&r1=61399&r2=61400&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Dec 23 16:45:23 2008 @@ -6513,6 +6513,7 @@ case X86ISD::CALL: return "X86ISD::CALL"; case X86ISD::TAILCALL: return "X86ISD::TAILCALL"; case X86ISD::RDTSC_DAG: return "X86ISD::RDTSC_DAG"; + case X86ISD::BT: return "X86ISD::BT"; case X86ISD::CMP: return "X86ISD::CMP"; case X86ISD::COMI: return "X86ISD::COMI"; case X86ISD::UCOMI: return "X86ISD::UCOMI"; Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=61400&r1=61399&r2=61400&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Tue Dec 23 16:45:23 2008 @@ -115,6 +115,9 @@ /// X86 compare and logical compare instructions. CMP, COMI, UCOMI, + /// X86 bit-test instructions. + BT, + /// X86 SetCC. Operand 1 is condition code, and operand 2 is the flag /// operand produced by a CMP instruction. SETCC, Modified: llvm/trunk/lib/Target/X86/X86Instr64bit.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Instr64bit.td?rev=61400&r1=61399&r2=61400&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Instr64bit.td (original) +++ llvm/trunk/lib/Target/X86/X86Instr64bit.td Tue Dec 23 16:45:23 2008 @@ -917,6 +917,20 @@ (implicit EFLAGS)]>; } // Defs = [EFLAGS] +// Bit tests. +// TODO: BT with immediate operands. +// TODO: BTC, BTR, and BTS +let Defs = [EFLAGS] in { +def BT64rr : RI<0xA3, MRMSrcReg, (outs), (ins GR64:$src1, GR64:$src2), + "bt{q}\t{$src2, $src1|$src1, $src2}", + [(X86bt GR64:$src1, GR64:$src2), + (implicit EFLAGS)]>; +def BT64mr : RI<0xA3, MRMSrcMem, (outs), (ins i64mem:$src1, GR64:$src2), + "bt{q}\t{$src2, $src1|$src1, $src2}", + [(X86bt addr:$src1, GR64:$src2), + (implicit EFLAGS)]>; +} // Defs = [EFLAGS] + // Conditional moves let Uses = [EFLAGS], isTwoAddress = 1 in { let isCommutable = 1 in { Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=61400&r1=61399&r2=61400&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Tue Dec 23 16:45:23 2008 @@ -76,6 +76,8 @@ def X86cmp : SDNode<"X86ISD::CMP" , SDTX86CmpTest>; +def X86bt : SDNode<"X86ISD::BT", SDTX86CmpTest>; + def X86cmov : SDNode<"X86ISD::CMOV", SDTX86Cmov>; def X86brcond : SDNode<"X86ISD::BRCOND", SDTX86BrCond, [SDNPHasChain]>; @@ -2649,6 +2651,28 @@ (implicit EFLAGS)]>; } // Defs = [EFLAGS] +// Bit tests. +// TODO: BT with immediate operands +// TODO: BTC, BTR, and BTS +let Defs = [EFLAGS] in { +def BT16rr : I<0xA3, MRMSrcReg, (outs), (ins GR16:$src1, GR16:$src2), + "bt{w}\t{$src2, $src1|$src1, $src2}", + [(X86bt GR16:$src1, GR16:$src2), + (implicit EFLAGS)]>, OpSize; +def BT32rr : I<0xA3, MRMSrcReg, (outs), (ins GR32:$src1, GR32:$src2), + "bt{l}\t{$src2, $src1|$src1, $src2}", + [(X86bt GR32:$src1, GR32:$src2), + (implicit EFLAGS)]>; +def BT16mr : I<0xA3, MRMSrcMem, (outs), (ins i16mem:$src1, GR16:$src2), + "bt{w}\t{$src2, $src1|$src1, $src2}", + [(X86bt addr:$src1, GR16:$src2), + (implicit EFLAGS)]>, OpSize; +def BT32mr : I<0xA3, MRMSrcMem, (outs), (ins i32mem:$src1, GR32:$src2), + "bt{l}\t{$src2, $src1|$src1, $src2}", + [(X86bt addr:$src1, GR32:$src2), + (implicit EFLAGS)]>; +} // Defs = [EFLAGS] + // Sign/Zero extenders // Use movsbl intead of movsbw; we don't care about the high 16 bits // of the register here. This has a smaller encoding and avoids a From dalej at apple.com Tue Dec 23 17:01:28 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 23:01:28 -0000 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll Message-ID: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> Author: johannes Date: Tue Dec 23 17:01:27 2008 New Revision: 61401 URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev Log: Add another permutation where we should get rid of a-a. Added: llvm/trunk/test/CodeGen/X86/2008-12-23-dagcombine-6.ll Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=61401&r1=61400&r2=61401&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue Dec 23 17:01:27 2008 @@ -1177,13 +1177,20 @@ return N0.getOperand(1); // fold (A+B)-B -> A if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) - return N0.getOperand(0); - // fold ((A+(B-C))-B) -> A-C + return N0.getOperand(0); + // fold ((A+(B+-C))-B) -> A+-C if (N0.getOpcode() == ISD::ADD && - N0.getOperand(1).getOpcode() == ISD::SUB && + (N0.getOperand(1).getOpcode() == ISD::SUB || + N0.getOperand(1).getOpcode() == ISD::ADD) && N0.getOperand(1).getOperand(0) == N1) - return DAG.getNode(ISD::SUB, VT, N0.getOperand(0), + return DAG.getNode(N0.getOperand(1).getOpcode(), VT, N0.getOperand(0), N0.getOperand(1).getOperand(1)); + // fold ((A+(C+B))-B) -> A+C + if (N0.getOpcode() == ISD::ADD && + N0.getOperand(1).getOpcode() == ISD::ADD && + N0.getOperand(1).getOperand(1) == N1) + return DAG.getNode(ISD::ADD, VT, N0.getOperand(0), + N0.getOperand(1).getOperand(0)); // fold ((A-(B-C))-C) -> A-B if (N0.getOpcode() == ISD::SUB && N0.getOperand(1).getOpcode() == ISD::SUB && Added: llvm/trunk/test/CodeGen/X86/2008-12-23-dagcombine-6.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-23-dagcombine-6.ll?rev=61401&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/2008-12-23-dagcombine-6.ll (added) +++ llvm/trunk/test/CodeGen/X86/2008-12-23-dagcombine-6.ll Tue Dec 23 17:01:27 2008 @@ -0,0 +1,24 @@ +; RUN: llvm-as < %s | llc -march=x86 | grep "(%esp)" | count 4 +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" +target triple = "i386-apple-darwin9.5" +; a - a should be found and removed, leaving refs to only L and P +define i32 @test(i32 %a, i32 %L, i32 %P) nounwind { +entry: + %0 = add i32 %a, %L + %1 = add i32 %P, %0 + %2 = sub i32 %1, %a + br label %return + +return: ; preds = %bb3 + ret i32 %2 +} +define i32 @test2(i32 %a, i32 %L, i32 %P) nounwind { +entry: + %0 = add i32 %L, %a + %1 = add i32 %P, %0 + %2 = sub i32 %1, %a + br label %return + +return: ; preds = %bb3 + ret i32 %2 +} From dalej at apple.com Tue Dec 23 17:05:26 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 23:05:26 -0000 Subject: [llvm-commits] [llvm] r61402 - /llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Message-ID: <200812232305.mBNN5RaU016463@zion.cs.uiuc.edu> Author: johannes Date: Tue Dec 23 17:05:26 2008 New Revision: 61402 URL: http://llvm.org/viewvc/llvm-project?rev=61402&view=rev Log: This fixes the bug in 175.vpr. It doesn't fix the other SPEC breakage. I'll be reverting all recent changes shortly, this checking is mostly so this change doesn't get lost. Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=61402&r1=61401&r2=61402&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Tue Dec 23 17:05:26 2008 @@ -1627,10 +1627,17 @@ // The common base is emitted in the loop preheader. But since we // are reusing an IV, it has not been used to initialize the PHI node. // Add it to the expression used to rewrite the uses. + // When this use is outside the loop, we earlier subtracted the + // common base, and are adding it back here. Use the same expression + // as before, rather than CommonBaseV, so DAGCombiner will zap it. if (!isa(CommonBaseV) || - !cast(CommonBaseV)->isZero()) - RewriteExpr = SE->getAddExpr(RewriteExpr, + !cast(CommonBaseV)->isZero()) { + if (L->contains(User.Inst->getParent())) + RewriteExpr = SE->getAddExpr(RewriteExpr, SE->getUnknown(CommonBaseV)); + else + RewriteExpr = SE->getAddExpr(RewriteExpr, CommonExprs); + } } // Now that we know what we need to do, insert code before User for the From clattner at apple.com Tue Dec 23 17:14:41 2008 From: clattner at apple.com (Chris Lattner) Date: Tue, 23 Dec 2008 15:14:41 -0800 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll In-Reply-To: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> References: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> Message-ID: > > URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev > Log: > Add another permutation where we should get rid of a-a. > @@ -1177,13 +1177,20 @@ > return N0.getOperand(1); > // fold (A+B)-B -> A > if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) > - return N0.getOperand(0); > - // fold ((A+(B-C))-B) -> A-C > + return N0.getOperand(0); > + // fold ((A+(B+-C))-B) -> A+-C I don't see why you'd ever way to match on A + (-B). Shouldn't dag combine independently fold this into A-B? -Chris From dalej at apple.com Tue Dec 23 17:16:13 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 15:16:13 -0800 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll In-Reply-To: References: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> Message-ID: <41B8EB23-3920-40C4-8F72-2284C18AFD66@apple.com> On Dec 23, 2008, at 3:14 PMPST, Chris Lattner wrote: >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev >> Log: >> Add another permutation where we should get rid of a-a. > >> @@ -1177,13 +1177,20 @@ >> return N0.getOperand(1); >> // fold (A+B)-B -> A >> if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) >> - return N0.getOperand(0); >> - // fold ((A+(B-C))-B) -> A-C >> + return N0.getOperand(0); >> + // fold ((A+(B+-C))-B) -> A+-C > > I don't see why you'd ever way to match on A + (-B). Shouldn't dag > combine independently fold this into A-B? That should be read "plus or minus". From dalej at apple.com Tue Dec 23 17:21:36 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 23:21:36 -0000 Subject: [llvm-commits] [llvm] r61403 - /llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Message-ID: <200812232321.mBNNLaJT016902@zion.cs.uiuc.edu> Author: johannes Date: Tue Dec 23 17:21:35 2008 New Revision: 61403 URL: http://llvm.org/viewvc/llvm-project?rev=61403&view=rev Log: Revert 61362 and 61402 until SPEC breakage is fixed. Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=61403&r1=61402&r2=61403&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Tue Dec 23 17:21:35 2008 @@ -130,12 +130,6 @@ /// dependent on random ordering of pointers in the process. SmallVector StrideOrder; - /// GEPlist - A list of the GEP's that have been remembered in the SCEV - /// data structures. SCEV does not know to update these when the operands - /// of the GEP are changed, which means we cannot leave them live across - /// loops. - SmallVector GEPlist; - /// CastedValues - As we need to cast values to uintptr_t, this keeps track /// of the casted version of each value. This is accessed by /// getCastedVersionOf. @@ -197,7 +191,7 @@ bool FindIVUserForCond(ICmpInst *Cond, IVStrideUse *&CondUse, const SCEVHandle *&CondStride); bool RequiresTypeConversion(const Type *Ty, const Type *NewTy); - SCEVHandle CheckForIVReuse(bool, bool, bool, const SCEVHandle&, + int64_t CheckForIVReuse(bool, bool, bool, const SCEVHandle&, IVExpr&, const Type*, const std::vector& UsersToProcess); bool ValidStride(bool, int64_t, @@ -346,7 +340,6 @@ } SE->setSCEV(GEP, GEPVal); - GEPlist.push_back(GEP); return GEPVal; } @@ -515,22 +508,14 @@ if (isa(User) && Processed.count(User)) continue; - // Descend recursively, but not into PHI nodes outside the current loop. - // It's important to see the entire expression outside the loop to get - // choices that depend on addressing mode use right, although we won't - // consider references ouside the loop in all cases. - // If User is already in Processed, we don't want to recurse into it again, - // but do want to record a second reference in the same instruction. + // If this is an instruction defined in a nested loop, or outside this loop, + // don't recurse into it. bool AddUserToIVUsers = false; if (LI->getLoopFor(User->getParent()) != L) { - if (isa(User) || Processed.count(User) || - !AddUsersIfInteresting(User, L, Processed)) { - DOUT << "FOUND USER in other loop: " << *User - << " OF SCEV: " << *ISE << "\n"; - AddUserToIVUsers = true; - } - } else if (Processed.count(User) || - !AddUsersIfInteresting(User, L, Processed)) { + DOUT << "FOUND USER in other loop: " << *User + << " OF SCEV: " << *ISE << "\n"; + AddUserToIVUsers = true; + } else if (!AddUsersIfInteresting(User, L, Processed)) { DOUT << "FOUND USER: " << *User << " OF SCEV: " << *ISE << "\n"; AddUserToIVUsers = true; @@ -719,45 +704,34 @@ PHINode *PN = cast(Inst); for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) { if (PN->getIncomingValue(i) == OperandValToReplace) { - // If the original expression is outside the loop, put the replacement - // code in the same place as the original expression, - // which need not be an immediate predecessor of this PHI. This way we - // need only one copy of it even if it is referenced multiple times in - // the PHI. We don't do this when the original expression is inside the - // loop because multiple copies sometimes do useful sinking of code in that - // case(?). - Instruction *OldLoc = dyn_cast(OperandValToReplace); - if (L->contains(OldLoc->getParent())) { - // If this is a critical edge, split the edge so that we do not insert the - // code on all predecessor/successor paths. We do this unless this is the - // canonical backedge for this loop, as this can make some inserted code - // be in an illegal position. - BasicBlock *PHIPred = PN->getIncomingBlock(i); - if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 && - (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) { - - // First step, split the critical edge. - SplitCriticalEdge(PHIPred, PN->getParent(), P, false); - - // Next step: move the basic block. In particular, if the PHI node - // is outside of the loop, and PredTI is in the loop, we want to - // move the block to be immediately before the PHI block, not - // immediately after PredTI. - if (L->contains(PHIPred) && !L->contains(PN->getParent())) { - BasicBlock *NewBB = PN->getIncomingBlock(i); - NewBB->moveBefore(PN->getParent()); - } - - // Splitting the edge can reduce the number of PHI entries we have. - e = PN->getNumIncomingValues(); + // If this is a critical edge, split the edge so that we do not insert the + // code on all predecessor/successor paths. We do this unless this is the + // canonical backedge for this loop, as this can make some inserted code + // be in an illegal position. + BasicBlock *PHIPred = PN->getIncomingBlock(i); + if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 && + (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) { + + // First step, split the critical edge. + SplitCriticalEdge(PHIPred, PN->getParent(), P, false); + + // Next step: move the basic block. In particular, if the PHI node + // is outside of the loop, and PredTI is in the loop, we want to + // move the block to be immediately before the PHI block, not + // immediately after PredTI. + if (L->contains(PHIPred) && !L->contains(PN->getParent())) { + BasicBlock *NewBB = PN->getIncomingBlock(i); + NewBB->moveBefore(PN->getParent()); } + + // Splitting the edge can reduce the number of PHI entries we have. + e = PN->getNumIncomingValues(); } + Value *&Code = InsertedCode[PN->getIncomingBlock(i)]; if (!Code) { // Insert the code into the end of the predecessor block. - Instruction *InsertPt = (L->contains(OldLoc->getParent())) ? - PN->getIncomingBlock(i)->getTerminator() : - OldLoc->getParent()->getTerminator(); + Instruction *InsertPt = PN->getIncomingBlock(i)->getTerminator(); Code = InsertCodeForBaseAtPosition(NewBase, Rewriter, InsertPt, L); // Adjust the type back to match the PHI. Note that we can't use @@ -1194,11 +1168,7 @@ /// mode scale component and optional base reg. This allows the users of /// this stride to be rewritten as prev iv * factor. It returns 0 if no /// reuse is possible. Factors can be negative on same targets, e.g. ARM. -/// -/// If all uses are outside the loop, we don't require that all multiplies -/// be folded into the addressing mode; a multiply (executed once) outside -/// the loop is better than another IV within. Well, usually. -SCEVHandle LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg, +int64_t LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg, bool AllUsesAreAddresses, bool AllUsesAreOutsideLoop, const SCEVHandle &Stride, @@ -1210,7 +1180,7 @@ ++NewStride) { std::map::iterator SI = IVsByStride.find(StrideOrder[NewStride]); - if (SI == IVsByStride.end() || !isa(SI->first)) + if (SI == IVsByStride.end()) continue; int64_t SSInt = cast(SI->first)->getValue()->getSExtValue(); if (SI->first != Stride && @@ -1232,53 +1202,11 @@ if (II->Base->isZero() && !RequiresTypeConversion(II->Base->getType(), Ty)) { IV = *II; - return SE->getIntegerSCEV(Scale, Stride->getType()); + return Scale; } } - } else if (AllUsesAreOutsideLoop) { - // Accept nonconstant strides here; it is really really right to substitute - // an existing IV if we can. - for (unsigned NewStride = 0, e = StrideOrder.size(); NewStride != e; - ++NewStride) { - std::map::iterator SI = - IVsByStride.find(StrideOrder[NewStride]); - if (SI == IVsByStride.end() || !isa(SI->first)) - continue; - int64_t SSInt = cast(SI->first)->getValue()->getSExtValue(); - if (SI->first != Stride && SSInt != 1) - continue; - for (std::vector::iterator II = SI->second.IVs.begin(), - IE = SI->second.IVs.end(); II != IE; ++II) - // Accept nonzero base here. - // Only reuse previous IV if it would not require a type conversion. - if (!RequiresTypeConversion(II->Base->getType(), Ty)) { - IV = *II; - return Stride; - } - } - // Special case, old IV is -1*x and this one is x. Can treat this one as - // -1*old. - for (unsigned NewStride = 0, e = StrideOrder.size(); NewStride != e; - ++NewStride) { - std::map::iterator SI = - IVsByStride.find(StrideOrder[NewStride]); - if (SI == IVsByStride.end()) - continue; - if (SCEVMulExpr *ME = dyn_cast(SI->first)) - if (SCEVConstant *SC = dyn_cast(ME->getOperand(0))) - if (Stride == ME->getOperand(1) && - SC->getValue()->getSExtValue() == -1LL) - for (std::vector::iterator II = SI->second.IVs.begin(), - IE = SI->second.IVs.end(); II != IE; ++II) - // Accept nonzero base here. - // Only reuse previous IV if it would not require type conversion. - if (!RequiresTypeConversion(II->Base->getType(), Ty)) { - IV = *II; - return SE->getIntegerSCEV(-1LL, Stride->getType()); - } - } } - return SE->getIntegerSCEV(0, Stride->getType()); + return 0; } /// PartitionByIsUseOfPostIncrementedValue - Simple boolean predicate that @@ -1429,13 +1357,12 @@ IVExpr ReuseIV(SE->getIntegerSCEV(0, Type::Int32Ty), SE->getIntegerSCEV(0, Type::Int32Ty), 0, 0); - SCEVHandle RewriteFactor = - CheckForIVReuse(HaveCommonExprs, AllUsesAreAddresses, + int64_t RewriteFactor = 0; + RewriteFactor = CheckForIVReuse(HaveCommonExprs, AllUsesAreAddresses, AllUsesAreOutsideLoop, Stride, ReuseIV, CommonExprs->getType(), UsersToProcess); - if (!isa(RewriteFactor) || - !cast(RewriteFactor)->isZero()) { + if (RewriteFactor != 0) { DOUT << "BASED ON IV of STRIDE " << *ReuseIV.Stride << " and BASE " << *ReuseIV.Base << " :\n"; NewPHI = ReuseIV.PHI; @@ -1463,8 +1390,7 @@ Value *CommonBaseV = PreheaderRewriter.expandCodeFor(CommonExprs, PreInsertPt); - if (isa(RewriteFactor) && - cast(RewriteFactor)->isZero()) { + if (RewriteFactor == 0) { // Create a new Phi for this base, and stick it in the loop header. NewPHI = PHINode::Create(ReplacedTy, "iv.", PhiInsertBefore); ++NumInserted; @@ -1611,33 +1537,18 @@ // If we are reusing the iv, then it must be multiplied by a constant // factor take advantage of addressing mode scale component. - if (!isa(RewriteFactor) || - !cast(RewriteFactor)->isZero()) { - // If we're reusing an IV with a nonzero base (currently this happens - // only when all reuses are outside the loop) subtract that base here. - // The base has been used to initialize the PHI node but we don't want - // it here. - if (!ReuseIV.Base->isZero()) - RewriteExpr = SE->getMinusSCEV(RewriteExpr, ReuseIV.Base); - - // Multiply old variable, with base removed, by new scale factor. - RewriteExpr = SE->getMulExpr(RewriteFactor, + if (RewriteFactor != 0) { + RewriteExpr = SE->getMulExpr(SE->getIntegerSCEV(RewriteFactor, + RewriteExpr->getType()), RewriteExpr); // The common base is emitted in the loop preheader. But since we // are reusing an IV, it has not been used to initialize the PHI node. // Add it to the expression used to rewrite the uses. - // When this use is outside the loop, we earlier subtracted the - // common base, and are adding it back here. Use the same expression - // as before, rather than CommonBaseV, so DAGCombiner will zap it. if (!isa(CommonBaseV) || - !cast(CommonBaseV)->isZero()) { - if (L->contains(User.Inst->getParent())) - RewriteExpr = SE->getAddExpr(RewriteExpr, + !cast(CommonBaseV)->isZero()) + RewriteExpr = SE->getAddExpr(RewriteExpr, SE->getUnknown(CommonBaseV)); - else - RewriteExpr = SE->getAddExpr(RewriteExpr, CommonExprs); - } } // Now that we know what we need to do, insert code before User for the @@ -2263,9 +2174,6 @@ IVUsesByStride.clear(); IVsByStride.clear(); StrideOrder.clear(); - for (unsigned i=0; ideleteValueFromRecords(GEPlist[i]); - GEPlist.clear(); // Clean up after ourselves if (!DeadInsts.empty()) { From clattner at apple.com Tue Dec 23 17:30:45 2008 From: clattner at apple.com (Chris Lattner) Date: Tue, 23 Dec 2008 15:30:45 -0800 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll In-Reply-To: <41B8EB23-3920-40C4-8F72-2284C18AFD66@apple.com> References: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> <41B8EB23-3920-40C4-8F72-2284C18AFD66@apple.com> Message-ID: On Dec 23, 2008, at 3:16 PM, Dale Johannesen wrote: > > On Dec 23, 2008, at 3:14 PMPST, Chris Lattner wrote: > >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev >>> Log: >>> Add another permutation where we should get rid of a-a. >> >>> @@ -1177,13 +1177,20 @@ >>> return N0.getOperand(1); >>> // fold (A+B)-B -> A >>> if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) >>> - return N0.getOperand(0); >>> - // fold ((A+(B-C))-B) -> A-C >>> + return N0.getOperand(0); >>> + // fold ((A+(B+-C))-B) -> A+-C >> >> I don't see why you'd ever way to match on A + (-B). Shouldn't dag >> combine independently fold this into A-B? > > That should be read "plus or minus". Ah, please clarify the comment, thanks! -Chris From dalej at apple.com Tue Dec 23 17:34:20 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 15:34:20 -0800 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll In-Reply-To: References: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> <41B8EB23-3920-40C4-8F72-2284C18AFD66@apple.com> Message-ID: <77EA5170-F5AC-4EC5-89AA-F6A2E45029BF@apple.com> On Dec 23, 2008, at 3:30 PMPST, Chris Lattner wrote: > > On Dec 23, 2008, at 3:16 PM, Dale Johannesen wrote: > >> >> On Dec 23, 2008, at 3:14 PMPST, Chris Lattner wrote: >> >>>> >>>> URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev >>>> Log: >>>> Add another permutation where we should get rid of a-a. >>> >>>> @@ -1177,13 +1177,20 @@ >>>> return N0.getOperand(1); >>>> // fold (A+B)-B -> A >>>> if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) >>>> - return N0.getOperand(0); >>>> - // fold ((A+(B-C))-B) -> A-C >>>> + return N0.getOperand(0); >>>> + // fold ((A+(B+-C))-B) -> A+-C >>> >>> I don't see why you'd ever way to match on A + (-B). Shouldn't dag >>> combine independently fold this into A-B? >> >> That should be read "plus or minus". > > Ah, please clarify the comment, thanks! What would you prefer? I've used this elsewhere without confusing anybody. From echristo at apple.com Tue Dec 23 17:38:10 2008 From: echristo at apple.com (Eric Christopher) Date: Tue, 23 Dec 2008 15:38:10 -0800 Subject: [llvm-commits] [llvm] r61401 - in /llvm/trunk: lib/CodeGen/SelectionDAG/DAGCombiner.cpp test/CodeGen/X86/2008-12-23-dagcombine-6.ll In-Reply-To: <77EA5170-F5AC-4EC5-89AA-F6A2E45029BF@apple.com> References: <200812232301.mBNN1TZ8016352@zion.cs.uiuc.edu> <41B8EB23-3920-40C4-8F72-2284C18AFD66@apple.com> <77EA5170-F5AC-4EC5-89AA-F6A2E45029BF@apple.com> Message-ID: <7BE08CC1-C977-4A5F-9C7C-F98FD3463263@apple.com> On Dec 23, 2008, at 3:34 PM, Dale Johannesen wrote: > > On Dec 23, 2008, at 3:30 PMPST, Chris Lattner wrote: > >> >> On Dec 23, 2008, at 3:16 PM, Dale Johannesen wrote: >> >>> >>> On Dec 23, 2008, at 3:14 PMPST, Chris Lattner wrote: >>> >>>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=61401&view=rev >>>>> Log: >>>>> Add another permutation where we should get rid of a-a. >>>> >>>>> @@ -1177,13 +1177,20 @@ >>>>> return N0.getOperand(1); >>>>> // fold (A+B)-B -> A >>>>> if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) >>>>> - return N0.getOperand(0); >>>>> - // fold ((A+(B-C))-B) -> A-C >>>>> + return N0.getOperand(0); >>>>> + // fold ((A+(B+-C))-B) -> A+-C >>>> >>>> I don't see why you'd ever way to match on A + (-B). Shouldn't dag >>>> combine independently fold this into A-B? >>> >>> That should be read "plus or minus". >> >> Ah, please clarify the comment, thanks! > > What would you prefer? I've used this elsewhere without confusing > anybody. *raises hand* Now that I'm reading it I'd be confused. A +/- B ? -eric From sabre at nondot.org Tue Dec 23 17:42:28 2008 From: sabre at nondot.org (Chris Lattner) Date: Tue, 23 Dec 2008 23:42:28 -0000 Subject: [llvm-commits] [llvm] r61404 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <200812232342.mBNNgTi1017494@zion.cs.uiuc.edu> Author: lattner Date: Tue Dec 23 17:42:27 2008 New Revision: 61404 URL: http://llvm.org/viewvc/llvm-project?rev=61404&view=rev Log: simplify some control flow and reduce indentation, no functionality change. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61404&r1=61403&r2=61404&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Dec 23 17:42:27 2008 @@ -1988,7 +1988,7 @@ } switch (SetCCOpcode) { - default: break; + default: assert(0 && "Invalid integer condition!"); case ISD::SETEQ: X86CC = X86::COND_E; break; case ISD::SETGT: X86CC = X86::COND_G; break; case ISD::SETGE: X86CC = X86::COND_GE; break; @@ -2000,72 +2000,55 @@ case ISD::SETULE: X86CC = X86::COND_BE; break; case ISD::SETUGE: X86CC = X86::COND_AE; break; } - } else { - // First determine if it is required or is profitable to flip the operands. - - // If LHS is a foldable load, but RHS is not, flip the condition. - if ((ISD::isNON_EXTLoad(LHS.getNode()) && LHS.hasOneUse()) && - !(ISD::isNON_EXTLoad(RHS.getNode()) && RHS.hasOneUse())) { - SetCCOpcode = getSetCCSwappedOperands(SetCCOpcode); - std::swap(LHS, RHS); - } + return true; + } + + // First determine if it is required or is profitable to flip the operands. - switch (SetCCOpcode) { - default: break; - case ISD::SETOLT: - case ISD::SETOLE: - case ISD::SETUGT: - case ISD::SETUGE: - std::swap(LHS, RHS); - break; - } + // If LHS is a foldable load, but RHS is not, flip the condition. + if ((ISD::isNON_EXTLoad(LHS.getNode()) && LHS.hasOneUse()) && + !(ISD::isNON_EXTLoad(RHS.getNode()) && RHS.hasOneUse())) { + SetCCOpcode = getSetCCSwappedOperands(SetCCOpcode); + std::swap(LHS, RHS); + } - // On a floating point condition, the flags are set as follows: - // ZF PF CF op - // 0 | 0 | 0 | X > Y - // 0 | 0 | 1 | X < Y - // 1 | 0 | 0 | X == Y - // 1 | 1 | 1 | unordered - switch (SetCCOpcode) { - default: break; - case ISD::SETUEQ: - case ISD::SETEQ: - X86CC = X86::COND_E; - break; - case ISD::SETOLT: // flipped - case ISD::SETOGT: - case ISD::SETGT: - X86CC = X86::COND_A; - break; - case ISD::SETOLE: // flipped - case ISD::SETOGE: - case ISD::SETGE: - X86CC = X86::COND_AE; - break; - case ISD::SETUGT: // flipped - case ISD::SETULT: - case ISD::SETLT: - X86CC = X86::COND_B; - break; - case ISD::SETUGE: // flipped - case ISD::SETULE: - case ISD::SETLE: - X86CC = X86::COND_BE; - break; - case ISD::SETONE: - case ISD::SETNE: - X86CC = X86::COND_NE; - break; - case ISD::SETUO: - X86CC = X86::COND_P; - break; - case ISD::SETO: - X86CC = X86::COND_NP; - break; - } + switch (SetCCOpcode) { + default: break; + case ISD::SETOLT: + case ISD::SETOLE: + case ISD::SETUGT: + case ISD::SETUGE: + std::swap(LHS, RHS); + break; } - return X86CC != X86::COND_INVALID; + // On a floating point condition, the flags are set as follows: + // ZF PF CF op + // 0 | 0 | 0 | X > Y + // 0 | 0 | 1 | X < Y + // 1 | 0 | 0 | X == Y + // 1 | 1 | 1 | unordered + switch (SetCCOpcode) { + default: return false; + case ISD::SETUEQ: + case ISD::SETEQ: X86CC = X86::COND_E; return true; + case ISD::SETOLT: // flipped + case ISD::SETOGT: + case ISD::SETGT: X86CC = X86::COND_A; return true; + case ISD::SETOLE: // flipped + case ISD::SETOGE: + case ISD::SETGE: X86CC = X86::COND_AE; return true; + case ISD::SETUGT: // flipped + case ISD::SETULT: + case ISD::SETLT: X86CC = X86::COND_B; return true; + case ISD::SETUGE: // flipped + case ISD::SETULE: + case ISD::SETLE: X86CC = X86::COND_BE; return true; + case ISD::SETONE: + case ISD::SETNE: X86CC = X86::COND_NE; return true; + case ISD::SETUO: X86CC = X86::COND_P; return true; + case ISD::SETO: X86CC = X86::COND_NP; return true; + } } /// hasFPCMov - is there a floating point cmov for the specific X86 condition From dalej at apple.com Tue Dec 23 17:47:23 2008 From: dalej at apple.com (Dale Johannesen) Date: Tue, 23 Dec 2008 23:47:23 -0000 Subject: [llvm-commits] [llvm] r61405 - /llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Message-ID: <200812232347.mBNNlOZV017636@zion.cs.uiuc.edu> Author: johannes Date: Tue Dec 23 17:47:22 2008 New Revision: 61405 URL: http://llvm.org/viewvc/llvm-project?rev=61405&view=rev Log: Change comments so everybody can understand them, hopefully. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=61405&r1=61404&r2=61405&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Tue Dec 23 17:47:22 2008 @@ -1025,7 +1025,7 @@ return DAG.getNode(ISD::SUB, VT, N1.getOperand(0), N1.getOperand(1).getOperand(0)); } - // fold (A+((B-A)+-C)) to (B+-C) + // fold (A+((B-A)+or-C)) to (B+or-C) if ((N1.getOpcode() == ISD::SUB || N1.getOpcode() == ISD::ADD) && N1.getOperand(0).getOpcode() == ISD::SUB && N0 == N1.getOperand(0).getOperand(1)) { @@ -1178,7 +1178,7 @@ // fold (A+B)-B -> A if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1) return N0.getOperand(0); - // fold ((A+(B+-C))-B) -> A+-C + // fold ((A+(B+or-C))-B) -> A+or-C if (N0.getOpcode() == ISD::ADD && (N0.getOperand(1).getOpcode() == ISD::SUB || N0.getOperand(1).getOpcode() == ISD::ADD) && From sabre at nondot.org Tue Dec 23 18:11:38 2008 From: sabre at nondot.org (Chris Lattner) Date: Wed, 24 Dec 2008 00:11:38 -0000 Subject: [llvm-commits] [llvm] r61407 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <200812240011.mBO0BceT018447@zion.cs.uiuc.edu> Author: lattner Date: Tue Dec 23 18:11:37 2008 New Revision: 61407 URL: http://llvm.org/viewvc/llvm-project?rev=61407&view=rev Log: indentation Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61407&r1=61406&r2=61407&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Dec 23 18:11:37 2008 @@ -5031,15 +5031,14 @@ bool isFP = Op.getOperand(1).getValueType().isFloatingPoint(); unsigned X86CC; - if (translateX86CC(cast(CC)->get(), isFP, X86CC, - Op0, Op1, DAG)) { - Cond = DAG.getNode(X86ISD::CMP, MVT::i32, Op0, Op1); - return DAG.getNode(X86ISD::SETCC, MVT::i8, - DAG.getConstant(X86CC, MVT::i8), Cond); - } - - assert(0 && "Illegal SetCC!"); - return SDValue(); + if (!translateX86CC(cast(CC)->get(), isFP, X86CC, + Op0, Op1, DAG)) + assert(0 && "Illegal SetCC!"); + + + Cond = DAG.getNode(X86ISD::CMP, MVT::i32, Op0, Op1); + return DAG.getNode(X86ISD::SETCC, MVT::i8, + DAG.getConstant(X86CC, MVT::i8), Cond); } SDValue X86TargetLowering::LowerVSETCC(SDValue Op, SelectionDAG &DAG) { From gohman at apple.com Tue Dec 23 18:27:52 2008 From: gohman at apple.com (Dan Gohman) Date: Wed, 24 Dec 2008 00:27:52 -0000 Subject: [llvm-commits] [llvm] r61408 - in /llvm/trunk: lib/Target/X86/X86RegisterInfo.cpp test/CodeGen/X86/2008-12-23-crazy-address.ll Message-ID: <200812240027.mBO0RqZi018926@zion.cs.uiuc.edu> Author: djg Date: Tue Dec 23 18:27:51 2008 New Revision: 61408 URL: http://llvm.org/viewvc/llvm-project?rev=61408&view=rev Log: Fix a compiler-abort on a testcase where the stack-pointer is added to a symbolic constant. This is unlikely to be intentional, but it shouldn't crash the compiler. Added: llvm/trunk/test/CodeGen/X86/2008-12-23-crazy-address.ll Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=61408&r1=61407&r2=61408&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Tue Dec 23 18:27:51 2008 @@ -454,12 +454,19 @@ // FrameIndex with base register with EBP. Add an offset to the offset. MI.getOperand(i).ChangeToRegister(BasePtr, false); - // Now add the frame object offset to the offset from EBP. Offset is a - // 32-bit integer. - int Offset = getFrameIndexOffset(MF, FrameIndex) + - (int)(MI.getOperand(i+3).getImm()); - - MI.getOperand(i+3).ChangeToImmediate(Offset); + // Now add the frame object offset to the offset from EBP. + if (MI.getOperand(i+3).isImm()) { + // Offset is a 32-bit integer. + int Offset = getFrameIndexOffset(MF, FrameIndex) + + (int)(MI.getOperand(i+3).getImm()); + + MI.getOperand(i+3).ChangeToImmediate(Offset); + } else { + // Offset is symbolic. This is extremely rare. + uint64_t Offset = getFrameIndexOffset(MF, FrameIndex) + + (uint64_t)MI.getOperand(i+3).getOffset(); + MI.getOperand(i+3).setOffset(Offset); + } } void Added: llvm/trunk/test/CodeGen/X86/2008-12-23-crazy-address.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-12-23-crazy-address.ll?rev=61408&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/2008-12-23-crazy-address.ll (added) +++ llvm/trunk/test/CodeGen/X86/2008-12-23-crazy-address.ll Tue Dec 23 18:27:51 2008 @@ -0,0 +1,33 @@ +; RUN: llvm-as < %s | llc -march=x86 -relocation-model=static | grep {lea.*X.*esp} | count 2 + + at X = external global [0 x i32] + +define void @foo() nounwind { +entry: + %Y = alloca i32 + call void @frob(i32* %Y) nounwind + %Y3 = bitcast i32* %Y to i8* + %ctg2 = getelementptr i8* %Y3, i32 ptrtoint ([0 x i32]* @X to i32) + %0 = ptrtoint i8* %ctg2 to i32 + call void @borf(i32 %0) nounwind + ret void +} + +define void @bar(i32 %i) nounwind { +entry: + %Y = alloca [10 x i32] + %0 = getelementptr [10 x i32]* %Y, i32 0, i32 0 + call void @frob(i32* %0) nounwind + %1 = getelementptr [0 x i32]* @X, i32 0, i32 %i + %2 = getelementptr [10 x i32]* %Y, i32 0, i32 0 + %3 = ptrtoint i32* %2 to i32 + %4 = bitcast i32* %1 to i8* + %ctg2 = getelementptr i8* %4, i32 %3 + %5 = ptrtoint i8* %ctg2 to i32 + call void @borf(i32 %5) nounwind + ret void +} + +declare void @frob(i32*) + +declare void @borf(i32) From isanbard at gmail.com Tue Dec 23 19:18:01 2008 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 23 Dec 2008 17:18:01 -0800 Subject: [llvm-commits] [llvm] r61395 - in /llvm/trunk/lib/CodeGen/SelectionDAG: SelectionDAGBuild.cpp SelectionDAGBuild.h In-Reply-To: <200812232225.mBNMPTx4015251@zion.cs.uiuc.edu> References: <200812232225.mBNMPTx4015251@zion.cs.uiuc.edu> Message-ID: <7FBCEECD-DAD9-46DE-A7C0-E1911BE942B8@gmail.com> Anton, I think that your switch lowering stuff broke this test. Could you please verify? Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/ CodeGen/Generic/dg.exp ... FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/ CodeGen/Generic/switch-lower.ll for PR1197 Failed with exit(1) at line 2 while running: grep BB1_5 switch-lower.ll.tmp | count 131 count: expected 131 lines and got 5. child process exited abnormally -bw On Dec 23, 2008, at 2:25 PM, Anton Korobeynikov wrote: > Author: asl > Date: Tue Dec 23 16:25:27 2008 > New Revision: 61395 > > URL: http://llvm.org/viewvc/llvm-project?rev=61395&view=rev > Log: > Initial checkin of APInt'ififcation of switch lowering > > Modified: > llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp > llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=61395&r1=61394&r2=61395&view=diff > > = > = > = > = > = > = > = > = > ====================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp > (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Tue > Dec 23 16:25:27 2008 > @@ -1259,8 +1259,8 @@ > void SelectionDAGLowering::visitSwitchCase(CaseBlock &CB) { > SDValue Cond; > SDValue CondLHS = getValue(CB.CmpLHS); > - > - // Build the setcc now. > + > + // Build the setcc now. > if (CB.CmpMHS == NULL) { > // Fold "(X == true)" to X and "(X == false)" to !X to > // handle common cases produced by branch lowering. > @@ -1274,8 +1274,8 @@ > } else { > assert(CB.CC == ISD::SETLE && "Can handle only LE ranges now"); > > - uint64_t Low = cast(CB.CmpLHS)->getSExtValue(); > - uint64_t High = cast(CB.CmpRHS)->getSExtValue(); > + const APInt& Low = cast(CB.CmpLHS)->getValue(); > + const APInt& High = cast(CB.CmpRHS)->getValue(); > > SDValue CmpOp = getValue(CB.CmpMHS); > MVT VT = CmpOp.getValueType(); > @@ -1288,18 +1288,18 @@ > DAG.getConstant(High-Low, VT), ISD::SETULE); > } > } > - > + > // Update successor info > CurMBB->addSuccessor(CB.TrueBB); > CurMBB->addSuccessor(CB.FalseBB); > - > + > // Set NextBlock to be the MBB immediately after the current one, > if any. > // This is used to avoid emitting unnecessary branches to the next > block. > MachineBasicBlock *NextBlock = 0; > MachineFunction::iterator BBI = CurMBB; > if (++BBI != CurMBB->getParent()->end()) > NextBlock = BBI; > - > + > // If the lhs block is the next block, invert the condition so > that we can > // fall through to the lhs instead of the rhs block. > if (CB.TrueBB == NextBlock) { > @@ -1309,20 +1309,20 @@ > } > SDValue BrCond = DAG.getNode(ISD::BRCOND, MVT::Other, > getControlRoot(), Cond, > DAG.getBasicBlock(CB.TrueBB)); > - > + > // If the branch was constant folded, fix up the CFG. > if (BrCond.getOpcode() == ISD::BR) { > CurMBB->removeSuccessor(CB.FalseBB); > DAG.setRoot(BrCond); > } else { > // Otherwise, go ahead and insert the false branch. > - if (BrCond == getControlRoot()) > + if (BrCond == getControlRoot()) > CurMBB->removeSuccessor(CB.TrueBB); > - > + > if (CB.FalseBB == NextBlock) > DAG.setRoot(BrCond); > else > - DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, > + DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, BrCond, > DAG.getBasicBlock(CB.FalseBB))); > } > } > @@ -1350,7 +1350,7 @@ > MVT VT = SwitchOp.getValueType(); > SDValue SUB = DAG.getNode(ISD::SUB, VT, SwitchOp, > DAG.getConstant(JTH.First, VT)); > - > + > // The SDNode we just created, which holds the value being > switched on > // minus the the smallest case value, needs to be copied to a > virtual > // register so it can be used as an index into the jump table in a > @@ -1360,7 +1360,7 @@ > SwitchOp = DAG.getNode(ISD::TRUNCATE, TLI.getPointerTy(), SUB); > else > SwitchOp = DAG.getNode(ISD::ZERO_EXTEND, TLI.getPointerTy(), SUB); > - > + > unsigned JumpTableReg = FuncInfo.MakeReg(TLI.getPointerTy()); > SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), JumpTableReg, > SwitchOp); > JT.Reg = JumpTableReg; > @@ -1434,7 +1434,7 @@ > > SDValue BrRange = DAG.getNode(ISD::BRCOND, MVT::Other, CopyTo, > RangeCmp, > DAG.getBasicBlock(B.Default)); > - > + > if (MBB == NextBlock) > DAG.setRoot(BrRange); > else > @@ -1449,9 +1449,9 @@ > unsigned Reg, > BitTestCase &B) { > // Emit bit tests and jumps > - SDValue SwitchVal = DAG.getCopyFromReg(getControlRoot(), Reg, > + SDValue SwitchVal = DAG.getCopyFromReg(getControlRoot(), Reg, > TLI.getPointerTy()); > - > + > SDValue AndOp = DAG.getNode(ISD::AND, TLI.getPointerTy(), SwitchVal, > DAG.getConstant(B.Mask, > TLI.getPointerTy())); > SDValue AndCmp = DAG.getSetCC(TLI.getSetCCResultType(AndOp), AndOp, > @@ -1460,7 +1460,7 @@ > > CurMBB->addSuccessor(B.TargetBB); > CurMBB->addSuccessor(NextMBB); > - > + > SDValue BrAnd = DAG.getNode(ISD::BRCOND, MVT::Other, > getControlRoot(), > AndCmp, > DAG.getBasicBlock(B.TargetBB)); > > @@ -1517,15 +1517,15 @@ > Value* SV, > MachineBasicBlock* > Default) { > Case& BackCase = *(CR.Range.second-1); > - > + > // Size is the number of Cases represented by this range. > - unsigned Size = CR.Range.second - CR.Range.first; > + size_t Size = CR.Range.second - CR.Range.first; > if (Size > 3) > - return false; > - > + return false; > + > // Get the MachineFunction which holds the current MBB. This is > used when > // inserting any additional MBBs necessary to represent the switch. > - MachineFunction *CurMF = CurMBB->getParent(); > + MachineFunction *CurMF = CurMBB->getParent(); > > // Figure out which block is immediately after the current one. > MachineBasicBlock *NextBlock = 0; > @@ -1538,7 +1538,7 @@ > // is the same as the other, but has one bit unset that the other > has set, > // use bit manipulation to do two compares at once. For example: > // "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" > - > + > // Rearrange the case blocks so that the last one falls through if > possible. > if (NextBlock && Default != NextBlock && BackCase.BB != NextBlock) { > // The last case block won't fall through into 'NextBlock' if we > emit the > @@ -1550,7 +1550,7 @@ > } > } > } > - > + > // Create a CaseBlock record representing a conditional branch to > // the Case's target mbb if the value being switched on SV is equal > // to C. > @@ -1576,7 +1576,7 @@ > LHS = I->Low; MHS = SV; RHS = I->High; > } > CaseBlock CB(CC, LHS, RHS, MHS, I->BB, FallThrough, CurBlock); > - > + > // If emitting the first comparison, just call visitSwitchCase > to emit the > // code into the current block. Otherwise, push the CaseBlock > onto the > // vector to be later processed by SDISel, and insert the node's > MBB > @@ -1585,7 +1585,7 @@ > visitSwitchCase(CB); > else > SwitchCases.push_back(CB); > - > + > CurBlock = FallThrough; > } > > @@ -1597,7 +1597,7 @@ > (TLI.isOperationLegal(ISD::BR_JT, MVT::Other) || > TLI.isOperationLegal(ISD::BRIND, MVT::Other)); > } > - > + > /// handleJTSwitchCase - Emit jumptable for current switch case range > bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, > CaseRecVector& WorkList, > @@ -1606,24 +1606,25 @@ > Case& FrontCase = *CR.Range.first; > Case& BackCase = *(CR.Range.second-1); > > - int64_t First = cast(FrontCase.Low)->getSExtValue(); > - int64_t Last = cast(BackCase.High)->getSExtValue(); > + const APInt& First = cast(FrontCase.Low)->getValue(); > + const APInt& Last = cast(BackCase.High)->getValue(); > > - uint64_t TSize = 0; > + size_t TSize = 0; > for (CaseItr I = CR.Range.first, E = CR.Range.second; > I!=E; ++I) > TSize += I->size(); > > if (!areJTsAllowed(TLI) || TSize <= 3) > return false; > - > - double Density = (double)TSize / (double)((Last - First) + 1ULL); > + > + APInt Range = Last - First + 1ULL; > + double Density = (double)TSize / Range.roundToDouble(); > if (Density < 0.4) > return false; > > - DOUT << "Lowering jump table\n" > + /*DOUT << "Lowering jump table\n" > << "First entry: " << First << ". Last entry: " << Last << "\n" > - << "Size: " << TSize << ". Density: " << Density << "\n\n"; > + << "Size: " << TSize << ". Density: " << Density << "\n\n";*/ > > // Get the MachineFunction which holds the current MBB. This is > used when > // inserting any additional MBBs necessary to represent the switch. > @@ -1646,18 +1647,18 @@ > CurMF->insert(BBI, JumpTableBB); > CR.CaseBB->addSuccessor(Default); > CR.CaseBB->addSuccessor(JumpTableBB); > - > + > // Build a vector of destination BBs, corresponding to each target > // of the jump table. If the value of the jump table slot > corresponds to > // a case statement, push the case's BB onto the vector, > otherwise, push > // the default BB. > std::vector DestBBs; > - int64_t TEI = First; > + APInt TEI = First; > for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; + > +TEI) { > - int64_t Low = cast(I->Low)->getSExtValue(); > - int64_t High = cast(I->High)->getSExtValue(); > - > - if ((Low <= TEI) && (TEI <= High)) { > + const APInt& Low = cast(I->Low)->getValue(); > + const APInt& High = cast(I->High)->getValue(); > + > + if (Low.sle(TEI) && TEI.sle(High)) { > DestBBs.push_back(I->BB); > if (TEI==High) > ++I; > @@ -1665,28 +1666,28 @@ > DestBBs.push_back(Default); > } > } > - > + > // Update successor info. Add one edge to each unique successor. > - BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); > - for (std::vector::iterator I = DestBBs.begin(), > + BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); > + for (std::vector::iterator I = DestBBs.begin(), > E = DestBBs.end(); I != E; ++I) { > if (!SuccsHandled[(*I)->getNumber()]) { > SuccsHandled[(*I)->getNumber()] = true; > JumpTableBB->addSuccessor(*I); > } > } > - > + > // Create a jump table index for this jump table, or return an > existing > // one. > unsigned JTI = CurMF->getJumpTableInfo()- > >getJumpTableIndex(DestBBs); > - > + > // Set the jump table information so that we can codegen it as a > second > // MachineBasicBlock > JumpTable JT(-1U, JTI, JumpTableBB, Default); > JumpTableHeader JTH(First, Last, SV, CR.CaseBB, (CR.CaseBB == > CurMBB)); > if (CR.CaseBB == CurMBB) > visitJumpTableHeader(JT, JTH); > - > + > JTCases.push_back(JumpTableBlock(JTH, JT)); > > return true; > @@ -1700,7 +1701,7 @@ > > MachineBasicBlock* Default) { > // Get the MachineFunction which holds the current MBB. This is > used when > // inserting any additional MBBs necessary to represent the switch. > - MachineFunction *CurMF = CurMBB->getParent(); > + MachineFunction *CurMF = CurMBB->getParent(); > > // Figure out which block is immediately after the current one. > MachineBasicBlock *NextBlock = 0; > @@ -1716,36 +1717,36 @@ > // Size is the number of Cases represented by this range. > unsigned Size = CR.Range.second - CR.Range.first; > > - int64_t First = cast(FrontCase.Low)->getSExtValue(); > - int64_t Last = cast(BackCase.High)->getSExtValue(); > + const APInt& First = cast(FrontCase.Low)->getValue(); > + const APInt& Last = cast(BackCase.High)->getValue(); > double FMetric = 0; > CaseItr Pivot = CR.Range.first + Size/2; > > // Select optimal pivot, maximizing sum density of LHS and RHS. > This will > // (heuristically) allow us to emit JumpTable's later. > - uint64_t TSize = 0; > + size_t TSize = 0; > for (CaseItr I = CR.Range.first, E = CR.Range.second; > I!=E; ++I) > TSize += I->size(); > > - uint64_t LSize = FrontCase.size(); > - uint64_t RSize = TSize-LSize; > - DOUT << "Selecting best pivot: \n" > + size_t LSize = FrontCase.size(); > + size_t RSize = TSize-LSize; > + /*DOUT << "Selecting best pivot: \n" > << "First: " << First << ", Last: " << Last <<"\n" > - << "LSize: " << LSize << ", RSize: " << RSize << "\n"; > + << "LSize: " << LSize << ", RSize: " << RSize << "\n";*/ > for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; > J!=E; ++I, ++J) { > - int64_t LEnd = cast(I->High)->getSExtValue(); > - int64_t RBegin = cast(J->Low)->getSExtValue(); > - assert((RBegin-LEnd>=1) && "Invalid case distance"); > - double LDensity = (double)LSize / (double)((LEnd - First) + > 1ULL); > - double RDensity = (double)RSize / (double)((Last - RBegin) + > 1ULL); > - double Metric = Log2_64(RBegin-LEnd)*(LDensity+RDensity); > + const APInt& LEnd = cast(I->High)->getValue(); > + const APInt& RBegin = cast(J->Low)->getValue(); > + assert((RBegin - LEnd - 1).isNonNegative() && "Invalid case > distance"); > + double LDensity = (double)LSize / (LEnd - First + > 1ULL).roundToDouble(); > + double RDensity = (double)RSize / (Last - RBegin + > 1ULL).roundToDouble(); > + double Metric = (RBegin-LEnd).logBase2()*(LDensity+RDensity); > // Should always split in some non-trivial place > - DOUT <<"=>Step\n" > + /*DOUT <<"=>Step\n" > << "LEnd: " << LEnd << ", RBegin: " << RBegin << "\n" > << "LDensity: " << LDensity << ", RDensity: " << RDensity > << "\n" > - << "Metric: " << Metric << "\n"; > + << "Metric: " << Metric << "\n";*/ > if (FMetric < Metric) { > Pivot = J; > FMetric = Metric; > @@ -1761,12 +1762,12 @@ > } else { > Pivot = CR.Range.first + Size/2; > } > - > + > CaseRange LHSR(CR.Range.first, Pivot); > CaseRange RHSR(Pivot, CR.Range.second); > Constant *C = Pivot->Low; > MachineBasicBlock *FalseBB = 0, *TrueBB = 0; > - > + > // We know that we branch to the LHS if the Value being switched > on is > // less than the Pivot value, C. We use this to optimize our binary > // tree a bit, by recognizing that if SV is greater than or equal > to the > @@ -1775,22 +1776,22 @@ > // rather than creating a leaf node for it. > if ((LHSR.second - LHSR.first) == 1 && > LHSR.first->High == CR.GE && > - cast(C)->getSExtValue() == > - (cast(CR.GE)->getSExtValue() + 1LL)) { > + cast(C)->getValue() == > + (cast(CR.GE)->getValue() + 1LL)) { > TrueBB = LHSR.first->BB; > } else { > TrueBB = CurMF->CreateMachineBasicBlock(LLVMBB); > CurMF->insert(BBI, TrueBB); > WorkList.push_back(CaseRec(TrueBB, C, CR.GE, LHSR)); > } > - > + > // Similar to the optimization above, if the Value being switched > on is > // known to be less than the Constant CR.LT, and the current Case > Value > // is CR.LT - 1, then we can branch directly to the target block for > // the current Case Value, rather than emitting a RHS leaf node > for it. > if ((RHSR.second - RHSR.first) == 1 && CR.LT && > - cast(RHSR.first->Low)->getSExtValue() == > - (cast(CR.LT)->getSExtValue() - 1LL)) { > + cast(RHSR.first->Low)->getValue() == > + (cast(CR.LT)->getValue() - 1LL)) { > FalseBB = RHSR.first->BB; > } else { > FalseBB = CurMF->CreateMachineBasicBlock(LLVMBB); > @@ -1825,18 +1826,15 @@ > > // Get the MachineFunction which holds the current MBB. This is > used when > // inserting any additional MBBs necessary to represent the switch. > - MachineFunction *CurMF = CurMBB->getParent(); > + MachineFunction *CurMF = CurMBB->getParent(); > > - unsigned numCmps = 0; > + size_t numCmps = 0; > for (CaseItr I = CR.Range.first, E = CR.Range.second; > I!=E; ++I) { > // Single case counts one, case range - two. > - if (I->Low == I->High) > - numCmps +=1; > - else > - numCmps +=2; > + numCmps += (I->Low == I->High ? 1 : 2); > } > - > + > // Count unique destinations > SmallSet Dests; > for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { > @@ -1847,35 +1845,34 @@ > } > DOUT << "Total number of unique destinations: " << Dests.size() << > "\n" > << "Total number of comparisons: " << numCmps << "\n"; > - > + > // Compute span of values. > - Constant* minValue = FrontCase.Low; > - Constant* maxValue = BackCase.High; > - uint64_t range = cast(maxValue)->getSExtValue() - > - cast(minValue)->getSExtValue(); > - DOUT << "Compare range: " << range << "\n" > - << "Low bound: " << cast(minValue)- > >getSExtValue() << "\n" > - << "High bound: " << cast(maxValue)- > >getSExtValue() << "\n"; > - > - if (range>=IntPtrBits || > + const APInt& minValue = cast(FrontCase.Low)- > >getValue(); > + const APInt& maxValue = cast(BackCase.High)- > >getValue(); > + APInt cmpRange = maxValue - minValue; > + /*DOUT << "Compare range: " << Range << "\n" > + << "Low bound: " << cast(minValue)->getValue() > << "\n" > + << "High bound: " << cast(maxValue)->getValue() > << "\n";*/ > + > + if (cmpRange.uge(APInt(cmpRange.getBitWidth(), IntPtrBits)) || > (!(Dests.size() == 1 && numCmps >= 3) && > !(Dests.size() == 2 && numCmps >= 5) && > !(Dests.size() >= 3 && numCmps >= 6))) > return false; > - > + > DOUT << "Emitting bit tests\n"; > - int64_t lowBound = 0; > - > + APInt lowBound = APInt::getNullValue(cmpRange.getBitWidth()); > + > // Optimize the case where all the case values fit in a > // word without having to subtract minValue. In this case, > // we can optimize away the subtraction. > - if (cast(minValue)->getSExtValue() >= 0 && > - cast(maxValue)->getSExtValue() < IntPtrBits) { > - range = cast(maxValue)->getSExtValue(); > + if (minValue.isNonNegative() && > + maxValue.slt(APInt(maxValue.getBitWidth(), IntPtrBits))) { > + cmpRange = maxValue; > } else { > - lowBound = cast(minValue)->getSExtValue(); > + lowBound = minValue; > } > - > + > CaseBitsVector CasesBits; > unsigned i, count = 0; > > @@ -1884,24 +1881,27 @@ > for (i = 0; i < count; ++i) > if (Dest == CasesBits[i].BB) > break; > - > + > if (i == count) { > assert((count < 3) && "Too much destinations to test!"); > CasesBits.push_back(CaseBits(0, Dest, 0)); > count++; > } > - > - uint64_t lo = cast(I->Low)->getSExtValue() - > lowBound; > - uint64_t hi = cast(I->High)->getSExtValue() - > lowBound; > - > + > + const APInt& lowValue = cast(I->Low)->getValue(); > + const APInt& highValue = cast(I->High)->getValue(); > + > + uint64_t lo = (lowValue - lowBound).getZExtValue(); > + uint64_t hi = (highValue - lowBound).getZExtValue(); > + > for (uint64_t j = lo; j <= hi; j++) { > CasesBits[i].Mask |= 1ULL << j; > CasesBits[i].Bits++; > } > - > + > } > std::sort(CasesBits.begin(), CasesBits.end(), CaseBitsCmp()); > - > + > BitTestInfo BTC; > > // Figure out which block is immediately after the current one. > @@ -1921,14 +1921,14 @@ > CaseBB, > CasesBits[i].BB)); > } > - > - BitTestBlock BTB(lowBound, range, SV, > + > + BitTestBlock BTB(lowBound, cmpRange, SV, > -1U, (CR.CaseBB == CurMBB), > CR.CaseBB, Default, BTC); > > if (CR.CaseBB == CurMBB) > visitBitTestHeader(BTB); > - > + > BitTestCases.push_back(BTB); > > return true; > @@ -1936,12 +1936,12 @@ > > > /// Clusterify - Transform simple list of Cases into list of > CaseRange's > -unsigned SelectionDAGLowering::Clusterify(CaseVector& Cases, > +size_t SelectionDAGLowering::Clusterify(CaseVector& Cases, > const SwitchInst& SI) { > - unsigned numCmps = 0; > + size_t numCmps = 0; > > // Start with "simple" cases > - for (unsigned i = 1; i < SI.getNumSuccessors(); ++i) { > + for (size_t i = 1; i < SI.getNumSuccessors(); ++i) { > MachineBasicBlock *SMBB = FuncInfo.MBBMap[SI.getSuccessor(i)]; > Cases.push_back(Case(SI.getSuccessorValue(i), > SI.getSuccessorValue(i), > @@ -1950,18 +1950,18 @@ > std::sort(Cases.begin(), Cases.end(), CaseCmp()); > > // Merge case into clusters > - if (Cases.size()>=2) > + if (Cases.size() >= 2) > // Must recompute end() each iteration because it may be > // invalidated by erase if we hold on to it > - for (CaseItr I=Cases.begin(), J=++(Cases.begin()); J! > =Cases.end(); ) { > - int64_t nextValue = cast(J->Low)->getSExtValue(); > - int64_t currentValue = cast(I->High)- > >getSExtValue(); > + for (CaseItr I = Cases.begin(), J = ++(Cases.begin()); J != > Cases.end(); ) { > + const APInt& nextValue = cast(J->Low)->getValue(); > + const APInt& currentValue = cast(I->High)- > >getValue(); > MachineBasicBlock* nextBB = J->BB; > MachineBasicBlock* currentBB = I->BB; > > // If the two neighboring cases go to the same destination, > merge them > // into a single case. > - if ((nextValue-currentValue==1) && (currentBB == nextBB)) { > + if ((nextValue - currentValue == 1) && (currentBB == nextBB)) { > I->High = J->High; > J = Cases.erase(J); > } else { > @@ -1978,7 +1978,7 @@ > return numCmps; > } > > -void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { > +void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { > // Figure out which block is immediately after the current one. > MachineBasicBlock *NextBlock = 0; > MachineFunction::iterator BBI = CurMBB; > @@ -1995,15 +1995,14 @@ > if (Default != NextBlock) > DAG.setRoot(DAG.getNode(ISD::BR, MVT::Other, getControlRoot(), > DAG.getBasicBlock(Default))); > - > return; > } > - > + > // If there are any non-default case statements, create a vector > of Cases > // representing each one, and sort the vector so that we can > efficiently > // create a binary search tree from them. > CaseVector Cases; > - unsigned numCmps = Clusterify(Cases, SI); > + size_t numCmps = Clusterify(Cases, SI); > DOUT << "Clusterify finished. Total clusters: " << Cases.size() > << ". Total compares: " << numCmps << "\n"; > > @@ -2023,18 +2022,18 @@ > > if (handleBitTestsSwitchCase(CR, WorkList, SV, Default)) > continue; > - > + > // If the range has few cases (two or less) emit a series of > specific > // tests. > if (handleSmallSwitchRange(CR, WorkList, SV, Default)) > continue; > - > + > // If the switch has more than 5 blocks, and at least 40% dense, > and the > // target supports indirect branches, then emit a jump table > rather than > // lowering the switch to a binary tree of conditional branches. > if (handleJTSwitchCase(CR, WorkList, SV, Default)) > continue; > - > + > // Emit binary tree. We need to pick a pivot, and push left and > right ranges > // onto the worklist. Leafs are handled via > handleSmallSwitchRange() call. > handleBTSplitSwitchCase(CR, WorkList, SV, Default); > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h?rev=61395&r1=61394&r2=61395&view=diff > > = > = > = > = > = > = > = > = > ====================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Tue Dec > 23 16:25:27 2008 > @@ -246,8 +246,8 @@ > } > }; > > - unsigned Clusterify(CaseVector& Cases, const SwitchInst &SI); > - > + size_t Clusterify(CaseVector& Cases, const SwitchInst &SI); > + > /// CaseBlock - This structure is used to communicate between > SDLowering and > /// SDISel for the code generation of additional basic blocks > needed by multi- > /// case switch statements. > @@ -284,11 +284,11 @@ > MachineBasicBlock *Default; > }; > struct JumpTableHeader { > - JumpTableHeader(uint64_t F, uint64_t L, Value* SV, > MachineBasicBlock* H, > + JumpTableHeader(APInt F, APInt L, Value* SV, MachineBasicBlock* > H, > bool E = false): > First(F), Last(L), SValue(SV), HeaderBB(H), Emitted(E) {} > - uint64_t First; > - uint64_t Last; > + APInt First; > + APInt Last; > Value *SValue; > MachineBasicBlock *HeaderBB; > bool Emitted; > @@ -306,14 +306,14 @@ > typedef SmallVector BitTestInfo; > > struct BitTestBlock { > - BitTestBlock(uint64_t F, uint64_t R, Value* SV, > + BitTestBlock(APInt F, APInt R, Value* SV, > unsigned Rg, bool E, > MachineBasicBlock* P, MachineBasicBlock* D, > const BitTestInfo& C): > First(F), Range(R), SValue(SV), Reg(Rg), Emitted(E), > Parent(P), Default(D), Cases(C) { } > - uint64_t First; > - uint64_t Range; > + APInt First; > + APInt Range; > Value *SValue; > unsigned Reg; > bool Emitted; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From isanbard at gmail.com Tue Dec 23 21:38:47 2008 From: isanbard at gmail.com (Bill Wendling) Date: Tue, 23 Dec 2008 19:38:47 -0800 Subject: [llvm-commits] [llvm] r61391 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp In-Reply-To: <200812232155.mBNLteni013987@zion.cs.uiuc.edu> References: <200812232155.mBNLteni013987@zion.cs.uiuc.edu> Message-ID: <0E75F175-7039-4BA6-807E-039BCCD3E616@gmail.com> On Dec 23, 2008, at 1:55 PM, Devang Patel wrote: > Author: dpatel > Date: Tue Dec 23 15:55:38 2008 > New Revision: 61391 > > URL: http://llvm.org/viewvc/llvm-project?rev=61391&view=rev > Log: > Fix typo. > Silence unused variable warning. > > Modified: > llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp?rev=61391&r1=61390&r2=61391&view=diff > > = > = > = > = > = > = > = > = > ====================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Tue Dec 23 > 15:55:38 2008 > @@ -1616,7 +1616,7 @@ > while (FromTy) { > if (FromTy->getTag() != DW_TAG_typedef) { > FieldSize = FromTy->getSize(); > - FieldAlign = FromTy->getSize(); > + FieldAlign = FromTy->getAlign(); Did you mean to commit this with this patch? :-) -bw > > break; > } > > @@ -2776,6 +2776,7 @@ > sys::Path > FullPath(Directories[SourceFiles[i].getDirectoryID()]); > bool AppendOk = > FullPath.appendComponent(SourceFiles[i].getName()); > assert(AppendOk && "Could not append filename to > directory!"); > + AppendOk = false; > Asm->EmitFile(i, FullPath.toString()); > Asm->EOL(); > } > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From isanbard at gmail.com Tue Dec 23 23:23:39 2008 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 24 Dec 2008 05:23:39 -0000 Subject: [llvm-commits] [llvm] r61414 - /llvm/trunk/test/CodeGen/Generic/switch-lower.ll Message-ID: <200812240523.mBO5NeP4027317@zion.cs.uiuc.edu> Author: void Date: Tue Dec 23 23:23:34 2008 New Revision: 61414 URL: http://llvm.org/viewvc/llvm-project?rev=61414&view=rev Log: Revert the changes in this testcase until Anton can fix them. Modified: llvm/trunk/test/CodeGen/Generic/switch-lower.ll Modified: llvm/trunk/test/CodeGen/Generic/switch-lower.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/switch-lower.ll?rev=61414&r1=61413&r2=61414&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/switch-lower.ll (original) +++ llvm/trunk/test/CodeGen/Generic/switch-lower.ll Tue Dec 23 23:23:34 2008 @@ -1,13 +1,16 @@ -; RUN: llvm-as < %s | llc -march=x86-64 > %t -; RUN: grep BB1_5 %t | count 131 -; RUN: grep 119 %t -; RUN: grep 128 %t +; RUN: llvm-as < %s | llc ; PR1197 -define void @exp_attr__expand_n_attribute_reference(i8 %in) { +define void @exp_attr__expand_n_attribute_reference() { +entry: + br i1 false, label %cond_next954, label %cond_true924 + +cond_true924: ; preds = %entry + ret void + cond_next954: ; preds = %entry - switch i8 %in, label %cleanup7419 [ + switch i8 0, label %cleanup7419 [ i8 1, label %bb956 i8 2, label %bb1069 i8 4, label %bb7328 @@ -87,11 +90,12 @@ i8 126, label %bb6955 i8 127, label %bb6990 i8 -128, label %bb7027 - i8 -127, label %bb7027 - i8 -126, label %bb7027 - i8 -124, label %bb7027 - i8 -123, label %bb7027 - i8 -122, label %bb7027 + i8 -127, label %bb3879 + i8 -126, label %bb4700 + i8 -125, label %bb7076 + i8 -124, label %bb2366 + i8 -123, label %bb2366 + i8 -122, label %bb5490 ] bb956: ; preds = %cond_next954 From isanbard at gmail.com Tue Dec 23 23:25:50 2008 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 24 Dec 2008 05:25:50 -0000 Subject: [llvm-commits] [llvm] r61415 - in /llvm/trunk: include/llvm/Target/TargetAsmInfo.h lib/CodeGen/AsmPrinter/DwarfWriter.cpp lib/Target/TargetAsmInfo.cpp lib/Target/X86/X86TargetAsmInfo.cpp Message-ID: <200812240525.mBO5PpsY027387@zion.cs.uiuc.edu> Author: void Date: Tue Dec 23 23:25:49 2008 New Revision: 61415 URL: http://llvm.org/viewvc/llvm-project?rev=61415&view=rev Log: GCC doesn't emit DW_EH_PE_sdata4 for the FDE encoding on Darwin. I'm not sure about other platforms. Modified: llvm/trunk/include/llvm/Target/TargetAsmInfo.h llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp llvm/trunk/lib/Target/TargetAsmInfo.cpp llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp Modified: llvm/trunk/include/llvm/Target/TargetAsmInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetAsmInfo.h?rev=61415&r1=61414&r2=61415&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetAsmInfo.h (original) +++ llvm/trunk/include/llvm/Target/TargetAsmInfo.h Tue Dec 23 23:25:49 2008 @@ -448,6 +448,11 @@ /// bool DwarfRequiresFrameSection; // Defaults to true. + /// FDEEncodingRequiresSData4 - If set, the FDE Encoding in the EH section + /// includes DW_EH_PE_sdata4. + /// + bool FDEEncodingRequiresSData4; // Defaults to true + /// GlobalEHDirective - This is the directive used to make exception frame /// tables globally visible. /// @@ -814,6 +819,9 @@ bool doesDwarfRequireFrameSection() const { return DwarfRequiresFrameSection; } + bool doesFDEEncodingRequireSData4() const { + return FDEEncodingRequiresSData4; + } const char *getGlobalEHDirective() const { return GlobalEHDirective; } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp?rev=61415&r1=61414&r2=61415&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Tue Dec 23 23:25:49 2008 @@ -3017,13 +3017,25 @@ Asm->EmitInt8(DW_EH_PE_pcrel | DW_EH_PE_sdata4); Asm->EOL("LSDA Encoding (pcrel sdata4)"); - Asm->EmitInt8(DW_EH_PE_pcrel | DW_EH_PE_sdata4); - Asm->EOL("FDE Encoding (pcrel sdata4)"); + + if (TAI->doesFDEEncodingRequireSData4()) { + Asm->EmitInt8(DW_EH_PE_pcrel | DW_EH_PE_sdata4); + Asm->EOL("FDE Encoding (pcrel sdata4)"); + } else { + Asm->EmitInt8(DW_EH_PE_pcrel); + Asm->EOL("FDE Encoding (pcrel)"); + } } else { Asm->EmitULEB128Bytes(1); Asm->EOL("Augmentation Size"); - Asm->EmitInt8(DW_EH_PE_pcrel | DW_EH_PE_sdata4); - Asm->EOL("FDE Encoding (pcrel sdata4)"); + + if (TAI->doesFDEEncodingRequireSData4()) { + Asm->EmitInt8(DW_EH_PE_pcrel | DW_EH_PE_sdata4); + Asm->EOL("FDE Encoding (pcrel sdata4)"); + } else { + Asm->EmitInt8(DW_EH_PE_pcrel); + Asm->EOL("FDE Encoding (pcrel)"); + } } // Indicate locations of general callee saved registers in frame. Modified: llvm/trunk/lib/Target/TargetAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/TargetAsmInfo.cpp?rev=61415&r1=61414&r2=61415&view=diff ============================================================================== --- llvm/trunk/lib/Target/TargetAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/TargetAsmInfo.cpp Tue Dec 23 23:25:49 2008 @@ -99,6 +99,7 @@ SupportsDebugInformation = false; SupportsExceptionHandling = false; DwarfRequiresFrameSection = true; + FDEEncodingRequiresSData4 = true; GlobalEHDirective = 0; SupportsWeakOmittedEHFrame = true; DwarfSectionOffsetDirective = 0; Modified: llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp?rev=61415&r1=61414&r2=61415&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp Tue Dec 23 23:25:49 2008 @@ -70,6 +70,7 @@ COMMDirectiveTakesAlignment = (Subtarget->getDarwinVers() >= 9); HasDotTypeDotSizeDirective = false; HasSingleParameterDotFile = false; + FDEEncodingRequiresSData4 = false; if (TM.getRelocationModel() == Reloc::Static) { StaticCtorsSection = ".constructor"; StaticDtorsSection = ".destructor"; From isanbard at gmail.com Wed Dec 24 02:05:20 2008 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 24 Dec 2008 08:05:20 -0000 Subject: [llvm-commits] [llvm] r61420 - in /llvm/trunk: include/llvm/Target/TargetAsmInfo.h lib/CodeGen/AsmPrinter/DwarfWriter.cpp lib/Target/TargetAsmInfo.cpp lib/Target/X86/X86TargetAsmInfo.cpp Message-ID: <200812240805.mBO85LI5000396@zion.cs.uiuc.edu> Author: void Date: Wed Dec 24 02:05:17 2008 New Revision: 61420 URL: http://llvm.org/viewvc/llvm-project?rev=61420&view=rev Log: Darwin likes for the EH frame to be non-local. Modified: llvm/trunk/include/llvm/Target/TargetAsmInfo.h llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp llvm/trunk/lib/Target/TargetAsmInfo.cpp llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp Modified: llvm/trunk/include/llvm/Target/TargetAsmInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetAsmInfo.h?rev=61420&r1=61419&r2=61420&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetAsmInfo.h (original) +++ llvm/trunk/include/llvm/Target/TargetAsmInfo.h Wed Dec 24 02:05:17 2008 @@ -453,6 +453,10 @@ /// bool FDEEncodingRequiresSData4; // Defaults to true + /// NonLocalEHFrameLabel - If set, the EH_frame label needs to be non-local. + /// + bool NonLocalEHFrameLabel; // Defaults to false. + /// GlobalEHDirective - This is the directive used to make exception frame /// tables globally visible. /// @@ -822,6 +826,9 @@ bool doesFDEEncodingRequireSData4() const { return FDEEncodingRequiresSData4; } + bool doesRequireNonLocalEHFrameLabel() const { + return NonLocalEHFrameLabel; + } const char *getGlobalEHDirective() const { return GlobalEHDirective; } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp?rev=61420&r1=61419&r2=61420&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Wed Dec 24 02:05:17 2008 @@ -2963,7 +2963,10 @@ // Begin eh frame section. Asm->SwitchToTextSection(TAI->getDwarfEHFrameSection()); - O << TAI->getEHGlobalPrefix() << "EH_frame" << Index << ":\n"; + + if (!TAI->doesRequireNonLocalEHFrameLabel()) + O << TAI->getEHGlobalPrefix(); + O << "EH_frame" << Index << ":\n"; EmitLabel("section_eh_frame", Index); // Define base labels. @@ -3102,9 +3105,18 @@ EmitLabel("eh_frame_begin", EHFrameInfo.Number); - EmitSectionOffset("eh_frame_begin", "eh_frame_common", - EHFrameInfo.Number, EHFrameInfo.PersonalityIndex, - true, true, false); + if (TAI->doesRequireNonLocalEHFrameLabel()) { + PrintRelDirective(true, true); + PrintLabelName("eh_frame_begin", EHFrameInfo.Number); + + if (!TAI->isAbsoluteEHSectionOffsets()) + O << "-EH_frame" << EHFrameInfo.PersonalityIndex; + } else { + EmitSectionOffset("eh_frame_begin", "eh_frame_common", + EHFrameInfo.Number, EHFrameInfo.PersonalityIndex, + true, true, false); + } + Asm->EOL("FDE CIE offset"); EmitReference("eh_func_begin", EHFrameInfo.Number, true, true); Modified: llvm/trunk/lib/Target/TargetAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/TargetAsmInfo.cpp?rev=61420&r1=61419&r2=61420&view=diff ============================================================================== --- llvm/trunk/lib/Target/TargetAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/TargetAsmInfo.cpp Wed Dec 24 02:05:17 2008 @@ -100,6 +100,7 @@ SupportsExceptionHandling = false; DwarfRequiresFrameSection = true; FDEEncodingRequiresSData4 = true; + NonLocalEHFrameLabel = false; GlobalEHDirective = 0; SupportsWeakOmittedEHFrame = true; DwarfSectionOffsetDirective = 0; Modified: llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp?rev=61420&r1=61419&r2=61420&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86TargetAsmInfo.cpp Wed Dec 24 02:05:17 2008 @@ -71,6 +71,7 @@ HasDotTypeDotSizeDirective = false; HasSingleParameterDotFile = false; FDEEncodingRequiresSData4 = false; + NonLocalEHFrameLabel = true; if (TM.getRelocationModel() == Reloc::Static) { StaticCtorsSection = ".constructor"; StaticDtorsSection = ".destructor"; From anton at korobeynikov.info Wed Dec 24 03:42:09 2008 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Wed, 24 Dec 2008 12:42:09 +0300 Subject: [llvm-commits] [llvm] r61395 - in /llvm/trunk/lib/CodeGen/SelectionDAG: SelectionDAGBuild.cpp SelectionDAGBuild.h In-Reply-To: <7FBCEECD-DAD9-46DE-A7C0-E1911BE942B8@gmail.com> References: <200812232225.mBNMPTx4015251@zion.cs.uiuc.edu> <7FBCEECD-DAD9-46DE-A7C0-E1911BE942B8@gmail.com> Message-ID: Hi, Bill On which platform you're running the test? -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From brukman at cs.uiuc.edu Wed Dec 24 12:11:46 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 12:11:46 -0600 Subject: [llvm-commits] CVS: llvm-www/releases/index.html Message-ID: <200812241811.mBOIBkok028298@zion.cs.uiuc.edu> Changes in directory llvm-www/releases: index.html updated: 1.48 -> 1.49 --- Log message: By default, align table cell contents 'center' in the download table. --- Diffs of the changes: (+2 -4) index.html | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) Index: llvm-www/releases/index.html diff -u llvm-www/releases/index.html:1.48 llvm-www/releases/index.html:1.49 --- llvm-www/releases/index.html:1.48 Mon Dec 15 15:26:05 2008 +++ llvm-www/releases/index.html Wed Dec 24 12:09:38 2008 @@ -67,9 +67,7 @@ function createCell(contents, align) { var cell = document.createElement('td'); cell.innerHTML = contents; - if (align) { - cell.style.textAlign = align; - } + cell.style.textAlign = align ? align : 'center'; return cell; } var table = document.getElementById('download'); @@ -80,7 +78,7 @@ // table.insertRow(-1).innerHTML = '...'; var row = table.insertRow(-1); row.appendChild(createCell(date, 'right')); - row.appendChild(createCell(version, 'center')); + row.appendChild(createCell(version)); row.appendChild(createCell('download')); row.appendChild(createCell('release notes')); } From brukman at cs.uiuc.edu Wed Dec 24 13:03:00 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 13:03:00 -0600 Subject: [llvm-commits] CVS: llvm-www/header.incl Message-ID: <200812241903.mBOJ30fW029862@zion.cs.uiuc.edu> Changes in directory llvm-www: header.incl updated: 1.74 -> 1.75 --- Log message: * Charset: "iso-8859-1" -> "utf-8" (Apache says we're UTF-8 as it is) * Comply with HTML 4.01 Strict (that's what the doctype declares) --- Diffs of the changes: (+25 -25) header.incl | 50 +++++++++++++++++++++++++------------------------- 1 files changed, 25 insertions(+), 25 deletions(-) Index: llvm-www/header.incl diff -u llvm-www/header.incl:1.74 llvm-www/header.incl:1.75 --- llvm-www/header.incl:1.74 Mon Nov 10 00:49:25 2008 +++ llvm-www/header.incl Wed Dec 24 13:01:39 2008 @@ -2,7 +2,7 @@ "http://www.w3.org/TR/html4/strict.dtd"> - + The LLVM Compiler Infrastructure Project @@ -19,47 +19,47 @@ Site Map: -
+
Download!
Download now: -LLVM 2.4
+LLVM 2.4
-
-Try the
-online demo
-
+
+Try the
+online demo
+
-View the open-source
+View the open-source
license
-
+
Search this Site
- -
- + +
+
From brukman at cs.uiuc.edu Wed Dec 24 16:13:57 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 16:13:57 -0600 Subject: [llvm-commits] CVS: llvm-www/releases/index.html Message-ID: <200812242213.mBOMDvOl002689@zion.cs.uiuc.edu> Changes in directory llvm-www/releases: index.html updated: 1.49 -> 1.50 --- Log message: * Comply with HTML 4.01 Strict + Added doctype, content type, fixed + From brukman at cs.uiuc.edu Wed Dec 24 17:09:23 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 17:09:23 -0600 Subject: [llvm-commits] CVS: llvm-www/pubs/pubs.js index.html Message-ID: <200812242309.mBON9N2F004306@zion.cs.uiuc.edu> Changes in directory llvm-www/pubs: pubs.js added (r1.1) index.html updated: 1.87 -> 1.88 --- Log message: * Comply with HTML 4.01 Strict * Added pubs.js to store the publications in a quasi-BibTeX-like manner * Converted the 2 oldest papers to the new Javascript format Tested on various major platforms/browsers via http://browsershots.org . --- Diffs of the changes: (+74 -13) index.html | 19 +++++------------ pubs.js | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 13 deletions(-) Index: llvm-www/pubs/pubs.js diff -c /dev/null llvm-www/pubs/pubs.js:1.1 *** /dev/null Wed Dec 24 17:08:49 2008 --- llvm-www/pubs/pubs.js Wed Dec 24 17:08:39 2008 *************** *** 0 **** --- 1,68 ---- + // The array should be sorted reverse-chronologically, and will be displayed on + // the page in the order listed. + var PUBS = [ + {url: '2002-08-08-CASES02-ControlC.html', + title: 'Ensuring Code Safety Without Runtime Checks for Real-Time ' + + 'Control Systems', + author: 'Sumant Kowshik, Dinakar Dhurjati, and Vikram Adve', + published: "Proc. Int'l Conf. on Compilers, Architecture and Synthesis " + + "for Embedded Systems (CASES02)", + location: 'Grenoble, France', + date: 'Oct. 2002'}, + + {url: '2002-06-AutomaticPoolAllocation.html', + title: 'Automatic Pool Allocation for Disjoint Data Structures', + author: 'Chris Lattner & Vikram Adve', + published: 'ACM SIGPLAN Workshop on Memory System Performance (MSP)', + location: 'Berlin, Germany', + date: 'June 2002'} + ]; + + /** + * HTML-escapes entities for display in a web page. + * + * @param {string} str The input to be escaped. + * @return {string} HTML-escaped version of str. + */ + function htmlEscape(str) { + return str.replace(/&/g, '&').replace(//g, '>'); + } + + /** + * Tests to see if the given object is defined or not (and not null). + * + * @param {Object?} + * @return {bool} Whether or not the object is defined and not null. + */ + function isDef(obj) { + return typeof obj != 'undefined' && obj != null; + } + + /** + * Displays all publications by attaching them to the element with the passed-in + * id. + * + * @param {string} id ID of the OL/UL element that will serve as the parent of + * the publications, each of which will be a list item (LI). + */ + function displayAllPubs(id) { + var list = document.getElementById(id); + for (var i = 0; i < PUBS.length; ++i) { + var pub = PUBS[i]; + var item = document.createElement('li'); + item.innerHTML += '"' + htmlEscape(pub.title) + + '"
'; + item.innerHTML += htmlEscape(pub.author) + '
'; + item.innerHTML += '' + htmlEscape(pub.published) + ''; + if (isDef(pub.location)) { + item.innerHTML += ', ' + htmlEscape(pub.location); + } + if (isDef(pub.date)) { + item.innerHTML += ', ' + htmlEscape(pub.date); + } + item.innerHTML += '.'; + list.appendChild(item); + } + } + Index: llvm-www/pubs/index.html diff -u llvm-www/pubs/index.html:1.87 llvm-www/pubs/index.html:1.88 --- llvm-www/pubs/index.html:1.87 Thu Dec 18 12:01:05 2008 +++ llvm-www/pubs/index.html Wed Dec 24 17:08:39 2008 @@ -1,7 +1,7 @@
LLVM Related Publications
-
    +
    1. "Recovery Domains: An Organizing Principle for Recoverable Operating Systems"
      Andrew Lenharth, Samuel T. King, Vikram Adve
      @@ -256,7 +256,7 @@
    2. "Automatic Pool Allocation: Improving Performance by Controlling Data Structure Layout in the Heap"
      Chris Lattner and Vikram Adve
      - Proc. of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05), Chicago, Illinois, + Proc. of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05), Chicago, Illinois, June, 2005.
      Received PLDI 2005 Best Paper Award.
    3. @@ -338,17 +338,10 @@ Computer Science Dept., University of Illinois at Urbana-Champaign, Dec. 2002 -
    4. "Ensuring Code Safety Without -Runtime Checks for Real-Time Control Systems"
      Sumant Kowshik, Dinakar -Dhurjati, and Vikram Adve
      Proc. Int'l Conf. -on Compilers, Architecture and Synthesis for Embedded Systems (CASES02), -Grenoble, France, Oct. 2002.
    5. - -
    6. "Automatic Pool Allocation -for Disjoint Data Structures,"
      Chris Lattner & Vikram Adve
      -ACM SIGPLAN Workshop on Memory System Performance (MSP), Berlin, Germany, June 2002.
      -
    7. -
    + + + + From sabre at nondot.org Wed Dec 24 17:53:06 2008 From: sabre at nondot.org (Chris Lattner) Date: Wed, 24 Dec 2008 23:53:06 -0000 Subject: [llvm-commits] [llvm] r61423 - /llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Message-ID: <200812242353.mBONr7KJ005469@zion.cs.uiuc.edu> Author: lattner Date: Wed Dec 24 17:53:05 2008 New Revision: 61423 URL: http://llvm.org/viewvc/llvm-project?rev=61423&view=rev Log: translateX86CC can never fail. Simplify it based on this. Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61423&r1=61422&r2=61423&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 24 17:53:05 2008 @@ -1960,47 +1960,40 @@ } -/// translateX86CC - do a one to one translation of a ISD::CondCode to the X86 -/// specific condition code. It returns a false if it cannot do a direct -/// translation. X86CC is the translated CondCode. LHS/RHS are modified as -/// needed. -static bool translateX86CC(ISD::CondCode SetCCOpcode, bool isFP, - unsigned &X86CC, SDValue &LHS, SDValue &RHS, - SelectionDAG &DAG) { - X86CC = X86::COND_INVALID; +/// TranslateX86CC - do a one to one translation of a ISD::CondCode to the X86 +/// specific condition code, returning the condition code and the LHS/RHS of the +/// comparison to make. +static unsigned TranslateX86CC(ISD::CondCode SetCCOpcode, bool isFP, + SDValue &LHS, SDValue &RHS, SelectionDAG &DAG) { if (!isFP) { if (ConstantSDNode *RHSC = dyn_cast(RHS)) { if (SetCCOpcode == ISD::SETGT && RHSC->isAllOnesValue()) { // X > -1 -> X == 0, jump !sign. RHS = DAG.getConstant(0, RHS.getValueType()); - X86CC = X86::COND_NS; - return true; + return X86::COND_NS; } else if (SetCCOpcode == ISD::SETLT && RHSC->isNullValue()) { // X < 0 -> X == 0, jump on sign. - X86CC = X86::COND_S; - return true; + return X86::COND_S; } else if (SetCCOpcode == ISD::SETLT && RHSC->getZExtValue() == 1) { // X < 1 -> X <= 0 RHS = DAG.getConstant(0, RHS.getValueType()); - X86CC = X86::COND_LE; - return true; + return X86::COND_LE; } } switch (SetCCOpcode) { default: assert(0 && "Invalid integer condition!"); - case ISD::SETEQ: X86CC = X86::COND_E; break; - case ISD::SETGT: X86CC = X86::COND_G; break; - case ISD::SETGE: X86CC = X86::COND_GE; break; - case ISD::SETLT: X86CC = X86::COND_L; break; - case ISD::SETLE: X86CC = X86::COND_LE; break; - case ISD::SETNE: X86CC = X86::COND_NE; break; - case ISD::SETULT: X86CC = X86::COND_B; break; - case ISD::SETUGT: X86CC = X86::COND_A; break; - case ISD::SETULE: X86CC = X86::COND_BE; break; - case ISD::SETUGE: X86CC = X86::COND_AE; break; + case ISD::SETEQ: return X86::COND_E; + case ISD::SETGT: return X86::COND_G; + case ISD::SETGE: return X86::COND_GE; + case ISD::SETLT: return X86::COND_L; + case ISD::SETLE: return X86::COND_LE; + case ISD::SETNE: return X86::COND_NE; + case ISD::SETULT: return X86::COND_B; + case ISD::SETUGT: return X86::COND_A; + case ISD::SETULE: return X86::COND_BE; + case ISD::SETUGE: return X86::COND_AE; } - return true; } // First determine if it is required or is profitable to flip the operands. @@ -2029,25 +2022,25 @@ // 1 | 0 | 0 | X == Y // 1 | 1 | 1 | unordered switch (SetCCOpcode) { - default: return false; + default: assert(0 && "Condcode should be pre-legalized away"); case ISD::SETUEQ: - case ISD::SETEQ: X86CC = X86::COND_E; return true; + case ISD::SETEQ: return X86::COND_E; case ISD::SETOLT: // flipped case ISD::SETOGT: - case ISD::SETGT: X86CC = X86::COND_A; return true; + case ISD::SETGT: return X86::COND_A; case ISD::SETOLE: // flipped case ISD::SETOGE: - case ISD::SETGE: X86CC = X86::COND_AE; return true; + case ISD::SETGE: return X86::COND_AE; case ISD::SETUGT: // flipped case ISD::SETULT: - case ISD::SETLT: X86CC = X86::COND_B; return true; + case ISD::SETLT: return X86::COND_B; case ISD::SETUGE: // flipped case ISD::SETULE: - case ISD::SETLE: X86CC = X86::COND_BE; return true; + case ISD::SETLE: return X86::COND_BE; case ISD::SETONE: - case ISD::SETNE: X86CC = X86::COND_NE; return true; - case ISD::SETUO: X86CC = X86::COND_P; return true; - case ISD::SETO: X86CC = X86::COND_NP; return true; + case ISD::SETNE: return X86::COND_NE; + case ISD::SETUO: return X86::COND_P; + case ISD::SETO: return X86::COND_NP; } } @@ -5029,12 +5022,9 @@ SDValue Op1 = Op.getOperand(1); SDValue CC = Op.getOperand(2); bool isFP = Op.getOperand(1).getValueType().isFloatingPoint(); - unsigned X86CC; - if (!translateX86CC(cast(CC)->get(), isFP, X86CC, - Op0, Op1, DAG)) - assert(0 && "Illegal SetCC!"); - + unsigned X86CC = TranslateX86CC(cast(CC)->get(), isFP, + Op0, Op1, DAG); Cond = DAG.getNode(X86ISD::CMP, MVT::i32, Op0, Op1); return DAG.getNode(X86ISD::SETCC, MVT::i8, @@ -5746,11 +5736,9 @@ break; } - unsigned X86CC; SDValue LHS = Op.getOperand(1); SDValue RHS = Op.getOperand(2); - translateX86CC(CC, true, X86CC, LHS, RHS, DAG); - + unsigned X86CC = TranslateX86CC(CC, true, LHS, RHS, DAG); SDValue Cond = DAG.getNode(Opc, MVT::i32, LHS, RHS); SDValue SetCC = DAG.getNode(X86ISD::SETCC, MVT::i8, DAG.getConstant(X86CC, MVT::i8), Cond); From isanbard at gmail.com Wed Dec 24 18:57:53 2008 From: isanbard at gmail.com (Bill Wendling) Date: Wed, 24 Dec 2008 16:57:53 -0800 Subject: [llvm-commits] [llvm] r61395 - in /llvm/trunk/lib/CodeGen/SelectionDAG: SelectionDAGBuild.cpp SelectionDAGBuild.h In-Reply-To: References: <200812232225.mBNMPTx4015251@zion.cs.uiuc.edu> <7FBCEECD-DAD9-46DE-A7C0-E1911BE942B8@gmail.com> Message-ID: <38B95FD7-1FDA-4D9B-92CA-62FDC42C669F@gmail.com> Darwin Leopard. -bw On Dec 24, 2008, at 1:42 AM, "Anton Korobeynikov" wrote: > Hi, Bill > > On which platform you're running the test? > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State > University > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From sabre at nondot.org Wed Dec 24 19:27:11 2008 From: sabre at nondot.org (Chris Lattner) Date: Thu, 25 Dec 2008 01:27:11 -0000 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td Message-ID: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> Author: lattner Date: Wed Dec 24 19:27:10 2008 New Revision: 61424 URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev Log: BT memory operands load from their address operand. Modified: llvm/trunk/lib/Target/X86/X86Instr64bit.td llvm/trunk/lib/Target/X86/X86InstrInfo.td Modified: llvm/trunk/lib/Target/X86/X86Instr64bit.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Instr64bit.td?rev=61424&r1=61423&r2=61424&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Instr64bit.td (original) +++ llvm/trunk/lib/Target/X86/X86Instr64bit.td Wed Dec 24 19:27:10 2008 @@ -927,7 +927,7 @@ (implicit EFLAGS)]>; def BT64mr : RI<0xA3, MRMSrcMem, (outs), (ins i64mem:$src1, GR64:$src2), "bt{q}\t{$src2, $src1|$src1, $src2}", - [(X86bt addr:$src1, GR64:$src2), + [(X86bt (loadi64 addr:$src1), GR64:$src2), (implicit EFLAGS)]>; } // Defs = [EFLAGS] Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=61424&r1=61423&r2=61424&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Wed Dec 24 19:27:10 2008 @@ -2665,11 +2665,11 @@ (implicit EFLAGS)]>; def BT16mr : I<0xA3, MRMSrcMem, (outs), (ins i16mem:$src1, GR16:$src2), "bt{w}\t{$src2, $src1|$src1, $src2}", - [(X86bt addr:$src1, GR16:$src2), + [(X86bt (loadi16 addr:$src1), GR16:$src2), (implicit EFLAGS)]>, OpSize; def BT32mr : I<0xA3, MRMSrcMem, (outs), (ins i32mem:$src1, GR32:$src2), "bt{l}\t{$src2, $src1|$src1, $src2}", - [(X86bt addr:$src1, GR32:$src2), + [(X86bt (loadi32 addr:$src1), GR32:$src2), (implicit EFLAGS)]>; } // Defs = [EFLAGS] From sabre at nondot.org Wed Dec 24 19:32:50 2008 From: sabre at nondot.org (Chris Lattner) Date: Thu, 25 Dec 2008 01:32:50 -0000 Subject: [llvm-commits] [llvm] r61425 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td Message-ID: <200812250132.mBP1WofS008095@zion.cs.uiuc.edu> Author: lattner Date: Wed Dec 24 19:32:49 2008 New Revision: 61425 URL: http://llvm.org/viewvc/llvm-project?rev=61425&view=rev Log: Fix some JIT encodings. Modified: llvm/trunk/lib/Target/X86/X86Instr64bit.td llvm/trunk/lib/Target/X86/X86InstrInfo.td Modified: llvm/trunk/lib/Target/X86/X86Instr64bit.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86Instr64bit.td?rev=61425&r1=61424&r2=61425&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86Instr64bit.td (original) +++ llvm/trunk/lib/Target/X86/X86Instr64bit.td Wed Dec 24 19:32:49 2008 @@ -921,14 +921,14 @@ // TODO: BT with immediate operands. // TODO: BTC, BTR, and BTS let Defs = [EFLAGS] in { -def BT64rr : RI<0xA3, MRMSrcReg, (outs), (ins GR64:$src1, GR64:$src2), +def BT64rr : RI<0xA3, MRMDestReg, (outs), (ins GR64:$src1, GR64:$src2), "bt{q}\t{$src2, $src1|$src1, $src2}", [(X86bt GR64:$src1, GR64:$src2), - (implicit EFLAGS)]>; -def BT64mr : RI<0xA3, MRMSrcMem, (outs), (ins i64mem:$src1, GR64:$src2), + (implicit EFLAGS)]>, TB; +def BT64mr : RI<0xA3, MRMDestMem, (outs), (ins i64mem:$src1, GR64:$src2), "bt{q}\t{$src2, $src1|$src1, $src2}", [(X86bt (loadi64 addr:$src1), GR64:$src2), - (implicit EFLAGS)]>; + (implicit EFLAGS)]>, TB; } // Defs = [EFLAGS] // Conditional moves Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=61425&r1=61424&r2=61425&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Wed Dec 24 19:32:49 2008 @@ -2658,19 +2658,19 @@ def BT16rr : I<0xA3, MRMSrcReg, (outs), (ins GR16:$src1, GR16:$src2), "bt{w}\t{$src2, $src1|$src1, $src2}", [(X86bt GR16:$src1, GR16:$src2), - (implicit EFLAGS)]>, OpSize; + (implicit EFLAGS)]>, OpSize, TB; def BT32rr : I<0xA3, MRMSrcReg, (outs), (ins GR32:$src1, GR32:$src2), "bt{l}\t{$src2, $src1|$src1, $src2}", [(X86bt GR32:$src1, GR32:$src2), - (implicit EFLAGS)]>; -def BT16mr : I<0xA3, MRMSrcMem, (outs), (ins i16mem:$src1, GR16:$src2), + (implicit EFLAGS)]>, TB; +def BT16mr : I<0xA3, MRMDestMem, (outs), (ins i16mem:$src1, GR16:$src2), "bt{w}\t{$src2, $src1|$src1, $src2}", [(X86bt (loadi16 addr:$src1), GR16:$src2), - (implicit EFLAGS)]>, OpSize; -def BT32mr : I<0xA3, MRMSrcMem, (outs), (ins i32mem:$src1, GR32:$src2), + (implicit EFLAGS)]>, OpSize, TB; +def BT32mr : I<0xA3, MRMDestMem, (outs), (ins i32mem:$src1, GR32:$src2), "bt{l}\t{$src2, $src1|$src1, $src2}", [(X86bt (loadi32 addr:$src1), GR32:$src2), - (implicit EFLAGS)]>; + (implicit EFLAGS)]>, TB; } // Defs = [EFLAGS] // Sign/Zero extenders From brukman at cs.uiuc.edu Wed Dec 24 20:46:47 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 20:46:47 -0600 Subject: [llvm-commits] CVS: llvm-www/pubs/index.html pubs.js Message-ID: <200812250246.mBP2klHk010319@zion.cs.uiuc.edu> Changes in directory llvm-www/pubs: index.html updated: 1.88 -> 1.89 pubs.js updated: 1.1 -> 1.2 --- Log message: * Split month vs. year in publication dates * Converted 2 more papers to the new Javascript format --- Diffs of the changes: (+33 -14) index.html | 10 ---------- pubs.js | 37 +++++++++++++++++++++++++++++++++---- 2 files changed, 33 insertions(+), 14 deletions(-) Index: llvm-www/pubs/index.html diff -u llvm-www/pubs/index.html:1.88 llvm-www/pubs/index.html:1.89 --- llvm-www/pubs/index.html:1.88 Wed Dec 24 17:08:39 2008 +++ llvm-www/pubs/index.html Wed Dec 24 20:45:50 2008 @@ -328,16 +328,6 @@ GCC"
    Chris Lattner & Vikram Adve
    First Annual GCC Developers' Summit, Ottawa, Canada, May 2003.
    -
  1. "Data Structure Analysis: -An Efficient Context-Sensitive Heap Analysis"
    Chris Lattner & Vikram -Adve
    Technical Report #UIUCDCS-R-2003-2340, Computer Science Dept., Univ. of -Illinois, Apr. 2003.
  2. - -
  3. "LLVM: An Infrastructure for -Multi-Stage Optimization"
    Chris Lattner
    Masters Thesis, -Computer Science Dept., University of Illinois at Urbana-Champaign, -Dec. 2002
  4. -
Index: llvm-www/pubs/pubs.js diff -u llvm-www/pubs/pubs.js:1.1 llvm-www/pubs/pubs.js:1.2 --- llvm-www/pubs/pubs.js:1.1 Wed Dec 24 17:08:39 2008 +++ llvm-www/pubs/pubs.js Wed Dec 24 20:45:50 2008 @@ -1,6 +1,22 @@ // The array should be sorted reverse-chronologically, and will be displayed on // the page in the order listed. var PUBS = [ + {url: '2003-04-29-DataStructureAnalysisTR.html', + title: 'Data Structure Analysis: An Efficient Context-Sensitive Heap Analysis', + author: 'Chris Lattner & Vikram Adve', + published: 'Technical Report #UIUCDCS-R-2003-2340', + location: 'Computer Science Dept., Univ. of Illinois', + month: 4, + year: 2003}, + + {url: '2002-12-LattnerMSThesis.html', + title: 'LLVM: An Infrastructure for Multi-Stage Optimization', + author: 'Chris Lattner', + published: 'Masters Thesis', + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 12, + year: 2002}, + {url: '2002-08-08-CASES02-ControlC.html', title: 'Ensuring Code Safety Without Runtime Checks for Real-Time ' + 'Control Systems', @@ -8,14 +24,16 @@ published: "Proc. Int'l Conf. on Compilers, Architecture and Synthesis " + "for Embedded Systems (CASES02)", location: 'Grenoble, France', - date: 'Oct. 2002'}, + month: 10, + year: 2002}, {url: '2002-06-AutomaticPoolAllocation.html', title: 'Automatic Pool Allocation for Disjoint Data Structures', author: 'Chris Lattner & Vikram Adve', published: 'ACM SIGPLAN Workshop on Memory System Performance (MSP)', location: 'Berlin, Germany', - date: 'June 2002'} + month: 6, + year: 2002} ]; /** @@ -47,6 +65,10 @@ * the publications, each of which will be a list item (LI). */ function displayAllPubs(id) { + var MONTHS = [ + 'Jan.', 'Feb.', 'Mar.', 'Apr.', 'May', 'June', + 'July', 'Aug.', 'Sep.', 'Oct.', 'Nov.', 'Dec.' + ]; var list = document.getElementById(id); for (var i = 0; i < PUBS.length; ++i) { var pub = PUBS[i]; @@ -58,8 +80,15 @@ if (isDef(pub.location)) { item.innerHTML += ', ' + htmlEscape(pub.location); } - if (isDef(pub.date)) { - item.innerHTML += ', ' + htmlEscape(pub.date); + if (isDef(pub.month) || isDef(pub.year)) { + var date = []; + if (isDef(pub.month)) { + date.push(MONTHS[pub.month - 1]); + } + if (isDef(pub.year)) { + date.push(pub.year); + } + item.innerHTML += ', ' + htmlEscape(date.join(' ')); } item.innerHTML += '.'; list.appendChild(item); From eli.friedman at gmail.com Wed Dec 24 21:28:24 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 24 Dec 2008 19:28:24 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> Message-ID: On Wed, Dec 24, 2008 at 5:27 PM, Chris Lattner wrote: > Author: lattner > Date: Wed Dec 24 19:27:10 2008 > New Revision: 61424 > > URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev > Log: > BT memory operands load from their address operand. Do we really want to map this to the same intrinsic as the register form? "mov (%esp), %eax; bt %ebx, %eax;" has different semantics from "bt %ebx, (%esp), ", and the latter form is significantly slower. -Eli From clattner at apple.com Wed Dec 24 23:00:33 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 24 Dec 2008 21:00:33 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> Message-ID: <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> On Dec 24, 2008, at 7:28 PM, Eli Friedman wrote: > On Wed, Dec 24, 2008 at 5:27 PM, Chris Lattner > wrote: >> Author: lattner >> Date: Wed Dec 24 19:27:10 2008 >> New Revision: 61424 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev >> Log: >> BT memory operands load from their address operand. > > Do we really want to map this to the same intrinsic as the register > form? "mov (%esp), %eax; bt %ebx, %eax;" has different semantics from > "bt %ebx, (%esp), ", and the latter form is significantly slower. It does? How so? The later form reduces register pressure, so absent a difference in semantics it is preferable. -Chris From eli.friedman at gmail.com Wed Dec 24 23:15:21 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 24 Dec 2008 21:15:21 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> Message-ID: On Wed, Dec 24, 2008 at 9:00 PM, Chris Lattner wrote: > > On Dec 24, 2008, at 7:28 PM, Eli Friedman wrote: > >> On Wed, Dec 24, 2008 at 5:27 PM, Chris Lattner >> wrote: >>> Author: lattner >>> Date: Wed Dec 24 19:27:10 2008 >>> New Revision: 61424 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev >>> Log: >>> BT memory operands load from their address operand. >> >> Do we really want to map this to the same intrinsic as the register >> form? "mov (%esp), %eax; bt %ebx, %eax;" has different semantics from >> "bt %ebx, (%esp)", and the latter form is significantly slower. > > It does? How so? The later form reduces register pressure, so absent > a difference in semantics it is preferable. Suppose ebx contains the number 32. The register form modifies the bottom bit of eax; the memory form modifies bottom bit of the *following* integer in memory. Also, even ignoring that, performance is hugely different: on a Core 2, "bt %ebx, %eax" is one uop, but "bt %ebx, (%esp)" is 10 uops. The difference isn't quite as severe on other processors, but the reg-reg form is still significantly faster even if a load from memory is necessary. -Eli From clattner at apple.com Wed Dec 24 23:19:56 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 24 Dec 2008 21:19:56 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> Message-ID: On Dec 24, 2008, at 9:15 PM, Eli Friedman wrote: > On Wed, Dec 24, 2008 at 9:00 PM, Chris Lattner > wrote: >> >> On Dec 24, 2008, at 7:28 PM, Eli Friedman wrote: >> >>> On Wed, Dec 24, 2008 at 5:27 PM, Chris Lattner >>> wrote: >>>> Author: lattner >>>> Date: Wed Dec 24 19:27:10 2008 >>>> New Revision: 61424 >>>> >>>> URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev >>>> Log: >>>> BT memory operands load from their address operand. >>> >>> Do we really want to map this to the same intrinsic as the register >>> form? "mov (%esp), %eax; bt %ebx, %eax;" has different semantics >>> from >>> "bt %ebx, (%esp)", and the latter form is significantly slower. >> >> It does? How so? The later form reduces register pressure, so >> absent >> a difference in semantics it is preferable. > > Suppose ebx contains the number 32. The register form modifies the > bottom bit of eax; the memory form modifies bottom bit of the > *following* integer in memory. > > Also, even ignoring that, performance is hugely different: on a Core > 2, "bt %ebx, %eax" is one uop, but "bt %ebx, (%esp)" is 10 uops. The > difference isn't quite as severe on other processors, but the reg-reg > form is still significantly faster even if a load from memory is > necessary. Are you sure you aren't thinking of btc/bts? bt doesn't modify any operands. -Chris From eli.friedman at gmail.com Wed Dec 24 23:21:53 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 24 Dec 2008 21:21:53 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> Message-ID: On Wed, Dec 24, 2008 at 9:19 PM, Chris Lattner wrote: > > On Dec 24, 2008, at 9:15 PM, Eli Friedman wrote: > >> On Wed, Dec 24, 2008 at 9:00 PM, Chris Lattner >> wrote: >>> >>> On Dec 24, 2008, at 7:28 PM, Eli Friedman wrote: >>> >>>> On Wed, Dec 24, 2008 at 5:27 PM, Chris Lattner >>>> wrote: >>>>> Author: lattner >>>>> Date: Wed Dec 24 19:27:10 2008 >>>>> New Revision: 61424 >>>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=61424&view=rev >>>>> Log: >>>>> BT memory operands load from their address operand. >>>> >>>> Do we really want to map this to the same intrinsic as the register >>>> form? "mov (%esp), %eax; bt %ebx, %eax;" has different semantics >>>> from >>>> "bt %ebx, (%esp)", and the latter form is significantly slower. >>> >>> It does? How so? The later form reduces register pressure, so >>> absent >>> a difference in semantics it is preferable. >> >> Suppose ebx contains the number 32. The register form modifies the >> bottom bit of eax; the memory form modifies bottom bit of the >> *following* integer in memory. >> >> Also, even ignoring that, performance is hugely different: on a Core >> 2, "bt %ebx, %eax" is one uop, but "bt %ebx, (%esp)" is 10 uops. The >> difference isn't quite as severe on other processors, but the reg-reg >> form is still significantly faster even if a load from memory is >> necessary. > > Are you sure you aren't thinking of btc/bts? bt doesn't modify any > operands. Oh, oops, s/modifies/tests/. The rest is correct. -Eli From brukman at cs.uiuc.edu Wed Dec 24 23:28:43 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Wed, 24 Dec 2008 23:28:43 -0600 Subject: [llvm-commits] CVS: llvm-www/pubs/index.html pubs.js Message-ID: <200812250528.mBP5Sho6014869@zion.cs.uiuc.edu> Changes in directory llvm-www/pubs: index.html updated: 1.89 -> 1.90 pubs.js updated: 1.2 -> 1.3 --- Log message: Converted all papers from hand-coded HTML to BibTeX-like Javascript. --- Diffs of the changes: (+477 -328) index.html | 326 ----------------------------------------- pubs.js | 479 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 477 insertions(+), 328 deletions(-) Index: llvm-www/pubs/index.html diff -u llvm-www/pubs/index.html:1.89 llvm-www/pubs/index.html:1.90 --- llvm-www/pubs/index.html:1.89 Wed Dec 24 20:45:50 2008 +++ llvm-www/pubs/index.html Wed Dec 24 23:27:51 2008 @@ -2,332 +2,6 @@
LLVM Related Publications
    -
  1. "Recovery Domains: An Organizing -Principle for Recoverable Operating Systems"
    -Andrew Lenharth, Samuel T. King, Vikram Adve
    -Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09)
  2. - -
  3. "KLEE: Unassisted and Automatic -Generation of High-Coverage Tests for Complex Systems Programs"
    -Cristian Cadar, Daniel Dunbar, Dawson Engler
    -Proc. 8th USENIX Symposium on Operating Systems Design and Implementation -(OSDI 2008), December 2008
  4. - -
  5. "Introduction to the LLVM Compiler System"
    - Chris Lattner
    - Plenary Talk, ACAT 2008: Advanced Computing and Analysis Techniques in Physics Research, Erice, Sicily, Italy, November 2008
  6. - - -
  7. "Volatiles Are Miscompiled, -and What to Do about It"
    -Eric Eide, John Regehr
    -Proc. EMSOFT International Conference on Embedded Software (EMSOFT 2008), October 2008
  8. - - - -
  9. "A Lazy Developer Approach: -Building a JVM with Third Party Software"
    -Nicolas Geoffray, Gael Thomas, Charles Clement and Bertil Folliot
    -Proc. International Conference on Principles and Practice of Programming -In Java (PPPJ 2008), September 2008
  10. - -
  11. "Run-Time Code Generation for Materials"
    Stephan Reiter
    IEEE Symposium on Interactive Ray Tracing (RT'08), August 2008
  12. - -
  13. "Verifying Multi-threaded C Programs -with SPIN"
    -Anna Zaks and Rajeev Joshi
    -Proc. International SPIN Workshop on Model Checking of Software (SPIN -2008), August 2008
  14. - -
  15. "Generalized Instruction Selection -using SSA-Graphs"
    -Dietmar Ebner, Florian Brandner, Bernhard Scholz, Andreas Krall, Peter Wiedermann and Albrecht Kadlec
    -Proc. Languages Compilers and Tools for Embedded Systems 2008 (LCTES'08), June, 2008
  16. - - -
  17. "Automatic Data Partitioning in Software -Transactional Memories"
    -Torvald Riegel, Christof Fetzer, and Pascal Felber
    -Proc. 20th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'08), June, 2008
  18. - -
  19. "Register Allocation by Puzzle Solving"
    -Fernando Magno Quintao Pereira and Jens Palsberg
    -Proc. ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI'08), June, 2008
  20. - -
  21. "CoVaC: Compiler Validation by Program Analysis -of the Cross-Product"
    -Anna Zaks and Amir Pnueli
    -Proc. International Symposium on Formal Methods (FM 2008), May 2008
  22. - -
  23. "LLVM and Clang: Next Generation Compiler Technology"
    -Chris Lattner
    -BSDCan 2008: The BSD Conference, Ottawa, Canada, May 2008.
  24. - -
  25. "Cycle-approximate Retargetable Performance -Estimation at the Transaction Level"
    -Y. Hwang, S. Abdi, and D. Gajski
    -Proc. of Design Automation and Test in Europe (DATE'08), Munich, Germany, March 2008
  26. - -
  27. "User-Input Dependence Analysis via Graph Reachability"
    -Bernard Scholz, Chenyi Zhang, and Cristina Cifuentes -
    -Technical Report #TR-2008-171, Sun Microsystems, March 2008
  28. - -
  29. "Impeding Malware -Analysis Using Conditional Code Obfuscation"
    -Monirul Sharif, Andrea Lanzi, Jonathon Giffin and Wenke Lee
    -Network and Distributed System Security Symposium (NDSS'08), San Diego, CA, February 2008
  30. - - -
  31. "Making Object-Based STM Practical in Unmanaged Environments"
    -Torvald Riegel and Diogo Becker de Brum
    -ACM SIGPLAN Workshop on Transactional Computing (TRANSACT 2008), Salt Lake City, Utah, 2008
  32. - -
  33. "Near-Optimal Instruction Selection on DAGs"
    -David Ryan Koes and Seth Copen Goldstein
    -Proc. of the 2008 International Symposium on Code Generation and Optimization (CGO'08), Boston, MA, 2008.
  34. - - -
  35. -Secure Virtual Architecture: A Safe Execution Environment for Commodity -Operating Systems
    John Criswell, Andrew Lenharth, Dinakar Dhurjati, and -Vikram Adve -
    -Proceedings of the Twenty First ACM Symposium on Operating Systems Principles (SOSP '07), Stevenson, WA, October 2007. -
    -Received an SOSP 2007 Audience Choice Award. -
  36. - -
  37. "Transactifying Applications -Using an Open Compiler Framework"
    Pascal Felber, Christof Fetzer, -Ulrich Mueller, Torvald Riegel, Martin Suesskraut, and Heiko Sturzrehm
    -TRANSACT 2007, August 2007.
  38. - -
  39. "LLVM 2.0 -and Beyond!"
    -Chris Lattner
    -Google Tech Talk, Mountain View, CA, July 2007.
  40. - - -
  41. "Structural Abstraction of - Software Verification Conditions"
    - Domagoj Babic and Alan J. Hu.
    - Proc. of the 19th Int. Conf. on Computer Aided Verification - (CAV'07), Berlin, Germany, Jul, 2007.
  42. - -
  43. "Making Context-Sensitive Points-to - Analysis with Heap Cloning Practical For The Real World"
    - Chris Lattner, Andrew Lenharth, and Vikram Adve.
    - Proc. of the 2007 ACM SIGPLAN Conference on Programming Language - Design and Implementation (PLDI'07), San Diego, CA, Jun, 2007.
  44. - -
  45. "Improving Switch Lowering for - The LLVM Compiler System"
    - Anton Korobeynikov.
    - Proc. of the 2007 Spring Young Researchers Colloquium on Software - Engineering (SYRCoSE'2007), Moscow, Russia, May, 2007.
  46. - -
  47. "A Change Framework based on the Low Level -Virtual Machine Compiler Infrastructure"
    -Jakob Praher
    - Masters Thesis, Institute for System Software -Johannes Kepler University Linz, April 2007.
  48. - -
  49. "An Aspect for Idiom-based Exception - Handling (using local continuation join points, join point properties, - annotations and type parameters)"
    - Bram Adams and Kris De Schutter.
    - Proc. of the 5th Software-Engineering Properties of Languages and Aspect - Technologies Workshop (SPLAT), - AOSD 2007, Vancouver, Canada, March, 2007.
  50. - -
  51. "The LLVM Compiler System"
    - Chris Lattner
    - 2007 Bossa Conference on Open Source, Mobile Internet and Multimedia, - Recife, Brazil, March 2007.
  52. - -
  53. "Scaling Task Graphs for - Network Processors"
    - Martin Labrecque and J. Gregory Steffan
    - IFIP International Conference on Network and Parallel Computing, Tokyo, - Japan, October, 2006.
  54. - -
  55. "Automated Compile-Time and -Run-Time Techniques to Increase Usable Memory in MMU-Less Embedded Systems"
    -L. Bai, L. Yang, and R. P. Dick
    -Proc. Int. Conf. Compilers, Architecture & Synthesis for Embedded Systems, -pp. 125-135, Oct. 2006.
  56. - -
  57. "Platform-Based Behavior-Level and System-Level Synthesis"
    -J. Cong, Y. Fan, G. Han, W. Jiang, and Z. Zhang
    -Proceedings of IEEE International SOC Conference, pp. 199-202, Austin, Texas, Sept. 2006.
  58. - -
  59. "Efficiently Detecting All Dangling Pointer Uses in Production Servers"
    Dinakar Dhurjati and Vikram Adve.
    -Proceedings of the International Conference on Dependable Systems and Networks (DSN '06), Philadelphia, Pennsylvania, 2006.
  60. - -
  61. "A Virtual Instruction Set Interface for Operating System Kernels"
    John Criswell, Brent Monroe, and Vikram Adve.
    -Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA '06), Boston, Massachusetts, 2006.
  62. - -
  63. "SAFECode: Enforcing Alias Analysis for Weakly Typed Languages"
    Dinakar Dhurjati, Sumant, Kowshik, and Vikram Adve.
    -Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '06), Ottawa, Canada, 2006.
  64. - -
  65. " -Backwards-Compatible Array Bounds Checking for C with Very Low Overhead" -
    By Dinakar Dhurjati and Vikram Adve, -
    Proceedings of the 28th -International Conference on Software Engineering (ICSE '06), -Shanghai, China, 2006.
  66. - -
  67. "Vector LLVA: A Virtual Vector Instruction Set for Media Processing"
    Robert L. Bocchino Jr. and Vikram S. Adve.
    -Proc. of the Second International Conference on Virtual Execution Environments (VEE'06), Ottawa, Canada, 2006.
  68. - -
  69. "Checker: a Static Program Checker"
    -Nicholas Lewycky
    B.Sc. Thesis, Computer Science Dept., Ryerson University, June 2006.
  70. - -
  71. "Introduction to the LLVM Compiler Infrastructure"
    Chris Lattner
    - 2006 Itanium Conference and Expo, San Jose, California, April 2006.
  72. - -
  73. "Tailoring Graph-coloring Register Allocation For Runtime Compilation"
    Keith D. Cooper and Anshuman Dasgupta
    Proc. of the 2006 International Symposium on Code Generation and Optimization (CGO'06), New York, New York, 2006.
  74. - - -
  75. "Towards a Compilation Infrastructure for Network Processors"
    -Martin Labrecque
    -Masters Thesis, Department of Electrical and Computer Engineering, University of Toronto, January, 2006.
  76. - - - -
  77. "How Successful is Data Structure Analysis in Isolating and Analyzing - Linked Data Structures?"
    Patrick Meredith, Balpreet Pankaj, Swarup Sahoo, Chris Lattner and Vikram Adve
    Technical Report #UIUCDCS-R-2005-2658, Computer Science Dept., Univ. of -Illinois, Dec. 2005.
  78. - - -
  79. "Enforcing Alias Analysis for - Weakly Typed Languages"
    Dinakar Dhurjati, Sumant Kowshik, and Vikram -Adve
    Technical Report #UIUCDCS-R-2005-2657, Computer Science Dept., Univ. of -Illinois, Nov. 2005.
  80. - -
  81. "Revisiting Graph Coloring Register - Allocation: A Study of the Chaitin-Briggs and Callahan-Koblenz - Algorithms"
    - By Keith Cooper, Anshuman Dasgupta, and Jason Eckhardt.
    - Proc. of the Workshop on Languages and Compilers for Parallel - Computing (LCPC'05), Hawthorne, NY, October 20-22, 2005
  82. - -
  83. "Segment Protection for - Embedded Systems Using Run-time Checks"
    - By Matthew Simpson, Bhuvan Middha and Rajeev Barua
    - Proc. of the ACM International Conference on Compilers, - Architecture, and Synthesis for Embedded Systems (CASES'05), - San Francisco, CA, September, 2005
  84. - -
  85. " -A Concept Analysis Inspired Greedy Algorithm for Test Suite Minimization"
    -By Sriraman Tallam and Neelam Gupta
    -ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and -Engineering (PASTE 2005), Lisbon, Portugal, September 5-6, 2005.
  86. - - -
  87. "Deciding Where to Call - Performance Libraries"
    - By C. Alias and D. Barthou
    - Proc. of the International IEEE Euro-Par Conference, August, 2005
  88. - -
  89. "Practical Techniques for Performance Estimation of Processors"
    - Abhijit Ray, Thambipillai Srikanthan and Wu Jigang.
    - Proceedings of the 9th International Database Engineering & Application Sy -mposium (IDEAS'05), July 2005.
  90. - -
  91. "Profile-directed If-Conversion in - Superscalar Microprocessors"
    -Eric Zimmerman
    -Masters Thesis, Computer Science Dept., University of Illinois at -Urbana-Champaign, July 2005.
  92. - - -
  93. "An Implementation of Swing Modulo Scheduling with Extensions for Superblocks"
    -Tanya M. Lattner.
    M.S. Thesis, Computer Science Dept., University of Illinois at -Urbana-Champaign, June 2005.
  94. - -
  95. "Macroscopic Data Structure -Analysis and Optimization"
    -Chris Lattner
    Ph.D. Thesis, Computer Science Dept., University of Illinois at -Urbana-Champaign, May 2005.
  96. - -
  97. "Automatic Pool Allocation: -Improving Performance by Controlling Data Structure Layout in the Heap"
    -Chris Lattner and Vikram Adve
    - Proc. of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05), Chicago, Illinois, -June, 2005.
    -Received PLDI 2005 Best Paper Award.
  98. - -
  99. "Transparent Pointer Compression -for Linked Data Structures"
    -Chris Lattner and Vikram Adve
    Proc. of the -ACM Workshop on Memory System Performance (MSP'05), Chicago, Illinois, June, 2005. -
  100. - -
  101. Using a Low-Level Virtual - Machine to Improve Dynamic Aspect Support in Operating System Kernels
    - By Michael Engel and Bernd Freisleben.
    -Proc. of the 4th AOSD Workshop on Aspects, Components, and Patterns for - Infrastructure Software (ACP4IS'05), March 14-18, Chicago, 2005 -
  102. - -
  103. "Memory Safety Without - Garbage Collection for Embedded Applications"
    Dinakar - Dhurjati, Sumant Kowshik, Vikram Adve and Chris Lattner
    - ACM Transactions in Embedded Computing Systems (TECS), February 2005.
  104. - -
  105. "The LLVM Compiler Framework and -Infrastructure Tutorial"
    Chris Lattner and Vikram Adve
    -LCPC'04 Mini Workshop on Compiler Research Infrastructures, West Lafayette, Indiana, Sep. 2004.
  106. - -
  107. "RubyComp - A Ruby-to-LLVM -Compiler Prototype"
    Anders Alexandersson
    Masters Thesis, -Division of Computer Science at the Department of Informatics and Mathematics, -University of Trollhättan/Uddevalla, Sweden, Spring 2004
  108. - -
  109. "A Task Optimization Framework for -MSSP"
    Rahul Ulhas Joshi
    Masters Thesis, -Computer Science Dept., University of Illinois at Urbana-Champaign, May 2004.
  110. - -
  111. "Coordinating Adaptations in -Distributed Systems"
    -Brian Ensink and Vikram Adve
    Proc. of the 24th International Conference on -Distributed Computing Systems (ICDCS 2004), Tokyo, Japan, March 2004
  112. - -
  113. "LLVM: A Compilation Framework for -Lifelong Program Analysis & Transformation"
    Chris Lattner and Vikram -Adve
    Proc. of the 2004 International Symposium -on Code Generation and Optimization (CGO'04), Palo Alto, California, Mar. -2004.
  114. - -
  115. "LLVA: A Low-level Virtual Instruction Set -Architecture"
    Vikram Adve, Chris Lattner, Michael Brukman, Anand Shukla, -and Brian Gaeke
    Proc. of the -36th annual ACM/IEEE international symposium on Microarchitecture -(MICRO-36), San Diego, CA, December 2003.
  116. - -
  117. "Language Extensions for -Performance-Oriented Programming"
    Joel Stanley
    Masters Thesis, -Computer Science Dept., University of Illinois at Urbana-Champaign, -July 2003
  118. - -
  119. "Lightweight, Cross-Procedure -Tracing for Runtime Optimization"
    Anand Shukla
    Masters Thesis, -Computer Science Dept., University of Illinois at Urbana-Champaign, -July 2003
  120. - -
  121. "Memory Safety Without Runtime -Checks or Garbage Collection"
    Dinakar Dhurjati, Sumant Kowshik, Vikram -Adve and Chris Lattner
    Proc. of -Languages Compilers and Tools for Embedded Systems 2003 (LCTES 03), San -Diego, CA, June 2003.
  122. - -
  123. "Architecture For a Next-Generation -GCC"
    Chris Lattner & Vikram Adve
    First Annual GCC Developers' -Summit, Ottawa, Canada, May 2003.
  124. -
Index: llvm-www/pubs/pubs.js diff -u llvm-www/pubs/pubs.js:1.2 llvm-www/pubs/pubs.js:1.3 --- llvm-www/pubs/pubs.js:1.2 Wed Dec 24 20:45:50 2008 +++ llvm-www/pubs/pubs.js Wed Dec 24 23:27:51 2008 @@ -1,6 +1,476 @@ // The array should be sorted reverse-chronologically, and will be displayed on // the page in the order listed. var PUBS = [ + {url: '2009-03-ASPLOS-Recovery.html', + title: 'Recovery Domains: An Organizing Principle for Recoverable Operating Systems', + author: 'Andrew Lenharth, Samuel T. King, Vikram Adve', + published: "Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09)"}, + + {url: '2008-12-OSDI-KLEE.html', + title: 'KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs', + author: 'Cristian Cadar, Daniel Dunbar, Dawson Engler', + published: "Proc. 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2008)", + month: 12, + year: 2008}, + + {url: '2008-10-04-ACAT-LLVM-Intro.html', + title: 'Introduction to the LLVM Compiler System', + author: 'Chris Lattner', + published: 'Plenary Talk, ACAT 2008: Advanced Computing and Analysis Techniques in Physics Research', + location: 'Erice, Sicily, Italy', + month: 11, + year: 2008}, + + {url: '2008-10-EMSOFT-Volatiles.html', + title: 'Volatiles Are Miscompiled, and What to Do about It', + author: 'Eric Eide, John Regehr', + published: "Proc. EMSOFT International Conference on Embedded Software (EMSOFT 2008)", + month: 10, + year: 2008}, + + {url: '2008-09-LadyVM.html', + title: 'A Lazy Developer Approach: Building a JVM with Third Party Software', + author: 'Nicolas Geoffray, Gael Thomas, Charles Clement and Bertil Folliot', + published: "Proc. International Conference on Principles and Practice of Programming In Java (PPPJ 2008)", + month: 9, + year: 2008}, + + {url: '2008-08-RTCodegen.html', + title: 'Run-Time Code Generation for Materials', + author: 'Stephan Reiter', + published: "IEEE Symposium on Interactive Ray Tracing (RT'08)", + month: 8, + year: 2008}, + + {url: '2008-08-SPIN-Pancam.html', + title: 'Verifying Multi-threaded C Programs with SPIN', + author: 'Anna Zaks and Rajeev Joshi', + published: "Proc. International SPIN Workshop on Model Checking of Software (SPIN 2008)", + month: 8, + year: 2008}, + + {url: '2008-06-LCTES-ISelUsingSSAGraphs.html', + title: 'Generalized Instruction Selection using SSA-Graphs', + author: 'Dietmar Ebner, Florian Brandner, Bernhard Scholz, Andreas Krall, Peter Wiedermann and Albrecht Kadlec', + published: "Proc. Languages Compilers and Tools for Embedded Systems 2008 (LCTES'08)", + month: 6, + year: 2008}, + + {url: '2008-06-13-SPAA-STMDataPartitioning.html', + title: 'Automatic Data Partitioning in Software Transactional Memories', + author: 'Torvald Riegel, Christof Fetzer, and Pascal Felber', + published: "Proc. 20th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'08)", + month: 6, + year: 2008}, + + {url: '2008-06-PLDI-PuzzleSolving.html', + title: 'Register Allocation by Puzzle Solving', + author: 'Fernando Magno Quintao Pereira and Jens Palsberg', + published: "Proc. ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI'08)", + month: 6, + year: 2008}, + + {url: '2008-05-CoVaC.html', + title: 'CoVaC: Compiler Validation by Program Analysis of the Cross-Product', + author: 'Anna Zaks and Amir Pnueli', + published: "Proc. International Symposium on Formal Methods (FM 2008)", + month: 5, + year: 2008}, + + {url: '2008-05-17-BSDCan-LLVMIntro.html', + title: 'LLVM and Clang: Next Generation Compiler Technology', + author: 'Chris Lattner', + published: 'BSDCan 2008: The BSD Conference', + location: 'Ottawa, Canada', + month: 5, + year: 2008}, + + {url: '2008-03-DATE-TLM_Estimation.html', + title: 'Cycle-approximate Retargetable Performance Estimation at the Transaction Level', + author: 'Y. Hwang, S. Abdi, and D. Gajski', + published: "Proc. of Design Automation and Test in Europe (DATE'08)", + location: 'Munich, Germany', + month: 3, + year: 2008}, + + {url: '2008-03-TR-UIDependAnalysis.html', + title: 'User-Input Dependence Analysis via Graph Reachability', + author: 'Bernard Scholz, Chenyi Zhang, and Cristina Cifuentes', + published: "Technical Report #TR-2008-171, Sun Microsystems", + month: 3, + year: 2008}, + + {url: '2008-02-ImpedingMalwareAnalysis.html', + title: 'Impeding Malware Analysis Using Conditional Code Obfuscation', + author: 'Monirul Sharif, Andrea Lanzi, Jonathon Giffin and Wenke Lee', + published: "Network and Distributed System Security Symposium (NDSS'08)", + location: 'San Diego, CA', + month: 2, + year: 2008}, + + {url: '2008-02-23-TRANSACT-TangerObjBased.html', + title: 'Making Object-Based STM Practical in Unmanaged Environments', + author: 'Torvald Riegel and Diogo Becker de Brum', + published: "ACM SIGPLAN Workshop on Transactional Computing (TRANSACT 2008)", + location: 'Salt Lake City, Utah', + year: 2008}, + + {url: '2008-CGO-DagISel.html', + title: 'Near-Optimal Instruction Selection on DAGs', + author: 'David Ryan Koes and Seth Copen Goldstein', + published: "Proc. of the 2008 International Symposium on Code Generation and Optimization (CGO'08)", + location: 'Boston, MA', + year: 2008}, + + {url: '2007-SOSP-SVA.html', + title: 'Secure Virtual Architecture: A Safe Execution Environment for Commodity Operating Systems', + author: 'John Criswell, Andrew Lenharth, Dinakar Dhurjati, and Vikram Adve', + published: "Proceedings of the Twenty First ACM Symposium on Operating Systems Principles (SOSP '07)", + award: 'Received an SOSP 2007 Audience Choice Award', + location: 'Stevenson, WA', + month: 10, + year: 2007}, + + {url: '2007-08-16-TRANSACT-Tanger.html', + title: 'Transactifying Applications Using an Open Compiler Framework', + author: 'Pascal Felber, Christof Fetzer, Ulrich Mueller, Torvald Riegel, Martin Suesskraut, and Heiko Sturzrehm', + published: "TRANSACT 2007", + month: 8, + year: 2007}, + + {url: '2007-07-25-LLVM-2.0-and-Beyond.html', + title: 'LLVM 2.0 and Beyond!', + author: 'Chris Lattner', + published: "Google Tech Talk", + location: 'Mountain View, CA', + month: 7, + year: 2007}, + + {url: '2007-07-CAV-StructuralAbstraction.html', + title: 'Structural Abstraction of Software Verification Conditions', + author: 'Domagoj Babic and Alan J. Hu', + published: "Proc. of the 19th Int. Conf. on Computer Aided Verification (CAV'07)", + location: 'Berlin, Germany', + month: 7, + year: 2007}, + + {url: '2007-06-10-PLDI-DSA.html', + title: 'Making Context-Sensitive Points-to Analysis with Heap Cloning Practical For The Real World', + author: 'Chris Lattner, Andrew Lenharth, and Vikram Adve', + published: "Proc. of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07)", + location: 'San Diego, CA', + month: 6, + year: 2007}, + + {url: '2007-05-31-Switch-Lowering.html', + title: 'Improving Switch Lowering for The LLVM Compiler System', + author: 'Anton Korobeynikov', + published: "Proc. of the 2007 Spring Young Researchers Colloquium on Software Engineering (SYRCoSE'2007)", + location: 'Moscow, Russia', + month: 5, + year: 2007}, + + {url: '2007-04-PraherMSThesis.html', + title: 'A Change Framework based on the Low Level Virtual Machine Compiler Infrastructure', + author: 'Jakob Praher', + published: "Masters Thesis", + location: 'Institute for System Software Johannes Kepler University Linz', + month: 4, + year: 2007}, + + {url: '2007-03-SPLAT-Aspects.html', + title: 'An Aspect for Idiom-based Exception Handling (using local continuation join points, join point properties, annotations and type parameters)', + author: 'Bram Adams and Kris De Schutter', + published: "Proc. of the 5th Software-Engineering Properties of Languages and Aspect Technologies Workshop (SPLAT)", + location: 'AOSD 2007, Vancouver, Canada', + month: 3, + year: 2007}, + + {url: '2007-03-12-BossaLLVMIntro.html', + title: 'The LLVM Compiler System', + author: 'Chris Lattner', + published: "2007 Bossa Conference on Open Source, Mobile Internet and Multimedia", + location: 'Recife, Brazil', + month: 3, + year: 2007}, + + {url: '2006-10-ICNPC-ScalingTaskGraphs.html', + title: 'Scaling Task Graphs for Network Processors', + author: 'Martin Labrecque and J. Gregory Steffan', + published: "IFIP International Conference on Network and Parallel Computing", + location: 'Tokyo, Japan', + month: 10, + year: 2006}, + + {url: '2006-10-CASES-IncreaseMem.html', + title: 'Automated Compile-Time and Run-Time Techniques to Increase Usable Memory in MMU-Less Embedded Systems', + author: 'L. Bai, L. Yang, and R. P. Dick', + published: "Proc. Int. Conf. Compilers, Architecture & Synthesis for Embedded Systems", + location: 'pp. 125-135', + month: 10, + year: 2006}, + + {url: '2006-09-SOC-Synthesis.html', + title: 'Platform-Based Behavior-Level and System-Level Synthesis', + author: 'J. Cong, Y. Fan, G. Han, W. Jiang, and Z. Zhang', + published: "Proceedings of IEEE International SOC Conference", + location: 'pp. 199-202, Austin, Texas', + month: 9, + year: 2006}, + + {url: '2006-DSN-DanglingPointers.html', + title: 'Efficiently Detecting All Dangling Pointer Uses in Production Servers', + author: 'Dinakar Dhurjati and Vikram Adve', + published: "Proceedings of the International Conference on Dependable Systems and Networks (DSN '06)", + location: 'Philadelphia, Pennsylvania', + year: 2006}, + + {url: '2006-06-18-WIOSCA-LLVAOS.html', + title: 'A Virtual Instruction Set Interface for Operating System Kernels', + author: 'John Criswell, Brent Monroe, and Vikram Adve', + published: "Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA '06)", + location: 'Boston, Massachusetts', + year: 2006}, + + {url: '2006-06-12-PLDI-SAFECode.html', + title: 'SAFECode: Enforcing Alias Analysis for Weakly Typed Languages', + author: 'Dinakar Dhurjati, Sumant, Kowshik, and Vikram Adve', + published: "Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '06)", + location: 'Ottawa, Canada', + year: 2006}, + + {url: 'http://llvm.org/pubs/2006-05-24-SAFECode-BoundsCheck.html', + title: 'Backwards-Compatible Array Bounds Checking for C with Very Low Overhead', + author: 'Dinakar Dhurjati and Vikram Adve', + published: "Proceedings of the 28th International Conference on Software Engineering (ICSE '06)", + location: 'Shanghai, China', + year: 2006}, + + {url: '2006-06-15-VEE-VectorLLVA.html', + title: 'Vector LLVA: A Virtual Vector Instruction Set for Media Processing', + author: 'Robert L. Bocchino Jr. and Vikram S. Adve', + published: "Proc. of the Second International Conference on Virtual Execution Environments (VEE'06)", + location: 'Ottawa, Canada', + year: 2006}, + + {url: '2006-06-07-LewyckyChecker.html', + title: 'Checker: a Static Program Checker', + author: 'Nicholas Lewycky', + published: "B.Sc. Thesis", + location: 'Computer Science Dept., Ryerson University', + month: 6, + year: 2006}, + + {url: '2006-04-25-GelatoLLVMIntro.html', + title: 'Introduction to the LLVM Compiler Infrastructure', + author: 'Chris Lattner', + published: "2006 Itanium Conference and Expo", + location: 'San Jose, California', + month: 4, + year: 2006}, + + {url: '2006-04-04-CGO-GraphColoring.html', + title: 'Tailoring Graph-coloring Register Allocation For Runtime Compilation', + author: 'Keith D. Cooper and Anshuman Dasgupta', + published: "Proc. of the 2006 International Symposium on Code Generation and Optimization (CGO'06)", + location: 'New York, New York', + year: 2006}, + + {url: '2006-01-LabrecqueMSThesis.html', + title: 'Towards a Compilation Infrastructure for Network Processors', + author: 'Martin Labrecque', + published: "Masters Thesis", + location: 'Department of Electrical and Computer Engineering, University of Toronto', + month: 1, + year: 2006}, + + {url: '2005-TR-DSAEvaluation.html', + title: 'How Successful is Data Structure Analysis in Isolating and Analyzing Linked Data Structures?', + author: 'Patrick Meredith, Balpreet Pankaj, Swarup Sahoo, Chris Lattner and Vikram Adve', + published: "Technical Report #UIUCDCS-R-2005-2658, Computer Science Dept., Univ. of Illinois", + month: 12, + year: 2005}, + + {url: '2005-11-SAFECodeTR.html', + title: 'Enforcing Alias Analysis for Weakly Typed Languages', + author: 'Dinakar Dhurjati, Sumant Kowshik, and Vikram Adve', + published: "Technical Report #UIUCDCS-R-2005-2657, Computer Science Dept., Univ. of Illinois", + month: 11, + year: 2005}, + + {url: '2005-10-20-LCPC-RegAlloc.html', + title: 'Revisiting Graph Coloring Register Allocation: A Study of the Chaitin-Briggs and Callahan-Koblenz Algorithms', + author: 'By Keith Cooper, Anshuman Dasgupta, and Jason Eckhardt', + published: "Proc. of the Workshop on Languages and Compilers for Parallel Computing (LCPC'05)", + location: 'Hawthorne, NY', + month: 10, + year: 2005}, + + {url: '2005-09-25-CASES05-SegmentProtection.html', + title: 'Segment Protection for Embedded Systems Using Run-time Checks', + author: 'By Matthew Simpson, Bhuvan Middha and Rajeev Barua', + published: "Proc. of the ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'05)", + location: 'San Francisco, CA', + month: 9, + year: 2005}, + + {url: '2005-09-PASTE-GreedySuiteMinimization.html', + title: 'A Concept Analysis Inspired Greedy Algorithm for Test Suite Minimization', + author: 'By Sriraman Tallam and Neelam Gupta', + published: "ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005)", + location: 'Lisbon, Portugal', + month: 9, + year: 2005}, + + {url: '2005-08-EUROPAR-PerformanceLibs.html', + title: 'Deciding Where to Call Performance Libraries', + author: 'By C. Alias and D. Barthou', + published: "Proc. of the International IEEE Euro-Par Conference", + month: 8, + year: 2005}, + + {url: '2005-07-IDEAS-PerfEstimation.html', + title: 'Practical Techniques for Performance Estimation of Processors', + author: 'Abhijit Ray, Thambipillai Srikanthan and Wu Jigang', + published: "Proceedings of the 9th International Database Engineering & Application Symposium (IDEAS'05)", + month: 7, + year: 2005}, + + {url: '2005-07-ZimmermanMSThesis.html', + title: 'Profile-directed If-Conversion in Superscalar Microprocessors', + author: 'Eric Zimmerman', + published: "Masters Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 7, + year: 2005}, + + {url: '2005-06-17-LattnerMSThesis.html', + title: 'An Implementation of Swing Modulo Scheduling with Extensions for Superblocks', + author: 'Tanya M. Lattner', + published: "M.S. Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 6, + year: 2005}, + + {url: '2005-05-04-LattnerPHDThesis.html', + title: 'Macroscopic Data Structure Analysis and Optimization', + author: 'Chris Lattner', + published: "Ph.D. Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 5, + year: 2005}, + + {url: '2005-05-21-PLDI-PoolAlloc.html', + title: 'Automatic Pool Allocation: Improving Performance by Controlling Data Structure Layout in the Heap', + author: 'Chris Lattner and Vikram Adve', + published: "Proc. of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05)", + award: 'Received PLDI 2005 Best Paper Award', + location: 'Chicago, Illinois', + month: 6, + year: 2005}, + + {url: '2005-06-12-MSP-PointerComp.html', + title: 'Transparent Pointer Compression for Linked Data Structures', + author: 'Chris Lattner and Vikram Adve', + published: "Proc. of the ACM Workshop on Memory System Performance (MSP'05)", + location: 'Chicago, Illinois', + month: 6, + year: 2005}, + + {url: '2005-03-14-ACP4IS-AspectsKernel.html', + title: 'Using a Low-Level Virtual Machine to Improve Dynamic Aspect Support in Operating System Kernels', + author: 'By Michael Engel and Bernd Freisleben', + published: "Proc. of the 4th AOSD Workshop on Aspects, Components, and Patterns for Infrastructure Software (ACP4IS'05)", + location: 'Chicago', + month: 3, + year: 2005}, + + {url: '2005-02-TECS-SAFECode.html', + title: 'Memory Safety Without Garbage Collection for Embedded Applications', + author: 'Dinakar Dhurjati, Sumant Kowshik, Vikram Adve and Chris Lattner', + published: "ACM Transactions in Embedded Computing Systems (TECS)", + month: 2, + year: 2005}, + + {url: '2004-09-22-LCPCLLVMTutorial.html', + title: 'The LLVM Compiler Framework and Infrastructure Tutorial', + author: 'Chris Lattner and Vikram Adve', + published: "LCPC'04 Mini Workshop on Compiler Research Infrastructures", + location: 'West Lafayette, Indiana', + month: 9, + year: 2004}, + + {url: '2004-Spring-AlexanderssonMSThesis.html', + title: 'RubyComp - A Ruby-to-LLVM Compiler Prototype', + author: 'Anders Alexandersson', + published: "Masters Thesis", + location: 'Division of Computer Science at the Department of Informatics and Mathematics, University of Trollhättan/Uddevalla, Sweden', + year: 2004}, + + {url: '2004-05-JoshiMSThesis.html', + title: 'A Task Optimization Framework for MSSP', + author: 'Rahul Ulhas Joshi', + published: "Masters Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 5, + year: 2004}, + + {url: '2004-03-ICDCS-Adaptions.html', + title: 'Coordinating Adaptations in Distributed Systems', + author: 'Brian Ensink and Vikram Adve', + published: "Proc. of the 24th International Conference on Distributed Computing Systems (ICDCS 2004)", + location: 'Tokyo, Japan', + month: 3, + year: 2004}, + + {url: '2004-01-30-CGO-LLVM.html', + title: 'LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation', + author: 'Chris Lattner and Vikram Adve', + published: "Proc. of the 2004 International Symposium on Code Generation and Optimization (CGO'04)", + location: 'Palo Alto, California', + month: 3, + year: 2004}, + + {url: '2003-10-01-LLVA.html', + title: 'LLVA: A Low-level Virtual Instruction Set Architecture', + author: 'Vikram Adve, Chris Lattner, Michael Brukman, Anand Shukla, and Brian Gaeke', + published: "Proc. of the 36th annual ACM/IEEE International Symposium on Microarchitecture (MICRO-36)", + location: 'San Diego, CA', + month: 12, + year: 2003}, + + {url: '2003-07-18-StanleyMSThesis.html', + title: 'Language Extensions for Performance-Oriented Programming', + author: 'Joel Stanley', + published: "Masters Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 7, + year: 2003}, + + {url: '2003-07-18-ShuklaMSThesis.html', + title: 'Lightweight, Cross-Procedure Tracing for Runtime Optimization', + author: 'Anand Shukla', + published: "Masters Thesis", + location: 'Computer Science Dept., University of Illinois at Urbana-Champaign', + month: 7, + year: 2003}, + + {url: '2003-05-05-LCTES03-CodeSafety.html', + title: 'Memory Safety Without Runtime Checks or Garbage Collection', + author: 'Dinakar Dhurjati, Sumant Kowshik, Vikram Adve and Chris Lattner', + published: "Proc. of Languages Compilers and Tools for Embedded Systems 2003 (LCTES 03)", + location: 'San Diego, CA', + month: 6, + year: 2003}, + + {url: '2003-05-01-GCCSummit2003.html', + title: 'Architecture For a Next-Generation GCC', + author: 'Chris Lattner & Vikram Adve', + published: "First Annual GCC Developers' Summit", + location: 'Ottawa, Canada', + month: 5, + year: 2003}, + {url: '2003-04-29-DataStructureAnalysisTR.html', title: 'Data Structure Analysis: An Efficient Context-Sensitive Heap Analysis', author: 'Chris Lattner & Vikram Adve', @@ -76,9 +546,11 @@ item.innerHTML += '"' + htmlEscape(pub.title) + '"
'; item.innerHTML += htmlEscape(pub.author) + '
'; - item.innerHTML += '' + htmlEscape(pub.published) + ''; + if (isDef(pub.published)) { + item.innerHTML += '' + pub.published + ''; + } if (isDef(pub.location)) { - item.innerHTML += ', ' + htmlEscape(pub.location); + item.innerHTML += ', ' + pub.location; } if (isDef(pub.month) || isDef(pub.year)) { var date = []; @@ -91,6 +563,9 @@ item.innerHTML += ', ' + htmlEscape(date.join(' ')); } item.innerHTML += '.'; + if (isDef(pub.award)) { + item.innerHTML += '
' + pub.award + '.'; + } list.appendChild(item); } } From clattner at apple.com Wed Dec 24 23:32:29 2008 From: clattner at apple.com (Chris Lattner) Date: Wed, 24 Dec 2008 21:32:29 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> Message-ID: On Dec 24, 2008, at 9:21 PM, Eli Friedman wrote: >>> Also, even ignoring that, performance is hugely different: on a Core >>> 2, "bt %ebx, %eax" is one uop, but "bt %ebx, (%esp)" is 10 uops. >>> The >>> difference isn't quite as severe on other processors, but the reg- >>> reg >>> form is still significantly faster even if a load from memory is >>> necessary. >> >> Are you sure you aren't thinking of btc/bts? bt doesn't modify any >> operands. > > Oh, oops, s/modifies/tests/. The rest is correct. Do you have a benchmark to show this? If it shows that it is slower in practice, I think it would make sense to have a "has slow bt from memory" subtarget flag that would be a nice predicate for the memory form of the instructions. -Chris From sabre at nondot.org Wed Dec 24 23:34:38 2008 From: sabre at nondot.org (Chris Lattner) Date: Thu, 25 Dec 2008 05:34:38 -0000 Subject: [llvm-commits] [llvm] r61426 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/bt.ll Message-ID: <200812250534.mBP5YdRQ015027@zion.cs.uiuc.edu> Author: lattner Date: Wed Dec 24 23:34:37 2008 New Revision: 61426 URL: http://llvm.org/viewvc/llvm-project?rev=61426&view=rev Log: Add a simple pattern for matching 'bt'. Added: llvm/trunk/test/CodeGen/X86/bt.ll Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=61426&r1=61425&r2=61426&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Dec 24 23:34:37 2008 @@ -5017,16 +5017,48 @@ SDValue X86TargetLowering::LowerSETCC(SDValue Op, SelectionDAG &DAG) { assert(Op.getValueType() == MVT::i8 && "SetCC type must be 8-bit integer"); - SDValue Cond; SDValue Op0 = Op.getOperand(0); SDValue Op1 = Op.getOperand(1); - SDValue CC = Op.getOperand(2); - bool isFP = Op.getOperand(1).getValueType().isFloatingPoint(); + ISD::CondCode CC = cast(Op.getOperand(2))->get(); + + // Lower (X & (1 << N)) == 0 to BT. + // Lower ((X >>u N) & 1) != 0 to BT. + // Lower ((X >>s N) & 1) != 0 to BT. + // FIXME: Is i386 or later or available only on some chips? + if (Op0.getOpcode() == ISD::AND && Op1.getOpcode() == ISD::Constant && + Op0.getOperand(1).getOpcode() == ISD::Constant && + (CC == ISD::SETEQ || CC == ISD::SETNE)) { + ConstantSDNode *AndRHS = cast(Op0.getOperand(1)); + ConstantSDNode *CmpRHS = cast(Op1); + SDValue AndLHS = Op0.getOperand(0); + if (CmpRHS->getZExtValue() == 0 && AndRHS->getZExtValue() == 1 && + AndLHS.getOpcode() == ISD::SRL) { + SDValue LHS = AndLHS.getOperand(0); + SDValue RHS = AndLHS.getOperand(1); + + // If LHS is i8, promote it to i16 with any_extend. There is no i8 BT + // instruction. Since the shift amount is in-range-or-undefined, we know + // that doing a bittest on the i16 value is ok. We extend to i32 because + // the encoding for the i16 version is larger than the i32 version. + if (LHS.getValueType() == MVT::i8) + LHS = DAG.getNode(ISD::ANY_EXTEND, MVT::i32, LHS); + + // If the operand types disagree, extend the shift amount to match. Since + // BT ignores high bits (like shifts) we can use anyextend. + if (LHS.getValueType() != RHS.getValueType()) + RHS = DAG.getNode(ISD::ANY_EXTEND, LHS.getValueType(), RHS); + + SDValue BT = DAG.getNode(X86ISD::BT, MVT::i32, LHS, RHS); + unsigned Cond = CC == ISD::SETEQ ? X86::COND_NC : X86::COND_C; + return DAG.getNode(X86ISD::SETCC, MVT::i8, + DAG.getConstant(Cond, MVT::i8), BT); + } + } - unsigned X86CC = TranslateX86CC(cast(CC)->get(), isFP, - Op0, Op1, DAG); + bool isFP = Op.getOperand(1).getValueType().isFloatingPoint(); + unsigned X86CC = TranslateX86CC(CC, isFP, Op0, Op1, DAG); - Cond = DAG.getNode(X86ISD::CMP, MVT::i32, Op0, Op1); + SDValue Cond = DAG.getNode(X86ISD::CMP, MVT::i32, Op0, Op1); return DAG.getNode(X86ISD::SETCC, MVT::i8, DAG.getConstant(X86CC, MVT::i8), Cond); } @@ -5219,12 +5251,15 @@ if (Cond.getOpcode() == ISD::SETCC) Cond = LowerSETCC(Cond, DAG); +#if 0 + // FIXME: LowerXALUO doesn't handle these!! else if (Cond.getOpcode() == X86ISD::ADD || Cond.getOpcode() == X86ISD::SUB || Cond.getOpcode() == X86ISD::SMUL || Cond.getOpcode() == X86ISD::UMUL) Cond = LowerXALUO(Cond, DAG); - +#endif + // If condition flag is set by a X86ISD::CMP, then use it as the condition // setting operand in place of the X86ISD::SETCC. if (Cond.getOpcode() == X86ISD::SETCC) { @@ -5232,7 +5267,8 @@ SDValue Cmp = Cond.getOperand(1); unsigned Opc = Cmp.getOpcode(); - if (isX86LogicalCmp(Opc)) { + // FIXME: WHY THE SPECIAL CASING OF LogicalCmp?? + if (isX86LogicalCmp(Opc) || Opc == X86ISD::BT) { Cond = Cmp; addTest = false; } else { @@ -5240,8 +5276,8 @@ default: break; case X86::COND_O: case X86::COND_C: - // These can only come from an arithmetic instruction with overflow, e.g. - // SADDO, UADDO. + // These can only come from an arithmetic instruction with overflow, + // e.g. SADDO, UADDO. Cond = Cond.getNode()->getOperand(1); addTest = false; break; Added: llvm/trunk/test/CodeGen/X86/bt.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/bt.ll?rev=61426&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/bt.ll (added) +++ llvm/trunk/test/CodeGen/X86/bt.ll Wed Dec 24 23:34:37 2008 @@ -0,0 +1,20 @@ +; RUN: llvm-as < %s | llc | grep btl +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" +target triple = "i386-apple-darwin8" + +define void @test2(i32 %x, i32 %n) nounwind { +entry: + %tmp29 = lshr i32 %x, %n ; [#uses=1] + %tmp3 = and i32 %tmp29, 1 ; [#uses=1] + %tmp4 = icmp eq i32 %tmp3, 0 ; [#uses=1] + br i1 %tmp4, label %bb, label %UnifiedReturnBlock + +bb: ; preds = %entry + call void @foo() + ret void + +UnifiedReturnBlock: ; preds = %entry + ret void +} + +declare void @foo() From sabre at nondot.org Wed Dec 24 23:40:39 2008 From: sabre at nondot.org (Chris Lattner) Date: Thu, 25 Dec 2008 05:40:39 -0000 Subject: [llvm-commits] [llvm] r61427 - /llvm/trunk/test/CodeGen/X86/bt.ll Message-ID: <200812250540.mBP5eeWx015250@zion.cs.uiuc.edu> Author: lattner Date: Wed Dec 24 23:40:38 2008 New Revision: 61427 URL: http://llvm.org/viewvc/llvm-project?rev=61427&view=rev Log: add PR # Modified: llvm/trunk/test/CodeGen/X86/bt.ll Modified: llvm/trunk/test/CodeGen/X86/bt.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/bt.ll?rev=61427&r1=61426&r2=61427&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/X86/bt.ll (original) +++ llvm/trunk/test/CodeGen/X86/bt.ll Wed Dec 24 23:40:38 2008 @@ -1,4 +1,5 @@ ; RUN: llvm-as < %s | llc | grep btl +; PR3253 target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" target triple = "i386-apple-darwin8" From eli.friedman at gmail.com Thu Dec 25 00:01:25 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 24 Dec 2008 22:01:25 -0800 Subject: [llvm-commits] [llvm] r61424 - in /llvm/trunk/lib/Target/X86: X86Instr64bit.td X86InstrInfo.td In-Reply-To: References: <200812250127.mBP1RCda007933@zion.cs.uiuc.edu> <6566EF58-EB07-4CB4-B164-EDA15444C4E9@apple.com> Message-ID: On Wed, Dec 24, 2008 at 9:32 PM, Chris Lattner wrote: > On Dec 24, 2008, at 9:21 PM, Eli Friedman wrote: >>>> Also, even ignoring that, performance is hugely different: on a Core >>>> 2, "bt %ebx, %eax" is one uop, but "bt %ebx, (%esp)" is 10 uops. >>>> The >>>> difference isn't quite as severe on other processors, but the reg- >>>> reg >>>> form is still significantly faster even if a load from memory is >>>> necessary. >>> >>> Are you sure you aren't thinking of btc/bts? bt doesn't modify any >>> operands. >> >> Oh, oops, s/modifies/tests/. The rest is correct. > > Do you have a benchmark to show this? If it shows that it is slower > in practice, I think it would make sense to have a "has slow bt from > memory" subtarget flag that would be a nice predicate for the memory > form of the instructions. I'm going by timings from http://www.agner.org/optimize/. If you want a benchmark, try the following; it's a completely silly benchmark, but it shows the issue at hand. #include int main() { int testlen = 1000000000; int* a = malloc(testlen/8); unsigned i; #if 1 for (i = 0; i < testlen; i++) { asm volatile ("btl %0, (%1)" : : "r"(i), "r"(a)); } #else for (i = 0; i < testlen; i++) { asm volatile ("mov %0, %%eax;" "shrl $5, %%eax;" "mov (%1,%%eax,4), %%eax;" "btl %0, %%eax" : : "r"(i), "r"(a) : "eax"); } #endif } The two branches do approximately the same thing; the second version is almost twice as fast as the first on my computer (a Core Duo). -Eli From eli.friedman at gmail.com Thu Dec 25 00:07:15 2008 From: eli.friedman at gmail.com (Eli Friedman) Date: Wed, 24 Dec 2008 22:07:15 -0800 Subject: [llvm-commits] [llvm] r61426 - in /llvm/trunk: lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/bt.ll In-Reply-To: <200812250534.mBP5YdRQ015027@zion.cs.uiuc.edu> References: <200812250534.mBP5YdRQ015027@zion.cs.uiuc.edu> Message-ID: On Wed, Dec 24, 2008 at 9:34 PM, Chris Lattner wrote: > + // Lower (X & (1 << N)) == 0 to BT. > + // Lower ((X >>u N) & 1) != 0 to BT. > + // Lower ((X >>s N) & 1) != 0 to BT. > + // FIXME: Is i386 or later or available only on some chips? My reference says it's 386 or later. -Eli From nunoplopes at sapo.pt Fri Dec 26 10:12:00 2008 From: nunoplopes at sapo.pt (Nuno Lopes) Date: Fri, 26 Dec 2008 16:12:00 -0000 Subject: [llvm-commits] CVS: llvm-www/pubs/index.html pubs.js In-Reply-To: <200812250528.mBP5Sho6014869@zion.cs.uiuc.edu> References: <200812250528.mBP5Sho6014869@zion.cs.uiuc.edu> Message-ID: <72A5AFEB5F174A80B9A1A41FA91F3107@pc07654> Hi, > Changes in directory llvm-www/pubs: > > index.html updated: 1.89 -> 1.90 > pubs.js updated: 1.2 -> 1.3 > --- > Log message: > > Converted all papers from hand-coded HTML to BibTeX-like Javascript. I don't really agree with this change. This will rule-out non-js aware browsers plus some search engines' crawlers. If such abstraction is desired, I think it could be done at server-side with PHP (although I'm slightly biased :) Nuno From clattner at apple.com Fri Dec 26 12:26:55 2008 From: clattner at apple.com (Chris Lattner) Date: Fri, 26 Dec 2008 10:26:55 -0800 Subject: [llvm-commits] CVS: llvm-www/pubs/index.html pubs.js In-Reply-To: <72A5AFEB5F174A80B9A1A41FA91F3107@pc07654> References: <200812250528.mBP5Sho6014869@zion.cs.uiuc.edu> <72A5AFEB5F174A80B9A1A41FA91F3107@pc07654> Message-ID: <34A4249C-B983-4B2A-BAB3-B8C209DBD488@apple.com> On Dec 26, 2008, at 8:12 AM, Nuno Lopes wrote: > Hi, > >> Changes in directory llvm-www/pubs: >> >> index.html updated: 1.89 -> 1.90 >> pubs.js updated: 1.2 -> 1.3 >> --- >> Log message: >> >> Converted all papers from hand-coded HTML to BibTeX-like Javascript. > > > I don't really agree with this change. This will rule-out non-js aware > browsers plus some search engines' crawlers. > If such abstraction is desired, I think it could be done at server- > side with > PHP (although I'm slightly biased :) Hi Nuno, Maintaining the papers list is a major pain (for me, since I am the only person who seems to add papers). All the individual papers' pages are still plain HTML. Besides that, I wouldn't be surprised if major search engines had some way to handle this sort of JS. -Chris From brukman at cs.uiuc.edu Fri Dec 26 12:44:17 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Fri, 26 Dec 2008 12:44:17 -0600 Subject: [llvm-commits] CVS: llvm-www/pubs/index.html pubs.js Message-ID: <200812261844.mBQIiHam032642@zion.cs.uiuc.edu> Changes in directory llvm-www/pubs: index.html updated: 1.90 -> 1.91 pubs.js updated: 1.3 -> 1.4 --- Log message: Split up list of publications into sections by year. --- Diffs of the changes: (+18 -7) index.html | 4 ++-- pubs.js | 21 ++++++++++++++++----- 2 files changed, 18 insertions(+), 7 deletions(-) Index: llvm-www/pubs/index.html diff -u llvm-www/pubs/index.html:1.90 llvm-www/pubs/index.html:1.91 --- llvm-www/pubs/index.html:1.90 Wed Dec 24 23:27:51 2008 +++ llvm-www/pubs/index.html Fri Dec 26 12:43:11 2008 @@ -1,8 +1,8 @@
LLVM Related Publications
-
    -
+
+
Index: llvm-www/pubs/pubs.js diff -u llvm-www/pubs/pubs.js:1.3 llvm-www/pubs/pubs.js:1.4 --- llvm-www/pubs/pubs.js:1.3 Wed Dec 24 23:27:51 2008 +++ llvm-www/pubs/pubs.js Fri Dec 26 12:43:11 2008 @@ -4,7 +4,8 @@ {url: '2009-03-ASPLOS-Recovery.html', title: 'Recovery Domains: An Organizing Principle for Recoverable Operating Systems', author: 'Andrew Lenharth, Samuel T. King, Vikram Adve', - published: "Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09)"}, + published: "Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '09)", + year: 2009}, {url: '2008-12-OSDI-KLEE.html', title: 'KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs', @@ -529,19 +530,29 @@ /** * Displays all publications by attaching them to the element with the passed-in - * id. + * id. The publications will be separated by year, and within each, will be in + * an ordered list (OL). * - * @param {string} id ID of the OL/UL element that will serve as the parent of - * the publications, each of which will be a list item (LI). + * @param {string} id ID of the element that will serve as the parent of + * the publications list. */ function displayAllPubs(id) { var MONTHS = [ 'Jan.', 'Feb.', 'Mar.', 'Apr.', 'May', 'June', 'July', 'Aug.', 'Sep.', 'Oct.', 'Nov.', 'Dec.' ]; - var list = document.getElementById(id); + var container = document.getElementById(id); + var list = null; + var current_year = -1; for (var i = 0; i < PUBS.length; ++i) { var pub = PUBS[i]; + if (current_year != pub.year) { + var header = document.createElement('h2'); + header.innerHTML = current_year = pub.year; + container.appendChild(header); + list = document.createElement('ol'); + container.appendChild(list); + } var item = document.createElement('li'); item.innerHTML += '"' + htmlEscape(pub.title) + '"
'; From brukman at cs.uiuc.edu Fri Dec 26 17:47:51 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Fri, 26 Dec 2008 17:47:51 -0600 Subject: [llvm-commits] CVS: llvm-www/attrib.incl Message-ID: <200812262347.mBQNlpTR011423@zion.cs.uiuc.edu> Changes in directory llvm-www: attrib.incl updated: 1.4 -> 1.5 --- Log message: Comply with HTML 4.01 Strict. --- Diffs of the changes: (+2 -2) attrib.incl | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: llvm-www/attrib.incl diff -u llvm-www/attrib.incl:1.4 llvm-www/attrib.incl:1.5 --- llvm-www/attrib.incl:1.4 Tue Sep 12 15:35:18 2006 +++ llvm-www/attrib.incl Fri Dec 26 17:46:48 2008 @@ -1,7 +1,7 @@
- +

This web site is hosted by the Computer Science Department at the University of Illinois at Urbana-Champaign. - +

From brukman at cs.uiuc.edu Fri Dec 26 17:53:34 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Fri, 26 Dec 2008 17:53:34 -0600 Subject: [llvm-commits] CVS: llvm-www/releases/index.html Message-ID: <200812262353.mBQNrYET012348@zion.cs.uiuc.edu> Changes in directory llvm-www/releases: index.html updated: 1.50 -> 1.51 --- Log message: * Combined "Download" and "Documentation" sections together, since they're both presenting per-release information and links * Made the link to download LLVM from SVN just like one of the "releases" * Added link to LLVM homepage --- Diffs of the changes: (+20 -36) index.html | 56 ++++++++++++++++++++------------------------------------ 1 files changed, 20 insertions(+), 36 deletions(-) Index: llvm-www/releases/index.html diff -u llvm-www/releases/index.html:1.50 llvm-www/releases/index.html:1.51 --- llvm-www/releases/index.html:1.50 Wed Dec 24 16:12:51 2008 +++ llvm-www/releases/index.html Fri Dec 26 17:52:56 2008 @@ -18,8 +18,8 @@ - + From brukman at cs.uiuc.edu Fri Dec 26 18:02:01 2008 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Fri, 26 Dec 2008 18:02:01 -0600 Subject: [llvm-commits] CVS: llvm-www/www-index.html Message-ID: <200812270002.mBR021FQ012805@zion.cs.uiuc.edu> Changes in directory llvm-www: www-index.html updated: 1.170 -> 1.171 --- Log message: Comply with HTML 4.01 Strict (no user-visible changes): * No such property 'cellborder' on a table element * Replaced width=X with style="width:X" * Dropped underline () on a link * Dropped

around an unordered list --- Diffs of the changes: (+5 -8) www-index.html | 13 +++++-------- 1 files changed, 5 insertions(+), 8 deletions(-) Index: llvm-www/www-index.html diff -u llvm-www/www-index.html:1.170 llvm-www/www-index.html:1.171 --- llvm-www/www-index.html:1.170 Fri Dec 5 15:49:13 2008 +++ llvm-www/www-index.html Fri Dec 26 18:01:27 2008 @@ -1,10 +1,9 @@ - +
- - + -
+
LLVM Overview
@@ -21,7 +20,7 @@ installed), while remaining transparent to developers and maintaining compatibility with existing build scripts.

-
  • A virtual instruction set - LLVM +

  • A virtual instruction set - LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables @@ -84,9 +83,9 @@

  • +
    Latest LLVM Release!
    @@ -106,7 +105,6 @@

    LLVM 2.5 release schedule:

    -

    • Jan 21 - Code Freeze/Branch Creation (9PM PST).
    • Jan 25 - Pre-release 1 testing begins.
    • @@ -115,7 +113,6 @@
    • Feb 9 - Pre-release 2 testing ends.
    • Feb 11 - Release.
    -


    From scottm at aero.org Fri Dec 26 22:51:40 2008 From: scottm at aero.org (Scott Michel) Date: Sat, 27 Dec 2008 04:51:40 -0000 Subject: [llvm-commits] [llvm] r61447 - in /llvm/trunk: lib/Target/CellSPU/ lib/Target/CellSPU/AsmPrinter/ test/CodeGen/CellSPU/ test/CodeGen/CellSPU/useful-harnesses/ Message-ID: <200812270451.mBR4phHY001611@zion.cs.uiuc.edu> Author: pingbak Date: Fri Dec 26 22:51:36 2008 New Revision: 61447 URL: http://llvm.org/viewvc/llvm-project?rev=61447&view=rev Log: - Remove Tilmann's custom truncate lowering: it completely hosed over DAGcombine's ability to find reasons to remove truncates when they were not needed. Consequently, the CellSPU backend would produce correct, but _really slow and horrible_, code. Replaced with instruction sequences that do the equivalent truncation in SPUInstrInfo.td. - Re-examine how unaligned loads and stores work. Generated unaligned load code has been tested on the CellSPU hardware; see the i32operations.c and i64operations.c in CodeGen/CellSPU/useful-harnesses. (While they may be toy test code, it does prove that some real world code does compile correctly.) - Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc fault because i64 ult is not yet implemented.) - Added i64 eq and neq for setcc and select/setcc; started new instruction information file for them in SPU64InstrInfo.td. Additional i64 operations should be added to this file and not to SPUInstrInfo.td. Added: llvm/trunk/lib/Target/CellSPU/SPU64InstrInfo.td llvm/trunk/test/CodeGen/CellSPU/icmp64.ll llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i32operations.c llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i64operations.c Modified: llvm/trunk/lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp llvm/trunk/lib/Target/CellSPU/SPUISelDAGToDAG.cpp llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp llvm/trunk/lib/Target/CellSPU/SPUISelLowering.h llvm/trunk/lib/Target/CellSPU/SPUInstrFormats.td llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.cpp llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td llvm/trunk/lib/Target/CellSPU/SPUNodes.td llvm/trunk/lib/Target/CellSPU/SPUOperands.td llvm/trunk/lib/Target/CellSPU/SPURegisterInfo.cpp llvm/trunk/lib/Target/CellSPU/SPUTargetAsmInfo.cpp llvm/trunk/test/CodeGen/CellSPU/call_indirect.ll llvm/trunk/test/CodeGen/CellSPU/stores.ll llvm/trunk/test/CodeGen/CellSPU/struct_1.ll llvm/trunk/test/CodeGen/CellSPU/trunc.ll Modified: llvm/trunk/lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/AsmPrinter/SPUAsmPrinter.cpp Fri Dec 26 22:51:36 2008 @@ -117,7 +117,7 @@ } void - printMemRegImmS7(const MachineInstr *MI, unsigned OpNo) + printShufAddr(const MachineInstr *MI, unsigned OpNo) { char value = MI->getOperand(OpNo).getImm(); O << (int) value; @@ -183,16 +183,16 @@ } void - printMemRegImmS10(const MachineInstr *MI, unsigned OpNo) + printDFormAddr(const MachineInstr *MI, unsigned OpNo) { const MachineOperand &MO = MI->getOperand(OpNo); assert(MO.isImm() && - "printMemRegImmS10 first operand is not immedate"); + "printDFormAddr first operand is not immedate"); int64_t value = int64_t(MI->getOperand(OpNo).getImm()); int16_t value16 = int16_t(value); assert((value16 >= -(1 << (9+4)) && value16 <= (1 << (9+4)) - 1) && "Invalid dform s10 offset argument"); - O << value16 << "("; + O << (value16 & ~0xf) << "("; printOperand(MI, OpNo+1); O << ")"; } Added: llvm/trunk/lib/Target/CellSPU/SPU64InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPU64InstrInfo.td?rev=61447&view=auto ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPU64InstrInfo.td (added) +++ llvm/trunk/lib/Target/CellSPU/SPU64InstrInfo.td Fri Dec 26 22:51:36 2008 @@ -0,0 +1,77 @@ +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ +// 64-bit comparisons: +// +// 1. The instruction sequences for vector vice scalar differ by a +// constant. +// +// 2. There are no "immediate" forms, since loading 64-bit constants +// could be a constant pool load. +// +// 3. i64 setcc results are i32, which are subsequently converted to a FSM +// mask when used in a select pattern. +// +// 4. v2i64 setcc results are v4i32, which can be converted to a FSM mask +// (TODO) +// +// M00$E Kan be Pretty N at sTi!!!!! (appologies to Monty!) +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ + +// selb instruction definition for i64. Note that the selection mask is +// a vector, produced by various forms of FSM: +def SELBr64_cond: + SELBInst<(outs R64C:$rT), (ins R64C:$rA, R64C:$rB, VECREG:$rC), + [/* no pattern */]>; + +class CodeFrag { + dag Fragment = frag; +} + +class I64SELECTNegCond: + Pat<(select (i32 (cond R64C:$rA, R64C:$rB)), R64C:$rTrue, R64C:$rFalse), + (SELBr64_cond R64C:$rTrue, R64C:$rFalse, (FSMr32 cmpare.Fragment))>; + +class I64SETCCNegCond: + Pat<(cond R64C:$rA, R64C:$rB), + (XORIr32 cmpare.Fragment, -1)>; + +// The i64 seteq fragment that does the scalar->vector conversion and +// comparison: +def CEQr64compare: + CodeFrag<(CGTIv4i32 (GBv4i32 (CEQv4i32 (ORv2i64_i64 R64C:$rA), + (ORv2i64_i64 R64C:$rB))), + 0x0000000c)>; + + +// The i64 seteq fragment that does the vector comparison +def CEQv2i64compare: + CodeFrag<(CGTIv4i32 (GBv4i32 (CEQv4i32 VECREG:$rA, VECREG:$rB)), + 0x0000000f)>; + +// i64 seteq (equality): the setcc result is i32, which is converted to a +// vector FSM mask when used in a select pattern. +// +// v2i64 seteq (equality): the setcc result is v4i32 +multiclass CompareEqual64 { + // Plain old comparison, converts back to i32 scalar + def r64: CodeFrag<(ORi32_v4i32 CEQr64compare.Fragment)>; + def v2i64: CodeFrag<(ORi32_v4i32 CEQv2i64compare.Fragment)>; + + // SELB mask from FSM: + def r64mask: CodeFrag<(ORi32_v4i32 (FSMv4i32 CEQr64compare.Fragment))>; + def v2i64mask: CodeFrag<(ORi32_v4i32 (FSMv4i32 CEQv2i64compare.Fragment))>; +} + +defm I64EQ: CompareEqual64; + +def : Pat<(seteq R64C:$rA, R64C:$rB), I64EQr64.Fragment>; + +def : Pat<(seteq (v2i64 VECREG:$rA), (v2i64 VECREG:$rB)), + I64EQv2i64.Fragment>; + +def I64Select: + Pat<(select R32C:$rC, R64C:$rB, R64C:$rA), + (SELBr64_cond R64C:$rA, R64C:$rB, (FSMr32 R32C:$rC))>; + +def : I64SETCCNegCond; + +def : I64SELECTNegCond; \ No newline at end of file Modified: llvm/trunk/lib/Target/CellSPU/SPUISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUISelDAGToDAG.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUISelDAGToDAG.cpp Fri Dec 26 22:51:36 2008 @@ -165,24 +165,23 @@ MVT VT; unsigned ldresult_ins; /// LDRESULT instruction (0 = undefined) bool ldresult_imm; /// LDRESULT instruction requires immediate? - int prefslot_byte; /// Byte offset of the "preferred" slot + unsigned lrinst; /// LR instruction }; const valtype_map_s valtype_map[] = { - { MVT::i1, 0, false, 3 }, - { MVT::i8, SPU::ORBIr8, true, 3 }, - { MVT::i16, SPU::ORHIr16, true, 2 }, - { MVT::i32, SPU::ORIr32, true, 0 }, - { MVT::i64, SPU::ORr64, false, 0 }, - { MVT::f32, SPU::ORf32, false, 0 }, - { MVT::f64, SPU::ORf64, false, 0 }, + { MVT::i8, SPU::ORBIr8, true, SPU::LRr8 }, + { MVT::i16, SPU::ORHIr16, true, SPU::LRr16 }, + { MVT::i32, SPU::ORIr32, true, SPU::LRr32 }, + { MVT::i64, SPU::ORr64, false, SPU::LRr64 }, + { MVT::f32, SPU::ORf32, false, SPU::LRf32 }, + { MVT::f64, SPU::ORf64, false, SPU::LRf64 }, // vector types... (sigh!) - { MVT::v16i8, 0, false, 0 }, - { MVT::v8i16, 0, false, 0 }, - { MVT::v4i32, 0, false, 0 }, - { MVT::v2i64, 0, false, 0 }, - { MVT::v4f32, 0, false, 0 }, - { MVT::v2f64, 0, false, 0 } + { MVT::v16i8, 0, false, SPU::LRv16i8 }, + { MVT::v8i16, 0, false, SPU::LRv8i16 }, + { MVT::v4i32, 0, false, SPU::LRv4i32 }, + { MVT::v2i64, 0, false, SPU::LRv2i64 }, + { MVT::v4f32, 0, false, SPU::LRv4f32 }, + { MVT::v2f64, 0, false, SPU::LRv2f64 } }; const size_t n_valtype_map = sizeof(valtype_map) / sizeof(valtype_map[0]); @@ -686,31 +685,32 @@ Result = CurDAG->getTargetNode(Opc, VT, MVT::Other, Arg, Arg, Chain); } - Chain = SDValue(Result, 1); - return Result; } else if (Opc == SPUISD::IndirectAddr) { - SDValue Op0 = Op.getOperand(0); - if (Op0.getOpcode() == SPUISD::LDRESULT) { - /* || Op0.getOpcode() == SPUISD::AFormAddr) */ - // (IndirectAddr (LDRESULT, imm)) - SDValue Op1 = Op.getOperand(1); - MVT VT = Op.getValueType(); - - DEBUG(cerr << "CellSPU: IndirectAddr(LDRESULT, imm):\nOp0 = "); - DEBUG(Op.getOperand(0).getNode()->dump(CurDAG)); - DEBUG(cerr << "\nOp1 = "); - DEBUG(Op.getOperand(1).getNode()->dump(CurDAG)); - DEBUG(cerr << "\n"); - + // Look at the operands: SelectCode() will catch the cases that aren't + // specifically handled here. + // + // SPUInstrInfo catches the following patterns: + // (SPUindirect (SPUhi ...), (SPUlo ...)) + // (SPUindirect $sp, imm) + MVT VT = Op.getValueType(); + SDValue Op0 = N->getOperand(0); + SDValue Op1 = N->getOperand(1); + RegisterSDNode *RN; + + if ((Op0.getOpcode() != SPUISD::Hi && Op1.getOpcode() != SPUISD::Lo) + || (Op0.getOpcode() == ISD::Register + && ((RN = dyn_cast(Op0.getNode())) != 0 + && RN->getReg() != SPU::R1))) { + NewOpc = SPU::Ar32; if (Op1.getOpcode() == ISD::Constant) { ConstantSDNode *CN = cast(Op1); - Op1 = CurDAG->getTargetConstant(CN->getZExtValue(), VT); + Op1 = CurDAG->getTargetConstant(CN->getSExtValue(), VT); NewOpc = (isI32IntS10Immediate(CN) ? SPU::AIr32 : SPU::Ar32); - Ops[0] = Op0; - Ops[1] = Op1; - n_ops = 2; } + Ops[0] = Op0; + Ops[1] = Op1; + n_ops = 2; } } Modified: llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp Fri Dec 26 22:51:36 2008 @@ -15,6 +15,7 @@ #include "SPUISelLowering.h" #include "SPUTargetMachine.h" #include "SPUFrameInfo.h" +#include "llvm/ADT/APInt.h" #include "llvm/ADT/VectorExtras.h" #include "llvm/CodeGen/CallingConvLower.h" #include "llvm/CodeGen/MachineFrameInfo.h" @@ -77,37 +78,6 @@ return retval; } - - //! Predicate that returns true if operand is a memory target - /*! - \arg Op Operand to test - \return true if the operand is a memory target (i.e., global - address, external symbol, constant pool) or an A-form - address. - */ - bool isMemoryOperand(const SDValue &Op) - { - const unsigned Opc = Op.getOpcode(); - return (Opc == ISD::GlobalAddress - || Opc == ISD::GlobalTLSAddress - || Opc == ISD::JumpTable - || Opc == ISD::ConstantPool - || Opc == ISD::ExternalSymbol - || Opc == ISD::TargetGlobalAddress - || Opc == ISD::TargetGlobalTLSAddress - || Opc == ISD::TargetJumpTable - || Opc == ISD::TargetConstantPool - || Opc == ISD::TargetExternalSymbol - || Opc == SPUISD::AFormAddr); - } - - //! Predicate that returns true if the operand is an indirect target - bool isIndirectOperand(const SDValue &Op) - { - const unsigned Opc = Op.getOpcode(); - return (Opc == ISD::Register - || Opc == SPUISD::LDRESULT); - } } SPUTargetLowering::SPUTargetLowering(SPUTargetMachine &TM) @@ -135,20 +105,8 @@ setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote); setLoadExtAction(ISD::ZEXTLOAD, MVT::i1, Promote); - setLoadExtAction(ISD::EXTLOAD, MVT::i8, Custom); - setLoadExtAction(ISD::SEXTLOAD, MVT::i8, Custom); - setLoadExtAction(ISD::ZEXTLOAD, MVT::i8, Custom); - setTruncStoreAction(MVT::i8, MVT::i8, Custom); - setTruncStoreAction(MVT::i16, MVT::i8, Custom); - setTruncStoreAction(MVT::i32, MVT::i8, Custom); - setTruncStoreAction(MVT::i64, MVT::i8, Custom); - setTruncStoreAction(MVT::i128, MVT::i8, Custom); - - setLoadExtAction(ISD::EXTLOAD, MVT::i16, Custom); - setLoadExtAction(ISD::SEXTLOAD, MVT::i16, Custom); - setLoadExtAction(ISD::ZEXTLOAD, MVT::i16, Custom); - - setLoadExtAction(ISD::EXTLOAD, MVT::f32, Custom); + setLoadExtAction(ISD::EXTLOAD, MVT::f32, Expand); + setLoadExtAction(ISD::EXTLOAD, MVT::f64, Expand); // SPU constant load actions are custom lowered: setOperationAction(ISD::Constant, MVT::i64, Custom); @@ -160,11 +118,33 @@ ++sctype) { MVT VT = (MVT::SimpleValueType)sctype; - setOperationAction(ISD::LOAD, VT, Custom); - setOperationAction(ISD::STORE, VT, Custom); + setOperationAction(ISD::LOAD, VT, Custom); + setOperationAction(ISD::STORE, VT, Custom); + setLoadExtAction(ISD::EXTLOAD, VT, Custom); + setLoadExtAction(ISD::ZEXTLOAD, VT, Custom); + setLoadExtAction(ISD::SEXTLOAD, VT, Custom); + + for (unsigned stype = sctype - 1; stype >= (unsigned) MVT::i8; --stype) { + MVT StoreVT = (MVT::SimpleValueType) stype; + setTruncStoreAction(VT, StoreVT, Expand); + } + } + + for (unsigned sctype = (unsigned) MVT::f32; sctype < (unsigned) MVT::f64; + ++sctype) { + MVT VT = (MVT::SimpleValueType) sctype; + + setOperationAction(ISD::LOAD, VT, Custom); + setOperationAction(ISD::STORE, VT, Custom); + + for (unsigned stype = sctype - 1; stype >= (unsigned) MVT::f32; --stype) { + MVT StoreVT = (MVT::SimpleValueType) stype; + setTruncStoreAction(VT, StoreVT, Expand); + } } - // Custom lower BRCOND for i8 to "promote" the result to i16 + // Custom lower BRCOND for i8 to "promote" the result to whatever the result + // operand happens to be: setOperationAction(ISD::BRCOND, MVT::Other, Custom); // Expand the jumptable branches @@ -176,14 +156,12 @@ setOperationAction(ISD::SELECT_CC, MVT::i8, Custom); setOperationAction(ISD::SELECT_CC, MVT::i16, Custom); setOperationAction(ISD::SELECT_CC, MVT::i32, Custom); -#if 0 setOperationAction(ISD::SELECT_CC, MVT::i64, Custom); -#endif // SPU has no intrinsics for these particular operations: setOperationAction(ISD::MEMBARRIER, MVT::Other, Expand); - // PowerPC has no SREM/UREM instructions + // SPU has no SREM/UREM instructions setOperationAction(ISD::SREM, MVT::i32, Expand); setOperationAction(ISD::UREM, MVT::i32, Expand); setOperationAction(ISD::SREM, MVT::i64, Expand); @@ -232,14 +210,6 @@ setOperationAction(ISD::MUL, MVT::i32, Custom); setOperationAction(ISD::MUL, MVT::i64, Expand); // libcall - // SMUL_LOHI, UMUL_LOHI -#if 0 - setOperationAction(ISD::SMUL_LOHI, MVT::i32, Expand); - setOperationAction(ISD::UMUL_LOHI, MVT::i32, Expand); - setOperationAction(ISD::SMUL_LOHI, MVT::i64, Expand); - setOperationAction(ISD::UMUL_LOHI, MVT::i64, Expand); -#endif - // Need to custom handle (some) common i8, i64 math ops setOperationAction(ISD::ADD, MVT::i64, Custom); setOperationAction(ISD::SUB, MVT::i8, Custom); @@ -265,12 +235,12 @@ setOperationAction(ISD::SELECT, MVT::i8, Legal); setOperationAction(ISD::SELECT, MVT::i16, Legal); setOperationAction(ISD::SELECT, MVT::i32, Legal); - setOperationAction(ISD::SELECT, MVT::i64, Expand); + setOperationAction(ISD::SELECT, MVT::i64, Legal); setOperationAction(ISD::SETCC, MVT::i8, Legal); setOperationAction(ISD::SETCC, MVT::i16, Legal); - setOperationAction(ISD::SETCC, MVT::i32, Legal); - setOperationAction(ISD::SETCC, MVT::i64, Expand); + setOperationAction(ISD::SETCC, MVT::i32, Custom); + setOperationAction(ISD::SETCC, MVT::i64, Custom); // Zero extension and sign extension for i64 have to be // custom legalized @@ -278,10 +248,7 @@ setOperationAction(ISD::SIGN_EXTEND, MVT::i64, Custom); setOperationAction(ISD::ANY_EXTEND, MVT::i64, Custom); - // Custom lower truncates - setOperationAction(ISD::TRUNCATE, MVT::i8, Custom); - setOperationAction(ISD::TRUNCATE, MVT::i16, Custom); - setOperationAction(ISD::TRUNCATE, MVT::i32, Custom); + // Custom lower i128 -> i64 truncates setOperationAction(ISD::TRUNCATE, MVT::i64, Custom); // SPU has a legal FP -> signed INT instruction @@ -292,7 +259,7 @@ // FDIV on SPU requires custom lowering setOperationAction(ISD::FDIV, MVT::f32, Custom); - //setOperationAction(ISD::FDIV, MVT::f64, Custom); + setOperationAction(ISD::FDIV, MVT::f64, Expand); // libcall // SPU has [U|S]INT_TO_FP setOperationAction(ISD::SINT_TO_FP, MVT::i32, Legal); @@ -402,7 +369,7 @@ setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4f32, Custom); setShiftAmountType(MVT::i32); - setBooleanContents(ZeroOrOneBooleanContent); + setBooleanContents(ZeroOrNegativeOneBooleanContent); setStackPointerRegisterToSaveRestore(SPU::R1); @@ -435,7 +402,7 @@ node_names[(unsigned) SPUISD::SHUFB] = "SPUISD::SHUFB"; node_names[(unsigned) SPUISD::SHUFFLE_MASK] = "SPUISD::SHUFFLE_MASK"; node_names[(unsigned) SPUISD::CNTB] = "SPUISD::CNTB"; - node_names[(unsigned) SPUISD::PROMOTE_SCALAR] = "SPUISD::PROMOTE_SCALAR"; + node_names[(unsigned) SPUISD::PREFSLOT2VEC] = "SPUISD::PROMOTE_SCALAR"; node_names[(unsigned) SPUISD::VEC2PREFSLOT] = "SPUISD::VEC2PREFSLOT"; node_names[(unsigned) SPUISD::MPY] = "SPUISD::MPY"; node_names[(unsigned) SPUISD::MPYU] = "SPUISD::MPYU"; @@ -471,9 +438,14 @@ return ((i != node_names.end()) ? i->second : 0); } +//===----------------------------------------------------------------------===// +// Return the Cell SPU's SETCC result type +//===----------------------------------------------------------------------===// + MVT SPUTargetLowering::getSetCCResultType(const SDValue &Op) const { MVT VT = Op.getValueType(); - return (VT.isInteger() ? VT : MVT(MVT::i32)); + // i16 and i32 are valid SETCC result types + return ((VT == MVT::i8 || VT == MVT::i16 || VT == MVT::i32) ? VT : MVT::i32); } //===----------------------------------------------------------------------===// @@ -486,105 +458,6 @@ // LowerOperation implementation //===----------------------------------------------------------------------===// -/// Aligned load common code for CellSPU -/*! - \param[in] Op The SelectionDAG load or store operand - \param[in] DAG The selection DAG - \param[in] ST CellSPU subtarget information structure - \param[in,out] alignment Caller initializes this to the load or store node's - value from getAlignment(), may be updated while generating the aligned load - \param[in,out] alignOffs Aligned offset; set by AlignedLoad to the aligned - offset (divisible by 16, modulo 16 == 0) - \param[in,out] prefSlotOffs Preferred slot offset; set by AlignedLoad to the - offset of the preferred slot (modulo 16 != 0) - \param[in,out] VT Caller initializes this value type to the the load or store - node's loaded or stored value type; may be updated if an i1-extended load or - store. - \param[out] was16aligned true if the base pointer had 16-byte alignment, - otherwise false. Can help to determine if the chunk needs to be rotated. - - Both load and store lowering load a block of data aligned on a 16-byte - boundary. This is the common aligned load code shared between both. - */ -static SDValue -AlignedLoad(SDValue Op, SelectionDAG &DAG, const SPUSubtarget *ST, - LSBaseSDNode *LSN, - unsigned &alignment, int &alignOffs, int &prefSlotOffs, - MVT &VT, bool &was16aligned) -{ - MVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); - const valtype_map_s *vtm = getValueTypeMapEntry(VT); - SDValue basePtr = LSN->getBasePtr(); - SDValue chain = LSN->getChain(); - - if (basePtr.getOpcode() == ISD::ADD) { - SDValue Op1 = basePtr.getNode()->getOperand(1); - - if (Op1.getOpcode() == ISD::Constant - || Op1.getOpcode() == ISD::TargetConstant) { - const ConstantSDNode *CN = cast(basePtr.getOperand(1)); - - alignOffs = (int) CN->getZExtValue(); - prefSlotOffs = (int) (alignOffs & 0xf); - - // Adjust the rotation amount to ensure that the final result ends up in - // the preferred slot: - prefSlotOffs -= vtm->prefslot_byte; - basePtr = basePtr.getOperand(0); - - // Loading from memory, can we adjust alignment? - if (basePtr.getOpcode() == SPUISD::AFormAddr) { - SDValue APtr = basePtr.getOperand(0); - if (APtr.getOpcode() == ISD::TargetGlobalAddress) { - GlobalAddressSDNode *GSDN = cast(APtr); - alignment = GSDN->getGlobal()->getAlignment(); - } - } - } else { - alignOffs = 0; - prefSlotOffs = -vtm->prefslot_byte; - } - } else if (basePtr.getOpcode() == ISD::FrameIndex) { - FrameIndexSDNode *FIN = cast(basePtr); - alignOffs = int(FIN->getIndex() * SPUFrameInfo::stackSlotSize()); - prefSlotOffs = (int) (alignOffs & 0xf); - prefSlotOffs -= vtm->prefslot_byte; - } else { - alignOffs = 0; - prefSlotOffs = -vtm->prefslot_byte; - } - - if (alignment == 16) { - // Realign the base pointer as a D-Form address: - if (!isMemoryOperand(basePtr) || (alignOffs & ~0xf) != 0) { - basePtr = DAG.getNode(ISD::ADD, PtrVT, - basePtr, - DAG.getConstant((alignOffs & ~0xf), PtrVT)); - } - - // Emit the vector load: - was16aligned = true; - return DAG.getLoad(MVT::v16i8, chain, basePtr, - LSN->getSrcValue(), LSN->getSrcValueOffset(), - LSN->isVolatile(), 16); - } - - // Unaligned load or we're using the "large memory" model, which means that - // we have to be very pessimistic: - if (isMemoryOperand(basePtr) || isIndirectOperand(basePtr)) { - basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, basePtr, - DAG.getConstant(0, PtrVT)); - } - - // Add the offset - basePtr = DAG.getNode(ISD::ADD, PtrVT, basePtr, - DAG.getConstant((alignOffs & ~0xf), PtrVT)); - was16aligned = false; - return DAG.getLoad(MVT::v16i8, chain, basePtr, - LSN->getSrcValue(), LSN->getSrcValueOffset(), - LSN->isVolatile(), 16); -} - /// Custom lower loads for CellSPU /*! All CellSPU loads and stores are aligned to 16-byte boundaries, so for elements @@ -605,42 +478,109 @@ LowerLOAD(SDValue Op, SelectionDAG &DAG, const SPUSubtarget *ST) { LoadSDNode *LN = cast(Op); SDValue the_chain = LN->getChain(); + MVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); MVT InVT = LN->getMemoryVT(); MVT OutVT = Op.getValueType(); ISD::LoadExtType ExtType = LN->getExtensionType(); unsigned alignment = LN->getAlignment(); - SDValue Ops[8]; + const valtype_map_s *vtm = getValueTypeMapEntry(InVT); switch (LN->getAddressingMode()) { case ISD::UNINDEXED: { - int offset, rotamt; - bool was16aligned; - SDValue result = - AlignedLoad(Op, DAG, ST, LN,alignment, offset, rotamt, InVT, - was16aligned); + SDValue result; + SDValue basePtr = LN->getBasePtr(); + SDValue rotate; - if (result.getNode() == 0) - return result; + if (alignment == 16) { + ConstantSDNode *CN; - the_chain = result.getValue(1); - // Rotate the chunk if necessary - if (rotamt < 0) - rotamt += 16; - if (rotamt != 0 || !was16aligned) { - SDVTList vecvts = DAG.getVTList(MVT::v16i8, MVT::Other); - - Ops[0] = result; - if (was16aligned) { - Ops[1] = DAG.getConstant(rotamt, MVT::i16); + // Special cases for a known aligned load to simplify the base pointer + // and the rotation amount: + if (basePtr.getOpcode() == ISD::ADD + && (CN = dyn_cast (basePtr.getOperand(1))) != 0) { + // Known offset into basePtr + int64_t offset = CN->getSExtValue(); + int64_t rotamt = int64_t((offset & 0xf) - vtm->prefslot_byte); + + if (rotamt < 0) + rotamt += 16; + + rotate = DAG.getConstant(rotamt, MVT::i16); + + // Simplify the base pointer for this case: + basePtr = basePtr.getOperand(0); + if ((offset & ~0xf) > 0) { + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant((offset & ~0xf), PtrVT)); + } + } else if ((basePtr.getOpcode() == SPUISD::AFormAddr) + || (basePtr.getOpcode() == SPUISD::IndirectAddr + && basePtr.getOperand(0).getOpcode() == SPUISD::Hi + && basePtr.getOperand(1).getOpcode() == SPUISD::Lo)) { + // Plain aligned a-form address: rotate into preferred slot + // Same for (SPUindirect (SPUhi ...), (SPUlo ...)) + int64_t rotamt = -vtm->prefslot_byte; + if (rotamt < 0) + rotamt += 16; + rotate = DAG.getConstant(rotamt, MVT::i16); } else { - MVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); - LoadSDNode *LN1 = cast(result); - Ops[1] = DAG.getNode(ISD::ADD, PtrVT, LN1->getBasePtr(), + // Offset the rotate amount by the basePtr and the preferred slot + // byte offset + int64_t rotamt = -vtm->prefslot_byte; + if (rotamt < 0) + rotamt += 16; + rotate = DAG.getNode(ISD::ADD, PtrVT, + basePtr, DAG.getConstant(rotamt, PtrVT)); } + } else { + // Unaligned load: must be more pessimistic about addressing modes: + if (basePtr.getOpcode() == ISD::ADD) { + MachineFunction &MF = DAG.getMachineFunction(); + MachineRegisterInfo &RegInfo = MF.getRegInfo(); + unsigned VReg = RegInfo.createVirtualRegister(&SPU::R32CRegClass); + SDValue Flag; + + SDValue Op0 = basePtr.getOperand(0); + SDValue Op1 = basePtr.getOperand(1); + + if (isa(Op1)) { + // Convert the (add , ) to an indirect address contained + // in a register. Note that this is done because we need to avoid + // creating a 0(reg) d-form address due to the SPU's block loads. + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, Op0, Op1); + the_chain = DAG.getCopyToReg(the_chain, VReg, basePtr, Flag); + basePtr = DAG.getCopyFromReg(the_chain, VReg, PtrVT); + } else { + // Convert the (add , ) to an indirect address, which + // will likely be lowered as a reg(reg) x-form address. + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, Op0, Op1); + } + } else { + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant(0, PtrVT)); + } + + // Offset the rotate amount by the basePtr and the preferred slot + // byte offset + rotate = DAG.getNode(ISD::ADD, PtrVT, + basePtr, + DAG.getConstant(-vtm->prefslot_byte, PtrVT)); + } + + // Re-emit as a v16i8 vector load + result = DAG.getLoad(MVT::v16i8, the_chain, basePtr, + LN->getSrcValue(), LN->getSrcValueOffset(), + LN->isVolatile(), 16); - result = DAG.getNode(SPUISD::ROTBYTES_LEFT, MVT::v16i8, Ops, 2); - } + // Update the chain + the_chain = result.getValue(1); + + // Rotate into the preferred slot: + result = DAG.getNode(SPUISD::ROTBYTES_LEFT, MVT::v16i8, + result.getValue(0), rotate); // Convert the loaded v16i8 vector to the appropriate vector type // specified by the operand: @@ -704,23 +644,86 @@ switch (SN->getAddressingMode()) { case ISD::UNINDEXED: { - int chunk_offset, slot_offset; - bool was16aligned; - // The vector type we really want to load from the 16-byte chunk. MVT vecVT = MVT::getVectorVT(VT, (128 / VT.getSizeInBits())), stVecVT = MVT::getVectorVT(StVT, (128 / StVT.getSizeInBits())); - SDValue alignLoadVec = - AlignedLoad(Op, DAG, ST, SN, alignment, - chunk_offset, slot_offset, VT, was16aligned); + SDValue alignLoadVec; + SDValue basePtr = SN->getBasePtr(); + SDValue the_chain = SN->getChain(); + SDValue insertEltOffs; + + if (alignment == 16) { + ConstantSDNode *CN; + + // Special cases for a known aligned load to simplify the base pointer + // and insertion byte: + if (basePtr.getOpcode() == ISD::ADD + && (CN = dyn_cast(basePtr.getOperand(1))) != 0) { + // Known offset into basePtr + int64_t offset = CN->getSExtValue(); + + // Simplify the base pointer for this case: + basePtr = basePtr.getOperand(0); + insertEltOffs = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant((offset & 0xf), PtrVT)); + + if ((offset & ~0xf) > 0) { + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant((offset & ~0xf), PtrVT)); + } + } else { + // Otherwise, assume it's at byte 0 of basePtr + insertEltOffs = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant(0, PtrVT)); + } + } else { + // Unaligned load: must be more pessimistic about addressing modes: + if (basePtr.getOpcode() == ISD::ADD) { + MachineFunction &MF = DAG.getMachineFunction(); + MachineRegisterInfo &RegInfo = MF.getRegInfo(); + unsigned VReg = RegInfo.createVirtualRegister(&SPU::R32CRegClass); + SDValue Flag; + + SDValue Op0 = basePtr.getOperand(0); + SDValue Op1 = basePtr.getOperand(1); + + if (isa(Op1)) { + // Convert the (add , ) to an indirect address contained + // in a register. Note that this is done because we need to avoid + // creating a 0(reg) d-form address due to the SPU's block loads. + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, Op0, Op1); + the_chain = DAG.getCopyToReg(the_chain, VReg, basePtr, Flag); + basePtr = DAG.getCopyFromReg(the_chain, VReg, PtrVT); + } else { + // Convert the (add , ) to an indirect address, which + // will likely be lowered as a reg(reg) x-form address. + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, Op0, Op1); + } + } else { + basePtr = DAG.getNode(SPUISD::IndirectAddr, PtrVT, + basePtr, + DAG.getConstant(0, PtrVT)); + } + + // Insertion point is solely determined by basePtr's contents + insertEltOffs = DAG.getNode(ISD::ADD, PtrVT, + basePtr, + DAG.getConstant(0, PtrVT)); + } + + // Re-emit as a v16i8 vector load + alignLoadVec = DAG.getLoad(MVT::v16i8, the_chain, basePtr, + SN->getSrcValue(), SN->getSrcValueOffset(), + SN->isVolatile(), 16); - if (alignLoadVec.getNode() == 0) - return alignLoadVec; + // Update the chain + the_chain = alignLoadVec.getValue(1); LoadSDNode *LN = cast(alignLoadVec); - SDValue basePtr = LN->getBasePtr(); - SDValue the_chain = alignLoadVec.getValue(1); SDValue theValue = SN->getValue(); SDValue result; @@ -732,29 +735,20 @@ theValue = theValue.getOperand(0); } - chunk_offset &= 0xf; - - SDValue insertEltOffs = DAG.getConstant(chunk_offset, PtrVT); - SDValue insertEltPtr; - // If the base pointer is already a D-form address, then just create // a new D-form address with a slot offset and the orignal base pointer. // Otherwise generate a D-form address with the slot offset relative // to the stack pointer, which is always aligned. - DEBUG(cerr << "CellSPU LowerSTORE: basePtr = "); - DEBUG(basePtr.getNode()->dump(&DAG)); - DEBUG(cerr << "\n"); - - if (basePtr.getOpcode() == SPUISD::IndirectAddr || - (basePtr.getOpcode() == ISD::ADD - && basePtr.getOperand(0).getOpcode() == SPUISD::IndirectAddr)) { - insertEltPtr = basePtr; - } else { - insertEltPtr = DAG.getNode(ISD::ADD, PtrVT, basePtr, insertEltOffs); - } +#if !defined(NDEBUG) + if (DebugFlag && isCurrentDebugType(DEBUG_TYPE)) { + cerr << "CellSPU LowerSTORE: basePtr = "; + basePtr.getNode()->dump(&DAG); + cerr << "\n"; + } +#endif SDValue insertEltOp = - DAG.getNode(SPUISD::SHUFFLE_MASK, vecVT, insertEltPtr); + DAG.getNode(SPUISD::SHUFFLE_MASK, vecVT, insertEltOffs); SDValue vectorizeOp = DAG.getNode(ISD::SCALAR_TO_VECTOR, vecVT, theValue); @@ -919,22 +913,31 @@ return SDValue(); } -//! Lower MVT::i8 brcond to a promoted type (MVT::i32, MVT::i16) static SDValue -LowerBRCOND(SDValue Op, SelectionDAG &DAG) -{ +LowerBRCOND(SDValue Op, SelectionDAG &DAG, const TargetLowering &TLI) { SDValue Cond = Op.getOperand(1); MVT CondVT = Cond.getValueType(); - MVT CondNVT; + unsigned CondOpc; if (CondVT == MVT::i8) { - CondNVT = MVT::i16; + SDValue CondOp0 = Cond.getOperand(0); + if (Cond.getOpcode() == ISD::TRUNCATE) { + // Use the truncate's value type and ANY_EXTEND the condition (DAGcombine + // will then remove the truncate) + CondVT = CondOp0.getValueType(); + CondOpc = ISD::ANY_EXTEND; + } else { + CondVT = MVT::i32; // default to something reasonable + CondOpc = ISD::ZERO_EXTEND; + } + + Cond = DAG.getNode(CondOpc, CondVT, Op.getOperand(1)); + return DAG.getNode(ISD::BRCOND, Op.getValueType(), - Op.getOperand(0), - DAG.getNode(ISD::ZERO_EXTEND, CondNVT, Op.getOperand(1)), - Op.getOperand(2)); - } else - return SDValue(); // Unchanged + Op.getOperand(0), Cond, Op.getOperand(2)); + } + + return SDValue(); // Unchanged } static SDValue @@ -1896,7 +1899,7 @@ case MVT::i64: case MVT::f32: case MVT::f64: - return DAG.getNode(SPUISD::PROMOTE_SCALAR, Op.getValueType(), Op0, Op0); + return DAG.getNode(SPUISD::PREFSLOT2VEC, Op.getValueType(), Op0, Op0); } } @@ -2274,9 +2277,11 @@ return result; } -static SDValue LowerI8Math(SDValue Op, SelectionDAG &DAG, unsigned Opc) +static SDValue LowerI8Math(SDValue Op, SelectionDAG &DAG, unsigned Opc, + const TargetLowering &TLI) { SDValue N0 = Op.getOperand(0); // Everything has at least one operand + MVT ShiftVT = TLI.getShiftAmountTy(); assert(Op.getValueType() == MVT::i8); switch (Opc) { @@ -2290,11 +2295,11 @@ SDValue N1 = Op.getOperand(1); N0 = (N0.getOpcode() != ISD::Constant ? DAG.getNode(ISD::SIGN_EXTEND, MVT::i16, N0) - : DAG.getConstant(cast(N0)->getZExtValue(), + : DAG.getConstant(cast(N0)->getSExtValue(), MVT::i16)); N1 = (N1.getOpcode() != ISD::Constant ? DAG.getNode(ISD::SIGN_EXTEND, MVT::i16, N1) - : DAG.getConstant(cast(N1)->getZExtValue(), + : DAG.getConstant(cast(N1)->getSExtValue(), MVT::i16)); return DAG.getNode(ISD::TRUNCATE, MVT::i8, DAG.getNode(Opc, MVT::i16, N0, N1)); @@ -2307,13 +2312,13 @@ ? DAG.getNode(ISD::ZERO_EXTEND, MVT::i16, N0) : DAG.getConstant(cast(N0)->getZExtValue(), MVT::i16)); - N1Opc = N1.getValueType().bitsLT(MVT::i32) + N1Opc = N1.getValueType().bitsLT(ShiftVT) ? ISD::ZERO_EXTEND : ISD::TRUNCATE; N1 = (N1.getOpcode() != ISD::Constant - ? DAG.getNode(N1Opc, MVT::i32, N1) + ? DAG.getNode(N1Opc, ShiftVT, N1) : DAG.getConstant(cast(N1)->getZExtValue(), - MVT::i32)); + TLI.getShiftAmountTy())); SDValue ExpandArg = DAG.getNode(ISD::OR, MVT::i16, N0, DAG.getNode(ISD::SHL, MVT::i16, @@ -2328,14 +2333,13 @@ N0 = (N0.getOpcode() != ISD::Constant ? DAG.getNode(ISD::ZERO_EXTEND, MVT::i16, N0) : DAG.getConstant(cast(N0)->getZExtValue(), - MVT::i16)); - N1Opc = N1.getValueType().bitsLT(MVT::i16) + MVT::i32)); + N1Opc = N1.getValueType().bitsLT(ShiftVT) ? ISD::ZERO_EXTEND : ISD::TRUNCATE; N1 = (N1.getOpcode() != ISD::Constant - ? DAG.getNode(N1Opc, MVT::i16, N1) - : DAG.getConstant(cast(N1)->getZExtValue(), - MVT::i16)); + ? DAG.getNode(N1Opc, ShiftVT, N1) + : DAG.getConstant(cast(N1)->getZExtValue(), ShiftVT)); return DAG.getNode(ISD::TRUNCATE, MVT::i8, DAG.getNode(Opc, MVT::i16, N0, N1)); } @@ -2344,15 +2348,15 @@ unsigned N1Opc; N0 = (N0.getOpcode() != ISD::Constant ? DAG.getNode(ISD::SIGN_EXTEND, MVT::i16, N0) - : DAG.getConstant(cast(N0)->getZExtValue(), + : DAG.getConstant(cast(N0)->getSExtValue(), MVT::i16)); - N1Opc = N1.getValueType().bitsLT(MVT::i16) + N1Opc = N1.getValueType().bitsLT(ShiftVT) ? ISD::SIGN_EXTEND : ISD::TRUNCATE; N1 = (N1.getOpcode() != ISD::Constant - ? DAG.getNode(N1Opc, MVT::i16, N1) + ? DAG.getNode(N1Opc, ShiftVT, N1) : DAG.getConstant(cast(N1)->getZExtValue(), - MVT::i16)); + ShiftVT)); return DAG.getNode(ISD::TRUNCATE, MVT::i8, DAG.getNode(Opc, MVT::i16, N0, N1)); } @@ -2366,7 +2370,7 @@ N1Opc = N1.getValueType().bitsLT(MVT::i16) ? ISD::SIGN_EXTEND : ISD::TRUNCATE; N1 = (N1.getOpcode() != ISD::Constant ? DAG.getNode(N1Opc, MVT::i16, N1) - : DAG.getConstant(cast(N1)->getZExtValue(), + : DAG.getConstant(cast(N1)->getSExtValue(), MVT::i16)); return DAG.getNode(ISD::TRUNCATE, MVT::i8, DAG.getNode(Opc, MVT::i16, N0, N1)); @@ -2397,7 +2401,7 @@ DEBUG(cerr << "CellSPU.LowerI64Math: lowering zero/sign/any extend\n"); SDValue PromoteScalar = - DAG.getNode(SPUISD::PROMOTE_SCALAR, Op0VecVT, Op0); + DAG.getNode(SPUISD::PREFSLOT2VEC, Op0VecVT, Op0); if (Opc != ISD::SIGN_EXTEND) { // Use a shuffle to zero extend the i32 to i64 directly: @@ -2438,9 +2442,9 @@ // Turn operands into vectors to satisfy type checking (shufb works on // vectors) SDValue Op0 = - DAG.getNode(SPUISD::PROMOTE_SCALAR, MVT::v2i64, Op.getOperand(0)); + DAG.getNode(SPUISD::PREFSLOT2VEC, MVT::v2i64, Op.getOperand(0)); SDValue Op1 = - DAG.getNode(SPUISD::PROMOTE_SCALAR, MVT::v2i64, Op.getOperand(1)); + DAG.getNode(SPUISD::PREFSLOT2VEC, MVT::v2i64, Op.getOperand(1)); SmallVector ShufBytes; // Create the shuffle mask for "rotating" the borrow up one register slot @@ -2467,9 +2471,9 @@ // Turn operands into vectors to satisfy type checking (shufb works on // vectors) SDValue Op0 = - DAG.getNode(SPUISD::PROMOTE_SCALAR, MVT::v2i64, Op.getOperand(0)); + DAG.getNode(SPUISD::PREFSLOT2VEC, MVT::v2i64, Op.getOperand(0)); SDValue Op1 = - DAG.getNode(SPUISD::PROMOTE_SCALAR, MVT::v2i64, Op.getOperand(1)); + DAG.getNode(SPUISD::PREFSLOT2VEC, MVT::v2i64, Op.getOperand(1)); SmallVector ShufBytes; // Create the shuffle mask for "rotating" the borrow up one register slot @@ -2495,7 +2499,7 @@ case ISD::SHL: { SDValue ShiftAmt = Op.getOperand(1); MVT ShiftAmtVT = ShiftAmt.getValueType(); - SDValue Op0Vec = DAG.getNode(SPUISD::PROMOTE_SCALAR, VecVT, Op0); + SDValue Op0Vec = DAG.getNode(SPUISD::PREFSLOT2VEC, VecVT, Op0); SDValue MaskLower = DAG.getNode(SPUISD::SELB, VecVT, Op0Vec, @@ -2540,7 +2544,7 @@ case ISD::SRA: { // Promote Op0 to vector SDValue Op0 = - DAG.getNode(SPUISD::PROMOTE_SCALAR, MVT::v2i64, Op.getOperand(0)); + DAG.getNode(SPUISD::PREFSLOT2VEC, MVT::v2i64, Op.getOperand(0)); SDValue ShiftAmt = Op.getOperand(1); MVT ShiftVT = ShiftAmt.getValueType(); @@ -2669,7 +2673,7 @@ SDValue N = Op.getOperand(0); SDValue Elt0 = DAG.getConstant(0, MVT::i32); - SDValue Promote = DAG.getNode(SPUISD::PROMOTE_SCALAR, vecVT, N, N); + SDValue Promote = DAG.getNode(SPUISD::PREFSLOT2VEC, vecVT, N, N); SDValue CNTB = DAG.getNode(SPUISD::CNTB, vecVT, Promote); return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, MVT::i8, CNTB, Elt0); @@ -2686,7 +2690,7 @@ SDValue Mask0 = DAG.getConstant(0x0f, MVT::i16); SDValue Shift1 = DAG.getConstant(8, MVT::i32); - SDValue Promote = DAG.getNode(SPUISD::PROMOTE_SCALAR, vecVT, N, N); + SDValue Promote = DAG.getNode(SPUISD::PREFSLOT2VEC, vecVT, N, N); SDValue CNTB = DAG.getNode(SPUISD::CNTB, vecVT, Promote); // CNTB_result becomes the chain to which all of the virtual registers @@ -2720,7 +2724,7 @@ SDValue Shift1 = DAG.getConstant(16, MVT::i32); SDValue Shift2 = DAG.getConstant(8, MVT::i32); - SDValue Promote = DAG.getNode(SPUISD::PROMOTE_SCALAR, vecVT, N, N); + SDValue Promote = DAG.getNode(SPUISD::PREFSLOT2VEC, vecVT, N, N); SDValue CNTB = DAG.getNode(SPUISD::CNTB, vecVT, Promote); // CNTB_result becomes the chain to which all of the virtual registers @@ -2760,6 +2764,32 @@ return SDValue(); } +//! Lower ISD::SETCC +/*! + Lower i64 condition code handling. + */ + +static SDValue LowerSETCC(SDValue Op, SelectionDAG &DAG) { + MVT VT = Op.getValueType(); + SDValue lhs = Op.getOperand(0); + SDValue rhs = Op.getOperand(1); + SDValue condition = Op.getOperand(2); + + if (VT == MVT::i32 && lhs.getValueType() == MVT::i64) { + // Expand the i64 comparisons to what Cell can actually support, + // which is eq, ugt and sgt: +#if 0 + CondCodeSDNode *ccvalue = dyn_cast(condition); + + switch (ccvalue->get()) { + case + } +#endif + } + + return SDValue(); +} + //! Lower ISD::SELECT_CC /*! ISD::SELECT_CC can (generally) be implemented directly on the SPU using the @@ -2772,7 +2802,8 @@ assumption, given the simplisitc uses so far. */ -static SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) { +static SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG, + const TargetLowering &TLI) { MVT VT = Op.getValueType(); SDValue lhs = Op.getOperand(0); SDValue rhs = Op.getOperand(1); @@ -2780,12 +2811,20 @@ SDValue falseval = Op.getOperand(3); SDValue condition = Op.getOperand(4); + // NOTE: SELB's arguments: $rA, $rB, $mask + // + // SELB selects bits from $rA where bits in $mask are 0, bits from $rB + // where bits in $mask are 1. CCond will be inverted, having 1s where the + // condition was true and 0s where the condition was false. Hence, the + // arguments to SELB get reversed. + // Note: Really should be ISD::SELECT instead of SPUISD::SELB, but LLVM's // legalizer insists on combining SETCC/SELECT into SELECT_CC, so we end up // with another "cannot select select_cc" assert: - SDValue compare = DAG.getNode(ISD::SETCC, VT, lhs, rhs, condition); - return DAG.getNode(SPUISD::SELB, VT, trueval, falseval, compare); + SDValue compare = DAG.getNode(ISD::SETCC, TLI.getSetCCResultType(Op), + lhs, rhs, condition); + return DAG.getNode(SPUISD::SELB, VT, falseval, trueval, compare); } //! Custom lower ISD::TRUNCATE @@ -2799,89 +2838,29 @@ MVT Op0VT = Op0.getValueType(); MVT Op0VecVT = MVT::getVectorVT(Op0VT, (128 / Op0VT.getSizeInBits())); - SDValue PromoteScalar = DAG.getNode(SPUISD::PROMOTE_SCALAR, Op0VecVT, Op0); + // Create shuffle mask + if (Op0VT.getSimpleVT() == MVT::i128 && simpleVT == MVT::i64) { + // least significant doubleword of quadword + unsigned maskHigh = 0x08090a0b; + unsigned maskLow = 0x0c0d0e0f; + // Use a shuffle to perform the truncation + SDValue shufMask = DAG.getNode(ISD::BUILD_VECTOR, MVT::v4i32, + DAG.getConstant(maskHigh, MVT::i32), + DAG.getConstant(maskLow, MVT::i32), + DAG.getConstant(maskHigh, MVT::i32), + DAG.getConstant(maskLow, MVT::i32)); - unsigned maskLow; - unsigned maskHigh; - // Create shuffle mask - switch (Op0VT.getSimpleVT()) { - case MVT::i128: - switch (simpleVT) { - case MVT::i64: - // least significant doubleword of quadword - maskHigh = 0x08090a0b; - maskLow = 0x0c0d0e0f; - break; - case MVT::i32: - // least significant word of quadword - maskHigh = maskLow = 0x0c0d0e0f; - break; - case MVT::i16: - // least significant halfword of quadword - maskHigh = maskLow = 0x0e0f0e0f; - break; - case MVT::i8: - // least significant byte of quadword - maskHigh = maskLow = 0x0f0f0f0f; - break; - default: - cerr << "Truncation to illegal type!"; - abort(); - } - break; - case MVT::i64: - switch (simpleVT) { - case MVT::i32: - // least significant word of doubleword - maskHigh = maskLow = 0x04050607; - break; - case MVT::i16: - // least significant halfword of doubleword - maskHigh = maskLow = 0x06070607; - break; - case MVT::i8: - // least significant byte of doubleword - maskHigh = maskLow = 0x07070707; - break; - default: - cerr << "Truncation to illegal type!"; - abort(); - } - break; - case MVT::i32: - case MVT::i16: - switch (simpleVT) { - case MVT::i16: - // least significant halfword of word - maskHigh = maskLow = 0x02030203; - break; - case MVT::i8: - // least significant byte of word/halfword - maskHigh = maskLow = 0x03030303; - break; - default: - cerr << "Truncation to illegal type!"; - abort(); - } - break; - default: - cerr << "Trying to lower truncation from illegal type!"; - abort(); - } + SDValue PromoteScalar = DAG.getNode(SPUISD::PREFSLOT2VEC, Op0VecVT, Op0); - // Use a shuffle to perform the truncation - SDValue shufMask = DAG.getNode(ISD::BUILD_VECTOR, MVT::v4i32, - DAG.getConstant(maskHigh, MVT::i32), - DAG.getConstant(maskLow, MVT::i32), - DAG.getConstant(maskHigh, MVT::i32), - DAG.getConstant(maskLow, MVT::i32)); + SDValue truncShuffle = DAG.getNode(SPUISD::SHUFB, Op0VecVT, + PromoteScalar, PromoteScalar, shufMask); - SDValue truncShuffle = DAG.getNode(SPUISD::SHUFB, Op0VecVT, - PromoteScalar, PromoteScalar, shufMask); + return DAG.getNode(SPUISD::VEC2PREFSLOT, VT, + DAG.getNode(ISD::BIT_CONVERT, VecVT, truncShuffle)); + } - return DAG.getNode(SPUISD::VEC2PREFSLOT, VT, - DAG.getNode(ISD::BIT_CONVERT, VecVT, truncShuffle)); + return SDValue(); // Leave the truncate unmolested } //! Custom (target-specific) lowering entry point @@ -2921,7 +2900,7 @@ case ISD::ConstantFP: return LowerConstantFP(Op, DAG); case ISD::BRCOND: - return LowerBRCOND(Op, DAG); + return LowerBRCOND(Op, DAG, *this); case ISD::FORMAL_ARGUMENTS: return LowerFORMAL_ARGUMENTS(Op, DAG, VarArgsFrameIndex); case ISD::CALL: @@ -2942,7 +2921,7 @@ case ISD::SHL: case ISD::SRA: { if (VT == MVT::i8) - return LowerI8Math(Op, DAG, Opc); + return LowerI8Math(Op, DAG, Opc, *this); else if (VT == MVT::i64) return LowerI64Math(Op, DAG, Opc); break; @@ -2971,7 +2950,7 @@ if (VT.isVector()) return LowerVectorMUL(Op, DAG); else if (VT == MVT::i8) - return LowerI8Math(Op, DAG, Opc); + return LowerI8Math(Op, DAG, Opc, *this); else return LowerMUL(Op, DAG, VT, Opc); @@ -2990,10 +2969,13 @@ return LowerCTPOP(Op, DAG); case ISD::SELECT_CC: - return LowerSELECT_CC(Op, DAG); + return LowerSELECT_CC(Op, DAG, *this); case ISD::TRUNCATE: return LowerTRUNCATE(Op, DAG); + + case ISD::SETCC: + return LowerSETCC(Op, DAG); } return SDValue(); @@ -3036,7 +3018,7 @@ SelectionDAG &DAG = DCI.DAG; SDValue Op0 = N->getOperand(0); // everything has at least one operand MVT NodeVT = N->getValueType(0); // The node's value type - MVT Op0VT = Op0.getValueType(); // The first operand's result + MVT Op0VT = Op0.getValueType(); // The first operand's result SDValue Result; // Initially, empty result switch (N->getOpcode()) { @@ -3044,49 +3026,53 @@ case ISD::ADD: { SDValue Op1 = N->getOperand(1); - if (isa(Op1) && Op0.getOpcode() == SPUISD::IndirectAddr) { - SDValue Op01 = Op0.getOperand(1); - if (Op01.getOpcode() == ISD::Constant - || Op01.getOpcode() == ISD::TargetConstant) { - // (add , (SPUindirect , )) -> - // (SPUindirect , ) - ConstantSDNode *CN0 = cast(Op1); - ConstantSDNode *CN1 = cast(Op01); - SDValue combinedConst = - DAG.getConstant(CN0->getZExtValue() + CN1->getZExtValue(), Op0VT); + if (Op0.getOpcode() == SPUISD::IndirectAddr + || Op1.getOpcode() == SPUISD::IndirectAddr) { + // Normalize the operands to reduce repeated code + SDValue IndirectArg = Op0, AddArg = Op1; + + if (Op1.getOpcode() == SPUISD::IndirectAddr) { + IndirectArg = Op1; + AddArg = Op0; + } + + if (isa(AddArg)) { + ConstantSDNode *CN0 = cast (AddArg); + SDValue IndOp1 = IndirectArg.getOperand(1); + + if (CN0->isNullValue()) { + // (add (SPUindirect , ), 0) -> + // (SPUindirect , ) #if !defined(NDEBUG) - if (DebugFlag && isCurrentDebugType(DEBUG_TYPE)) { + if (DebugFlag && isCurrentDebugType(DEBUG_TYPE)) { cerr << "\n" - << "Replace: (add " << CN0->getZExtValue() << ", " - << "(SPUindirect , " << CN1->getZExtValue() << "))\n" - << "With: (SPUindirect , " - << CN0->getZExtValue() + CN1->getZExtValue() << ")\n"; - } + << "Replace: (add (SPUindirect , ), 0)\n" + << "With: (SPUindirect , )\n"; + } #endif - return DAG.getNode(SPUISD::IndirectAddr, Op0VT, - Op0.getOperand(0), combinedConst); - } - } else if (isa(Op0) - && Op1.getOpcode() == SPUISD::IndirectAddr) { - SDValue Op11 = Op1.getOperand(1); - if (Op11.getOpcode() == ISD::Constant - || Op11.getOpcode() == ISD::TargetConstant) { - // (add (SPUindirect , ), ) -> - // (SPUindirect , ) - ConstantSDNode *CN0 = cast(Op0); - ConstantSDNode *CN1 = cast(Op11); - SDValue combinedConst = - DAG.getConstant(CN0->getZExtValue() + CN1->getZExtValue(), Op0VT); - - DEBUG(cerr << "Replace: (add " << CN0->getZExtValue() << ", " - << "(SPUindirect , " << CN1->getZExtValue() << "))\n"); - DEBUG(cerr << "With: (SPUindirect , " - << CN0->getZExtValue() + CN1->getZExtValue() << ")\n"); + return IndirectArg; + } else if (isa(IndOp1)) { + // (add (SPUindirect , ), ) -> + // (SPUindirect , ) + ConstantSDNode *CN1 = cast (IndOp1); + int64_t combinedConst = CN0->getSExtValue() + CN1->getSExtValue(); + SDValue combinedValue = DAG.getConstant(combinedConst, Op0VT); + +#if !defined(NDEBUG) + if (DebugFlag && isCurrentDebugType(DEBUG_TYPE)) { + cerr << "\n" + << "Replace: (add (SPUindirect , " << CN1->getSExtValue() + << "), " << CN0->getSExtValue() << ")\n" + << "With: (SPUindirect , " + << combinedConst << ")\n"; + } +#endif - return DAG.getNode(SPUISD::IndirectAddr, Op1.getValueType(), - Op1.getOperand(0), combinedConst); + return DAG.getNode(SPUISD::IndirectAddr, Op0VT, + IndirectArg, combinedValue); + } } } break; @@ -3127,6 +3113,25 @@ return Op0; } + } else if (Op0.getOpcode() == ISD::ADD) { + SDValue Op1 = N->getOperand(1); + if (ConstantSDNode *CN1 = dyn_cast(Op1)) { + // (SPUindirect (add , ), 0) -> + // (SPUindirect , ) + if (CN1->isNullValue()) { + +#if !defined(NDEBUG) + if (DebugFlag && isCurrentDebugType(DEBUG_TYPE)) { + cerr << "\n" + << "Replace: (SPUindirect (add , ), 0)\n" + << "With: (SPUindirect , )\n"; + } +#endif + + return DAG.getNode(SPUISD::IndirectAddr, Op0VT, + Op0.getOperand(0), Op0.getOperand(1)); + } + } } break; } @@ -3136,19 +3141,19 @@ case SPUISD::VEC_SRL: case SPUISD::VEC_SRA: case SPUISD::ROTQUAD_RZ_BYTES: - case SPUISD::ROTQUAD_RZ_BITS: { + case SPUISD::ROTQUAD_RZ_BITS: + case SPUISD::ROTBYTES_LEFT: { SDValue Op1 = N->getOperand(1); - if (isa(Op1)) { - // Kill degenerate vector shifts: - ConstantSDNode *CN = cast(Op1); - if (CN->getZExtValue() == 0) { + // Kill degenerate vector shifts: + if (ConstantSDNode *CN = dyn_cast(Op1)) { + if (CN->isNullValue()) { Result = Op0; } } break; } - case SPUISD::PROMOTE_SCALAR: { + case SPUISD::PREFSLOT2VEC: { switch (Op0.getOpcode()) { default: break; @@ -3263,7 +3268,7 @@ case CNTB: #endif - case SPUISD::PROMOTE_SCALAR: { + case SPUISD::PREFSLOT2VEC: { SDValue Op0 = Op.getOperand(0); MVT Op0VT = Op0.getValueType(); unsigned Op0VTBits = Op0VT.getSizeInBits(); @@ -3306,7 +3311,25 @@ #endif } } + +unsigned +SPUTargetLowering::ComputeNumSignBitsForTargetNode(SDValue Op, + unsigned Depth) const { + switch (Op.getOpcode()) { + default: + return 1; + case ISD::SETCC: { + MVT VT = Op.getValueType(); + + if (VT != MVT::i8 && VT != MVT::i16 && VT != MVT::i32) { + VT = MVT::i32; + } + return VT.getSizeInBits(); + } + } +} + // LowerAsmOperandForConstraint void SPUTargetLowering::LowerAsmOperandForConstraint(SDValue Op, Modified: llvm/trunk/lib/Target/CellSPU/SPUISelLowering.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUISelLowering.h?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUISelLowering.h (original) +++ llvm/trunk/lib/Target/CellSPU/SPUISelLowering.h Fri Dec 26 22:51:36 2008 @@ -39,7 +39,7 @@ SHUFB, ///< Vector shuffle (permute) SHUFFLE_MASK, ///< Shuffle mask CNTB, ///< Count leading ones in bytes - PROMOTE_SCALAR, ///< Promote scalar->vector + PREFSLOT2VEC, ///< Promote scalar->vector VEC2PREFSLOT, ///< Extract element 0 MPY, ///< 16-bit Multiply (low parts of a 32-bit) MPYU, ///< Multiply Unsigned @@ -58,6 +58,7 @@ ROTBYTES_LEFT_BITS, ///< Rotate bytes left by bit shift count SELECT_MASK, ///< Select Mask (FSM, FSMB, FSMH, FSMBI) SELB, ///< Select bits -> (b & mask) | (a & ~mask) + GATHER_BITS, ///< Gather bits from bytes/words/halfwords ADD_EXTENDED, ///< Add extended, with carry CARRY_GENERATE, ///< Carry generate for ADD_EXTENDED SUB_EXTENDED, ///< Subtract extended, with borrow @@ -120,6 +121,9 @@ const SelectionDAG &DAG, unsigned Depth = 0) const; + virtual unsigned ComputeNumSignBitsForTargetNode(SDValue Op, + unsigned Depth = 0) const; + ConstraintType getConstraintType(const std::string &ConstraintLetter) const; std::pair Modified: llvm/trunk/lib/Target/CellSPU/SPUInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUInstrFormats.td?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUInstrFormats.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUInstrFormats.td Fri Dec 26 22:51:36 2008 @@ -120,9 +120,8 @@ } let RA = 0 in { - class BICondForm opcode, string asmstr, list pattern> - : RRForm + class BICondForm opcode, dag OOL, dag IOL, string asmstr, list pattern> + : RRForm { } let RT = 0 in { Modified: llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.cpp Fri Dec 26 22:51:36 2008 @@ -34,10 +34,14 @@ inline bool isCondBranch(const MachineInstr *I) { unsigned opc = I->getOpcode(); - return (opc == SPU::BRNZ - || opc == SPU::BRZ - || opc == SPU::BRHNZ - || opc == SPU::BRHZ); + return (opc == SPU::BRNZr32 + || opc == SPU::BRNZv4i32 + || opc == SPU::BRZr32 + || opc == SPU::BRZv4i32 + || opc == SPU::BRHNZr16 + || opc == SPU::BRHNZv8i16 + || opc == SPU::BRHZr16 + || opc == SPU::BRHZv8i16); } } @@ -103,6 +107,19 @@ return true; } break; + case SPU::LRr8: + case SPU::LRr16: + case SPU::LRr32: + case SPU::LRf32: + case SPU::LRr64: + case SPU::LRf64: + case SPU::LRr128: + case SPU::LRv16i8: + case SPU::LRv8i16: + case SPU::LRv4i32: + case SPU::LRv4f32: + case SPU::LRv2i64: + case SPU::LRv2f64: case SPU::ORv16i8_i8: case SPU::ORv8i16_i16: case SPU::ORv4i32_i32: @@ -114,7 +131,18 @@ case SPU::ORi32_v4i32: case SPU::ORi64_v2i64: case SPU::ORf32_v4f32: - case SPU::ORf64_v2f64: + case SPU::ORf64_v2f64: { + assert(MI.getNumOperands() == 2 && + MI.getOperand(0).isReg() && + MI.getOperand(1).isReg() && + "invalid SPU OR_ instruction!"); + if (MI.getOperand(0).getReg() == MI.getOperand(1).getReg()) { + sourceReg = MI.getOperand(0).getReg(); + destReg = MI.getOperand(0).getReg(); + return true; + } + break; + } case SPU::ORv16i8: case SPU::ORv8i16: case SPU::ORv4i32: @@ -198,18 +226,14 @@ case SPU::STQDr8: { const MachineOperand MOp1 = MI->getOperand(1); const MachineOperand MOp2 = MI->getOperand(2); - if (MOp1.isImm() - && (MOp2.isFI() - || (MOp2.isReg() && MOp2.getReg() == SPU::R1))) { - if (MOp2.isFI()) - FrameIndex = MOp2.getIndex(); - else - FrameIndex = MOp1.getImm() / SPUFrameInfo::stackSlotSize(); + if (MOp1.isImm() && MOp2.isFI()) { + FrameIndex = MOp2.getIndex(); return MI->getOperand(0).getReg(); } break; } - case SPU::STQXv16i8: +#if 0 + case SPU::STQXv16i8: case SPU::STQXv8i16: case SPU::STQXv4i32: case SPU::STQXv4f32: @@ -226,6 +250,7 @@ return MI->getOperand(0).getReg(); } break; +#endif } return 0; } @@ -292,6 +317,8 @@ opc = (isValidFrameIdx ? SPU::STQDr16 : SPU::STQXr16); } else if (RC == SPU::R8CRegisterClass) { opc = (isValidFrameIdx ? SPU::STQDr8 : SPU::STQXr8); + } else if (RC == SPU::VECREGRegisterClass) { + opc = (isValidFrameIdx) ? SPU::STQDv16i8 : SPU::STQXv16i8; } else { assert(0 && "Unknown regclass!"); abort(); @@ -366,6 +393,8 @@ opc = (isValidFrameIdx ? SPU::LQDr16 : SPU::LQXr16); } else if (RC == SPU::R8CRegisterClass) { opc = (isValidFrameIdx ? SPU::LQDr8 : SPU::LQXr8); + } else if (RC == SPU::VECREGRegisterClass) { + opc = (isValidFrameIdx) ? SPU::LQDv16i8 : SPU::LQXv16i8; } else { assert(0 && "Unknown regclass in loadRegFromStackSlot!"); abort(); Modified: llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td Fri Dec 26 22:51:36 2008 @@ -1,10 +1,10 @@ //==- SPUInstrInfo.td - Describe the Cell SPU Instructions -*- tablegen -*-==// -// +// // The LLVM Compiler Infrastructure // // This file is distributed under the University of Illinois Open Source // License. See LICENSE.TXT for details. -// +// //===----------------------------------------------------------------------===// // Cell SPU Instructions: //===----------------------------------------------------------------------===// @@ -49,14 +49,14 @@ let canFoldAsLoad = 1 in { class LoadDFormVec - : RI10Form<0b00101100, (outs VECREG:$rT), (ins memri10:$src), + : RI10Form<0b00101100, (outs VECREG:$rT), (ins dformaddr:$src), "lqd\t$rT, $src", LoadStore, [(set (vectype VECREG:$rT), (load dform_addr:$src))]> { } class LoadDForm - : RI10Form<0b00101100, (outs rclass:$rT), (ins memri10:$src), + : RI10Form<0b00101100, (outs rclass:$rT), (ins dformaddr:$src), "lqd\t$rT, $src", LoadStore, [(set rclass:$rT, (load dform_addr:$src))]> @@ -161,14 +161,14 @@ // Stores: //===----------------------------------------------------------------------===// class StoreDFormVec - : RI10Form<0b00100100, (outs), (ins VECREG:$rT, memri10:$src), + : RI10Form<0b00100100, (outs), (ins VECREG:$rT, dformaddr:$src), "stqd\t$rT, $src", LoadStore, [(store (vectype VECREG:$rT), dform_addr:$src)]> { } class StoreDForm - : RI10Form<0b00100100, (outs), (ins rclass:$rT, memri10:$src), + : RI10Form<0b00100100, (outs), (ins rclass:$rT, dformaddr:$src), "stqd\t$rT, $src", LoadStore, [(store rclass:$rT, dform_addr:$src)]> @@ -269,7 +269,7 @@ // Generate Controls for Insertion: //===----------------------------------------------------------------------===// -def CBD: RI7Form<0b10101111100, (outs VECREG:$rT), (ins memri7:$src), +def CBD: RI7Form<0b10101111100, (outs VECREG:$rT), (ins shufaddr:$src), "cbd\t$rT, $src", ShuffleOp, [(set (v16i8 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -277,7 +277,7 @@ "cbx\t$rT, $src", ShuffleOp, [(set (v16i8 VECREG:$rT), (SPUshufmask xform_addr:$src))]>; -def CHD: RI7Form<0b10101111100, (outs VECREG:$rT), (ins memri7:$src), +def CHD: RI7Form<0b10101111100, (outs VECREG:$rT), (ins shufaddr:$src), "chd\t$rT, $src", ShuffleOp, [(set (v8i16 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -285,7 +285,7 @@ "chx\t$rT, $src", ShuffleOp, [(set (v8i16 VECREG:$rT), (SPUshufmask xform_addr:$src))]>; -def CWD: RI7Form<0b01101111100, (outs VECREG:$rT), (ins memri7:$src), +def CWD: RI7Form<0b01101111100, (outs VECREG:$rT), (ins shufaddr:$src), "cwd\t$rT, $src", ShuffleOp, [(set (v4i32 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -293,7 +293,7 @@ "cwx\t$rT, $src", ShuffleOp, [(set (v4i32 VECREG:$rT), (SPUshufmask xform_addr:$src))]>; -def CWDf32: RI7Form<0b01101111100, (outs VECREG:$rT), (ins memri7:$src), +def CWDf32: RI7Form<0b01101111100, (outs VECREG:$rT), (ins shufaddr:$src), "cwd\t$rT, $src", ShuffleOp, [(set (v4f32 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -301,7 +301,7 @@ "cwx\t$rT, $src", ShuffleOp, [(set (v4f32 VECREG:$rT), (SPUshufmask xform_addr:$src))]>; -def CDD: RI7Form<0b11101111100, (outs VECREG:$rT), (ins memri7:$src), +def CDD: RI7Form<0b11101111100, (outs VECREG:$rT), (ins shufaddr:$src), "cdd\t$rT, $src", ShuffleOp, [(set (v2i64 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -309,7 +309,7 @@ "cdx\t$rT, $src", ShuffleOp, [(set (v2i64 VECREG:$rT), (SPUshufmask xform_addr:$src))]>; -def CDDf64: RI7Form<0b11101111100, (outs VECREG:$rT), (ins memri7:$src), +def CDDf64: RI7Form<0b11101111100, (outs VECREG:$rT), (ins shufaddr:$src), "cdd\t$rT, $src", ShuffleOp, [(set (v2f64 VECREG:$rT), (SPUshufmask dform2_addr:$src))]>; @@ -421,6 +421,7 @@ def f32: ILARegInst; def f64: ILARegInst; + def hi: ILARegInst; def lo: ILARegInst; def lsa: ILAInst<(outs R32C:$rT), (ins symbolLSA:$val), @@ -481,37 +482,77 @@ defm FSMBI : FormSelectMaskBytesImm; // fsmb: Form select mask for bytes. N.B. Input operand, $rA, is 16-bits -def FSMB: - RRForm_1<0b01101101100, (outs VECREG:$rT), (ins R16C:$rA), - "fsmb\t$rT, $rA", SelectOp, - [(set (v16i8 VECREG:$rT), (SPUselmask R16C:$rA))]>; +class FSMBInst pattern>: + RRForm_1<0b01101101100, OOL, IOL, "fsmb\t$rT, $rA", SelectOp, + pattern>; + +class FSMBRegInst: + FSMBInst<(outs VECREG:$rT), (ins rclass:$rA), + [(set (vectype VECREG:$rT), (SPUselmask rclass:$rA))]>; + +class FSMBVecInst: + FSMBInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), + (SPUselmask (vectype VECREG:$rA)))]>; + +multiclass FormSelectMaskBits { + def v16i8_r16: FSMBRegInst; + def v16i8: FSMBVecInst; +} + +defm FSMB: FormSelectMaskBits; // fsmh: Form select mask for halfwords. N.B., Input operand, $rA, is // only 8-bits wide (even though it's input as 16-bits here) -def FSMH: - RRForm_1<0b10101101100, (outs VECREG:$rT), (ins R16C:$rA), - "fsmh\t$rT, $rA", SelectOp, - [(set (v8i16 VECREG:$rT), (SPUselmask R16C:$rA))]>; + +class FSMHInst pattern>: + RRForm_1<0b10101101100, OOL, IOL, "fsmh\t$rT, $rA", SelectOp, + pattern>; + +class FSMHRegInst: + FSMHInst<(outs VECREG:$rT), (ins rclass:$rA), + [(set (vectype VECREG:$rT), (SPUselmask rclass:$rA))]>; + +class FSMHVecInst: + FSMHInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), + (SPUselmask (vectype VECREG:$rA)))]>; + +multiclass FormSelectMaskHalfword { + def v8i16_r16: FSMHRegInst; + def v8i16: FSMHVecInst; +} + +defm FSMH: FormSelectMaskHalfword; // fsm: Form select mask for words. Like the other fsm* instructions, // only the lower 4 bits of $rA are significant. -class FSMInst: - RRForm_1<0b00101101100, (outs VECREG:$rT), (ins rclass:$rA), - "fsm\t$rT, $rA", - SelectOp, - [(set (vectype VECREG:$rT), (SPUselmask rclass:$rA))]>; + +class FSMInst pattern>: + RRForm_1<0b00101101100, OOL, IOL, "fsm\t$rT, $rA", SelectOp, + pattern>; + +class FSMRegInst: + FSMInst<(outs VECREG:$rT), (ins rclass:$rA), + [(set (vectype VECREG:$rT), (SPUselmask rclass:$rA))]>; + +class FSMVecInst: + FSMInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), (SPUselmask (vectype VECREG:$rA)))]>; multiclass FormSelectMaskWord { - def r32 : FSMInst; - def r16 : FSMInst; + def v4i32: FSMVecInst; + + def r32 : FSMRegInst; + def r16 : FSMRegInst; } defm FSM : FormSelectMaskWord; // Special case when used for i64 math operations multiclass FormSelectMaskWord64 { - def r32 : FSMInst; - def r16 : FSMInst; + def r32 : FSMRegInst; + def r16 : FSMRegInst; } defm FSM64 : FormSelectMaskWord64; @@ -736,7 +777,7 @@ // BGX: Borrow generate, extended. def BGXvec: RRForm<0b11000010110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, - VECREG:$rCarry), + VECREG:$rCarry), "bgx\t$rT, $rA, $rB", IntegerOp, []>, RegConstraint<"$rCarry = $rT">, @@ -898,20 +939,31 @@ []>; // clz: Count leading zeroes -def CLZv4i32: - RRForm_1<0b10100101010, (outs VECREG:$rT), (ins VECREG:$rA), - "clz\t$rT, $rA", IntegerOp, - [/* intrinsic */]>; - -def CLZr32: - RRForm_1<0b10100101010, (outs R32C:$rT), (ins R32C:$rA), - "clz\t$rT, $rA", IntegerOp, - [(set R32C:$rT, (ctlz R32C:$rA))]>; +class CLZInst pattern>: + RRForm_1<0b10100101010, OOL, IOL, "clz\t$rT, $rA", + IntegerOp, pattern>; + +class CLZRegInst: + CLZInst<(outs rclass:$rT), (ins rclass:$rA), + [(set rclass:$rT, (ctlz rclass:$rA))]>; + +class CLZVecInst: + CLZInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), (ctlz (vectype VECREG:$rA)))]>; + +multiclass CountLeadingZeroes { + def v4i32 : CLZVecInst; + def r32 : CLZRegInst; +} + +defm CLZ : CountLeadingZeroes; // cntb: Count ones in bytes (aka "population count") +// // NOTE: This instruction is really a vector instruction, but the custom // lowering code uses it in unorthodox ways to support CTPOP for other // data types! + def CNTBv16i8: RRForm_1<0b00101101010, (outs VECREG:$rT), (ins VECREG:$rA), "cntb\t$rT, $rA", IntegerOp, @@ -927,26 +979,88 @@ "cntb\t$rT, $rA", IntegerOp, [(set (v4i32 VECREG:$rT), (SPUcntb (v4i32 VECREG:$rA)))]>; -// gbb: Gather all low order bits from each byte in $rA into a single 16-bit -// quantity stored into $rT -def GBB: - RRForm_1<0b01001101100, (outs R16C:$rT), (ins VECREG:$rA), - "gbb\t$rT, $rA", GatherOp, - []>; +// gbb: Gather the low order bits from each byte in $rA into a single 16-bit +// quantity stored into $rT's slot 0, upper 16 bits are zeroed, as are +// slots 1-3. +// +// Note: This instruction "pairs" with the fsmb instruction for all of the +// various types defined here. +// +// Note 2: The "VecInst" and "RegInst" forms refer to the result being either +// a vector or register. + +class GBBInst pattern>: + RRForm_1<0b01001101100, OOL, IOL, "gbb\t$rT, $rA", GatherOp, pattern>; + +class GBBRegInst: + GBBInst<(outs rclass:$rT), (ins VECREG:$rA), + [(set rclass:$rT, (SPUgatherbits (vectype VECREG:$rA)))]>; + +class GBBVecInst: + GBBInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), (SPUgatherbits (vectype VECREG:$rA)))]>; + +multiclass GatherBitsFromBytes { + def v16i8_r32: GBBRegInst; + def v16i8_r16: GBBRegInst; + def v16i8: GBBVecInst; +} + +defm GBB: GatherBitsFromBytes; // gbh: Gather all low order bits from each halfword in $rA into a single -// 8-bit quantity stored in $rT -def GBH: - RRForm_1<0b10001101100, (outs R16C:$rT), (ins VECREG:$rA), - "gbh\t$rT, $rA", GatherOp, - []>; +// 8-bit quantity stored in $rT's slot 0, with the upper bits of $rT set to 0 +// and slots 1-3 also set to 0. +// +// See notes for GBBInst, above. + +class GBHInst pattern>: + RRForm_1<0b10001101100, OOL, IOL, "gbh\t$rT, $rA", GatherOp, + pattern>; + +class GBHRegInst: + GBHInst<(outs rclass:$rT), (ins VECREG:$rA), + [(set rclass:$rT, (SPUgatherbits (vectype VECREG:$rA)))]>; + +class GBHVecInst: + GBHInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), + (SPUgatherbits (vectype VECREG:$rA)))]>; + +multiclass GatherBitsHalfword { + def v8i16_r32: GBHRegInst; + def v8i16_r16: GBHRegInst; + def v8i16: GBHVecInst; +} + +defm GBH: GatherBitsHalfword; // gb: Gather all low order bits from each word in $rA into a single -// 4-bit quantity stored in $rT -def GB: - RRForm_1<0b00001101100, (outs R16C:$rT), (ins VECREG:$rA), - "gb\t$rT, $rA", GatherOp, - []>; +// 4-bit quantity stored in $rT's slot 0, upper bits in $rT set to 0, +// as well as slots 1-3. +// +// See notes for gbb, above. + +class GBInst pattern>: + RRForm_1<0b00001101100, OOL, IOL, "gb\t$rT, $rA", GatherOp, + pattern>; + +class GBRegInst: + GBInst<(outs rclass:$rT), (ins VECREG:$rA), + [(set rclass:$rT, (SPUgatherbits (vectype VECREG:$rA)))]>; + +class GBVecInst: + GBInst<(outs VECREG:$rT), (ins VECREG:$rA), + [(set (vectype VECREG:$rT), + (SPUgatherbits (vectype VECREG:$rA)))]>; + +multiclass GatherBitsWord { + def v4i32_r32: GBRegInst; + def v4i32_r16: GBRegInst; + def v4i32: GBVecInst; +} + +defm GB: GatherBitsWord; // avgb: average bytes def AVGB: @@ -976,30 +1090,26 @@ XSBHInst<(outs VECREG:$rDst), (ins VECREG:$rSrc), [(set (v8i16 VECREG:$rDst), (sext (vectype VECREG:$rSrc)))]>; -class XSBHRegInst: +class XSBHInRegInst: XSBHInst<(outs rclass:$rDst), (ins rclass:$rSrc), [(set rclass:$rDst, (sext_inreg rclass:$rSrc, i8))]>; multiclass ExtendByteHalfword { def v16i8: XSBHVecInst; - def r16: XSBHRegInst; + def r16: XSBHInRegInst; + def r8: XSBHInst<(outs R16C:$rDst), (ins R8C:$rSrc), + [(set R16C:$rDst, (sext R8C:$rSrc))]>; // 32-bit form for XSBH: used to sign extend 8-bit quantities to 16-bit // quantities to 32-bit quantities via a 32-bit register (see the sext 8->32 // pattern below). Intentionally doesn't match a pattern because we want the // sext 8->32 pattern to do the work for us, namely because we need the extra // XSHWr32. - def r32: XSBHRegInst; + def r32: XSBHInRegInst; } defm XSBH : ExtendByteHalfword; -// Sign-extend, but take an 8-bit register to a 16-bit register (not done as -// sext_inreg) -def XSBHr8: - XSBHInst<(outs R16C:$rDst), (ins R8C:$rSrc), - [(set R16C:$rDst, (sext R8C:$rSrc))]>; - // Sign extend halfwords to words: def XSHWvec: RRForm_1<0b01101101010, (outs VECREG:$rDest), (ins VECREG:$rSrc), @@ -1208,13 +1318,44 @@ ORInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB), [(set rclass:$rT, (or rclass:$rA, rclass:$rB))]>; +// ORCvtForm: OR conversion form +// +// This is used to "convert" the preferred slot to its vector equivalent, as +// well as convert a vector back to its preferred slot. +// +// These are effectively no-ops, but need to exist for proper type conversion +// and type coercion. + +class ORCvtForm + : SPUInstr { + bits<7> RA; + bits<7> RT; + + let Pattern = [/* no pattern */]; + + let Inst{0-10} = 0b10000010000; + let Inst{11-17} = RA; + let Inst{18-24} = RA; + let Inst{25-31} = RT; +} + class ORPromoteScalar: - ORInst<(outs VECREG:$rT), (ins rclass:$rA, rclass:$rB), - [/* no pattern */]>; + ORCvtForm<(outs VECREG:$rT), (ins rclass:$rA)>; class ORExtractElt: - ORInst<(outs rclass:$rT), (ins VECREG:$rA, VECREG:$rB), - [/* no pattern */]>; + ORCvtForm<(outs rclass:$rT), (ins VECREG:$rA)>; + +class ORCvtRegGPRC: + ORCvtForm<(outs GPRC:$rT), (ins rclass:$rA)>; + +class ORCvtVecGPRC: + ORCvtForm<(outs GPRC:$rT), (ins VECREG:$rA)>; + +class ORCvtGPRCReg: + ORCvtForm<(outs rclass:$rT), (ins GPRC:$rA)>; + +class ORCvtGPRCVec: + ORCvtForm<(outs VECREG:$rT), (ins GPRC:$rA)>; multiclass BitwiseOr { @@ -1229,7 +1370,7 @@ (v4i32 VECREG:$rB)))))]>; def v2f64: ORInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), - [(set (v2f64 VECREG:$rT), + [(set (v2f64 VECREG:$rT), (v2f64 (bitconvert (or (v2i64 VECREG:$rA), (v2i64 VECREG:$rB)))))]>; @@ -1260,48 +1401,115 @@ def i64_v2i64: ORExtractElt; def f32_v4f32: ORExtractElt; def f64_v2f64: ORExtractElt; + + // Conversion from GPRC to register + def i128_r64: ORCvtRegGPRC; + def i128_f64: ORCvtRegGPRC; + def i128_r32: ORCvtRegGPRC; + def i128_f32: ORCvtRegGPRC; + def i128_r16: ORCvtRegGPRC; + def i128_r8: ORCvtRegGPRC; + + // Conversion from GPRC to vector + def i128_vec: ORCvtVecGPRC; + + // Conversion from register to GPRC + def r64_i128: ORCvtGPRCReg; + def f64_i128: ORCvtGPRCReg; + def r32_i128: ORCvtGPRCReg; + def f32_i128: ORCvtGPRCReg; + def r16_i128: ORCvtGPRCReg; + def r8_i128: ORCvtGPRCReg; + + // Conversion from vector to GPRC + def vec_i128: ORCvtGPRCVec; } defm OR : BitwiseOr; -// scalar->vector promotion patterns: -def : Pat<(v16i8 (SPUpromote_scalar R8C:$rA)), - (ORv16i8_i8 R8C:$rA, R8C:$rA)>; +// scalar->vector promotion patterns (preferred slot to vector): +def : Pat<(v16i8 (SPUprefslot2vec R8C:$rA)), + (ORv16i8_i8 R8C:$rA)>; -def : Pat<(v8i16 (SPUpromote_scalar R16C:$rA)), - (ORv8i16_i16 R16C:$rA, R16C:$rA)>; +def : Pat<(v8i16 (SPUprefslot2vec R16C:$rA)), + (ORv8i16_i16 R16C:$rA)>; -def : Pat<(v4i32 (SPUpromote_scalar R32C:$rA)), - (ORv4i32_i32 R32C:$rA, R32C:$rA)>; +def : Pat<(v4i32 (SPUprefslot2vec R32C:$rA)), + (ORv4i32_i32 R32C:$rA)>; -def : Pat<(v2i64 (SPUpromote_scalar R64C:$rA)), - (ORv2i64_i64 R64C:$rA, R64C:$rA)>; +def : Pat<(v2i64 (SPUprefslot2vec R64C:$rA)), + (ORv2i64_i64 R64C:$rA)>; -def : Pat<(v4f32 (SPUpromote_scalar R32FP:$rA)), - (ORv4f32_f32 R32FP:$rA, R32FP:$rA)>; +def : Pat<(v4f32 (SPUprefslot2vec R32FP:$rA)), + (ORv4f32_f32 R32FP:$rA)>; -def : Pat<(v2f64 (SPUpromote_scalar R64FP:$rA)), - (ORv2f64_f64 R64FP:$rA, R64FP:$rA)>; +def : Pat<(v2f64 (SPUprefslot2vec R64FP:$rA)), + (ORv2f64_f64 R64FP:$rA)>; -// ORi*_v*: Used to extract vector element 0 (the preferred slot) +// ORi*_v*: Used to extract vector element 0 (the preferred slot), otherwise +// known as converting the vector back to its preferred slot def : Pat<(SPUvec2prefslot (v16i8 VECREG:$rA)), - (ORi8_v16i8 VECREG:$rA, VECREG:$rA)>; + (ORi8_v16i8 VECREG:$rA)>; def : Pat<(SPUvec2prefslot (v8i16 VECREG:$rA)), - (ORi16_v8i16 VECREG:$rA, VECREG:$rA)>; + (ORi16_v8i16 VECREG:$rA)>; def : Pat<(SPUvec2prefslot (v4i32 VECREG:$rA)), - (ORi32_v4i32 VECREG:$rA, VECREG:$rA)>; + (ORi32_v4i32 VECREG:$rA)>; def : Pat<(SPUvec2prefslot (v2i64 VECREG:$rA)), - (ORi64_v2i64 VECREG:$rA, VECREG:$rA)>; + (ORi64_v2i64 VECREG:$rA)>; def : Pat<(SPUvec2prefslot (v4f32 VECREG:$rA)), - (ORf32_v4f32 VECREG:$rA, VECREG:$rA)>; + (ORf32_v4f32 VECREG:$rA)>; def : Pat<(SPUvec2prefslot (v2f64 VECREG:$rA)), - (ORf64_v2f64 VECREG:$rA, VECREG:$rA)>; + (ORf64_v2f64 VECREG:$rA)>; + +// Load Register: This is an assembler alias for a bitwise OR of a register +// against itself. It's here because it brings some clarity to assembly +// language output. + +let hasCtrlDep = 1 in { + class LRInst + : SPUInstr { + bits<7> RA; + bits<7> RT; + + let Pattern = [/*no pattern*/]; + + let Inst{0-10} = 0b10000010000; /* It's an OR operation */ + let Inst{11-17} = RA; + let Inst{18-24} = RA; + let Inst{25-31} = RT; + } + + class LRVecInst: + LRInst<(outs VECREG:$rT), (ins VECREG:$rA)>; + + class LRRegInst: + LRInst<(outs rclass:$rT), (ins rclass:$rA)>; + + multiclass LoadRegister { + def v2i64: LRVecInst; + def v2f64: LRVecInst; + def v4i32: LRVecInst; + def v4f32: LRVecInst; + def v8i16: LRVecInst; + def v16i8: LRVecInst; + + def r128: LRRegInst; + def r64: LRRegInst; + def f64: LRRegInst; + def r32: LRRegInst; + def f32: LRRegInst; + def r16: LRRegInst; + def r8: LRRegInst; + } + + defm LR: LoadRegister; +} // ORC: Bitwise "or" with complement (c = a | ~b) @@ -1585,12 +1793,24 @@ (and (vnot (vectype VECREG:$rC)), (vectype VECREG:$rA))))]>; +class SELBVecCondInst: + SELBInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, R32C:$rC), + [(set (vectype VECREG:$rT), + (select R32C:$rC, + (vectype VECREG:$rB), + (vectype VECREG:$rA)))]>; + class SELBRegInst: SELBInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB, rclass:$rC), [(set rclass:$rT, (or (and rclass:$rA, rclass:$rC), (and rclass:$rB, (not rclass:$rC))))]>; +class SELBRegCondInst: + SELBInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB, rcond:$rC), + [(set rclass:$rT, + (select rcond:$rC, rclass:$rB, rclass:$rA))]>; + multiclass SelectBits { def v16i8: SELBVecInst; @@ -1603,6 +1823,16 @@ def r32: SELBRegInst; def r16: SELBRegInst; def r8: SELBRegInst; + + def v16i8_cond: SELBVecCondInst; + def v8i16_cond: SELBVecCondInst; + def v4i32_cond: SELBVecCondInst; + def v2i64_cond: SELBVecCondInst; + + // SELBr64_cond is defined further down, look for i64 comparisons + def r32_cond: SELBRegCondInst; + def r16_cond: SELBRegCondInst; + def r8_cond: SELBRegCondInst; } defm SELB : SelectBits; @@ -1625,14 +1855,6 @@ def : SPUselbPatReg; def : SPUselbPatReg; -class SelectConditional: - Pat<(select rclass:$rCond, rclass:$rTrue, rclass:$rFalse), - (inst rclass:$rFalse, rclass:$rTrue, rclass:$rCond)>; - -def : SelectConditional; -def : SelectConditional; -def : SelectConditional; - // EQV: Equivalence (1 for each same bit, otherwise 0) // // Note: There are a lot of ways to match this bit operator and these patterns @@ -1753,6 +1975,10 @@ (resultvec VECREG:$rB), (maskvec VECREG:$rC)))]>; +class SHUFBGPRCInst: + SHUFBInst<(outs VECREG:$rT), (ins GPRC:$rA, GPRC:$rB, VECREG:$rC), + [/* no pattern */]>; + multiclass ShuffleBytes { def v16i8 : SHUFBVecInst; @@ -1769,6 +1995,8 @@ def v2f64 : SHUFBVecInst; def v2f64_m32 : SHUFBVecInst; + + def gprc : SHUFBGPRCInst; } defm SHUFB : ShuffleBytes; @@ -2027,7 +2255,7 @@ def : Pat<(SPUvec_rotl VECREG:$rA, (i32 uimm7:$val)), (ROTHIv8i16 VECREG:$rA, imm:$val)>; - + //-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ // Rotate word: //-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ @@ -2207,7 +2435,7 @@ } defm ROTQBI: RotateQuadByBitCount; - + class ROTQBIIInst pattern>: RI7Form<0b00011111100, OOL, IOL, "rotqbii\t$rT, $rA, $val", RotateShift, pattern>; @@ -2298,7 +2526,7 @@ def: Pat<(SPUvec_srl (v8i16 VECREG:$rA), (i16 imm:$val)), (ROTHMIv8i16 VECREG:$rA, imm:$val)>; - + def: Pat<(SPUvec_srl (v8i16 VECREG:$rA), (i8 imm:$val)), (ROTHMIv8i16 VECREG:$rA, imm:$val)>; @@ -2359,7 +2587,7 @@ def : Pat<(SPUvec_srl VECREG:$rA, (i16 uimm7:$val)), (ROTMIv4i32 VECREG:$rA, uimm7:$val)>; - + def : Pat<(SPUvec_srl VECREG:$rA, (i8 uimm7:$val)), (ROTMIv4i32 VECREG:$rA, uimm7:$val)>; @@ -2682,7 +2910,7 @@ "hgt\t$rA, $rB", BranchResolv, [/* no pattern to match */]>; - def HGTIr32: + def HGTIr32: RI10Form_2<0b11110010, (outs), (ins R32C:$rA, s10imm:$val), "hgti\t$rA, $val", BranchResolv, [/* no pattern to match */]>; @@ -2698,9 +2926,9 @@ [/* no pattern to match */]>; } -//------------------------------------------------------------------------ -// Comparison operators: -//------------------------------------------------------------------------ +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ +// Comparison operators for i8, i16 and i32: +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ class CEQBInst pattern> : RRForm<0b00001011110, OOL, IOL, "ceqb\t$rT, $rA, $rB", @@ -2990,8 +3218,14 @@ // define a pattern to generate the right code, as a binary operator // (in a manner of speaking.) // -// N.B.: This only matches the setcc set of conditionals. Special pattern -// matching is used for select conditionals. +// Notes: +// 1. This only matches the setcc set of conditionals. Special pattern +// matching is used for select conditionals. +// +// 2. The "DAG" versions of these classes is almost exclusively used for +// i64 comparisons. See the tblgen fundamentals documentation for what +// ".ResultInstrs[0]" means; see TargetSelectionDAG.td and the Pattern +// class for where ResultInstrs originates. //-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ class SETCCNegCondReg; -def : SETCCNegCondReg; +def : SETCCNegCondReg; def : SETCCNegCondImm; -def : SETCCNegCondReg; +def : SETCCNegCondReg; def : SETCCNegCondImm; def : SETCCNegCondReg; @@ -3128,8 +3362,8 @@ SPUInstr selinstr, SPUInstr binop, SPUInstr cmpOp1, SPUInstr cmpOp2>: Pat<(select (inttype (cond rclass:$rA, rclass:$rB)), - rclass:$rFalse, rclass:$rTrue), - (selinstr rclass:$rTrue, rclass:$rFalse, + rclass:$rTrue, rclass:$rFalse), + (selinstr rclass:$rFalse, rclass:$rTrue, (binop (cmpOp1 rclass:$rA, rclass:$rB), (cmpOp2 rclass:$rA, rclass:$rB)))>; @@ -3226,54 +3460,129 @@ BIForm<0b00010101100, "bi\t$func", [(brind R32C:$func)]>; // Various branches: - def BRNZ: - RI16Form<0b010000100, (outs), (ins R32C:$rCond, brtarget:$dest), - "brnz\t$rCond,$dest", - BranchResolv, - [(brcond R32C:$rCond, bb:$dest)]>; - - def BRZ: - RI16Form<0b000000100, (outs), (ins R32C:$rT, brtarget:$dest), - "brz\t$rT,$dest", - BranchResolv, - [/* no pattern */]>; + class BRNZInst pattern>: + RI16Form<0b010000100, (outs), IOL, "brnz\t$rCond,$dest", + BranchResolv, pattern>; + + class BRNZRegInst: + BRNZInst<(ins rclass:$rCond, brtarget:$dest), + [(brcond rclass:$rCond, bb:$dest)]>; + + class BRNZVecInst: + BRNZInst<(ins VECREG:$rCond, brtarget:$dest), + [(brcond (vectype VECREG:$rCond), bb:$dest)]>; + + multiclass BranchNotZero { + def v4i32 : BRNZVecInst; + def r32 : BRNZRegInst; + } - def BRHNZ: - RI16Form<0b011000100, (outs), (ins R16C:$rCond, brtarget:$dest), - "brhnz\t$rCond,$dest", - BranchResolv, - [(brcond R16C:$rCond, bb:$dest)]>; - - def BRHZ: - RI16Form<0b001000100, (outs), (ins R16C:$rT, brtarget:$dest), - "brhz\t$rT,$dest", - BranchResolv, - [/* no pattern */]>; - -/* - def BINZ: - BICondForm<0b10010100100, "binz\t$rA, $func", - [(SPUbinz R32C:$rA, R32C:$func)]>; - - def BIZ: - BICondForm<0b00010100100, "biz\t$rA, $func", - [(SPUbiz R32C:$rA, R32C:$func)]>; -*/ + defm BRNZ : BranchNotZero; + + class BRZInst pattern>: + RI16Form<0b000000100, (outs), IOL, "brz\t$rT,$dest", + BranchResolv, pattern>; + + class BRZRegInst: + BRZInst<(ins rclass:$rT, brtarget:$dest), [/* no pattern */]>; + + class BRZVecInst: + BRZInst<(ins VECREG:$rT, brtarget:$dest), [/* no pattern */]>; + + multiclass BranchZero { + def v4i32: BRZVecInst; + def r32: BRZRegInst; + } + + defm BRZ: BranchZero; + + // Note: LLVM doesn't do branch conditional, indirect. Otherwise these would + // be useful: + /* + class BINZInst pattern>: + BICondForm<0b10010100100, (outs), IOL, "binz\t$rA, $dest", pattern>; + + class BINZRegInst: + BINZInst<(ins rclass:$rA, brtarget:$dest), + [(brcond rclass:$rA, R32C:$dest)]>; + + class BINZVecInst: + BINZInst<(ins VECREG:$rA, R32C:$dest), + [(brcond (vectype VECREG:$rA), R32C:$dest)]>; + + multiclass BranchNotZeroIndirect { + def v4i32: BINZVecInst; + def r32: BINZRegInst; + } + + defm BINZ: BranchNotZeroIndirect; + + class BIZInst pattern>: + BICondForm<0b00010100100, (outs), IOL, "biz\t$rA, $func", pattern>; + + class BIZRegInst: + BIZInst<(ins rclass:$rA, R32C:$func), [/* no pattern */]>; + + class BIZVecInst: + BIZInst<(ins VECREG:$rA, R32C:$func), [/* no pattern */]>; + + multiclass BranchZeroIndirect { + def v4i32: BIZVecInst; + def r32: BIZRegInst; + } + + defm BIZ: BranchZeroIndirect; + */ + + class BRHNZInst pattern>: + RI16Form<0b011000100, (outs), IOL, "brhnz\t$rCond,$dest", BranchResolv, + pattern>; + + class BRHNZRegInst: + BRHNZInst<(ins rclass:$rCond, brtarget:$dest), + [(brcond rclass:$rCond, bb:$dest)]>; + + class BRHNZVecInst: + BRHNZInst<(ins VECREG:$rCond, brtarget:$dest), [/* no pattern */]>; + + multiclass BranchNotZeroHalfword { + def v8i16: BRHNZVecInst; + def r16: BRHNZRegInst; + } + + defm BRHNZ: BranchNotZeroHalfword; + + class BRHZInst pattern>: + RI16Form<0b001000100, (outs), IOL, "brhz\t$rT,$dest", BranchResolv, + pattern>; + + class BRHZRegInst: + BRHZInst<(ins rclass:$rT, brtarget:$dest), [/* no pattern */]>; + + class BRHZVecInst: + BRHZInst<(ins VECREG:$rT, brtarget:$dest), [/* no pattern */]>; + + multiclass BranchZeroHalfword { + def v8i16: BRHZVecInst; + def r16: BRHZRegInst; + } + + defm BRHZ: BranchZeroHalfword; } //===----------------------------------------------------------------------===// // setcc and brcond patterns: //===----------------------------------------------------------------------===// -def : Pat<(brcond (i16 (seteq R16C:$rA, 0)), bb:$dest), - (BRHZ R16C:$rA, bb:$dest)>; -def : Pat<(brcond (i16 (setne R16C:$rA, 0)), bb:$dest), - (BRHNZ R16C:$rA, bb:$dest)>; - -def : Pat<(brcond (i32 (seteq R32C:$rA, 0)), bb:$dest), - (BRZ R32C:$rA, bb:$dest)>; -def : Pat<(brcond (i32 (setne R32C:$rA, 0)), bb:$dest), - (BRNZ R32C:$rA, bb:$dest)>; +def : Pat<(brcond (i16 (seteq R16C:$rA, 0)), bb:$dest), + (BRHZr16 R16C:$rA, bb:$dest)>; +def : Pat<(brcond (i16 (setne R16C:$rA, 0)), bb:$dest), + (BRHNZr16 R16C:$rA, bb:$dest)>; + +def : Pat<(brcond (i32 (seteq R32C:$rA, 0)), bb:$dest), + (BRZr32 R32C:$rA, bb:$dest)>; +def : Pat<(brcond (i32 (setne R32C:$rA, 0)), bb:$dest), + (BRNZr32 R32C:$rA, bb:$dest)>; multiclass BranchCondEQ { @@ -3290,8 +3599,8 @@ (brinst32 (CEQr32 R32C:$rA, R32C:$rB), bb:$dest)>; } -defm BRCONDeq : BranchCondEQ; -defm BRCONDne : BranchCondEQ; +defm BRCONDeq : BranchCondEQ; +defm BRCONDne : BranchCondEQ; multiclass BranchCondLGT { @@ -3308,8 +3617,8 @@ (brinst32 (CLGTr32 R32C:$rA, R32C:$rB), bb:$dest)>; } -defm BRCONDugt : BranchCondLGT; -defm BRCONDule : BranchCondLGT; +defm BRCONDugt : BranchCondLGT; +defm BRCONDule : BranchCondLGT; multiclass BranchCondLGTEQ @@ -3335,8 +3644,8 @@ bb:$dest)>; } -defm BRCONDuge : BranchCondLGTEQ; -defm BRCONDult : BranchCondLGTEQ; +defm BRCONDuge : BranchCondLGTEQ; +defm BRCONDult : BranchCondLGTEQ; multiclass BranchCondGT { @@ -3353,8 +3662,8 @@ (brinst32 (CGTr32 R32C:$rA, R32C:$rB), bb:$dest)>; } -defm BRCONDgt : BranchCondGT; -defm BRCONDle : BranchCondGT; +defm BRCONDgt : BranchCondGT; +defm BRCONDle : BranchCondGT; multiclass BranchCondGTEQ @@ -3380,8 +3689,8 @@ bb:$dest)>; } -defm BRCONDge : BranchCondGTEQ; -defm BRCONDlt : BranchCondGTEQ; +defm BRCONDge : BranchCondGTEQ; +defm BRCONDlt : BranchCondGTEQ; let isTerminator = 1, isBarrier = 1 in { let isReturn = 1 in { @@ -3397,10 +3706,12 @@ class FAInst pattern>: RRForm<0b01011000100, OOL, IOL, "fa\t$rT, $rA, $rB", SPrecFP, pattern>; + class FAVecInst: FAInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), [(set (vectype VECREG:$rT), (fadd (vectype VECREG:$rA), (vectype VECREG:$rB)))]>; + multiclass SFPAdd { def v4f32: FAVecInst; @@ -3548,7 +3859,7 @@ // floating reciprocal absolute square root estimate (frsqest) // The following are probably just intrinsics -// status and control register write +// status and control register write // status and control register read //-------------------------------------- @@ -3603,7 +3914,7 @@ // = c - a * b // NOTE: subtraction order // fsub a b = a - b -// fs a b = b - a? +// fs a b = b - a? def FNMSf32 : RRRForm<0b1101, (outs R32FP:$rT), (ins R32FP:$rA, R32FP:$rB, R32FP:$rC), "fnms\t$rT, $rA, $rB, $rC", SPrecFP, @@ -3612,9 +3923,9 @@ def FNMSv4f32 : RRRForm<0b1101, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, VECREG:$rC), "fnms\t$rT, $rA, $rB, $rC", SPrecFP, - [(set (v4f32 VECREG:$rT), - (fsub (v4f32 VECREG:$rC), - (fmul (v4f32 VECREG:$rA), + [(set (v4f32 VECREG:$rT), + (fsub (v4f32 VECREG:$rC), + (fmul (v4f32 VECREG:$rA), (v4f32 VECREG:$rB))))]>; //-------------------------------------- @@ -3625,7 +3936,7 @@ "csflt\t$rT, $rA, 0", SPrecFP, [(set (v4f32 VECREG:$rT), (sint_to_fp (v4i32 VECREG:$rA)))]>; -// Convert signed integer to floating point +// Convert signed integer to floating point def CSiFf32 : CVTIntFPForm<0b0101101110, (outs R32FP:$rT), (ins R32C:$rA), "csflt\t$rT, $rA, 0", SPrecFP, @@ -3642,7 +3953,7 @@ "cuflt\t$rT, $rA, 0", SPrecFP, [(set R32FP:$rT, (uint_to_fp R32C:$rA))]>; -// Convert float to unsigned int +// Convert float to unsigned int // Assume that scale = 0 def CFUiv4f32 : @@ -3655,7 +3966,7 @@ "cfltu\t$rT, $rA, 0", SPrecFP, [(set R32C:$rT, (fp_to_uint R32FP:$rA))]>; -// Convert float to signed int +// Convert float to signed int // Assume that scale = 0 def CFSiv4f32 : @@ -3788,9 +4099,9 @@ RRForm<0b01111010110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, VECREG:$rC), "dfnms\t$rT, $rA, $rB", DPrecFP, - [(set (v2f64 VECREG:$rT), - (fsub (v2f64 VECREG:$rC), - (fmul (v2f64 VECREG:$rA), + [(set (v2f64 VECREG:$rT), + (fsub (v2f64 VECREG:$rC), + (fmul (v2f64 VECREG:$rA), (v2f64 VECREG:$rB))))]>, RegConstraint<"$rC = $rT">, NoEncode<"$rC">; @@ -3813,9 +4124,9 @@ RRForm<0b11111010110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, VECREG:$rC), "dfnma\t$rT, $rA, $rB", DPrecFP, - [(set (v2f64 VECREG:$rT), - (fneg (fadd (v2f64 VECREG:$rC), - (fmul (v2f64 VECREG:$rA), + [(set (v2f64 VECREG:$rT), + (fneg (fadd (v2f64 VECREG:$rC), + (fmul (v2f64 VECREG:$rA), (v2f64 VECREG:$rB)))))]>, RegConstraint<"$rC = $rT">, NoEncode<"$rC">; @@ -3825,7 +4136,7 @@ //===----------------------------------------------------------------------==// def : Pat<(fneg (v4f32 VECREG:$rA)), - (XORfnegvec (v4f32 VECREG:$rA), + (XORfnegvec (v4f32 VECREG:$rA), (v4f32 (ILHUv4i32 0x8000)))>; def : Pat<(fneg R32FP:$rA), @@ -3944,7 +4255,7 @@ def : Pat<(v4i32 v4i32Imm:$imm), (IOHLv4i32 (v4i32 (ILHUv4i32 (HI16_vec v4i32Imm:$imm))), (LO16_vec v4i32Imm:$imm))>; - + // 8-bit constants def : Pat<(i8 imm:$imm), (ILHr8 imm:$imm)>; @@ -4001,6 +4312,69 @@ (ORIi16i32 R16C:$rSrc, 0)>; //===----------------------------------------------------------------------===// +// Truncates: +// These truncates are for the SPU's supported types (i8, i16, i32). i64 and +// above are custom lowered. +//===----------------------------------------------------------------------===// + +def : Pat<(i8 (trunc GPRC:$src)), + (ORi8_v16i8 + (SHUFBgprc GPRC:$src, GPRC:$src, + (IOHLv4i32 (ILHUv4i32 0x0f0f), 0x0f0f)))>; + +def : Pat<(i8 (trunc R64C:$src)), + (ORi8_v16i8 + (SHUFBv2i64_m32 + (ORv2i64_i64 R64C:$src), + (ORv2i64_i64 R64C:$src), + (IOHLv4i32 (ILHUv4i32 0x0707), 0x0707)))>; + +def : Pat<(i8 (trunc R32C:$src)), + (ORi8_v16i8 + (SHUFBv4i32_m32 + (ORv4i32_i32 R32C:$src), + (ORv4i32_i32 R32C:$src), + (IOHLv4i32 (ILHUv4i32 0x0303), 0x0303)))>; + +def : Pat<(i8 (trunc R16C:$src)), + (ORi8_v16i8 + (SHUFBv4i32_m32 + (ORv8i16_i16 R16C:$src), + (ORv8i16_i16 R16C:$src), + (IOHLv4i32 (ILHUv4i32 0x0303), 0x0303)))>; + +def : Pat<(i16 (trunc GPRC:$src)), + (ORi16_v8i16 + (SHUFBgprc GPRC:$src, GPRC:$src, + (IOHLv4i32 (ILHUv4i32 0x0e0f), 0x0e0f)))>; + +def : Pat<(i16 (trunc R64C:$src)), + (ORi16_v8i16 + (SHUFBv2i64_m32 + (ORv2i64_i64 R64C:$src), + (ORv2i64_i64 R64C:$src), + (IOHLv4i32 (ILHUv4i32 0x0607), 0x0607)))>; + +def : Pat<(i16 (trunc R32C:$src)), + (ORi16_v8i16 + (SHUFBv4i32_m32 + (ORv4i32_i32 R32C:$src), + (ORv4i32_i32 R32C:$src), + (IOHLv4i32 (ILHUv4i32 0x0203), 0x0203)))>; + +def : Pat<(i32 (trunc GPRC:$src)), + (ORi32_v4i32 + (SHUFBgprc GPRC:$src, GPRC:$src, + (IOHLv4i32 (ILHUv4i32 0x0c0d), 0x0e0f)))>; + +def : Pat<(i32 (trunc R64C:$src)), + (ORi32_v4i32 + (SHUFBv2i64_m32 + (ORv2i64_i64 R64C:$src), + (ORv2i64_i64 R64C:$src), + (IOHLv4i32 (ILHUv4i32 0x0405), 0x0607)))>; + +//===----------------------------------------------------------------------===// // Address generation: SPU, like PPC, has to split addresses into high and // low parts in order to load them into a register. //===----------------------------------------------------------------------===// @@ -4047,3 +4421,5 @@ // Instrinsics: include "CellSDKIntrinsics.td" +// 64-bit "instructions"/support +include "SPU64InstrInfo.td" Modified: llvm/trunk/lib/Target/CellSPU/SPUNodes.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUNodes.td?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUNodes.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUNodes.td Fri Dec 26 22:51:36 2008 @@ -66,6 +66,13 @@ def SPUvecshift_type: SDTypeProfile<1, 2, [ SDTCisSameAs<0, 1>, SDTCisInt<2>]>; +// SPU gather bits: +// This instruction looks at each vector (word|halfword|byte) slot's low bit +// and forms a mask in the low order bits of the first word's preferred slot. +def SPUgatherbits_type: SDTypeProfile<1, 1, [ + /* no type constraints defined */ +]>; + //===----------------------------------------------------------------------===// // Synthetic/pseudo-instructions //===----------------------------------------------------------------------===// @@ -137,14 +144,17 @@ // SPU select bits instruction def SPUselb: SDNode<"SPUISD::SELB", SPUselb_type, []>; +// SPU gather bits instruction: +def SPUgatherbits: SDNode<"SPUISD::GATHER_BITS", SPUgatherbits_type, []>; + // SPU floating point interpolate def SPUinterpolate : SDNode<"SPUISD::FPInterp", SDTFPBinOp, []>; // SPU floating point reciprocal estimate (used for fdiv) def SPUreciprocalEst: SDNode<"SPUISD::FPRecipEst", SDTFPUnaryOp, []>; -def SDTpromote_scalar: SDTypeProfile<1, 1, []>; -def SPUpromote_scalar: SDNode<"SPUISD::PROMOTE_SCALAR", SDTpromote_scalar, []>; +def SDTprefslot2vec: SDTypeProfile<1, 1, []>; +def SPUprefslot2vec: SDNode<"SPUISD::PREFSLOT2VEC", SDTprefslot2vec, []>; def SPU_vec_demote : SDTypeProfile<1, 1, []>; def SPUvec2prefslot: SDNode<"SPUISD::VEC2PREFSLOT", SPU_vec_demote, []>; Modified: llvm/trunk/lib/Target/CellSPU/SPUOperands.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUOperands.td?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUOperands.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUOperands.td Fri Dec 26 22:51:36 2008 @@ -609,15 +609,15 @@ let PrintMethod = "printSymbolLSA"; } -// memory s7imm(reg) operaand -def memri7 : Operand { - let PrintMethod = "printMemRegImmS7"; +// Shuffle address memory operaand [s7imm(reg) d-format] +def shufaddr : Operand { + let PrintMethod = "printShufAddr"; let MIOperandInfo = (ops s7imm:$imm, ptr_rc:$reg); } // memory s10imm(reg) operand -def memri10 : Operand { - let PrintMethod = "printMemRegImmS10"; +def dformaddr : Operand { + let PrintMethod = "printDFormAddr"; let MIOperandInfo = (ops s10imm:$imm, ptr_rc:$reg); } Modified: llvm/trunk/lib/Target/CellSPU/SPURegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPURegisterInfo.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPURegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPURegisterInfo.cpp Fri Dec 26 22:51:36 2008 @@ -403,11 +403,6 @@ void SPURegisterInfo::processFunctionBeforeCalleeSavedScan(MachineFunction &MF, RegScavenger *RS) const { -#if 0 - // Save and clear the LR state. - SPUFunctionInfo *FI = MF.getInfo(); - FI->setUsesLR(MF.getRegInfo().isPhysRegUsed(LR)); -#endif // Mark LR and SP unused, since the prolog spills them to stack and // we don't want anyone else to spill them for us. // Modified: llvm/trunk/lib/Target/CellSPU/SPUTargetAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUTargetAsmInfo.cpp?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUTargetAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUTargetAsmInfo.cpp Fri Dec 26 22:51:36 2008 @@ -26,6 +26,13 @@ PrivateGlobalPrefix = ".L"; // This corresponds to what the gcc SPU compiler emits, for consistency. CStringSection = ".rodata.str"; + + // BSS section needs to be emitted as ".section" + BSSSection = "\t.section\t.bss"; + BSSSection_ = getUnnamedSection("\t.section\t.bss", + SectionFlags::Writeable | SectionFlags::BSS, + true); + } /// PreferredEHDataFormat - This hook allows the target to select data Modified: llvm/trunk/test/CodeGen/CellSPU/call_indirect.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/call_indirect.ll?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/call_indirect.ll (original) +++ llvm/trunk/test/CodeGen/CellSPU/call_indirect.ll Fri Dec 26 22:51:36 2008 @@ -2,7 +2,7 @@ ; RUN: llvm-as -o - %s | llc -march=cellspu -mattr=large_mem > %t2.s ; RUN: grep bisl %t1.s | count 7 ; RUN: grep ila %t1.s | count 1 -; RUN: grep rotqbyi %t1.s | count 4 +; RUN: grep rotqby %t1.s | count 6 ; RUN: grep lqa %t1.s | count 1 ; RUN: grep lqd %t1.s | count 12 ; RUN: grep dispatch_tab %t1.s | count 5 Added: llvm/trunk/test/CodeGen/CellSPU/icmp64.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/icmp64.ll?rev=61447&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/icmp64.ll (added) +++ llvm/trunk/test/CodeGen/CellSPU/icmp64.ll Fri Dec 26 22:51:36 2008 @@ -0,0 +1,144 @@ +; RUN: llvm-as -o - %s | llc -march=cellspu > %t1.s +; RUN: grep ceq %t1.s | count 4 +; RUN: grep cgti %t1.s | count 4 +; RUN: grep gb %t1.s | count 4 +; RUN: grep fsm %t1.s | count 2 +; RUN: grep xori %t1.s | count 1 +; RUN: grep selb %t1.s | count 2 + +target datalayout = "E-p:32:32:128-f64:64:128-f32:32:128-i64:32:128-i32:32:128-i16:16:128-i8:8:128-i1:8:128-a0:0:128-v128:128:128-s0:128:128" +target triple = "spu" + +; $3 = %arg1, $4 = %arg2, $5 = %val1, $6 = %val2 +; $3 = %arg1, $4 = %val1, $5 = %val2 +; +; i64 integer comparisons: +define i64 @icmp_eq_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +entry: + %A = icmp eq i64 %arg1, %arg2 + %B = select i1 %A, i64 %val1, i64 %val2 + ret i64 %B +} + +define i1 @icmp_eq_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +entry: + %A = icmp eq i64 %arg1, %arg2 + ret i1 %A +} + +define i64 @icmp_ne_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +entry: + %A = icmp ne i64 %arg1, %arg2 + %B = select i1 %A, i64 %val1, i64 %val2 + ret i64 %B +} + +define i1 @icmp_ne_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +entry: + %A = icmp ne i64 %arg1, %arg2 + ret i1 %A +} + +;; define i64 @icmp_ugt_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ugt i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_ugt_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ugt i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_uge_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp uge i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_uge_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp uge i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_ult_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ult i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_ult_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ult i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_ule_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ule i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_ule_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp ule i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_sgt_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sgt i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_sgt_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sgt i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_sge_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sge i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_sge_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sge i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_slt_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp slt i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_slt_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp slt i64 %arg1, %arg2 +;; ret i1 %A +;; } +;; +;; define i64 @icmp_sle_select_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sle i64 %arg1, %arg2 +;; %B = select i1 %A, i64 %val1, i64 %val2 +;; ret i64 %B +;; } +;; +;; define i1 @icmp_sle_setcc_i64(i64 %arg1, i64 %arg2, i64 %val1, i64 %val2) nounwind { +;; entry: +;; %A = icmp sle i64 %arg1, %arg2 +;; ret i1 %A +;; } Modified: llvm/trunk/test/CodeGen/CellSPU/stores.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/stores.ll?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/stores.ll (original) +++ llvm/trunk/test/CodeGen/CellSPU/stores.ll Fri Dec 26 22:51:36 2008 @@ -3,8 +3,17 @@ ; RUN: grep {stqd.*16(\$3)} %t1.s | count 4 ; RUN: grep 16256 %t1.s | count 2 ; RUN: grep 16384 %t1.s | count 1 +; RUN: grep 771 %t1.s | count 4 +; RUN: grep 515 %t1.s | count 2 +; RUN: grep 1799 %t1.s | count 2 +; RUN: grep 1543 %t1.s | count 5 +; RUN: grep 1029 %t1.s | count 3 ; RUN: grep {shli.*, 4} %t1.s | count 4 ; RUN: grep stqx %t1.s | count 4 +; RUN: grep ilhu %t1.s | count 11 +; RUN: grep iohl %t1.s | count 8 +; RUN: grep shufb %t1.s | count 15 +; RUN: grep frds %t1.s | count 1 ; ModuleID = 'stores.bc' target datalayout = "E-p:32:32:128-f64:64:128-f32:32:128-i64:32:128-i32:32:128-i16:16:128-i8:8:128-i1:8:128-a0:0:128-v128:128:128-s0:128:128" @@ -89,3 +98,54 @@ store <4 x float> < float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00 >, <4 x float>* %arrayidx ret void } + +; Test truncating stores: + +define zeroext i8 @tstore_i16_i8(i16 signext %val, i8* %dest) nounwind { +entry: + %conv = trunc i16 %val to i8 + store i8 %conv, i8* %dest + ret i8 %conv +} + +define zeroext i8 @tstore_i32_i8(i32 %val, i8* %dest) nounwind { +entry: + %conv = trunc i32 %val to i8 + store i8 %conv, i8* %dest + ret i8 %conv +} + +define signext i16 @tstore_i32_i16(i32 %val, i16* %dest) nounwind { +entry: + %conv = trunc i32 %val to i16 + store i16 %conv, i16* %dest + ret i16 %conv +} + +define zeroext i8 @tstore_i64_i8(i64 %val, i8* %dest) nounwind { +entry: + %conv = trunc i64 %val to i8 + store i8 %conv, i8* %dest + ret i8 %conv +} + +define signext i16 @tstore_i64_i16(i64 %val, i16* %dest) nounwind { +entry: + %conv = trunc i64 %val to i16 + store i16 %conv, i16* %dest + ret i16 %conv +} + +define i32 @tstore_i64_i32(i64 %val, i32* %dest) nounwind { +entry: + %conv = trunc i64 %val to i32 + store i32 %conv, i32* %dest + ret i32 %conv +} + +define float @tstore_f64_f32(double %val, float* %dest) nounwind { +entry: + %conv = fptrunc double %val to float + store float %conv, float* %dest + ret float %conv +} Modified: llvm/trunk/test/CodeGen/CellSPU/struct_1.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/struct_1.ll?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/struct_1.ll (original) +++ llvm/trunk/test/CodeGen/CellSPU/struct_1.ll Fri Dec 26 22:51:36 2008 @@ -35,7 +35,7 @@ ; int i2; // offset 12 [ignored] ; unsigned char c4; // offset 16 [ignored] ; unsigned char c5; // offset 17 [ignored] -; unsigned char c6; // offset 18 [ignored] +; unsigned char c6; // offset 18 (rotate left by 14 bytes to byte 3) ; unsigned char c7; // offset 19 (no rotate, in preferred slot) ; int i3; // offset 20 [ignored] ; int i4; // offset 24 [ignored] Modified: llvm/trunk/test/CodeGen/CellSPU/trunc.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/trunc.ll?rev=61447&r1=61446&r2=61447&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/trunc.ll (original) +++ llvm/trunk/test/CodeGen/CellSPU/trunc.ll Fri Dec 26 22:51:36 2008 @@ -1,16 +1,12 @@ ; RUN: llvm-as -o - %s | llc -march=cellspu > %t1.s -; RUN: grep shufb %t1.s | count 9 +; RUN: grep shufb %t1.s | count 10 ; RUN: grep {ilhu.*1799} %t1.s | count 1 -; RUN: grep {ilhu.*771} %t1.s | count 3 +; RUN: grep {ilhu.*771} %t1.s | count 1 ; RUN: grep {ilhu.*1543} %t1.s | count 1 ; RUN: grep {ilhu.*1029} %t1.s | count 1 -; RUN: grep {ilhu.*515} %t1.s | count 1 -; RUN: grep {iohl.*1799} %t1.s | count 1 -; RUN: grep {iohl.*771} %t1.s | count 3 -; RUN: grep {iohl.*1543} %t1.s | count 2 -; RUN: grep {iohl.*515} %t1.s | count 1 -; RUN: grep xsbh %t1.s | count 6 -; RUN: grep sfh %t1.s | count 5 +; RUN: grep {ilhu.*515} %t1.s | count 2 +; RUN: grep xsbh %t1.s | count 2 +; RUN: grep sfh %t1.s | count 1 ; ModuleID = 'trunc.bc' target datalayout = "E-p:32:32:128-i1:8:128-i8:8:128-i16:16:128-i32:32:128-i64:32:128-f32:32:128-f64:64:128-v64:64:64-v128:128:128-a0:0:128-s0:128:128" @@ -41,23 +37,22 @@ ; ret i64 %0 ;} -define i8 @trunc_i64_i8(i64 %u, i8 %v) nounwind readnone { +define <16 x i8> @trunc_i64_i8(i64 %u, <16 x i8> %v) nounwind readnone { entry: %0 = trunc i64 %u to i8 - %1 = sub i8 %0, %v - ret i8 %1 + %tmp1 = insertelement <16 x i8> %v, i8 %0, i32 10 + ret <16 x i8> %tmp1 } -define i16 @trunc_i64_i16(i64 %u, i16 %v) nounwind readnone { +define <8 x i16> @trunc_i64_i16(i64 %u, <8 x i16> %v) nounwind readnone { entry: %0 = trunc i64 %u to i16 - %1 = sub i16 %0, %v - ret i16 %1 + %tmp1 = insertelement <8 x i16> %v, i16 %0, i32 6 + ret <8 x i16> %tmp1 } define i32 @trunc_i64_i32(i64 %u, i32 %v) nounwind readnone { entry: %0 = trunc i64 %u to i32 - %1 = sub i32 %0, %v - ret i32 %1 + ret i32 %0 } define i8 @trunc_i32_i8(i32 %u, i8 %v) nounwind readnone { @@ -66,16 +61,16 @@ %1 = sub i8 %0, %v ret i8 %1 } -define i16 @trunc_i32_i16(i32 %u, i16 %v) nounwind readnone { +define <8 x i16> @trunc_i32_i16(i32 %u, <8 x i16> %v) nounwind readnone { entry: %0 = trunc i32 %u to i16 - %1 = sub i16 %0, %v - ret i16 %1 + %tmp1 = insertelement <8 x i16> %v, i16 %0, i32 3 + ret <8 x i16> %tmp1 } -define i8 @trunc_i16_i8(i16 %u, i8 %v) nounwind readnone { +define <16 x i8> @trunc_i16_i8(i16 %u, <16 x i8> %v) nounwind readnone { entry: %0 = trunc i16 %u to i8 - %1 = sub i8 %0, %v - ret i8 %1 + %tmp1 = insertelement <16 x i8> %v, i8 %0, i32 5 + ret <16 x i8> %tmp1 } Added: llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i32operations.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i32operations.c?rev=61447&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i32operations.c (added) +++ llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i32operations.c Fri Dec 26 22:51:36 2008 @@ -0,0 +1,69 @@ +#include + +typedef unsigned int uint32_t; +typedef int int32_t; + +const char *boolstring(int val) { + return val ? "true" : "false"; +} + +int i32_eq(int32_t a, int32_t b) { + return (a == b); +} + +int i32_neq(int32_t a, int32_t b) { + return (a != b); +} + +int32_t i32_eq_select(int32_t a, int32_t b, int32_t c, int32_t d) { + return ((a == b) ? c : d); +} + +int32_t i32_neq_select(int32_t a, int32_t b, int32_t c, int32_t d) { + return ((a != b) ? c : d); +} + +struct pred_s { + const char *name; + int (*predfunc)(int32_t, int32_t); + int (*selfunc)(int32_t, int32_t, int32_t, int32_t); +}; + +struct pred_s preds[] = { + { "eq", i32_eq, i32_eq_select }, + { "neq", i32_neq, i32_neq_select } +}; + +int main(void) { + int i; + int32_t a = 1234567890; + int32_t b = 345678901; + int32_t c = 1234500000; + int32_t d = 10001; + int32_t e = 10000; + + printf("a = %12d (0x%08x)\n", a, a); + printf("b = %12d (0x%08x)\n", b, b); + printf("c = %12d (0x%08x)\n", c, c); + printf("d = %12d (0x%08x)\n", d, d); + printf("e = %12d (0x%08x)\n", e, e); + printf("----------------------------------------\n"); + + for (i = 0; i < sizeof(preds)/sizeof(preds[0]); ++i) { + printf("a %s a = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, a))); + printf("a %s a = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, a))); + printf("a %s b = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, b))); + printf("a %s c = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, c))); + printf("d %s e = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(d, e))); + printf("e %s e = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(e, e))); + + printf("a %s a ? c : d = %d\n", preds[i].name, (*preds[i].selfunc)(a, a, c, d)); + printf("a %s a ? c : d == c (%s)\n", preds[i].name, boolstring((*preds[i].selfunc)(a, a, c, d) == c)); + printf("a %s b ? c : d = %d\n", preds[i].name, (*preds[i].selfunc)(a, b, c, d)); + printf("a %s b ? c : d == d (%s)\n", preds[i].name, boolstring((*preds[i].selfunc)(a, b, c, d) == d)); + + printf("----------------------------------------\n"); + } + + return 0; +} Added: llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i64operations.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i64operations.c?rev=61447&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i64operations.c (added) +++ llvm/trunk/test/CodeGen/CellSPU/useful-harnesses/i64operations.c Fri Dec 26 22:51:36 2008 @@ -0,0 +1,68 @@ +#include + +typedef unsigned long long int uint64_t; +typedef long long int int64_t; + +const char *boolstring(int val) { + return val ? "true" : "false"; +} + +int i64_eq(int64_t a, int64_t b) { + return (a == b); +} + +int i64_neq(int64_t a, int64_t b) { + return (a != b); +} + +int64_t i64_eq_select(int64_t a, int64_t b, int64_t c, int64_t d) { + return ((a == b) ? c : d); +} + +int64_t i64_neq_select(int64_t a, int64_t b, int64_t c, int64_t d) { + return ((a != b) ? c : d); +} + +struct pred_s { + const char *name; + int (*predfunc)(int64_t, int64_t); + int64_t (*selfunc)(int64_t, int64_t, int64_t, int64_t); +}; + +struct pred_s preds[] = { + { "eq", i64_eq, i64_eq_select }, + { "neq", i64_neq, i64_neq_select } +}; + +int main(void) { + int i; + int64_t a = 1234567890000LL; + int64_t b = 2345678901234LL; + int64_t c = 1234567890001LL; + int64_t d = 10001LL; + int64_t e = 10000LL; + + printf("a = %16lld (0x%016llx)\n", a, a); + printf("b = %16lld (0x%016llx)\n", b, b); + printf("c = %16lld (0x%016llx)\n", c, c); + printf("d = %16lld (0x%016llx)\n", d, d); + printf("e = %16lld (0x%016llx)\n", e, e); + printf("----------------------------------------\n"); + + for (i = 0; i < sizeof(preds)/sizeof(preds[0]); ++i) { + printf("a %s a = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, a))); + printf("a %s b = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, b))); + printf("a %s c = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(a, c))); + printf("d %s e = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(d, e))); + printf("e %s e = %s\n", preds[i].name, boolstring((*preds[i].predfunc)(e, e))); + + printf("a %s a ? c : d = %lld\n", preds[i].name, (*preds[i].selfunc)(a, a, c, d)); + printf("a %s a ? c : d == c (%s)\n", preds[i].name, boolstring((*preds[i].selfunc)(a, a, c, d) == c)); + printf("a %s b ? c : d = %lld\n", preds[i].name, (*preds[i].selfunc)(a, b, c, d)); + printf("a %s b ? c : d == d (%s)\n", preds[i].name, boolstring((*preds[i].selfunc)(a, b, c, d) == d)); + + printf("----------------------------------------\n"); + } + + return 0; +} From sabre at nondot.org Sat Dec 27 01:47:41 2008 From: sabre at nondot.org (Chris Lattner) Date: Sat, 27 Dec 2008 07:47:41 -0000 Subject: [llvm-commits] [llvm] r61448 - /llvm/trunk/include/llvm/AbstractTypeUser.h Message-ID: <200812270747.mBR7lfHN006522@zion.cs.uiuc.edu> Author: lattner Date: Sat Dec 27 01:47:40 2008 New Revision: 61448 URL: http://llvm.org/viewvc/llvm-project?rev=61448&view=rev Log: implement simplify_type for PATypeHolder so that isa(PATypeHolder) works. Modified: llvm/trunk/include/llvm/AbstractTypeUser.h Modified: llvm/trunk/include/llvm/AbstractTypeUser.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/AbstractTypeUser.h?rev=61448&r1=61447&r2=61448&view=diff ============================================================================== --- llvm/trunk/include/llvm/AbstractTypeUser.h (original) +++ llvm/trunk/include/llvm/AbstractTypeUser.h Sat Dec 27 01:47:40 2008 @@ -33,6 +33,7 @@ class Type; class DerivedType; +template struct simplify_type; /// The AbstractTypeUser class is an interface to be implemented by classes who /// could possibly use an abstract type. Abstract types are denoted by the @@ -174,6 +175,21 @@ void dropRef(); }; +// simplify_type - Allow clients to treat uses just like values when using +// casting operators. +template<> struct simplify_type { + typedef const Type* SimpleType; + static SimpleType getSimplifiedValue(const PATypeHolder &Val) { + return static_cast(Val.get()); + } +}; +template<> struct simplify_type { + typedef const Type* SimpleType; + static SimpleType getSimplifiedValue(const PATypeHolder &Val) { + return static_cast(Val.get()); + } +}; + } // End llvm namespace #endif From sabre at nondot.org Sat Dec 27 02:10:47 2008 From: sabre at nondot.org (Chris Lattner) Date: Sat, 27 Dec 2008 08:10:47 -0000 Subject: [llvm-commits] [llvm] r61449 - /llvm/trunk/test/Feature/testtype.ll Message-ID: <200812270810.mBR8Amv3007198@zion.cs.uiuc.edu> Author: lattner Date: Sat Dec 27 02:10:46 2008 New Revision: 61449 URL: http://llvm.org/viewvc/llvm-project?rev=61449&view=rev Log: add testcase for type parsing. Added: llvm/trunk/test/Feature/testtype.ll Added: llvm/trunk/test/Feature/testtype.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Feature/testtype.ll?rev=61449&view=auto ============================================================================== --- llvm/trunk/test/Feature/testtype.ll (added) +++ llvm/trunk/test/Feature/testtype.ll Sat Dec 27 02:10:46 2008 @@ -0,0 +1,21 @@ +; RUN: llvm-as < %s | llvm-dis > %t1.ll +; RUN: llvm-as %t1.ll -o - | llvm-dis > %t2.ll +; RUN: diff %t1.ll %t2.ll + +%X = type i32* addrspace(4)* + + %inners = type { float, { i8 } } + %struct = type { i32, %inners, i64 } + +%fwdref = type { %fwd* } +%fwd = type %fwdref* + +; same as above with unnamed types +type { %1* } +type %0* +%test = type %1 + +%test2 = type [2 x i32] +;%x = type %undefined* + +%test3 = type i32 (i32()*, float(...)*, ...)* From nicholas at mxc.ca Sat Dec 27 10:20:55 2008 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 27 Dec 2008 16:20:55 -0000 Subject: [llvm-commits] [llvm] r61451 - in /llvm/trunk: lib/Analysis/IPA/Andersens.cpp test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll Message-ID: <200812271620.mBRGKtoO031382@zion.cs.uiuc.edu> Author: nicholas Date: Sat Dec 27 10:20:53 2008 New Revision: 61451 URL: http://llvm.org/viewvc/llvm-project?rev=61451&view=rev Log: Check that the function prototypes are correct before assuming that the parameters are pointers. Added: llvm/trunk/test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll Modified: llvm/trunk/lib/Analysis/IPA/Andersens.cpp Modified: llvm/trunk/lib/Analysis/IPA/Andersens.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IPA/Andersens.cpp?rev=61451&r1=61450&r2=61451&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/IPA/Andersens.cpp (original) +++ llvm/trunk/lib/Analysis/IPA/Andersens.cpp Sat Dec 27 10:20:53 2008 @@ -945,30 +945,40 @@ F->getName() == "llvm.memmove" || F->getName() == "memmove") { - // *Dest = *Src, which requires an artificial graph node to represent the - // constraint. It is broken up into *Dest = temp, temp = *Src - unsigned FirstArg = getNode(CS.getArgument(0)); - unsigned SecondArg = getNode(CS.getArgument(1)); - unsigned TempArg = GraphNodes.size(); - GraphNodes.push_back(Node()); - Constraints.push_back(Constraint(Constraint::Store, - FirstArg, TempArg)); - Constraints.push_back(Constraint(Constraint::Load, - TempArg, SecondArg)); - // In addition, Dest = Src - Constraints.push_back(Constraint(Constraint::Copy, - FirstArg, SecondArg)); - return true; + const FunctionType *FTy = F->getFunctionType(); + if (FTy->getNumParams() > 1 && + isa(FTy->getParamType(0)) && + isa(FTy->getParamType(1))) { + + // *Dest = *Src, which requires an artificial graph node to represent the + // constraint. It is broken up into *Dest = temp, temp = *Src + unsigned FirstArg = getNode(CS.getArgument(0)); + unsigned SecondArg = getNode(CS.getArgument(1)); + unsigned TempArg = GraphNodes.size(); + GraphNodes.push_back(Node()); + Constraints.push_back(Constraint(Constraint::Store, + FirstArg, TempArg)); + Constraints.push_back(Constraint(Constraint::Load, + TempArg, SecondArg)); + // In addition, Dest = Src + Constraints.push_back(Constraint(Constraint::Copy, + FirstArg, SecondArg)); + return true; + } } // Result = Arg0 if (F->getName() == "realloc" || F->getName() == "strchr" || F->getName() == "strrchr" || F->getName() == "strstr" || F->getName() == "strtok") { - Constraints.push_back(Constraint(Constraint::Copy, - getNode(CS.getInstruction()), - getNode(CS.getArgument(0)))); - return true; + const FunctionType *FTy = F->getFunctionType(); + if (FTy->getNumParams() > 0 && + isa(FTy->getParamType(0))) { + Constraints.push_back(Constraint(Constraint::Copy, + getNode(CS.getInstruction()), + getNode(CS.getArgument(0)))); + return true; + } } return false; Added: llvm/trunk/test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll?rev=61451&view=auto ============================================================================== --- llvm/trunk/test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll (added) +++ llvm/trunk/test/Analysis/Andersens/2008-12-27-BuiltinWrongType.ll Sat Dec 27 10:20:53 2008 @@ -0,0 +1,19 @@ +; RUN: llvm-as < %s | opt -anders-aa +; PR3262 + + at .str15 = external global [3 x i8] ; <[3 x i8]*> [#uses=1] + +declare i8* @strtok(...) +declare i8* @memmove(...) + +define void @test1(i8* %want1) nounwind { +entry: + %0 = call i8* (...)* @strtok(i32 0, i8* getelementptr ([3 x i8]* @.str15, i32 0, i32 0)) nounwind ; [#uses=0] + unreachable +} + +define void @test2() nounwind { +entry: + %0 = call i8* (...)* @memmove() + unreachable +} From wurstgebaeck at googlemail.com Sat Dec 27 10:16:35 2008 From: wurstgebaeck at googlemail.com (Jan Rehders) Date: Sat, 27 Dec 2008 17:16:35 +0100 Subject: [llvm-commits] [LLVMdev] ParseAssemblyString change of behaviour In-Reply-To: <422C0CB2-5D13-451A-A6C7-D94747F4BD6D@apple.com> References: <0E604620-7BAD-4492-A8EE-770C534EC180@gmail.com> <422C0CB2-5D13-451A-A6C7-D94747F4BD6D@apple.com> Message-ID: <08F95122-2FAF-45D6-84AB-56D097247DF2@gmail.com> Hi, here is a patch which fixes the bug for me. I'm not sure the format is right, I created it using git show (I started from the 2.4 source release, no svn version here). Let me know if you need some other format HTH, Jan On 23.12.2008, at 23:20, Chris Lattner wrote: > > On Dec 23, 2008, at 7:15 AM, Jan Rehders wrote: > >> Hi, >> >> when upgrading my compiler from LLVM 2.1 to 2.4 I stumbled upon a >> change of behaviour in ParseAssemblyString. For an interactive >> toplevel I am generating .ll source and feeding it into >> ParseAssemblyString like this: > > Hi Jan, > > I don't think that there is any intentional change here. It sounds > like a bug. > > -Chris > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev Original message: > when upgrading my compiler from LLVM 2.1 to 2.4 I stumbled upon a > change of behaviour in ParseAssemblyString. For an interactive > toplevel I am generating .ll source and feeding it into > ParseAssemblyString like this: > > Module* parsedModule = ParseAssemblyString( code, targetModule, > &errorInfo ); > > where targetModule is the module I expect all the LLVM code to go. > Until 2.1 the globals, types and functions in code where added to > targetModule. Since 2.2 this does not happen anymore. > > The documentation still states that targetModule is "A module to add > the assembly too.". Is the new behaviour a bug? If not why has this > been changed? > > And most important for me: how do I work around this? I intend to > copy the types, globals and functions from the newly parsed module > into my other module. Is this feasible or is there another > recommended way? (Using the LLVM API instead of generating .ll > source is not really practical for me at this time) -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-2.4.patch Type: application/octet-stream Size: 1527 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20081227/8dfb7233/attachment.obj From dpatel at apple.com Sat Dec 27 21:28:06 2008 From: dpatel at apple.com (Devang Patel) Date: Sat, 27 Dec 2008 19:28:06 -0800 Subject: [llvm-commits] [llvm] r61391 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp In-Reply-To: <0E75F175-7039-4BA6-807E-039BCCD3E616@gmail.com> References: <200812232155.mBNLteni013987@zion.cs.uiuc.edu> <0E75F175-7039-4BA6-807E-039BCCD3E616@gmail.com> Message-ID: On Dec 23, 2008, at 7:38 PM, Bill Wendling wrote: > On Dec 23, 2008, at 1:55 PM, Devang Patel wrote: > >> Author: dpatel >> Date: Tue Dec 23 15:55:38 2008 >> New Revision: 61391 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=61391&view=rev >> Log: >> Fix typo. >> Silence unused variable warning. >> >> Modified: >> llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp >> >> Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp?rev=61391&r1=61390&r2=61391&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp (original) >> +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfWriter.cpp Tue Dec 23 >> 15:55:38 2008 >> @@ -1616,7 +1616,7 @@ >> while (FromTy) { >> if (FromTy->getTag() != DW_TAG_typedef) { >> FieldSize = FromTy->getSize(); >> - FieldAlign = FromTy->getSize(); >> + FieldAlign = FromTy->getAlign(); > > Did you mean to commit this with this patch? :-) yup, this fixes a typo. - Devang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20081227/374c9e0f/attachment.html From edwintorok at gmail.com Sun Dec 28 14:07:02 2008 From: edwintorok at gmail.com (=?ISO-8859-1?Q?T=F6r=F6k_Edwin?=) Date: Sun, 28 Dec 2008 22:07:02 +0200 Subject: [llvm-commits] [PATCH] Teach IRBuilder about simplifying BinOp(Value, Constant) Message-ID: <4957DC66.1010406@gmail.com> Hi, I have a pass that produces lots of and,or,xor where one of the operands is a constant (0 or all ones). Initially I thought to run instcombine afterwards to clean up, but now I think it is more efficient to not create those instructions in the first place: why create possibly hundreds of instructions just to delete them later. I could do this "simplification" in my pass, but I thought that a better place is IRBuilder, which already does constant folding by default (when both operands are constant). Another alternative would be a utility function SimplifyInstruction (similar to ConstantFoldInstruction). Which one is more appropriate? The attached patch adds another template parameter (that defaults to a no-op implementation) to simplify BinOps where one operand is constant. There is also a BinOpSimplifier class in my patch that handles some very simple (and common for me) cases: * I have only implemented very simple cases, the more cases belong to instcombine (and AFAICT they are there already) * if instruction already exists in BB before or at insertion point, reuse it If BinOpSimplifier ever becomes the default for IRBuilder, this could cause problems if somebody inserts an instruction using IRBuilder, and later uses RAUW if !isa, because it can potentially replace more than he thought of! * if the outcome of a binop is known to be a constant, return the constant: and X, 0 -> 0; and X, ~0 -> X; or X, 0 -> X; or X, ~0; xor X, 0 -> X; * simplify chained XORs with same constant,: If we want to create xor %1, C; and there is a "%1 = xor %0, C", then return %0 While we're here I also implemented: * add X, 0 -> X * sub X, 0 -> X * mul X, 0 -> 0 * mul X, 1 -> X * udiv/sdiv 0, X -> 0 * udiv/sdiv X, 1 -> X * urem X, 1 -> 0 I haven't implemented shifts, because I am not sure how complicated we'd want to make this: use ComputeMaskedBits, and make decision based on that, or just simplify shift by 0, and shift > bitwidth? I am submitting this for discussion, and to see if I am going into the right direction, the patch currently lacks a test case for example. What do you think? Is this patch acceptable as a default no-op (using NoSimplifier in IRBuilder)? Does anybody want this to be default on? I can think of clang as a possible user (besides my pass). llvm-gcc -O0 appears to do this kind of simplification by default. Best regards, --Edwin -------------- next part -------------- A non-text attachment was scrubbed... Name: simplifier.patch Type: text/x-patch Size: 16322 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20081228/c2e10f5f/attachment.bin From resistor at mac.com Sun Dec 28 15:48:50 2008 From: resistor at mac.com (Owen Anderson) Date: Sun, 28 Dec 2008 21:48:50 -0000 Subject: [llvm-commits] [llvm] r61458 - /llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Message-ID: <200812282148.mBSLmoGs019936@zion.cs.uiuc.edu> Author: resistor Date: Sun Dec 28 15:48:48 2008 New Revision: 61458 URL: http://llvm.org/viewvc/llvm-project?rev=61458&view=rev Log: Add prototype code for recomputing a live interval's ranges and valnos through recursive phi construction. Modified: llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Modified: llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp?rev=61458&r1=61457&r2=61458&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp (original) +++ llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Sun Dec 28 15:48:48 2008 @@ -171,7 +171,15 @@ int& SS, SmallPtrSet& RefsInMBB); void RenumberValno(VNInfo* VN); - }; + void ReconstructLiveInterval(LiveInterval* LI); + VNInfo* PerformPHIConstruction(MachineBasicBlock::iterator use, + LiveInterval* LI, + DenseMap >& Defs, + DenseMap >& Uses, + DenseMap& NewVNs, + DenseMap& Visited, + bool toplevel = false); +}; } // end anonymous namespace char PreAllocSplitting::ID = 0; @@ -577,6 +585,257 @@ return LastMI; } +/// PerformPHIConstruction - From properly set up use and def lists, use a PHI +/// construction algorithm to compute the ranges and valnos for an interval. +VNInfo* PreAllocSplitting::PerformPHIConstruction( + MachineBasicBlock::iterator use, + LiveInterval* LI, + DenseMap >& Defs, + DenseMap >& Uses, + DenseMap& NewVNs, + DenseMap& Visited, + bool toplevel) { + // Return memoized result if it's available. + if (Visited.count(use->getParent())) + return Visited[use->getParent()]; + + typedef DenseMap > RegMap; + + // Check if our block contains any uses or defs. + bool ContainsDefs = Defs.count(use->getParent()); + bool ContainsUses = Uses.count(use->getParent()); + + VNInfo* ret = 0; + + // Enumerate the cases of use/def contaning blocks. + if (!ContainsDefs && !ContainsUses) { + Fallback: + // NOTE: Because this is the fallback case from other cases, we do NOT + // assume that we are not at toplevel here. + + // If there are no uses or defs between our starting point and the beginning + // of the block, then recursive perform phi construction on our predecessors + MachineBasicBlock* MBB = use->getParent(); + DenseMap IncomingVNs; + for (MachineBasicBlock::pred_iterator PI = MBB->pred_begin(), + PE = MBB->pred_end(); PI != PE; ++PI) { + VNInfo* Incoming = PerformPHIConstruction((*PI)->end(), LI, Defs, Uses, + NewVNs, Visited, false); + IncomingVNs[*PI] = Incoming; + } + + // If only one VNInfo came back from our predecessors, just use that one... + if (IncomingVNs.size() == 1) { + ret = IncomingVNs.begin()->second; + unsigned StartIndex = LIs->getMBBStartIdx(use->getParent()); + unsigned EndIndex = 0; + if (toplevel) { + EndIndex = LIs->getInstructionIndex(use); + EndIndex = LiveIntervals::getUseIndex(EndIndex); + } else + EndIndex = LIs->getMBBEndIdx(use->getParent()); + + LI->addRange(LiveRange(StartIndex, EndIndex, ret)); + } else { + // Otherwise, merge the incoming VNInfos with a phi join. Create a new + // VNInfo to represent the joined value. + for (DenseMap::iterator I = + IncomingVNs.begin(), E = IncomingVNs.end(); I != E; ++I) { + I->second->hasPHIKill = true; + unsigned KillIndex = LIs->getMBBEndIdx(I->first); + LI->addKill(I->second, KillIndex); + } + + unsigned StartIndex = LIs->getMBBStartIdx(use->getParent()); + unsigned EndIndex = 0; + if (toplevel) { + EndIndex = LIs->getInstructionIndex(use); + EndIndex = LiveIntervals::getUseIndex(EndIndex); + } else + EndIndex = LIs->getMBBEndIdx(use->getParent()); + ret = LI->getNextValue(StartIndex, /*FIXME*/ 0, + LIs->getVNInfoAllocator()); + LI->addRange(LiveRange(StartIndex, EndIndex, ret)); + } + } else if (ContainsDefs && !ContainsUses) { + SmallPtrSet& BlockDefs = Defs[use->getParent()]; + + // Search for the def in this block. If we don't find it before the + // instruction we care about, go to the fallback case. Note that that + // should never happen: this cannot be a toplevel block, so use should + // always be an end() iterator. + assert(use == use->getParent()->end() && "No use marked in toplevel block"); + + MachineBasicBlock::iterator walker = use; + --walker; + while (walker != use->getParent()->begin()) + if (BlockDefs.count(walker)) { + break; + } else + --walker; + + // Once we've found it, extend its VNInfo to our instruction. + unsigned DefIndex = LIs->getInstructionIndex(walker); + DefIndex = LiveIntervals::getDefIndex(DefIndex); + unsigned EndIndex = LIs->getMBBEndIdx(use->getParent()); + + ret = NewVNs[walker]; + LI->addRange(LiveRange(DefIndex, EndIndex, ret)); + } else if (!ContainsDefs && ContainsUses) { + SmallPtrSet& BlockUses = Uses[use->getParent()]; + + // Search for the use in this block that precedes the instruction we care + // about, going to the fallback case if we don't find it. + + if (use == use->getParent()->begin()) + goto Fallback; + + MachineBasicBlock::iterator walker = use; + --walker; + bool found = false; + while (walker != use->getParent()->begin()) + if (BlockUses.count(walker)) { + found = true; + break; + } else + --walker; + + // Must check begin() too. + if (!found) + if (BlockUses.count(walker)) + found = true; + else + goto Fallback; + + unsigned UseIndex = LIs->getInstructionIndex(walker); + UseIndex = LiveIntervals::getUseIndex(UseIndex); + unsigned EndIndex = 0; + if (toplevel) { + EndIndex = LIs->getInstructionIndex(walker); + EndIndex = LiveIntervals::getUseIndex(EndIndex); + } else + EndIndex = LIs->getMBBEndIdx(use->getParent()); + + // Now, recursively phi construct the VNInfo for the use we found, + // and then extend it to include the instruction we care about + ret = PerformPHIConstruction(walker, LI, Defs, Uses, + NewVNs, Visited, false); + + // FIXME: Need to set kills properly for inter-block stuff. + if (toplevel) { + if (LI->isKill(ret, UseIndex)) LI->removeKill(ret, UseIndex); + LI->addKill(ret, EndIndex); + } + + LI->addRange(LiveRange(UseIndex, EndIndex, ret)); + } else if (ContainsDefs && ContainsUses){ + SmallPtrSet& BlockDefs = Defs[use->getParent()]; + SmallPtrSet& BlockUses = Uses[use->getParent()]; + + // This case is basically a merging of the two preceding case, with the + // special note that checking for defs must take precedence over checking + // for uses, because of two-address instructions. + + if (use == use->getParent()->begin()) + goto Fallback; + + MachineBasicBlock::iterator walker = use; + --walker; + bool foundDef = false; + bool foundUse = false; + while (walker != use->getParent()->begin()) + if (BlockDefs.count(walker)) { + foundDef = true; + break; + } else if (BlockUses.count(walker)) { + foundUse = true; + break; + } else + --walker; + + // Must check begin() too. + if (!foundDef && !foundUse) + if (BlockDefs.count(walker)) + foundDef = true; + else if (BlockUses.count(walker)) + foundUse = true; + else + goto Fallback; + + unsigned StartIndex = LIs->getInstructionIndex(walker); + StartIndex = foundDef ? LiveIntervals::getDefIndex(StartIndex) : + LiveIntervals::getUseIndex(StartIndex); + unsigned EndIndex = 0; + if (toplevel) { + EndIndex = LIs->getInstructionIndex(walker); + EndIndex = LiveIntervals::getUseIndex(EndIndex); + } else + EndIndex = LIs->getMBBEndIdx(use->getParent()); + + if (foundDef) + ret = NewVNs[walker]; + else + ret = PerformPHIConstruction(walker, LI, Defs, Uses, + NewVNs, Visited, false); + + // FIXME: Need to set kills properly for inter-block stuff. + if (toplevel) { + if (foundUse && LI->isKill(ret, StartIndex)) + LI->removeKill(ret, StartIndex); + LI->addKill(ret, EndIndex); + } + + LI->addRange(LiveRange(StartIndex, EndIndex, ret)); + } + + // Memoize results so we don't have to recompute them. + if (!toplevel) Visited[use->getParent()] = ret; + + return ret; +} + +/// ReconstructLiveInterval - Recompute a live interval from scratch. +void PreAllocSplitting::ReconstructLiveInterval(LiveInterval* LI) { + BumpPtrAllocator& Alloc = LIs->getVNInfoAllocator(); + + // Clear the old ranges and valnos; + LI->clear(); + + // Cache the uses and defs of the register + typedef DenseMap > RegMap; + RegMap Defs, Uses; + + // Keep track of the new VNs we're creating. + DenseMap NewVNs; + SmallPtrSet PhiVNs; + + // Cache defs, and create a new VNInfo for each def. + for (MachineRegisterInfo::def_iterator DI = MRI->def_begin(LI->reg), + DE = MRI->def_end(); DI != DE; ++DI) { + Defs[(*DI).getParent()].insert(&*DI); + + unsigned DefIdx = LIs->getInstructionIndex(&*DI); + DefIdx = LiveIntervals::getDefIndex(DefIdx); + + VNInfo* NewVN = LI->getNextValue(DefIdx, /*FIXME*/ 0, Alloc); + NewVNs[&*DI] = NewVN; + } + + // Cache uses as a separate pass from actually processing them. + for (MachineRegisterInfo::use_iterator UI = MRI->use_begin(LI->reg), + UE = MRI->use_end(); UI != UE; ++UI) + Uses[(*UI).getParent()].insert(&*UI); + + // Now, actually process every use and use a phi construction algorithm + // to walk from it to its reaching definitions, building VNInfos along + // the way. + for (MachineRegisterInfo::use_iterator UI = MRI->use_begin(LI->reg), + UE = MRI->use_end(); UI != UE; ++UI) { + DenseMap Visited; + PerformPHIConstruction(&*UI, LI, Defs, Uses, NewVNs, Visited, true); + } +} + /// ShrinkWrapLiveInterval - Recursively traverse the predecessor /// chain to find the new 'kills' and shrink wrap the live interval to the /// new kill indices. From resistor at mac.com Sun Dec 28 15:57:03 2008 From: resistor at mac.com (Owen Anderson) Date: Sun, 28 Dec 2008 21:57:03 -0000 Subject: [llvm-commits] [llvm] r61459 - /llvm/trunk/include/llvm/CodeGen/LiveInterval.h Message-ID: <200812282157.mBSLv4lB020162@zion.cs.uiuc.edu> Author: resistor Date: Sun Dec 28 15:57:02 2008 New Revision: 61459 URL: http://llvm.org/viewvc/llvm-project?rev=61459&view=rev Log: Forgot to commit this file. Add a clear() method to remove all ranges and value numbers for a live interval. Modified: llvm/trunk/include/llvm/CodeGen/LiveInterval.h Modified: llvm/trunk/include/llvm/CodeGen/LiveInterval.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/LiveInterval.h?rev=61459&r1=61458&r2=61459&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/LiveInterval.h (original) +++ llvm/trunk/include/llvm/CodeGen/LiveInterval.h Sun Dec 28 15:57:02 2008 @@ -145,6 +145,16 @@ while (I->end <= Pos) ++I; return I; } + + void clear() { + while (!valnos.empty()) { + VNInfo *VNI = valnos.back(); + valnos.pop_back(); + VNI->~VNInfo(); + } + + ranges.clear(); + } /// isStackSlot - Return true if this is a stack slot interval. /// From resistor at mac.com Sun Dec 28 17:35:14 2008 From: resistor at mac.com (Owen Anderson) Date: Sun, 28 Dec 2008 23:35:14 -0000 Subject: [llvm-commits] [llvm] r61460 - /llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Message-ID: <200812282335.mBSNZEqd023322@zion.cs.uiuc.edu> Author: resistor Date: Sun Dec 28 17:35:13 2008 New Revision: 61460 URL: http://llvm.org/viewvc/llvm-project?rev=61460&view=rev Log: Fix up kill/dead marking in the new live interval reconstruction code. Modified: llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Modified: llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp?rev=61460&r1=61459&r2=61460&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp (original) +++ llvm/trunk/lib/CodeGen/PreAllocSplitting.cpp Sun Dec 28 17:35:13 2008 @@ -722,10 +722,9 @@ NewVNs, Visited, false); // FIXME: Need to set kills properly for inter-block stuff. - if (toplevel) { - if (LI->isKill(ret, UseIndex)) LI->removeKill(ret, UseIndex); + if (LI->isKill(ret, UseIndex)) LI->removeKill(ret, UseIndex); + if (toplevel) LI->addKill(ret, EndIndex); - } LI->addRange(LiveRange(UseIndex, EndIndex, ret)); } else if (ContainsDefs && ContainsUses){ @@ -778,10 +777,9 @@ ret = PerformPHIConstruction(walker, LI, Defs, Uses, NewVNs, Visited, false); - // FIXME: Need to set kills properly for inter-block stuff. + if (foundUse && LI->isKill(ret, StartIndex)) + LI->removeKill(ret, StartIndex); if (toplevel) { - if (foundUse && LI->isKill(ret, StartIndex)) - LI->removeKill(ret, StartIndex); LI->addKill(ret, EndIndex); } @@ -834,6 +832,20 @@ DenseMap Visited; PerformPHIConstruction(&*UI, LI, Defs, Uses, NewVNs, Visited, true); } + + // Add ranges for dead defs + for (MachineRegisterInfo::def_iterator DI = MRI->def_begin(LI->reg), + DE = MRI->def_end(); DI != DE; ++DI) { + unsigned DefIdx = LIs->getInstructionIndex(&*DI); + DefIdx = LiveIntervals::getDefIndex(DefIdx); + unsigned UseIdx = LiveIntervals::getUseIndex(DefIdx); + + if (LI->liveAt(DefIdx)) continue; + + VNInfo* DeadVN = NewVNs[&*DI]; + LI->addRange(LiveRange(DefIdx, UseIdx, DeadVN)); + LI->addKill(DeadVN, DefIdx); + } } /// ShrinkWrapLiveInterval - Recursively traverse the predecessor From sabre at nondot.org Sun Dec 28 18:12:51 2008 From: sabre at nondot.org (Chris Lattner) Date: Mon, 29 Dec 2008 00:12:51 -0000 Subject: [llvm-commits] [llvm] r61461 - in /llvm/trunk: include/llvm/Instructions.h lib/VMCore/Instructions.cpp lib/VMCore/Verifier.cpp Message-ID: <200812290012.mBT0Cp80024442@zion.cs.uiuc.edu> Author: lattner Date: Sun Dec 28 18:12:50 2008 New Revision: 61461 URL: http://llvm.org/viewvc/llvm-project?rev=61461&view=rev Log: move select validation logic into a shared place where the select ctor, verifier, asm parser, etc can share it. Modified: llvm/trunk/include/llvm/Instructions.h llvm/trunk/lib/VMCore/Instructions.cpp llvm/trunk/lib/VMCore/Verifier.cpp Modified: llvm/trunk/include/llvm/Instructions.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Instructions.h?rev=61461&r1=61460&r2=61461&view=diff ============================================================================== --- llvm/trunk/include/llvm/Instructions.h (original) +++ llvm/trunk/include/llvm/Instructions.h Sun Dec 28 18:12:50 2008 @@ -1208,6 +1208,7 @@ /// class SelectInst : public Instruction { void init(Value *C, Value *S1, Value *S2) { + assert(!areInvalidOperands(C, S1, S2) && "Invalid operands for select"); Op<0>() = C; Op<1>() = S1; Op<2>() = S2; @@ -1246,6 +1247,10 @@ Value *getCondition() const { return Op<0>(); } Value *getTrueValue() const { return Op<1>(); } Value *getFalseValue() const { return Op<2>(); } + + /// areInvalidOperands - Return a string if the specified operands are invalid + /// for a select operation, otherwise return null. + static const char *areInvalidOperands(Value *Cond, Value *True, Value *False); /// Transparently provide more efficient getOperand methods. DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value); Modified: llvm/trunk/lib/VMCore/Instructions.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Instructions.cpp?rev=61461&r1=61460&r2=61461&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Instructions.cpp (original) +++ llvm/trunk/lib/VMCore/Instructions.cpp Sun Dec 28 18:12:50 2008 @@ -139,6 +139,33 @@ } //===----------------------------------------------------------------------===// +// SelectInst Class +//===----------------------------------------------------------------------===// + +/// areInvalidOperands - Return a string if the specified operands are invalid +/// for a select operation, otherwise return null. +const char *SelectInst::areInvalidOperands(Value *Op0, Value *Op1, Value *Op2) { + if (Op1->getType() != Op2->getType()) + return "both values to select must have same type"; + + if (const VectorType *VT = dyn_cast(Op0->getType())) { + // Vector select. + if (VT->getElementType() != Type::Int1Ty) + return "vector select condition element type must be i1"; + const VectorType *ET = dyn_cast(Op1->getType()); + if (ET == 0) + return "selected values for vector select must be vectors"; + if (ET->getNumElements() != VT->getNumElements()) + return "vector select requires selected vectors to have " + "the same vector length as select condition"; + } else if (Op0->getType() != Type::Int1Ty) { + return "select condition must be i1 or "; + } + return 0; +} + + +//===----------------------------------------------------------------------===// // PHINode Class //===----------------------------------------------------------------------===// Modified: llvm/trunk/lib/VMCore/Verifier.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Verifier.cpp?rev=61461&r1=61460&r2=61461&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Verifier.cpp (original) +++ llvm/trunk/lib/VMCore/Verifier.cpp Sun Dec 28 18:12:50 2008 @@ -687,23 +687,10 @@ } void Verifier::visitSelectInst(SelectInst &SI) { - if (const VectorType* vt - = dyn_cast(SI.getCondition()->getType())) { - Assert1( vt->getElementType() == Type::Int1Ty, - "Select condition type must be vector of bool!", &SI); - if (const VectorType* val_vt - = dyn_cast(SI.getTrueValue()->getType())) { - Assert1( vt->getNumElements() == val_vt->getNumElements(), - "Select vector size != value vector size", &SI); - } else { - Assert1(0, "Vector select values must have vector types", &SI); - } - } else { - Assert1(SI.getCondition()->getType() == Type::Int1Ty, - "Select condition type must be bool!", &SI); - } - Assert1(SI.getTrueValue()->getType() == SI.getFalseValue()->getType(), - "Select values must have identical types!", &SI); + Assert1(!SelectInst::areInvalidOperands(SI.getOperand(0), SI.getOperand(1), + SI.getOperand(2)), + "Invalid operands for select instruction!", &SI); + Assert1(SI.getTrueValue()->getType() == SI.getType(), "Select values must have same type as select instruction!", &SI); visitInstruction(SI); From sabre at nondot.org Sun Dec 28 18:16:13 2008 From: sabre at nondot.org (Chris Lattner) Date: Mon, 29 Dec 2008 00:16:13 -0000 Subject: [llvm-commits] [llvm] r61462 - /llvm/trunk/lib/VMCore/Constants.cpp Message-ID: <200812290016.mBT0GDGX024558@zion.cs.uiuc.edu> Author: lattner Date: Sun Dec 28 18:16:12 2008 New Revision: 61462 URL: http://llvm.org/viewvc/llvm-project?rev=61462&view=rev Log: select constant exprs should have the same constraints as select instructions, notably, they should support vectors and aggregates. Modified: llvm/trunk/lib/VMCore/Constants.cpp Modified: llvm/trunk/lib/VMCore/Constants.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Constants.cpp?rev=61462&r1=61461&r2=61462&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Constants.cpp (original) +++ llvm/trunk/lib/VMCore/Constants.cpp Sun Dec 28 18:16:12 2008 @@ -2111,9 +2111,7 @@ Constant *ConstantExpr::getSelectTy(const Type *ReqTy, Constant *C, Constant *V1, Constant *V2) { - assert(C->getType() == Type::Int1Ty && "Select condition must be i1!"); - assert(V1->getType() == V2->getType() && "Select value types must match!"); - assert(V1->getType()->isFirstClassType() && "Cannot select aggregate type!"); + assert(!SelectInst::areInvalidOperands(C, V1, V2)&&"Invalid select operands"); if (ReqTy == V1->getType()) if (Constant *SC = ConstantFoldSelectInstruction(C, V1, V2)) From scottm at aero.org Sun Dec 28 21:21:38 2008 From: scottm at aero.org (Scott Michel) Date: Mon, 29 Dec 2008 03:21:38 -0000 Subject: [llvm-commits] [llvm] r61463 - /llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200812290321.mBT3Lc6N030531@zion.cs.uiuc.edu> Author: pingbak Date: Sun Dec 28 21:21:37 2008 New Revision: 61463 URL: http://llvm.org/viewvc/llvm-project?rev=61463&view=rev Log: Teach LeaglizeDAG that i64 mul can be a libcall. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=61463&r1=61462&r2=61463&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Sun Dec 28 21:21:37 2008 @@ -3281,6 +3281,8 @@ case ISD::MUL: if (VT == MVT::i32) LC = RTLIB::MUL_I32; + else if (VT == MVT::i64) + LC = RTLIB::MUL_I64; break; case ISD::FPOW: LC = GetFPLibCall(VT, RTLIB::POW_F32, RTLIB::POW_F64, RTLIB::POW_F80, From scottm at aero.org Sun Dec 28 21:23:37 2008 From: scottm at aero.org (Scott Michel) Date: Mon, 29 Dec 2008 03:23:37 -0000 Subject: [llvm-commits] [llvm] r61464 - in /llvm/trunk/lib/Target/CellSPU: SPUISelLowering.cpp SPUInstrInfo.td SPUNodes.td Message-ID: <200812290323.mBT3Nc2G030602@zion.cs.uiuc.edu> Author: pingbak Date: Sun Dec 28 21:23:36 2008 New Revision: 61464 URL: http://llvm.org/viewvc/llvm-project?rev=61464&view=rev Log: - Various '#if 0' cleanups. - Move v4i32, i32 mul into SPUInstrInfo.td, with a few more instruction cleanups there as well. - Make SMUL_LOHI, UMUL_LOHI competely illegal for Cell SPU, to better assist Chris to see the problem in bug 3101. Modified: llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td llvm/trunk/lib/Target/CellSPU/SPUNodes.td Modified: llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp?rev=61464&r1=61463&r2=61464&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp Sun Dec 28 21:23:36 2008 @@ -124,6 +124,10 @@ setLoadExtAction(ISD::ZEXTLOAD, VT, Custom); setLoadExtAction(ISD::SEXTLOAD, VT, Custom); + // SMUL_LOHI, UMUL_LOHI are not legal for Cell: + setOperationAction(ISD::SMUL_LOHI, VT, Expand); + setOperationAction(ISD::UMUL_LOHI, VT, Expand); + for (unsigned stype = sctype - 1; stype >= (unsigned) MVT::i8; --stype) { MVT StoreVT = (MVT::SimpleValueType) stype; setTruncStoreAction(VT, StoreVT, Expand); @@ -207,7 +211,7 @@ // Custom lower i8, i32 and i64 multiplications setOperationAction(ISD::MUL, MVT::i8, Custom); - setOperationAction(ISD::MUL, MVT::i32, Custom); + setOperationAction(ISD::MUL, MVT::i32, Legal); setOperationAction(ISD::MUL, MVT::i64, Expand); // libcall // Need to custom handle (some) common i8, i64 math ops @@ -239,8 +243,8 @@ setOperationAction(ISD::SETCC, MVT::i8, Legal); setOperationAction(ISD::SETCC, MVT::i16, Legal); - setOperationAction(ISD::SETCC, MVT::i32, Custom); - setOperationAction(ISD::SETCC, MVT::i64, Custom); + setOperationAction(ISD::SETCC, MVT::i32, Legal); + setOperationAction(ISD::SETCC, MVT::i64, Legal); // Zero extension and sign extension for i64 have to be // custom legalized @@ -289,9 +293,9 @@ ++sctype) { MVT VT = (MVT::SimpleValueType)sctype; - setOperationAction(ISD::GlobalAddress, VT, Custom); - setOperationAction(ISD::ConstantPool, VT, Custom); - setOperationAction(ISD::JumpTable, VT, Custom); + setOperationAction(ISD::GlobalAddress, VT, Custom); + setOperationAction(ISD::ConstantPool, VT, Custom); + setOperationAction(ISD::JumpTable, VT, Custom); } // RET must be custom lowered, to meet ABI requirements @@ -362,12 +366,15 @@ setOperationAction(ISD::VECTOR_SHUFFLE, VT, Custom); } - setOperationAction(ISD::MUL, MVT::v16i8, Custom); setOperationAction(ISD::AND, MVT::v16i8, Custom); setOperationAction(ISD::OR, MVT::v16i8, Custom); setOperationAction(ISD::XOR, MVT::v16i8, Custom); setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4f32, Custom); + // FIXME: This is only temporary until I put all vector multiplications in + // SPUInstrInfo.td: + setOperationAction(ISD::MUL, MVT::v4i32, Legal); + setShiftAmountType(MVT::i32); setBooleanContents(ZeroOrNegativeOneBooleanContent); @@ -402,7 +409,7 @@ node_names[(unsigned) SPUISD::SHUFB] = "SPUISD::SHUFB"; node_names[(unsigned) SPUISD::SHUFFLE_MASK] = "SPUISD::SHUFFLE_MASK"; node_names[(unsigned) SPUISD::CNTB] = "SPUISD::CNTB"; - node_names[(unsigned) SPUISD::PREFSLOT2VEC] = "SPUISD::PROMOTE_SCALAR"; + node_names[(unsigned) SPUISD::PREFSLOT2VEC] = "SPUISD::PREFSLOT2VEC"; node_names[(unsigned) SPUISD::VEC2PREFSLOT] = "SPUISD::VEC2PREFSLOT"; node_names[(unsigned) SPUISD::MPY] = "SPUISD::MPY"; node_names[(unsigned) SPUISD::MPYU] = "SPUISD::MPYU"; @@ -467,9 +474,9 @@ emitted, e.g. for MVT::f32 extending load to MVT::f64: \verbatim -%1 v16i8,ch = load +%1 v16i8,ch = load %2 v16i8,ch = rotate %1 -%3 v4f8, ch = bitconvert %2 +%3 v4f8, ch = bitconvert %2 %4 f32 = vec2perfslot %3 %5 f64 = fp_extend %4 \endverbatim @@ -902,7 +909,7 @@ assert((FP != 0) && "LowerConstantFP: Node is not ConstantFPSDNode"); - + uint64_t dbits = DoubleToBits(FP->getValueAPF().convertToDouble()); SDValue T = DAG.getConstant(dbits, MVT::i64); SDValue Tvec = DAG.getNode(ISD::BUILD_VECTOR, MVT::v2i64, T, T); @@ -936,7 +943,7 @@ return DAG.getNode(ISD::BRCOND, Op.getValueType(), Op.getOperand(0), Cond, Op.getOperand(2)); } - + return SDValue(); // Unchanged } @@ -1197,9 +1204,18 @@ // address pairs: Callee = DAG.getNode(SPUISD::IndirectAddr, PtrVT, GA, Zero); } - } else if (ExternalSymbolSDNode *S = dyn_cast(Callee)) - Callee = DAG.getExternalSymbol(S->getSymbol(), Callee.getValueType()); - else if (SDNode *Dest = isLSAAddress(Callee, DAG)) { + } else if (ExternalSymbolSDNode *S = dyn_cast(Callee)) { + MVT CalleeVT = Callee.getValueType(); + SDValue Zero = DAG.getConstant(0, PtrVT); + SDValue ExtSym = DAG.getTargetExternalSymbol(S->getSymbol(), + Callee.getValueType()); + + if (!ST->usingLargeMem()) { + Callee = DAG.getNode(SPUISD::AFormAddr, CalleeVT, ExtSym, Zero); + } else { + Callee = DAG.getNode(SPUISD::IndirectAddr, PtrVT, ExtSym, Zero); + } + } else if (SDNode *Dest = isLSAAddress(Callee, DAG)) { // If this is an absolute destination address that appears to be a legal // local store address, use the munged value. Callee = SDValue(Dest, 0); @@ -1831,7 +1847,7 @@ return DAG.getNode(SPUISD::SHUFB, V1.getValueType(), V2, V1, ShufMaskOp); } else if (rotate) { int rotamt = (MaxElts - V0Elt) * EltVT.getSizeInBits()/8; - + return DAG.getNode(SPUISD::ROTBYTES_LEFT, V1.getValueType(), V1, DAG.getConstant(rotamt, MVT::i16)); } else { @@ -1915,17 +1931,8 @@ abort(); /*NOTREACHED*/ - case MVT::v4i32: { - SDValue rA = Op.getOperand(0); - SDValue rB = Op.getOperand(1); - SDValue HiProd1 = DAG.getNode(SPUISD::MPYH, MVT::v4i32, rA, rB); - SDValue HiProd2 = DAG.getNode(SPUISD::MPYH, MVT::v4i32, rB, rA); - SDValue LoProd = DAG.getNode(SPUISD::MPYU, MVT::v4i32, rA, rB); - SDValue Residual1 = DAG.getNode(ISD::ADD, MVT::v4i32, LoProd, HiProd1); - - return DAG.getNode(ISD::ADD, MVT::v4i32, Residual1, HiProd2); - break; - } + case MVT::v4i32: + break; // Multiply two v8i16 vectors (pipeline friendly version): // a) multiply lower halves, mask off upper 16-bit of 32-bit product @@ -2271,7 +2278,7 @@ SDValue result = DAG.getNode(SPUISD::SHUFB, VT, DAG.getNode(ISD::SCALAR_TO_VECTOR, VT, ValOp), - VecOp, + VecOp, DAG.getNode(ISD::BIT_CONVERT, MVT::v4i32, ShufMask)); return result; @@ -2630,32 +2637,6 @@ return Op; } -//! Lower i32 multiplication -static SDValue LowerMUL(SDValue Op, SelectionDAG &DAG, MVT VT, - unsigned Opc) { - switch (VT.getSimpleVT()) { - default: - cerr << "CellSPU: Unknown LowerMUL value type, got " - << Op.getValueType().getMVTString() - << "\n"; - abort(); - /*NOTREACHED*/ - - case MVT::i32: { - SDValue rA = Op.getOperand(0); - SDValue rB = Op.getOperand(1); - - return DAG.getNode(ISD::ADD, MVT::i32, - DAG.getNode(ISD::ADD, MVT::i32, - DAG.getNode(SPUISD::MPYH, MVT::i32, rA, rB), - DAG.getNode(SPUISD::MPYH, MVT::i32, rB, rA)), - DAG.getNode(SPUISD::MPYU, MVT::i32, rA, rB)); - } - } - - return SDValue(); -} - //! Custom lowering for CTPOP (count population) /*! Custom lowering code that counts the number ones in the input @@ -2951,8 +2932,6 @@ return LowerVectorMUL(Op, DAG); else if (VT == MVT::i8) return LowerI8Math(Op, DAG, Opc, *this); - else - return LowerMUL(Op, DAG, VT, Opc); case ISD::FDIV: if (VT == MVT::f32 || VT == MVT::v4f32) @@ -3030,7 +3009,7 @@ || Op1.getOpcode() == SPUISD::IndirectAddr) { // Normalize the operands to reduce repeated code SDValue IndirectArg = Op0, AddArg = Op1; - + if (Op1.getOpcode() == SPUISD::IndirectAddr) { IndirectArg = Op1; AddArg = Op0; @@ -3160,9 +3139,9 @@ case ISD::ANY_EXTEND: case ISD::ZERO_EXTEND: case ISD::SIGN_EXTEND: { - // (SPUpromote_scalar (any|zero|sign_extend (SPUvec2prefslot ))) -> + // (SPUprefslot2vec (any|zero|sign_extend (SPUvec2prefslot ))) -> // - // but only if the SPUpromote_scalar and types match. + // but only if the SPUprefslot2vec and types match. SDValue Op00 = Op0.getOperand(0); if (Op00.getOpcode() == SPUISD::VEC2PREFSLOT) { SDValue Op000 = Op00.getOperand(0); @@ -3173,7 +3152,7 @@ break; } case SPUISD::VEC2PREFSLOT: { - // (SPUpromote_scalar (SPUvec2prefslot )) -> + // (SPUprefslot2vec (SPUvec2prefslot )) -> // Result = Op0.getOperand(0); break; @@ -3329,7 +3308,7 @@ } } } - + // LowerAsmOperandForConstraint void SPUTargetLowering::LowerAsmOperandForConstraint(SDValue Op, Modified: llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td?rev=61464&r1=61463&r2=61464&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td Sun Dec 28 21:23:36 2008 @@ -585,23 +585,29 @@ "ahi\t$rT, $rA, $val", IntegerOp, [(set R16C:$rT, (add R16C:$rA, v8i16SExt10Imm:$val))]>; -def Avec: - RRForm<0b00000011000, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), - "a\t$rT, $rA, $rB", IntegerOp, - [(set (v4i32 VECREG:$rT), (add (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)))]>; - -def : Pat<(add (v16i8 VECREG:$rA), (v16i8 VECREG:$rB)), - (Avec VECREG:$rA, VECREG:$rB)>; - -def Ar32: - RRForm<0b00000011000, (outs R32C:$rT), (ins R32C:$rA, R32C:$rB), - "a\t$rT, $rA, $rB", IntegerOp, - [(set R32C:$rT, (add R32C:$rA, R32C:$rB))]>; - -def Ar8: - RRForm<0b00000011000, (outs R8C:$rT), (ins R8C:$rA, R8C:$rB), - "a\t$rT, $rA, $rB", IntegerOp, - [/* no pattern */]>; +class AInst pattern>: + RRForm<0b00000011000, OOL, IOL, + "a\t$rT, $rA, $rB", IntegerOp, + pattern>; + +class AVecInst: + AInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), + [(set (vectype VECREG:$rT), (add (vectype VECREG:$rA), + (vectype VECREG:$rB)))]>; + +class ARegInst: + AInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB), + [(set rclass:$rT, (add rclass:$rA, rclass:$rB))]>; + +multiclass AddInstruction { + def v4i32: AVecInst; + def v16i8: AVecInst; + + def r32: ARegInst; + def r8: AInst<(outs R8C:$rT), (ins R8C:$rA, R8C:$rB), [/* no pattern */]>; +} + +defm A : AddInstruction; def AIvec: RI10Form<0b00111000, (outs VECREG:$rT), (ins VECREG:$rA, s10imm:$val), @@ -789,96 +795,109 @@ def MPYv8i16: RRForm<0b00100011110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), "mpy\t$rT, $rA, $rB", IntegerMulDiv, - [(set (v8i16 VECREG:$rT), (SPUmpy_v8i16 (v8i16 VECREG:$rA), - (v8i16 VECREG:$rB)))]>; + [(set (v8i16 VECREG:$rT), (SPUmpy_vec (v8i16 VECREG:$rA), + (v8i16 VECREG:$rB)))]>; def MPYr16: RRForm<0b00100011110, (outs R16C:$rT), (ins R16C:$rA, R16C:$rB), "mpy\t$rT, $rA, $rB", IntegerMulDiv, [(set R16C:$rT, (mul R16C:$rA, R16C:$rB))]>; +// Unsigned 16-bit multiply: + +class MPYUInst pattern>: + RRForm<0b00110011110, OOL, IOL, + "mpyu\t$rT, $rA, $rB", IntegerMulDiv, + pattern>; + def MPYUv4i32: - RRForm<0b00110011110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), - "mpyu\t$rT, $rA, $rB", IntegerMulDiv, - [(set (v4i32 VECREG:$rT), - (SPUmpyu_v4i32 (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)))]>; + MPYUInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), + [(set (v4i32 VECREG:$rT), + (SPUmpyu_vec (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)))]>; def MPYUr16: - RRForm<0b00110011110, (outs R32C:$rT), (ins R16C:$rA, R16C:$rB), - "mpyu\t$rT, $rA, $rB", IntegerMulDiv, - [(set R32C:$rT, (mul (zext R16C:$rA), - (zext R16C:$rB)))]>; + MPYUInst<(outs R32C:$rT), (ins R16C:$rA, R16C:$rB), + [(set R32C:$rT, (mul (zext R16C:$rA), (zext R16C:$rB)))]>; def MPYUr32: - RRForm<0b00110011110, (outs R32C:$rT), (ins R32C:$rA, R32C:$rB), - "mpyu\t$rT, $rA, $rB", IntegerMulDiv, - [(set R32C:$rT, (SPUmpyu_i32 R32C:$rA, R32C:$rB))]>; + MPYUInst<(outs R32C:$rT), (ins R32C:$rA, R32C:$rB), + [(set R32C:$rT, (SPUmpyu_int R32C:$rA, R32C:$rB))]>; -// mpyi: multiply 16 x s10imm -> 32 result (custom lowering for 32 bit result, -// this only produces the lower 16 bits) -def MPYIvec: - RI10Form<0b00101110, (outs VECREG:$rT), (ins VECREG:$rA, s10imm:$val), +// mpyi: multiply 16 x s10imm -> 32 result. + +class MPYIInst pattern>: + RI10Form<0b00101110, OOL, IOL, "mpyi\t$rT, $rA, $val", IntegerMulDiv, - [(set (v8i16 VECREG:$rT), (mul (v8i16 VECREG:$rA), v8i16SExt10Imm:$val))]>; + pattern>; + +def MPYIvec: + MPYIInst<(outs VECREG:$rT), (ins VECREG:$rA, s10imm:$val), + [(set (v8i16 VECREG:$rT), + (mul (v8i16 VECREG:$rA), v8i16SExt10Imm:$val))]>; def MPYIr16: - RI10Form<0b00101110, (outs R16C:$rT), (ins R16C:$rA, s10imm:$val), - "mpyi\t$rT, $rA, $val", IntegerMulDiv, - [(set R16C:$rT, (mul R16C:$rA, i16ImmSExt10:$val))]>; + MPYIInst<(outs R16C:$rT), (ins R16C:$rA, s10imm:$val), + [(set R16C:$rT, (mul R16C:$rA, i16ImmSExt10:$val))]>; // mpyui: same issues as other multiplies, plus, this doesn't match a // pattern... but may be used during target DAG selection or lowering + +class MPYUIInst pattern>: + RI10Form<0b10101110, OOL, IOL, + "mpyui\t$rT, $rA, $val", IntegerMulDiv, + pattern>; + def MPYUIvec: - RI10Form<0b10101110, (outs VECREG:$rT), (ins VECREG:$rA, s10imm:$val), - "mpyui\t$rT, $rA, $val", IntegerMulDiv, - []>; + MPYUIInst<(outs VECREG:$rT), (ins VECREG:$rA, s10imm:$val), + []>; def MPYUIr16: - RI10Form<0b10101110, (outs R16C:$rT), (ins R16C:$rA, s10imm:$val), - "mpyui\t$rT, $rA, $val", IntegerMulDiv, - []>; + MPYUIInst<(outs R16C:$rT), (ins R16C:$rA, s10imm:$val), + []>; // mpya: 16 x 16 + 16 -> 32 bit result +class MPYAInst pattern>: + RRRForm<0b0011, OOL, IOL, + "mpya\t$rT, $rA, $rB, $rC", IntegerMulDiv, + pattern>; + def MPYAvec: - RRRForm<0b0011, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, VECREG:$rC), - "mpya\t$rT, $rA, $rB, $rC", IntegerMulDiv, - [(set (v4i32 VECREG:$rT), (add (v4i32 (bitconvert (mul (v8i16 VECREG:$rA), - (v8i16 VECREG:$rB)))), - (v4i32 VECREG:$rC)))]>; + MPYAInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB, VECREG:$rC), + [(set (v4i32 VECREG:$rT), + (add (v4i32 (bitconvert (mul (v8i16 VECREG:$rA), + (v8i16 VECREG:$rB)))), + (v4i32 VECREG:$rC)))]>; def MPYAr32: - RRRForm<0b0011, (outs R32C:$rT), (ins R16C:$rA, R16C:$rB, R32C:$rC), - "mpya\t$rT, $rA, $rB, $rC", IntegerMulDiv, - [(set R32C:$rT, (add (sext (mul R16C:$rA, R16C:$rB)), - R32C:$rC))]>; - -def : Pat<(add (mul (sext R16C:$rA), (sext R16C:$rB)), R32C:$rC), - (MPYAr32 R16C:$rA, R16C:$rB, R32C:$rC)>; + MPYAInst<(outs R32C:$rT), (ins R16C:$rA, R16C:$rB, R32C:$rC), + [(set R32C:$rT, (add (sext (mul R16C:$rA, R16C:$rB)), + R32C:$rC))]>; + +def MPYAr32_sext: + MPYAInst<(outs R32C:$rT), (ins R16C:$rA, R16C:$rB, R32C:$rC), + [(set R32C:$rT, (add (mul (sext R16C:$rA), (sext R16C:$rB)), + R32C:$rC))]>; def MPYAr32_sextinreg: - RRRForm<0b0011, (outs R32C:$rT), (ins R32C:$rA, R32C:$rB, R32C:$rC), - "mpya\t$rT, $rA, $rB, $rC", IntegerMulDiv, - [(set R32C:$rT, (add (mul (sext_inreg R32C:$rA, i16), - (sext_inreg R32C:$rB, i16)), - R32C:$rC))]>; - -//def MPYAr32: -// RRRForm<0b0011, (outs R32C:$rT), (ins R16C:$rA, R16C:$rB, R32C:$rC), -// "mpya\t$rT, $rA, $rB, $rC", IntegerMulDiv, -// [(set R32C:$rT, (add (sext (mul R16C:$rA, R16C:$rB)), -// R32C:$rC))]>; + MPYAInst<(outs R32C:$rT), (ins R32C:$rA, R32C:$rB, R32C:$rC), + [(set R32C:$rT, (add (mul (sext_inreg R32C:$rA, i16), + (sext_inreg R32C:$rB, i16)), + R32C:$rC))]>; // mpyh: multiply high, used to synthesize 32-bit multiplies +class MPYHInst pattern>: + RRForm<0b10100011110, OOL, IOL, + "mpyh\t$rT, $rA, $rB", IntegerMulDiv, + pattern>; + def MPYHv4i32: - RRForm<0b10100011110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), - "mpyh\t$rT, $rA, $rB", IntegerMulDiv, - [(set (v4i32 VECREG:$rT), - (SPUmpyh_v4i32 (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)))]>; + MPYHInst<(outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), + [(set (v4i32 VECREG:$rT), + (SPUmpyh_vec (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)))]>; def MPYHr32: - RRForm<0b10100011110, (outs R32C:$rT), (ins R32C:$rA, R32C:$rB), - "mpyh\t$rT, $rA, $rB", IntegerMulDiv, - [(set R32C:$rT, (SPUmpyh_i32 R32C:$rA, R32C:$rB))]>; + MPYHInst<(outs R32C:$rT), (ins R32C:$rA, R32C:$rB), + [(set R32C:$rT, (SPUmpyh_int R32C:$rA, R32C:$rB))]>; // mpys: multiply high and shift right (returns the top half of // a 16-bit multiply, sign extended to 32 bits.) @@ -898,7 +917,7 @@ RRForm<0b01100011110, (outs VECREG:$rT), (ins VECREG:$rA, VECREG:$rB), "mpyhh\t$rT, $rA, $rB", IntegerMulDiv, [(set (v8i16 VECREG:$rT), - (SPUmpyhh_v8i16 (v8i16 VECREG:$rA), (v8i16 VECREG:$rB)))]>; + (SPUmpyhh_vec (v8i16 VECREG:$rA), (v8i16 VECREG:$rB)))]>; def MPYHHr32: RRForm<0b01100011110, (outs R32C:$rT), (ins R32C:$rA, R32C:$rB), @@ -938,7 +957,26 @@ "mpyhhau\t$rT, $rA, $rB", IntegerMulDiv, []>; +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ +// v4i32, i32 multiply instruction sequence: +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ +def MPYv4i32: + Pat<(mul (v4i32 VECREG:$rA), (v4i32 VECREG:$rB)), + (Av4i32 + (Av4i32 (MPYHv4i32 VECREG:$rA, VECREG:$rB), + (MPYHv4i32 VECREG:$rB, VECREG:$rA)), + (MPYUv4i32 VECREG:$rA, VECREG:$rB))>; + +def MPYi32: + Pat<(mul R32C:$rA, R32C:$rB), + (Ar32 + (Ar32 (MPYHr32 R32C:$rA, R32C:$rB), + (MPYHr32 R32C:$rB, R32C:$rA)), + (MPYUr32 R32C:$rA, R32C:$rB))>; + +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ // clz: Count leading zeroes +//-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ class CLZInst pattern>: RRForm_1<0b10100101010, OOL, IOL, "clz\t$rT, $rA", IntegerOp, pattern>; @@ -1803,8 +1841,8 @@ class SELBRegInst: SELBInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB, rclass:$rC), [(set rclass:$rT, - (or (and rclass:$rA, rclass:$rC), - (and rclass:$rB, (not rclass:$rC))))]>; + (or (and rclass:$rB, rclass:$rC), + (and rclass:$rA, (not rclass:$rC))))]>; class SELBRegCondInst: SELBInst<(outs rclass:$rT), (ins rclass:$rA, rclass:$rB, rcond:$rC), @@ -3442,6 +3480,13 @@ BIForm<0b10010101100, "bisl\t$$lr, $func", [(SPUcall R32C:$func)]>; } +// Support calls to external symbols: +def : Pat<(SPUcall (SPUpcrel texternalsym:$func, 0)), + (BRSL texternalsym:$func)>; + +def : Pat<(SPUcall (SPUaform texternalsym:$func, 0)), + (BRASL texternalsym:$func)>; + // Unconditional branches: let isBranch = 1, isTerminator = 1, hasCtrlDep = 1, isBarrier = 1 in { def BR : Modified: llvm/trunk/lib/Target/CellSPU/SPUNodes.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUNodes.td?rev=61464&r1=61463&r2=61464&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUNodes.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUNodes.td Sun Dec 28 21:23:36 2008 @@ -35,17 +35,12 @@ SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2> ]>; -// Unary, binary v16i8 operator type constraints: -def SPUv16i8_binop: SDTypeProfile<1, 2, [ - SDTCisVT<0, v16i8>, SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>]>; - -// Binary v8i16 operator type constraints: -def SPUv8i16_binop: SDTypeProfile<1, 2, [ - SDTCisVT<0, v8i16>, SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>]>; - -// Binary v4i32 operator type constraints: -def SPUv4i32_binop: SDTypeProfile<1, 2, [ - SDTCisVT<0, v4i32>, SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>]>; +// Vector binary operator type constraints (needs a further constraint to +// ensure that operand 0 is a vector...): + +def SPUVecBinop: SDTypeProfile<1, 2, [ + SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2> +]>; // Trinary operators, e.g., addx, carry generate def SPUIntTrinaryOp : SDTypeProfile<1, 3, [ @@ -93,23 +88,22 @@ def SPUshuffle: SDNode<"SPUISD::SHUFB", SDT_SPUshuffle, []>; // SPU 16-bit multiply -def SPUmpy_v16i8: SDNode<"SPUISD::MPY", SPUv16i8_binop, []>; -def SPUmpy_v8i16: SDNode<"SPUISD::MPY", SPUv8i16_binop, []>; -def SPUmpy_v4i32: SDNode<"SPUISD::MPY", SPUv4i32_binop, []>; +def SPUmpy_vec: SDNode<"SPUISD::MPY", SPUVecBinop, []>; // SPU multiply unsigned, used in instruction lowering for v4i32 // multiplies: -def SPUmpyu_v4i32: SDNode<"SPUISD::MPYU", SPUv4i32_binop, []>; -def SPUmpyu_i32: SDNode<"SPUISD::MPYU", SDTIntBinOp, []>; +def SPUmpyu_vec: SDNode<"SPUISD::MPYU", SPUVecBinop, []>; +def SPUmpyu_int: SDNode<"SPUISD::MPYU", SDTIntBinOp, []>; // SPU 16-bit multiply high x low, shift result 16-bits // Used to compute intermediate products for 32-bit multiplies -def SPUmpyh_v4i32: SDNode<"SPUISD::MPYH", SPUv4i32_binop, []>; -def SPUmpyh_i32: SDNode<"SPUISD::MPYH", SDTIntBinOp, []>; +def SPUmpyh_vec: SDNode<"SPUISD::MPYH", SPUVecBinop, []>; +def SPUmpyh_int: SDNode<"SPUISD::MPYH", SDTIntBinOp, []>; // SPU 16-bit multiply high x high, 32-bit product // Used to compute intermediate products for 16-bit multiplies -def SPUmpyhh_v8i16: SDNode<"SPUISD::MPYHH", SPUv8i16_binop, []>; +def SPUmpyhh_vec: SDNode<"SPUISD::MPYHH", SPUVecBinop, []>; +def SPUmpyhh_int: SDNode<"SPUISD::MPYHH", SDTIntBinOp, []>; // Shift left quadword by bits and bytes def SPUshlquad_l_bits: SDNode<"SPUISD::SHLQUAD_L_BITS", SPUvecshift_type, []>;