From baldrick at free.fr Mon Nov 23 02:14:36 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 09:14:36 +0100 Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/compare-signs.ll In-Reply-To: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu> References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu> Message-ID: <4B0A446C.10104@free.fr> Hi Nick, > + if (KnownZeroLHS.countLeadingOnes() == BitWidth-1 && > + KnownZeroRHS.countLeadingOnes() == BitWidth-1) { == -> >= :) Ciao, Duncan. From nicholas at mxc.ca Mon Nov 23 02:18:32 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 23 Nov 2009 00:18:32 -0800 Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/compare-signs.ll In-Reply-To: <4B0A446C.10104@free.fr> References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu> <4B0A446C.10104@free.fr> Message-ID: <4B0A4558.8070803@mxc.ca> Duncan Sands wrote: > Hi Nick, > >> + if (KnownZeroLHS.countLeadingOnes() == BitWidth-1 && >> + KnownZeroRHS.countLeadingOnes() == BitWidth-1) { > > == -> >= :) Nope, look again! + APInt TypeMask(APInt::getHighBitsSet(BitWidth, BitWidth-1)); Thus, it will never return a knownzero with all bits set. :) Nick From baldrick at free.fr Mon Nov 23 02:24:10 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 09:24:10 +0100 Subject: [llvm-commits] [llvm] r89602 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp In-Reply-To: <200911221616.nAMGGm5b026391@zion.cs.uiuc.edu> References: <200911221616.nAMGGm5b026391@zion.cs.uiuc.edu> Message-ID: <4B0A46AA.4040207@free.fr> Hi Chris, > bool BasicAliasAnalysis::pointsToConstantMemory(const Value *P) { > if (const GlobalVariable *GV = > dyn_cast(P->getUnderlyingObject())) > + // FIXME: shouldn't this require GV to be "ODR"? I'm not sure, but I think it's the case that something declared weak and constant can safely be considered constant, i.e. readonly. However you can't assume that the initializer it has is correct - some other (read-only) initializer may be substituted at link time. Ciao, Duncan. From baldrick at free.fr Mon Nov 23 02:25:01 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 09:25:01 +0100 Subject: [llvm-commits] [llvm] r89602 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp In-Reply-To: References: <200911221616.nAMGGm5b026391@zion.cs.uiuc.edu> <4B099CF2.7020402@mxc.ca> Message-ID: <4B0A46DD.6060705@free.fr> > No; there isn't any guarantee it won't be overridden, just that it > won't be modified at runtime. I think Eli is right. Ciao, Duncan. From baldrick at free.fr Mon Nov 23 02:32:34 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 09:32:34 +0100 Subject: [llvm-commits] patch: make memdep scan memory use intrinsics In-Reply-To: <4B09F47B.1040008@mxc.ca> References: <4B09F47B.1040008@mxc.ca> Message-ID: <4B0A48A2.3030306@free.fr> Hi Nick, > + case Intrinsic::lifetime_start: > + case Intrinsic::invariant_start: > + case Intrinsic::invariant_end: > + MemPtr = QueryInst->getOperand(1); > + MemSize = cast(QueryInst->getOperand(0))->getZExtValue(); isn't the pointer operand 2, and the size operand 1? > + break; > + case Intrinsic::lifetime_end: > + MemPtr = QueryInst->getOperand(2); > + MemSize = cast(QueryInst->getOperand(1))->getZExtValue(); And here operands 3 and 2? Ciao, Duncan. From baldrick at free.fr Mon Nov 23 02:33:15 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 09:33:15 +0100 Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/compare-signs.ll In-Reply-To: <4B0A4558.8070803@mxc.ca> References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu> <4B0A446C.10104@free.fr> <4B0A4558.8070803@mxc.ca> Message-ID: <4B0A48CB.4020201@free.fr> >>> + if (KnownZeroLHS.countLeadingOnes() == BitWidth-1 && >>> + KnownZeroRHS.countLeadingOnes() == BitWidth-1) { >> >> == -> >= :) > > Nope, look again! > > + APInt TypeMask(APInt::getHighBitsSet(BitWidth, BitWidth-1)); > > Thus, it will never return a knownzero with all bits set. :) Ha ha, you got me there! Ciao, Duncan. From baldrick at free.fr Mon Nov 23 04:43:55 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 11:43:55 +0100 Subject: [llvm-commits] [llvm] r88910 - /llvm/trunk/lib/VMCore/Core.cpp In-Reply-To: <4B09ADBC.3010807@mxc.ca> References: <200911161315.nAGDFTMi017583@zion.cs.uiuc.edu> <4B09ADBC.3010807@mxc.ca> Message-ID: <4B0A676B.6010104@free.fr> Hi Nick, > Also, this is the C API, so you can't fix this by changing the signature > on the C function, if it's ever been through a release. why not? Ciao, Duncan. From baldrick at free.fr Mon Nov 23 04:49:09 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 10:49:09 -0000 Subject: [llvm-commits] [llvm] r89648 - in /llvm/trunk: include/llvm-c/Core.h lib/VMCore/Core.cpp Message-ID: <200911231049.nANAnBT0015538@zion.cs.uiuc.edu> Author: baldrick Date: Mon Nov 23 04:49:03 2009 New Revision: 89648 URL: http://llvm.org/viewvc/llvm-project?rev=89648&view=rev Log: I forgot to update the prototype for LLVMBuildIntCast when correcting the body to not pass the name for the isSigned parameter. However it seems that changing prototypes is a big-no-no, so here I revert the previous change and pass "true" for isSigned, meaning this always does a signed cast, which was the previous behaviour assuming the name was not NULL! Some other C function needs to be introduced for the general case of signed or unsigned casts. This hopefully unbreaks the ocaml binding. Modified: llvm/trunk/include/llvm-c/Core.h llvm/trunk/lib/VMCore/Core.cpp Modified: llvm/trunk/include/llvm-c/Core.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/Core.h?rev=89648&r1=89647&r2=89648&view=diff ============================================================================== --- llvm/trunk/include/llvm-c/Core.h (original) +++ llvm/trunk/include/llvm-c/Core.h Mon Nov 23 04:49:03 2009 @@ -870,7 +870,7 @@ LLVMTypeRef DestTy, const char *Name); LLVMValueRef LLVMBuildPointerCast(LLVMBuilderRef, LLVMValueRef Val, LLVMTypeRef DestTy, const char *Name); -LLVMValueRef LLVMBuildIntCast(LLVMBuilderRef, LLVMValueRef Val, +LLVMValueRef LLVMBuildIntCast(LLVMBuilderRef, LLVMValueRef Val, /*Signed cast!*/ LLVMTypeRef DestTy, const char *Name); LLVMValueRef LLVMBuildFPCast(LLVMBuilderRef, LLVMValueRef Val, LLVMTypeRef DestTy, const char *Name); Modified: llvm/trunk/lib/VMCore/Core.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Core.cpp?rev=89648&r1=89647&r2=89648&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Core.cpp (original) +++ llvm/trunk/lib/VMCore/Core.cpp Mon Nov 23 04:49:03 2009 @@ -1860,9 +1860,9 @@ } LLVMValueRef LLVMBuildIntCast(LLVMBuilderRef B, LLVMValueRef Val, - LLVMTypeRef DestTy, int isSigned, - const char *Name) { - return wrap(unwrap(B)->CreateIntCast(unwrap(Val), unwrap(DestTy), isSigned, Name)); + LLVMTypeRef DestTy, const char *Name) { + return wrap(unwrap(B)->CreateIntCast(unwrap(Val), unwrap(DestTy), + /*isSigned*/true, Name)); } LLVMValueRef LLVMBuildFPCast(LLVMBuilderRef B, LLVMValueRef Val, From baldrick at free.fr Mon Nov 23 04:54:14 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 11:54:14 +0100 Subject: [llvm-commits] [llvm] r88979 - /llvm/trunk/Makefile In-Reply-To: <6a8523d60911221055m6389806q789720c839e93a82@mail.gmail.com> References: <200911162238.nAGMc189009001@zion.cs.uiuc.edu> <4B0256AD.2040909@free.fr> <6a8523d60911221055m6389806q789720c839e93a82@mail.gmail.com> Message-ID: <4B0A69D6.9060700@free.fr> Hi Daniel, >>> Don't build examples by default, use BUILD_EXAMPLES=1 to build them. The >>> only utility of this is testing that we keep the examples up to date, I will >>> just make the buildbots run with this flag. >> is this really a good idea? > > Yes, I think so. I don't think the examples break that often. > Additionally, developers changing APIs widely are already probably > using grep or other tools to find the code to fix, and can always test > the examples directly. > >> If someone breaks an example, wouldn't it be >> better >> for them to discover that directly themselves rather than via a buildbot? > > No, this optimizes for the rare case (single developer, breaking > single build) at the expensive of the extraordinarily common case > (many many developers and buildbots, rebuilding llvm). doesn't it just mean that breakage to examples won't be noticed for a long time, and probably by some poor soul who was interested in giving LLVM a go, but now won't because the examples don't even build... Also, I had understood that the builtbots would build the examples, but here you seem to suggest they won't? Anyway, I'd be happier with this change if was possible to say "make all" and have everything be built, examples and all, while "make" would build a sensible subset. Ciao, Duncan. From baldrick at free.fr Mon Nov 23 08:22:04 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 15:22:04 +0100 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> Message-ID: <4B0A9A8C.9040905@free.fr> Hi Chris, >>>> I think this is wrong, consider the following pseudocode example: >>> While this example is "possible" I really don't think this is worth worrying about. It is not valid C code, is not likely to exist in practice, etc. Beyond that, comparison against null is really common and we really do want "nocapture" in this cases. >> well, it's a slippery slope :) Dan later changed this to only allow >> comparisons against malloc return values and other noalias function >> results. > > Is such paranoia really worthwhile? I'm not against loosening the definition of nocapture as long as it is clearly defined what nocapture means. I think the obvious thing to do is to say that if the value of the pointer is reconstructed *entirely by control flow* then it is not considered to be captured. Consider the following standard crazy example of capturing a pointer P. This is a prime example where the pointer was captured via control flow: n = 0 loop: Q = GEP P, -n if (Q == null) break; n = n + 1; goto loop; captured_value = GEP null, n Notice how if you start with "captured_value", and look at how it is defined you see it is defined in terms of null and "n" via a GEP. "n" itself is defined in terms of "n" and 0 via a PHI node. Nowhere do you see the original pointer P. Using my proposed definition here we would say that P is *not* captured. What do you think? Doing such a thing introduces an 'abstraction penalty' where malloc wrappers now get pessimized where direct calls to malloc don't. A malloc wrapper should also be marked noalias, so your argument seems weak :) Of course currently we have no pass that can deduce noalias attributes, which would help here. Ciao, Duncan. From dag at cray.com Mon Nov 23 09:01:47 2009 From: dag at cray.com (David Greene) Date: Mon, 23 Nov 2009 09:01:47 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911201622.37994.dag@cray.com> References: <200911201622.37994.dag@cray.com> Message-ID: <200911230901.47371.dag@cray.com> On Friday 20 November 2009 16:22, you wrote: > This patch adds information to spill/reload comments as to whether they are > vector or scalar. This is helpful when doing static code analysis of > performance issues and other things. It's only implemented for X86. > Experts on other architectures will have to fill things in. > > Please review. Thanks! Ping! -Dave > Index: include/llvm/Target/TargetInstrInfo.h > =================================================================== > --- include/llvm/Target/TargetInstrInfo.h (revision 89484) > +++ include/llvm/Target/TargetInstrInfo.h (working copy) > @@ -142,6 +142,23 @@ > return false; > } > > + /// isVectorInstr - Return true if the instruction is a vector > operation. + virtual bool isVectorInstr(const MachineInstr& MI) const { > + return false; > + } > + > + /// isVectorOperand - Return true if the operand is of vector type. > + virtual bool isVectorOperand(const MachineInstr &MI, > + const MachineOperand *MO) const { > + return false; > + } > + > + /// isVectorOperand - Return true if the mem operand is of vector type. > + virtual bool isVectorOperand(const MachineInstr &MI, > + const MachineMemOperand *MMO) const { > + return false; > + } > + > /// isIdentityCopy - Return true if the instruction is a copy (or > /// extract_subreg, insert_subreg, subreg_to_reg) where the source and > /// destination registers are the same. > @@ -182,11 +199,13 @@ > > /// hasLoadFromStackSlot - If the specified machine instruction has > /// a load from a stack slot, return true along with the FrameIndex > - /// of the loaded stack slot. If not, return false. Unlike > + /// of the loaded stack slot and the machine mem operand containing > + /// the reference. If not, return false. Unlike > /// isLoadFromStackSlot, this returns true for any instructions that > /// loads from the stack. This is just a hint, as some cases may be > /// missed. > virtual bool hasLoadFromStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > int &FrameIndex) const { > return 0; > } > @@ -205,17 +224,18 @@ > /// stack locations as well. This uses a heuristic so it isn't > /// reliable for correctness. > virtual unsigned isStoreToStackSlotPostFE(const MachineInstr *MI, > - int &FrameIndex) const { > + int &FrameIndex) const { > return 0; > } > > /// hasStoreToStackSlot - If the specified machine instruction has a > /// store to a stack slot, return true along with the FrameIndex of > - /// the loaded stack slot. If not, return false. Unlike > - /// isStoreToStackSlot, this returns true for any instructions that > - /// loads from the stack. This is just a hint, as some cases may be > - /// missed. > + /// the loaded stack slot and the machine mem operand containing the > + /// reference. If not, return false. Unlike isStoreToStackSlot, > + /// this returns true for any instructions that loads from the > + /// stack. This is just a hint, as some cases may be missed. > virtual bool hasStoreToStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > int &FrameIndex) const { > return 0; > } > Index: lib/CodeGen/AsmPrinter/AsmPrinter.cpp > =================================================================== > --- lib/CodeGen/AsmPrinter/AsmPrinter.cpp (revision 89484) > +++ lib/CodeGen/AsmPrinter/AsmPrinter.cpp (working copy) > @@ -1854,35 +1854,46 @@ > > // We assume a single instruction only has a spill or reload, not > // both. > + const MachineMemOperand *MMO; > if (TM.getInstrInfo()->isLoadFromStackSlotPostFE(&MI, FI)) { > if (FrameInfo->isSpillSlotObjectIndex(FI)) { > + MMO = *MI.memoperands_begin(); > + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); > if (Newline) O << '\n'; > O.PadToColumn(MAI->getCommentColumn()); > - O << MAI->getCommentString() << " Reload"; > + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") > + << " Reload"; > Newline = true; > } > } > - else if (TM.getInstrInfo()->hasLoadFromStackSlot(&MI, FI)) { > + else if (TM.getInstrInfo()->hasLoadFromStackSlot(&MI, MMO, FI)) { > if (FrameInfo->isSpillSlotObjectIndex(FI)) { > + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); > if (Newline) O << '\n'; > O.PadToColumn(MAI->getCommentColumn()); > - O << MAI->getCommentString() << " Folded Reload"; > + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") > + << " Folded Reload"; > Newline = true; > } > } > else if (TM.getInstrInfo()->isStoreToStackSlotPostFE(&MI, FI)) { > if (FrameInfo->isSpillSlotObjectIndex(FI)) { > + MMO = *MI.memoperands_begin(); > + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); > if (Newline) O << '\n'; > O.PadToColumn(MAI->getCommentColumn()); > - O << MAI->getCommentString() << " Spill"; > + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") > + << " Spill"; > Newline = true; > } > } > - else if (TM.getInstrInfo()->hasStoreToStackSlot(&MI, FI)) { > + else if (TM.getInstrInfo()->hasStoreToStackSlot(&MI, MMO, FI)) { > if (FrameInfo->isSpillSlotObjectIndex(FI)) { > + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); > if (Newline) O << '\n'; > O.PadToColumn(MAI->getCommentColumn()); > - O << MAI->getCommentString() << " Folded Spill"; > + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") > + << " Folded Spill"; > Newline = true; > } > } > @@ -1892,9 +1903,11 @@ > if (TM.getInstrInfo()->isMoveInstr(MI, SrcReg, DstReg, > SrcSubIdx, DstSubIdx)) { > if (MI.getAsmPrinterFlag(ReloadReuse)) { > + bool isVector = TM.getInstrInfo()->isVectorInstr(MI); > if (Newline) O << '\n'; > O.PadToColumn(MAI->getCommentColumn()); > - O << MAI->getCommentString() << " Reload Reuse"; > + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") > + << " Reload Reuse"; > Newline = true; > } > } > Index: lib/Target/X86/X86InstrInfo.cpp > =================================================================== > --- lib/Target/X86/X86InstrInfo.cpp (revision 89484) > +++ lib/Target/X86/X86InstrInfo.cpp (working copy) > @@ -34,6 +34,7 @@ > #include "llvm/MC/MCAsmInfo.h" > > #include > +#include > > using namespace llvm; > > @@ -711,6 +712,393 @@ > } > } > > +bool X86InstrInfo::isVectorInstr(const MachineInstr &MI) const{ > + // Handle special cases here. > + switch(MI.getOpcode()) { > + case X86::MOVDDUPrr: > + case X86::MOVDDUPrm: > + case X86::MOVSHDUPrr: > + case X86::MOVSHDUPrm: > + case X86::MOVSLDUPrr: > + case X86::MOVSLDUPrm: > + case X86::MPSADBWrri: // "PS" is lucky. Be explicit. > + case X86::MPSADBWrmi: > + return true; > + case X86::MMX_MOVQ2DQrr: > + return false; > + } > + > + // Look for the common cases. > + const TargetInstrDesc &InstrDesc = get(MI.getOpcode()); > + const char *Name = InstrDesc.getName(); > + if (std::strstr(Name, "PS") != 0 // SSE packed single > + || std::strstr(Name, "PD") != 0 // SSE packed double > + || std::strstr(Name, "DQ") != 0 // SSE packed integer > + || Name[0] == 'P' // MMX/SSE packed integer > + || Name[0] == 'V' && Name[1] == 'P') // AVX packed integer > + return true; > + > + return false; > +} > + > +bool X86InstrInfo::isVectorOperand(const MachineInstr &MI, > + const MachineOperand *MO) const { > + // Handle special cases here. These are for mixed vector/scalar > + // instructions. > + if (MO->getType() != MachineOperand::MO_Register > + && MO->getType() != MachineOperand::MO_FrameIndex > + && MO->getType() != MachineOperand::MO_ExternalSymbol > + && MO->getType() != MachineOperand::MO_GlobalAddress) > + return false; > + > + // Operands that are part of memory addresses are never vector. > + // Come Larrabee, we will need to handle vector address operands so > + // this will get more complicated. > + for (unsigned OpNum = 0; OpNum < MI.getNumOperands(); ++OpNum) { > + if (&MI.getOperand(OpNum) == MO) { > + switch(MI.getOpcode()) { > + case X86::EXTRACTPSmr: > + case X86::EXTRACTPSrr: > + return OpNum == MI.getNumExplicitOperands() - 1; > + case X86::INSERTPSrm: > + case X86::INSERTPSrr: > + return OpNum == 0; > + case X86::MOVDDUPrm: > + case X86::MOVDDUPrr: > + return OpNum == 0; > + case X86::MOVHPDmr: > + return OpNum == MI.getNumExplicitOperands() - 1; > + case X86::MOVHPDrm: > + // Address operands are never vector. > + return false; > + case X86::MOVLPDmr: > + return OpNum == MI.getNumExplicitOperands() - 1; > + case X86::MOVLPDrr: > + case X86::MOVLPDrm: > + return OpNum == 0; > + case X86::MOVMSKPDrr: > + case X86::MOVMSKPSrr: > + return OpNum == 1; > + case X86::PBLENDVBrr0: > + case X86::PBLENDVBrm0: > + return !(MO->isReg() && MO->isImplicit()); > + case X86::PCMPESTRIrr: > + case X86::PCMPESTRIrm: > + case X86::PCMPESTRIArr: > + case X86::PCMPESTRIArm: > + case X86::PCMPESTRICrr: > + case X86::PCMPESTRICrm: > + case X86::PCMPESTRIOrr: > + case X86::PCMPESTRIOrm: > + case X86::PCMPESTRISrr: > + case X86::PCMPESTRISrm: > + case X86::PCMPESTRIZrr: > + case X86::PCMPESTRIZrm: > + case X86::PCMPESTRM128MEM: > + case X86::PCMPESTRM128REG: > + case X86::PCMPESTRM128rr: > + case X86::PCMPESTRM128rm: > + case X86::PCMPISTRIrr: > + case X86::PCMPISTRIrm: > + case X86::PCMPISTRIArr: > + case X86::PCMPISTRIArm: > + case X86::PCMPISTRICrr: > + case X86::PCMPISTRICrm: > + case X86::PCMPISTRIOrr: > + case X86::PCMPISTRIOrm: > + case X86::PCMPISTRISrr: > + case X86::PCMPISTRISrm: > + case X86::PCMPISTRIZrr: > + case X86::PCMPISTRIZrm: > + case X86::PCMPISTRM128MEM: > + case X86::PCMPISTRM128REG: > + case X86::PCMPISTRM128rr: > + case X86::PCMPISTRM128rm: > + return !(MO->isReg() && MO->isImplicit()); > + case X86::PEXTRBrr: > + case X86::MMX_PEXTRWri: > + case X86::PEXTRWri: > + case X86::PEXTRDrr: > + case X86::PEXTRQrr: > + case X86::PEXTRBmr: > + case X86::PEXTRWmr: > + case X86::PEXTRDmr: > + case X86::PEXTRQmr: > + // Account for the immediate operand. > + return OpNum == MI.getNumExplicitOperands() - 2; > + case X86::PINSRBrr: > + case X86::PINSRBrm: > + case X86::MMX_PINSRWrri: > + case X86::PINSRWrri: > + case X86::MMX_PINSRWrmi: > + case X86::PINSRWrmi: > + case X86::PINSRDrr: > + case X86::PINSRDrm: > + case X86::PINSRQrr: > + case X86::PINSRQrm: > + return OpNum == 0; > + case X86::PMOVMSKBrr: > + return OpNum == 1; > + case X86::MMX_PSLLWrr: > + case X86::MMX_PSLLWri: > + case X86::MMX_PSLLWrm: > + case X86::PSLLWrr: > + case X86::PSLLWri: > + case X86::PSLLWrm: > + case X86::MMX_PSLLDrr: > + case X86::MMX_PSLLDri: > + case X86::MMX_PSLLDrm: > + case X86::PSLLDrr: > + case X86::PSLLDri: > + case X86::PSLLDrm: > + case X86::MMX_PSLLQrr: > + case X86::MMX_PSLLQri: > + case X86::MMX_PSLLQrm: > + case X86::PSLLQrr: > + case X86::PSLLQri: > + case X86::PSLLQrm: > + case X86::MMX_PSRAWrr: > + case X86::MMX_PSRAWri: > + case X86::MMX_PSRAWrm: > + case X86::PSRAWrr: > + case X86::PSRAWri: > + case X86::PSRAWrm: > + case X86::MMX_PSRADrr: > + case X86::MMX_PSRADri: > + case X86::MMX_PSRADrm: > + case X86::PSRADrr: > + case X86::PSRADri: > + case X86::PSRADrm: > + case X86::MMX_PSRLWrr: > + case X86::MMX_PSRLWri: > + case X86::MMX_PSRLWrm: > + case X86::PSRLWrr: > + case X86::PSRLWri: > + case X86::PSRLWrm: > + case X86::MMX_PSRLDrr: > + case X86::MMX_PSRLDri: > + case X86::MMX_PSRLDrm: > + case X86::PSRLDrr: > + case X86::PSRLDri: > + case X86::PSRLDrm: > + case X86::MMX_PSRLQrr: > + case X86::MMX_PSRLQri: > + case X86::MMX_PSRLQrm: > + case X86::PSRLQrr: > + case X86::PSRLQri: > + case X86::PSRLQrm: > + return OpNum == 0; > + case X86::PTESTrr: > + case X86::PTESTrm: > + return !(MO->isReg() && MO->isImplicit()); > + case X86::UNPCKLPDrr: > + case X86::UNPCKLPDrm: > + return OpNum == 0; > + } > + return isVectorInstr(MI); > + } > + } > + > + assert(0 && "Did not find operand in instruction!"); > + > + return false; > +} > + > +bool X86InstrInfo::isVectorOperand(const MachineInstr &MI, > + const MachineMemOperand *MMO) const { > + bool found = false; > + for (MachineInstr::mmo_iterator m = MI.memoperands_begin(), > + mend = MI.memoperands_end(); > + m != mend; > + ++m) { > + if (*m == MMO) > + found = true; > + } > + > + if (!found) > + assert(0 && "Wrong machine mem operands for instruction!"); > + > + // Handle special cases here. These are for mixed vector/scalar > + // instructions. > + switch(MI.getOpcode()) { > + case X86::EXTRACTPSrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::EXTRACTPSmr: > + assert(MMO->isStore() && "Wrong machine mem operand for > instruction!"); + return false; > + case X86::INSERTPSrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::INSERTPSrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVDDUPrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVDDUPrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVHPDmr: > + assert(MMO->isStore() && "Wrong machine mem operand for > instruction!"); + return false; > + case X86::MOVHPDrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVLPDmr: > + assert(MMO->isStore() && "Wrong machine mem operand for > instruction!"); + return false; > + case X86::MOVLPDrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVLPDrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MOVMSKPDrr: > + case X86::MOVMSKPSrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PBLENDVBrr0: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PBLENDVBrm0: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return true; > + case X86::PCMPESTRIrm: > + case X86::PCMPESTRIArm: > + case X86::PCMPESTRICrm: > + case X86::PCMPESTRIOrm: > + case X86::PCMPESTRISrm: > + case X86::PCMPESTRIZrm: > + case X86::PCMPESTRM128MEM: > + case X86::PCMPESTRM128rm: > + case X86::PCMPISTRIrm: > + case X86::PCMPISTRIArm: > + case X86::PCMPISTRICrm: > + case X86::PCMPISTRIOrm: > + case X86::PCMPISTRISrm: > + case X86::PCMPISTRIZrm: > + case X86::PCMPISTRM128MEM: > + case X86::PCMPISTRM128rm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PCMPESTRIrr: > + case X86::PCMPESTRIArr: > + case X86::PCMPESTRICrr: > + case X86::PCMPESTRIOrr: > + case X86::PCMPESTRISrr: > + case X86::PCMPESTRIZrr: > + case X86::PCMPESTRM128REG: > + case X86::PCMPESTRM128rr: > + case X86::PCMPISTRIrr: > + case X86::PCMPISTRIArr: > + case X86::PCMPISTRICrr: > + case X86::PCMPISTRIOrr: > + case X86::PCMPISTRISrr: > + case X86::PCMPISTRIZrr: > + case X86::PCMPISTRM128REG: > + case X86::PCMPISTRM128rr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PEXTRBrr: > + case X86::MMX_PEXTRWri: > + case X86::PEXTRWri: > + case X86::PEXTRDrr: > + case X86::PEXTRQrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PEXTRBmr: > + case X86::PEXTRWmr: > + case X86::PEXTRDmr: > + case X86::PEXTRQmr: > + assert(MMO->isStore() && "Wrong machine mem operand for > instruction!"); + return false; > + case X86::PINSRBrr: > + case X86::MMX_PINSRWrri: > + case X86::PINSRWrri: > + case X86::PINSRDrr: > + case X86::PINSRQrr: > + assert(MMO->isStore() && "Wrong machine mem operand for > instruction!"); + return false; > + case X86::PINSRBrm: > + case X86::MMX_PINSRWrmi: > + case X86::PINSRWrmi: > + case X86::PINSRDrm: > + case X86::PINSRQrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PMOVMSKBrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MMX_PSLLWrm: > + case X86::PSLLWrm: > + case X86::MMX_PSLLDrm: > + case X86::PSLLDrm: > + case X86::MMX_PSLLQrm: > + case X86::PSLLQrm: > + case X86::MMX_PSRAWrm: > + case X86::PSRAWrm: > + case X86::MMX_PSRADrm: > + case X86::PSRADrm: > + case X86::MMX_PSRLWrm: > + case X86::PSRLWrm: > + case X86::MMX_PSRLDrm: > + case X86::PSRLDrm: > + case X86::MMX_PSRLQrm: > + case X86::PSRLQrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::MMX_PSLLWrr: > + case X86::MMX_PSLLWri: > + case X86::PSLLWrr: > + case X86::PSLLWri: > + case X86::MMX_PSLLDrr: > + case X86::MMX_PSLLDri: > + case X86::PSLLDrr: > + case X86::PSLLDri: > + case X86::MMX_PSLLQrr: > + case X86::MMX_PSLLQri: > + case X86::PSLLQrr: > + case X86::PSLLQri: > + case X86::MMX_PSRAWrr: > + case X86::MMX_PSRAWri: > + case X86::PSRAWrr: > + case X86::PSRAWri: > + case X86::MMX_PSRADrr: > + case X86::MMX_PSRADri: > + case X86::PSRADrr: > + case X86::PSRADri: > + case X86::MMX_PSRLWrr: > + case X86::MMX_PSRLWri: > + case X86::PSRLWrr: > + case X86::PSRLWri: > + case X86::MMX_PSRLDrr: > + case X86::MMX_PSRLDri: > + case X86::PSRLDrr: > + case X86::PSRLDri: > + case X86::MMX_PSRLQrr: > + case X86::MMX_PSRLQri: > + case X86::PSRLQrr: > + case X86::PSRLQri: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PTESTrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::PTESTrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::UNPCKLPDrr: > + assert(0 && "Wrong machine mem operand for instruction!"); > + return false; > + case X86::UNPCKLPDrm: > + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); > + return false; > + } > + > + return isVectorInstr(MI); > +} > + > /// isFrameOperand - Return true and the FrameIndex if the specified > /// operand and follow operands form a reference to the stack frame. > bool X86InstrInfo::isFrameOperand(const MachineInstr *MI, unsigned int Op, > @@ -783,12 +1171,14 @@ > if ((Reg = isLoadFromStackSlot(MI, FrameIndex))) > return Reg; > // Check for post-frame index elimination operations > - return hasLoadFromStackSlot(MI, FrameIndex); > + const MachineMemOperand *Dummy; > + return hasLoadFromStackSlot(MI, Dummy, FrameIndex); > } > return 0; > } > > bool X86InstrInfo::hasLoadFromStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > int &FrameIndex) const { > for (MachineInstr::mmo_iterator o = MI->memoperands_begin(), > oe = MI->memoperands_end(); > @@ -798,6 +1188,7 @@ > if (const FixedStackPseudoSourceValue *Value = > dyn_cast((*o)->getValue())) { > FrameIndex = Value->getFrameIndex(); > + MMO = *o; > return true; > } > } > @@ -819,12 +1210,14 @@ > if ((Reg = isStoreToStackSlot(MI, FrameIndex))) > return Reg; > // Check for post-frame index elimination operations > - return hasStoreToStackSlot(MI, FrameIndex); > + const MachineMemOperand *Dummy; > + return hasStoreToStackSlot(MI, Dummy, FrameIndex); > } > return 0; > } > > bool X86InstrInfo::hasStoreToStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > int &FrameIndex) const { > for (MachineInstr::mmo_iterator o = MI->memoperands_begin(), > oe = MI->memoperands_end(); > @@ -834,6 +1227,7 @@ > if (const FixedStackPseudoSourceValue *Value = > dyn_cast((*o)->getValue())) { > FrameIndex = Value->getFrameIndex(); > + MMO = *o; > return true; > } > } > Index: lib/Target/X86/X86InstrInfo.h > =================================================================== > --- lib/Target/X86/X86InstrInfo.h (revision 89484) > +++ lib/Target/X86/X86InstrInfo.h (working copy) > @@ -448,6 +448,17 @@ > unsigned &SrcReg, unsigned &DstReg, > unsigned &SrcSubIdx, unsigned &DstSubIdx) > const; > > + /// isVectorInstr - Return true if the instruction is a vector > operation. + virtual bool isVectorInstr(const MachineInstr& MI) const; > + > + /// isVectorOperand - Return true if the operand is of vector type.. > + virtual bool isVectorOperand(const MachineInstr& MI, > + const MachineOperand *MO) const; > + > + /// isVectorOperand - Return true if the mem operand is of vector type.. > + virtual bool isVectorOperand(const MachineInstr& MI, > + const MachineMemOperand *MMO) const; > + > unsigned isLoadFromStackSlot(const MachineInstr *MI, int &FrameIndex) > const; > /// isLoadFromStackSlotPostFE - Check for post-frame ptr elimination > /// stack locations as well. This uses a heuristic so it isn't > @@ -457,11 +468,14 @@ > > /// hasLoadFromStackSlot - If the specified machine instruction has > /// a load from a stack slot, return true along with the FrameIndex > - /// of the loaded stack slot. If not, return false. Unlike > + /// of the loaded stack slot and the machine mem operand containing > + /// the reference. If not, return false. Unlike > /// isLoadFromStackSlot, this returns true for any instructions that > /// loads from the stack. This is a hint only and may not catch all > /// cases. > - bool hasLoadFromStackSlot(const MachineInstr *MI, int &FrameIndex) > const; + bool hasLoadFromStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > + int &FrameIndex) const; > > unsigned isStoreToStackSlot(const MachineInstr *MI, int &FrameIndex) > const; /// isStoreToStackSlotPostFE - Check for post-frame ptr elimination > @@ -472,11 +486,13 @@ > > /// hasStoreToStackSlot - If the specified machine instruction has a > /// store to a stack slot, return true along with the FrameIndex of > - /// the loaded stack slot. If not, return false. Unlike > - /// isStoreToStackSlot, this returns true for any instructions that > - /// loads from the stack. This is a hint only and may not catch all > - /// cases. > - bool hasStoreToStackSlot(const MachineInstr *MI, int &FrameIndex) const; > + /// the loaded stack slot and the machine mem operand containing the > + /// reference. If not, return false. Unlike isStoreToStackSlot, > + /// this returns true for any instructions that loads from the > + /// stack. This is a hint only and may not catch all cases. > + bool hasStoreToStackSlot(const MachineInstr *MI, > + const MachineMemOperand *&MMO, > + int &FrameIndex) const; > > bool isReallyTriviallyReMaterializable(const MachineInstr *MI, > AliasAnalysis *AA) const; > Index: test/CodeGen/X86/2009-11-20-VectorSpillComments.ll > =================================================================== > --- test/CodeGen/X86/2009-11-20-VectorSpillComments.ll (revision 0) > +++ test/CodeGen/X86/2009-11-20-VectorSpillComments.ll (revision 0) > @@ -0,0 +1,19 @@ > +; RUN: llc < %s -march=x86-64 | FileCheck %s > +; CHECK: Vector Spill > +; CHECK: Vector Reload > +; CHECK: Vector Folded Reload > +; CHECK: Scalar Spill > +; CHECK: Scalar Folded Reload > + > +define <8 x i32> @foo(<8 x i32> %t, <8 x i32> %u) { > + %m = srem <8 x i32> %t, %u > + ret <8 x i32> %m > +} > +define <8 x i32> @bar(<8 x i32> %t, <8 x i32> %u) { > + %m = urem <8 x i32> %t, %u > + ret <8 x i32> %m > +} > +define <8 x float> @qux(<8 x float> %t, <8 x float> %u) { > + %m = frem <8 x float> %t, %u > + ret <8 x float> %m > +} From gohman at apple.com Mon Nov 23 10:13:39 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 16:13:39 -0000 Subject: [llvm-commits] [llvm] r89658 - in /llvm/trunk: lib/Transforms/Scalar/SCCP.cpp test/Transforms/IPConstantProp/user-with-multiple-uses.ll Message-ID: <200911231613.nANGDdFb028965@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 10:13:39 2009 New Revision: 89658 URL: http://llvm.org/viewvc/llvm-project?rev=89658&view=rev Log: Fix a use of an invalidated iterator in the case where there are multiple adjacent uses of a dead basic block from the same user. This fixes PR5596. Added: llvm/trunk/test/Transforms/IPConstantProp/user-with-multiple-uses.ll Modified: llvm/trunk/lib/Transforms/Scalar/SCCP.cpp Modified: llvm/trunk/lib/Transforms/Scalar/SCCP.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SCCP.cpp?rev=89658&r1=89657&r2=89658&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/SCCP.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/SCCP.cpp Mon Nov 23 10:13:39 2009 @@ -1871,8 +1871,12 @@ BasicBlock *DeadBB = BlocksToErase[i]; for (Value::use_iterator UI = DeadBB->use_begin(), UE = DeadBB->use_end(); UI != UE; ) { + // Grab the user and then increment the iterator early, as the user + // will be deleted. Step past all adjacent uses from the same user. + Instruction *I = dyn_cast(*UI); + do { ++UI; } while (UI != UE && *UI == I); + // Ignore blockaddress users; BasicBlock's dtor will handle them. - Instruction *I = dyn_cast(*UI++); if (!I) continue; bool Folded = ConstantFoldTerminator(I->getParent()); Added: llvm/trunk/test/Transforms/IPConstantProp/user-with-multiple-uses.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/IPConstantProp/user-with-multiple-uses.ll?rev=89658&view=auto ============================================================================== --- llvm/trunk/test/Transforms/IPConstantProp/user-with-multiple-uses.ll (added) +++ llvm/trunk/test/Transforms/IPConstantProp/user-with-multiple-uses.ll Mon Nov 23 10:13:39 2009 @@ -0,0 +1,30 @@ +; RUN: opt < %s -S -ipsccp | FileCheck %s +; PR5596 + +; IPSCCP should propagate the 0 argument, eliminate the switch, and propagate +; the result. + +; CHECK: define i32 @main() noreturn nounwind { +; CHECK-NEXT: entry: +; CHECK-NEXT: %call2 = tail call i32 @wwrite(i64 0) nounwind +; CHECK-NEXT: ret i32 123 + +define i32 @main() noreturn nounwind { +entry: + %call2 = tail call i32 @wwrite(i64 0) nounwind + ret i32 %call2 +} + +define internal i32 @wwrite(i64 %i) nounwind readnone { +entry: + switch i64 %i, label %sw.default [ + i64 3, label %return + i64 10, label %return + ] + +sw.default: + ret i32 123 + +return: + ret i32 0 +} From gohman at apple.com Mon Nov 23 10:22:21 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 16:22:21 -0000 Subject: [llvm-commits] [llvm] r89659 - in /llvm/trunk: lib/Analysis/ConstantFolding.cpp lib/Transforms/IPO/GlobalOpt.cpp test/Transforms/GlobalOpt/constantfold-initializers.ll test/Transforms/InstCombine/cast.ll test/Transforms/InstCombine/shufflevec-constant.ll Message-ID: <200911231622.nANGMLiD029357@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 10:22:21 2009 New Revision: 89659 URL: http://llvm.org/viewvc/llvm-project?rev=89659&view=rev Log: Make ConstantFoldConstantExpression recursively visit the entire ConstantExpr, not just the top-level operator. This allows it to fold many more constants. Also, make GlobalOpt call ConstantFoldConstantExpression on GlobalVariable initializers. Added: llvm/trunk/test/Transforms/GlobalOpt/constantfold-initializers.ll Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp llvm/trunk/test/Transforms/InstCombine/cast.ll llvm/trunk/test/Transforms/InstCombine/shufflevec-constant.ll Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ConstantFolding.cpp?rev=89659&r1=89658&r2=89659&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ConstantFolding.cpp (original) +++ llvm/trunk/lib/Analysis/ConstantFolding.cpp Mon Nov 23 10:22:21 2009 @@ -671,8 +671,13 @@ Constant *llvm::ConstantFoldConstantExpression(ConstantExpr *CE, const TargetData *TD) { SmallVector Ops; - for (User::op_iterator i = CE->op_begin(), e = CE->op_end(); i != e; ++i) - Ops.push_back(cast(*i)); + for (User::op_iterator i = CE->op_begin(), e = CE->op_end(); i != e; ++i) { + Constant *NewC = cast(*i); + // Recursively fold the ConstantExpr's operands. + if (ConstantExpr *NewCE = dyn_cast(NewC)) + NewC = ConstantFoldConstantExpression(NewCE, TD); + Ops.push_back(NewC); + } if (CE->isCompare()) return ConstantFoldCompareInstOperands(CE->getPredicate(), Ops[0], Ops[1], @@ -687,6 +692,10 @@ /// attempting to fold instructions like loads and stores, which have no /// constant expression form. /// +/// TODO: This function neither utilizes nor preserves nsw/nuw/inbounds/etc +/// information, due to only being passed an opcode and operands. Constant +/// folding using this function strips this information. +/// Constant *llvm::ConstantFoldInstOperands(unsigned Opcode, const Type *DestTy, Constant* const* Ops, unsigned NumOps, const TargetData *TD) { Modified: llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp?rev=89659&r1=89658&r2=89659&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp (original) +++ llvm/trunk/lib/Transforms/IPO/GlobalOpt.cpp Mon Nov 23 10:22:21 2009 @@ -1898,6 +1898,15 @@ // Global variables without names cannot be referenced outside this module. if (!GV->hasName() && !GV->isDeclaration()) GV->setLinkage(GlobalValue::InternalLinkage); + // Simplify the initializer. + if (GV->hasInitializer()) + if (ConstantExpr *CE = dyn_cast(GV->getInitializer())) { + TargetData *TD = getAnalysisIfAvailable(); + Constant *New = ConstantFoldConstantExpression(CE, TD); + if (New && New != CE) + GV->setInitializer(New); + } + // Do more involved optimizations if the global is internal. if (!GV->isConstant() && GV->hasLocalLinkage() && GV->hasInitializer()) Changed |= ProcessInternalGlobal(GV, GVI); Added: llvm/trunk/test/Transforms/GlobalOpt/constantfold-initializers.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GlobalOpt/constantfold-initializers.ll?rev=89659&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GlobalOpt/constantfold-initializers.ll (added) +++ llvm/trunk/test/Transforms/GlobalOpt/constantfold-initializers.ll Mon Nov 23 10:22:21 2009 @@ -0,0 +1,8 @@ +; RUN: opt < %s -S -globalopt | FileCheck %s + +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" + + at .str91250 = global [3 x i8] zeroinitializer + +; CHECK: @A = global i1 false + at A = global i1 icmp ne (i64 sub nsw (i64 ptrtoint (i8* getelementptr inbounds ([3 x i8]* @.str91250, i64 0, i64 1) to i64), i64 ptrtoint ([3 x i8]* @.str91250 to i64)), i64 1) Modified: llvm/trunk/test/Transforms/InstCombine/cast.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/cast.ll?rev=89659&r1=89658&r2=89659&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/cast.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/cast.ll Mon Nov 23 10:22:21 2009 @@ -103,7 +103,7 @@ %p = malloc [4 x i8] ; <[4 x i8]*> [#uses=1] %c = bitcast [4 x i8]* %p to i32* ; [#uses=1] ret i32* %c -; CHECK: %malloccall = tail call i8* @malloc(i32 ptrtoint ([4 x i8]* getelementptr ([4 x i8]* null, i32 1) to i32)) +; CHECK: %malloccall = tail call i8* @malloc(i32 4) ; CHECK: ret i32* %c } @@ -275,7 +275,7 @@ %tmp8.upgrd.1 = bitcast [16 x i8]* %tmp8 to double* ; [#uses=1] store double* %tmp8.upgrd.1, double** %tmp ret void -; CHECK: %malloccall = tail call i8* @malloc(i32 ptrtoint ([16 x i8]* getelementptr ([16 x i8]* null, i32 1) to i32)) +; CHECK: %malloccall = tail call i8* @malloc(i32 16) ; CHECK: %tmp8.upgrd.1 = bitcast i8* %malloccall to double* ; CHECK: store double* %tmp8.upgrd.1, double** %tmp ; CHECK: ret void Modified: llvm/trunk/test/Transforms/InstCombine/shufflevec-constant.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/shufflevec-constant.ll?rev=89659&r1=89658&r2=89659&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/shufflevec-constant.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/shufflevec-constant.ll Mon Nov 23 10:22:21 2009 @@ -1,4 +1,4 @@ -; RUN: opt < %s -instcombine -S | grep "2 x float" +; RUN: opt < %s -instcombine -S | grep {ret <4 x float> } target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" target triple = "i386-apple-darwin9" From gohman at apple.com Mon Nov 23 10:24:19 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 16:24:19 -0000 Subject: [llvm-commits] [llvm] r89660 - /llvm/trunk/lib/VMCore/PassManager.cpp Message-ID: <200911231624.nANGOJCN029476@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 10:24:18 2009 New Revision: 89660 URL: http://llvm.org/viewvc/llvm-project?rev=89660&view=rev Log: Move FunctionPassManagerImpl's dumpArguments and dumpPasses calls out of its run function and into its doInitialization method, so that it does the dump once instead of once per function. Modified: llvm/trunk/lib/VMCore/PassManager.cpp Modified: llvm/trunk/lib/VMCore/PassManager.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/PassManager.cpp?rev=89660&r1=89659&r2=89660&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/PassManager.cpp (original) +++ llvm/trunk/lib/VMCore/PassManager.cpp Mon Nov 23 10:24:18 2009 @@ -1231,6 +1231,9 @@ bool FunctionPassManagerImpl::doInitialization(Module &M) { bool Changed = false; + dumpArguments(); + dumpPasses(); + for (unsigned Index = 0; Index < getNumContainedManagers(); ++Index) Changed |= getContainedManager(Index)->doInitialization(M); @@ -1274,9 +1277,6 @@ bool Changed = false; TimingInfo::createTheTimeInfo(); - dumpArguments(); - dumpPasses(); - initializeAllAnalysisInfo(); for (unsigned Index = 0; Index < getNumContainedManagers(); ++Index) Changed |= getContainedManager(Index)->runOnFunction(F); From sabre at nondot.org Mon Nov 23 10:38:55 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 23 Nov 2009 16:38:55 -0000 Subject: [llvm-commits] [llvm] r89662 - /llvm/trunk/include/llvm/Analysis/AliasAnalysis.h Message-ID: <200911231638.nANGct28030001@zion.cs.uiuc.edu> Author: lattner Date: Mon Nov 23 10:38:54 2009 New Revision: 89662 URL: http://llvm.org/viewvc/llvm-project?rev=89662&view=rev Log: add a helper Modified: llvm/trunk/include/llvm/Analysis/AliasAnalysis.h Modified: llvm/trunk/include/llvm/Analysis/AliasAnalysis.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/AliasAnalysis.h?rev=89662&r1=89661&r2=89662&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/AliasAnalysis.h (original) +++ llvm/trunk/include/llvm/Analysis/AliasAnalysis.h Mon Nov 23 10:38:54 2009 @@ -94,6 +94,13 @@ virtual AliasResult alias(const Value *V1, unsigned V1Size, const Value *V2, unsigned V2Size); + /// isNoAlias - A trivial helper function to check to see if the specified + /// pointers are no-alias. + bool isNoAlias(const Value *V1, unsigned V1Size, + const Value *V2, unsigned V2Size) { + return alias(V1, V1Size, V2, V2Size) == NoAlias; + } + /// pointsToConstantMemory - If the specified pointer is known to point into /// constant global memory, return true. This allows disambiguation of store /// instructions from constant pointers. From sabre at nondot.org Mon Nov 23 10:44:44 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 23 Nov 2009 16:44:44 -0000 Subject: [llvm-commits] [llvm] r89663 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Message-ID: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu> Author: lattner Date: Mon Nov 23 10:44:43 2009 New Revision: 89663 URL: http://llvm.org/viewvc/llvm-project?rev=89663&view=rev Log: speed up BasicAA a bit by implementing a long-standing TODO. Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89663&r1=89662&r2=89663&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Mon Nov 23 10:44:43 2009 @@ -289,18 +289,29 @@ return NoModRef; // If the pointer is to a locally allocated object that does not escape, - // then the call can not mod/ref the pointer unless the call takes the - // argument without capturing it. + // then the call can not mod/ref the pointer unless the call takes the pointer + // as an argument, and itself doesn't capture it. if (isNonEscapingLocalObject(Object) && CS.getInstruction() != Object) { - bool passedAsArg = false; - // TODO: Eventually only check 'nocapture' arguments. + bool PassedAsArg = false; + unsigned ArgNo = 0; for (CallSite::arg_iterator CI = CS.arg_begin(), CE = CS.arg_end(); - CI != CE; ++CI) - if (isa((*CI)->getType()) && - alias(cast(CI), ~0U, P, ~0U) != NoAlias) - passedAsArg = true; + CI != CE; ++CI, ++ArgNo) { + // Only look at the no-capture pointer arguments. + if (!isa((*CI)->getType()) || + !CS.paramHasAttr(ArgNo+1, Attribute::NoCapture)) + continue; + + // If this is a no-capture pointer argument, see if we can tell that it + // is impossible to alias the pointer we're checking. If not, we have to + // assume that the call could touch the pointer, even though it doesn't + // escape. + if (alias(cast(CI), ~0U, P, ~0U) != NoAlias) { + PassedAsArg = true; + break; + } + } - if (!passedAsArg) + if (!PassedAsArg) return NoModRef; } From sabre at nondot.org Mon Nov 23 10:45:27 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 23 Nov 2009 16:45:27 -0000 Subject: [llvm-commits] [llvm] r89664 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Message-ID: <200911231645.nANGjR0F030233@zion.cs.uiuc.edu> Author: lattner Date: Mon Nov 23 10:45:27 2009 New Revision: 89664 URL: http://llvm.org/viewvc/llvm-project?rev=89664&view=rev Log: whitespace cleanup, tidying Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89664&r1=89663&r2=89664&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Mon Nov 23 10:45:27 2009 @@ -217,13 +217,13 @@ // VisitedPHIs - Track PHI nodes visited by a aliasCheck() call. SmallPtrSet VisitedPHIs; - // aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction - // against another. + // aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP + // instruction against another. AliasResult aliasGEP(const Value *V1, unsigned V1Size, const Value *V2, unsigned V2Size); - // aliasPHI - Provide a bunch of ad-hoc rules to disambiguate a PHI instruction - // against another. + // aliasPHI - Provide a bunch of ad-hoc rules to disambiguate a PHI + // instruction against another. AliasResult aliasPHI(const PHINode *PN, unsigned PNSize, const Value *V2, unsigned V2Size); @@ -236,7 +236,7 @@ // CheckGEPInstructions - Check two GEP instructions with known // must-aliasing base pointers. This checks to see if the index expressions - // preclude the pointers from aliasing... + // preclude the pointers from aliasing. AliasResult CheckGEPInstructions(const Type* BasePtr1Ty, Value **GEP1Ops, unsigned NumGEP1Ops, unsigned G1Size, @@ -269,11 +269,10 @@ } -// getModRefInfo - Check to see if the specified callsite can clobber the -// specified memory object. Since we only look at local properties of this -// function, we really can't say much about this query. We do, however, use -// simple "address taken" analysis on local objects. -// +/// getModRefInfo - Check to see if the specified callsite can clobber the +/// specified memory object. Since we only look at local properties of this +/// function, we really can't say much about this query. We do, however, use +/// simple "address taken" analysis on local objects. AliasAnalysis::ModRefResult BasicAliasAnalysis::getModRefInfo(CallSite CS, Value *P, unsigned Size) { const Value *Object = P->getUnderlyingObject(); @@ -534,8 +533,8 @@ return MayAlias; } -// aliasSelect - Provide a bunch of ad-hoc rules to disambiguate a Select instruction -// against another. +/// aliasSelect - Provide a bunch of ad-hoc rules to disambiguate a Select +/// instruction against another. AliasAnalysis::AliasResult BasicAliasAnalysis::aliasSelect(const SelectInst *SI, unsigned SISize, const Value *V2, unsigned V2Size) { @@ -701,10 +700,12 @@ // isNonEscapingLocalObject considers all stores to be escapes (it // passes true for the StoreCaptures argument to PointerMayBeCaptured). if (O1 != O2) { - if ((isa(O1) || isa(O1) || isa(O1)) && + if ((isa(O1) || isa(O1) || isa(O1) || + isa(O2) || isa(O2) || isa(O2)) && + if ((isa(O2) || isa(O2) || isa(O2) || + isa Author: lattner Date: Mon Nov 23 10:46:41 2009 New Revision: 89665 URL: http://llvm.org/viewvc/llvm-project?rev=89665&view=rev Log: use the new isNoAlias method to simplify some code, only do an escaping check if we have a non-constant pointer. Constant pointers can't be local. Modified: llvm/trunk/lib/Analysis/AliasAnalysis.cpp llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Modified: llvm/trunk/lib/Analysis/AliasAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/AliasAnalysis.cpp?rev=89665&r1=89664&r2=89665&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/AliasAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/AliasAnalysis.cpp Mon Nov 23 10:46:41 2009 @@ -127,17 +127,18 @@ AliasAnalysis::ModRefResult AliasAnalysis::getModRefInfo(CallSite CS, Value *P, unsigned Size) { - ModRefResult Mask = ModRef; ModRefBehavior MRB = getModRefBehavior(CS); if (MRB == DoesNotAccessMemory) return NoModRef; - else if (MRB == OnlyReadsMemory) + + ModRefResult Mask = ModRef; + if (MRB == OnlyReadsMemory) Mask = Ref; else if (MRB == AliasAnalysis::AccessesArguments) { bool doesAlias = false; for (CallSite::arg_iterator AI = CS.arg_begin(), AE = CS.arg_end(); AI != AE; ++AI) - if (alias(*AI, ~0U, P, Size) != NoAlias) { + if (!isNoAlias(*AI, ~0U, P, Size)) { doesAlias = true; break; } Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89665&r1=89664&r2=89665&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Mon Nov 23 10:46:41 2009 @@ -290,7 +290,8 @@ // If the pointer is to a locally allocated object that does not escape, // then the call can not mod/ref the pointer unless the call takes the pointer // as an argument, and itself doesn't capture it. - if (isNonEscapingLocalObject(Object) && CS.getInstruction() != Object) { + if (!isa(Object) && CS.getInstruction() != Object && + isNonEscapingLocalObject(Object)) { bool PassedAsArg = false; unsigned ArgNo = 0; for (CallSite::arg_iterator CI = CS.arg_begin(), CE = CS.arg_end(); @@ -304,7 +305,7 @@ // is impossible to alias the pointer we're checking. If not, we have to // assume that the call could touch the pointer, even though it doesn't // escape. - if (alias(cast(CI), ~0U, P, ~0U) != NoAlias) { + if (!isNoAlias(cast(CI), ~0U, P, ~0U)) { PassedAsArg = true; break; } @@ -328,18 +329,20 @@ Len = LenCI->getZExtValue(); Value *Dest = II->getOperand(1); Value *Src = II->getOperand(2); - if (alias(Dest, Len, P, Size) == NoAlias) { - if (alias(Src, Len, P, Size) == NoAlias) + if (isNoAlias(Dest, Len, P, Size)) { + if (isNoAlias(Src, Len, P, Size)) return NoModRef; return Ref; } break; } case Intrinsic::memset: + // Since memset is 'accesses arguments' only, the AliasAnalysis base class + // will handle it for the variable length case. if (ConstantInt *LenCI = dyn_cast(II->getOperand(3))) { unsigned Len = LenCI->getZExtValue(); Value *Dest = II->getOperand(1); - if (alias(Dest, Len, P, Size) == NoAlias) + if (isNoAlias(Dest, Len, P, Size)) return NoModRef; } break; @@ -358,7 +361,7 @@ if (TD) { Value *Op1 = II->getOperand(1); unsigned Op1Size = TD->getTypeStoreSize(Op1->getType()); - if (alias(Op1, Op1Size, P, Size) == NoAlias) + if (isNoAlias(Op1, Op1Size, P, Size)) return NoModRef; } break; @@ -366,13 +369,13 @@ case Intrinsic::lifetime_end: case Intrinsic::invariant_start: { unsigned PtrSize = cast(II->getOperand(1))->getZExtValue(); - if (alias(II->getOperand(2), PtrSize, P, Size) == NoAlias) + if (isNoAlias(II->getOperand(2), PtrSize, P, Size)) return NoModRef; break; } case Intrinsic::invariant_end: { unsigned PtrSize = cast(II->getOperand(2))->getZExtValue(); - if (alias(II->getOperand(3), PtrSize, P, Size) == NoAlias) + if (isNoAlias(II->getOperand(3), PtrSize, P, Size)) return NoModRef; break; } @@ -701,11 +704,11 @@ // passes true for the StoreCaptures argument to PointerMayBeCaptured). if (O1 != O2) { if ((isa(O1) || isa(O1) || isa(O1) || - isa(O1)) && isNonEscapingLocalObject(O2)) return NoAlias; if ((isa(O2) || isa(O2) || isa(O2) || - isa(O2)) && isNonEscapingLocalObject(O1)) return NoAlias; } From sabre at nondot.org Mon Nov 23 11:07:35 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 23 Nov 2009 17:07:35 -0000 Subject: [llvm-commits] [llvm] r89666 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Message-ID: <200911231707.nANH7Zwg031063@zion.cs.uiuc.edu> Author: lattner Date: Mon Nov 23 11:07:35 2009 New Revision: 89666 URL: http://llvm.org/viewvc/llvm-project?rev=89666&view=rev Log: fix comment, thanks all :) Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89666&r1=89665&r2=89666&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Mon Nov 23 11:07:35 2009 @@ -263,7 +263,9 @@ bool BasicAliasAnalysis::pointsToConstantMemory(const Value *P) { if (const GlobalVariable *GV = dyn_cast(P->getUnderlyingObject())) - // FIXME: shouldn't this require GV to be "ODR"? + // Note: this doesn't require GV to be "ODR" because it isn't legal for a + // global to be marked constant in some modules and non-constant in others. + // GV may even be a declaration, not a definition. return GV->isConstant(); return false; } From sabre at nondot.org Mon Nov 23 11:08:23 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 23 Nov 2009 09:08:23 -0800 Subject: [llvm-commits] [llvm] r89602 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp In-Reply-To: <4B0A46AA.4040207@free.fr> References: <200911221616.nAMGGm5b026391@zion.cs.uiuc.edu> <4B0A46AA.4040207@free.fr> Message-ID: On Nov 23, 2009, at 12:24 AM, Duncan Sands wrote: > Hi Chris, > >> bool BasicAliasAnalysis::pointsToConstantMemory(const Value *P) { >> if (const GlobalVariable *GV = dyn_cast(P->getUnderlyingObject())) >> + // FIXME: shouldn't this require GV to be "ODR"? > > I'm not sure, but I think it's the case that something declared weak and > constant can safely be considered constant, i.e. readonly. However you > can't assume that the initializer it has is correct - some other (read-only) > initializer may be substituted at link time. Makes sense, I committed the patch. The test is verifying whether it is constant (thus a store is guaranteed not to alias it) not anything about its specific value. Thanks all! -Chris From gohman at apple.com Mon Nov 23 11:16:22 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 17:16:22 -0000 Subject: [llvm-commits] [llvm] r89667 - in /llvm/trunk/lib/CodeGen/SelectionDAG: FunctionLoweringInfo.cpp FunctionLoweringInfo.h SelectionDAGBuild.cpp SelectionDAGBuild.h SelectionDAGISel.cpp Message-ID: <200911231716.nANHGNlI031330@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 11:16:22 2009 New Revision: 89667 URL: http://llvm.org/viewvc/llvm-project?rev=89667&view=rev Log: Move the FunctionLoweringInfo class and some related utility functions out of SelectionDAGBuild.h/cpp into its own files, to help separate general lowering logic from SelectionDAG-specific lowering logic. Added: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp - copied, changed from r89623, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h - copied, changed from r88738, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Copied: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (from r89623, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp?p2=llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp&p1=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp&r1=89623&r2=89667&rev=89667&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp Mon Nov 23 11:16:22 2009 @@ -1,4 +1,4 @@ -//===-- SelectionDAGBuild.cpp - Selection-DAG building --------------------===// +//===-- FunctionLoweringInfo.cpp ------------------------------------------===// // // The LLVM Compiler Infrastructure // @@ -7,39 +7,24 @@ // //===----------------------------------------------------------------------===// // -// This implements routines for translating from LLVM IR into SelectionDAG IR. +// This implements routines for translating functions from LLVM IR into +// Machine IR. // //===----------------------------------------------------------------------===// -#define DEBUG_TYPE "isel" -#include "SelectionDAGBuild.h" -#include "llvm/ADT/BitVector.h" -#include "llvm/ADT/SmallSet.h" -#include "llvm/Analysis/AliasAnalysis.h" -#include "llvm/Constants.h" -#include "llvm/Constants.h" +#define DEBUG_TYPE "function-lowering-info" +#include "FunctionLoweringInfo.h" #include "llvm/CallingConv.h" #include "llvm/DerivedTypes.h" #include "llvm/Function.h" -#include "llvm/GlobalVariable.h" -#include "llvm/InlineAsm.h" #include "llvm/Instructions.h" -#include "llvm/Intrinsics.h" -#include "llvm/IntrinsicInst.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" -#include "llvm/CodeGen/FastISel.h" -#include "llvm/CodeGen/GCStrategy.h" -#include "llvm/CodeGen/GCMetadata.h" #include "llvm/CodeGen/MachineFunction.h" #include "llvm/CodeGen/MachineFrameInfo.h" #include "llvm/CodeGen/MachineInstrBuilder.h" -#include "llvm/CodeGen/MachineJumpTableInfo.h" #include "llvm/CodeGen/MachineModuleInfo.h" #include "llvm/CodeGen/MachineRegisterInfo.h" -#include "llvm/CodeGen/PseudoSourceValue.h" -#include "llvm/CodeGen/SelectionDAG.h" -#include "llvm/CodeGen/DwarfWriter.h" #include "llvm/Analysis/DebugInfo.h" #include "llvm/Target/TargetRegisterInfo.h" #include "llvm/Target/TargetData.h" @@ -57,25 +42,14 @@ #include using namespace llvm; -/// LimitFloatPrecision - Generate low-precision inline sequences for -/// some float libcalls (6, 8 or 12 bits). -static unsigned LimitFloatPrecision; - -static cl::opt -LimitFPPrecision("limit-float-precision", - cl::desc("Generate low-precision inline sequences " - "for some float libcalls"), - cl::location(LimitFloatPrecision), - cl::init(0)); - /// ComputeLinearIndex - Given an LLVM IR aggregate type and a sequence /// of insertvalue or extractvalue indices that identify a member, return /// the linearized index of the start of the member. /// -static unsigned ComputeLinearIndex(const TargetLowering &TLI, const Type *Ty, - const unsigned *Indices, - const unsigned *IndicesEnd, - unsigned CurIndex = 0) { +unsigned llvm::ComputeLinearIndex(const TargetLowering &TLI, const Type *Ty, + const unsigned *Indices, + const unsigned *IndicesEnd, + unsigned CurIndex) { // Base case: We're done. if (Indices && Indices == IndicesEnd) return CurIndex; @@ -113,10 +87,10 @@ /// If Offsets is non-null, it points to a vector to be filled in /// with the in-memory offsets of each of the individual values. /// -static void ComputeValueVTs(const TargetLowering &TLI, const Type *Ty, - SmallVectorImpl &ValueVTs, - SmallVectorImpl *Offsets = 0, - uint64_t StartingOffset = 0) { +void llvm::ComputeValueVTs(const TargetLowering &TLI, const Type *Ty, + SmallVectorImpl &ValueVTs, + SmallVectorImpl *Offsets, + uint64_t StartingOffset) { // Given a struct type, recursively traverse the elements. if (const StructType *STy = dyn_cast(Ty)) { const StructLayout *SL = TLI.getTargetData()->getStructLayout(STy); @@ -146,101 +120,6 @@ Offsets->push_back(StartingOffset); } -namespace llvm { - /// RegsForValue - This struct represents the registers (physical or virtual) - /// that a particular set of values is assigned, and the type information about - /// the value. The most common situation is to represent one value at a time, - /// but struct or array values are handled element-wise as multiple values. - /// The splitting of aggregates is performed recursively, so that we never - /// have aggregate-typed registers. The values at this point do not necessarily - /// have legal types, so each value may require one or more registers of some - /// legal type. - /// - struct VISIBILITY_HIDDEN RegsForValue { - /// TLI - The TargetLowering object. - /// - const TargetLowering *TLI; - - /// ValueVTs - The value types of the values, which may not be legal, and - /// may need be promoted or synthesized from one or more registers. - /// - SmallVector ValueVTs; - - /// RegVTs - The value types of the registers. This is the same size as - /// ValueVTs and it records, for each value, what the type of the assigned - /// register or registers are. (Individual values are never synthesized - /// from more than one type of register.) - /// - /// With virtual registers, the contents of RegVTs is redundant with TLI's - /// getRegisterType member function, however when with physical registers - /// it is necessary to have a separate record of the types. - /// - SmallVector RegVTs; - - /// Regs - This list holds the registers assigned to the values. - /// Each legal or promoted value requires one register, and each - /// expanded value requires multiple registers. - /// - SmallVector Regs; - - RegsForValue() : TLI(0) {} - - RegsForValue(const TargetLowering &tli, - const SmallVector ®s, - EVT regvt, EVT valuevt) - : TLI(&tli), ValueVTs(1, valuevt), RegVTs(1, regvt), Regs(regs) {} - RegsForValue(const TargetLowering &tli, - const SmallVector ®s, - const SmallVector ®vts, - const SmallVector &valuevts) - : TLI(&tli), ValueVTs(valuevts), RegVTs(regvts), Regs(regs) {} - RegsForValue(LLVMContext &Context, const TargetLowering &tli, - unsigned Reg, const Type *Ty) : TLI(&tli) { - ComputeValueVTs(tli, Ty, ValueVTs); - - for (unsigned Value = 0, e = ValueVTs.size(); Value != e; ++Value) { - EVT ValueVT = ValueVTs[Value]; - unsigned NumRegs = TLI->getNumRegisters(Context, ValueVT); - EVT RegisterVT = TLI->getRegisterType(Context, ValueVT); - for (unsigned i = 0; i != NumRegs; ++i) - Regs.push_back(Reg + i); - RegVTs.push_back(RegisterVT); - Reg += NumRegs; - } - } - - /// append - Add the specified values to this one. - void append(const RegsForValue &RHS) { - TLI = RHS.TLI; - ValueVTs.append(RHS.ValueVTs.begin(), RHS.ValueVTs.end()); - RegVTs.append(RHS.RegVTs.begin(), RHS.RegVTs.end()); - Regs.append(RHS.Regs.begin(), RHS.Regs.end()); - } - - - /// getCopyFromRegs - Emit a series of CopyFromReg nodes that copies from - /// this value and returns the result as a ValueVTs value. This uses - /// Chain/Flag as the input and updates them for the output Chain/Flag. - /// If the Flag pointer is NULL, no flag is used. - SDValue getCopyFromRegs(SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const; - - /// getCopyToRegs - Emit a series of CopyToReg nodes that copies the - /// specified value into the registers specified by this object. This uses - /// Chain/Flag as the input and updates them for the output Chain/Flag. - /// If the Flag pointer is NULL, no flag is used. - void getCopyToRegs(SDValue Val, SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const; - - /// AddInlineAsmOperands - Add this value to the specified inlineasm node - /// operand list. This adds the code marker, matching input operand index - /// (if applicable), and includes the number of values added into it. - void AddInlineAsmOperands(unsigned Code, - bool HasMatching, unsigned MatchingIdx, - SelectionDAG &DAG, std::vector &Ops) const; - }; -} - /// isUsedOutsideOfDefiningBlock - Return true if this instruction is used by /// PHI nodes or outside of the basic block that defines it, or used by a /// switch or atomic instruction, which may expand to multiple basic blocks. @@ -276,7 +155,6 @@ } void FunctionLoweringInfo::set(Function &fn, MachineFunction &mf, - SelectionDAG &DAG, bool EnableFastISel) { Fn = &fn; MF = &mf; @@ -346,7 +224,7 @@ ComputeValueVTs(TLI, PN->getType(), ValueVTs); for (unsigned vti = 0, vte = ValueVTs.size(); vti != vte; ++vti) { EVT VT = ValueVTs[vti]; - unsigned NumRegisters = TLI.getNumRegisters(*DAG.getContext(), VT); + unsigned NumRegisters = TLI.getNumRegisters(Fn->getContext(), VT); const TargetInstrInfo *TII = MF->getTarget().getInstrInfo(); for (unsigned i = 0; i != NumRegisters; ++i) BuildMI(MBB, DL, TII->get(TargetInstrInfo::PHI), PHIReg + i); @@ -356,6 +234,20 @@ } } +/// clear - Clear out all the function-specific state. This returns this +/// FunctionLoweringInfo to an empty state, ready to be used for a +/// different function. +void FunctionLoweringInfo::clear() { + MBBMap.clear(); + ValueMap.clear(); + StaticAllocaMap.clear(); +#ifndef NDEBUG + CatchInfoLost.clear(); + CatchInfoFound.clear(); +#endif + LiveOutRegInfo.clear(); +} + unsigned FunctionLoweringInfo::MakeReg(EVT VT) { return RegInfo->createVirtualRegister(TLI.getRegClassFor(VT)); } @@ -384,5727 +276,3 @@ } return FirstReg; } - -/// getCopyFromParts - Create a value that contains the specified legal parts -/// combined into the value they represent. If the parts combine to a type -/// larger then ValueVT then AssertOp can be used to specify whether the extra -/// bits are known to be zero (ISD::AssertZext) or sign extended from ValueVT -/// (ISD::AssertSext). -static SDValue getCopyFromParts(SelectionDAG &DAG, DebugLoc dl, - const SDValue *Parts, - unsigned NumParts, EVT PartVT, EVT ValueVT, - ISD::NodeType AssertOp = ISD::DELETED_NODE) { - assert(NumParts > 0 && "No parts to assemble!"); - const TargetLowering &TLI = DAG.getTargetLoweringInfo(); - SDValue Val = Parts[0]; - - if (NumParts > 1) { - // Assemble the value from multiple parts. - if (!ValueVT.isVector() && ValueVT.isInteger()) { - unsigned PartBits = PartVT.getSizeInBits(); - unsigned ValueBits = ValueVT.getSizeInBits(); - - // Assemble the power of 2 part. - unsigned RoundParts = NumParts & (NumParts - 1) ? - 1 << Log2_32(NumParts) : NumParts; - unsigned RoundBits = PartBits * RoundParts; - EVT RoundVT = RoundBits == ValueBits ? - ValueVT : EVT::getIntegerVT(*DAG.getContext(), RoundBits); - SDValue Lo, Hi; - - EVT HalfVT = EVT::getIntegerVT(*DAG.getContext(), RoundBits/2); - - if (RoundParts > 2) { - Lo = getCopyFromParts(DAG, dl, Parts, RoundParts/2, PartVT, HalfVT); - Hi = getCopyFromParts(DAG, dl, Parts+RoundParts/2, RoundParts/2, - PartVT, HalfVT); - } else { - Lo = DAG.getNode(ISD::BIT_CONVERT, dl, HalfVT, Parts[0]); - Hi = DAG.getNode(ISD::BIT_CONVERT, dl, HalfVT, Parts[1]); - } - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - Val = DAG.getNode(ISD::BUILD_PAIR, dl, RoundVT, Lo, Hi); - - if (RoundParts < NumParts) { - // Assemble the trailing non-power-of-2 part. - unsigned OddParts = NumParts - RoundParts; - EVT OddVT = EVT::getIntegerVT(*DAG.getContext(), OddParts * PartBits); - Hi = getCopyFromParts(DAG, dl, - Parts+RoundParts, OddParts, PartVT, OddVT); - - // Combine the round and odd parts. - Lo = Val; - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - EVT TotalVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Hi = DAG.getNode(ISD::ANY_EXTEND, dl, TotalVT, Hi); - Hi = DAG.getNode(ISD::SHL, dl, TotalVT, Hi, - DAG.getConstant(Lo.getValueType().getSizeInBits(), - TLI.getPointerTy())); - Lo = DAG.getNode(ISD::ZERO_EXTEND, dl, TotalVT, Lo); - Val = DAG.getNode(ISD::OR, dl, TotalVT, Lo, Hi); - } - } else if (ValueVT.isVector()) { - // Handle a multi-element vector. - EVT IntermediateVT, RegisterVT; - unsigned NumIntermediates; - unsigned NumRegs = - TLI.getVectorTypeBreakdown(*DAG.getContext(), ValueVT, IntermediateVT, - NumIntermediates, RegisterVT); - assert(NumRegs == NumParts && "Part count doesn't match vector breakdown!"); - NumParts = NumRegs; // Silence a compiler warning. - assert(RegisterVT == PartVT && "Part type doesn't match vector breakdown!"); - assert(RegisterVT == Parts[0].getValueType() && - "Part type doesn't match part!"); - - // Assemble the parts into intermediate operands. - SmallVector Ops(NumIntermediates); - if (NumIntermediates == NumParts) { - // If the register was not expanded, truncate or copy the value, - // as appropriate. - for (unsigned i = 0; i != NumParts; ++i) - Ops[i] = getCopyFromParts(DAG, dl, &Parts[i], 1, - PartVT, IntermediateVT); - } else if (NumParts > 0) { - // If the intermediate type was expanded, build the intermediate operands - // from the parts. - assert(NumParts % NumIntermediates == 0 && - "Must expand into a divisible number of parts!"); - unsigned Factor = NumParts / NumIntermediates; - for (unsigned i = 0; i != NumIntermediates; ++i) - Ops[i] = getCopyFromParts(DAG, dl, &Parts[i * Factor], Factor, - PartVT, IntermediateVT); - } - - // Build a vector with BUILD_VECTOR or CONCAT_VECTORS from the intermediate - // operands. - Val = DAG.getNode(IntermediateVT.isVector() ? - ISD::CONCAT_VECTORS : ISD::BUILD_VECTOR, dl, - ValueVT, &Ops[0], NumIntermediates); - } else if (PartVT.isFloatingPoint()) { - // FP split into multiple FP parts (for ppcf128) - assert(ValueVT == EVT(MVT::ppcf128) && PartVT == EVT(MVT::f64) && - "Unexpected split"); - SDValue Lo, Hi; - Lo = DAG.getNode(ISD::BIT_CONVERT, dl, EVT(MVT::f64), Parts[0]); - Hi = DAG.getNode(ISD::BIT_CONVERT, dl, EVT(MVT::f64), Parts[1]); - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - Val = DAG.getNode(ISD::BUILD_PAIR, dl, ValueVT, Lo, Hi); - } else { - // FP split into integer parts (soft fp) - assert(ValueVT.isFloatingPoint() && PartVT.isInteger() && - !PartVT.isVector() && "Unexpected split"); - EVT IntVT = EVT::getIntegerVT(*DAG.getContext(), ValueVT.getSizeInBits()); - Val = getCopyFromParts(DAG, dl, Parts, NumParts, PartVT, IntVT); - } - } - - // There is now one part, held in Val. Correct it to match ValueVT. - PartVT = Val.getValueType(); - - if (PartVT == ValueVT) - return Val; - - if (PartVT.isVector()) { - assert(ValueVT.isVector() && "Unknown vector conversion!"); - return DAG.getNode(ISD::BIT_CONVERT, dl, ValueVT, Val); - } - - if (ValueVT.isVector()) { - assert(ValueVT.getVectorElementType() == PartVT && - ValueVT.getVectorNumElements() == 1 && - "Only trivial scalar-to-vector conversions should get here!"); - return DAG.getNode(ISD::BUILD_VECTOR, dl, ValueVT, Val); - } - - if (PartVT.isInteger() && - ValueVT.isInteger()) { - if (ValueVT.bitsLT(PartVT)) { - // For a truncate, see if we have any information to - // indicate whether the truncated bits will always be - // zero or sign-extension. - if (AssertOp != ISD::DELETED_NODE) - Val = DAG.getNode(AssertOp, dl, PartVT, Val, - DAG.getValueType(ValueVT)); - return DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } else { - return DAG.getNode(ISD::ANY_EXTEND, dl, ValueVT, Val); - } - } - - if (PartVT.isFloatingPoint() && ValueVT.isFloatingPoint()) { - if (ValueVT.bitsLT(Val.getValueType())) - // FP_ROUND's are always exact here. - return DAG.getNode(ISD::FP_ROUND, dl, ValueVT, Val, - DAG.getIntPtrConstant(1)); - return DAG.getNode(ISD::FP_EXTEND, dl, ValueVT, Val); - } - - if (PartVT.getSizeInBits() == ValueVT.getSizeInBits()) - return DAG.getNode(ISD::BIT_CONVERT, dl, ValueVT, Val); - - llvm_unreachable("Unknown mismatch!"); - return SDValue(); -} - -/// getCopyToParts - Create a series of nodes that contain the specified value -/// split into legal parts. If the parts contain more bits than Val, then, for -/// integers, ExtendKind can be used to specify how to generate the extra bits. -static void getCopyToParts(SelectionDAG &DAG, DebugLoc dl, SDValue Val, - SDValue *Parts, unsigned NumParts, EVT PartVT, - ISD::NodeType ExtendKind = ISD::ANY_EXTEND) { - const TargetLowering &TLI = DAG.getTargetLoweringInfo(); - EVT PtrVT = TLI.getPointerTy(); - EVT ValueVT = Val.getValueType(); - unsigned PartBits = PartVT.getSizeInBits(); - unsigned OrigNumParts = NumParts; - assert(TLI.isTypeLegal(PartVT) && "Copying to an illegal type!"); - - if (!NumParts) - return; - - if (!ValueVT.isVector()) { - if (PartVT == ValueVT) { - assert(NumParts == 1 && "No-op copy with multiple parts!"); - Parts[0] = Val; - return; - } - - if (NumParts * PartBits > ValueVT.getSizeInBits()) { - // If the parts cover more bits than the value has, promote the value. - if (PartVT.isFloatingPoint() && ValueVT.isFloatingPoint()) { - assert(NumParts == 1 && "Do not know what to promote to!"); - Val = DAG.getNode(ISD::FP_EXTEND, dl, PartVT, Val); - } else if (PartVT.isInteger() && ValueVT.isInteger()) { - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ExtendKind, dl, ValueVT, Val); - } else { - llvm_unreachable("Unknown mismatch!"); - } - } else if (PartBits == ValueVT.getSizeInBits()) { - // Different types of the same size. - assert(NumParts == 1 && PartVT != ValueVT); - Val = DAG.getNode(ISD::BIT_CONVERT, dl, PartVT, Val); - } else if (NumParts * PartBits < ValueVT.getSizeInBits()) { - // If the parts cover less bits than value has, truncate the value. - if (PartVT.isInteger() && ValueVT.isInteger()) { - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } else { - llvm_unreachable("Unknown mismatch!"); - } - } - - // The value may have changed - recompute ValueVT. - ValueVT = Val.getValueType(); - assert(NumParts * PartBits == ValueVT.getSizeInBits() && - "Failed to tile the value with PartVT!"); - - if (NumParts == 1) { - assert(PartVT == ValueVT && "Type conversion failed!"); - Parts[0] = Val; - return; - } - - // Expand the value into multiple parts. - if (NumParts & (NumParts - 1)) { - // The number of parts is not a power of 2. Split off and copy the tail. - assert(PartVT.isInteger() && ValueVT.isInteger() && - "Do not know what to expand to!"); - unsigned RoundParts = 1 << Log2_32(NumParts); - unsigned RoundBits = RoundParts * PartBits; - unsigned OddParts = NumParts - RoundParts; - SDValue OddVal = DAG.getNode(ISD::SRL, dl, ValueVT, Val, - DAG.getConstant(RoundBits, - TLI.getPointerTy())); - getCopyToParts(DAG, dl, OddVal, Parts + RoundParts, OddParts, PartVT); - if (TLI.isBigEndian()) - // The odd parts were reversed by getCopyToParts - unreverse them. - std::reverse(Parts + RoundParts, Parts + NumParts); - NumParts = RoundParts; - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } - - // The number of parts is a power of 2. Repeatedly bisect the value using - // EXTRACT_ELEMENT. - Parts[0] = DAG.getNode(ISD::BIT_CONVERT, dl, - EVT::getIntegerVT(*DAG.getContext(), ValueVT.getSizeInBits()), - Val); - for (unsigned StepSize = NumParts; StepSize > 1; StepSize /= 2) { - for (unsigned i = 0; i < NumParts; i += StepSize) { - unsigned ThisBits = StepSize * PartBits / 2; - EVT ThisVT = EVT::getIntegerVT(*DAG.getContext(), ThisBits); - SDValue &Part0 = Parts[i]; - SDValue &Part1 = Parts[i+StepSize/2]; - - Part1 = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, - ThisVT, Part0, - DAG.getConstant(1, PtrVT)); - Part0 = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, - ThisVT, Part0, - DAG.getConstant(0, PtrVT)); - - if (ThisBits == PartBits && ThisVT != PartVT) { - Part0 = DAG.getNode(ISD::BIT_CONVERT, dl, - PartVT, Part0); - Part1 = DAG.getNode(ISD::BIT_CONVERT, dl, - PartVT, Part1); - } - } - } - - if (TLI.isBigEndian()) - std::reverse(Parts, Parts + OrigNumParts); - - return; - } - - // Vector ValueVT. - if (NumParts == 1) { - if (PartVT != ValueVT) { - if (PartVT.isVector()) { - Val = DAG.getNode(ISD::BIT_CONVERT, dl, PartVT, Val); - } else { - assert(ValueVT.getVectorElementType() == PartVT && - ValueVT.getVectorNumElements() == 1 && - "Only trivial vector-to-scalar conversions should get here!"); - Val = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, - PartVT, Val, - DAG.getConstant(0, PtrVT)); - } - } - - Parts[0] = Val; - return; - } - - // Handle a multi-element vector. - EVT IntermediateVT, RegisterVT; - unsigned NumIntermediates; - unsigned NumRegs = TLI.getVectorTypeBreakdown(*DAG.getContext(), ValueVT, - IntermediateVT, NumIntermediates, RegisterVT); - unsigned NumElements = ValueVT.getVectorNumElements(); - - assert(NumRegs == NumParts && "Part count doesn't match vector breakdown!"); - NumParts = NumRegs; // Silence a compiler warning. - assert(RegisterVT == PartVT && "Part type doesn't match vector breakdown!"); - - // Split the vector into intermediate operands. - SmallVector Ops(NumIntermediates); - for (unsigned i = 0; i != NumIntermediates; ++i) - if (IntermediateVT.isVector()) - Ops[i] = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, - IntermediateVT, Val, - DAG.getConstant(i * (NumElements / NumIntermediates), - PtrVT)); - else - Ops[i] = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, - IntermediateVT, Val, - DAG.getConstant(i, PtrVT)); - - // Split the intermediate operands into legal parts. - if (NumParts == NumIntermediates) { - // If the register was not expanded, promote or copy the value, - // as appropriate. - for (unsigned i = 0; i != NumParts; ++i) - getCopyToParts(DAG, dl, Ops[i], &Parts[i], 1, PartVT); - } else if (NumParts > 0) { - // If the intermediate type was expanded, split each the value into - // legal parts. - assert(NumParts % NumIntermediates == 0 && - "Must expand into a divisible number of parts!"); - unsigned Factor = NumParts / NumIntermediates; - for (unsigned i = 0; i != NumIntermediates; ++i) - getCopyToParts(DAG, dl, Ops[i], &Parts[i * Factor], Factor, PartVT); - } -} - - -void SelectionDAGLowering::init(GCFunctionInfo *gfi, AliasAnalysis &aa) { - AA = &aa; - GFI = gfi; - TD = DAG.getTarget().getTargetData(); -} - -/// clear - Clear out the curret SelectionDAG and the associated -/// state and prepare this SelectionDAGLowering object to be used -/// for a new block. This doesn't clear out information about -/// additional blocks that are needed to complete switch lowering -/// or PHI node updating; that information is cleared out as it is -/// consumed. -void SelectionDAGLowering::clear() { - NodeMap.clear(); - PendingLoads.clear(); - PendingExports.clear(); - EdgeMapping.clear(); - DAG.clear(); - CurDebugLoc = DebugLoc::getUnknownLoc(); - HasTailCall = false; -} - -/// getRoot - Return the current virtual root of the Selection DAG, -/// flushing any PendingLoad items. This must be done before emitting -/// a store or any other node that may need to be ordered after any -/// prior load instructions. -/// -SDValue SelectionDAGLowering::getRoot() { - if (PendingLoads.empty()) - return DAG.getRoot(); - - if (PendingLoads.size() == 1) { - SDValue Root = PendingLoads[0]; - DAG.setRoot(Root); - PendingLoads.clear(); - return Root; - } - - // Otherwise, we have to make a token factor node. - SDValue Root = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &PendingLoads[0], PendingLoads.size()); - PendingLoads.clear(); - DAG.setRoot(Root); - return Root; -} - -/// getControlRoot - Similar to getRoot, but instead of flushing all the -/// PendingLoad items, flush all the PendingExports items. It is necessary -/// to do this before emitting a terminator instruction. -/// -SDValue SelectionDAGLowering::getControlRoot() { - SDValue Root = DAG.getRoot(); - - if (PendingExports.empty()) - return Root; - - // Turn all of the CopyToReg chains into one factored node. - if (Root.getOpcode() != ISD::EntryToken) { - unsigned i = 0, e = PendingExports.size(); - for (; i != e; ++i) { - assert(PendingExports[i].getNode()->getNumOperands() > 1); - if (PendingExports[i].getNode()->getOperand(0) == Root) - break; // Don't add the root if we already indirectly depend on it. - } - - if (i == e) - PendingExports.push_back(Root); - } - - Root = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &PendingExports[0], - PendingExports.size()); - PendingExports.clear(); - DAG.setRoot(Root); - return Root; -} - -void SelectionDAGLowering::visit(Instruction &I) { - visit(I.getOpcode(), I); -} - -void SelectionDAGLowering::visit(unsigned Opcode, User &I) { - // Note: this doesn't use InstVisitor, because it has to work with - // ConstantExpr's in addition to instructions. - switch (Opcode) { - default: llvm_unreachable("Unknown instruction type encountered!"); - // Build the switch statement using the Instruction.def file. -#define HANDLE_INST(NUM, OPCODE, CLASS) \ - case Instruction::OPCODE:return visit##OPCODE((CLASS&)I); -#include "llvm/Instruction.def" - } -} - -SDValue SelectionDAGLowering::getValue(const Value *V) { - SDValue &N = NodeMap[V]; - if (N.getNode()) return N; - - if (Constant *C = const_cast(dyn_cast(V))) { - EVT VT = TLI.getValueType(V->getType(), true); - - if (ConstantInt *CI = dyn_cast(C)) - return N = DAG.getConstant(*CI, VT); - - if (GlobalValue *GV = dyn_cast(C)) - return N = DAG.getGlobalAddress(GV, VT); - - if (isa(C)) - return N = DAG.getConstant(0, TLI.getPointerTy()); - - if (ConstantFP *CFP = dyn_cast(C)) - return N = DAG.getConstantFP(*CFP, VT); - - if (isa(C) && !V->getType()->isAggregateType()) - return N = DAG.getUNDEF(VT); - - if (ConstantExpr *CE = dyn_cast(C)) { - visit(CE->getOpcode(), *CE); - SDValue N1 = NodeMap[V]; - assert(N1.getNode() && "visit didn't populate the ValueMap!"); - return N1; - } - - if (isa(C) || isa(C)) { - SmallVector Constants; - for (User::const_op_iterator OI = C->op_begin(), OE = C->op_end(); - OI != OE; ++OI) { - SDNode *Val = getValue(*OI).getNode(); - // If the operand is an empty aggregate, there are no values. - if (!Val) continue; - // Add each leaf value from the operand to the Constants list - // to form a flattened list of all the values. - for (unsigned i = 0, e = Val->getNumValues(); i != e; ++i) - Constants.push_back(SDValue(Val, i)); - } - return DAG.getMergeValues(&Constants[0], Constants.size(), - getCurDebugLoc()); - } - - if (isa(C->getType()) || isa(C->getType())) { - assert((isa(C) || isa(C)) && - "Unknown struct or array constant!"); - - SmallVector ValueVTs; - ComputeValueVTs(TLI, C->getType(), ValueVTs); - unsigned NumElts = ValueVTs.size(); - if (NumElts == 0) - return SDValue(); // empty struct - SmallVector Constants(NumElts); - for (unsigned i = 0; i != NumElts; ++i) { - EVT EltVT = ValueVTs[i]; - if (isa(C)) - Constants[i] = DAG.getUNDEF(EltVT); - else if (EltVT.isFloatingPoint()) - Constants[i] = DAG.getConstantFP(0, EltVT); - else - Constants[i] = DAG.getConstant(0, EltVT); - } - return DAG.getMergeValues(&Constants[0], NumElts, getCurDebugLoc()); - } - - if (BlockAddress *BA = dyn_cast(C)) - return DAG.getBlockAddress(BA, VT); - - const VectorType *VecTy = cast(V->getType()); - unsigned NumElements = VecTy->getNumElements(); - - // Now that we know the number and type of the elements, get that number of - // elements into the Ops array based on what kind of constant it is. - SmallVector Ops; - if (ConstantVector *CP = dyn_cast(C)) { - for (unsigned i = 0; i != NumElements; ++i) - Ops.push_back(getValue(CP->getOperand(i))); - } else { - assert(isa(C) && "Unknown vector constant!"); - EVT EltVT = TLI.getValueType(VecTy->getElementType()); - - SDValue Op; - if (EltVT.isFloatingPoint()) - Op = DAG.getConstantFP(0, EltVT); - else - Op = DAG.getConstant(0, EltVT); - Ops.assign(NumElements, Op); - } - - // Create a BUILD_VECTOR node. - return NodeMap[V] = DAG.getNode(ISD::BUILD_VECTOR, getCurDebugLoc(), - VT, &Ops[0], Ops.size()); - } - - // If this is a static alloca, generate it as the frameindex instead of - // computation. - if (const AllocaInst *AI = dyn_cast(V)) { - DenseMap::iterator SI = - FuncInfo.StaticAllocaMap.find(AI); - if (SI != FuncInfo.StaticAllocaMap.end()) - return DAG.getFrameIndex(SI->second, TLI.getPointerTy()); - } - - unsigned InReg = FuncInfo.ValueMap[V]; - assert(InReg && "Value not in map!"); - - RegsForValue RFV(*DAG.getContext(), TLI, InReg, V->getType()); - SDValue Chain = DAG.getEntryNode(); - return RFV.getCopyFromRegs(DAG, getCurDebugLoc(), Chain, NULL); -} - -/// Get the EVTs and ArgFlags collections that represent the return type -/// of the given function. This does not require a DAG or a return value, and -/// is suitable for use before any DAGs for the function are constructed. -static void getReturnInfo(const Type* ReturnType, - Attributes attr, SmallVectorImpl &OutVTs, - SmallVectorImpl &OutFlags, - TargetLowering &TLI, - SmallVectorImpl *Offsets = 0) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, ReturnType, ValueVTs, Offsets); - unsigned NumValues = ValueVTs.size(); - if ( NumValues == 0 ) return; - - for (unsigned j = 0, f = NumValues; j != f; ++j) { - EVT VT = ValueVTs[j]; - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - if (attr & Attribute::SExt) - ExtendKind = ISD::SIGN_EXTEND; - else if (attr & Attribute::ZExt) - ExtendKind = ISD::ZERO_EXTEND; - - // FIXME: C calling convention requires the return type to be promoted to - // at least 32-bit. But this is not necessary for non-C calling - // conventions. The frontend should mark functions whose return values - // require promoting with signext or zeroext attributes. - if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { - EVT MinVT = TLI.getRegisterType(ReturnType->getContext(), MVT::i32); - if (VT.bitsLT(MinVT)) - VT = MinVT; - } - - unsigned NumParts = TLI.getNumRegisters(ReturnType->getContext(), VT); - EVT PartVT = TLI.getRegisterType(ReturnType->getContext(), VT); - // 'inreg' on function refers to return value - ISD::ArgFlagsTy Flags = ISD::ArgFlagsTy(); - if (attr & Attribute::InReg) - Flags.setInReg(); - - // Propagate extension type if any - if (attr & Attribute::SExt) - Flags.setSExt(); - else if (attr & Attribute::ZExt) - Flags.setZExt(); - - for (unsigned i = 0; i < NumParts; ++i) { - OutVTs.push_back(PartVT); - OutFlags.push_back(Flags); - } - } -} - -void SelectionDAGLowering::visitRet(ReturnInst &I) { - SDValue Chain = getControlRoot(); - SmallVector Outs; - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - - if (!FLI.CanLowerReturn) { - unsigned DemoteReg = FLI.DemoteRegister; - const Function *F = I.getParent()->getParent(); - - // Emit a store of the return value through the virtual register. - // Leave Outs empty so that LowerReturn won't try to load return - // registers the usual way. - SmallVector PtrValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F->getReturnType()), - PtrValueVTs); - - SDValue RetPtr = DAG.getRegister(DemoteReg, PtrValueVTs[0]); - SDValue RetOp = getValue(I.getOperand(0)); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, I.getOperand(0)->getType(), ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - - SmallVector Chains(NumValues); - EVT PtrVT = PtrValueVTs[0]; - for (unsigned i = 0; i != NumValues; ++i) - Chains[i] = DAG.getStore(Chain, getCurDebugLoc(), - SDValue(RetOp.getNode(), RetOp.getResNo() + i), - DAG.getNode(ISD::ADD, getCurDebugLoc(), PtrVT, RetPtr, - DAG.getConstant(Offsets[i], PtrVT)), - NULL, Offsets[i], false, 0); - Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues); - } - else { - for (unsigned i = 0, e = I.getNumOperands(); i != e; ++i) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getOperand(i)->getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) continue; - - SDValue RetOp = getValue(I.getOperand(i)); - for (unsigned j = 0, f = NumValues; j != f; ++j) { - EVT VT = ValueVTs[j]; - - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - const Function *F = I.getParent()->getParent(); - if (F->paramHasAttr(0, Attribute::SExt)) - ExtendKind = ISD::SIGN_EXTEND; - else if (F->paramHasAttr(0, Attribute::ZExt)) - ExtendKind = ISD::ZERO_EXTEND; - - // FIXME: C calling convention requires the return type to be promoted to - // at least 32-bit. But this is not necessary for non-C calling - // conventions. The frontend should mark functions whose return values - // require promoting with signext or zeroext attributes. - if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { - EVT MinVT = TLI.getRegisterType(*DAG.getContext(), MVT::i32); - if (VT.bitsLT(MinVT)) - VT = MinVT; - } - - unsigned NumParts = TLI.getNumRegisters(*DAG.getContext(), VT); - EVT PartVT = TLI.getRegisterType(*DAG.getContext(), VT); - SmallVector Parts(NumParts); - getCopyToParts(DAG, getCurDebugLoc(), - SDValue(RetOp.getNode(), RetOp.getResNo() + j), - &Parts[0], NumParts, PartVT, ExtendKind); - - // 'inreg' on function refers to return value - ISD::ArgFlagsTy Flags = ISD::ArgFlagsTy(); - if (F->paramHasAttr(0, Attribute::InReg)) - Flags.setInReg(); - - // Propagate extension type if any - if (F->paramHasAttr(0, Attribute::SExt)) - Flags.setSExt(); - else if (F->paramHasAttr(0, Attribute::ZExt)) - Flags.setZExt(); - - for (unsigned i = 0; i < NumParts; ++i) - Outs.push_back(ISD::OutputArg(Flags, Parts[i], /*isfixed=*/true)); - } - } - } - - bool isVarArg = DAG.getMachineFunction().getFunction()->isVarArg(); - CallingConv::ID CallConv = - DAG.getMachineFunction().getFunction()->getCallingConv(); - Chain = TLI.LowerReturn(Chain, CallConv, isVarArg, - Outs, getCurDebugLoc(), DAG); - - // Verify that the target's LowerReturn behaved as expected. - assert(Chain.getNode() && Chain.getValueType() == MVT::Other && - "LowerReturn didn't return a valid chain!"); - - // Update the DAG with the new chain value resulting from return lowering. - DAG.setRoot(Chain); -} - -/// CopyToExportRegsIfNeeded - If the given value has virtual registers -/// created for it, emit nodes to copy the value into the virtual -/// registers. -void SelectionDAGLowering::CopyToExportRegsIfNeeded(Value *V) { - if (!V->use_empty()) { - DenseMap::iterator VMI = FuncInfo.ValueMap.find(V); - if (VMI != FuncInfo.ValueMap.end()) - CopyValueToVirtualRegister(V, VMI->second); - } -} - -/// ExportFromCurrentBlock - If this condition isn't known to be exported from -/// the current basic block, add it to ValueMap now so that we'll get a -/// CopyTo/FromReg. -void SelectionDAGLowering::ExportFromCurrentBlock(Value *V) { - // No need to export constants. - if (!isa(V) && !isa(V)) return; - - // Already exported? - if (FuncInfo.isExportedInst(V)) return; - - unsigned Reg = FuncInfo.InitializeRegForValue(V); - CopyValueToVirtualRegister(V, Reg); -} - -bool SelectionDAGLowering::isExportableFromCurrentBlock(Value *V, - const BasicBlock *FromBB) { - // The operands of the setcc have to be in this block. We don't know - // how to export them from some other block. - if (Instruction *VI = dyn_cast(V)) { - // Can export from current BB. - if (VI->getParent() == FromBB) - return true; - - // Is already exported, noop. - return FuncInfo.isExportedInst(V); - } - - // If this is an argument, we can export it if the BB is the entry block or - // if it is already exported. - if (isa(V)) { - if (FromBB == &FromBB->getParent()->getEntryBlock()) - return true; - - // Otherwise, can only export this if it is already exported. - return FuncInfo.isExportedInst(V); - } - - // Otherwise, constants can always be exported. - return true; -} - -static bool InBlock(const Value *V, const BasicBlock *BB) { - if (const Instruction *I = dyn_cast(V)) - return I->getParent() == BB; - return true; -} - -/// getFCmpCondCode - Return the ISD condition code corresponding to -/// the given LLVM IR floating-point condition code. This includes -/// consideration of global floating-point math flags. -/// -static ISD::CondCode getFCmpCondCode(FCmpInst::Predicate Pred) { - ISD::CondCode FPC, FOC; - switch (Pred) { - case FCmpInst::FCMP_FALSE: FOC = FPC = ISD::SETFALSE; break; - case FCmpInst::FCMP_OEQ: FOC = ISD::SETEQ; FPC = ISD::SETOEQ; break; - case FCmpInst::FCMP_OGT: FOC = ISD::SETGT; FPC = ISD::SETOGT; break; - case FCmpInst::FCMP_OGE: FOC = ISD::SETGE; FPC = ISD::SETOGE; break; - case FCmpInst::FCMP_OLT: FOC = ISD::SETLT; FPC = ISD::SETOLT; break; - case FCmpInst::FCMP_OLE: FOC = ISD::SETLE; FPC = ISD::SETOLE; break; - case FCmpInst::FCMP_ONE: FOC = ISD::SETNE; FPC = ISD::SETONE; break; - case FCmpInst::FCMP_ORD: FOC = FPC = ISD::SETO; break; - case FCmpInst::FCMP_UNO: FOC = FPC = ISD::SETUO; break; - case FCmpInst::FCMP_UEQ: FOC = ISD::SETEQ; FPC = ISD::SETUEQ; break; - case FCmpInst::FCMP_UGT: FOC = ISD::SETGT; FPC = ISD::SETUGT; break; - case FCmpInst::FCMP_UGE: FOC = ISD::SETGE; FPC = ISD::SETUGE; break; - case FCmpInst::FCMP_ULT: FOC = ISD::SETLT; FPC = ISD::SETULT; break; - case FCmpInst::FCMP_ULE: FOC = ISD::SETLE; FPC = ISD::SETULE; break; - case FCmpInst::FCMP_UNE: FOC = ISD::SETNE; FPC = ISD::SETUNE; break; - case FCmpInst::FCMP_TRUE: FOC = FPC = ISD::SETTRUE; break; - default: - llvm_unreachable("Invalid FCmp predicate opcode!"); - FOC = FPC = ISD::SETFALSE; - break; - } - if (FiniteOnlyFPMath()) - return FOC; - else - return FPC; -} - -/// getICmpCondCode - Return the ISD condition code corresponding to -/// the given LLVM IR integer condition code. -/// -static ISD::CondCode getICmpCondCode(ICmpInst::Predicate Pred) { - switch (Pred) { - case ICmpInst::ICMP_EQ: return ISD::SETEQ; - case ICmpInst::ICMP_NE: return ISD::SETNE; - case ICmpInst::ICMP_SLE: return ISD::SETLE; - case ICmpInst::ICMP_ULE: return ISD::SETULE; - case ICmpInst::ICMP_SGE: return ISD::SETGE; - case ICmpInst::ICMP_UGE: return ISD::SETUGE; - case ICmpInst::ICMP_SLT: return ISD::SETLT; - case ICmpInst::ICMP_ULT: return ISD::SETULT; - case ICmpInst::ICMP_SGT: return ISD::SETGT; - case ICmpInst::ICMP_UGT: return ISD::SETUGT; - default: - llvm_unreachable("Invalid ICmp predicate opcode!"); - return ISD::SETNE; - } -} - -/// EmitBranchForMergedCondition - Helper method for FindMergedConditions. -/// This function emits a branch and is used at the leaves of an OR or an -/// AND operator tree. -/// -void -SelectionDAGLowering::EmitBranchForMergedCondition(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB) { - const BasicBlock *BB = CurBB->getBasicBlock(); - - // If the leaf of the tree is a comparison, merge the condition into - // the caseblock. - if (CmpInst *BOp = dyn_cast(Cond)) { - // The operands of the cmp have to be in this block. We don't know - // how to export them from some other block. If this is the first block - // of the sequence, no exporting is needed. - if (CurBB == CurMBB || - (isExportableFromCurrentBlock(BOp->getOperand(0), BB) && - isExportableFromCurrentBlock(BOp->getOperand(1), BB))) { - ISD::CondCode Condition; - if (ICmpInst *IC = dyn_cast(Cond)) { - Condition = getICmpCondCode(IC->getPredicate()); - } else if (FCmpInst *FC = dyn_cast(Cond)) { - Condition = getFCmpCondCode(FC->getPredicate()); - } else { - Condition = ISD::SETEQ; // silence warning. - llvm_unreachable("Unknown compare instruction"); - } - - CaseBlock CB(Condition, BOp->getOperand(0), - BOp->getOperand(1), NULL, TBB, FBB, CurBB); - SwitchCases.push_back(CB); - return; - } - } - - // Create a CaseBlock record representing this branch. - CaseBlock CB(ISD::SETEQ, Cond, ConstantInt::getTrue(*DAG.getContext()), - NULL, TBB, FBB, CurBB); - SwitchCases.push_back(CB); -} - -/// FindMergedConditions - If Cond is an expression like -void SelectionDAGLowering::FindMergedConditions(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB, - unsigned Opc) { - // If this node is not part of the or/and tree, emit it as a branch. - Instruction *BOp = dyn_cast(Cond); - if (!BOp || !(isa(BOp) || isa(BOp)) || - (unsigned)BOp->getOpcode() != Opc || !BOp->hasOneUse() || - BOp->getParent() != CurBB->getBasicBlock() || - !InBlock(BOp->getOperand(0), CurBB->getBasicBlock()) || - !InBlock(BOp->getOperand(1), CurBB->getBasicBlock())) { - EmitBranchForMergedCondition(Cond, TBB, FBB, CurBB); - return; - } - - // Create TmpBB after CurBB. - MachineFunction::iterator BBI = CurBB; - MachineFunction &MF = DAG.getMachineFunction(); - MachineBasicBlock *TmpBB = MF.CreateMachineBasicBlock(CurBB->getBasicBlock()); - CurBB->getParent()->insert(++BBI, TmpBB); - - if (Opc == Instruction::Or) { - // Codegen X | Y as: - // jmp_if_X TBB - // jmp TmpBB - // TmpBB: - // jmp_if_Y TBB - // jmp FBB - // - - // Emit the LHS condition. - FindMergedConditions(BOp->getOperand(0), TBB, TmpBB, CurBB, Opc); - - // Emit the RHS condition into TmpBB. - FindMergedConditions(BOp->getOperand(1), TBB, FBB, TmpBB, Opc); - } else { - assert(Opc == Instruction::And && "Unknown merge op!"); - // Codegen X & Y as: - // jmp_if_X TmpBB - // jmp FBB - // TmpBB: - // jmp_if_Y TBB - // jmp FBB - // - // This requires creation of TmpBB after CurBB. - - // Emit the LHS condition. - FindMergedConditions(BOp->getOperand(0), TmpBB, FBB, CurBB, Opc); - - // Emit the RHS condition into TmpBB. - FindMergedConditions(BOp->getOperand(1), TBB, FBB, TmpBB, Opc); - } -} - -/// If the set of cases should be emitted as a series of branches, return true. -/// If we should emit this as a bunch of and/or'd together conditions, return -/// false. -bool -SelectionDAGLowering::ShouldEmitAsBranches(const std::vector &Cases){ - if (Cases.size() != 2) return true; - - // If this is two comparisons of the same values or'd or and'd together, they - // will get folded into a single comparison, so don't emit two blocks. - if ((Cases[0].CmpLHS == Cases[1].CmpLHS && - Cases[0].CmpRHS == Cases[1].CmpRHS) || - (Cases[0].CmpRHS == Cases[1].CmpLHS && - Cases[0].CmpLHS == Cases[1].CmpRHS)) { - return false; - } - - return true; -} - -void SelectionDAGLowering::visitBr(BranchInst &I) { - // Update machine-CFG edges. - MachineBasicBlock *Succ0MBB = FuncInfo.MBBMap[I.getSuccessor(0)]; - - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - if (I.isUnconditional()) { - // Update machine-CFG edges. - CurMBB->addSuccessor(Succ0MBB); - - // If this is not a fall-through branch, emit the branch. - if (Succ0MBB != NextBlock) - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Succ0MBB))); - return; - } - - // If this condition is one of the special cases we handle, do special stuff - // now. - Value *CondVal = I.getCondition(); - MachineBasicBlock *Succ1MBB = FuncInfo.MBBMap[I.getSuccessor(1)]; - - // If this is a series of conditions that are or'd or and'd together, emit - // this as a sequence of branches instead of setcc's with and/or operations. - // For example, instead of something like: - // cmp A, B - // C = seteq - // cmp D, E - // F = setle - // or C, F - // jnz foo - // Emit: - // cmp A, B - // je foo - // cmp D, E - // jle foo - // - if (BinaryOperator *BOp = dyn_cast(CondVal)) { - if (BOp->hasOneUse() && - (BOp->getOpcode() == Instruction::And || - BOp->getOpcode() == Instruction::Or)) { - FindMergedConditions(BOp, Succ0MBB, Succ1MBB, CurMBB, BOp->getOpcode()); - // If the compares in later blocks need to use values not currently - // exported from this block, export them now. This block should always - // be the first entry. - assert(SwitchCases[0].ThisBB == CurMBB && "Unexpected lowering!"); - - // Allow some cases to be rejected. - if (ShouldEmitAsBranches(SwitchCases)) { - for (unsigned i = 1, e = SwitchCases.size(); i != e; ++i) { - ExportFromCurrentBlock(SwitchCases[i].CmpLHS); - ExportFromCurrentBlock(SwitchCases[i].CmpRHS); - } - - // Emit the branch for this block. - visitSwitchCase(SwitchCases[0]); - SwitchCases.erase(SwitchCases.begin()); - return; - } - - // Okay, we decided not to do this, remove any inserted MBB's and clear - // SwitchCases. - for (unsigned i = 1, e = SwitchCases.size(); i != e; ++i) - FuncInfo.MF->erase(SwitchCases[i].ThisBB); - - SwitchCases.clear(); - } - } - - // Create a CaseBlock record representing this branch. - CaseBlock CB(ISD::SETEQ, CondVal, ConstantInt::getTrue(*DAG.getContext()), - NULL, Succ0MBB, Succ1MBB, CurMBB); - // Use visitSwitchCase to actually insert the fast branch sequence for this - // cond branch. - visitSwitchCase(CB); -} - -/// visitSwitchCase - Emits the necessary code to represent a single node in -/// the binary search tree resulting from lowering a switch instruction. -void SelectionDAGLowering::visitSwitchCase(CaseBlock &CB) { - SDValue Cond; - SDValue CondLHS = getValue(CB.CmpLHS); - DebugLoc dl = getCurDebugLoc(); - - // Build the setcc now. - if (CB.CmpMHS == NULL) { - // Fold "(X == true)" to X and "(X == false)" to !X to - // handle common cases produced by branch lowering. - if (CB.CmpRHS == ConstantInt::getTrue(*DAG.getContext()) && - CB.CC == ISD::SETEQ) - Cond = CondLHS; - else if (CB.CmpRHS == ConstantInt::getFalse(*DAG.getContext()) && - CB.CC == ISD::SETEQ) { - SDValue True = DAG.getConstant(1, CondLHS.getValueType()); - Cond = DAG.getNode(ISD::XOR, dl, CondLHS.getValueType(), CondLHS, True); - } else - Cond = DAG.getSetCC(dl, MVT::i1, CondLHS, getValue(CB.CmpRHS), CB.CC); - } else { - assert(CB.CC == ISD::SETLE && "Can handle only LE ranges now"); - - const APInt& Low = cast(CB.CmpLHS)->getValue(); - const APInt& High = cast(CB.CmpRHS)->getValue(); - - SDValue CmpOp = getValue(CB.CmpMHS); - EVT VT = CmpOp.getValueType(); - - if (cast(CB.CmpLHS)->isMinValue(true)) { - Cond = DAG.getSetCC(dl, MVT::i1, CmpOp, DAG.getConstant(High, VT), - ISD::SETLE); - } else { - SDValue SUB = DAG.getNode(ISD::SUB, dl, - VT, CmpOp, DAG.getConstant(Low, VT)); - Cond = DAG.getSetCC(dl, MVT::i1, SUB, - DAG.getConstant(High-Low, VT), ISD::SETULE); - } - } - - // Update successor info - CurMBB->addSuccessor(CB.TrueBB); - CurMBB->addSuccessor(CB.FalseBB); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - // If the lhs block is the next block, invert the condition so that we can - // fall through to the lhs instead of the rhs block. - if (CB.TrueBB == NextBlock) { - std::swap(CB.TrueBB, CB.FalseBB); - SDValue True = DAG.getConstant(1, Cond.getValueType()); - Cond = DAG.getNode(ISD::XOR, dl, Cond.getValueType(), Cond, True); - } - SDValue BrCond = DAG.getNode(ISD::BRCOND, dl, - MVT::Other, getControlRoot(), Cond, - DAG.getBasicBlock(CB.TrueBB)); - - // If the branch was constant folded, fix up the CFG. - if (BrCond.getOpcode() == ISD::BR) { - CurMBB->removeSuccessor(CB.FalseBB); - DAG.setRoot(BrCond); - } else { - // Otherwise, go ahead and insert the false branch. - if (BrCond == getControlRoot()) - CurMBB->removeSuccessor(CB.TrueBB); - - if (CB.FalseBB == NextBlock) - DAG.setRoot(BrCond); - else - DAG.setRoot(DAG.getNode(ISD::BR, dl, MVT::Other, BrCond, - DAG.getBasicBlock(CB.FalseBB))); - } -} - -/// visitJumpTable - Emit JumpTable node in the current MBB -void SelectionDAGLowering::visitJumpTable(JumpTable &JT) { - // Emit the code for the jump table - assert(JT.Reg != -1U && "Should lower JT Header first!"); - EVT PTy = TLI.getPointerTy(); - SDValue Index = DAG.getCopyFromReg(getControlRoot(), getCurDebugLoc(), - JT.Reg, PTy); - SDValue Table = DAG.getJumpTable(JT.JTI, PTy); - DAG.setRoot(DAG.getNode(ISD::BR_JT, getCurDebugLoc(), - MVT::Other, Index.getValue(1), - Table, Index)); -} - -/// visitJumpTableHeader - This function emits necessary code to produce index -/// in the JumpTable from switch case. -void SelectionDAGLowering::visitJumpTableHeader(JumpTable &JT, - JumpTableHeader &JTH) { - // Subtract the lowest switch case value from the value being switched on and - // conditional branch to default mbb if the result is greater than the - // difference between smallest and largest cases. - SDValue SwitchOp = getValue(JTH.SValue); - EVT VT = SwitchOp.getValueType(); - SDValue SUB = DAG.getNode(ISD::SUB, getCurDebugLoc(), VT, SwitchOp, - DAG.getConstant(JTH.First, VT)); - - // The SDNode we just created, which holds the value being switched on minus - // the the smallest case value, needs to be copied to a virtual register so it - // can be used as an index into the jump table in a subsequent basic block. - // This value may be smaller or larger than the target's pointer type, and - // therefore require extension or truncating. - SwitchOp = DAG.getZExtOrTrunc(SUB, getCurDebugLoc(), TLI.getPointerTy()); - - unsigned JumpTableReg = FuncInfo.MakeReg(TLI.getPointerTy()); - SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), getCurDebugLoc(), - JumpTableReg, SwitchOp); - JT.Reg = JumpTableReg; - - // Emit the range check for the jump table, and branch to the default block - // for the switch statement if the value being switched on exceeds the largest - // case in the switch. - SDValue CMP = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(SUB.getValueType()), SUB, - DAG.getConstant(JTH.Last-JTH.First,VT), - ISD::SETUGT); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - SDValue BrCond = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, CopyTo, CMP, - DAG.getBasicBlock(JT.Default)); - - if (JT.MBB == NextBlock) - DAG.setRoot(BrCond); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, BrCond, - DAG.getBasicBlock(JT.MBB))); -} - -/// visitBitTestHeader - This function emits necessary code to produce value -/// suitable for "bit tests" -void SelectionDAGLowering::visitBitTestHeader(BitTestBlock &B) { - // Subtract the minimum value - SDValue SwitchOp = getValue(B.SValue); - EVT VT = SwitchOp.getValueType(); - SDValue SUB = DAG.getNode(ISD::SUB, getCurDebugLoc(), VT, SwitchOp, - DAG.getConstant(B.First, VT)); - - // Check range - SDValue RangeCmp = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(SUB.getValueType()), - SUB, DAG.getConstant(B.Range, VT), - ISD::SETUGT); - - SDValue ShiftOp = DAG.getZExtOrTrunc(SUB, getCurDebugLoc(), TLI.getPointerTy()); - - B.Reg = FuncInfo.MakeReg(TLI.getPointerTy()); - SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), getCurDebugLoc(), - B.Reg, ShiftOp); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - MachineBasicBlock* MBB = B.Cases[0].ThisBB; - - CurMBB->addSuccessor(B.Default); - CurMBB->addSuccessor(MBB); - - SDValue BrRange = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, CopyTo, RangeCmp, - DAG.getBasicBlock(B.Default)); - - if (MBB == NextBlock) - DAG.setRoot(BrRange); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, CopyTo, - DAG.getBasicBlock(MBB))); -} - -/// visitBitTestCase - this function produces one "bit test" -void SelectionDAGLowering::visitBitTestCase(MachineBasicBlock* NextMBB, - unsigned Reg, - BitTestCase &B) { - // Make desired shift - SDValue ShiftOp = DAG.getCopyFromReg(getControlRoot(), getCurDebugLoc(), Reg, - TLI.getPointerTy()); - SDValue SwitchVal = DAG.getNode(ISD::SHL, getCurDebugLoc(), - TLI.getPointerTy(), - DAG.getConstant(1, TLI.getPointerTy()), - ShiftOp); - - // Emit bit tests and jumps - SDValue AndOp = DAG.getNode(ISD::AND, getCurDebugLoc(), - TLI.getPointerTy(), SwitchVal, - DAG.getConstant(B.Mask, TLI.getPointerTy())); - SDValue AndCmp = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(AndOp.getValueType()), - AndOp, DAG.getConstant(0, TLI.getPointerTy()), - ISD::SETNE); - - CurMBB->addSuccessor(B.TargetBB); - CurMBB->addSuccessor(NextMBB); - - SDValue BrAnd = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, getControlRoot(), - AndCmp, DAG.getBasicBlock(B.TargetBB)); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - if (NextMBB == NextBlock) - DAG.setRoot(BrAnd); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, BrAnd, - DAG.getBasicBlock(NextMBB))); -} - -void SelectionDAGLowering::visitInvoke(InvokeInst &I) { - // Retrieve successors. - MachineBasicBlock *Return = FuncInfo.MBBMap[I.getSuccessor(0)]; - MachineBasicBlock *LandingPad = FuncInfo.MBBMap[I.getSuccessor(1)]; - - const Value *Callee(I.getCalledValue()); - if (isa(Callee)) - visitInlineAsm(&I); - else - LowerCallTo(&I, getValue(Callee), false, LandingPad); - - // If the value of the invoke is used outside of its defining block, make it - // available as a virtual register. - CopyToExportRegsIfNeeded(&I); - - // Update successor info - CurMBB->addSuccessor(Return); - CurMBB->addSuccessor(LandingPad); - - // Drop into normal successor. - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Return))); -} - -void SelectionDAGLowering::visitUnwind(UnwindInst &I) { -} - -/// handleSmallSwitchCaseRange - Emit a series of specific tests (suitable for -/// small case ranges). -bool SelectionDAGLowering::handleSmallSwitchRange(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - Case& BackCase = *(CR.Range.second-1); - - // Size is the number of Cases represented by this range. - size_t Size = CR.Range.second - CR.Range.first; - if (Size > 3) - return false; - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CR.CaseBB; - - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - // TODO: If any two of the cases has the same destination, and if one value - // is the same as the other, but has one bit unset that the other has set, - // use bit manipulation to do two compares at once. For example: - // "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" - - // Rearrange the case blocks so that the last one falls through if possible. - if (NextBlock && Default != NextBlock && BackCase.BB != NextBlock) { - // The last case block won't fall through into 'NextBlock' if we emit the - // branches in this order. See if rearranging a case value would help. - for (CaseItr I = CR.Range.first, E = CR.Range.second-1; I != E; ++I) { - if (I->BB == NextBlock) { - std::swap(*I, BackCase); - break; - } - } - } - - // Create a CaseBlock record representing a conditional branch to - // the Case's target mbb if the value being switched on SV is equal - // to C. - MachineBasicBlock *CurBlock = CR.CaseBB; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; ++I) { - MachineBasicBlock *FallThrough; - if (I != E-1) { - FallThrough = CurMF->CreateMachineBasicBlock(CurBlock->getBasicBlock()); - CurMF->insert(BBI, FallThrough); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } else { - // If the last case doesn't match, go to the default block. - FallThrough = Default; - } - - Value *RHS, *LHS, *MHS; - ISD::CondCode CC; - if (I->High == I->Low) { - // This is just small small case range :) containing exactly 1 case - CC = ISD::SETEQ; - LHS = SV; RHS = I->High; MHS = NULL; - } else { - CC = ISD::SETLE; - LHS = I->Low; MHS = SV; RHS = I->High; - } - CaseBlock CB(CC, LHS, RHS, MHS, I->BB, FallThrough, CurBlock); - - // If emitting the first comparison, just call visitSwitchCase to emit the - // code into the current block. Otherwise, push the CaseBlock onto the - // vector to be later processed by SDISel, and insert the node's MBB - // before the next MBB. - if (CurBlock == CurMBB) - visitSwitchCase(CB); - else - SwitchCases.push_back(CB); - - CurBlock = FallThrough; - } - - return true; -} - -static inline bool areJTsAllowed(const TargetLowering &TLI) { - return !DisableJumpTables && - (TLI.isOperationLegalOrCustom(ISD::BR_JT, MVT::Other) || - TLI.isOperationLegalOrCustom(ISD::BRIND, MVT::Other)); -} - -static APInt ComputeRange(const APInt &First, const APInt &Last) { - APInt LastExt(Last), FirstExt(First); - uint32_t BitWidth = std::max(Last.getBitWidth(), First.getBitWidth()) + 1; - LastExt.sext(BitWidth); FirstExt.sext(BitWidth); - return (LastExt - FirstExt + 1ULL); -} - -/// handleJTSwitchCase - Emit jumptable for current switch case range -bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - - const APInt &First = cast(FrontCase.Low)->getValue(); - const APInt &Last = cast(BackCase.High)->getValue(); - - APInt TSize(First.getBitWidth(), 0); - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) - TSize += I->size(); - - if (!areJTsAllowed(TLI) || TSize.ult(APInt(First.getBitWidth(), 4))) - return false; - - APInt Range = ComputeRange(First, Last); - double Density = TSize.roundToDouble() / Range.roundToDouble(); - if (Density < 0.4) - return false; - - DEBUG(errs() << "Lowering jump table\n" - << "First entry: " << First << ". Last entry: " << Last << '\n' - << "Range: " << Range - << "Size: " << TSize << ". Density: " << Density << "\n\n"); - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - // Create a new basic block to hold the code for loading the address - // of the jump table, and jumping to it. Update successor information; - // we will either branch to the default case for the switch, or the jump - // table. - MachineBasicBlock *JumpTableBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, JumpTableBB); - CR.CaseBB->addSuccessor(Default); - CR.CaseBB->addSuccessor(JumpTableBB); - - // Build a vector of destination BBs, corresponding to each target - // of the jump table. If the value of the jump table slot corresponds to - // a case statement, push the case's BB onto the vector, otherwise, push - // the default BB. - std::vector DestBBs; - APInt TEI = First; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; ++TEI) { - const APInt& Low = cast(I->Low)->getValue(); - const APInt& High = cast(I->High)->getValue(); - - if (Low.sle(TEI) && TEI.sle(High)) { - DestBBs.push_back(I->BB); - if (TEI==High) - ++I; - } else { - DestBBs.push_back(Default); - } - } - - // Update successor info. Add one edge to each unique successor. - BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); - for (std::vector::iterator I = DestBBs.begin(), - E = DestBBs.end(); I != E; ++I) { - if (!SuccsHandled[(*I)->getNumber()]) { - SuccsHandled[(*I)->getNumber()] = true; - JumpTableBB->addSuccessor(*I); - } - } - - // Create a jump table index for this jump table, or return an existing - // one. - unsigned JTI = CurMF->getJumpTableInfo()->getJumpTableIndex(DestBBs); - - // Set the jump table information so that we can codegen it as a second - // MachineBasicBlock - JumpTable JT(-1U, JTI, JumpTableBB, Default); - JumpTableHeader JTH(First, Last, SV, CR.CaseBB, (CR.CaseBB == CurMBB)); - if (CR.CaseBB == CurMBB) - visitJumpTableHeader(JT, JTH); - - JTCases.push_back(JumpTableBlock(JTH, JT)); - - return true; -} - -/// handleBTSplitSwitchCase - emit comparison and split binary search tree into -/// 2 subtrees. -bool SelectionDAGLowering::handleBTSplitSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - // Size is the number of Cases represented by this range. - unsigned Size = CR.Range.second - CR.Range.first; - - const APInt &First = cast(FrontCase.Low)->getValue(); - const APInt &Last = cast(BackCase.High)->getValue(); - double FMetric = 0; - CaseItr Pivot = CR.Range.first + Size/2; - - // Select optimal pivot, maximizing sum density of LHS and RHS. This will - // (heuristically) allow us to emit JumpTable's later. - APInt TSize(First.getBitWidth(), 0); - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) - TSize += I->size(); - - APInt LSize = FrontCase.size(); - APInt RSize = TSize-LSize; - DEBUG(errs() << "Selecting best pivot: \n" - << "First: " << First << ", Last: " << Last <<'\n' - << "LSize: " << LSize << ", RSize: " << RSize << '\n'); - for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; - J!=E; ++I, ++J) { - const APInt &LEnd = cast(I->High)->getValue(); - const APInt &RBegin = cast(J->Low)->getValue(); - APInt Range = ComputeRange(LEnd, RBegin); - assert((Range - 2ULL).isNonNegative() && - "Invalid case distance"); - double LDensity = (double)LSize.roundToDouble() / - (LEnd - First + 1ULL).roundToDouble(); - double RDensity = (double)RSize.roundToDouble() / - (Last - RBegin + 1ULL).roundToDouble(); - double Metric = Range.logBase2()*(LDensity+RDensity); - // Should always split in some non-trivial place - DEBUG(errs() <<"=>Step\n" - << "LEnd: " << LEnd << ", RBegin: " << RBegin << '\n' - << "LDensity: " << LDensity - << ", RDensity: " << RDensity << '\n' - << "Metric: " << Metric << '\n'); - if (FMetric < Metric) { - Pivot = J; - FMetric = Metric; - DEBUG(errs() << "Current metric set to: " << FMetric << '\n'); - } - - LSize += J->size(); - RSize -= J->size(); - } - if (areJTsAllowed(TLI)) { - // If our case is dense we *really* should handle it earlier! - assert((FMetric > 0) && "Should handle dense range earlier!"); - } else { - Pivot = CR.Range.first + Size/2; - } - - CaseRange LHSR(CR.Range.first, Pivot); - CaseRange RHSR(Pivot, CR.Range.second); - Constant *C = Pivot->Low; - MachineBasicBlock *FalseBB = 0, *TrueBB = 0; - - // We know that we branch to the LHS if the Value being switched on is - // less than the Pivot value, C. We use this to optimize our binary - // tree a bit, by recognizing that if SV is greater than or equal to the - // LHS's Case Value, and that Case Value is exactly one less than the - // Pivot's Value, then we can branch directly to the LHS's Target, - // rather than creating a leaf node for it. - if ((LHSR.second - LHSR.first) == 1 && - LHSR.first->High == CR.GE && - cast(C)->getValue() == - (cast(CR.GE)->getValue() + 1LL)) { - TrueBB = LHSR.first->BB; - } else { - TrueBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, TrueBB); - WorkList.push_back(CaseRec(TrueBB, C, CR.GE, LHSR)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - // Similar to the optimization above, if the Value being switched on is - // known to be less than the Constant CR.LT, and the current Case Value - // is CR.LT - 1, then we can branch directly to the target block for - // the current Case Value, rather than emitting a RHS leaf node for it. - if ((RHSR.second - RHSR.first) == 1 && CR.LT && - cast(RHSR.first->Low)->getValue() == - (cast(CR.LT)->getValue() - 1LL)) { - FalseBB = RHSR.first->BB; - } else { - FalseBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, FalseBB); - WorkList.push_back(CaseRec(FalseBB,CR.LT,C,RHSR)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - // Create a CaseBlock record representing a conditional branch to - // the LHS node if the value being switched on SV is less than C. - // Otherwise, branch to LHS. - CaseBlock CB(ISD::SETLT, SV, C, NULL, TrueBB, FalseBB, CR.CaseBB); - - if (CR.CaseBB == CurMBB) - visitSwitchCase(CB); - else - SwitchCases.push_back(CB); - - return true; -} - -/// handleBitTestsSwitchCase - if current case range has few destination and -/// range span less, than machine word bitwidth, encode case range into series -/// of masks and emit bit tests with these masks. -bool SelectionDAGLowering::handleBitTestsSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default){ - EVT PTy = TLI.getPointerTy(); - unsigned IntPtrBits = PTy.getSizeInBits(); - - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // If target does not have legal shift left, do not emit bit tests at all. - if (!TLI.isOperationLegal(ISD::SHL, TLI.getPointerTy())) - return false; - - size_t numCmps = 0; - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) { - // Single case counts one, case range - two. - numCmps += (I->Low == I->High ? 1 : 2); - } - - // Count unique destinations - SmallSet Dests; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { - Dests.insert(I->BB); - if (Dests.size() > 3) - // Don't bother the code below, if there are too much unique destinations - return false; - } - DEBUG(errs() << "Total number of unique destinations: " << Dests.size() << '\n' - << "Total number of comparisons: " << numCmps << '\n'); - - // Compute span of values. - const APInt& minValue = cast(FrontCase.Low)->getValue(); - const APInt& maxValue = cast(BackCase.High)->getValue(); - APInt cmpRange = maxValue - minValue; - - DEBUG(errs() << "Compare range: " << cmpRange << '\n' - << "Low bound: " << minValue << '\n' - << "High bound: " << maxValue << '\n'); - - if (cmpRange.uge(APInt(cmpRange.getBitWidth(), IntPtrBits)) || - (!(Dests.size() == 1 && numCmps >= 3) && - !(Dests.size() == 2 && numCmps >= 5) && - !(Dests.size() >= 3 && numCmps >= 6))) - return false; - - DEBUG(errs() << "Emitting bit tests\n"); - APInt lowBound = APInt::getNullValue(cmpRange.getBitWidth()); - - // Optimize the case where all the case values fit in a - // word without having to subtract minValue. In this case, - // we can optimize away the subtraction. - if (minValue.isNonNegative() && - maxValue.slt(APInt(maxValue.getBitWidth(), IntPtrBits))) { - cmpRange = maxValue; - } else { - lowBound = minValue; - } - - CaseBitsVector CasesBits; - unsigned i, count = 0; - - for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { - MachineBasicBlock* Dest = I->BB; - for (i = 0; i < count; ++i) - if (Dest == CasesBits[i].BB) - break; - - if (i == count) { - assert((count < 3) && "Too much destinations to test!"); - CasesBits.push_back(CaseBits(0, Dest, 0)); - count++; - } - - const APInt& lowValue = cast(I->Low)->getValue(); - const APInt& highValue = cast(I->High)->getValue(); - - uint64_t lo = (lowValue - lowBound).getZExtValue(); - uint64_t hi = (highValue - lowBound).getZExtValue(); - - for (uint64_t j = lo; j <= hi; j++) { - CasesBits[i].Mask |= 1ULL << j; - CasesBits[i].Bits++; - } - - } - std::sort(CasesBits.begin(), CasesBits.end(), CaseBitsCmp()); - - BitTestInfo BTC; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - DEBUG(errs() << "Cases:\n"); - for (unsigned i = 0, e = CasesBits.size(); i!=e; ++i) { - DEBUG(errs() << "Mask: " << CasesBits[i].Mask - << ", Bits: " << CasesBits[i].Bits - << ", BB: " << CasesBits[i].BB << '\n'); - - MachineBasicBlock *CaseBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, CaseBB); - BTC.push_back(BitTestCase(CasesBits[i].Mask, - CaseBB, - CasesBits[i].BB)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - BitTestBlock BTB(lowBound, cmpRange, SV, - -1U, (CR.CaseBB == CurMBB), - CR.CaseBB, Default, BTC); - - if (CR.CaseBB == CurMBB) - visitBitTestHeader(BTB); - - BitTestCases.push_back(BTB); - - return true; -} - - -/// Clusterify - Transform simple list of Cases into list of CaseRange's -size_t SelectionDAGLowering::Clusterify(CaseVector& Cases, - const SwitchInst& SI) { - size_t numCmps = 0; - - // Start with "simple" cases - for (size_t i = 1; i < SI.getNumSuccessors(); ++i) { - MachineBasicBlock *SMBB = FuncInfo.MBBMap[SI.getSuccessor(i)]; - Cases.push_back(Case(SI.getSuccessorValue(i), - SI.getSuccessorValue(i), - SMBB)); - } - std::sort(Cases.begin(), Cases.end(), CaseCmp()); - - // Merge case into clusters - if (Cases.size() >= 2) - // Must recompute end() each iteration because it may be - // invalidated by erase if we hold on to it - for (CaseItr I = Cases.begin(), J = ++(Cases.begin()); J != Cases.end(); ) { - const APInt& nextValue = cast(J->Low)->getValue(); - const APInt& currentValue = cast(I->High)->getValue(); - MachineBasicBlock* nextBB = J->BB; - MachineBasicBlock* currentBB = I->BB; - - // If the two neighboring cases go to the same destination, merge them - // into a single case. - if ((nextValue - currentValue == 1) && (currentBB == nextBB)) { - I->High = J->High; - J = Cases.erase(J); - } else { - I = J++; - } - } - - for (CaseItr I=Cases.begin(), E=Cases.end(); I!=E; ++I, ++numCmps) { - if (I->Low != I->High) - // A range counts double, since it requires two compares. - ++numCmps; - } - - return numCmps; -} - -void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - - MachineBasicBlock *Default = FuncInfo.MBBMap[SI.getDefaultDest()]; - - // If there is only the default destination, branch to it if it is not the - // next basic block. Otherwise, just fall through. - if (SI.getNumOperands() == 2) { - // Update machine-CFG edges. - - // If this is not a fall-through branch, emit the branch. - CurMBB->addSuccessor(Default); - if (Default != NextBlock) - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Default))); - return; - } - - // If there are any non-default case statements, create a vector of Cases - // representing each one, and sort the vector so that we can efficiently - // create a binary search tree from them. - CaseVector Cases; - size_t numCmps = Clusterify(Cases, SI); - DEBUG(errs() << "Clusterify finished. Total clusters: " << Cases.size() - << ". Total compares: " << numCmps << '\n'); - numCmps = 0; - - // Get the Value to be switched on and default basic blocks, which will be - // inserted into CaseBlock records, representing basic blocks in the binary - // search tree. - Value *SV = SI.getOperand(0); - - // Push the initial CaseRec onto the worklist - CaseRecVector WorkList; - WorkList.push_back(CaseRec(CurMBB,0,0,CaseRange(Cases.begin(),Cases.end()))); - - while (!WorkList.empty()) { - // Grab a record representing a case range to process off the worklist - CaseRec CR = WorkList.back(); - WorkList.pop_back(); - - if (handleBitTestsSwitchCase(CR, WorkList, SV, Default)) - continue; - - // If the range has few cases (two or less) emit a series of specific - // tests. - if (handleSmallSwitchRange(CR, WorkList, SV, Default)) - continue; - - // If the switch has more than 5 blocks, and at least 40% dense, and the - // target supports indirect branches, then emit a jump table rather than - // lowering the switch to a binary tree of conditional branches. - if (handleJTSwitchCase(CR, WorkList, SV, Default)) - continue; - - // Emit binary tree. We need to pick a pivot, and push left and right ranges - // onto the worklist. Leafs are handled via handleSmallSwitchRange() call. - handleBTSplitSwitchCase(CR, WorkList, SV, Default); - } -} - -void SelectionDAGLowering::visitIndirectBr(IndirectBrInst &I) { - // Update machine-CFG edges. - for (unsigned i = 0, e = I.getNumSuccessors(); i != e; ++i) - CurMBB->addSuccessor(FuncInfo.MBBMap[I.getSuccessor(i)]); - - DAG.setRoot(DAG.getNode(ISD::BRIND, getCurDebugLoc(), - MVT::Other, getControlRoot(), - getValue(I.getAddress()))); -} - - -void SelectionDAGLowering::visitFSub(User &I) { - // -0.0 - X --> fneg - const Type *Ty = I.getType(); - if (isa(Ty)) { - if (ConstantVector *CV = dyn_cast(I.getOperand(0))) { - const VectorType *DestTy = cast(I.getType()); - const Type *ElTy = DestTy->getElementType(); - unsigned VL = DestTy->getNumElements(); - std::vector NZ(VL, ConstantFP::getNegativeZero(ElTy)); - Constant *CNZ = ConstantVector::get(&NZ[0], NZ.size()); - if (CV == CNZ) { - SDValue Op2 = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FNEG, getCurDebugLoc(), - Op2.getValueType(), Op2)); - return; - } - } - } - if (ConstantFP *CFP = dyn_cast(I.getOperand(0))) - if (CFP->isExactlyValue(ConstantFP::getNegativeZero(Ty)->getValueAPF())) { - SDValue Op2 = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FNEG, getCurDebugLoc(), - Op2.getValueType(), Op2)); - return; - } - - visitBinary(I, ISD::FSUB); -} - -void SelectionDAGLowering::visitBinary(User &I, unsigned OpCode) { - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - - setValue(&I, DAG.getNode(OpCode, getCurDebugLoc(), - Op1.getValueType(), Op1, Op2)); -} - -void SelectionDAGLowering::visitShift(User &I, unsigned Opcode) { - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - if (!isa(I.getType()) && - Op2.getValueType() != TLI.getShiftAmountTy()) { - // If the operand is smaller than the shift count type, promote it. - EVT PTy = TLI.getPointerTy(); - EVT STy = TLI.getShiftAmountTy(); - if (STy.bitsGT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::ANY_EXTEND, getCurDebugLoc(), - TLI.getShiftAmountTy(), Op2); - // If the operand is larger than the shift count type but the shift - // count type has enough bits to represent any shift value, truncate - // it now. This is a common case and it exposes the truncate to - // optimization early. - else if (STy.getSizeInBits() >= - Log2_32_Ceil(Op2.getValueType().getSizeInBits())) - Op2 = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getShiftAmountTy(), Op2); - // Otherwise we'll need to temporarily settle for some other - // convenient type; type legalization will make adjustments as - // needed. - else if (PTy.bitsLT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getPointerTy(), Op2); - else if (PTy.bitsGT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::ANY_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), Op2); - } - - setValue(&I, DAG.getNode(Opcode, getCurDebugLoc(), - Op1.getValueType(), Op1, Op2)); -} - -void SelectionDAGLowering::visitICmp(User &I) { - ICmpInst::Predicate predicate = ICmpInst::BAD_ICMP_PREDICATE; - if (ICmpInst *IC = dyn_cast(&I)) - predicate = IC->getPredicate(); - else if (ConstantExpr *IC = dyn_cast(&I)) - predicate = ICmpInst::Predicate(IC->getPredicate()); - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - ISD::CondCode Opcode = getICmpCondCode(predicate); - - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Opcode)); -} - -void SelectionDAGLowering::visitFCmp(User &I) { - FCmpInst::Predicate predicate = FCmpInst::BAD_FCMP_PREDICATE; - if (FCmpInst *FC = dyn_cast(&I)) - predicate = FC->getPredicate(); - else if (ConstantExpr *FC = dyn_cast(&I)) - predicate = FCmpInst::Predicate(FC->getPredicate()); - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - ISD::CondCode Condition = getFCmpCondCode(predicate); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Condition)); -} - -void SelectionDAGLowering::visitSelect(User &I) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - if (NumValues != 0) { - SmallVector Values(NumValues); - SDValue Cond = getValue(I.getOperand(0)); - SDValue TrueVal = getValue(I.getOperand(1)); - SDValue FalseVal = getValue(I.getOperand(2)); - - for (unsigned i = 0; i != NumValues; ++i) - Values[i] = DAG.getNode(ISD::SELECT, getCurDebugLoc(), - TrueVal.getValueType(), Cond, - SDValue(TrueVal.getNode(), TrueVal.getResNo() + i), - SDValue(FalseVal.getNode(), FalseVal.getResNo() + i)); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValueVTs[0], NumValues), - &Values[0], NumValues)); - } -} - - -void SelectionDAGLowering::visitTrunc(User &I) { - // TruncInst cannot be a no-op cast because sizeof(src) > sizeof(dest). - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitZExt(User &I) { - // ZExt cannot be a no-op cast because sizeof(src) < sizeof(dest). - // ZExt also can't be a cast to bool for same reason. So, nothing much to do - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitSExt(User &I) { - // SExt cannot be a no-op cast because sizeof(src) < sizeof(dest). - // SExt also can't be a cast to bool for same reason. So, nothing much to do - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::SIGN_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPTrunc(User &I) { - // FPTrunc is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_ROUND, getCurDebugLoc(), - DestVT, N, DAG.getIntPtrConstant(0))); -} - -void SelectionDAGLowering::visitFPExt(User &I){ - // FPTrunc is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPToUI(User &I) { - // FPToUI is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_TO_UINT, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPToSI(User &I) { - // FPToSI is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_TO_SINT, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitUIToFP(User &I) { - // UIToFP is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::UINT_TO_FP, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitSIToFP(User &I){ - // SIToFP is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::SINT_TO_FP, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitPtrToInt(User &I) { - // What to do depends on the size of the integer and the size of the pointer. - // We can either truncate, zero extend, or no-op, accordingly. - SDValue N = getValue(I.getOperand(0)); - EVT SrcVT = N.getValueType(); - EVT DestVT = TLI.getValueType(I.getType()); - SDValue Result = DAG.getZExtOrTrunc(N, getCurDebugLoc(), DestVT); - setValue(&I, Result); -} - -void SelectionDAGLowering::visitIntToPtr(User &I) { - // What to do depends on the size of the integer and the size of the pointer. - // We can either truncate, zero extend, or no-op, accordingly. - SDValue N = getValue(I.getOperand(0)); - EVT SrcVT = N.getValueType(); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getZExtOrTrunc(N, getCurDebugLoc(), DestVT)); -} - -void SelectionDAGLowering::visitBitCast(User &I) { - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - - // BitCast assures us that source and destination are the same size so this - // is either a BIT_CONVERT or a no-op. - if (DestVT != N.getValueType()) - setValue(&I, DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - DestVT, N)); // convert types - else - setValue(&I, N); // noop cast. -} - -void SelectionDAGLowering::visitInsertElement(User &I) { - SDValue InVec = getValue(I.getOperand(0)); - SDValue InVal = getValue(I.getOperand(1)); - SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), - getValue(I.getOperand(2))); - - setValue(&I, DAG.getNode(ISD::INSERT_VECTOR_ELT, getCurDebugLoc(), - TLI.getValueType(I.getType()), - InVec, InVal, InIdx)); -} - -void SelectionDAGLowering::visitExtractElement(User &I) { - SDValue InVec = getValue(I.getOperand(0)); - SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), - getValue(I.getOperand(1))); - setValue(&I, DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - TLI.getValueType(I.getType()), InVec, InIdx)); -} - - -// Utility for visitShuffleVector - Returns true if the mask is mask starting -// from SIndx and increasing to the element length (undefs are allowed). -static bool SequentialMask(SmallVectorImpl &Mask, unsigned SIndx) { - unsigned MaskNumElts = Mask.size(); - for (unsigned i = 0; i != MaskNumElts; ++i) - if ((Mask[i] >= 0) && (Mask[i] != (int)(i + SIndx))) - return false; - return true; -} - -void SelectionDAGLowering::visitShuffleVector(User &I) { - SmallVector Mask; - SDValue Src1 = getValue(I.getOperand(0)); - SDValue Src2 = getValue(I.getOperand(1)); - - // Convert the ConstantVector mask operand into an array of ints, with -1 - // representing undef values. - SmallVector MaskElts; - cast(I.getOperand(2))->getVectorElements(*DAG.getContext(), - MaskElts); - unsigned MaskNumElts = MaskElts.size(); - for (unsigned i = 0; i != MaskNumElts; ++i) { - if (isa(MaskElts[i])) - Mask.push_back(-1); - else - Mask.push_back(cast(MaskElts[i])->getSExtValue()); - } - - EVT VT = TLI.getValueType(I.getType()); - EVT SrcVT = Src1.getValueType(); - unsigned SrcNumElts = SrcVT.getVectorNumElements(); - - if (SrcNumElts == MaskNumElts) { - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &Mask[0])); - return; - } - - // Normalize the shuffle vector since mask and vector length don't match. - if (SrcNumElts < MaskNumElts && MaskNumElts % SrcNumElts == 0) { - // Mask is longer than the source vectors and is a multiple of the source - // vectors. We can use concatenate vector to make the mask and vectors - // lengths match. - if (SrcNumElts*2 == MaskNumElts && SequentialMask(Mask, 0)) { - // The shuffle is concatenating two vectors together. - setValue(&I, DAG.getNode(ISD::CONCAT_VECTORS, getCurDebugLoc(), - VT, Src1, Src2)); - return; - } - - // Pad both vectors with undefs to make them the same length as the mask. - unsigned NumConcat = MaskNumElts / SrcNumElts; - bool Src1U = Src1.getOpcode() == ISD::UNDEF; - bool Src2U = Src2.getOpcode() == ISD::UNDEF; - SDValue UndefVal = DAG.getUNDEF(SrcVT); - - SmallVector MOps1(NumConcat, UndefVal); - SmallVector MOps2(NumConcat, UndefVal); - MOps1[0] = Src1; - MOps2[0] = Src2; - - Src1 = Src1U ? DAG.getUNDEF(VT) : DAG.getNode(ISD::CONCAT_VECTORS, - getCurDebugLoc(), VT, - &MOps1[0], NumConcat); - Src2 = Src2U ? DAG.getUNDEF(VT) : DAG.getNode(ISD::CONCAT_VECTORS, - getCurDebugLoc(), VT, - &MOps2[0], NumConcat); - - // Readjust mask for new input vector length. - SmallVector MappedOps; - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - if (Idx < (int)SrcNumElts) - MappedOps.push_back(Idx); - else - MappedOps.push_back(Idx + MaskNumElts - SrcNumElts); - } - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &MappedOps[0])); - return; - } - - if (SrcNumElts > MaskNumElts) { - // Analyze the access pattern of the vector to see if we can extract - // two subvectors and do the shuffle. The analysis is done by calculating - // the range of elements the mask access on both vectors. - int MinRange[2] = { SrcNumElts+1, SrcNumElts+1}; - int MaxRange[2] = {-1, -1}; - - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - int Input = 0; - if (Idx < 0) - continue; - - if (Idx >= (int)SrcNumElts) { - Input = 1; - Idx -= SrcNumElts; - } - if (Idx > MaxRange[Input]) - MaxRange[Input] = Idx; - if (Idx < MinRange[Input]) - MinRange[Input] = Idx; - } - - // Check if the access is smaller than the vector size and can we find - // a reasonable extract index. - int RangeUse[2] = { 2, 2 }; // 0 = Unused, 1 = Extract, 2 = Can not Extract. - int StartIdx[2]; // StartIdx to extract from - for (int Input=0; Input < 2; ++Input) { - if (MinRange[Input] == (int)(SrcNumElts+1) && MaxRange[Input] == -1) { - RangeUse[Input] = 0; // Unused - StartIdx[Input] = 0; - } else if (MaxRange[Input] - MinRange[Input] < (int)MaskNumElts) { - // Fits within range but we should see if we can find a good - // start index that is a multiple of the mask length. - if (MaxRange[Input] < (int)MaskNumElts) { - RangeUse[Input] = 1; // Extract from beginning of the vector - StartIdx[Input] = 0; - } else { - StartIdx[Input] = (MinRange[Input]/MaskNumElts)*MaskNumElts; - if (MaxRange[Input] - StartIdx[Input] < (int)MaskNumElts && - StartIdx[Input] + MaskNumElts < SrcNumElts) - RangeUse[Input] = 1; // Extract from a multiple of the mask length. - } - } - } - - if (RangeUse[0] == 0 && RangeUse[1] == 0) { - setValue(&I, DAG.getUNDEF(VT)); // Vectors are not used. - return; - } - else if (RangeUse[0] < 2 && RangeUse[1] < 2) { - // Extract appropriate subvector and generate a vector shuffle - for (int Input=0; Input < 2; ++Input) { - SDValue& Src = Input == 0 ? Src1 : Src2; - if (RangeUse[Input] == 0) { - Src = DAG.getUNDEF(VT); - } else { - Src = DAG.getNode(ISD::EXTRACT_SUBVECTOR, getCurDebugLoc(), VT, - Src, DAG.getIntPtrConstant(StartIdx[Input])); - } - } - // Calculate new mask. - SmallVector MappedOps; - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - if (Idx < 0) - MappedOps.push_back(Idx); - else if (Idx < (int)SrcNumElts) - MappedOps.push_back(Idx - StartIdx[0]); - else - MappedOps.push_back(Idx - SrcNumElts - StartIdx[1] + MaskNumElts); - } - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &MappedOps[0])); - return; - } - } - - // We can't use either concat vectors or extract subvectors so fall back to - // replacing the shuffle with extract and build vector. - // to insert and build vector. - EVT EltVT = VT.getVectorElementType(); - EVT PtrVT = TLI.getPointerTy(); - SmallVector Ops; - for (unsigned i = 0; i != MaskNumElts; ++i) { - if (Mask[i] < 0) { - Ops.push_back(DAG.getUNDEF(EltVT)); - } else { - int Idx = Mask[i]; - if (Idx < (int)SrcNumElts) - Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - EltVT, Src1, DAG.getConstant(Idx, PtrVT))); - else - Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - EltVT, Src2, - DAG.getConstant(Idx - SrcNumElts, PtrVT))); - } - } - setValue(&I, DAG.getNode(ISD::BUILD_VECTOR, getCurDebugLoc(), - VT, &Ops[0], Ops.size())); -} - -void SelectionDAGLowering::visitInsertValue(InsertValueInst &I) { - const Value *Op0 = I.getOperand(0); - const Value *Op1 = I.getOperand(1); - const Type *AggTy = I.getType(); - const Type *ValTy = Op1->getType(); - bool IntoUndef = isa(Op0); - bool FromUndef = isa(Op1); - - unsigned LinearIndex = ComputeLinearIndex(TLI, AggTy, - I.idx_begin(), I.idx_end()); - - SmallVector AggValueVTs; - ComputeValueVTs(TLI, AggTy, AggValueVTs); - SmallVector ValValueVTs; - ComputeValueVTs(TLI, ValTy, ValValueVTs); - - unsigned NumAggValues = AggValueVTs.size(); - unsigned NumValValues = ValValueVTs.size(); - SmallVector Values(NumAggValues); - - SDValue Agg = getValue(Op0); - SDValue Val = getValue(Op1); - unsigned i = 0; - // Copy the beginning value(s) from the original aggregate. - for (; i != LinearIndex; ++i) - Values[i] = IntoUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - // Copy values from the inserted value(s). - for (; i != LinearIndex + NumValValues; ++i) - Values[i] = FromUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Val.getNode(), Val.getResNo() + i - LinearIndex); - // Copy remaining value(s) from the original aggregate. - for (; i != NumAggValues; ++i) - Values[i] = IntoUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&AggValueVTs[0], NumAggValues), - &Values[0], NumAggValues)); -} - -void SelectionDAGLowering::visitExtractValue(ExtractValueInst &I) { - const Value *Op0 = I.getOperand(0); - const Type *AggTy = Op0->getType(); - const Type *ValTy = I.getType(); - bool OutOfUndef = isa(Op0); - - unsigned LinearIndex = ComputeLinearIndex(TLI, AggTy, - I.idx_begin(), I.idx_end()); - - SmallVector ValValueVTs; - ComputeValueVTs(TLI, ValTy, ValValueVTs); - - unsigned NumValValues = ValValueVTs.size(); - SmallVector Values(NumValValues); - - SDValue Agg = getValue(Op0); - // Copy out the selected value(s). - for (unsigned i = LinearIndex; i != LinearIndex + NumValValues; ++i) - Values[i - LinearIndex] = - OutOfUndef ? - DAG.getUNDEF(Agg.getNode()->getValueType(Agg.getResNo() + i)) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValValueVTs[0], NumValValues), - &Values[0], NumValValues)); -} - - -void SelectionDAGLowering::visitGetElementPtr(User &I) { - SDValue N = getValue(I.getOperand(0)); - const Type *Ty = I.getOperand(0)->getType(); - - for (GetElementPtrInst::op_iterator OI = I.op_begin()+1, E = I.op_end(); - OI != E; ++OI) { - Value *Idx = *OI; - if (const StructType *StTy = dyn_cast(Ty)) { - unsigned Field = cast(Idx)->getZExtValue(); - if (Field) { - // N = N + Offset - uint64_t Offset = TD->getStructLayout(StTy)->getElementOffset(Field); - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), N.getValueType(), N, - DAG.getIntPtrConstant(Offset)); - } - Ty = StTy->getElementType(Field); - } else { - Ty = cast(Ty)->getElementType(); - - // If this is a constant subscript, handle it quickly. - if (ConstantInt *CI = dyn_cast(Idx)) { - if (CI->getZExtValue() == 0) continue; - uint64_t Offs = - TD->getTypeAllocSize(Ty)*cast(CI)->getSExtValue(); - SDValue OffsVal; - EVT PTy = TLI.getPointerTy(); - unsigned PtrBits = PTy.getSizeInBits(); - if (PtrBits < 64) { - OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getPointerTy(), - DAG.getConstant(Offs, MVT::i64)); - } else - OffsVal = DAG.getIntPtrConstant(Offs); - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), N.getValueType(), N, - OffsVal); - continue; - } - - // N = N + Idx * ElementSize; - APInt ElementSize = APInt(TLI.getPointerTy().getSizeInBits(), - TD->getTypeAllocSize(Ty)); - SDValue IdxN = getValue(Idx); - - // If the index is smaller or larger than intptr_t, truncate or extend - // it. - IdxN = DAG.getSExtOrTrunc(IdxN, getCurDebugLoc(), N.getValueType()); - - // If this is a multiply by a power of two, turn it into a shl - // immediately. This is a very common case. - if (ElementSize != 1) { - if (ElementSize.isPowerOf2()) { - unsigned Amt = ElementSize.logBase2(); - IdxN = DAG.getNode(ISD::SHL, getCurDebugLoc(), - N.getValueType(), IdxN, - DAG.getConstant(Amt, TLI.getPointerTy())); - } else { - SDValue Scale = DAG.getConstant(ElementSize, TLI.getPointerTy()); - IdxN = DAG.getNode(ISD::MUL, getCurDebugLoc(), - N.getValueType(), IdxN, Scale); - } - } - - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), - N.getValueType(), N, IdxN); - } - } - setValue(&I, N); -} - -void SelectionDAGLowering::visitAlloca(AllocaInst &I) { - // If this is a fixed sized alloca in the entry block of the function, - // allocate it statically on the stack. - if (FuncInfo.StaticAllocaMap.count(&I)) - return; // getValue will auto-populate this. - - const Type *Ty = I.getAllocatedType(); - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize(Ty); - unsigned Align = - std::max((unsigned)TLI.getTargetData()->getPrefTypeAlignment(Ty), - I.getAlignment()); - - SDValue AllocSize = getValue(I.getArraySize()); - - AllocSize = DAG.getNode(ISD::MUL, getCurDebugLoc(), AllocSize.getValueType(), - AllocSize, - DAG.getConstant(TySize, AllocSize.getValueType())); - - - - EVT IntPtr = TLI.getPointerTy(); - AllocSize = DAG.getZExtOrTrunc(AllocSize, getCurDebugLoc(), IntPtr); - - // Handle alignment. If the requested alignment is less than or equal to - // the stack alignment, ignore it. If the size is greater than or equal to - // the stack alignment, we note this in the DYNAMIC_STACKALLOC node. - unsigned StackAlign = - TLI.getTargetMachine().getFrameInfo()->getStackAlignment(); - if (Align <= StackAlign) - Align = 0; - - // Round the size of the allocation up to the stack alignment size - // by add SA-1 to the size. - AllocSize = DAG.getNode(ISD::ADD, getCurDebugLoc(), - AllocSize.getValueType(), AllocSize, - DAG.getIntPtrConstant(StackAlign-1)); - // Mask out the low bits for alignment purposes. - AllocSize = DAG.getNode(ISD::AND, getCurDebugLoc(), - AllocSize.getValueType(), AllocSize, - DAG.getIntPtrConstant(~(uint64_t)(StackAlign-1))); - - SDValue Ops[] = { getRoot(), AllocSize, DAG.getIntPtrConstant(Align) }; - SDVTList VTs = DAG.getVTList(AllocSize.getValueType(), MVT::Other); - SDValue DSA = DAG.getNode(ISD::DYNAMIC_STACKALLOC, getCurDebugLoc(), - VTs, Ops, 3); - setValue(&I, DSA); - DAG.setRoot(DSA.getValue(1)); - - // Inform the Frame Information that we have just allocated a variable-sized - // object. - FuncInfo.MF->getFrameInfo()->CreateVariableSizedObject(); -} - -void SelectionDAGLowering::visitLoad(LoadInst &I) { - const Value *SV = I.getOperand(0); - SDValue Ptr = getValue(SV); - - const Type *Ty = I.getType(); - bool isVolatile = I.isVolatile(); - unsigned Alignment = I.getAlignment(); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, Ty, ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) - return; - - SDValue Root; - bool ConstantMemory = false; - if (I.isVolatile()) - // Serialize volatile loads with other side effects. - Root = getRoot(); - else if (AA->pointsToConstantMemory(SV)) { - // Do not serialize (non-volatile) loads of constant memory with anything. - Root = DAG.getEntryNode(); - ConstantMemory = true; - } else { - // Do not serialize non-volatile loads against each other. - Root = DAG.getRoot(); - } - - SmallVector Values(NumValues); - SmallVector Chains(NumValues); - EVT PtrVT = Ptr.getValueType(); - for (unsigned i = 0; i != NumValues; ++i) { - SDValue L = DAG.getLoad(ValueVTs[i], getCurDebugLoc(), Root, - DAG.getNode(ISD::ADD, getCurDebugLoc(), - PtrVT, Ptr, - DAG.getConstant(Offsets[i], PtrVT)), - SV, Offsets[i], isVolatile, Alignment); - Values[i] = L; - Chains[i] = L.getValue(1); - } - - if (!ConstantMemory) { - SDValue Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, - &Chains[0], NumValues); - if (isVolatile) - DAG.setRoot(Chain); - else - PendingLoads.push_back(Chain); - } - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValueVTs[0], NumValues), - &Values[0], NumValues)); -} - - -void SelectionDAGLowering::visitStore(StoreInst &I) { - Value *SrcV = I.getOperand(0); - Value *PtrV = I.getOperand(1); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, SrcV->getType(), ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) - return; - - // Get the lowered operands. Note that we do this after - // checking if NumResults is zero, because with zero results - // the operands won't have values in the map. - SDValue Src = getValue(SrcV); - SDValue Ptr = getValue(PtrV); - - SDValue Root = getRoot(); - SmallVector Chains(NumValues); - EVT PtrVT = Ptr.getValueType(); - bool isVolatile = I.isVolatile(); - unsigned Alignment = I.getAlignment(); - for (unsigned i = 0; i != NumValues; ++i) - Chains[i] = DAG.getStore(Root, getCurDebugLoc(), - SDValue(Src.getNode(), Src.getResNo() + i), - DAG.getNode(ISD::ADD, getCurDebugLoc(), - PtrVT, Ptr, - DAG.getConstant(Offsets[i], PtrVT)), - PtrV, Offsets[i], isVolatile, Alignment); - - DAG.setRoot(DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues)); -} - -/// visitTargetIntrinsic - Lower a call of a target intrinsic to an INTRINSIC -/// node. -void SelectionDAGLowering::visitTargetIntrinsic(CallInst &I, - unsigned Intrinsic) { - bool HasChain = !I.doesNotAccessMemory(); - bool OnlyLoad = HasChain && I.onlyReadsMemory(); - - // Build the operand list. - SmallVector Ops; - if (HasChain) { // If this intrinsic has side-effects, chainify it. - if (OnlyLoad) { - // We don't need to serialize loads against other loads. - Ops.push_back(DAG.getRoot()); - } else { - Ops.push_back(getRoot()); - } - } - - // Info is set by getTgtMemInstrinsic - TargetLowering::IntrinsicInfo Info; - bool IsTgtIntrinsic = TLI.getTgtMemIntrinsic(Info, I, Intrinsic); - - // Add the intrinsic ID as an integer operand if it's not a target intrinsic. - if (!IsTgtIntrinsic) - Ops.push_back(DAG.getConstant(Intrinsic, TLI.getPointerTy())); - - // Add all operands of the call to the operand list. - for (unsigned i = 1, e = I.getNumOperands(); i != e; ++i) { - SDValue Op = getValue(I.getOperand(i)); - assert(TLI.isTypeLegal(Op.getValueType()) && - "Intrinsic uses a non-legal type?"); - Ops.push_back(Op); - } - - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getType(), ValueVTs); -#ifndef NDEBUG - for (unsigned Val = 0, E = ValueVTs.size(); Val != E; ++Val) { - assert(TLI.isTypeLegal(ValueVTs[Val]) && - "Intrinsic uses a non-legal type?"); - } -#endif // NDEBUG - if (HasChain) - ValueVTs.push_back(MVT::Other); - - SDVTList VTs = DAG.getVTList(ValueVTs.data(), ValueVTs.size()); - - // Create the node. - SDValue Result; - if (IsTgtIntrinsic) { - // This is target intrinsic that touches memory - Result = DAG.getMemIntrinsicNode(Info.opc, getCurDebugLoc(), - VTs, &Ops[0], Ops.size(), - Info.memVT, Info.ptrVal, Info.offset, - Info.align, Info.vol, - Info.readMem, Info.writeMem); - } - else if (!HasChain) - Result = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - else if (I.getType() != Type::getVoidTy(*DAG.getContext())) - Result = DAG.getNode(ISD::INTRINSIC_W_CHAIN, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - else - Result = DAG.getNode(ISD::INTRINSIC_VOID, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - - if (HasChain) { - SDValue Chain = Result.getValue(Result.getNode()->getNumValues()-1); - if (OnlyLoad) - PendingLoads.push_back(Chain); - else - DAG.setRoot(Chain); - } - if (I.getType() != Type::getVoidTy(*DAG.getContext())) { - if (const VectorType *PTy = dyn_cast(I.getType())) { - EVT VT = TLI.getValueType(PTy); - Result = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), VT, Result); - } - setValue(&I, Result); - } -} - -/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V. -static GlobalVariable *ExtractTypeInfo(Value *V) { - V = V->stripPointerCasts(); - GlobalVariable *GV = dyn_cast(V); - assert ((GV || isa(V)) && - "TypeInfo must be a global variable or NULL"); - return GV; -} - -namespace llvm { - -/// AddCatchInfo - Extract the personality and type infos from an eh.selector -/// call, and add them to the specified machine basic block. -void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, - MachineBasicBlock *MBB) { - // Inform the MachineModuleInfo of the personality for this landing pad. - ConstantExpr *CE = cast(I.getOperand(2)); - assert(CE->getOpcode() == Instruction::BitCast && - isa(CE->getOperand(0)) && - "Personality should be a function"); - MMI->addPersonality(MBB, cast(CE->getOperand(0))); - - // Gather all the type infos for this landing pad and pass them along to - // MachineModuleInfo. - std::vector TyInfo; - unsigned N = I.getNumOperands(); - - for (unsigned i = N - 1; i > 2; --i) { - if (ConstantInt *CI = dyn_cast(I.getOperand(i))) { - unsigned FilterLength = CI->getZExtValue(); - unsigned FirstCatch = i + FilterLength + !FilterLength; - assert (FirstCatch <= N && "Invalid filter length"); - - if (FirstCatch < N) { - TyInfo.reserve(N - FirstCatch); - for (unsigned j = FirstCatch; j < N; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addCatchTypeInfo(MBB, TyInfo); - TyInfo.clear(); - } - - if (!FilterLength) { - // Cleanup. - MMI->addCleanup(MBB); - } else { - // Filter. - TyInfo.reserve(FilterLength - 1); - for (unsigned j = i + 1; j < FirstCatch; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addFilterTypeInfo(MBB, TyInfo); - TyInfo.clear(); - } - - N = i; - } - } - - if (N > 3) { - TyInfo.reserve(N - 3); - for (unsigned j = 3; j < N; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addCatchTypeInfo(MBB, TyInfo); - } -} - -} - -/// GetSignificand - Get the significand and build it into a floating-point -/// number with exponent of 1: -/// -/// Op = (Op & 0x007fffff) | 0x3f800000; -/// -/// where Op is the hexidecimal representation of floating point value. -static SDValue -GetSignificand(SelectionDAG &DAG, SDValue Op, DebugLoc dl) { - SDValue t1 = DAG.getNode(ISD::AND, dl, MVT::i32, Op, - DAG.getConstant(0x007fffff, MVT::i32)); - SDValue t2 = DAG.getNode(ISD::OR, dl, MVT::i32, t1, - DAG.getConstant(0x3f800000, MVT::i32)); - return DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t2); -} - -/// GetExponent - Get the exponent: -/// -/// (float)(int)(((Op & 0x7f800000) >> 23) - 127); -/// -/// where Op is the hexidecimal representation of floating point value. -static SDValue -GetExponent(SelectionDAG &DAG, SDValue Op, const TargetLowering &TLI, - DebugLoc dl) { - SDValue t0 = DAG.getNode(ISD::AND, dl, MVT::i32, Op, - DAG.getConstant(0x7f800000, MVT::i32)); - SDValue t1 = DAG.getNode(ISD::SRL, dl, MVT::i32, t0, - DAG.getConstant(23, TLI.getPointerTy())); - SDValue t2 = DAG.getNode(ISD::SUB, dl, MVT::i32, t1, - DAG.getConstant(127, MVT::i32)); - return DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, t2); -} - -/// getF32Constant - Get 32-bit floating point constant. -static SDValue -getF32Constant(SelectionDAG &DAG, unsigned Flt) { - return DAG.getConstantFP(APFloat(APInt(32, Flt)), MVT::f32); -} - -/// Inlined utility function to implement binary input atomic intrinsics for -/// visitIntrinsicCall: I is a call instruction -/// Op is the associated NodeType for I -const char * -SelectionDAGLowering::implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op) { - SDValue Root = getRoot(); - SDValue L = - DAG.getAtomic(Op, getCurDebugLoc(), - getValue(I.getOperand(2)).getValueType().getSimpleVT(), - Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - I.getOperand(1)); - setValue(&I, L); - DAG.setRoot(L.getValue(1)); - return 0; -} - -// implVisitAluOverflow - Lower arithmetic overflow instrinsics. -const char * -SelectionDAGLowering::implVisitAluOverflow(CallInst &I, ISD::NodeType Op) { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - - SDVTList VTs = DAG.getVTList(Op1.getValueType(), MVT::i1); - SDValue Result = DAG.getNode(Op, getCurDebugLoc(), VTs, Op1, Op2); - - setValue(&I, Result); - return 0; -} - -/// visitExp - Lower an exp intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitExp(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - - // Put the exponent in the right bit position for later addition to the - // final result: - // - // #define LOG2OFe 1.4426950f - // IntegerPartOfX = ((int32_t)(X * LOG2OFe)); - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, Op, - getF32Constant(DAG, 0x3fb8aa3b)); - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, t0); - - // FractionalPartOfX = (X * LOG2OFe) - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // TwoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl,MVT::i32, t5); - - // Add the exponent into the result in integer domain. - SDValue t6 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t6); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // 0.000107046256 error, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl,MVT::i32, t7); - - // Add the exponent into the result in integer domain. - SDValue t8 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t8); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::i32, t13); - - // Add the exponent into the result in integer domain. - SDValue t14 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t14); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FEXP, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog - Lower a log intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Scale the exponent by log(2) [0.69314718f]. - SDValue Exp = GetExponent(DAG, Op1, TLI, dl); - SDValue LogOfExponent = DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, - getF32Constant(DAG, 0x3f317218)); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // LogofMantissa = - // -1.1609546f + - // (1.4034025f - 0.23903021f * x) * x; - // - // error 0.0034276066, which is better than 8 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbe74c456)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3fb3a2b1)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f949a29)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // LogOfMantissa = - // -1.7417939f + - // (2.8212026f + - // (-1.4699568f + - // (0.44717955f - 0.56570851e-1f * x) * x) * x) * x; - // - // error 0.000061011436, which is 14 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbd67b6d6)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3ee4f4b8)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fbc278b)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40348e95)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3fdef31a)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // LogOfMantissa = - // -2.1072184f + - // (4.2372794f + - // (-3.7029485f + - // (2.2781945f + - // (-0.87823314f + - // (0.19073739f - 0.17809712e-1f * x) * x) * x) * x) * x)*x; - // - // error 0.0000023660568, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbc91e5ac)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e4350aa)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f60d3e3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x4011cdf0)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x406cfd1c)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x408797cb)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t10, - getF32Constant(DAG, 0x4006dcab)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog2 - Lower a log2 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog2(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Get the exponent. - SDValue LogOfExponent = GetExponent(DAG, Op1, TLI, dl); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - // Different possible minimax approximations of significand in - // floating-point for various degrees of accuracy over [1,2]. - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // Log2ofMantissa = -1.6749035f + (2.0246817f - .34484768f * x) * x; - // - // error 0.0049451742, which is more than 7 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbeb08fe0)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x40019463)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fd6633d)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // Log2ofMantissa = - // -2.51285454f + - // (4.07009056f + - // (-2.12067489f + - // (.645142248f - 0.816157886e-1f * x) * x) * x) * x; - // - // error 0.0000876136000, which is better than 13 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbda7262e)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3f25280b)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x4007b923)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40823e2f)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x4020d29c)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // Log2ofMantissa = - // -3.0400495f + - // (6.1129976f + - // (-5.3420409f + - // (3.2865683f + - // (-1.2669343f + - // (0.27515199f - - // 0.25691327e-1f * x) * x) * x) * x) * x) * x; - // - // error 0.0000018516, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbcd2769e)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e8ce0b9)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fa22ae7)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40525723)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x40aaf200)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x40c39dad)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t10, - getF32Constant(DAG, 0x4042902c)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG2, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog10 - Lower a log10 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog10(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Scale the exponent by log10(2) [0.30102999f]. - SDValue Exp = GetExponent(DAG, Op1, TLI, dl); - SDValue LogOfExponent = DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, - getF32Constant(DAG, 0x3e9a209a)); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // Log10ofMantissa = - // -0.50419619f + - // (0.60948995f - 0.10380950f * x) * x; - // - // error 0.0014886165, which is 6 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbdd49a13)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3f1c0789)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f011300)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // Log10ofMantissa = - // -0.64831180f + - // (0.91751397f + - // (-0.31664806f + 0.47637168e-1f * x) * x) * x; - // - // error 0.00019228036, which is better than 12 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3d431f31)); - SDValue t1 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3ea21fb2)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f6ae232)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f25f7c3)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // Log10ofMantissa = - // -0.84299375f + - // (1.5327582f + - // (-1.0688956f + - // (0.49102474f + - // (-0.12539807f + 0.13508273e-1f * x) * x) * x) * x) * x; - // - // error 0.0000037995730, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3c5d51ce)); - SDValue t1 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e00685a)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3efb6798)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f88d192)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3fc4316c)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3f57ce70)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG10, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitExp2 - Lower an exp2 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitExp2(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, Op); - - // FractionalPartOfX = x - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, Op, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // TwoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue t6 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t5); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t6, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // error 0.000107046256, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue t8 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t7); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t8, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue t14 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t13); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t14, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FEXP2, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitPow - Lower a pow intrinsic. Handles the special sequences for -/// limited-precision mode with x == 10.0f. -void -SelectionDAGLowering::visitPow(CallInst &I) { - SDValue result; - Value *Val = I.getOperand(1); - DebugLoc dl = getCurDebugLoc(); - bool IsExp10 = false; - - if (getValue(Val).getValueType() == MVT::f32 && - getValue(I.getOperand(2)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - if (Constant *C = const_cast(dyn_cast(Val))) { - if (ConstantFP *CFP = dyn_cast(C)) { - APFloat Ten(10.0f); - IsExp10 = CFP->getValueAPF().bitwiseIsEqual(Ten); - } - } - } - - if (IsExp10 && LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(2)); - - // Put the exponent in the right bit position for later addition to the - // final result: - // - // #define LOG2OF10 3.3219281f - // IntegerPartOfX = (int32_t)(x * LOG2OF10); - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, Op, - getF32Constant(DAG, 0x40549a78)); - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, t0); - - // FractionalPartOfX = x - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // twoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue t6 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t5); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t6, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // error 0.000107046256, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue t8 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t7); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t8, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue t14 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t13); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t14, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FPOW, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2))); - } - - setValue(&I, result); -} - -/// visitIntrinsicCall - Lower the call to the specified intrinsic function. If -/// we want to emit this as a call to a named external function, return the name -/// otherwise lower it and return null. -const char * -SelectionDAGLowering::visitIntrinsicCall(CallInst &I, unsigned Intrinsic) { - DebugLoc dl = getCurDebugLoc(); - switch (Intrinsic) { - default: - // By default, turn this into a target intrinsic node. - visitTargetIntrinsic(I, Intrinsic); - return 0; - case Intrinsic::vastart: visitVAStart(I); return 0; - case Intrinsic::vaend: visitVAEnd(I); return 0; - case Intrinsic::vacopy: visitVACopy(I); return 0; - case Intrinsic::returnaddress: - setValue(&I, DAG.getNode(ISD::RETURNADDR, dl, TLI.getPointerTy(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::frameaddress: - setValue(&I, DAG.getNode(ISD::FRAMEADDR, dl, TLI.getPointerTy(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::setjmp: - return "_setjmp"+!TLI.usesUnderscoreSetJmp(); - break; - case Intrinsic::longjmp: - return "_longjmp"+!TLI.usesUnderscoreLongJmp(); - break; - case Intrinsic::memcpy: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - DAG.setRoot(DAG.getMemcpy(getRoot(), dl, Op1, Op2, Op3, Align, false, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - case Intrinsic::memset: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - DAG.setRoot(DAG.getMemset(getRoot(), dl, Op1, Op2, Op3, Align, - I.getOperand(1), 0)); - return 0; - } - case Intrinsic::memmove: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - - // If the source and destination are known to not be aliases, we can - // lower memmove as memcpy. - uint64_t Size = -1ULL; - if (ConstantSDNode *C = dyn_cast(Op3)) - Size = C->getZExtValue(); - if (AA->alias(I.getOperand(1), Size, I.getOperand(2), Size) == - AliasAnalysis::NoAlias) { - DAG.setRoot(DAG.getMemcpy(getRoot(), dl, Op1, Op2, Op3, Align, false, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - - DAG.setRoot(DAG.getMemmove(getRoot(), dl, Op1, Op2, Op3, Align, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - case Intrinsic::dbg_stoppoint: - case Intrinsic::dbg_region_start: - case Intrinsic::dbg_region_end: - case Intrinsic::dbg_func_start: - // FIXME - Remove this instructions once the dust settles. - return 0; - case Intrinsic::dbg_declare: { - if (OptLevel != CodeGenOpt::None) - // FIXME: Variable debug info is not supported here. - return 0; - DwarfWriter *DW = DAG.getDwarfWriter(); - if (!DW) - return 0; - DbgDeclareInst &DI = cast(I); - if (!isValidDebugInfoIntrinsic(DI, CodeGenOpt::None)) - return 0; - - MDNode *Variable = DI.getVariable(); - Value *Address = DI.getAddress(); - if (BitCastInst *BCI = dyn_cast(Address)) - Address = BCI->getOperand(0); - AllocaInst *AI = dyn_cast(Address); - // Don't handle byval struct arguments or VLAs, for example. - if (!AI) - return 0; - DenseMap::iterator SI = - FuncInfo.StaticAllocaMap.find(AI); - if (SI == FuncInfo.StaticAllocaMap.end()) - return 0; // VLAs. - int FI = SI->second; - - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - if (MMI) { - MetadataContext &TheMetadata = - DI.getParent()->getContext().getMetadata(); - unsigned MDDbgKind = TheMetadata.getMDKind("dbg"); - MDNode *Dbg = TheMetadata.getMD(MDDbgKind, &DI); - MMI->setVariableDbgInfo(Variable, FI, Dbg); - } - return 0; - } - case Intrinsic::eh_exception: { - // Insert the EXCEPTIONADDR instruction. - assert(CurMBB->isLandingPad() &&"Call to eh.exception not in landing pad!"); - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[1]; - Ops[0] = DAG.getRoot(); - SDValue Op = DAG.getNode(ISD::EXCEPTIONADDR, dl, VTs, Ops, 1); - setValue(&I, Op); - DAG.setRoot(Op.getValue(1)); - return 0; - } - - case Intrinsic::eh_selector: { - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - - if (CurMBB->isLandingPad()) - AddCatchInfo(I, MMI, CurMBB); - else { -#ifndef NDEBUG - FuncInfo.CatchInfoLost.insert(&I); -#endif - // FIXME: Mark exception selector register as live in. Hack for PR1508. - unsigned Reg = TLI.getExceptionSelectorRegister(); - if (Reg) CurMBB->addLiveIn(Reg); - } - - // Insert the EHSELECTION instruction. - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[2]; - Ops[0] = getValue(I.getOperand(1)); - Ops[1] = getRoot(); - SDValue Op = DAG.getNode(ISD::EHSELECTION, dl, VTs, Ops, 2); - - DAG.setRoot(Op.getValue(1)); - - setValue(&I, DAG.getSExtOrTrunc(Op, dl, MVT::i32)); - return 0; - } - - case Intrinsic::eh_typeid_for: { - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - - if (MMI) { - // Find the type id for the given typeinfo. - GlobalVariable *GV = ExtractTypeInfo(I.getOperand(1)); - - unsigned TypeID = MMI->getTypeIDFor(GV); - setValue(&I, DAG.getConstant(TypeID, MVT::i32)); - } else { - // Return something different to eh_selector. - setValue(&I, DAG.getConstant(1, MVT::i32)); - } - - return 0; - } - - case Intrinsic::eh_return_i32: - case Intrinsic::eh_return_i64: - if (MachineModuleInfo *MMI = DAG.getMachineModuleInfo()) { - MMI->setCallsEHReturn(true); - DAG.setRoot(DAG.getNode(ISD::EH_RETURN, dl, - MVT::Other, - getControlRoot(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)))); - } else { - setValue(&I, DAG.getConstant(0, TLI.getPointerTy())); - } - - return 0; - case Intrinsic::eh_unwind_init: - if (MachineModuleInfo *MMI = DAG.getMachineModuleInfo()) { - MMI->setCallsUnwindInit(true); - } - - return 0; - - case Intrinsic::eh_dwarf_cfa: { - EVT VT = getValue(I.getOperand(1)).getValueType(); - SDValue CfaArg = DAG.getSExtOrTrunc(getValue(I.getOperand(1)), dl, - TLI.getPointerTy()); - - SDValue Offset = DAG.getNode(ISD::ADD, dl, - TLI.getPointerTy(), - DAG.getNode(ISD::FRAME_TO_ARGS_OFFSET, dl, - TLI.getPointerTy()), - CfaArg); - setValue(&I, DAG.getNode(ISD::ADD, dl, - TLI.getPointerTy(), - DAG.getNode(ISD::FRAMEADDR, dl, - TLI.getPointerTy(), - DAG.getConstant(0, - TLI.getPointerTy())), - Offset)); - return 0; - } - case Intrinsic::convertff: - case Intrinsic::convertfsi: - case Intrinsic::convertfui: - case Intrinsic::convertsif: - case Intrinsic::convertuif: - case Intrinsic::convertss: - case Intrinsic::convertsu: - case Intrinsic::convertus: - case Intrinsic::convertuu: { - ISD::CvtCode Code = ISD::CVT_INVALID; - switch (Intrinsic) { - case Intrinsic::convertff: Code = ISD::CVT_FF; break; - case Intrinsic::convertfsi: Code = ISD::CVT_FS; break; - case Intrinsic::convertfui: Code = ISD::CVT_FU; break; - case Intrinsic::convertsif: Code = ISD::CVT_SF; break; - case Intrinsic::convertuif: Code = ISD::CVT_UF; break; - case Intrinsic::convertss: Code = ISD::CVT_SS; break; - case Intrinsic::convertsu: Code = ISD::CVT_SU; break; - case Intrinsic::convertus: Code = ISD::CVT_US; break; - case Intrinsic::convertuu: Code = ISD::CVT_UU; break; - } - EVT DestVT = TLI.getValueType(I.getType()); - Value* Op1 = I.getOperand(1); - setValue(&I, DAG.getConvertRndSat(DestVT, getCurDebugLoc(), getValue(Op1), - DAG.getValueType(DestVT), - DAG.getValueType(getValue(Op1).getValueType()), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - Code)); - return 0; - } - - case Intrinsic::sqrt: - setValue(&I, DAG.getNode(ISD::FSQRT, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::powi: - setValue(&I, DAG.getNode(ISD::FPOWI, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)))); - return 0; - case Intrinsic::sin: - setValue(&I, DAG.getNode(ISD::FSIN, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::cos: - setValue(&I, DAG.getNode(ISD::FCOS, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::log: - visitLog(I); - return 0; - case Intrinsic::log2: - visitLog2(I); - return 0; - case Intrinsic::log10: - visitLog10(I); - return 0; - case Intrinsic::exp: - visitExp(I); - return 0; - case Intrinsic::exp2: - visitExp2(I); - return 0; - case Intrinsic::pow: - visitPow(I); - return 0; - case Intrinsic::pcmarker: { - SDValue Tmp = getValue(I.getOperand(1)); - DAG.setRoot(DAG.getNode(ISD::PCMARKER, dl, MVT::Other, getRoot(), Tmp)); - return 0; - } - case Intrinsic::readcyclecounter: { - SDValue Op = getRoot(); - SDValue Tmp = DAG.getNode(ISD::READCYCLECOUNTER, dl, - DAG.getVTList(MVT::i64, MVT::Other), - &Op, 1); - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - case Intrinsic::bswap: - setValue(&I, DAG.getNode(ISD::BSWAP, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::cttz: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTTZ, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::ctlz: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTLZ, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::ctpop: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTPOP, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::stacksave: { - SDValue Op = getRoot(); - SDValue Tmp = DAG.getNode(ISD::STACKSAVE, dl, - DAG.getVTList(TLI.getPointerTy(), MVT::Other), &Op, 1); - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - case Intrinsic::stackrestore: { - SDValue Tmp = getValue(I.getOperand(1)); - DAG.setRoot(DAG.getNode(ISD::STACKRESTORE, dl, MVT::Other, getRoot(), Tmp)); - return 0; - } - case Intrinsic::stackprotector: { - // Emit code into the DAG to store the stack guard onto the stack. - MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo *MFI = MF.getFrameInfo(); - EVT PtrTy = TLI.getPointerTy(); - - SDValue Src = getValue(I.getOperand(1)); // The guard's value. - AllocaInst *Slot = cast(I.getOperand(2)); - - int FI = FuncInfo.StaticAllocaMap[Slot]; - MFI->setStackProtectorIndex(FI); - - SDValue FIN = DAG.getFrameIndex(FI, PtrTy); - - // Store the stack protector onto the stack. - SDValue Result = DAG.getStore(getRoot(), getCurDebugLoc(), Src, FIN, - PseudoSourceValue::getFixedStack(FI), - 0, true); - setValue(&I, Result); - DAG.setRoot(Result); - return 0; - } - case Intrinsic::objectsize: { - // If we don't know by now, we're never going to know. - ConstantInt *CI = dyn_cast(I.getOperand(2)); - - assert(CI && "Non-constant type in __builtin_object_size?"); - - SDValue Arg = getValue(I.getOperand(0)); - EVT Ty = Arg.getValueType(); - - if (CI->getZExtValue() < 2) - setValue(&I, DAG.getConstant(-1ULL, Ty)); - else - setValue(&I, DAG.getConstant(0, Ty)); - return 0; - } - case Intrinsic::var_annotation: - // Discard annotate attributes - return 0; - - case Intrinsic::init_trampoline: { - const Function *F = cast(I.getOperand(2)->stripPointerCasts()); - - SDValue Ops[6]; - Ops[0] = getRoot(); - Ops[1] = getValue(I.getOperand(1)); - Ops[2] = getValue(I.getOperand(2)); - Ops[3] = getValue(I.getOperand(3)); - Ops[4] = DAG.getSrcValue(I.getOperand(1)); - Ops[5] = DAG.getSrcValue(F); - - SDValue Tmp = DAG.getNode(ISD::TRAMPOLINE, dl, - DAG.getVTList(TLI.getPointerTy(), MVT::Other), - Ops, 6); - - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - - case Intrinsic::gcroot: - if (GFI) { - Value *Alloca = I.getOperand(1); - Constant *TypeMap = cast(I.getOperand(2)); - - FrameIndexSDNode *FI = cast(getValue(Alloca).getNode()); - GFI->addStackRoot(FI->getIndex(), TypeMap); - } - return 0; - - case Intrinsic::gcread: - case Intrinsic::gcwrite: - llvm_unreachable("GC failed to lower gcread/gcwrite intrinsics!"); - return 0; - - case Intrinsic::flt_rounds: { - setValue(&I, DAG.getNode(ISD::FLT_ROUNDS_, dl, MVT::i32)); - return 0; - } - - case Intrinsic::trap: { - DAG.setRoot(DAG.getNode(ISD::TRAP, dl,MVT::Other, getRoot())); - return 0; - } - - case Intrinsic::uadd_with_overflow: - return implVisitAluOverflow(I, ISD::UADDO); - case Intrinsic::sadd_with_overflow: - return implVisitAluOverflow(I, ISD::SADDO); - case Intrinsic::usub_with_overflow: - return implVisitAluOverflow(I, ISD::USUBO); - case Intrinsic::ssub_with_overflow: - return implVisitAluOverflow(I, ISD::SSUBO); - case Intrinsic::umul_with_overflow: - return implVisitAluOverflow(I, ISD::UMULO); - case Intrinsic::smul_with_overflow: - return implVisitAluOverflow(I, ISD::SMULO); - - case Intrinsic::prefetch: { - SDValue Ops[4]; - Ops[0] = getRoot(); - Ops[1] = getValue(I.getOperand(1)); - Ops[2] = getValue(I.getOperand(2)); - Ops[3] = getValue(I.getOperand(3)); - DAG.setRoot(DAG.getNode(ISD::PREFETCH, dl, MVT::Other, &Ops[0], 4)); - return 0; - } - - case Intrinsic::memory_barrier: { - SDValue Ops[6]; - Ops[0] = getRoot(); - for (int x = 1; x < 6; ++x) - Ops[x] = getValue(I.getOperand(x)); - - DAG.setRoot(DAG.getNode(ISD::MEMBARRIER, dl, MVT::Other, &Ops[0], 6)); - return 0; - } - case Intrinsic::atomic_cmp_swap: { - SDValue Root = getRoot(); - SDValue L = - DAG.getAtomic(ISD::ATOMIC_CMP_SWAP, getCurDebugLoc(), - getValue(I.getOperand(2)).getValueType().getSimpleVT(), - Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - setValue(&I, L); - DAG.setRoot(L.getValue(1)); - return 0; - } - case Intrinsic::atomic_load_add: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD); - case Intrinsic::atomic_load_sub: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB); - case Intrinsic::atomic_load_or: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR); - case Intrinsic::atomic_load_xor: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR); - case Intrinsic::atomic_load_and: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND); - case Intrinsic::atomic_load_nand: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND); - case Intrinsic::atomic_load_max: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX); - case Intrinsic::atomic_load_min: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN); - case Intrinsic::atomic_load_umin: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN); - case Intrinsic::atomic_load_umax: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX); - case Intrinsic::atomic_swap: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP); - - case Intrinsic::invariant_start: - case Intrinsic::lifetime_start: - // Discard region information. - setValue(&I, DAG.getUNDEF(TLI.getPointerTy())); - return 0; - case Intrinsic::invariant_end: - case Intrinsic::lifetime_end: - // Discard region information. - return 0; - } -} - -/// Test if the given instruction is in a position to be optimized -/// with a tail-call. This roughly means that it's in a block with -/// a return and there's nothing that needs to be scheduled -/// between it and the return. -/// -/// This function only tests target-independent requirements. -/// For target-dependent requirements, a target should override -/// TargetLowering::IsEligibleForTailCallOptimization. -/// -static bool -isInTailCallPosition(const Instruction *I, Attributes CalleeRetAttr, - const TargetLowering &TLI) { - const BasicBlock *ExitBB = I->getParent(); - const TerminatorInst *Term = ExitBB->getTerminator(); - const ReturnInst *Ret = dyn_cast(Term); - const Function *F = ExitBB->getParent(); - - // The block must end in a return statement or an unreachable. - if (!Ret && !isa(Term)) return false; - - // If I will have a chain, make sure no other instruction that will have a - // chain interposes between I and the return. - if (I->mayHaveSideEffects() || I->mayReadFromMemory() || - !I->isSafeToSpeculativelyExecute()) - for (BasicBlock::const_iterator BBI = prior(prior(ExitBB->end())); ; - --BBI) { - if (&*BBI == I) - break; - if (BBI->mayHaveSideEffects() || BBI->mayReadFromMemory() || - !BBI->isSafeToSpeculativelyExecute()) - return false; - } - - // If the block ends with a void return or unreachable, it doesn't matter - // what the call's return type is. - if (!Ret || Ret->getNumOperands() == 0) return true; - - // If the return value is undef, it doesn't matter what the call's - // return type is. - if (isa(Ret->getOperand(0))) return true; - - // Conservatively require the attributes of the call to match those of - // the return. Ignore noalias because it doesn't affect the call sequence. - unsigned CallerRetAttr = F->getAttributes().getRetAttributes(); - if ((CalleeRetAttr ^ CallerRetAttr) & ~Attribute::NoAlias) - return false; - - // Otherwise, make sure the unmodified return value of I is the return value. - for (const Instruction *U = dyn_cast(Ret->getOperand(0)); ; - U = dyn_cast(U->getOperand(0))) { - if (!U) - return false; - if (!U->hasOneUse()) - return false; - if (U == I) - break; - // Check for a truly no-op truncate. - if (isa(U) && - TLI.isTruncateFree(U->getOperand(0)->getType(), U->getType())) - continue; - // Check for a truly no-op bitcast. - if (isa(U) && - (U->getOperand(0)->getType() == U->getType() || - (isa(U->getOperand(0)->getType()) && - isa(U->getType())))) - continue; - // Otherwise it's not a true no-op. - return false; - } - - return true; -} - -void SelectionDAGLowering::LowerCallTo(CallSite CS, SDValue Callee, - bool isTailCall, - MachineBasicBlock *LandingPad) { - const PointerType *PT = cast(CS.getCalledValue()->getType()); - const FunctionType *FTy = cast(PT->getElementType()); - const Type *RetTy = FTy->getReturnType(); - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - unsigned BeginLabel = 0, EndLabel = 0; - - TargetLowering::ArgListTy Args; - TargetLowering::ArgListEntry Entry; - Args.reserve(CS.arg_size()); - - // Check whether the function can return without sret-demotion. - SmallVector OutVTs; - SmallVector OutsFlags; - SmallVector Offsets; - getReturnInfo(RetTy, CS.getAttributes().getRetAttributes(), - OutVTs, OutsFlags, TLI, &Offsets); - - - bool CanLowerReturn = TLI.CanLowerReturn(CS.getCallingConv(), - FTy->isVarArg(), OutVTs, OutsFlags, DAG); - - SDValue DemoteStackSlot; - - if (!CanLowerReturn) { - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize( - FTy->getReturnType()); - unsigned Align = TLI.getTargetData()->getPrefTypeAlignment( - FTy->getReturnType()); - MachineFunction &MF = DAG.getMachineFunction(); - int SSFI = MF.getFrameInfo()->CreateStackObject(TySize, Align, false); - const Type *StackSlotPtrType = PointerType::getUnqual(FTy->getReturnType()); - - DemoteStackSlot = DAG.getFrameIndex(SSFI, TLI.getPointerTy()); - Entry.Node = DemoteStackSlot; - Entry.Ty = StackSlotPtrType; - Entry.isSExt = false; - Entry.isZExt = false; - Entry.isInReg = false; - Entry.isSRet = true; - Entry.isNest = false; - Entry.isByVal = false; - Entry.Alignment = Align; - Args.push_back(Entry); - RetTy = Type::getVoidTy(FTy->getContext()); - } - - for (CallSite::arg_iterator i = CS.arg_begin(), e = CS.arg_end(); - i != e; ++i) { - SDValue ArgNode = getValue(*i); - Entry.Node = ArgNode; Entry.Ty = (*i)->getType(); - - unsigned attrInd = i - CS.arg_begin() + 1; - Entry.isSExt = CS.paramHasAttr(attrInd, Attribute::SExt); - Entry.isZExt = CS.paramHasAttr(attrInd, Attribute::ZExt); - Entry.isInReg = CS.paramHasAttr(attrInd, Attribute::InReg); - Entry.isSRet = CS.paramHasAttr(attrInd, Attribute::StructRet); - Entry.isNest = CS.paramHasAttr(attrInd, Attribute::Nest); - Entry.isByVal = CS.paramHasAttr(attrInd, Attribute::ByVal); - Entry.Alignment = CS.getParamAlignment(attrInd); - Args.push_back(Entry); - } - - if (LandingPad && MMI) { - // Insert a label before the invoke call to mark the try range. This can be - // used to detect deletion of the invoke via the MachineModuleInfo. - BeginLabel = MMI->NextLabelID(); - - // Both PendingLoads and PendingExports must be flushed here; - // this call might not return. - (void)getRoot(); - DAG.setRoot(DAG.getLabel(ISD::EH_LABEL, getCurDebugLoc(), - getControlRoot(), BeginLabel)); - } - - // Check if target-independent constraints permit a tail call here. - // Target-dependent constraints are checked within TLI.LowerCallTo. - if (isTailCall && - !isInTailCallPosition(CS.getInstruction(), - CS.getAttributes().getRetAttributes(), - TLI)) - isTailCall = false; - - std::pair Result = - TLI.LowerCallTo(getRoot(), RetTy, - CS.paramHasAttr(0, Attribute::SExt), - CS.paramHasAttr(0, Attribute::ZExt), FTy->isVarArg(), - CS.paramHasAttr(0, Attribute::InReg), FTy->getNumParams(), - CS.getCallingConv(), - isTailCall, - !CS.getInstruction()->use_empty(), - Callee, Args, DAG, getCurDebugLoc()); - assert((isTailCall || Result.second.getNode()) && - "Non-null chain expected with non-tail call!"); - assert((Result.second.getNode() || !Result.first.getNode()) && - "Null value expected with tail call!"); - if (Result.first.getNode()) - setValue(CS.getInstruction(), Result.first); - else if (!CanLowerReturn && Result.second.getNode()) { - // The instruction result is the result of loading from the - // hidden sret parameter. - SmallVector PVTs; - const Type *PtrRetTy = PointerType::getUnqual(FTy->getReturnType()); - - ComputeValueVTs(TLI, PtrRetTy, PVTs); - assert(PVTs.size() == 1 && "Pointers should fit in one register"); - EVT PtrVT = PVTs[0]; - unsigned NumValues = OutVTs.size(); - SmallVector Values(NumValues); - SmallVector Chains(NumValues); - - for (unsigned i = 0; i < NumValues; ++i) { - SDValue L = DAG.getLoad(OutVTs[i], getCurDebugLoc(), Result.second, - DAG.getNode(ISD::ADD, getCurDebugLoc(), PtrVT, DemoteStackSlot, - DAG.getConstant(Offsets[i], PtrVT)), - NULL, Offsets[i], false, 1); - Values[i] = L; - Chains[i] = L.getValue(1); - } - SDValue Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues); - PendingLoads.push_back(Chain); - - setValue(CS.getInstruction(), DAG.getNode(ISD::MERGE_VALUES, - getCurDebugLoc(), DAG.getVTList(&OutVTs[0], NumValues), - &Values[0], NumValues)); - } - // As a special case, a null chain means that a tail call has - // been emitted and the DAG root is already updated. - if (Result.second.getNode()) - DAG.setRoot(Result.second); - else - HasTailCall = true; - - if (LandingPad && MMI) { - // Insert a label at the end of the invoke call to mark the try range. This - // can be used to detect deletion of the invoke via the MachineModuleInfo. - EndLabel = MMI->NextLabelID(); - DAG.setRoot(DAG.getLabel(ISD::EH_LABEL, getCurDebugLoc(), - getRoot(), EndLabel)); - - // Inform MachineModuleInfo of range. - MMI->addInvoke(LandingPad, BeginLabel, EndLabel); - } -} - - -void SelectionDAGLowering::visitCall(CallInst &I) { - const char *RenameFn = 0; - if (Function *F = I.getCalledFunction()) { - if (F->isDeclaration()) { - const TargetIntrinsicInfo *II = TLI.getTargetMachine().getIntrinsicInfo(); - if (II) { - if (unsigned IID = II->getIntrinsicID(F)) { - RenameFn = visitIntrinsicCall(I, IID); - if (!RenameFn) - return; - } - } - if (unsigned IID = F->getIntrinsicID()) { - RenameFn = visitIntrinsicCall(I, IID); - if (!RenameFn) - return; - } - } - - // Check for well-known libc/libm calls. If the function is internal, it - // can't be a library call. - if (!F->hasLocalLinkage() && F->hasName()) { - StringRef Name = F->getName(); - if (Name == "copysign" || Name == "copysignf") { - if (I.getNumOperands() == 3 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.getType() == I.getOperand(2)->getType()) { - SDValue LHS = getValue(I.getOperand(1)); - SDValue RHS = getValue(I.getOperand(2)); - setValue(&I, DAG.getNode(ISD::FCOPYSIGN, getCurDebugLoc(), - LHS.getValueType(), LHS, RHS)); - return; - } - } else if (Name == "fabs" || Name == "fabsf" || Name == "fabsl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FABS, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "sin" || Name == "sinf" || Name == "sinl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FSIN, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "cos" || Name == "cosf" || Name == "cosl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FCOS, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "sqrt" || Name == "sqrtf" || Name == "sqrtl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FSQRT, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } - } - } else if (isa(I.getOperand(0))) { - visitInlineAsm(&I); - return; - } - - SDValue Callee; - if (!RenameFn) - Callee = getValue(I.getOperand(0)); - else - Callee = DAG.getExternalSymbol(RenameFn, TLI.getPointerTy()); - - // Check if we can potentially perform a tail call. More detailed - // checking is be done within LowerCallTo, after more information - // about the call is known. - bool isTailCall = PerformTailCallOpt && I.isTailCall(); - - LowerCallTo(&I, Callee, isTailCall); -} - - -/// getCopyFromRegs - Emit a series of CopyFromReg nodes that copies from -/// this value and returns the result as a ValueVT value. This uses -/// Chain/Flag as the input and updates them for the output Chain/Flag. -/// If the Flag pointer is NULL, no flag is used. -SDValue RegsForValue::getCopyFromRegs(SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, - SDValue *Flag) const { - // Assemble the legal parts into the final values. - SmallVector Values(ValueVTs.size()); - SmallVector Parts; - for (unsigned Value = 0, Part = 0, e = ValueVTs.size(); Value != e; ++Value) { - // Copy the legal parts from the registers. - EVT ValueVT = ValueVTs[Value]; - unsigned NumRegs = TLI->getNumRegisters(*DAG.getContext(), ValueVT); - EVT RegisterVT = RegVTs[Value]; - - Parts.resize(NumRegs); - for (unsigned i = 0; i != NumRegs; ++i) { - SDValue P; - if (Flag == 0) - P = DAG.getCopyFromReg(Chain, dl, Regs[Part+i], RegisterVT); - else { - P = DAG.getCopyFromReg(Chain, dl, Regs[Part+i], RegisterVT, *Flag); - *Flag = P.getValue(2); - } - Chain = P.getValue(1); - - // If the source register was virtual and if we know something about it, - // add an assert node. - if (TargetRegisterInfo::isVirtualRegister(Regs[Part+i]) && - RegisterVT.isInteger() && !RegisterVT.isVector()) { - unsigned SlotNo = Regs[Part+i]-TargetRegisterInfo::FirstVirtualRegister; - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - if (FLI.LiveOutRegInfo.size() > SlotNo) { - FunctionLoweringInfo::LiveOutInfo &LOI = FLI.LiveOutRegInfo[SlotNo]; - - unsigned RegSize = RegisterVT.getSizeInBits(); - unsigned NumSignBits = LOI.NumSignBits; - unsigned NumZeroBits = LOI.KnownZero.countLeadingOnes(); - - // FIXME: We capture more information than the dag can represent. For - // now, just use the tightest assertzext/assertsext possible. - bool isSExt = true; - EVT FromVT(MVT::Other); - if (NumSignBits == RegSize) - isSExt = true, FromVT = MVT::i1; // ASSERT SEXT 1 - else if (NumZeroBits >= RegSize-1) - isSExt = false, FromVT = MVT::i1; // ASSERT ZEXT 1 - else if (NumSignBits > RegSize-8) - isSExt = true, FromVT = MVT::i8; // ASSERT SEXT 8 - else if (NumZeroBits >= RegSize-8) - isSExt = false, FromVT = MVT::i8; // ASSERT ZEXT 8 - else if (NumSignBits > RegSize-16) - isSExt = true, FromVT = MVT::i16; // ASSERT SEXT 16 - else if (NumZeroBits >= RegSize-16) - isSExt = false, FromVT = MVT::i16; // ASSERT ZEXT 16 - else if (NumSignBits > RegSize-32) - isSExt = true, FromVT = MVT::i32; // ASSERT SEXT 32 - else if (NumZeroBits >= RegSize-32) - isSExt = false, FromVT = MVT::i32; // ASSERT ZEXT 32 - - if (FromVT != MVT::Other) { - P = DAG.getNode(isSExt ? ISD::AssertSext : ISD::AssertZext, dl, - RegisterVT, P, DAG.getValueType(FromVT)); - - } - } - } - - Parts[i] = P; - } - - Values[Value] = getCopyFromParts(DAG, dl, Parts.begin(), - NumRegs, RegisterVT, ValueVT); - Part += NumRegs; - Parts.clear(); - } - - return DAG.getNode(ISD::MERGE_VALUES, dl, - DAG.getVTList(&ValueVTs[0], ValueVTs.size()), - &Values[0], ValueVTs.size()); -} - -/// getCopyToRegs - Emit a series of CopyToReg nodes that copies the -/// specified value into the registers specified by this object. This uses -/// Chain/Flag as the input and updates them for the output Chain/Flag. -/// If the Flag pointer is NULL, no flag is used. -void RegsForValue::getCopyToRegs(SDValue Val, SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const { - // Get the list of the values's legal parts. - unsigned NumRegs = Regs.size(); - SmallVector Parts(NumRegs); - for (unsigned Value = 0, Part = 0, e = ValueVTs.size(); Value != e; ++Value) { - EVT ValueVT = ValueVTs[Value]; - unsigned NumParts = TLI->getNumRegisters(*DAG.getContext(), ValueVT); - EVT RegisterVT = RegVTs[Value]; - - getCopyToParts(DAG, dl, Val.getValue(Val.getResNo() + Value), - &Parts[Part], NumParts, RegisterVT); - Part += NumParts; - } - - // Copy the parts into the registers. - SmallVector Chains(NumRegs); - for (unsigned i = 0; i != NumRegs; ++i) { - SDValue Part; - if (Flag == 0) - Part = DAG.getCopyToReg(Chain, dl, Regs[i], Parts[i]); - else { - Part = DAG.getCopyToReg(Chain, dl, Regs[i], Parts[i], *Flag); - *Flag = Part.getValue(1); - } - Chains[i] = Part.getValue(0); - } - - if (NumRegs == 1 || Flag) - // If NumRegs > 1 && Flag is used then the use of the last CopyToReg is - // flagged to it. That is the CopyToReg nodes and the user are considered - // a single scheduling unit. If we create a TokenFactor and return it as - // chain, then the TokenFactor is both a predecessor (operand) of the - // user as well as a successor (the TF operands are flagged to the user). - // c1, f1 = CopyToReg - // c2, f2 = CopyToReg - // c3 = TokenFactor c1, c2 - // ... - // = op c3, ..., f2 - Chain = Chains[NumRegs-1]; - else - Chain = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, &Chains[0], NumRegs); -} - -/// AddInlineAsmOperands - Add this value to the specified inlineasm node -/// operand list. This adds the code marker and includes the number of -/// values added into it. -void RegsForValue::AddInlineAsmOperands(unsigned Code, - bool HasMatching,unsigned MatchingIdx, - SelectionDAG &DAG, - std::vector &Ops) const { - EVT IntPtrTy = DAG.getTargetLoweringInfo().getPointerTy(); - assert(Regs.size() < (1 << 13) && "Too many inline asm outputs!"); - unsigned Flag = Code | (Regs.size() << 3); - if (HasMatching) - Flag |= 0x80000000 | (MatchingIdx << 16); - Ops.push_back(DAG.getTargetConstant(Flag, IntPtrTy)); - for (unsigned Value = 0, Reg = 0, e = ValueVTs.size(); Value != e; ++Value) { - unsigned NumRegs = TLI->getNumRegisters(*DAG.getContext(), ValueVTs[Value]); - EVT RegisterVT = RegVTs[Value]; - for (unsigned i = 0; i != NumRegs; ++i) { - assert(Reg < Regs.size() && "Mismatch in # registers expected"); - Ops.push_back(DAG.getRegister(Regs[Reg++], RegisterVT)); - } - } -} - -/// isAllocatableRegister - If the specified register is safe to allocate, -/// i.e. it isn't a stack pointer or some other special register, return the -/// register class for the register. Otherwise, return null. -static const TargetRegisterClass * -isAllocatableRegister(unsigned Reg, MachineFunction &MF, - const TargetLowering &TLI, - const TargetRegisterInfo *TRI) { - EVT FoundVT = MVT::Other; - const TargetRegisterClass *FoundRC = 0; - for (TargetRegisterInfo::regclass_iterator RCI = TRI->regclass_begin(), - E = TRI->regclass_end(); RCI != E; ++RCI) { - EVT ThisVT = MVT::Other; - - const TargetRegisterClass *RC = *RCI; - // If none of the the value types for this register class are valid, we - // can't use it. For example, 64-bit reg classes on 32-bit targets. - for (TargetRegisterClass::vt_iterator I = RC->vt_begin(), E = RC->vt_end(); - I != E; ++I) { - if (TLI.isTypeLegal(*I)) { - // If we have already found this register in a different register class, - // choose the one with the largest VT specified. For example, on - // PowerPC, we favor f64 register classes over f32. - if (FoundVT == MVT::Other || FoundVT.bitsLT(*I)) { - ThisVT = *I; - break; - } - } - } - - if (ThisVT == MVT::Other) continue; - - // NOTE: This isn't ideal. In particular, this might allocate the - // frame pointer in functions that need it (due to them not being taken - // out of allocation, because a variable sized allocation hasn't been seen - // yet). This is a slight code pessimization, but should still work. - for (TargetRegisterClass::iterator I = RC->allocation_order_begin(MF), - E = RC->allocation_order_end(MF); I != E; ++I) - if (*I == Reg) { - // We found a matching register class. Keep looking at others in case - // we find one with larger registers that this physreg is also in. - FoundRC = RC; - FoundVT = ThisVT; - break; - } - } - return FoundRC; -} - - -namespace llvm { -/// AsmOperandInfo - This contains information for each constraint that we are -/// lowering. -class VISIBILITY_HIDDEN SDISelAsmOperandInfo : - public TargetLowering::AsmOperandInfo { -public: - /// CallOperand - If this is the result output operand or a clobber - /// this is null, otherwise it is the incoming operand to the CallInst. - /// This gets modified as the asm is processed. - SDValue CallOperand; - - /// AssignedRegs - If this is a register or register class operand, this - /// contains the set of register corresponding to the operand. - RegsForValue AssignedRegs; - - explicit SDISelAsmOperandInfo(const InlineAsm::ConstraintInfo &info) - : TargetLowering::AsmOperandInfo(info), CallOperand(0,0) { - } - - /// MarkAllocatedRegs - Once AssignedRegs is set, mark the assigned registers - /// busy in OutputRegs/InputRegs. - void MarkAllocatedRegs(bool isOutReg, bool isInReg, - std::set &OutputRegs, - std::set &InputRegs, - const TargetRegisterInfo &TRI) const { - if (isOutReg) { - for (unsigned i = 0, e = AssignedRegs.Regs.size(); i != e; ++i) - MarkRegAndAliases(AssignedRegs.Regs[i], OutputRegs, TRI); - } - if (isInReg) { - for (unsigned i = 0, e = AssignedRegs.Regs.size(); i != e; ++i) - MarkRegAndAliases(AssignedRegs.Regs[i], InputRegs, TRI); - } - } - - /// getCallOperandValEVT - Return the EVT of the Value* that this operand - /// corresponds to. If there is no Value* for this operand, it returns - /// MVT::Other. - EVT getCallOperandValEVT(LLVMContext &Context, - const TargetLowering &TLI, - const TargetData *TD) const { - if (CallOperandVal == 0) return MVT::Other; - - if (isa(CallOperandVal)) - return TLI.getPointerTy(); - - const llvm::Type *OpTy = CallOperandVal->getType(); - - // If this is an indirect operand, the operand is a pointer to the - // accessed type. - if (isIndirect) - OpTy = cast(OpTy)->getElementType(); - - // If OpTy is not a single value, it may be a struct/union that we - // can tile with integers. - if (!OpTy->isSingleValueType() && OpTy->isSized()) { - unsigned BitSize = TD->getTypeSizeInBits(OpTy); - switch (BitSize) { - default: break; - case 1: - case 8: - case 16: - case 32: - case 64: - case 128: - OpTy = IntegerType::get(Context, BitSize); - break; - } - } - - return TLI.getValueType(OpTy, true); - } - -private: - /// MarkRegAndAliases - Mark the specified register and all aliases in the - /// specified set. - static void MarkRegAndAliases(unsigned Reg, std::set &Regs, - const TargetRegisterInfo &TRI) { - assert(TargetRegisterInfo::isPhysicalRegister(Reg) && "Isn't a physreg"); - Regs.insert(Reg); - if (const unsigned *Aliases = TRI.getAliasSet(Reg)) - for (; *Aliases; ++Aliases) - Regs.insert(*Aliases); - } -}; -} // end llvm namespace. - - -/// GetRegistersForValue - Assign registers (virtual or physical) for the -/// specified operand. We prefer to assign virtual registers, to allow the -/// register allocator handle the assignment process. However, if the asm uses -/// features that we can't model on machineinstrs, we have SDISel do the -/// allocation. This produces generally horrible, but correct, code. -/// -/// OpInfo describes the operand. -/// Input and OutputRegs are the set of already allocated physical registers. -/// -void SelectionDAGLowering:: -GetRegistersForValue(SDISelAsmOperandInfo &OpInfo, - std::set &OutputRegs, - std::set &InputRegs) { - LLVMContext &Context = FuncInfo.Fn->getContext(); - - // Compute whether this value requires an input register, an output register, - // or both. - bool isOutReg = false; - bool isInReg = false; - switch (OpInfo.Type) { - case InlineAsm::isOutput: - isOutReg = true; - - // If there is an input constraint that matches this, we need to reserve - // the input register so no other inputs allocate to it. - isInReg = OpInfo.hasMatchingInput(); - break; - case InlineAsm::isInput: - isInReg = true; - isOutReg = false; - break; - case InlineAsm::isClobber: - isOutReg = true; - isInReg = true; - break; - } - - - MachineFunction &MF = DAG.getMachineFunction(); - SmallVector Regs; - - // If this is a constraint for a single physreg, or a constraint for a - // register class, find it. - std::pair PhysReg = - TLI.getRegForInlineAsmConstraint(OpInfo.ConstraintCode, - OpInfo.ConstraintVT); - - unsigned NumRegs = 1; - if (OpInfo.ConstraintVT != MVT::Other) { - // If this is a FP input in an integer register (or visa versa) insert a bit - // cast of the input value. More generally, handle any case where the input - // value disagrees with the register class we plan to stick this in. - if (OpInfo.Type == InlineAsm::isInput && - PhysReg.second && !PhysReg.second->hasType(OpInfo.ConstraintVT)) { - // Try to convert to the first EVT that the reg class contains. If the - // types are identical size, use a bitcast to convert (e.g. two differing - // vector types). - EVT RegVT = *PhysReg.second->vt_begin(); - if (RegVT.getSizeInBits() == OpInfo.ConstraintVT.getSizeInBits()) { - OpInfo.CallOperand = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - RegVT, OpInfo.CallOperand); - OpInfo.ConstraintVT = RegVT; - } else if (RegVT.isInteger() && OpInfo.ConstraintVT.isFloatingPoint()) { - // If the input is a FP value and we want it in FP registers, do a - // bitcast to the corresponding integer type. This turns an f64 value - // into i64, which can be passed with two i32 values on a 32-bit - // machine. - RegVT = EVT::getIntegerVT(Context, - OpInfo.ConstraintVT.getSizeInBits()); - OpInfo.CallOperand = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - RegVT, OpInfo.CallOperand); - OpInfo.ConstraintVT = RegVT; - } - } - - NumRegs = TLI.getNumRegisters(Context, OpInfo.ConstraintVT); - } - - EVT RegVT; - EVT ValueVT = OpInfo.ConstraintVT; - - // If this is a constraint for a specific physical register, like {r17}, - // assign it now. - if (unsigned AssignedReg = PhysReg.first) { - const TargetRegisterClass *RC = PhysReg.second; - if (OpInfo.ConstraintVT == MVT::Other) - ValueVT = *RC->vt_begin(); - - // Get the actual register value type. This is important, because the user - // may have asked for (e.g.) the AX register in i32 type. We need to - // remember that AX is actually i16 to get the right extension. - RegVT = *RC->vt_begin(); - - // This is a explicit reference to a physical register. - Regs.push_back(AssignedReg); - - // If this is an expanded reference, add the rest of the regs to Regs. - if (NumRegs != 1) { - TargetRegisterClass::iterator I = RC->begin(); - for (; *I != AssignedReg; ++I) - assert(I != RC->end() && "Didn't find reg!"); - - // Already added the first reg. - --NumRegs; ++I; - for (; NumRegs; --NumRegs, ++I) { - assert(I != RC->end() && "Ran out of registers to allocate!"); - Regs.push_back(*I); - } - } - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, RegVT, ValueVT); - const TargetRegisterInfo *TRI = DAG.getTarget().getRegisterInfo(); - OpInfo.MarkAllocatedRegs(isOutReg, isInReg, OutputRegs, InputRegs, *TRI); - return; - } - - // Otherwise, if this was a reference to an LLVM register class, create vregs - // for this reference. - if (const TargetRegisterClass *RC = PhysReg.second) { - RegVT = *RC->vt_begin(); - if (OpInfo.ConstraintVT == MVT::Other) - ValueVT = RegVT; - - // Create the appropriate number of virtual registers. - MachineRegisterInfo &RegInfo = MF.getRegInfo(); - for (; NumRegs; --NumRegs) - Regs.push_back(RegInfo.createVirtualRegister(RC)); - - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, RegVT, ValueVT); - return; - } - - // This is a reference to a register class that doesn't directly correspond - // to an LLVM register class. Allocate NumRegs consecutive, available, - // registers from the class. - std::vector RegClassRegs - = TLI.getRegClassForInlineAsmConstraint(OpInfo.ConstraintCode, - OpInfo.ConstraintVT); - - const TargetRegisterInfo *TRI = DAG.getTarget().getRegisterInfo(); - unsigned NumAllocated = 0; - for (unsigned i = 0, e = RegClassRegs.size(); i != e; ++i) { - unsigned Reg = RegClassRegs[i]; - // See if this register is available. - if ((isOutReg && OutputRegs.count(Reg)) || // Already used. - (isInReg && InputRegs.count(Reg))) { // Already used. - // Make sure we find consecutive registers. - NumAllocated = 0; - continue; - } - - // Check to see if this register is allocatable (i.e. don't give out the - // stack pointer). - const TargetRegisterClass *RC = isAllocatableRegister(Reg, MF, TLI, TRI); - if (!RC) { // Couldn't allocate this register. - // Reset NumAllocated to make sure we return consecutive registers. - NumAllocated = 0; - continue; - } - - // Okay, this register is good, we can use it. - ++NumAllocated; - - // If we allocated enough consecutive registers, succeed. - if (NumAllocated == NumRegs) { - unsigned RegStart = (i-NumAllocated)+1; - unsigned RegEnd = i+1; - // Mark all of the allocated registers used. - for (unsigned i = RegStart; i != RegEnd; ++i) - Regs.push_back(RegClassRegs[i]); - - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, *RC->vt_begin(), - OpInfo.ConstraintVT); - OpInfo.MarkAllocatedRegs(isOutReg, isInReg, OutputRegs, InputRegs, *TRI); - return; - } - } - - // Otherwise, we couldn't allocate enough registers for this. -} - -/// hasInlineAsmMemConstraint - Return true if the inline asm instruction being -/// processed uses a memory 'm' constraint. -static bool -hasInlineAsmMemConstraint(std::vector &CInfos, - const TargetLowering &TLI) { - for (unsigned i = 0, e = CInfos.size(); i != e; ++i) { - InlineAsm::ConstraintInfo &CI = CInfos[i]; - for (unsigned j = 0, ee = CI.Codes.size(); j != ee; ++j) { - TargetLowering::ConstraintType CType = TLI.getConstraintType(CI.Codes[j]); - if (CType == TargetLowering::C_Memory) - return true; - } - - // Indirect operand accesses access memory. - if (CI.isIndirect) - return true; - } - - return false; -} - -/// visitInlineAsm - Handle a call to an InlineAsm object. -/// -void SelectionDAGLowering::visitInlineAsm(CallSite CS) { - InlineAsm *IA = cast(CS.getCalledValue()); - - /// ConstraintOperands - Information about all of the constraints. - std::vector ConstraintOperands; - - std::set OutputRegs, InputRegs; - - // Do a prepass over the constraints, canonicalizing them, and building up the - // ConstraintOperands list. - std::vector - ConstraintInfos = IA->ParseConstraints(); - - bool hasMemory = hasInlineAsmMemConstraint(ConstraintInfos, TLI); - - SDValue Chain, Flag; - - // We won't need to flush pending loads if this asm doesn't touch - // memory and is nonvolatile. - if (hasMemory || IA->hasSideEffects()) - Chain = getRoot(); - else - Chain = DAG.getRoot(); - - unsigned ArgNo = 0; // ArgNo - The argument of the CallInst. - unsigned ResNo = 0; // ResNo - The result number of the next output. - for (unsigned i = 0, e = ConstraintInfos.size(); i != e; ++i) { - ConstraintOperands.push_back(SDISelAsmOperandInfo(ConstraintInfos[i])); - SDISelAsmOperandInfo &OpInfo = ConstraintOperands.back(); - - EVT OpVT = MVT::Other; - - // Compute the value type for each operand. - switch (OpInfo.Type) { - case InlineAsm::isOutput: - // Indirect outputs just consume an argument. - if (OpInfo.isIndirect) { - OpInfo.CallOperandVal = CS.getArgument(ArgNo++); - break; - } - - // The return value of the call is this value. As such, there is no - // corresponding argument. - assert(CS.getType() != Type::getVoidTy(*DAG.getContext()) && - "Bad inline asm!"); - if (const StructType *STy = dyn_cast(CS.getType())) { - OpVT = TLI.getValueType(STy->getElementType(ResNo)); - } else { - assert(ResNo == 0 && "Asm only has one result!"); - OpVT = TLI.getValueType(CS.getType()); - } - ++ResNo; - break; - case InlineAsm::isInput: - OpInfo.CallOperandVal = CS.getArgument(ArgNo++); - break; - case InlineAsm::isClobber: - // Nothing to do. - break; - } - - // If this is an input or an indirect output, process the call argument. - // BasicBlocks are labels, currently appearing only in asm's. - if (OpInfo.CallOperandVal) { - // Strip bitcasts, if any. This mostly comes up for functions. - OpInfo.CallOperandVal = OpInfo.CallOperandVal->stripPointerCasts(); - - if (BasicBlock *BB = dyn_cast(OpInfo.CallOperandVal)) { - OpInfo.CallOperand = DAG.getBasicBlock(FuncInfo.MBBMap[BB]); - } else { - OpInfo.CallOperand = getValue(OpInfo.CallOperandVal); - } - - OpVT = OpInfo.getCallOperandValEVT(*DAG.getContext(), TLI, TD); - } - - OpInfo.ConstraintVT = OpVT; - } - - // Second pass over the constraints: compute which constraint option to use - // and assign registers to constraints that want a specific physreg. - for (unsigned i = 0, e = ConstraintInfos.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - // If this is an output operand with a matching input operand, look up the - // matching input. If their types mismatch, e.g. one is an integer, the - // other is floating point, or their sizes are different, flag it as an - // error. - if (OpInfo.hasMatchingInput()) { - SDISelAsmOperandInfo &Input = ConstraintOperands[OpInfo.MatchingInput]; - if (OpInfo.ConstraintVT != Input.ConstraintVT) { - if ((OpInfo.ConstraintVT.isInteger() != - Input.ConstraintVT.isInteger()) || - (OpInfo.ConstraintVT.getSizeInBits() != - Input.ConstraintVT.getSizeInBits())) { - llvm_report_error("Unsupported asm: input constraint" - " with a matching output constraint of incompatible" - " type!"); - } - Input.ConstraintVT = OpInfo.ConstraintVT; - } - } - - // Compute the constraint code and ConstraintType to use. - TLI.ComputeConstraintToUse(OpInfo, OpInfo.CallOperand, hasMemory, &DAG); - - // If this is a memory input, and if the operand is not indirect, do what we - // need to to provide an address for the memory input. - if (OpInfo.ConstraintType == TargetLowering::C_Memory && - !OpInfo.isIndirect) { - assert(OpInfo.Type == InlineAsm::isInput && - "Can only indirectify direct input operands!"); - - // Memory operands really want the address of the value. If we don't have - // an indirect input, put it in the constpool if we can, otherwise spill - // it to a stack slot. - - // If the operand is a float, integer, or vector constant, spill to a - // constant pool entry to get its address. - Value *OpVal = OpInfo.CallOperandVal; - if (isa(OpVal) || isa(OpVal) || - isa(OpVal)) { - OpInfo.CallOperand = DAG.getConstantPool(cast(OpVal), - TLI.getPointerTy()); - } else { - // Otherwise, create a stack slot and emit a store to it before the - // asm. - const Type *Ty = OpVal->getType(); - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize(Ty); - unsigned Align = TLI.getTargetData()->getPrefTypeAlignment(Ty); - MachineFunction &MF = DAG.getMachineFunction(); - int SSFI = MF.getFrameInfo()->CreateStackObject(TySize, Align, false); - SDValue StackSlot = DAG.getFrameIndex(SSFI, TLI.getPointerTy()); - Chain = DAG.getStore(Chain, getCurDebugLoc(), - OpInfo.CallOperand, StackSlot, NULL, 0); - OpInfo.CallOperand = StackSlot; - } - - // There is no longer a Value* corresponding to this operand. - OpInfo.CallOperandVal = 0; - // It is now an indirect operand. - OpInfo.isIndirect = true; - } - - // If this constraint is for a specific register, allocate it before - // anything else. - if (OpInfo.ConstraintType == TargetLowering::C_Register) - GetRegistersForValue(OpInfo, OutputRegs, InputRegs); - } - ConstraintInfos.clear(); - - - // Second pass - Loop over all of the operands, assigning virtual or physregs - // to register class operands. - for (unsigned i = 0, e = ConstraintOperands.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - // C_Register operands have already been allocated, Other/Memory don't need - // to be. - if (OpInfo.ConstraintType == TargetLowering::C_RegisterClass) - GetRegistersForValue(OpInfo, OutputRegs, InputRegs); - } - - // AsmNodeOperands - The operands for the ISD::INLINEASM node. - std::vector AsmNodeOperands; - AsmNodeOperands.push_back(SDValue()); // reserve space for input chain - AsmNodeOperands.push_back( - DAG.getTargetExternalSymbol(IA->getAsmString().c_str(), MVT::Other)); - - - // Loop over all of the inputs, copying the operand values into the - // appropriate registers and processing the output regs. - RegsForValue RetValRegs; - - // IndirectStoresToEmit - The set of stores to emit after the inline asm node. - std::vector > IndirectStoresToEmit; - - for (unsigned i = 0, e = ConstraintOperands.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - switch (OpInfo.Type) { - case InlineAsm::isOutput: { - if (OpInfo.ConstraintType != TargetLowering::C_RegisterClass && - OpInfo.ConstraintType != TargetLowering::C_Register) { - // Memory output, or 'other' output (e.g. 'X' constraint). - assert(OpInfo.isIndirect && "Memory output must be indirect operand"); - - // Add information to the INLINEASM node to know about this output. - unsigned ResOpType = 4/*MEM*/ | (1<<3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.push_back(OpInfo.CallOperand); - break; - } - - // Otherwise, this is a register or register class output. - - // Copy the output from the appropriate register. Find a register that - // we can use. - if (OpInfo.AssignedRegs.Regs.empty()) { - llvm_report_error("Couldn't allocate output reg for" - " constraint '" + OpInfo.ConstraintCode + "'!"); - } - - // If this is an indirect operand, store through the pointer after the - // asm. - if (OpInfo.isIndirect) { - IndirectStoresToEmit.push_back(std::make_pair(OpInfo.AssignedRegs, - OpInfo.CallOperandVal)); - } else { - // This is the result value of the call. - assert(CS.getType() != Type::getVoidTy(*DAG.getContext()) && - "Bad inline asm!"); - // Concatenate this output onto the outputs list. - RetValRegs.append(OpInfo.AssignedRegs); - } - - // Add information to the INLINEASM node to know that this register is - // set. - OpInfo.AssignedRegs.AddInlineAsmOperands(OpInfo.isEarlyClobber ? - 6 /* EARLYCLOBBER REGDEF */ : - 2 /* REGDEF */ , - false, - 0, - DAG, AsmNodeOperands); - break; - } - case InlineAsm::isInput: { - SDValue InOperandVal = OpInfo.CallOperand; - - if (OpInfo.isMatchingInputConstraint()) { // Matching constraint? - // If this is required to match an output register we have already set, - // just use its register. - unsigned OperandNo = OpInfo.getMatchedOperand(); - - // Scan until we find the definition we already emitted of this operand. - // When we find it, create a RegsForValue operand. - unsigned CurOp = 2; // The first operand. - for (; OperandNo; --OperandNo) { - // Advance to the next operand. - unsigned OpFlag = - cast(AsmNodeOperands[CurOp])->getZExtValue(); - assert(((OpFlag & 7) == 2 /*REGDEF*/ || - (OpFlag & 7) == 6 /*EARLYCLOBBER REGDEF*/ || - (OpFlag & 7) == 4 /*MEM*/) && - "Skipped past definitions?"); - CurOp += InlineAsm::getNumOperandRegisters(OpFlag)+1; - } - - unsigned OpFlag = - cast(AsmNodeOperands[CurOp])->getZExtValue(); - if ((OpFlag & 7) == 2 /*REGDEF*/ - || (OpFlag & 7) == 6 /* EARLYCLOBBER REGDEF */) { - // Add (OpFlag&0xffff)>>3 registers to MatchedRegs. - if (OpInfo.isIndirect) { - llvm_report_error("Don't know how to handle tied indirect " - "register inputs yet!"); - } - RegsForValue MatchedRegs; - MatchedRegs.TLI = &TLI; - MatchedRegs.ValueVTs.push_back(InOperandVal.getValueType()); - EVT RegVT = AsmNodeOperands[CurOp+1].getValueType(); - MatchedRegs.RegVTs.push_back(RegVT); - MachineRegisterInfo &RegInfo = DAG.getMachineFunction().getRegInfo(); - for (unsigned i = 0, e = InlineAsm::getNumOperandRegisters(OpFlag); - i != e; ++i) - MatchedRegs.Regs. - push_back(RegInfo.createVirtualRegister(TLI.getRegClassFor(RegVT))); - - // Use the produced MatchedRegs object to - MatchedRegs.getCopyToRegs(InOperandVal, DAG, getCurDebugLoc(), - Chain, &Flag); - MatchedRegs.AddInlineAsmOperands(1 /*REGUSE*/, - true, OpInfo.getMatchedOperand(), - DAG, AsmNodeOperands); - break; - } else { - assert(((OpFlag & 7) == 4) && "Unknown matching constraint!"); - assert((InlineAsm::getNumOperandRegisters(OpFlag)) == 1 && - "Unexpected number of operands"); - // Add information to the INLINEASM node to know about this input. - // See InlineAsm.h isUseOperandTiedToDef. - OpFlag |= 0x80000000 | (OpInfo.getMatchedOperand() << 16); - AsmNodeOperands.push_back(DAG.getTargetConstant(OpFlag, - TLI.getPointerTy())); - AsmNodeOperands.push_back(AsmNodeOperands[CurOp+1]); - break; - } - } - - if (OpInfo.ConstraintType == TargetLowering::C_Other) { - assert(!OpInfo.isIndirect && - "Don't know how to handle indirect other inputs yet!"); - - std::vector Ops; - TLI.LowerAsmOperandForConstraint(InOperandVal, OpInfo.ConstraintCode[0], - hasMemory, Ops, DAG); - if (Ops.empty()) { - llvm_report_error("Invalid operand for inline asm" - " constraint '" + OpInfo.ConstraintCode + "'!"); - } - - // Add information to the INLINEASM node to know about this input. - unsigned ResOpType = 3 /*IMM*/ | (Ops.size() << 3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.insert(AsmNodeOperands.end(), Ops.begin(), Ops.end()); - break; - } else if (OpInfo.ConstraintType == TargetLowering::C_Memory) { - assert(OpInfo.isIndirect && "Operand must be indirect to be a mem!"); - assert(InOperandVal.getValueType() == TLI.getPointerTy() && - "Memory operands expect pointer values"); - - // Add information to the INLINEASM node to know about this input. - unsigned ResOpType = 4/*MEM*/ | (1<<3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.push_back(InOperandVal); - break; - } - - assert((OpInfo.ConstraintType == TargetLowering::C_RegisterClass || - OpInfo.ConstraintType == TargetLowering::C_Register) && - "Unknown constraint type!"); - assert(!OpInfo.isIndirect && - "Don't know how to handle indirect register inputs yet!"); - - // Copy the input into the appropriate registers. - if (OpInfo.AssignedRegs.Regs.empty()) { - llvm_report_error("Couldn't allocate input reg for" - " constraint '"+ OpInfo.ConstraintCode +"'!"); - } - - OpInfo.AssignedRegs.getCopyToRegs(InOperandVal, DAG, getCurDebugLoc(), - Chain, &Flag); - - OpInfo.AssignedRegs.AddInlineAsmOperands(1/*REGUSE*/, false, 0, - DAG, AsmNodeOperands); - break; - } - case InlineAsm::isClobber: { - // Add the clobbered value to the operand list, so that the register - // allocator is aware that the physreg got clobbered. - if (!OpInfo.AssignedRegs.Regs.empty()) - OpInfo.AssignedRegs.AddInlineAsmOperands(6 /* EARLYCLOBBER REGDEF */, - false, 0, DAG,AsmNodeOperands); - break; - } - } - } - - // Finish up input operands. - AsmNodeOperands[0] = Chain; - if (Flag.getNode()) AsmNodeOperands.push_back(Flag); - - Chain = DAG.getNode(ISD::INLINEASM, getCurDebugLoc(), - DAG.getVTList(MVT::Other, MVT::Flag), - &AsmNodeOperands[0], AsmNodeOperands.size()); - Flag = Chain.getValue(1); - - // If this asm returns a register value, copy the result from that register - // and set it as the value of the call. - if (!RetValRegs.Regs.empty()) { - SDValue Val = RetValRegs.getCopyFromRegs(DAG, getCurDebugLoc(), - Chain, &Flag); - - // FIXME: Why don't we do this for inline asms with MRVs? - if (CS.getType()->isSingleValueType() && CS.getType()->isSized()) { - EVT ResultType = TLI.getValueType(CS.getType()); - - // If any of the results of the inline asm is a vector, it may have the - // wrong width/num elts. This can happen for register classes that can - // contain multiple different value types. The preg or vreg allocated may - // not have the same VT as was expected. Convert it to the right type - // with bit_convert. - if (ResultType != Val.getValueType() && Val.getValueType().isVector()) { - Val = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - ResultType, Val); - - } else if (ResultType != Val.getValueType() && - ResultType.isInteger() && Val.getValueType().isInteger()) { - // If a result value was tied to an input value, the computed result may - // have a wider width than the expected result. Extract the relevant - // portion. - Val = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), ResultType, Val); - } - - assert(ResultType == Val.getValueType() && "Asm result value mismatch!"); - } - - setValue(CS.getInstruction(), Val); - // Don't need to use this as a chain in this case. - if (!IA->hasSideEffects() && !hasMemory && IndirectStoresToEmit.empty()) - return; - } - - std::vector > StoresToEmit; - - // Process indirect outputs, first output all of the flagged copies out of - // physregs. - for (unsigned i = 0, e = IndirectStoresToEmit.size(); i != e; ++i) { - RegsForValue &OutRegs = IndirectStoresToEmit[i].first; - Value *Ptr = IndirectStoresToEmit[i].second; - SDValue OutVal = OutRegs.getCopyFromRegs(DAG, getCurDebugLoc(), - Chain, &Flag); - StoresToEmit.push_back(std::make_pair(OutVal, Ptr)); - - } - - // Emit the non-flagged stores from the physregs. - SmallVector OutChains; - for (unsigned i = 0, e = StoresToEmit.size(); i != e; ++i) - OutChains.push_back(DAG.getStore(Chain, getCurDebugLoc(), - StoresToEmit[i].first, - getValue(StoresToEmit[i].second), - StoresToEmit[i].second, 0)); - if (!OutChains.empty()) - Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &OutChains[0], OutChains.size()); - DAG.setRoot(Chain); -} - -void SelectionDAGLowering::visitVAStart(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VASTART, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(1)))); -} - -void SelectionDAGLowering::visitVAArg(VAArgInst &I) { - SDValue V = DAG.getVAArg(TLI.getValueType(I.getType()), getCurDebugLoc(), - getRoot(), getValue(I.getOperand(0)), - DAG.getSrcValue(I.getOperand(0))); - setValue(&I, V); - DAG.setRoot(V.getValue(1)); -} - -void SelectionDAGLowering::visitVAEnd(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VAEND, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(1)))); -} - -void SelectionDAGLowering::visitVACopy(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VACOPY, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - DAG.getSrcValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(2)))); -} - -/// TargetLowering::LowerCallTo - This is the default LowerCallTo -/// implementation, which just calls LowerCall. -/// FIXME: When all targets are -/// migrated to using LowerCall, this hook should be integrated into SDISel. -std::pair -TargetLowering::LowerCallTo(SDValue Chain, const Type *RetTy, - bool RetSExt, bool RetZExt, bool isVarArg, - bool isInreg, unsigned NumFixedArgs, - CallingConv::ID CallConv, bool isTailCall, - bool isReturnValueUsed, - SDValue Callee, - ArgListTy &Args, SelectionDAG &DAG, DebugLoc dl) { - - assert((!isTailCall || PerformTailCallOpt) && - "isTailCall set when tail-call optimizations are disabled!"); - - // Handle all of the outgoing arguments. - SmallVector Outs; - for (unsigned i = 0, e = Args.size(); i != e; ++i) { - SmallVector ValueVTs; - ComputeValueVTs(*this, Args[i].Ty, ValueVTs); - for (unsigned Value = 0, NumValues = ValueVTs.size(); - Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - const Type *ArgTy = VT.getTypeForEVT(RetTy->getContext()); - SDValue Op = SDValue(Args[i].Node.getNode(), - Args[i].Node.getResNo() + Value); - ISD::ArgFlagsTy Flags; - unsigned OriginalAlignment = - getTargetData()->getABITypeAlignment(ArgTy); - - if (Args[i].isZExt) - Flags.setZExt(); - if (Args[i].isSExt) - Flags.setSExt(); - if (Args[i].isInReg) - Flags.setInReg(); - if (Args[i].isSRet) - Flags.setSRet(); - if (Args[i].isByVal) { - Flags.setByVal(); - const PointerType *Ty = cast(Args[i].Ty); - const Type *ElementTy = Ty->getElementType(); - unsigned FrameAlign = getByValTypeAlignment(ElementTy); - unsigned FrameSize = getTargetData()->getTypeAllocSize(ElementTy); - // For ByVal, alignment should come from FE. BE will guess if this - // info is not there but there are cases it cannot get right. - if (Args[i].Alignment) - FrameAlign = Args[i].Alignment; - Flags.setByValAlign(FrameAlign); - Flags.setByValSize(FrameSize); - } - if (Args[i].isNest) - Flags.setNest(); - Flags.setOrigAlign(OriginalAlignment); - - EVT PartVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumParts = getNumRegisters(RetTy->getContext(), VT); - SmallVector Parts(NumParts); - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - if (Args[i].isSExt) - ExtendKind = ISD::SIGN_EXTEND; - else if (Args[i].isZExt) - ExtendKind = ISD::ZERO_EXTEND; - - getCopyToParts(DAG, dl, Op, &Parts[0], NumParts, PartVT, ExtendKind); - - for (unsigned j = 0; j != NumParts; ++j) { - // if it isn't first piece, alignment must be 1 - ISD::OutputArg MyFlags(Flags, Parts[j], i < NumFixedArgs); - if (NumParts > 1 && j == 0) - MyFlags.Flags.setSplit(); - else if (j != 0) - MyFlags.Flags.setOrigAlign(1); - - Outs.push_back(MyFlags); - } - } - } - - // Handle the incoming return values from the call. - SmallVector Ins; - SmallVector RetTys; - ComputeValueVTs(*this, RetTy, RetTys); - for (unsigned I = 0, E = RetTys.size(); I != E; ++I) { - EVT VT = RetTys[I]; - EVT RegisterVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumRegs = getNumRegisters(RetTy->getContext(), VT); - for (unsigned i = 0; i != NumRegs; ++i) { - ISD::InputArg MyFlags; - MyFlags.VT = RegisterVT; - MyFlags.Used = isReturnValueUsed; - if (RetSExt) - MyFlags.Flags.setSExt(); - if (RetZExt) - MyFlags.Flags.setZExt(); - if (isInreg) - MyFlags.Flags.setInReg(); - Ins.push_back(MyFlags); - } - } - - // Check if target-dependent constraints permit a tail call here. - // Target-independent constraints should be checked by the caller. - if (isTailCall && - !IsEligibleForTailCallOptimization(Callee, CallConv, isVarArg, Ins, DAG)) - isTailCall = false; - - SmallVector InVals; - Chain = LowerCall(Chain, Callee, CallConv, isVarArg, isTailCall, - Outs, Ins, dl, DAG, InVals); - - // Verify that the target's LowerCall behaved as expected. - assert(Chain.getNode() && Chain.getValueType() == MVT::Other && - "LowerCall didn't return a valid chain!"); - assert((!isTailCall || InVals.empty()) && - "LowerCall emitted a return value for a tail call!"); - assert((isTailCall || InVals.size() == Ins.size()) && - "LowerCall didn't emit the correct number of values!"); - DEBUG(for (unsigned i = 0, e = Ins.size(); i != e; ++i) { - assert(InVals[i].getNode() && - "LowerCall emitted a null value!"); - assert(Ins[i].VT == InVals[i].getValueType() && - "LowerCall emitted a value with the wrong type!"); - }); - - // For a tail call, the return value is merely live-out and there aren't - // any nodes in the DAG representing it. Return a special value to - // indicate that a tail call has been emitted and no more Instructions - // should be processed in the current block. - if (isTailCall) { - DAG.setRoot(Chain); - return std::make_pair(SDValue(), SDValue()); - } - - // Collect the legal value parts into potentially illegal values - // that correspond to the original function's return values. - ISD::NodeType AssertOp = ISD::DELETED_NODE; - if (RetSExt) - AssertOp = ISD::AssertSext; - else if (RetZExt) - AssertOp = ISD::AssertZext; - SmallVector ReturnValues; - unsigned CurReg = 0; - for (unsigned I = 0, E = RetTys.size(); I != E; ++I) { - EVT VT = RetTys[I]; - EVT RegisterVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumRegs = getNumRegisters(RetTy->getContext(), VT); - - SDValue ReturnValue = - getCopyFromParts(DAG, dl, &InVals[CurReg], NumRegs, RegisterVT, VT, - AssertOp); - ReturnValues.push_back(ReturnValue); - CurReg += NumRegs; - } - - // For a function returning void, there is no return value. We can't create - // such a node, so we just return a null return value in that case. In - // that case, nothing will actualy look at the value. - if (ReturnValues.empty()) - return std::make_pair(SDValue(), Chain); - - SDValue Res = DAG.getNode(ISD::MERGE_VALUES, dl, - DAG.getVTList(&RetTys[0], RetTys.size()), - &ReturnValues[0], ReturnValues.size()); - - return std::make_pair(Res, Chain); -} - -void TargetLowering::LowerOperationWrapper(SDNode *N, - SmallVectorImpl &Results, - SelectionDAG &DAG) { - SDValue Res = LowerOperation(SDValue(N, 0), DAG); - if (Res.getNode()) - Results.push_back(Res); -} - -SDValue TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { - llvm_unreachable("LowerOperation not implemented for this target!"); - return SDValue(); -} - - -void SelectionDAGLowering::CopyValueToVirtualRegister(Value *V, unsigned Reg) { - SDValue Op = getValue(V); - assert((Op.getOpcode() != ISD::CopyFromReg || - cast(Op.getOperand(1))->getReg() != Reg) && - "Copy from a reg to the same reg!"); - assert(!TargetRegisterInfo::isPhysicalRegister(Reg) && "Is a physreg"); - - RegsForValue RFV(V->getContext(), TLI, Reg, V->getType()); - SDValue Chain = DAG.getEntryNode(); - RFV.getCopyToRegs(Op, DAG, getCurDebugLoc(), Chain, 0); - PendingExports.push_back(Chain); -} - -#include "llvm/CodeGen/SelectionDAGISel.h" - -void SelectionDAGISel::LowerArguments(BasicBlock *LLVMBB) { - // If this is the entry block, emit arguments. - Function &F = *LLVMBB->getParent(); - SelectionDAG &DAG = SDL->DAG; - SDValue OldRoot = DAG.getRoot(); - DebugLoc dl = SDL->getCurDebugLoc(); - const TargetData *TD = TLI.getTargetData(); - SmallVector Ins; - - // Check whether the function can return without sret-demotion. - SmallVector OutVTs; - SmallVector OutsFlags; - getReturnInfo(F.getReturnType(), F.getAttributes().getRetAttributes(), - OutVTs, OutsFlags, TLI); - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - - FLI.CanLowerReturn = TLI.CanLowerReturn(F.getCallingConv(), F.isVarArg(), - OutVTs, OutsFlags, DAG); - if (!FLI.CanLowerReturn) { - // Put in an sret pointer parameter before all the other parameters. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F.getReturnType()), ValueVTs); - - // NOTE: Assuming that a pointer will never break down to more than one VT - // or one register. - ISD::ArgFlagsTy Flags; - Flags.setSRet(); - EVT RegisterVT = TLI.getRegisterType(*CurDAG->getContext(), ValueVTs[0]); - ISD::InputArg RetArg(Flags, RegisterVT, true); - Ins.push_back(RetArg); - } - - // Set up the incoming argument description vector. - unsigned Idx = 1; - for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(); - I != E; ++I, ++Idx) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I->getType(), ValueVTs); - bool isArgValueUsed = !I->use_empty(); - for (unsigned Value = 0, NumValues = ValueVTs.size(); - Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - const Type *ArgTy = VT.getTypeForEVT(*DAG.getContext()); - ISD::ArgFlagsTy Flags; - unsigned OriginalAlignment = - TD->getABITypeAlignment(ArgTy); - - if (F.paramHasAttr(Idx, Attribute::ZExt)) - Flags.setZExt(); - if (F.paramHasAttr(Idx, Attribute::SExt)) - Flags.setSExt(); - if (F.paramHasAttr(Idx, Attribute::InReg)) - Flags.setInReg(); - if (F.paramHasAttr(Idx, Attribute::StructRet)) - Flags.setSRet(); - if (F.paramHasAttr(Idx, Attribute::ByVal)) { - Flags.setByVal(); - const PointerType *Ty = cast(I->getType()); - const Type *ElementTy = Ty->getElementType(); - unsigned FrameAlign = TLI.getByValTypeAlignment(ElementTy); - unsigned FrameSize = TD->getTypeAllocSize(ElementTy); - // For ByVal, alignment should be passed from FE. BE will guess if - // this info is not there but there are cases it cannot get right. - if (F.getParamAlignment(Idx)) - FrameAlign = F.getParamAlignment(Idx); - Flags.setByValAlign(FrameAlign); - Flags.setByValSize(FrameSize); - } - if (F.paramHasAttr(Idx, Attribute::Nest)) - Flags.setNest(); - Flags.setOrigAlign(OriginalAlignment); - - EVT RegisterVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - unsigned NumRegs = TLI.getNumRegisters(*CurDAG->getContext(), VT); - for (unsigned i = 0; i != NumRegs; ++i) { - ISD::InputArg MyFlags(Flags, RegisterVT, isArgValueUsed); - if (NumRegs > 1 && i == 0) - MyFlags.Flags.setSplit(); - // if it isn't first piece, alignment must be 1 - else if (i > 0) - MyFlags.Flags.setOrigAlign(1); - Ins.push_back(MyFlags); - } - } - } - - // Call the target to set up the argument values. - SmallVector InVals; - SDValue NewRoot = TLI.LowerFormalArguments(DAG.getRoot(), F.getCallingConv(), - F.isVarArg(), Ins, - dl, DAG, InVals); - - // Verify that the target's LowerFormalArguments behaved as expected. - assert(NewRoot.getNode() && NewRoot.getValueType() == MVT::Other && - "LowerFormalArguments didn't return a valid chain!"); - assert(InVals.size() == Ins.size() && - "LowerFormalArguments didn't emit the correct number of values!"); - DEBUG(for (unsigned i = 0, e = Ins.size(); i != e; ++i) { - assert(InVals[i].getNode() && - "LowerFormalArguments emitted a null value!"); - assert(Ins[i].VT == InVals[i].getValueType() && - "LowerFormalArguments emitted a value with the wrong type!"); - }); - - // Update the DAG with the new chain value resulting from argument lowering. - DAG.setRoot(NewRoot); - - // Set up the argument values. - unsigned i = 0; - Idx = 1; - if (!FLI.CanLowerReturn) { - // Create a virtual register for the sret pointer, and put in a copy - // from the sret argument into it. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F.getReturnType()), ValueVTs); - EVT VT = ValueVTs[0]; - EVT RegVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - ISD::NodeType AssertOp = ISD::DELETED_NODE; - SDValue ArgValue = getCopyFromParts(DAG, dl, &InVals[0], 1, RegVT, - VT, AssertOp); - - MachineFunction& MF = SDL->DAG.getMachineFunction(); - MachineRegisterInfo& RegInfo = MF.getRegInfo(); - unsigned SRetReg = RegInfo.createVirtualRegister(TLI.getRegClassFor(RegVT)); - FLI.DemoteRegister = SRetReg; - NewRoot = SDL->DAG.getCopyToReg(NewRoot, SDL->getCurDebugLoc(), SRetReg, ArgValue); - DAG.setRoot(NewRoot); - - // i indexes lowered arguments. Bump it past the hidden sret argument. - // Idx indexes LLVM arguments. Don't touch it. - ++i; - } - for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(); I != E; - ++I, ++Idx) { - SmallVector ArgValues; - SmallVector ValueVTs; - ComputeValueVTs(TLI, I->getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - for (unsigned Value = 0; Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - EVT PartVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - unsigned NumParts = TLI.getNumRegisters(*CurDAG->getContext(), VT); - - if (!I->use_empty()) { - ISD::NodeType AssertOp = ISD::DELETED_NODE; - if (F.paramHasAttr(Idx, Attribute::SExt)) - AssertOp = ISD::AssertSext; - else if (F.paramHasAttr(Idx, Attribute::ZExt)) - AssertOp = ISD::AssertZext; - - ArgValues.push_back(getCopyFromParts(DAG, dl, &InVals[i], NumParts, - PartVT, VT, AssertOp)); - } - i += NumParts; - } - if (!I->use_empty()) { - SDL->setValue(I, DAG.getMergeValues(&ArgValues[0], NumValues, - SDL->getCurDebugLoc())); - // If this argument is live outside of the entry block, insert a copy from - // whereever we got it to the vreg that other BB's will reference it as. - SDL->CopyToExportRegsIfNeeded(I); - } - } - assert(i == InVals.size() && "Argument register count mismatch!"); - - // Finally, if the target has anything special to do, allow it to do so. - // FIXME: this should insert code into the DAG! - EmitFunctionEntryCode(F, SDL->DAG.getMachineFunction()); -} - -/// Handle PHI nodes in successor blocks. Emit code into the SelectionDAG to -/// ensure constants are generated when needed. Remember the virtual registers -/// that need to be added to the Machine PHI nodes as input. We cannot just -/// directly add them, because expansion might result in multiple MBB's for one -/// BB. As such, the start of the BB might correspond to a different MBB than -/// the end. -/// -void -SelectionDAGISel::HandlePHINodesInSuccessorBlocks(BasicBlock *LLVMBB) { - TerminatorInst *TI = LLVMBB->getTerminator(); - - SmallPtrSet SuccsHandled; - - // Check successor nodes' PHI nodes that expect a constant to be available - // from this block. - for (unsigned succ = 0, e = TI->getNumSuccessors(); succ != e; ++succ) { - BasicBlock *SuccBB = TI->getSuccessor(succ); - if (!isa(SuccBB->begin())) continue; - MachineBasicBlock *SuccMBB = FuncInfo->MBBMap[SuccBB]; - - // If this terminator has multiple identical successors (common for - // switches), only handle each succ once. - if (!SuccsHandled.insert(SuccMBB)) continue; - - MachineBasicBlock::iterator MBBI = SuccMBB->begin(); - PHINode *PN; - - // At this point we know that there is a 1-1 correspondence between LLVM PHI - // nodes and Machine PHI nodes, but the incoming operands have not been - // emitted yet. - for (BasicBlock::iterator I = SuccBB->begin(); - (PN = dyn_cast(I)); ++I) { - // Ignore dead phi's. - if (PN->use_empty()) continue; - - unsigned Reg; - Value *PHIOp = PN->getIncomingValueForBlock(LLVMBB); - - if (Constant *C = dyn_cast(PHIOp)) { - unsigned &RegOut = SDL->ConstantsOut[C]; - if (RegOut == 0) { - RegOut = FuncInfo->CreateRegForValue(C); - SDL->CopyValueToVirtualRegister(C, RegOut); - } - Reg = RegOut; - } else { - Reg = FuncInfo->ValueMap[PHIOp]; - if (Reg == 0) { - assert(isa(PHIOp) && - FuncInfo->StaticAllocaMap.count(cast(PHIOp)) && - "Didn't codegen value into a register!??"); - Reg = FuncInfo->CreateRegForValue(PHIOp); - SDL->CopyValueToVirtualRegister(PHIOp, Reg); - } - } - - // Remember that this register needs to added to the machine PHI node as - // the input for this MBB. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PN->getType(), ValueVTs); - for (unsigned vti = 0, vte = ValueVTs.size(); vti != vte; ++vti) { - EVT VT = ValueVTs[vti]; - unsigned NumRegisters = TLI.getNumRegisters(*CurDAG->getContext(), VT); - for (unsigned i = 0, e = NumRegisters; i != e; ++i) - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg+i)); - Reg += NumRegisters; - } - } - } - SDL->ConstantsOut.clear(); -} - -/// This is the Fast-ISel version of HandlePHINodesInSuccessorBlocks. It only -/// supports legal types, and it emits MachineInstrs directly instead of -/// creating SelectionDAG nodes. -/// -bool -SelectionDAGISel::HandlePHINodesInSuccessorBlocksFast(BasicBlock *LLVMBB, - FastISel *F) { - TerminatorInst *TI = LLVMBB->getTerminator(); - - SmallPtrSet SuccsHandled; - unsigned OrigNumPHINodesToUpdate = SDL->PHINodesToUpdate.size(); - - // Check successor nodes' PHI nodes that expect a constant to be available - // from this block. - for (unsigned succ = 0, e = TI->getNumSuccessors(); succ != e; ++succ) { - BasicBlock *SuccBB = TI->getSuccessor(succ); - if (!isa(SuccBB->begin())) continue; - MachineBasicBlock *SuccMBB = FuncInfo->MBBMap[SuccBB]; - - // If this terminator has multiple identical successors (common for - // switches), only handle each succ once. - if (!SuccsHandled.insert(SuccMBB)) continue; - - MachineBasicBlock::iterator MBBI = SuccMBB->begin(); - PHINode *PN; - - // At this point we know that there is a 1-1 correspondence between LLVM PHI - // nodes and Machine PHI nodes, but the incoming operands have not been - // emitted yet. - for (BasicBlock::iterator I = SuccBB->begin(); - (PN = dyn_cast(I)); ++I) { - // Ignore dead phi's. - if (PN->use_empty()) continue; - - // Only handle legal types. Two interesting things to note here. First, - // by bailing out early, we may leave behind some dead instructions, - // since SelectionDAG's HandlePHINodesInSuccessorBlocks will insert its - // own moves. Second, this check is necessary becuase FastISel doesn't - // use CreateRegForValue to create registers, so it always creates - // exactly one register for each non-void instruction. - EVT VT = TLI.getValueType(PN->getType(), /*AllowUnknown=*/true); - if (VT == MVT::Other || !TLI.isTypeLegal(VT)) { - // Promote MVT::i1. - if (VT == MVT::i1) - VT = TLI.getTypeToTransformTo(*CurDAG->getContext(), VT); - else { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); - return false; - } - } - - Value *PHIOp = PN->getIncomingValueForBlock(LLVMBB); - - unsigned Reg = F->getRegForValue(PHIOp); - if (Reg == 0) { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); - return false; - } - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg)); - } - } - - return true; -} Copied: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h (from r88738, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h?p2=llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h&p1=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h&r1=88738&r2=89667&rev=89667&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h Mon Nov 23 11:16:22 2009 @@ -1,4 +1,4 @@ -//===-- SelectionDAGBuild.h - Selection-DAG building ----------------------===// +//===-- FunctionLoweringInfo.h - Lower functions from LLVM IR to CodeGen --===// // // The LLVM Compiler Infrastructure // @@ -7,77 +7,33 @@ // //===----------------------------------------------------------------------===// // -// This implements routines for translating from LLVM IR into SelectionDAG IR. +// This implements routines for translating functions from LLVM IR into +// Machine IR. // //===----------------------------------------------------------------------===// -#ifndef SELECTIONDAGBUILD_H -#define SELECTIONDAGBUILD_H +#ifndef FUNCTIONLOWERINGINFO_H +#define FUNCTIONLOWERINGINFO_H -#include "llvm/Constants.h" -#include "llvm/CodeGen/SelectionDAG.h" #include "llvm/ADT/APInt.h" #include "llvm/ADT/DenseMap.h" #ifndef NDEBUG #include "llvm/ADT/SmallSet.h" #endif -#include "llvm/CodeGen/SelectionDAGNodes.h" #include "llvm/CodeGen/ValueTypes.h" -#include "llvm/Support/CallSite.h" -#include "llvm/Support/ErrorHandling.h" -#include "llvm/Target/TargetMachine.h" #include -#include namespace llvm { -class AliasAnalysis; class AllocaInst; class BasicBlock; -class BitCastInst; -class BranchInst; -class CallInst; -class ExtractElementInst; -class ExtractValueInst; -class FCmpInst; -class FPExtInst; -class FPToSIInst; -class FPToUIInst; -class FPTruncInst; class Function; -class GetElementPtrInst; -class GCFunctionInfo; -class ICmpInst; -class IntToPtrInst; -class IndirectBrInst; -class InvokeInst; -class InsertElementInst; -class InsertValueInst; class Instruction; -class LoadInst; class MachineBasicBlock; class MachineFunction; -class MachineInstr; -class MachineModuleInfo; class MachineRegisterInfo; -class PHINode; -class PtrToIntInst; -class ReturnInst; -class SDISelAsmOperandInfo; -class SExtInst; -class SelectInst; -class ShuffleVectorInst; -class SIToFPInst; -class StoreInst; -class SwitchInst; -class TargetData; class TargetLowering; -class TruncInst; -class UIToFPInst; -class UnreachableInst; -class UnwindInst; -class VAArgInst; -class ZExtInst; +class Value; //===--------------------------------------------------------------------===// /// FunctionLoweringInfo - This contains information that is global to a @@ -103,8 +59,7 @@ /// set - Initialize this FunctionLoweringInfo with the given Function /// and its associated MachineFunction. /// - void set(Function &Fn, MachineFunction &MF, SelectionDAG &DAG, - bool EnableFastISel); + void set(Function &Fn, MachineFunction &MF, bool EnableFastISel); /// MBBMap - A mapping from LLVM basic blocks to their machine code entry. DenseMap MBBMap; @@ -153,426 +108,29 @@ /// clear - Clear out all the function-specific state. This returns this /// FunctionLoweringInfo to an empty state, ready to be used for a /// different function. - void clear() { - MBBMap.clear(); - ValueMap.clear(); - StaticAllocaMap.clear(); -#ifndef NDEBUG - CatchInfoLost.clear(); - CatchInfoFound.clear(); -#endif - LiveOutRegInfo.clear(); - } -}; - -//===----------------------------------------------------------------------===// -/// SelectionDAGLowering - This is the common target-independent lowering -/// implementation that is parameterized by a TargetLowering object. -/// Also, targets can overload any lowering method. -/// -class SelectionDAGLowering { - MachineBasicBlock *CurMBB; - - /// CurDebugLoc - current file + line number. Changes as we build the DAG. - DebugLoc CurDebugLoc; - - DenseMap NodeMap; - - /// PendingLoads - Loads are not emitted to the program immediately. We bunch - /// them up and then emit token factor nodes when possible. This allows us to - /// get simple disambiguation between loads without worrying about alias - /// analysis. - SmallVector PendingLoads; - - /// PendingExports - CopyToReg nodes that copy values to virtual registers - /// for export to other blocks need to be emitted before any terminator - /// instruction, but they have no other ordering requirements. We bunch them - /// up and the emit a single tokenfactor for them just before terminator - /// instructions. - SmallVector PendingExports; - - /// Case - A struct to record the Value for a switch case, and the - /// case's target basic block. - struct Case { - Constant* Low; - Constant* High; - MachineBasicBlock* BB; - - Case() : Low(0), High(0), BB(0) { } - Case(Constant* low, Constant* high, MachineBasicBlock* bb) : - Low(low), High(high), BB(bb) { } - APInt size() const { - const APInt &rHigh = cast(High)->getValue(); - const APInt &rLow = cast(Low)->getValue(); - return (rHigh - rLow + 1ULL); - } - }; - - struct CaseBits { - uint64_t Mask; - MachineBasicBlock* BB; - unsigned Bits; - - CaseBits(uint64_t mask, MachineBasicBlock* bb, unsigned bits): - Mask(mask), BB(bb), Bits(bits) { } - }; - - typedef std::vector CaseVector; - typedef std::vector CaseBitsVector; - typedef CaseVector::iterator CaseItr; - typedef std::pair CaseRange; - - /// CaseRec - A struct with ctor used in lowering switches to a binary tree - /// of conditional branches. - struct CaseRec { - CaseRec(MachineBasicBlock *bb, Constant *lt, Constant *ge, CaseRange r) : - CaseBB(bb), LT(lt), GE(ge), Range(r) {} - - /// CaseBB - The MBB in which to emit the compare and branch - MachineBasicBlock *CaseBB; - /// LT, GE - If nonzero, we know the current case value must be less-than or - /// greater-than-or-equal-to these Constants. - Constant *LT; - Constant *GE; - /// Range - A pair of iterators representing the range of case values to be - /// processed at this point in the binary search tree. - CaseRange Range; - }; - - typedef std::vector CaseRecVector; - - /// The comparison function for sorting the switch case values in the vector. - /// WARNING: Case ranges should be disjoint! - struct CaseCmp { - bool operator () (const Case& C1, const Case& C2) { - assert(isa(C1.Low) && isa(C2.High)); - const ConstantInt* CI1 = cast(C1.Low); - const ConstantInt* CI2 = cast(C2.High); - return CI1->getValue().slt(CI2->getValue()); - } - }; - - struct CaseBitsCmp { - bool operator () (const CaseBits& C1, const CaseBits& C2) { - return C1.Bits > C2.Bits; - } - }; - - size_t Clusterify(CaseVector& Cases, const SwitchInst &SI); - - /// CaseBlock - This structure is used to communicate between SDLowering and - /// SDISel for the code generation of additional basic blocks needed by multi- - /// case switch statements. - struct CaseBlock { - CaseBlock(ISD::CondCode cc, Value *cmplhs, Value *cmprhs, Value *cmpmiddle, - MachineBasicBlock *truebb, MachineBasicBlock *falsebb, - MachineBasicBlock *me) - : CC(cc), CmpLHS(cmplhs), CmpMHS(cmpmiddle), CmpRHS(cmprhs), - TrueBB(truebb), FalseBB(falsebb), ThisBB(me) {} - // CC - the condition code to use for the case block's setcc node - ISD::CondCode CC; - // CmpLHS/CmpRHS/CmpMHS - The LHS/MHS/RHS of the comparison to emit. - // Emit by default LHS op RHS. MHS is used for range comparisons: - // If MHS is not null: (LHS <= MHS) and (MHS <= RHS). - Value *CmpLHS, *CmpMHS, *CmpRHS; - // TrueBB/FalseBB - the block to branch to if the setcc is true/false. - MachineBasicBlock *TrueBB, *FalseBB; - // ThisBB - the block into which to emit the code for the setcc and branches - MachineBasicBlock *ThisBB; - }; - struct JumpTable { - JumpTable(unsigned R, unsigned J, MachineBasicBlock *M, - MachineBasicBlock *D): Reg(R), JTI(J), MBB(M), Default(D) {} - - /// Reg - the virtual register containing the index of the jump table entry - //. to jump to. - unsigned Reg; - /// JTI - the JumpTableIndex for this jump table in the function. - unsigned JTI; - /// MBB - the MBB into which to emit the code for the indirect jump. - MachineBasicBlock *MBB; - /// Default - the MBB of the default bb, which is a successor of the range - /// check MBB. This is when updating PHI nodes in successors. - MachineBasicBlock *Default; - }; - struct JumpTableHeader { - JumpTableHeader(APInt F, APInt L, Value* SV, MachineBasicBlock* H, - bool E = false): - First(F), Last(L), SValue(SV), HeaderBB(H), Emitted(E) {} - APInt First; - APInt Last; - Value *SValue; - MachineBasicBlock *HeaderBB; - bool Emitted; - }; - typedef std::pair JumpTableBlock; - - struct BitTestCase { - BitTestCase(uint64_t M, MachineBasicBlock* T, MachineBasicBlock* Tr): - Mask(M), ThisBB(T), TargetBB(Tr) { } - uint64_t Mask; - MachineBasicBlock* ThisBB; - MachineBasicBlock* TargetBB; - }; - - typedef SmallVector BitTestInfo; - - struct BitTestBlock { - BitTestBlock(APInt F, APInt R, Value* SV, - unsigned Rg, bool E, - MachineBasicBlock* P, MachineBasicBlock* D, - const BitTestInfo& C): - First(F), Range(R), SValue(SV), Reg(Rg), Emitted(E), - Parent(P), Default(D), Cases(C) { } - APInt First; - APInt Range; - Value *SValue; - unsigned Reg; - bool Emitted; - MachineBasicBlock *Parent; - MachineBasicBlock *Default; - BitTestInfo Cases; - }; - -public: - // TLI - This is information that describes the available target features we - // need for lowering. This indicates when operations are unavailable, - // implemented with a libcall, etc. - TargetLowering &TLI; - SelectionDAG &DAG; - const TargetData *TD; - AliasAnalysis *AA; - - /// SwitchCases - Vector of CaseBlock structures used to communicate - /// SwitchInst code generation information. - std::vector SwitchCases; - /// JTCases - Vector of JumpTable structures used to communicate - /// SwitchInst code generation information. - std::vector JTCases; - /// BitTestCases - Vector of BitTestBlock structures used to communicate - /// SwitchInst code generation information. - std::vector BitTestCases; - - /// PHINodesToUpdate - A list of phi instructions whose operand list will - /// be updated after processing the current basic block. - std::vector > PHINodesToUpdate; - - /// EdgeMapping - If an edge from CurMBB to any MBB is changed (e.g. due to - /// scheduler custom lowering), track the change here. - DenseMap EdgeMapping; - - // Emit PHI-node-operand constants only once even if used by multiple - // PHI nodes. - DenseMap ConstantsOut; - - /// FuncInfo - Information about the function as a whole. - /// - FunctionLoweringInfo &FuncInfo; - - /// OptLevel - What optimization level we're generating code for. - /// - CodeGenOpt::Level OptLevel; - - /// GFI - Garbage collection metadata for the function. - GCFunctionInfo *GFI; - - /// HasTailCall - This is set to true if a call in the current - /// block has been translated as a tail call. In this case, - /// no subsequent DAG nodes should be created. - /// - bool HasTailCall; - - LLVMContext *Context; - - SelectionDAGLowering(SelectionDAG &dag, TargetLowering &tli, - FunctionLoweringInfo &funcinfo, - CodeGenOpt::Level ol) - : CurDebugLoc(DebugLoc::getUnknownLoc()), - TLI(tli), DAG(dag), FuncInfo(funcinfo), OptLevel(ol), - HasTailCall(false), - Context(dag.getContext()) { - } - - void init(GCFunctionInfo *gfi, AliasAnalysis &aa); - - /// clear - Clear out the curret SelectionDAG and the associated - /// state and prepare this SelectionDAGLowering object to be used - /// for a new block. This doesn't clear out information about - /// additional blocks that are needed to complete switch lowering - /// or PHI node updating; that information is cleared out as it is - /// consumed. void clear(); - - /// getRoot - Return the current virtual root of the Selection DAG, - /// flushing any PendingLoad items. This must be done before emitting - /// a store or any other node that may need to be ordered after any - /// prior load instructions. - /// - SDValue getRoot(); - - /// getControlRoot - Similar to getRoot, but instead of flushing all the - /// PendingLoad items, flush all the PendingExports items. It is necessary - /// to do this before emitting a terminator instruction. - /// - SDValue getControlRoot(); - - DebugLoc getCurDebugLoc() const { return CurDebugLoc; } - void setCurDebugLoc(DebugLoc dl) { CurDebugLoc = dl; } - - void CopyValueToVirtualRegister(Value *V, unsigned Reg); - - void visit(Instruction &I); - - void visit(unsigned Opcode, User &I); - - void setCurrentBasicBlock(MachineBasicBlock *MBB) { CurMBB = MBB; } - - SDValue getValue(const Value *V); - - void setValue(const Value *V, SDValue NewN) { - SDValue &N = NodeMap[V]; - assert(N.getNode() == 0 && "Already set a value for this node!"); - N = NewN; - } - - void GetRegistersForValue(SDISelAsmOperandInfo &OpInfo, - std::set &OutputRegs, - std::set &InputRegs); - - void FindMergedConditions(Value *Cond, MachineBasicBlock *TBB, - MachineBasicBlock *FBB, MachineBasicBlock *CurBB, - unsigned Opc); - void EmitBranchForMergedCondition(Value *Cond, MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB); - bool ShouldEmitAsBranches(const std::vector &Cases); - bool isExportableFromCurrentBlock(Value *V, const BasicBlock *FromBB); - void CopyToExportRegsIfNeeded(Value *V); - void ExportFromCurrentBlock(Value *V); - void LowerCallTo(CallSite CS, SDValue Callee, bool IsTailCall, - MachineBasicBlock *LandingPad = NULL); - -private: - // Terminator instructions. - void visitRet(ReturnInst &I); - void visitBr(BranchInst &I); - void visitSwitch(SwitchInst &I); - void visitIndirectBr(IndirectBrInst &I); - void visitUnreachable(UnreachableInst &I) { /* noop */ } - - // Helpers for visitSwitch - bool handleSmallSwitchRange(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleJTSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleBTSplitSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleBitTestsSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); -public: - void visitSwitchCase(CaseBlock &CB); - void visitBitTestHeader(BitTestBlock &B); - void visitBitTestCase(MachineBasicBlock* NextMBB, - unsigned Reg, - BitTestCase &B); - void visitJumpTable(JumpTable &JT); - void visitJumpTableHeader(JumpTable &JT, JumpTableHeader &JTH); - -private: - // These all get lowered before this pass. - void visitInvoke(InvokeInst &I); - void visitUnwind(UnwindInst &I); - - void visitBinary(User &I, unsigned OpCode); - void visitShift(User &I, unsigned Opcode); - void visitAdd(User &I) { visitBinary(I, ISD::ADD); } - void visitFAdd(User &I) { visitBinary(I, ISD::FADD); } - void visitSub(User &I) { visitBinary(I, ISD::SUB); } - void visitFSub(User &I); - void visitMul(User &I) { visitBinary(I, ISD::MUL); } - void visitFMul(User &I) { visitBinary(I, ISD::FMUL); } - void visitURem(User &I) { visitBinary(I, ISD::UREM); } - void visitSRem(User &I) { visitBinary(I, ISD::SREM); } - void visitFRem(User &I) { visitBinary(I, ISD::FREM); } - void visitUDiv(User &I) { visitBinary(I, ISD::UDIV); } - void visitSDiv(User &I) { visitBinary(I, ISD::SDIV); } - void visitFDiv(User &I) { visitBinary(I, ISD::FDIV); } - void visitAnd (User &I) { visitBinary(I, ISD::AND); } - void visitOr (User &I) { visitBinary(I, ISD::OR); } - void visitXor (User &I) { visitBinary(I, ISD::XOR); } - void visitShl (User &I) { visitShift(I, ISD::SHL); } - void visitLShr(User &I) { visitShift(I, ISD::SRL); } - void visitAShr(User &I) { visitShift(I, ISD::SRA); } - void visitICmp(User &I); - void visitFCmp(User &I); - // Visit the conversion instructions - void visitTrunc(User &I); - void visitZExt(User &I); - void visitSExt(User &I); - void visitFPTrunc(User &I); - void visitFPExt(User &I); - void visitFPToUI(User &I); - void visitFPToSI(User &I); - void visitUIToFP(User &I); - void visitSIToFP(User &I); - void visitPtrToInt(User &I); - void visitIntToPtr(User &I); - void visitBitCast(User &I); - - void visitExtractElement(User &I); - void visitInsertElement(User &I); - void visitShuffleVector(User &I); - - void visitExtractValue(ExtractValueInst &I); - void visitInsertValue(InsertValueInst &I); - - void visitGetElementPtr(User &I); - void visitSelect(User &I); - - void visitAlloca(AllocaInst &I); - void visitLoad(LoadInst &I); - void visitStore(StoreInst &I); - void visitPHI(PHINode &I) { } // PHI nodes are handled specially. - void visitCall(CallInst &I); - void visitInlineAsm(CallSite CS); - const char *visitIntrinsicCall(CallInst &I, unsigned Intrinsic); - void visitTargetIntrinsic(CallInst &I, unsigned Intrinsic); - - void visitPow(CallInst &I); - void visitExp2(CallInst &I); - void visitExp(CallInst &I); - void visitLog(CallInst &I); - void visitLog2(CallInst &I); - void visitLog10(CallInst &I); - - void visitVAStart(CallInst &I); - void visitVAArg(VAArgInst &I); - void visitVAEnd(CallInst &I); - void visitVACopy(CallInst &I); - - void visitUserOp1(Instruction &I) { - llvm_unreachable("UserOp1 should not exist at instruction selection time!"); - } - void visitUserOp2(Instruction &I) { - llvm_unreachable("UserOp2 should not exist at instruction selection time!"); - } - - const char *implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op); - const char *implVisitAluOverflow(CallInst &I, ISD::NodeType Op); }; -/// AddCatchInfo - Extract the personality and type infos from an eh.selector -/// call, and add them to the specified machine basic block. -void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, - MachineBasicBlock *MBB); +/// ComputeLinearIndex - Given an LLVM IR aggregate type and a sequence +/// of insertvalue or extractvalue indices that identify a member, return +/// the linearized index of the start of the member. +/// +unsigned ComputeLinearIndex(const TargetLowering &TLI, const Type *Ty, + const unsigned *Indices, + const unsigned *IndicesEnd, + unsigned CurIndex = 0); + +/// ComputeValueVTs - Given an LLVM IR type, compute a sequence of +/// EVTs that represent all the individual underlying +/// non-aggregate types that comprise it. +/// +/// If Offsets is non-null, it points to a vector to be filled in +/// with the in-memory offsets of each of the individual values. +/// +void ComputeValueVTs(const TargetLowering &TLI, const Type *Ty, + SmallVectorImpl &ValueVTs, + SmallVectorImpl *Offsets = 0, + uint64_t StartingOffset = 0); } // end namespace llvm Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=89667&r1=89666&r2=89667&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Mon Nov 23 11:16:22 2009 @@ -13,11 +13,11 @@ #define DEBUG_TYPE "isel" #include "SelectionDAGBuild.h" +#include "FunctionLoweringInfo.h" #include "llvm/ADT/BitVector.h" #include "llvm/ADT/SmallSet.h" #include "llvm/Analysis/AliasAnalysis.h" #include "llvm/Constants.h" -#include "llvm/Constants.h" #include "llvm/CallingConv.h" #include "llvm/DerivedTypes.h" #include "llvm/Function.h" @@ -68,84 +68,6 @@ cl::location(LimitFloatPrecision), cl::init(0)); -/// ComputeLinearIndex - Given an LLVM IR aggregate type and a sequence -/// of insertvalue or extractvalue indices that identify a member, return -/// the linearized index of the start of the member. -/// -static unsigned ComputeLinearIndex(const TargetLowering &TLI, const Type *Ty, - const unsigned *Indices, - const unsigned *IndicesEnd, - unsigned CurIndex = 0) { - // Base case: We're done. - if (Indices && Indices == IndicesEnd) - return CurIndex; - - // Given a struct type, recursively traverse the elements. - if (const StructType *STy = dyn_cast(Ty)) { - for (StructType::element_iterator EB = STy->element_begin(), - EI = EB, - EE = STy->element_end(); - EI != EE; ++EI) { - if (Indices && *Indices == unsigned(EI - EB)) - return ComputeLinearIndex(TLI, *EI, Indices+1, IndicesEnd, CurIndex); - CurIndex = ComputeLinearIndex(TLI, *EI, 0, 0, CurIndex); - } - return CurIndex; - } - // Given an array type, recursively traverse the elements. - else if (const ArrayType *ATy = dyn_cast(Ty)) { - const Type *EltTy = ATy->getElementType(); - for (unsigned i = 0, e = ATy->getNumElements(); i != e; ++i) { - if (Indices && *Indices == i) - return ComputeLinearIndex(TLI, EltTy, Indices+1, IndicesEnd, CurIndex); - CurIndex = ComputeLinearIndex(TLI, EltTy, 0, 0, CurIndex); - } - return CurIndex; - } - // We haven't found the type we're looking for, so keep searching. - return CurIndex + 1; -} - -/// ComputeValueVTs - Given an LLVM IR type, compute a sequence of -/// EVTs that represent all the individual underlying -/// non-aggregate types that comprise it. -/// -/// If Offsets is non-null, it points to a vector to be filled in -/// with the in-memory offsets of each of the individual values. -/// -static void ComputeValueVTs(const TargetLowering &TLI, const Type *Ty, - SmallVectorImpl &ValueVTs, - SmallVectorImpl *Offsets = 0, - uint64_t StartingOffset = 0) { - // Given a struct type, recursively traverse the elements. - if (const StructType *STy = dyn_cast(Ty)) { - const StructLayout *SL = TLI.getTargetData()->getStructLayout(STy); - for (StructType::element_iterator EB = STy->element_begin(), - EI = EB, - EE = STy->element_end(); - EI != EE; ++EI) - ComputeValueVTs(TLI, *EI, ValueVTs, Offsets, - StartingOffset + SL->getElementOffset(EI - EB)); - return; - } - // Given an array type, recursively traverse the elements. - if (const ArrayType *ATy = dyn_cast(Ty)) { - const Type *EltTy = ATy->getElementType(); - uint64_t EltSize = TLI.getTargetData()->getTypeAllocSize(EltTy); - for (unsigned i = 0, e = ATy->getNumElements(); i != e; ++i) - ComputeValueVTs(TLI, EltTy, ValueVTs, Offsets, - StartingOffset + i * EltSize); - return; - } - // Interpret void as zero return values. - if (Ty == Type::getVoidTy(Ty->getContext())) - return; - // Base case: we can get an EVT for this LLVM IR type. - ValueVTs.push_back(TLI.getValueType(Ty)); - if (Offsets) - Offsets->push_back(StartingOffset); -} - namespace llvm { /// RegsForValue - This struct represents the registers (physical or virtual) /// that a particular set of values is assigned, and the type information about @@ -241,150 +163,6 @@ }; } -/// isUsedOutsideOfDefiningBlock - Return true if this instruction is used by -/// PHI nodes or outside of the basic block that defines it, or used by a -/// switch or atomic instruction, which may expand to multiple basic blocks. -static bool isUsedOutsideOfDefiningBlock(Instruction *I) { - if (isa(I)) return true; - BasicBlock *BB = I->getParent(); - for (Value::use_iterator UI = I->use_begin(), E = I->use_end(); UI != E; ++UI) - if (cast(*UI)->getParent() != BB || isa(*UI)) - return true; - return false; -} - -/// isOnlyUsedInEntryBlock - If the specified argument is only used in the -/// entry block, return true. This includes arguments used by switches, since -/// the switch may expand into multiple basic blocks. -static bool isOnlyUsedInEntryBlock(Argument *A, bool EnableFastISel) { - // With FastISel active, we may be splitting blocks, so force creation - // of virtual registers for all non-dead arguments. - // Don't force virtual registers for byval arguments though, because - // fast-isel can't handle those in all cases. - if (EnableFastISel && !A->hasByValAttr()) - return A->use_empty(); - - BasicBlock *Entry = A->getParent()->begin(); - for (Value::use_iterator UI = A->use_begin(), E = A->use_end(); UI != E; ++UI) - if (cast(*UI)->getParent() != Entry || isa(*UI)) - return false; // Use not in entry block. - return true; -} - -FunctionLoweringInfo::FunctionLoweringInfo(TargetLowering &tli) - : TLI(tli) { -} - -void FunctionLoweringInfo::set(Function &fn, MachineFunction &mf, - SelectionDAG &DAG, - bool EnableFastISel) { - Fn = &fn; - MF = &mf; - RegInfo = &MF->getRegInfo(); - - // Create a vreg for each argument register that is not dead and is used - // outside of the entry block for the function. - for (Function::arg_iterator AI = Fn->arg_begin(), E = Fn->arg_end(); - AI != E; ++AI) - if (!isOnlyUsedInEntryBlock(AI, EnableFastISel)) - InitializeRegForValue(AI); - - // Initialize the mapping of values to registers. This is only set up for - // instruction values that are used outside of the block that defines - // them. - Function::iterator BB = Fn->begin(), EB = Fn->end(); - for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) - if (AllocaInst *AI = dyn_cast(I)) - if (ConstantInt *CUI = dyn_cast(AI->getArraySize())) { - const Type *Ty = AI->getAllocatedType(); - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize(Ty); - unsigned Align = - std::max((unsigned)TLI.getTargetData()->getPrefTypeAlignment(Ty), - AI->getAlignment()); - - TySize *= CUI->getZExtValue(); // Get total allocated size. - if (TySize == 0) TySize = 1; // Don't create zero-sized stack objects. - StaticAllocaMap[AI] = - MF->getFrameInfo()->CreateStackObject(TySize, Align, false); - } - - for (; BB != EB; ++BB) - for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) - if (!I->use_empty() && isUsedOutsideOfDefiningBlock(I)) - if (!isa(I) || - !StaticAllocaMap.count(cast(I))) - InitializeRegForValue(I); - - // Create an initial MachineBasicBlock for each LLVM BasicBlock in F. This - // also creates the initial PHI MachineInstrs, though none of the input - // operands are populated. - for (BB = Fn->begin(), EB = Fn->end(); BB != EB; ++BB) { - MachineBasicBlock *MBB = mf.CreateMachineBasicBlock(BB); - MBBMap[BB] = MBB; - MF->push_back(MBB); - - // Transfer the address-taken flag. This is necessary because there could - // be multiple MachineBasicBlocks corresponding to one BasicBlock, and only - // the first one should be marked. - if (BB->hasAddressTaken()) - MBB->setHasAddressTaken(); - - // Create Machine PHI nodes for LLVM PHI nodes, lowering them as - // appropriate. - PHINode *PN; - DebugLoc DL; - for (BasicBlock::iterator - I = BB->begin(), E = BB->end(); I != E; ++I) { - - PN = dyn_cast(I); - if (!PN || PN->use_empty()) continue; - - unsigned PHIReg = ValueMap[PN]; - assert(PHIReg && "PHI node does not have an assigned virtual register!"); - - SmallVector ValueVTs; - ComputeValueVTs(TLI, PN->getType(), ValueVTs); - for (unsigned vti = 0, vte = ValueVTs.size(); vti != vte; ++vti) { - EVT VT = ValueVTs[vti]; - unsigned NumRegisters = TLI.getNumRegisters(*DAG.getContext(), VT); - const TargetInstrInfo *TII = MF->getTarget().getInstrInfo(); - for (unsigned i = 0; i != NumRegisters; ++i) - BuildMI(MBB, DL, TII->get(TargetInstrInfo::PHI), PHIReg + i); - PHIReg += NumRegisters; - } - } - } -} - -unsigned FunctionLoweringInfo::MakeReg(EVT VT) { - return RegInfo->createVirtualRegister(TLI.getRegClassFor(VT)); -} - -/// CreateRegForValue - Allocate the appropriate number of virtual registers of -/// the correctly promoted or expanded types. Assign these registers -/// consecutive vreg numbers and return the first assigned number. -/// -/// In the case that the given value has struct or array type, this function -/// will assign registers for each member or element. -/// -unsigned FunctionLoweringInfo::CreateRegForValue(const Value *V) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, V->getType(), ValueVTs); - - unsigned FirstReg = 0; - for (unsigned Value = 0, e = ValueVTs.size(); Value != e; ++Value) { - EVT ValueVT = ValueVTs[Value]; - EVT RegisterVT = TLI.getRegisterType(V->getContext(), ValueVT); - - unsigned NumRegs = TLI.getNumRegisters(V->getContext(), ValueVT); - for (unsigned i = 0; i != NumRegs; ++i) { - unsigned R = MakeReg(RegisterVT); - if (!FirstReg) FirstReg = R; - } - } - return FirstReg; -} - /// getCopyFromParts - Create a value that contains the specified legal parts /// combined into the value they represent. If the parts combine to a type /// larger then ValueVT then AssertOp can be used to specify whether the extra Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h?rev=89667&r1=89666&r2=89667&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Mon Nov 23 11:16:22 2009 @@ -45,6 +45,7 @@ class FPToUIInst; class FPTruncInst; class Function; +class FunctionLoweringInfo; class GetElementPtrInst; class GCFunctionInfo; class ICmpInst; @@ -79,92 +80,6 @@ class VAArgInst; class ZExtInst; -//===--------------------------------------------------------------------===// -/// FunctionLoweringInfo - This contains information that is global to a -/// function that is used when lowering a region of the function. -/// -class FunctionLoweringInfo { -public: - TargetLowering &TLI; - Function *Fn; - MachineFunction *MF; - MachineRegisterInfo *RegInfo; - - /// CanLowerReturn - true iff the function's return value can be lowered to - /// registers. - bool CanLowerReturn; - - /// DemoteRegister - if CanLowerReturn is false, DemoteRegister is a vreg - /// allocated to hold a pointer to the hidden sret parameter. - unsigned DemoteRegister; - - explicit FunctionLoweringInfo(TargetLowering &TLI); - - /// set - Initialize this FunctionLoweringInfo with the given Function - /// and its associated MachineFunction. - /// - void set(Function &Fn, MachineFunction &MF, SelectionDAG &DAG, - bool EnableFastISel); - - /// MBBMap - A mapping from LLVM basic blocks to their machine code entry. - DenseMap MBBMap; - - /// ValueMap - Since we emit code for the function a basic block at a time, - /// we must remember which virtual registers hold the values for - /// cross-basic-block values. - DenseMap ValueMap; - - /// StaticAllocaMap - Keep track of frame indices for fixed sized allocas in - /// the entry block. This allows the allocas to be efficiently referenced - /// anywhere in the function. - DenseMap StaticAllocaMap; - -#ifndef NDEBUG - SmallSet CatchInfoLost; - SmallSet CatchInfoFound; -#endif - - unsigned MakeReg(EVT VT); - - /// isExportedInst - Return true if the specified value is an instruction - /// exported from its block. - bool isExportedInst(const Value *V) { - return ValueMap.count(V); - } - - unsigned CreateRegForValue(const Value *V); - - unsigned InitializeRegForValue(const Value *V) { - unsigned &R = ValueMap[V]; - assert(R == 0 && "Already initialized this value register!"); - return R = CreateRegForValue(V); - } - - struct LiveOutInfo { - unsigned NumSignBits; - APInt KnownOne, KnownZero; - LiveOutInfo() : NumSignBits(0), KnownOne(1, 0), KnownZero(1, 0) {} - }; - - /// LiveOutRegInfo - Information about live out vregs, indexed by their - /// register number offset by 'FirstVirtualRegister'. - std::vector LiveOutRegInfo; - - /// clear - Clear out all the function-specific state. This returns this - /// FunctionLoweringInfo to an empty state, ready to be used for a - /// different function. - void clear() { - MBBMap.clear(); - ValueMap.clear(); - StaticAllocaMap.clear(); -#ifndef NDEBUG - CatchInfoLost.clear(); - CatchInfoFound.clear(); -#endif - LiveOutRegInfo.clear(); - } -}; - //===----------------------------------------------------------------------===// /// SelectionDAGLowering - This is the common target-independent lowering /// implementation that is parameterized by a TargetLowering object. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp?rev=89667&r1=89666&r2=89667&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Mon Nov 23 11:16:22 2009 @@ -14,6 +14,7 @@ #define DEBUG_TYPE "isel" #include "ScheduleDAGSDNodes.h" #include "SelectionDAGBuild.h" +#include "FunctionLoweringInfo.h" #include "llvm/CodeGen/SelectionDAGISel.h" #include "llvm/Analysis/AliasAnalysis.h" #include "llvm/Analysis/DebugInfo.h" @@ -331,7 +332,7 @@ MachineModuleInfo *MMI = getAnalysisIfAvailable(); DwarfWriter *DW = getAnalysisIfAvailable(); CurDAG->init(*MF, MMI, DW); - FuncInfo->set(Fn, *MF, *CurDAG, EnableFastISel); + FuncInfo->set(Fn, *MF, EnableFastISel); SDL->init(GFI, *AA); for (Function::iterator I = Fn.begin(), E = Fn.end(); I != E; ++I) From kremenek at apple.com Mon Nov 23 11:26:04 2009 From: kremenek at apple.com (Ted Kremenek) Date: Mon, 23 Nov 2009 17:26:04 -0000 Subject: [llvm-commits] [llvm] r89671 - /llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Message-ID: <200911231726.nANHQ4nm031780@zion.cs.uiuc.edu> Author: kremenek Date: Mon Nov 23 11:26:04 2009 New Revision: 89671 URL: http://llvm.org/viewvc/llvm-project?rev=89671&view=rev Log: Update CMake file. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt?rev=89671&r1=89670&r2=89671&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Mon Nov 23 11:26:04 2009 @@ -2,6 +2,7 @@ CallingConvLower.cpp DAGCombiner.cpp FastISel.cpp + FunctionLoweringInfo.cpp InstrEmitter.cpp LegalizeDAG.cpp LegalizeFloatTypes.cpp From david_goodwin at apple.com Mon Nov 23 11:34:13 2009 From: david_goodwin at apple.com (David Goodwin) Date: Mon, 23 Nov 2009 17:34:13 -0000 Subject: [llvm-commits] [llvm] r89672 - /llvm/trunk/lib/Target/ARM/ARMScheduleV7.td Message-ID: <200911231734.nANHYDCV032172@zion.cs.uiuc.edu> Author: david_goodwin Date: Mon Nov 23 11:34:12 2009 New Revision: 89672 URL: http://llvm.org/viewvc/llvm-project?rev=89672&view=rev Log: Minor itinerary fixes for FP instructions. Modified: llvm/trunk/lib/Target/ARM/ARMScheduleV7.td Modified: llvm/trunk/lib/Target/ARM/ARMScheduleV7.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMScheduleV7.td?rev=89672&r1=89671&r2=89672&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMScheduleV7.td (original) +++ llvm/trunk/lib/Target/ARM/ARMScheduleV7.td Mon Nov 23 11:34:12 2009 @@ -180,7 +180,7 @@ // Double-precision FP Unary InstrItinData, InstrStage<4, [FU_NPipe], 0>, - InstrStage<4, [FU_NLSPipe]>]>, + InstrStage<4, [FU_NLSPipe]>], [4, 1]>, // // Single-precision FP Compare InstrItinData, @@ -189,17 +189,17 @@ // Double-precision FP Compare InstrItinData, InstrStage<4, [FU_NPipe], 0>, - InstrStage<4, [FU_NLSPipe]>]>, + InstrStage<4, [FU_NLSPipe]>], [4, 1]>, // // Single to Double FP Convert InstrItinData, InstrStage<7, [FU_NPipe], 0>, - InstrStage<7, [FU_NLSPipe]>]>, + InstrStage<7, [FU_NLSPipe]>], [7, 1]>, // // Double to Single FP Convert InstrItinData, InstrStage<5, [FU_NPipe], 0>, - InstrStage<5, [FU_NLSPipe]>]>, + InstrStage<5, [FU_NLSPipe]>], [5, 1]>, // // Single-Precision FP to Integer Convert InstrItinData, @@ -208,7 +208,7 @@ // Double-Precision FP to Integer Convert InstrItinData, InstrStage<8, [FU_NPipe], 0>, - InstrStage<8, [FU_NLSPipe]>]>, + InstrStage<8, [FU_NLSPipe]>], [8, 1]>, // // Integer to Single-Precision FP Convert InstrItinData, @@ -217,7 +217,7 @@ // Integer to Double-Precision FP Convert InstrItinData, InstrStage<8, [FU_NPipe], 0>, - InstrStage<8, [FU_NLSPipe]>]>, + InstrStage<8, [FU_NLSPipe]>], [8, 1]>, // // Single-precision FP ALU InstrItinData, @@ -226,7 +226,7 @@ // Double-precision FP ALU InstrItinData, InstrStage<9, [FU_NPipe], 0>, - InstrStage<9, [FU_NLSPipe]>]>, + InstrStage<9, [FU_NLSPipe]>], [9, 1, 1]>, // // Single-precision FP Multiply InstrItinData, @@ -235,7 +235,7 @@ // Double-precision FP Multiply InstrItinData, InstrStage<11, [FU_NPipe], 0>, - InstrStage<11, [FU_NLSPipe]>]>, + InstrStage<11, [FU_NLSPipe]>], [11, 1, 1]>, // // Single-precision FP MAC InstrItinData, @@ -244,27 +244,27 @@ // Double-precision FP MAC InstrItinData, InstrStage<19, [FU_NPipe], 0>, - InstrStage<19, [FU_NLSPipe]>]>, + InstrStage<19, [FU_NLSPipe]>], [19, 2, 1, 1]>, // // Single-precision FP DIV InstrItinData, InstrStage<20, [FU_NPipe], 0>, - InstrStage<20, [FU_NLSPipe]>]>, + InstrStage<20, [FU_NLSPipe]>], [20, 1, 1]>, // // Double-precision FP DIV InstrItinData, InstrStage<29, [FU_NPipe], 0>, - InstrStage<29, [FU_NLSPipe]>]>, + InstrStage<29, [FU_NLSPipe]>], [29, 1, 1]>, // // Single-precision FP SQRT InstrItinData, InstrStage<19, [FU_NPipe], 0>, - InstrStage<19, [FU_NLSPipe]>]>, + InstrStage<19, [FU_NLSPipe]>], [19, 1]>, // // Double-precision FP SQRT InstrItinData, InstrStage<29, [FU_NPipe], 0>, - InstrStage<29, [FU_NLSPipe]>]>, + InstrStage<29, [FU_NLSPipe]>], [29, 1]>, // // Single-precision FP Load // use FU_Issue to enforce the 1 load/store per cycle limit From clattner at apple.com Mon Nov 23 11:37:11 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:37:11 -0800 Subject: [llvm-commits] [llvm] r89516 - in /llvm/trunk: include/llvm/Target/SubtargetFeature.h lib/Target/SubtargetFeature.cpp tools/lto/LTOCodeGenerator.cpp In-Reply-To: <6a8523d60911220941u6e633e0dhd4ee9da39dedb2ce@mail.gmail.com> References: <200911210000.nAL003q6027547@zion.cs.uiuc.edu> <6a8523d60911220941u6e633e0dhd4ee9da39dedb2ce@mail.gmail.com> Message-ID: <6E59EC20-D700-4C76-9E8B-A4B12F2617F5@apple.com> On Nov 22, 2009, at 9:41 AM, Daniel Dunbar wrote: > On Sun, Nov 22, 2009 at 6:03 AM, Chris Lattner wrote: >> >> On Nov 20, 2009, at 4:00 PM, Viktor Kutuzov wrote: >> >>> Author: vkutuzov >>> Date: Fri Nov 20 18:00:02 2009 >>> New Revision: 89516 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=89516&view=rev >>> Log: >>> Added two SubtargetFeatures::AddFeatures methods, which accept a comma-separated string or already parsed command line parameters as input, and some code re-factoring to use these new methods. > > I don't think this code belongs in SubtargetFeatures at all. All it is > doing is calling AddFeature on each string, the client is perfectly > capable of doing this, which obviates thinking about how best to pass > the vector. Yes, I agree. Viktor, please remove this part of the patch, pushing the logic into the LTO client. -Chris > > Similarly, AddFeatures shouldn't impose some kind of discipline like > comma separate strings, clients should handle this if it is what they > have (and StringRef::split makes it easy for them to split the > string). > > - Daniel > >> Ok, a couple comments below: >> >>> +++ llvm/trunk/include/llvm/Target/SubtargetFeature.h Fri Nov 20 18:00:02 2009 >>> @@ -22,6 +22,7 @@ >>> #include >>> #include >>> #include "llvm/ADT/Triple.h" >>> +#include "llvm/Support/CommandLine.h" >>> #include "llvm/System/DataTypes.h" >> >> Please drop this #include. >> >>> @@ -93,6 +94,12 @@ >>> /// Adding Features. >>> void AddFeature(const std::string &String, bool IsEnabled = true); >>> >>> + /// Add a set of features from the comma-separated string. >>> + void AddFeatures(const std::string &String); >> >> This should take a StringRef instead of std::string. >> >>> + >>> + /// Add a set of features from the parsed command line parameters. >>> + void AddFeatures(const cl::list &List); >> >> cl::list inherits from std::vector, so you should be able to pass in a std::vector directly. However, it would be much much better to expose this as taking an array of StringRef's and require the caller to do the unpacking. >> >> -Chris >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> From clattner at apple.com Mon Nov 23 11:38:52 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:38:52 -0800 Subject: [llvm-commits] [llvm] r89530 - in /llvm/trunk: include/llvm/CodeGen/LiveVariables.h lib/CodeGen/LiveVariables.cpp lib/CodeGen/PHIElimination.cpp lib/CodeGen/PHIElimination.h In-Reply-To: <8F6DD22D-65D7-478C-8F62-FAF933194FE1@2pi.dk> References: <200911210205.nAL25M36031984@zion.cs.uiuc.edu> <58E5E7B6-3416-472E-8C1A-363DDDE5A930@apple.com> <8F6DD22D-65D7-478C-8F62-FAF933194FE1@2pi.dk> Message-ID: <85F16498-351A-4D3B-AEA0-B62D5E0181D6@apple.com> On Nov 22, 2009, at 10:11 AM, Jakob Stoklund Olesen wrote: > > On Nov 22, 2009, at 6:09 AM, Chris Lattner wrote: >> >> I haven't looked at the code at all, but I want to bring up one subtlety of critical edge splitting. In switches which have multiple edges to the same destination (on different values), these edges are always critical. However, they should be split together as a unit, not individually. I don't know if your code considers this or not, but this is the idea behind the "SplitEdgeNicely" logic that appears in a couple places. > > I hadn't considered that issue, but I think I have managed to avoid it. > > At the machine code level, switches become either branching trees or jumptable branches. In the first case, the switch has been converted to if/else code, and the issue disappears. In the second case I leave the edge alone because AnalyzeBranch fails. > > I don't want to start tampering with jump tables because I think they can be shared. Ah, that makes sense. So long as you don't muck with edges from switches, everything should be good, thanks! -Chris From clattner at apple.com Mon Nov 23 11:40:12 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:40:12 -0800 Subject: [llvm-commits] [llvm] r89211 - /llvm/trunk/CMakeLists.txt In-Reply-To: <4B09BAEA.2030207@mxc.ca> References: <200911181742.nAIHgMXe005175@zion.cs.uiuc.edu> <87lji25i81.fsf@telefonica.net> <6a8523d60911221021t75d68290j30060188af1734ca@mail.gmail.com> <4B098818.4000506@mxc.ca> <6a8523d60911221409o474ae2atb51155eab3f56890@mail.gmail.com> <4B09BAEA.2030207@mxc.ca> Message-ID: <33B0309B-C3D1-48AD-8574-917639E27C6B@apple.com> On Nov 22, 2009, at 2:27 PM, Nick Lewycky wrote: >>>> >>>> I will update the doc. >>> >>> The rationale for copying each Kaleidoscope chapter into the examples/ >>> directory was to make sure that our examples on the website don't get out of >>> date when someone changes the API. >> >> I agree this is good. This sounds like excellent rationale for disabling this stuff (including kaleidoscope!) by default, but making sure the buildbots are building them. -Chris From gohman at apple.com Mon Nov 23 11:42:47 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 17:42:47 -0000 Subject: [llvm-commits] [llvm] r89674 - in /llvm/trunk/lib/CodeGen/SelectionDAG: FastISel.cpp FunctionLoweringInfo.cpp FunctionLoweringInfo.h SelectionDAGBuild.cpp SelectionDAGBuild.h Message-ID: <200911231742.nANHglCB032651@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 11:42:46 2009 New Revision: 89674 URL: http://llvm.org/viewvc/llvm-project?rev=89674&view=rev Log: Move some more code out of SelectionDAGBuild.cpp and into FunctionLoweringInfo.cpp. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp?rev=89674&r1=89673&r2=89674&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp Mon Nov 23 11:42:46 2009 @@ -55,6 +55,7 @@ #include "llvm/Target/TargetLowering.h" #include "llvm/Target/TargetMachine.h" #include "SelectionDAGBuild.h" +#include "FunctionLoweringInfo.h" using namespace llvm; unsigned FastISel::getRegForValue(Value *V) { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp?rev=89674&r1=89673&r2=89674&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp Mon Nov 23 11:42:46 2009 @@ -276,3 +276,68 @@ } return FirstReg; } + +/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V. +GlobalVariable *llvm::ExtractTypeInfo(Value *V) { + V = V->stripPointerCasts(); + GlobalVariable *GV = dyn_cast(V); + assert ((GV || isa(V)) && + "TypeInfo must be a global variable or NULL"); + return GV; +} + +/// AddCatchInfo - Extract the personality and type infos from an eh.selector +/// call, and add them to the specified machine basic block. +void llvm::AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, + MachineBasicBlock *MBB) { + // Inform the MachineModuleInfo of the personality for this landing pad. + ConstantExpr *CE = cast(I.getOperand(2)); + assert(CE->getOpcode() == Instruction::BitCast && + isa(CE->getOperand(0)) && + "Personality should be a function"); + MMI->addPersonality(MBB, cast(CE->getOperand(0))); + + // Gather all the type infos for this landing pad and pass them along to + // MachineModuleInfo. + std::vector TyInfo; + unsigned N = I.getNumOperands(); + + for (unsigned i = N - 1; i > 2; --i) { + if (ConstantInt *CI = dyn_cast(I.getOperand(i))) { + unsigned FilterLength = CI->getZExtValue(); + unsigned FirstCatch = i + FilterLength + !FilterLength; + assert (FirstCatch <= N && "Invalid filter length"); + + if (FirstCatch < N) { + TyInfo.reserve(N - FirstCatch); + for (unsigned j = FirstCatch; j < N; ++j) + TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); + MMI->addCatchTypeInfo(MBB, TyInfo); + TyInfo.clear(); + } + + if (!FilterLength) { + // Cleanup. + MMI->addCleanup(MBB); + } else { + // Filter. + TyInfo.reserve(FilterLength - 1); + for (unsigned j = i + 1; j < FirstCatch; ++j) + TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); + MMI->addFilterTypeInfo(MBB, TyInfo); + TyInfo.clear(); + } + + N = i; + } + } + + if (N > 3) { + TyInfo.reserve(N - 3); + for (unsigned j = 3; j < N; ++j) + TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); + MMI->addCatchTypeInfo(MBB, TyInfo); + } +} + + Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h?rev=89674&r1=89673&r2=89674&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h Mon Nov 23 11:42:46 2009 @@ -27,10 +27,13 @@ class AllocaInst; class BasicBlock; +class CallInst; class Function; +class GlobalVariable; class Instruction; class MachineBasicBlock; class MachineFunction; +class MachineModuleInfo; class MachineRegisterInfo; class TargetLowering; class Value; @@ -132,6 +135,13 @@ SmallVectorImpl *Offsets = 0, uint64_t StartingOffset = 0); +/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V. +GlobalVariable *ExtractTypeInfo(Value *V); + +/// AddCatchInfo - Extract the personality and type infos from an eh.selector +/// call, and add them to the specified machine basic block. +void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, MachineBasicBlock *MBB); + } // end namespace llvm #endif Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=89674&r1=89673&r2=89674&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Mon Nov 23 11:42:46 2009 @@ -2790,73 +2790,6 @@ } } -/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V. -static GlobalVariable *ExtractTypeInfo(Value *V) { - V = V->stripPointerCasts(); - GlobalVariable *GV = dyn_cast(V); - assert ((GV || isa(V)) && - "TypeInfo must be a global variable or NULL"); - return GV; -} - -namespace llvm { - -/// AddCatchInfo - Extract the personality and type infos from an eh.selector -/// call, and add them to the specified machine basic block. -void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, - MachineBasicBlock *MBB) { - // Inform the MachineModuleInfo of the personality for this landing pad. - ConstantExpr *CE = cast(I.getOperand(2)); - assert(CE->getOpcode() == Instruction::BitCast && - isa(CE->getOperand(0)) && - "Personality should be a function"); - MMI->addPersonality(MBB, cast(CE->getOperand(0))); - - // Gather all the type infos for this landing pad and pass them along to - // MachineModuleInfo. - std::vector TyInfo; - unsigned N = I.getNumOperands(); - - for (unsigned i = N - 1; i > 2; --i) { - if (ConstantInt *CI = dyn_cast(I.getOperand(i))) { - unsigned FilterLength = CI->getZExtValue(); - unsigned FirstCatch = i + FilterLength + !FilterLength; - assert (FirstCatch <= N && "Invalid filter length"); - - if (FirstCatch < N) { - TyInfo.reserve(N - FirstCatch); - for (unsigned j = FirstCatch; j < N; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addCatchTypeInfo(MBB, TyInfo); - TyInfo.clear(); - } - - if (!FilterLength) { - // Cleanup. - MMI->addCleanup(MBB); - } else { - // Filter. - TyInfo.reserve(FilterLength - 1); - for (unsigned j = i + 1; j < FirstCatch; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addFilterTypeInfo(MBB, TyInfo); - TyInfo.clear(); - } - - N = i; - } - } - - if (N > 3) { - TyInfo.reserve(N - 3); - for (unsigned j = 3; j < N; ++j) - TyInfo.push_back(ExtractTypeInfo(I.getOperand(j))); - MMI->addCatchTypeInfo(MBB, TyInfo); - } -} - -} - /// GetSignificand - Get the significand and build it into a floating-point /// number with exponent of 1: /// Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h?rev=89674&r1=89673&r2=89674&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Mon Nov 23 11:42:46 2009 @@ -25,7 +25,6 @@ #include "llvm/CodeGen/ValueTypes.h" #include "llvm/Support/CallSite.h" #include "llvm/Support/ErrorHandling.h" -#include "llvm/Target/TargetMachine.h" #include #include @@ -59,7 +58,6 @@ class MachineBasicBlock; class MachineFunction; class MachineInstr; -class MachineModuleInfo; class MachineRegisterInfo; class PHINode; class PtrToIntInst; @@ -484,11 +482,6 @@ const char *implVisitAluOverflow(CallInst &I, ISD::NodeType Op); }; -/// AddCatchInfo - Extract the personality and type infos from an eh.selector -/// call, and add them to the specified machine basic block. -void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, - MachineBasicBlock *MBB); - } // end namespace llvm #endif From clattner at apple.com Mon Nov 23 11:44:30 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:44:30 -0800 Subject: [llvm-commits] [llvm] r89403 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp In-Reply-To: <6A655B82-5A9A-4854-A1C8-98E2DF7EE8E7@apple.com> References: <200911192310.nAJNASND022655@zion.cs.uiuc.edu> <6A655B82-5A9A-4854-A1C8-98E2DF7EE8E7@apple.com> Message-ID: <0D81A26C-67F8-4671-82F7-C55A21064706@apple.com> On Nov 22, 2009, at 10:28 AM, Jim Grosbach wrote: >> Are you sure that this is the right thing to do? GCC inline asm precisely specifies how the size of an inline asm is computed: >> http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm (6.39.1) >> >> If some inline asm isn't working with this algorithm, then the inline asm is wrong and should be fixed. > > > Yes, I believe this is the correct approach. > > The algorithm described is, "The estimate is formed by counting the number of statements in the pattern of the asm and multiplying that by the length of the longest instruction on that processor." This is exactly what we do, and gets us close, but it is still an estimate. To do better, we'll need to actually parse the assembly code to determine whether it's a 16 or a 32 bit instruction. In Thumb2, that's a non-trivial thing, and I suspect we don't want to go down that path, at least right now. No, this isn't true at all. This is a statement of how the language extension (in this case, gnu assembly) works. If the "estimate" is incorrect for a piece of code, then the asm is incorrect, not the compiler. > What's happening here is that the inline asm is actually shorter than the estimate since it's a 16-bit instruction, and is also not located between the constant pool reference and the constant pool entry. Since it's different in size than what the estimate believes, our calculations for when alignment padding will be inserted are now off, leading to problems. Ah, this is a completely different problem from what I thought. I thought we were incorrectly under-estimating the size of an asm, not overestimating it. If the code is expecting to get the exact size of the asm, it is definitely "doomed to failure" :). However, if you just need alignment, can you just not insert "alignment padding" and instead insert a .align directive after the asm? > This patch adjusts those padding calculations to know that in the presence of inline assembly, we're dealing with an estimate and therefore can't make assumptions about alignment padding. Ah ok, so this just causes extra .align directives to be emitted, it doesn't make wildly conservative assumptions? If so, sounds great! :) Thanks Jim, -Chris From gohman at apple.com Mon Nov 23 11:46:23 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 17:46:23 -0000 Subject: [llvm-commits] [llvm] r89675 - /llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Message-ID: <200911231746.nANHkNfP032765@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 11:46:23 2009 New Revision: 89675 URL: http://llvm.org/viewvc/llvm-project?rev=89675&view=rev Log: Move RegsForValue to an anonymous namespace, since it is only used in this file. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=89675&r1=89674&r2=89675&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp Mon Nov 23 11:46:23 2009 @@ -68,7 +68,7 @@ cl::location(LimitFloatPrecision), cl::init(0)); -namespace llvm { +namespace { /// RegsForValue - This struct represents the registers (physical or virtual) /// that a particular set of values is assigned, and the type information about /// the value. The most common situation is to represent one value at a time, @@ -78,7 +78,7 @@ /// have legal types, so each value may require one or more registers of some /// legal type. /// - struct VISIBILITY_HIDDEN RegsForValue { + struct RegsForValue { /// TLI - The TargetLowering object. /// const TargetLowering *TLI; From johnny.chen at apple.com Mon Nov 23 11:48:17 2009 From: johnny.chen at apple.com (Johnny Chen) Date: Mon, 23 Nov 2009 17:48:17 -0000 Subject: [llvm-commits] [llvm] r89676 - /llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Message-ID: <200911231748.nANHmHtc000364@zion.cs.uiuc.edu> Author: johnny Date: Mon Nov 23 11:48:17 2009 New Revision: 89676 URL: http://llvm.org/viewvc/llvm-project?rev=89676&view=rev Log: Make it clear that the index bit(s) of Vector Get Lane and Vector Set Lane should be left unspecified now that Bob Wilson has fixed pr5470. Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89676&r1=89675&r2=89676&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 11:48:17 2009 @@ -2518,27 +2518,27 @@ // VMOV : Vector Get Lane (move scalar to ARM core register) -def VGETLNs8 : NVGetLane<0b11100101, 0b1011, 0b00, +def VGETLNs8 : NVGetLane<{1,1,1,0,0,1,?,1}, 0b1011, {?,?}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVSI, "vmov", ".s8\t$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlanes (v8i8 DPR:$src), imm:$lane))]>; -def VGETLNs16 : NVGetLane<0b11100001, 0b1011, 0b01, +def VGETLNs16 : NVGetLane<{1,1,1,0,0,0,?,1}, 0b1011, {?,1}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVSI, "vmov", ".s16\t$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlanes (v4i16 DPR:$src), imm:$lane))]>; -def VGETLNu8 : NVGetLane<0b11101101, 0b1011, 0b00, +def VGETLNu8 : NVGetLane<{1,1,1,0,1,1,?,1}, 0b1011, {?,?}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVSI, "vmov", ".u8\t$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlaneu (v8i8 DPR:$src), imm:$lane))]>; -def VGETLNu16 : NVGetLane<0b11101001, 0b1011, 0b01, +def VGETLNu16 : NVGetLane<{1,1,1,0,1,0,?,1}, 0b1011, {?,1}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVSI, "vmov", ".u16\t$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlaneu (v4i16 DPR:$src), imm:$lane))]>; -def VGETLNi32 : NVGetLane<0b11100001, 0b1011, 0b00, +def VGETLNi32 : NVGetLane<{1,1,1,0,0,0,?,1}, 0b1011, 0b00, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVSI, "vmov", ".32\t$dst, $src[$lane]", [(set GPR:$dst, (extractelt (v2i32 DPR:$src), @@ -2579,17 +2579,17 @@ // VMOV : Vector Set Lane (move ARM core register to scalar) let Constraints = "$src1 = $dst" in { -def VSETLNi8 : NVSetLane<0b11100100, 0b1011, 0b00, (outs DPR:$dst), +def VSETLNi8 : NVSetLane<{1,1,1,0,0,1,?,0}, 0b1011, {?,?}, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), IIC_VMOVISL, "vmov", ".8\t$dst[$lane], $src2", [(set DPR:$dst, (vector_insert (v8i8 DPR:$src1), GPR:$src2, imm:$lane))]>; -def VSETLNi16 : NVSetLane<0b11100000, 0b1011, 0b01, (outs DPR:$dst), +def VSETLNi16 : NVSetLane<{1,1,1,0,0,0,?,0}, 0b1011, {?,1}, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), IIC_VMOVISL, "vmov", ".16\t$dst[$lane], $src2", [(set DPR:$dst, (vector_insert (v4i16 DPR:$src1), GPR:$src2, imm:$lane))]>; -def VSETLNi32 : NVSetLane<0b11100000, 0b1011, 0b00, (outs DPR:$dst), +def VSETLNi32 : NVSetLane<{1,1,1,0,0,0,?,0}, 0b1011, 0b00, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), IIC_VMOVISL, "vmov", ".32\t$dst[$lane], $src2", [(set DPR:$dst, (insertelt (v2i32 DPR:$src1), From clattner at apple.com Mon Nov 23 11:50:05 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:50:05 -0800 Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/compare-signs.ll In-Reply-To: <4B0A48CB.4020201@free.fr> References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu> <4B0A446C.10104@free.fr> <4B0A4558.8070803@mxc.ca> <4B0A48CB.4020201@free.fr> Message-ID: On Nov 23, 2009, at 12:33 AM, Duncan Sands wrote: >>>> + if (KnownZeroLHS.countLeadingOnes() == BitWidth-1 && >>>> + KnownZeroRHS.countLeadingOnes() == BitWidth-1) { >>> >>> == -> >= :) >> >> Nope, look again! >> >> + APInt TypeMask(APInt::getHighBitsSet(BitWidth, BitWidth-1)); >> >> Thus, it will never return a knownzero with all bits set. :) > > Ha ha, you got me there! Please add a comment. Obviously it isn't clear what is going on here. From espindola at google.com Mon Nov 23 11:50:47 2009 From: espindola at google.com (Rafael Espindola) Date: Mon, 23 Nov 2009 12:50:47 -0500 Subject: [llvm-commits] [PATCH] LTO code generator options In-Reply-To: <6a8523d60911221008t431f5662v369a3ebd910a8696@mail.gmail.com> References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911170809j6a32716ar840b8622cfad6f17@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607C4@mail.accesssoftek.com> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <6a8523d60911221008t431f5662v369a3ebd910a8696@mail.gmail.com> Message-ID: <38a0d8450911230950w13785ccdv4fc114271d00f7f8@mail.gmail.com> >> You don't need to. If Split is used elsewhere I agree it is better to >> use it in here too. > > Uh, why? We should kill off duplicate code. Just to be consistent. I like StringRef::split, I just think it is nicer to switch one module at a time :-) > ?- Daniel Cheers, -- Rafael ?vila de Esp?ndola From clattner at apple.com Mon Nov 23 11:52:05 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 09:52:05 -0800 Subject: [llvm-commits] [llvm] r89660 - /llvm/trunk/lib/VMCore/PassManager.cpp In-Reply-To: <200911231624.nANGOJCN029476@zion.cs.uiuc.edu> References: <200911231624.nANGOJCN029476@zion.cs.uiuc.edu> Message-ID: On Nov 23, 2009, at 8:24 AM, Dan Gohman wrote: > Author: djg > Date: Mon Nov 23 10:24:18 2009 > New Revision: 89660 > > URL: http://llvm.org/viewvc/llvm-project?rev=89660&view=rev > Log: > Move FunctionPassManagerImpl's dumpArguments and dumpPasses calls > out of its run function and into its doInitialization method, so > that it does the dump once instead of once per function. Thank you!!! :) From dpatel at apple.com Mon Nov 23 11:56:06 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 23 Nov 2009 09:56:06 -0800 Subject: [llvm-commits] [llvm] r89487 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp In-Reply-To: <56E0813A-560E-4810-9B9F-457E0853AB17@apple.com> References: <200911202105.nAKL5bJe020587@zion.cs.uiuc.edu> <56E0813A-560E-4810-9B9F-457E0853AB17@apple.com> Message-ID: On Nov 21, 2009, at 7:18 AM, Chris Lattner wrote: > > On Nov 20, 2009, at 1:05 PM, Devang Patel wrote: > >> Author: dpatel >> Date: Fri Nov 20 15:05:37 2009 >> New Revision: 89487 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89487&view=rev >> Log: >> There is no need to emit source location info for >> DW_TAG_pointer_type. > > This seems like a strange special case. Why are pointers special > here? Usually, one pointer type is used many places and FEs do not preserve separate locations for each instance. GCC does not emit location info here. If we get precise location info from FE then it makes sense to include location info here. - Devang From gohman at apple.com Mon Nov 23 12:04:59 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 18:04:59 -0000 Subject: [llvm-commits] [llvm] r89681 - in /llvm/trunk: include/llvm/CodeGen/SelectionDAGISel.h lib/CodeGen/SelectionDAG/CMakeLists.txt lib/CodeGen/SelectionDAG/FastISel.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuild.h lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Message-ID: <200911231804.nANI4xb4000989@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 12:04:58 2009 New Revision: 89681 URL: http://llvm.org/viewvc/llvm-project?rev=89681&view=rev Log: Rename SelectionDAGLowering to SelectionDAGBuilder, and rename SelectionDAGBuild.cpp to SelectionDAGBuilder.cpp. Added: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp - copied, changed from r89675, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h - copied, changed from r89674, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Removed: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h?rev=89681&r1=89680&r2=89681&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAGISel.h Mon Nov 23 12:04:58 2009 @@ -23,7 +23,7 @@ namespace llvm { class FastISel; - class SelectionDAGLowering; + class SelectionDAGBuilder; class SDValue; class MachineRegisterInfo; class MachineBasicBlock; @@ -48,7 +48,7 @@ MachineFunction *MF; MachineRegisterInfo *RegInfo; SelectionDAG *CurDAG; - SelectionDAGLowering *SDL; + SelectionDAGBuilder *SDB; MachineBasicBlock *BB; AliasAnalysis *AA; GCFunctionInfo *GFI; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt?rev=89681&r1=89680&r2=89681&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/CMakeLists.txt Mon Nov 23 12:04:58 2009 @@ -16,7 +16,7 @@ ScheduleDAGRRList.cpp ScheduleDAGSDNodes.cpp SelectionDAG.cpp - SelectionDAGBuild.cpp + SelectionDAGBuilder.cpp SelectionDAGISel.cpp SelectionDAGPrinter.cpp TargetLowering.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp?rev=89681&r1=89680&r2=89681&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FastISel.cpp Mon Nov 23 12:04:58 2009 @@ -54,7 +54,7 @@ #include "llvm/Target/TargetInstrInfo.h" #include "llvm/Target/TargetLowering.h" #include "llvm/Target/TargetMachine.h" -#include "SelectionDAGBuild.h" +#include "SelectionDAGBuilder.h" #include "FunctionLoweringInfo.h" using namespace llvm; Removed: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp?rev=89680&view=auto ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (removed) @@ -1,5821 +0,0 @@ -//===-- SelectionDAGBuild.cpp - Selection-DAG building --------------------===// -// -// The LLVM Compiler Infrastructure -// -// This file is distributed under the University of Illinois Open Source -// License. See LICENSE.TXT for details. -// -//===----------------------------------------------------------------------===// -// -// This implements routines for translating from LLVM IR into SelectionDAG IR. -// -//===----------------------------------------------------------------------===// - -#define DEBUG_TYPE "isel" -#include "SelectionDAGBuild.h" -#include "FunctionLoweringInfo.h" -#include "llvm/ADT/BitVector.h" -#include "llvm/ADT/SmallSet.h" -#include "llvm/Analysis/AliasAnalysis.h" -#include "llvm/Constants.h" -#include "llvm/CallingConv.h" -#include "llvm/DerivedTypes.h" -#include "llvm/Function.h" -#include "llvm/GlobalVariable.h" -#include "llvm/InlineAsm.h" -#include "llvm/Instructions.h" -#include "llvm/Intrinsics.h" -#include "llvm/IntrinsicInst.h" -#include "llvm/LLVMContext.h" -#include "llvm/Module.h" -#include "llvm/CodeGen/FastISel.h" -#include "llvm/CodeGen/GCStrategy.h" -#include "llvm/CodeGen/GCMetadata.h" -#include "llvm/CodeGen/MachineFunction.h" -#include "llvm/CodeGen/MachineFrameInfo.h" -#include "llvm/CodeGen/MachineInstrBuilder.h" -#include "llvm/CodeGen/MachineJumpTableInfo.h" -#include "llvm/CodeGen/MachineModuleInfo.h" -#include "llvm/CodeGen/MachineRegisterInfo.h" -#include "llvm/CodeGen/PseudoSourceValue.h" -#include "llvm/CodeGen/SelectionDAG.h" -#include "llvm/CodeGen/DwarfWriter.h" -#include "llvm/Analysis/DebugInfo.h" -#include "llvm/Target/TargetRegisterInfo.h" -#include "llvm/Target/TargetData.h" -#include "llvm/Target/TargetFrameInfo.h" -#include "llvm/Target/TargetInstrInfo.h" -#include "llvm/Target/TargetIntrinsicInfo.h" -#include "llvm/Target/TargetLowering.h" -#include "llvm/Target/TargetOptions.h" -#include "llvm/Support/Compiler.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/Debug.h" -#include "llvm/Support/ErrorHandling.h" -#include "llvm/Support/MathExtras.h" -#include "llvm/Support/raw_ostream.h" -#include -using namespace llvm; - -/// LimitFloatPrecision - Generate low-precision inline sequences for -/// some float libcalls (6, 8 or 12 bits). -static unsigned LimitFloatPrecision; - -static cl::opt -LimitFPPrecision("limit-float-precision", - cl::desc("Generate low-precision inline sequences " - "for some float libcalls"), - cl::location(LimitFloatPrecision), - cl::init(0)); - -namespace { - /// RegsForValue - This struct represents the registers (physical or virtual) - /// that a particular set of values is assigned, and the type information about - /// the value. The most common situation is to represent one value at a time, - /// but struct or array values are handled element-wise as multiple values. - /// The splitting of aggregates is performed recursively, so that we never - /// have aggregate-typed registers. The values at this point do not necessarily - /// have legal types, so each value may require one or more registers of some - /// legal type. - /// - struct RegsForValue { - /// TLI - The TargetLowering object. - /// - const TargetLowering *TLI; - - /// ValueVTs - The value types of the values, which may not be legal, and - /// may need be promoted or synthesized from one or more registers. - /// - SmallVector ValueVTs; - - /// RegVTs - The value types of the registers. This is the same size as - /// ValueVTs and it records, for each value, what the type of the assigned - /// register or registers are. (Individual values are never synthesized - /// from more than one type of register.) - /// - /// With virtual registers, the contents of RegVTs is redundant with TLI's - /// getRegisterType member function, however when with physical registers - /// it is necessary to have a separate record of the types. - /// - SmallVector RegVTs; - - /// Regs - This list holds the registers assigned to the values. - /// Each legal or promoted value requires one register, and each - /// expanded value requires multiple registers. - /// - SmallVector Regs; - - RegsForValue() : TLI(0) {} - - RegsForValue(const TargetLowering &tli, - const SmallVector ®s, - EVT regvt, EVT valuevt) - : TLI(&tli), ValueVTs(1, valuevt), RegVTs(1, regvt), Regs(regs) {} - RegsForValue(const TargetLowering &tli, - const SmallVector ®s, - const SmallVector ®vts, - const SmallVector &valuevts) - : TLI(&tli), ValueVTs(valuevts), RegVTs(regvts), Regs(regs) {} - RegsForValue(LLVMContext &Context, const TargetLowering &tli, - unsigned Reg, const Type *Ty) : TLI(&tli) { - ComputeValueVTs(tli, Ty, ValueVTs); - - for (unsigned Value = 0, e = ValueVTs.size(); Value != e; ++Value) { - EVT ValueVT = ValueVTs[Value]; - unsigned NumRegs = TLI->getNumRegisters(Context, ValueVT); - EVT RegisterVT = TLI->getRegisterType(Context, ValueVT); - for (unsigned i = 0; i != NumRegs; ++i) - Regs.push_back(Reg + i); - RegVTs.push_back(RegisterVT); - Reg += NumRegs; - } - } - - /// append - Add the specified values to this one. - void append(const RegsForValue &RHS) { - TLI = RHS.TLI; - ValueVTs.append(RHS.ValueVTs.begin(), RHS.ValueVTs.end()); - RegVTs.append(RHS.RegVTs.begin(), RHS.RegVTs.end()); - Regs.append(RHS.Regs.begin(), RHS.Regs.end()); - } - - - /// getCopyFromRegs - Emit a series of CopyFromReg nodes that copies from - /// this value and returns the result as a ValueVTs value. This uses - /// Chain/Flag as the input and updates them for the output Chain/Flag. - /// If the Flag pointer is NULL, no flag is used. - SDValue getCopyFromRegs(SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const; - - /// getCopyToRegs - Emit a series of CopyToReg nodes that copies the - /// specified value into the registers specified by this object. This uses - /// Chain/Flag as the input and updates them for the output Chain/Flag. - /// If the Flag pointer is NULL, no flag is used. - void getCopyToRegs(SDValue Val, SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const; - - /// AddInlineAsmOperands - Add this value to the specified inlineasm node - /// operand list. This adds the code marker, matching input operand index - /// (if applicable), and includes the number of values added into it. - void AddInlineAsmOperands(unsigned Code, - bool HasMatching, unsigned MatchingIdx, - SelectionDAG &DAG, std::vector &Ops) const; - }; -} - -/// getCopyFromParts - Create a value that contains the specified legal parts -/// combined into the value they represent. If the parts combine to a type -/// larger then ValueVT then AssertOp can be used to specify whether the extra -/// bits are known to be zero (ISD::AssertZext) or sign extended from ValueVT -/// (ISD::AssertSext). -static SDValue getCopyFromParts(SelectionDAG &DAG, DebugLoc dl, - const SDValue *Parts, - unsigned NumParts, EVT PartVT, EVT ValueVT, - ISD::NodeType AssertOp = ISD::DELETED_NODE) { - assert(NumParts > 0 && "No parts to assemble!"); - const TargetLowering &TLI = DAG.getTargetLoweringInfo(); - SDValue Val = Parts[0]; - - if (NumParts > 1) { - // Assemble the value from multiple parts. - if (!ValueVT.isVector() && ValueVT.isInteger()) { - unsigned PartBits = PartVT.getSizeInBits(); - unsigned ValueBits = ValueVT.getSizeInBits(); - - // Assemble the power of 2 part. - unsigned RoundParts = NumParts & (NumParts - 1) ? - 1 << Log2_32(NumParts) : NumParts; - unsigned RoundBits = PartBits * RoundParts; - EVT RoundVT = RoundBits == ValueBits ? - ValueVT : EVT::getIntegerVT(*DAG.getContext(), RoundBits); - SDValue Lo, Hi; - - EVT HalfVT = EVT::getIntegerVT(*DAG.getContext(), RoundBits/2); - - if (RoundParts > 2) { - Lo = getCopyFromParts(DAG, dl, Parts, RoundParts/2, PartVT, HalfVT); - Hi = getCopyFromParts(DAG, dl, Parts+RoundParts/2, RoundParts/2, - PartVT, HalfVT); - } else { - Lo = DAG.getNode(ISD::BIT_CONVERT, dl, HalfVT, Parts[0]); - Hi = DAG.getNode(ISD::BIT_CONVERT, dl, HalfVT, Parts[1]); - } - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - Val = DAG.getNode(ISD::BUILD_PAIR, dl, RoundVT, Lo, Hi); - - if (RoundParts < NumParts) { - // Assemble the trailing non-power-of-2 part. - unsigned OddParts = NumParts - RoundParts; - EVT OddVT = EVT::getIntegerVT(*DAG.getContext(), OddParts * PartBits); - Hi = getCopyFromParts(DAG, dl, - Parts+RoundParts, OddParts, PartVT, OddVT); - - // Combine the round and odd parts. - Lo = Val; - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - EVT TotalVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Hi = DAG.getNode(ISD::ANY_EXTEND, dl, TotalVT, Hi); - Hi = DAG.getNode(ISD::SHL, dl, TotalVT, Hi, - DAG.getConstant(Lo.getValueType().getSizeInBits(), - TLI.getPointerTy())); - Lo = DAG.getNode(ISD::ZERO_EXTEND, dl, TotalVT, Lo); - Val = DAG.getNode(ISD::OR, dl, TotalVT, Lo, Hi); - } - } else if (ValueVT.isVector()) { - // Handle a multi-element vector. - EVT IntermediateVT, RegisterVT; - unsigned NumIntermediates; - unsigned NumRegs = - TLI.getVectorTypeBreakdown(*DAG.getContext(), ValueVT, IntermediateVT, - NumIntermediates, RegisterVT); - assert(NumRegs == NumParts && "Part count doesn't match vector breakdown!"); - NumParts = NumRegs; // Silence a compiler warning. - assert(RegisterVT == PartVT && "Part type doesn't match vector breakdown!"); - assert(RegisterVT == Parts[0].getValueType() && - "Part type doesn't match part!"); - - // Assemble the parts into intermediate operands. - SmallVector Ops(NumIntermediates); - if (NumIntermediates == NumParts) { - // If the register was not expanded, truncate or copy the value, - // as appropriate. - for (unsigned i = 0; i != NumParts; ++i) - Ops[i] = getCopyFromParts(DAG, dl, &Parts[i], 1, - PartVT, IntermediateVT); - } else if (NumParts > 0) { - // If the intermediate type was expanded, build the intermediate operands - // from the parts. - assert(NumParts % NumIntermediates == 0 && - "Must expand into a divisible number of parts!"); - unsigned Factor = NumParts / NumIntermediates; - for (unsigned i = 0; i != NumIntermediates; ++i) - Ops[i] = getCopyFromParts(DAG, dl, &Parts[i * Factor], Factor, - PartVT, IntermediateVT); - } - - // Build a vector with BUILD_VECTOR or CONCAT_VECTORS from the intermediate - // operands. - Val = DAG.getNode(IntermediateVT.isVector() ? - ISD::CONCAT_VECTORS : ISD::BUILD_VECTOR, dl, - ValueVT, &Ops[0], NumIntermediates); - } else if (PartVT.isFloatingPoint()) { - // FP split into multiple FP parts (for ppcf128) - assert(ValueVT == EVT(MVT::ppcf128) && PartVT == EVT(MVT::f64) && - "Unexpected split"); - SDValue Lo, Hi; - Lo = DAG.getNode(ISD::BIT_CONVERT, dl, EVT(MVT::f64), Parts[0]); - Hi = DAG.getNode(ISD::BIT_CONVERT, dl, EVT(MVT::f64), Parts[1]); - if (TLI.isBigEndian()) - std::swap(Lo, Hi); - Val = DAG.getNode(ISD::BUILD_PAIR, dl, ValueVT, Lo, Hi); - } else { - // FP split into integer parts (soft fp) - assert(ValueVT.isFloatingPoint() && PartVT.isInteger() && - !PartVT.isVector() && "Unexpected split"); - EVT IntVT = EVT::getIntegerVT(*DAG.getContext(), ValueVT.getSizeInBits()); - Val = getCopyFromParts(DAG, dl, Parts, NumParts, PartVT, IntVT); - } - } - - // There is now one part, held in Val. Correct it to match ValueVT. - PartVT = Val.getValueType(); - - if (PartVT == ValueVT) - return Val; - - if (PartVT.isVector()) { - assert(ValueVT.isVector() && "Unknown vector conversion!"); - return DAG.getNode(ISD::BIT_CONVERT, dl, ValueVT, Val); - } - - if (ValueVT.isVector()) { - assert(ValueVT.getVectorElementType() == PartVT && - ValueVT.getVectorNumElements() == 1 && - "Only trivial scalar-to-vector conversions should get here!"); - return DAG.getNode(ISD::BUILD_VECTOR, dl, ValueVT, Val); - } - - if (PartVT.isInteger() && - ValueVT.isInteger()) { - if (ValueVT.bitsLT(PartVT)) { - // For a truncate, see if we have any information to - // indicate whether the truncated bits will always be - // zero or sign-extension. - if (AssertOp != ISD::DELETED_NODE) - Val = DAG.getNode(AssertOp, dl, PartVT, Val, - DAG.getValueType(ValueVT)); - return DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } else { - return DAG.getNode(ISD::ANY_EXTEND, dl, ValueVT, Val); - } - } - - if (PartVT.isFloatingPoint() && ValueVT.isFloatingPoint()) { - if (ValueVT.bitsLT(Val.getValueType())) - // FP_ROUND's are always exact here. - return DAG.getNode(ISD::FP_ROUND, dl, ValueVT, Val, - DAG.getIntPtrConstant(1)); - return DAG.getNode(ISD::FP_EXTEND, dl, ValueVT, Val); - } - - if (PartVT.getSizeInBits() == ValueVT.getSizeInBits()) - return DAG.getNode(ISD::BIT_CONVERT, dl, ValueVT, Val); - - llvm_unreachable("Unknown mismatch!"); - return SDValue(); -} - -/// getCopyToParts - Create a series of nodes that contain the specified value -/// split into legal parts. If the parts contain more bits than Val, then, for -/// integers, ExtendKind can be used to specify how to generate the extra bits. -static void getCopyToParts(SelectionDAG &DAG, DebugLoc dl, SDValue Val, - SDValue *Parts, unsigned NumParts, EVT PartVT, - ISD::NodeType ExtendKind = ISD::ANY_EXTEND) { - const TargetLowering &TLI = DAG.getTargetLoweringInfo(); - EVT PtrVT = TLI.getPointerTy(); - EVT ValueVT = Val.getValueType(); - unsigned PartBits = PartVT.getSizeInBits(); - unsigned OrigNumParts = NumParts; - assert(TLI.isTypeLegal(PartVT) && "Copying to an illegal type!"); - - if (!NumParts) - return; - - if (!ValueVT.isVector()) { - if (PartVT == ValueVT) { - assert(NumParts == 1 && "No-op copy with multiple parts!"); - Parts[0] = Val; - return; - } - - if (NumParts * PartBits > ValueVT.getSizeInBits()) { - // If the parts cover more bits than the value has, promote the value. - if (PartVT.isFloatingPoint() && ValueVT.isFloatingPoint()) { - assert(NumParts == 1 && "Do not know what to promote to!"); - Val = DAG.getNode(ISD::FP_EXTEND, dl, PartVT, Val); - } else if (PartVT.isInteger() && ValueVT.isInteger()) { - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ExtendKind, dl, ValueVT, Val); - } else { - llvm_unreachable("Unknown mismatch!"); - } - } else if (PartBits == ValueVT.getSizeInBits()) { - // Different types of the same size. - assert(NumParts == 1 && PartVT != ValueVT); - Val = DAG.getNode(ISD::BIT_CONVERT, dl, PartVT, Val); - } else if (NumParts * PartBits < ValueVT.getSizeInBits()) { - // If the parts cover less bits than value has, truncate the value. - if (PartVT.isInteger() && ValueVT.isInteger()) { - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } else { - llvm_unreachable("Unknown mismatch!"); - } - } - - // The value may have changed - recompute ValueVT. - ValueVT = Val.getValueType(); - assert(NumParts * PartBits == ValueVT.getSizeInBits() && - "Failed to tile the value with PartVT!"); - - if (NumParts == 1) { - assert(PartVT == ValueVT && "Type conversion failed!"); - Parts[0] = Val; - return; - } - - // Expand the value into multiple parts. - if (NumParts & (NumParts - 1)) { - // The number of parts is not a power of 2. Split off and copy the tail. - assert(PartVT.isInteger() && ValueVT.isInteger() && - "Do not know what to expand to!"); - unsigned RoundParts = 1 << Log2_32(NumParts); - unsigned RoundBits = RoundParts * PartBits; - unsigned OddParts = NumParts - RoundParts; - SDValue OddVal = DAG.getNode(ISD::SRL, dl, ValueVT, Val, - DAG.getConstant(RoundBits, - TLI.getPointerTy())); - getCopyToParts(DAG, dl, OddVal, Parts + RoundParts, OddParts, PartVT); - if (TLI.isBigEndian()) - // The odd parts were reversed by getCopyToParts - unreverse them. - std::reverse(Parts + RoundParts, Parts + NumParts); - NumParts = RoundParts; - ValueVT = EVT::getIntegerVT(*DAG.getContext(), NumParts * PartBits); - Val = DAG.getNode(ISD::TRUNCATE, dl, ValueVT, Val); - } - - // The number of parts is a power of 2. Repeatedly bisect the value using - // EXTRACT_ELEMENT. - Parts[0] = DAG.getNode(ISD::BIT_CONVERT, dl, - EVT::getIntegerVT(*DAG.getContext(), ValueVT.getSizeInBits()), - Val); - for (unsigned StepSize = NumParts; StepSize > 1; StepSize /= 2) { - for (unsigned i = 0; i < NumParts; i += StepSize) { - unsigned ThisBits = StepSize * PartBits / 2; - EVT ThisVT = EVT::getIntegerVT(*DAG.getContext(), ThisBits); - SDValue &Part0 = Parts[i]; - SDValue &Part1 = Parts[i+StepSize/2]; - - Part1 = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, - ThisVT, Part0, - DAG.getConstant(1, PtrVT)); - Part0 = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, - ThisVT, Part0, - DAG.getConstant(0, PtrVT)); - - if (ThisBits == PartBits && ThisVT != PartVT) { - Part0 = DAG.getNode(ISD::BIT_CONVERT, dl, - PartVT, Part0); - Part1 = DAG.getNode(ISD::BIT_CONVERT, dl, - PartVT, Part1); - } - } - } - - if (TLI.isBigEndian()) - std::reverse(Parts, Parts + OrigNumParts); - - return; - } - - // Vector ValueVT. - if (NumParts == 1) { - if (PartVT != ValueVT) { - if (PartVT.isVector()) { - Val = DAG.getNode(ISD::BIT_CONVERT, dl, PartVT, Val); - } else { - assert(ValueVT.getVectorElementType() == PartVT && - ValueVT.getVectorNumElements() == 1 && - "Only trivial vector-to-scalar conversions should get here!"); - Val = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, - PartVT, Val, - DAG.getConstant(0, PtrVT)); - } - } - - Parts[0] = Val; - return; - } - - // Handle a multi-element vector. - EVT IntermediateVT, RegisterVT; - unsigned NumIntermediates; - unsigned NumRegs = TLI.getVectorTypeBreakdown(*DAG.getContext(), ValueVT, - IntermediateVT, NumIntermediates, RegisterVT); - unsigned NumElements = ValueVT.getVectorNumElements(); - - assert(NumRegs == NumParts && "Part count doesn't match vector breakdown!"); - NumParts = NumRegs; // Silence a compiler warning. - assert(RegisterVT == PartVT && "Part type doesn't match vector breakdown!"); - - // Split the vector into intermediate operands. - SmallVector Ops(NumIntermediates); - for (unsigned i = 0; i != NumIntermediates; ++i) - if (IntermediateVT.isVector()) - Ops[i] = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, - IntermediateVT, Val, - DAG.getConstant(i * (NumElements / NumIntermediates), - PtrVT)); - else - Ops[i] = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, - IntermediateVT, Val, - DAG.getConstant(i, PtrVT)); - - // Split the intermediate operands into legal parts. - if (NumParts == NumIntermediates) { - // If the register was not expanded, promote or copy the value, - // as appropriate. - for (unsigned i = 0; i != NumParts; ++i) - getCopyToParts(DAG, dl, Ops[i], &Parts[i], 1, PartVT); - } else if (NumParts > 0) { - // If the intermediate type was expanded, split each the value into - // legal parts. - assert(NumParts % NumIntermediates == 0 && - "Must expand into a divisible number of parts!"); - unsigned Factor = NumParts / NumIntermediates; - for (unsigned i = 0; i != NumIntermediates; ++i) - getCopyToParts(DAG, dl, Ops[i], &Parts[i * Factor], Factor, PartVT); - } -} - - -void SelectionDAGLowering::init(GCFunctionInfo *gfi, AliasAnalysis &aa) { - AA = &aa; - GFI = gfi; - TD = DAG.getTarget().getTargetData(); -} - -/// clear - Clear out the curret SelectionDAG and the associated -/// state and prepare this SelectionDAGLowering object to be used -/// for a new block. This doesn't clear out information about -/// additional blocks that are needed to complete switch lowering -/// or PHI node updating; that information is cleared out as it is -/// consumed. -void SelectionDAGLowering::clear() { - NodeMap.clear(); - PendingLoads.clear(); - PendingExports.clear(); - EdgeMapping.clear(); - DAG.clear(); - CurDebugLoc = DebugLoc::getUnknownLoc(); - HasTailCall = false; -} - -/// getRoot - Return the current virtual root of the Selection DAG, -/// flushing any PendingLoad items. This must be done before emitting -/// a store or any other node that may need to be ordered after any -/// prior load instructions. -/// -SDValue SelectionDAGLowering::getRoot() { - if (PendingLoads.empty()) - return DAG.getRoot(); - - if (PendingLoads.size() == 1) { - SDValue Root = PendingLoads[0]; - DAG.setRoot(Root); - PendingLoads.clear(); - return Root; - } - - // Otherwise, we have to make a token factor node. - SDValue Root = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &PendingLoads[0], PendingLoads.size()); - PendingLoads.clear(); - DAG.setRoot(Root); - return Root; -} - -/// getControlRoot - Similar to getRoot, but instead of flushing all the -/// PendingLoad items, flush all the PendingExports items. It is necessary -/// to do this before emitting a terminator instruction. -/// -SDValue SelectionDAGLowering::getControlRoot() { - SDValue Root = DAG.getRoot(); - - if (PendingExports.empty()) - return Root; - - // Turn all of the CopyToReg chains into one factored node. - if (Root.getOpcode() != ISD::EntryToken) { - unsigned i = 0, e = PendingExports.size(); - for (; i != e; ++i) { - assert(PendingExports[i].getNode()->getNumOperands() > 1); - if (PendingExports[i].getNode()->getOperand(0) == Root) - break; // Don't add the root if we already indirectly depend on it. - } - - if (i == e) - PendingExports.push_back(Root); - } - - Root = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &PendingExports[0], - PendingExports.size()); - PendingExports.clear(); - DAG.setRoot(Root); - return Root; -} - -void SelectionDAGLowering::visit(Instruction &I) { - visit(I.getOpcode(), I); -} - -void SelectionDAGLowering::visit(unsigned Opcode, User &I) { - // Note: this doesn't use InstVisitor, because it has to work with - // ConstantExpr's in addition to instructions. - switch (Opcode) { - default: llvm_unreachable("Unknown instruction type encountered!"); - // Build the switch statement using the Instruction.def file. -#define HANDLE_INST(NUM, OPCODE, CLASS) \ - case Instruction::OPCODE:return visit##OPCODE((CLASS&)I); -#include "llvm/Instruction.def" - } -} - -SDValue SelectionDAGLowering::getValue(const Value *V) { - SDValue &N = NodeMap[V]; - if (N.getNode()) return N; - - if (Constant *C = const_cast(dyn_cast(V))) { - EVT VT = TLI.getValueType(V->getType(), true); - - if (ConstantInt *CI = dyn_cast(C)) - return N = DAG.getConstant(*CI, VT); - - if (GlobalValue *GV = dyn_cast(C)) - return N = DAG.getGlobalAddress(GV, VT); - - if (isa(C)) - return N = DAG.getConstant(0, TLI.getPointerTy()); - - if (ConstantFP *CFP = dyn_cast(C)) - return N = DAG.getConstantFP(*CFP, VT); - - if (isa(C) && !V->getType()->isAggregateType()) - return N = DAG.getUNDEF(VT); - - if (ConstantExpr *CE = dyn_cast(C)) { - visit(CE->getOpcode(), *CE); - SDValue N1 = NodeMap[V]; - assert(N1.getNode() && "visit didn't populate the ValueMap!"); - return N1; - } - - if (isa(C) || isa(C)) { - SmallVector Constants; - for (User::const_op_iterator OI = C->op_begin(), OE = C->op_end(); - OI != OE; ++OI) { - SDNode *Val = getValue(*OI).getNode(); - // If the operand is an empty aggregate, there are no values. - if (!Val) continue; - // Add each leaf value from the operand to the Constants list - // to form a flattened list of all the values. - for (unsigned i = 0, e = Val->getNumValues(); i != e; ++i) - Constants.push_back(SDValue(Val, i)); - } - return DAG.getMergeValues(&Constants[0], Constants.size(), - getCurDebugLoc()); - } - - if (isa(C->getType()) || isa(C->getType())) { - assert((isa(C) || isa(C)) && - "Unknown struct or array constant!"); - - SmallVector ValueVTs; - ComputeValueVTs(TLI, C->getType(), ValueVTs); - unsigned NumElts = ValueVTs.size(); - if (NumElts == 0) - return SDValue(); // empty struct - SmallVector Constants(NumElts); - for (unsigned i = 0; i != NumElts; ++i) { - EVT EltVT = ValueVTs[i]; - if (isa(C)) - Constants[i] = DAG.getUNDEF(EltVT); - else if (EltVT.isFloatingPoint()) - Constants[i] = DAG.getConstantFP(0, EltVT); - else - Constants[i] = DAG.getConstant(0, EltVT); - } - return DAG.getMergeValues(&Constants[0], NumElts, getCurDebugLoc()); - } - - if (BlockAddress *BA = dyn_cast(C)) - return DAG.getBlockAddress(BA, VT); - - const VectorType *VecTy = cast(V->getType()); - unsigned NumElements = VecTy->getNumElements(); - - // Now that we know the number and type of the elements, get that number of - // elements into the Ops array based on what kind of constant it is. - SmallVector Ops; - if (ConstantVector *CP = dyn_cast(C)) { - for (unsigned i = 0; i != NumElements; ++i) - Ops.push_back(getValue(CP->getOperand(i))); - } else { - assert(isa(C) && "Unknown vector constant!"); - EVT EltVT = TLI.getValueType(VecTy->getElementType()); - - SDValue Op; - if (EltVT.isFloatingPoint()) - Op = DAG.getConstantFP(0, EltVT); - else - Op = DAG.getConstant(0, EltVT); - Ops.assign(NumElements, Op); - } - - // Create a BUILD_VECTOR node. - return NodeMap[V] = DAG.getNode(ISD::BUILD_VECTOR, getCurDebugLoc(), - VT, &Ops[0], Ops.size()); - } - - // If this is a static alloca, generate it as the frameindex instead of - // computation. - if (const AllocaInst *AI = dyn_cast(V)) { - DenseMap::iterator SI = - FuncInfo.StaticAllocaMap.find(AI); - if (SI != FuncInfo.StaticAllocaMap.end()) - return DAG.getFrameIndex(SI->second, TLI.getPointerTy()); - } - - unsigned InReg = FuncInfo.ValueMap[V]; - assert(InReg && "Value not in map!"); - - RegsForValue RFV(*DAG.getContext(), TLI, InReg, V->getType()); - SDValue Chain = DAG.getEntryNode(); - return RFV.getCopyFromRegs(DAG, getCurDebugLoc(), Chain, NULL); -} - -/// Get the EVTs and ArgFlags collections that represent the return type -/// of the given function. This does not require a DAG or a return value, and -/// is suitable for use before any DAGs for the function are constructed. -static void getReturnInfo(const Type* ReturnType, - Attributes attr, SmallVectorImpl &OutVTs, - SmallVectorImpl &OutFlags, - TargetLowering &TLI, - SmallVectorImpl *Offsets = 0) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, ReturnType, ValueVTs, Offsets); - unsigned NumValues = ValueVTs.size(); - if ( NumValues == 0 ) return; - - for (unsigned j = 0, f = NumValues; j != f; ++j) { - EVT VT = ValueVTs[j]; - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - if (attr & Attribute::SExt) - ExtendKind = ISD::SIGN_EXTEND; - else if (attr & Attribute::ZExt) - ExtendKind = ISD::ZERO_EXTEND; - - // FIXME: C calling convention requires the return type to be promoted to - // at least 32-bit. But this is not necessary for non-C calling - // conventions. The frontend should mark functions whose return values - // require promoting with signext or zeroext attributes. - if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { - EVT MinVT = TLI.getRegisterType(ReturnType->getContext(), MVT::i32); - if (VT.bitsLT(MinVT)) - VT = MinVT; - } - - unsigned NumParts = TLI.getNumRegisters(ReturnType->getContext(), VT); - EVT PartVT = TLI.getRegisterType(ReturnType->getContext(), VT); - // 'inreg' on function refers to return value - ISD::ArgFlagsTy Flags = ISD::ArgFlagsTy(); - if (attr & Attribute::InReg) - Flags.setInReg(); - - // Propagate extension type if any - if (attr & Attribute::SExt) - Flags.setSExt(); - else if (attr & Attribute::ZExt) - Flags.setZExt(); - - for (unsigned i = 0; i < NumParts; ++i) { - OutVTs.push_back(PartVT); - OutFlags.push_back(Flags); - } - } -} - -void SelectionDAGLowering::visitRet(ReturnInst &I) { - SDValue Chain = getControlRoot(); - SmallVector Outs; - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - - if (!FLI.CanLowerReturn) { - unsigned DemoteReg = FLI.DemoteRegister; - const Function *F = I.getParent()->getParent(); - - // Emit a store of the return value through the virtual register. - // Leave Outs empty so that LowerReturn won't try to load return - // registers the usual way. - SmallVector PtrValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F->getReturnType()), - PtrValueVTs); - - SDValue RetPtr = DAG.getRegister(DemoteReg, PtrValueVTs[0]); - SDValue RetOp = getValue(I.getOperand(0)); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, I.getOperand(0)->getType(), ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - - SmallVector Chains(NumValues); - EVT PtrVT = PtrValueVTs[0]; - for (unsigned i = 0; i != NumValues; ++i) - Chains[i] = DAG.getStore(Chain, getCurDebugLoc(), - SDValue(RetOp.getNode(), RetOp.getResNo() + i), - DAG.getNode(ISD::ADD, getCurDebugLoc(), PtrVT, RetPtr, - DAG.getConstant(Offsets[i], PtrVT)), - NULL, Offsets[i], false, 0); - Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues); - } - else { - for (unsigned i = 0, e = I.getNumOperands(); i != e; ++i) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getOperand(i)->getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) continue; - - SDValue RetOp = getValue(I.getOperand(i)); - for (unsigned j = 0, f = NumValues; j != f; ++j) { - EVT VT = ValueVTs[j]; - - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - const Function *F = I.getParent()->getParent(); - if (F->paramHasAttr(0, Attribute::SExt)) - ExtendKind = ISD::SIGN_EXTEND; - else if (F->paramHasAttr(0, Attribute::ZExt)) - ExtendKind = ISD::ZERO_EXTEND; - - // FIXME: C calling convention requires the return type to be promoted to - // at least 32-bit. But this is not necessary for non-C calling - // conventions. The frontend should mark functions whose return values - // require promoting with signext or zeroext attributes. - if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) { - EVT MinVT = TLI.getRegisterType(*DAG.getContext(), MVT::i32); - if (VT.bitsLT(MinVT)) - VT = MinVT; - } - - unsigned NumParts = TLI.getNumRegisters(*DAG.getContext(), VT); - EVT PartVT = TLI.getRegisterType(*DAG.getContext(), VT); - SmallVector Parts(NumParts); - getCopyToParts(DAG, getCurDebugLoc(), - SDValue(RetOp.getNode(), RetOp.getResNo() + j), - &Parts[0], NumParts, PartVT, ExtendKind); - - // 'inreg' on function refers to return value - ISD::ArgFlagsTy Flags = ISD::ArgFlagsTy(); - if (F->paramHasAttr(0, Attribute::InReg)) - Flags.setInReg(); - - // Propagate extension type if any - if (F->paramHasAttr(0, Attribute::SExt)) - Flags.setSExt(); - else if (F->paramHasAttr(0, Attribute::ZExt)) - Flags.setZExt(); - - for (unsigned i = 0; i < NumParts; ++i) - Outs.push_back(ISD::OutputArg(Flags, Parts[i], /*isfixed=*/true)); - } - } - } - - bool isVarArg = DAG.getMachineFunction().getFunction()->isVarArg(); - CallingConv::ID CallConv = - DAG.getMachineFunction().getFunction()->getCallingConv(); - Chain = TLI.LowerReturn(Chain, CallConv, isVarArg, - Outs, getCurDebugLoc(), DAG); - - // Verify that the target's LowerReturn behaved as expected. - assert(Chain.getNode() && Chain.getValueType() == MVT::Other && - "LowerReturn didn't return a valid chain!"); - - // Update the DAG with the new chain value resulting from return lowering. - DAG.setRoot(Chain); -} - -/// CopyToExportRegsIfNeeded - If the given value has virtual registers -/// created for it, emit nodes to copy the value into the virtual -/// registers. -void SelectionDAGLowering::CopyToExportRegsIfNeeded(Value *V) { - if (!V->use_empty()) { - DenseMap::iterator VMI = FuncInfo.ValueMap.find(V); - if (VMI != FuncInfo.ValueMap.end()) - CopyValueToVirtualRegister(V, VMI->second); - } -} - -/// ExportFromCurrentBlock - If this condition isn't known to be exported from -/// the current basic block, add it to ValueMap now so that we'll get a -/// CopyTo/FromReg. -void SelectionDAGLowering::ExportFromCurrentBlock(Value *V) { - // No need to export constants. - if (!isa(V) && !isa(V)) return; - - // Already exported? - if (FuncInfo.isExportedInst(V)) return; - - unsigned Reg = FuncInfo.InitializeRegForValue(V); - CopyValueToVirtualRegister(V, Reg); -} - -bool SelectionDAGLowering::isExportableFromCurrentBlock(Value *V, - const BasicBlock *FromBB) { - // The operands of the setcc have to be in this block. We don't know - // how to export them from some other block. - if (Instruction *VI = dyn_cast(V)) { - // Can export from current BB. - if (VI->getParent() == FromBB) - return true; - - // Is already exported, noop. - return FuncInfo.isExportedInst(V); - } - - // If this is an argument, we can export it if the BB is the entry block or - // if it is already exported. - if (isa(V)) { - if (FromBB == &FromBB->getParent()->getEntryBlock()) - return true; - - // Otherwise, can only export this if it is already exported. - return FuncInfo.isExportedInst(V); - } - - // Otherwise, constants can always be exported. - return true; -} - -static bool InBlock(const Value *V, const BasicBlock *BB) { - if (const Instruction *I = dyn_cast(V)) - return I->getParent() == BB; - return true; -} - -/// getFCmpCondCode - Return the ISD condition code corresponding to -/// the given LLVM IR floating-point condition code. This includes -/// consideration of global floating-point math flags. -/// -static ISD::CondCode getFCmpCondCode(FCmpInst::Predicate Pred) { - ISD::CondCode FPC, FOC; - switch (Pred) { - case FCmpInst::FCMP_FALSE: FOC = FPC = ISD::SETFALSE; break; - case FCmpInst::FCMP_OEQ: FOC = ISD::SETEQ; FPC = ISD::SETOEQ; break; - case FCmpInst::FCMP_OGT: FOC = ISD::SETGT; FPC = ISD::SETOGT; break; - case FCmpInst::FCMP_OGE: FOC = ISD::SETGE; FPC = ISD::SETOGE; break; - case FCmpInst::FCMP_OLT: FOC = ISD::SETLT; FPC = ISD::SETOLT; break; - case FCmpInst::FCMP_OLE: FOC = ISD::SETLE; FPC = ISD::SETOLE; break; - case FCmpInst::FCMP_ONE: FOC = ISD::SETNE; FPC = ISD::SETONE; break; - case FCmpInst::FCMP_ORD: FOC = FPC = ISD::SETO; break; - case FCmpInst::FCMP_UNO: FOC = FPC = ISD::SETUO; break; - case FCmpInst::FCMP_UEQ: FOC = ISD::SETEQ; FPC = ISD::SETUEQ; break; - case FCmpInst::FCMP_UGT: FOC = ISD::SETGT; FPC = ISD::SETUGT; break; - case FCmpInst::FCMP_UGE: FOC = ISD::SETGE; FPC = ISD::SETUGE; break; - case FCmpInst::FCMP_ULT: FOC = ISD::SETLT; FPC = ISD::SETULT; break; - case FCmpInst::FCMP_ULE: FOC = ISD::SETLE; FPC = ISD::SETULE; break; - case FCmpInst::FCMP_UNE: FOC = ISD::SETNE; FPC = ISD::SETUNE; break; - case FCmpInst::FCMP_TRUE: FOC = FPC = ISD::SETTRUE; break; - default: - llvm_unreachable("Invalid FCmp predicate opcode!"); - FOC = FPC = ISD::SETFALSE; - break; - } - if (FiniteOnlyFPMath()) - return FOC; - else - return FPC; -} - -/// getICmpCondCode - Return the ISD condition code corresponding to -/// the given LLVM IR integer condition code. -/// -static ISD::CondCode getICmpCondCode(ICmpInst::Predicate Pred) { - switch (Pred) { - case ICmpInst::ICMP_EQ: return ISD::SETEQ; - case ICmpInst::ICMP_NE: return ISD::SETNE; - case ICmpInst::ICMP_SLE: return ISD::SETLE; - case ICmpInst::ICMP_ULE: return ISD::SETULE; - case ICmpInst::ICMP_SGE: return ISD::SETGE; - case ICmpInst::ICMP_UGE: return ISD::SETUGE; - case ICmpInst::ICMP_SLT: return ISD::SETLT; - case ICmpInst::ICMP_ULT: return ISD::SETULT; - case ICmpInst::ICMP_SGT: return ISD::SETGT; - case ICmpInst::ICMP_UGT: return ISD::SETUGT; - default: - llvm_unreachable("Invalid ICmp predicate opcode!"); - return ISD::SETNE; - } -} - -/// EmitBranchForMergedCondition - Helper method for FindMergedConditions. -/// This function emits a branch and is used at the leaves of an OR or an -/// AND operator tree. -/// -void -SelectionDAGLowering::EmitBranchForMergedCondition(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB) { - const BasicBlock *BB = CurBB->getBasicBlock(); - - // If the leaf of the tree is a comparison, merge the condition into - // the caseblock. - if (CmpInst *BOp = dyn_cast(Cond)) { - // The operands of the cmp have to be in this block. We don't know - // how to export them from some other block. If this is the first block - // of the sequence, no exporting is needed. - if (CurBB == CurMBB || - (isExportableFromCurrentBlock(BOp->getOperand(0), BB) && - isExportableFromCurrentBlock(BOp->getOperand(1), BB))) { - ISD::CondCode Condition; - if (ICmpInst *IC = dyn_cast(Cond)) { - Condition = getICmpCondCode(IC->getPredicate()); - } else if (FCmpInst *FC = dyn_cast(Cond)) { - Condition = getFCmpCondCode(FC->getPredicate()); - } else { - Condition = ISD::SETEQ; // silence warning. - llvm_unreachable("Unknown compare instruction"); - } - - CaseBlock CB(Condition, BOp->getOperand(0), - BOp->getOperand(1), NULL, TBB, FBB, CurBB); - SwitchCases.push_back(CB); - return; - } - } - - // Create a CaseBlock record representing this branch. - CaseBlock CB(ISD::SETEQ, Cond, ConstantInt::getTrue(*DAG.getContext()), - NULL, TBB, FBB, CurBB); - SwitchCases.push_back(CB); -} - -/// FindMergedConditions - If Cond is an expression like -void SelectionDAGLowering::FindMergedConditions(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB, - unsigned Opc) { - // If this node is not part of the or/and tree, emit it as a branch. - Instruction *BOp = dyn_cast(Cond); - if (!BOp || !(isa(BOp) || isa(BOp)) || - (unsigned)BOp->getOpcode() != Opc || !BOp->hasOneUse() || - BOp->getParent() != CurBB->getBasicBlock() || - !InBlock(BOp->getOperand(0), CurBB->getBasicBlock()) || - !InBlock(BOp->getOperand(1), CurBB->getBasicBlock())) { - EmitBranchForMergedCondition(Cond, TBB, FBB, CurBB); - return; - } - - // Create TmpBB after CurBB. - MachineFunction::iterator BBI = CurBB; - MachineFunction &MF = DAG.getMachineFunction(); - MachineBasicBlock *TmpBB = MF.CreateMachineBasicBlock(CurBB->getBasicBlock()); - CurBB->getParent()->insert(++BBI, TmpBB); - - if (Opc == Instruction::Or) { - // Codegen X | Y as: - // jmp_if_X TBB - // jmp TmpBB - // TmpBB: - // jmp_if_Y TBB - // jmp FBB - // - - // Emit the LHS condition. - FindMergedConditions(BOp->getOperand(0), TBB, TmpBB, CurBB, Opc); - - // Emit the RHS condition into TmpBB. - FindMergedConditions(BOp->getOperand(1), TBB, FBB, TmpBB, Opc); - } else { - assert(Opc == Instruction::And && "Unknown merge op!"); - // Codegen X & Y as: - // jmp_if_X TmpBB - // jmp FBB - // TmpBB: - // jmp_if_Y TBB - // jmp FBB - // - // This requires creation of TmpBB after CurBB. - - // Emit the LHS condition. - FindMergedConditions(BOp->getOperand(0), TmpBB, FBB, CurBB, Opc); - - // Emit the RHS condition into TmpBB. - FindMergedConditions(BOp->getOperand(1), TBB, FBB, TmpBB, Opc); - } -} - -/// If the set of cases should be emitted as a series of branches, return true. -/// If we should emit this as a bunch of and/or'd together conditions, return -/// false. -bool -SelectionDAGLowering::ShouldEmitAsBranches(const std::vector &Cases){ - if (Cases.size() != 2) return true; - - // If this is two comparisons of the same values or'd or and'd together, they - // will get folded into a single comparison, so don't emit two blocks. - if ((Cases[0].CmpLHS == Cases[1].CmpLHS && - Cases[0].CmpRHS == Cases[1].CmpRHS) || - (Cases[0].CmpRHS == Cases[1].CmpLHS && - Cases[0].CmpLHS == Cases[1].CmpRHS)) { - return false; - } - - return true; -} - -void SelectionDAGLowering::visitBr(BranchInst &I) { - // Update machine-CFG edges. - MachineBasicBlock *Succ0MBB = FuncInfo.MBBMap[I.getSuccessor(0)]; - - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - if (I.isUnconditional()) { - // Update machine-CFG edges. - CurMBB->addSuccessor(Succ0MBB); - - // If this is not a fall-through branch, emit the branch. - if (Succ0MBB != NextBlock) - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Succ0MBB))); - return; - } - - // If this condition is one of the special cases we handle, do special stuff - // now. - Value *CondVal = I.getCondition(); - MachineBasicBlock *Succ1MBB = FuncInfo.MBBMap[I.getSuccessor(1)]; - - // If this is a series of conditions that are or'd or and'd together, emit - // this as a sequence of branches instead of setcc's with and/or operations. - // For example, instead of something like: - // cmp A, B - // C = seteq - // cmp D, E - // F = setle - // or C, F - // jnz foo - // Emit: - // cmp A, B - // je foo - // cmp D, E - // jle foo - // - if (BinaryOperator *BOp = dyn_cast(CondVal)) { - if (BOp->hasOneUse() && - (BOp->getOpcode() == Instruction::And || - BOp->getOpcode() == Instruction::Or)) { - FindMergedConditions(BOp, Succ0MBB, Succ1MBB, CurMBB, BOp->getOpcode()); - // If the compares in later blocks need to use values not currently - // exported from this block, export them now. This block should always - // be the first entry. - assert(SwitchCases[0].ThisBB == CurMBB && "Unexpected lowering!"); - - // Allow some cases to be rejected. - if (ShouldEmitAsBranches(SwitchCases)) { - for (unsigned i = 1, e = SwitchCases.size(); i != e; ++i) { - ExportFromCurrentBlock(SwitchCases[i].CmpLHS); - ExportFromCurrentBlock(SwitchCases[i].CmpRHS); - } - - // Emit the branch for this block. - visitSwitchCase(SwitchCases[0]); - SwitchCases.erase(SwitchCases.begin()); - return; - } - - // Okay, we decided not to do this, remove any inserted MBB's and clear - // SwitchCases. - for (unsigned i = 1, e = SwitchCases.size(); i != e; ++i) - FuncInfo.MF->erase(SwitchCases[i].ThisBB); - - SwitchCases.clear(); - } - } - - // Create a CaseBlock record representing this branch. - CaseBlock CB(ISD::SETEQ, CondVal, ConstantInt::getTrue(*DAG.getContext()), - NULL, Succ0MBB, Succ1MBB, CurMBB); - // Use visitSwitchCase to actually insert the fast branch sequence for this - // cond branch. - visitSwitchCase(CB); -} - -/// visitSwitchCase - Emits the necessary code to represent a single node in -/// the binary search tree resulting from lowering a switch instruction. -void SelectionDAGLowering::visitSwitchCase(CaseBlock &CB) { - SDValue Cond; - SDValue CondLHS = getValue(CB.CmpLHS); - DebugLoc dl = getCurDebugLoc(); - - // Build the setcc now. - if (CB.CmpMHS == NULL) { - // Fold "(X == true)" to X and "(X == false)" to !X to - // handle common cases produced by branch lowering. - if (CB.CmpRHS == ConstantInt::getTrue(*DAG.getContext()) && - CB.CC == ISD::SETEQ) - Cond = CondLHS; - else if (CB.CmpRHS == ConstantInt::getFalse(*DAG.getContext()) && - CB.CC == ISD::SETEQ) { - SDValue True = DAG.getConstant(1, CondLHS.getValueType()); - Cond = DAG.getNode(ISD::XOR, dl, CondLHS.getValueType(), CondLHS, True); - } else - Cond = DAG.getSetCC(dl, MVT::i1, CondLHS, getValue(CB.CmpRHS), CB.CC); - } else { - assert(CB.CC == ISD::SETLE && "Can handle only LE ranges now"); - - const APInt& Low = cast(CB.CmpLHS)->getValue(); - const APInt& High = cast(CB.CmpRHS)->getValue(); - - SDValue CmpOp = getValue(CB.CmpMHS); - EVT VT = CmpOp.getValueType(); - - if (cast(CB.CmpLHS)->isMinValue(true)) { - Cond = DAG.getSetCC(dl, MVT::i1, CmpOp, DAG.getConstant(High, VT), - ISD::SETLE); - } else { - SDValue SUB = DAG.getNode(ISD::SUB, dl, - VT, CmpOp, DAG.getConstant(Low, VT)); - Cond = DAG.getSetCC(dl, MVT::i1, SUB, - DAG.getConstant(High-Low, VT), ISD::SETULE); - } - } - - // Update successor info - CurMBB->addSuccessor(CB.TrueBB); - CurMBB->addSuccessor(CB.FalseBB); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - // If the lhs block is the next block, invert the condition so that we can - // fall through to the lhs instead of the rhs block. - if (CB.TrueBB == NextBlock) { - std::swap(CB.TrueBB, CB.FalseBB); - SDValue True = DAG.getConstant(1, Cond.getValueType()); - Cond = DAG.getNode(ISD::XOR, dl, Cond.getValueType(), Cond, True); - } - SDValue BrCond = DAG.getNode(ISD::BRCOND, dl, - MVT::Other, getControlRoot(), Cond, - DAG.getBasicBlock(CB.TrueBB)); - - // If the branch was constant folded, fix up the CFG. - if (BrCond.getOpcode() == ISD::BR) { - CurMBB->removeSuccessor(CB.FalseBB); - DAG.setRoot(BrCond); - } else { - // Otherwise, go ahead and insert the false branch. - if (BrCond == getControlRoot()) - CurMBB->removeSuccessor(CB.TrueBB); - - if (CB.FalseBB == NextBlock) - DAG.setRoot(BrCond); - else - DAG.setRoot(DAG.getNode(ISD::BR, dl, MVT::Other, BrCond, - DAG.getBasicBlock(CB.FalseBB))); - } -} - -/// visitJumpTable - Emit JumpTable node in the current MBB -void SelectionDAGLowering::visitJumpTable(JumpTable &JT) { - // Emit the code for the jump table - assert(JT.Reg != -1U && "Should lower JT Header first!"); - EVT PTy = TLI.getPointerTy(); - SDValue Index = DAG.getCopyFromReg(getControlRoot(), getCurDebugLoc(), - JT.Reg, PTy); - SDValue Table = DAG.getJumpTable(JT.JTI, PTy); - DAG.setRoot(DAG.getNode(ISD::BR_JT, getCurDebugLoc(), - MVT::Other, Index.getValue(1), - Table, Index)); -} - -/// visitJumpTableHeader - This function emits necessary code to produce index -/// in the JumpTable from switch case. -void SelectionDAGLowering::visitJumpTableHeader(JumpTable &JT, - JumpTableHeader &JTH) { - // Subtract the lowest switch case value from the value being switched on and - // conditional branch to default mbb if the result is greater than the - // difference between smallest and largest cases. - SDValue SwitchOp = getValue(JTH.SValue); - EVT VT = SwitchOp.getValueType(); - SDValue SUB = DAG.getNode(ISD::SUB, getCurDebugLoc(), VT, SwitchOp, - DAG.getConstant(JTH.First, VT)); - - // The SDNode we just created, which holds the value being switched on minus - // the the smallest case value, needs to be copied to a virtual register so it - // can be used as an index into the jump table in a subsequent basic block. - // This value may be smaller or larger than the target's pointer type, and - // therefore require extension or truncating. - SwitchOp = DAG.getZExtOrTrunc(SUB, getCurDebugLoc(), TLI.getPointerTy()); - - unsigned JumpTableReg = FuncInfo.MakeReg(TLI.getPointerTy()); - SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), getCurDebugLoc(), - JumpTableReg, SwitchOp); - JT.Reg = JumpTableReg; - - // Emit the range check for the jump table, and branch to the default block - // for the switch statement if the value being switched on exceeds the largest - // case in the switch. - SDValue CMP = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(SUB.getValueType()), SUB, - DAG.getConstant(JTH.Last-JTH.First,VT), - ISD::SETUGT); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - SDValue BrCond = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, CopyTo, CMP, - DAG.getBasicBlock(JT.Default)); - - if (JT.MBB == NextBlock) - DAG.setRoot(BrCond); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, BrCond, - DAG.getBasicBlock(JT.MBB))); -} - -/// visitBitTestHeader - This function emits necessary code to produce value -/// suitable for "bit tests" -void SelectionDAGLowering::visitBitTestHeader(BitTestBlock &B) { - // Subtract the minimum value - SDValue SwitchOp = getValue(B.SValue); - EVT VT = SwitchOp.getValueType(); - SDValue SUB = DAG.getNode(ISD::SUB, getCurDebugLoc(), VT, SwitchOp, - DAG.getConstant(B.First, VT)); - - // Check range - SDValue RangeCmp = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(SUB.getValueType()), - SUB, DAG.getConstant(B.Range, VT), - ISD::SETUGT); - - SDValue ShiftOp = DAG.getZExtOrTrunc(SUB, getCurDebugLoc(), TLI.getPointerTy()); - - B.Reg = FuncInfo.MakeReg(TLI.getPointerTy()); - SDValue CopyTo = DAG.getCopyToReg(getControlRoot(), getCurDebugLoc(), - B.Reg, ShiftOp); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - MachineBasicBlock* MBB = B.Cases[0].ThisBB; - - CurMBB->addSuccessor(B.Default); - CurMBB->addSuccessor(MBB); - - SDValue BrRange = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, CopyTo, RangeCmp, - DAG.getBasicBlock(B.Default)); - - if (MBB == NextBlock) - DAG.setRoot(BrRange); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, CopyTo, - DAG.getBasicBlock(MBB))); -} - -/// visitBitTestCase - this function produces one "bit test" -void SelectionDAGLowering::visitBitTestCase(MachineBasicBlock* NextMBB, - unsigned Reg, - BitTestCase &B) { - // Make desired shift - SDValue ShiftOp = DAG.getCopyFromReg(getControlRoot(), getCurDebugLoc(), Reg, - TLI.getPointerTy()); - SDValue SwitchVal = DAG.getNode(ISD::SHL, getCurDebugLoc(), - TLI.getPointerTy(), - DAG.getConstant(1, TLI.getPointerTy()), - ShiftOp); - - // Emit bit tests and jumps - SDValue AndOp = DAG.getNode(ISD::AND, getCurDebugLoc(), - TLI.getPointerTy(), SwitchVal, - DAG.getConstant(B.Mask, TLI.getPointerTy())); - SDValue AndCmp = DAG.getSetCC(getCurDebugLoc(), - TLI.getSetCCResultType(AndOp.getValueType()), - AndOp, DAG.getConstant(0, TLI.getPointerTy()), - ISD::SETNE); - - CurMBB->addSuccessor(B.TargetBB); - CurMBB->addSuccessor(NextMBB); - - SDValue BrAnd = DAG.getNode(ISD::BRCOND, getCurDebugLoc(), - MVT::Other, getControlRoot(), - AndCmp, DAG.getBasicBlock(B.TargetBB)); - - // Set NextBlock to be the MBB immediately after the current one, if any. - // This is used to avoid emitting unnecessary branches to the next block. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CurMBB; - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - if (NextMBB == NextBlock) - DAG.setRoot(BrAnd); - else - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), MVT::Other, BrAnd, - DAG.getBasicBlock(NextMBB))); -} - -void SelectionDAGLowering::visitInvoke(InvokeInst &I) { - // Retrieve successors. - MachineBasicBlock *Return = FuncInfo.MBBMap[I.getSuccessor(0)]; - MachineBasicBlock *LandingPad = FuncInfo.MBBMap[I.getSuccessor(1)]; - - const Value *Callee(I.getCalledValue()); - if (isa(Callee)) - visitInlineAsm(&I); - else - LowerCallTo(&I, getValue(Callee), false, LandingPad); - - // If the value of the invoke is used outside of its defining block, make it - // available as a virtual register. - CopyToExportRegsIfNeeded(&I); - - // Update successor info - CurMBB->addSuccessor(Return); - CurMBB->addSuccessor(LandingPad); - - // Drop into normal successor. - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Return))); -} - -void SelectionDAGLowering::visitUnwind(UnwindInst &I) { -} - -/// handleSmallSwitchCaseRange - Emit a series of specific tests (suitable for -/// small case ranges). -bool SelectionDAGLowering::handleSmallSwitchRange(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - Case& BackCase = *(CR.Range.second-1); - - // Size is the number of Cases represented by this range. - size_t Size = CR.Range.second - CR.Range.first; - if (Size > 3) - return false; - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - MachineFunction::iterator BBI = CR.CaseBB; - - if (++BBI != FuncInfo.MF->end()) - NextBlock = BBI; - - // TODO: If any two of the cases has the same destination, and if one value - // is the same as the other, but has one bit unset that the other has set, - // use bit manipulation to do two compares at once. For example: - // "if (X == 6 || X == 4)" -> "if ((X|2) == 6)" - - // Rearrange the case blocks so that the last one falls through if possible. - if (NextBlock && Default != NextBlock && BackCase.BB != NextBlock) { - // The last case block won't fall through into 'NextBlock' if we emit the - // branches in this order. See if rearranging a case value would help. - for (CaseItr I = CR.Range.first, E = CR.Range.second-1; I != E; ++I) { - if (I->BB == NextBlock) { - std::swap(*I, BackCase); - break; - } - } - } - - // Create a CaseBlock record representing a conditional branch to - // the Case's target mbb if the value being switched on SV is equal - // to C. - MachineBasicBlock *CurBlock = CR.CaseBB; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; ++I) { - MachineBasicBlock *FallThrough; - if (I != E-1) { - FallThrough = CurMF->CreateMachineBasicBlock(CurBlock->getBasicBlock()); - CurMF->insert(BBI, FallThrough); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } else { - // If the last case doesn't match, go to the default block. - FallThrough = Default; - } - - Value *RHS, *LHS, *MHS; - ISD::CondCode CC; - if (I->High == I->Low) { - // This is just small small case range :) containing exactly 1 case - CC = ISD::SETEQ; - LHS = SV; RHS = I->High; MHS = NULL; - } else { - CC = ISD::SETLE; - LHS = I->Low; MHS = SV; RHS = I->High; - } - CaseBlock CB(CC, LHS, RHS, MHS, I->BB, FallThrough, CurBlock); - - // If emitting the first comparison, just call visitSwitchCase to emit the - // code into the current block. Otherwise, push the CaseBlock onto the - // vector to be later processed by SDISel, and insert the node's MBB - // before the next MBB. - if (CurBlock == CurMBB) - visitSwitchCase(CB); - else - SwitchCases.push_back(CB); - - CurBlock = FallThrough; - } - - return true; -} - -static inline bool areJTsAllowed(const TargetLowering &TLI) { - return !DisableJumpTables && - (TLI.isOperationLegalOrCustom(ISD::BR_JT, MVT::Other) || - TLI.isOperationLegalOrCustom(ISD::BRIND, MVT::Other)); -} - -static APInt ComputeRange(const APInt &First, const APInt &Last) { - APInt LastExt(Last), FirstExt(First); - uint32_t BitWidth = std::max(Last.getBitWidth(), First.getBitWidth()) + 1; - LastExt.sext(BitWidth); FirstExt.sext(BitWidth); - return (LastExt - FirstExt + 1ULL); -} - -/// handleJTSwitchCase - Emit jumptable for current switch case range -bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - - const APInt &First = cast(FrontCase.Low)->getValue(); - const APInt &Last = cast(BackCase.High)->getValue(); - - APInt TSize(First.getBitWidth(), 0); - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) - TSize += I->size(); - - if (!areJTsAllowed(TLI) || TSize.ult(APInt(First.getBitWidth(), 4))) - return false; - - APInt Range = ComputeRange(First, Last); - double Density = TSize.roundToDouble() / Range.roundToDouble(); - if (Density < 0.4) - return false; - - DEBUG(errs() << "Lowering jump table\n" - << "First entry: " << First << ". Last entry: " << Last << '\n' - << "Range: " << Range - << "Size: " << TSize << ". Density: " << Density << "\n\n"); - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - // Create a new basic block to hold the code for loading the address - // of the jump table, and jumping to it. Update successor information; - // we will either branch to the default case for the switch, or the jump - // table. - MachineBasicBlock *JumpTableBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, JumpTableBB); - CR.CaseBB->addSuccessor(Default); - CR.CaseBB->addSuccessor(JumpTableBB); - - // Build a vector of destination BBs, corresponding to each target - // of the jump table. If the value of the jump table slot corresponds to - // a case statement, push the case's BB onto the vector, otherwise, push - // the default BB. - std::vector DestBBs; - APInt TEI = First; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I != E; ++TEI) { - const APInt& Low = cast(I->Low)->getValue(); - const APInt& High = cast(I->High)->getValue(); - - if (Low.sle(TEI) && TEI.sle(High)) { - DestBBs.push_back(I->BB); - if (TEI==High) - ++I; - } else { - DestBBs.push_back(Default); - } - } - - // Update successor info. Add one edge to each unique successor. - BitVector SuccsHandled(CR.CaseBB->getParent()->getNumBlockIDs()); - for (std::vector::iterator I = DestBBs.begin(), - E = DestBBs.end(); I != E; ++I) { - if (!SuccsHandled[(*I)->getNumber()]) { - SuccsHandled[(*I)->getNumber()] = true; - JumpTableBB->addSuccessor(*I); - } - } - - // Create a jump table index for this jump table, or return an existing - // one. - unsigned JTI = CurMF->getJumpTableInfo()->getJumpTableIndex(DestBBs); - - // Set the jump table information so that we can codegen it as a second - // MachineBasicBlock - JumpTable JT(-1U, JTI, JumpTableBB, Default); - JumpTableHeader JTH(First, Last, SV, CR.CaseBB, (CR.CaseBB == CurMBB)); - if (CR.CaseBB == CurMBB) - visitJumpTableHeader(JT, JTH); - - JTCases.push_back(JumpTableBlock(JTH, JT)); - - return true; -} - -/// handleBTSplitSwitchCase - emit comparison and split binary search tree into -/// 2 subtrees. -bool SelectionDAGLowering::handleBTSplitSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - // Size is the number of Cases represented by this range. - unsigned Size = CR.Range.second - CR.Range.first; - - const APInt &First = cast(FrontCase.Low)->getValue(); - const APInt &Last = cast(BackCase.High)->getValue(); - double FMetric = 0; - CaseItr Pivot = CR.Range.first + Size/2; - - // Select optimal pivot, maximizing sum density of LHS and RHS. This will - // (heuristically) allow us to emit JumpTable's later. - APInt TSize(First.getBitWidth(), 0); - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) - TSize += I->size(); - - APInt LSize = FrontCase.size(); - APInt RSize = TSize-LSize; - DEBUG(errs() << "Selecting best pivot: \n" - << "First: " << First << ", Last: " << Last <<'\n' - << "LSize: " << LSize << ", RSize: " << RSize << '\n'); - for (CaseItr I = CR.Range.first, J=I+1, E = CR.Range.second; - J!=E; ++I, ++J) { - const APInt &LEnd = cast(I->High)->getValue(); - const APInt &RBegin = cast(J->Low)->getValue(); - APInt Range = ComputeRange(LEnd, RBegin); - assert((Range - 2ULL).isNonNegative() && - "Invalid case distance"); - double LDensity = (double)LSize.roundToDouble() / - (LEnd - First + 1ULL).roundToDouble(); - double RDensity = (double)RSize.roundToDouble() / - (Last - RBegin + 1ULL).roundToDouble(); - double Metric = Range.logBase2()*(LDensity+RDensity); - // Should always split in some non-trivial place - DEBUG(errs() <<"=>Step\n" - << "LEnd: " << LEnd << ", RBegin: " << RBegin << '\n' - << "LDensity: " << LDensity - << ", RDensity: " << RDensity << '\n' - << "Metric: " << Metric << '\n'); - if (FMetric < Metric) { - Pivot = J; - FMetric = Metric; - DEBUG(errs() << "Current metric set to: " << FMetric << '\n'); - } - - LSize += J->size(); - RSize -= J->size(); - } - if (areJTsAllowed(TLI)) { - // If our case is dense we *really* should handle it earlier! - assert((FMetric > 0) && "Should handle dense range earlier!"); - } else { - Pivot = CR.Range.first + Size/2; - } - - CaseRange LHSR(CR.Range.first, Pivot); - CaseRange RHSR(Pivot, CR.Range.second); - Constant *C = Pivot->Low; - MachineBasicBlock *FalseBB = 0, *TrueBB = 0; - - // We know that we branch to the LHS if the Value being switched on is - // less than the Pivot value, C. We use this to optimize our binary - // tree a bit, by recognizing that if SV is greater than or equal to the - // LHS's Case Value, and that Case Value is exactly one less than the - // Pivot's Value, then we can branch directly to the LHS's Target, - // rather than creating a leaf node for it. - if ((LHSR.second - LHSR.first) == 1 && - LHSR.first->High == CR.GE && - cast(C)->getValue() == - (cast(CR.GE)->getValue() + 1LL)) { - TrueBB = LHSR.first->BB; - } else { - TrueBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, TrueBB); - WorkList.push_back(CaseRec(TrueBB, C, CR.GE, LHSR)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - // Similar to the optimization above, if the Value being switched on is - // known to be less than the Constant CR.LT, and the current Case Value - // is CR.LT - 1, then we can branch directly to the target block for - // the current Case Value, rather than emitting a RHS leaf node for it. - if ((RHSR.second - RHSR.first) == 1 && CR.LT && - cast(RHSR.first->Low)->getValue() == - (cast(CR.LT)->getValue() - 1LL)) { - FalseBB = RHSR.first->BB; - } else { - FalseBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, FalseBB); - WorkList.push_back(CaseRec(FalseBB,CR.LT,C,RHSR)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - // Create a CaseBlock record representing a conditional branch to - // the LHS node if the value being switched on SV is less than C. - // Otherwise, branch to LHS. - CaseBlock CB(ISD::SETLT, SV, C, NULL, TrueBB, FalseBB, CR.CaseBB); - - if (CR.CaseBB == CurMBB) - visitSwitchCase(CB); - else - SwitchCases.push_back(CB); - - return true; -} - -/// handleBitTestsSwitchCase - if current case range has few destination and -/// range span less, than machine word bitwidth, encode case range into series -/// of masks and emit bit tests with these masks. -bool SelectionDAGLowering::handleBitTestsSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default){ - EVT PTy = TLI.getPointerTy(); - unsigned IntPtrBits = PTy.getSizeInBits(); - - Case& FrontCase = *CR.Range.first; - Case& BackCase = *(CR.Range.second-1); - - // Get the MachineFunction which holds the current MBB. This is used when - // inserting any additional MBBs necessary to represent the switch. - MachineFunction *CurMF = FuncInfo.MF; - - // If target does not have legal shift left, do not emit bit tests at all. - if (!TLI.isOperationLegal(ISD::SHL, TLI.getPointerTy())) - return false; - - size_t numCmps = 0; - for (CaseItr I = CR.Range.first, E = CR.Range.second; - I!=E; ++I) { - // Single case counts one, case range - two. - numCmps += (I->Low == I->High ? 1 : 2); - } - - // Count unique destinations - SmallSet Dests; - for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { - Dests.insert(I->BB); - if (Dests.size() > 3) - // Don't bother the code below, if there are too much unique destinations - return false; - } - DEBUG(errs() << "Total number of unique destinations: " << Dests.size() << '\n' - << "Total number of comparisons: " << numCmps << '\n'); - - // Compute span of values. - const APInt& minValue = cast(FrontCase.Low)->getValue(); - const APInt& maxValue = cast(BackCase.High)->getValue(); - APInt cmpRange = maxValue - minValue; - - DEBUG(errs() << "Compare range: " << cmpRange << '\n' - << "Low bound: " << minValue << '\n' - << "High bound: " << maxValue << '\n'); - - if (cmpRange.uge(APInt(cmpRange.getBitWidth(), IntPtrBits)) || - (!(Dests.size() == 1 && numCmps >= 3) && - !(Dests.size() == 2 && numCmps >= 5) && - !(Dests.size() >= 3 && numCmps >= 6))) - return false; - - DEBUG(errs() << "Emitting bit tests\n"); - APInt lowBound = APInt::getNullValue(cmpRange.getBitWidth()); - - // Optimize the case where all the case values fit in a - // word without having to subtract minValue. In this case, - // we can optimize away the subtraction. - if (minValue.isNonNegative() && - maxValue.slt(APInt(maxValue.getBitWidth(), IntPtrBits))) { - cmpRange = maxValue; - } else { - lowBound = minValue; - } - - CaseBitsVector CasesBits; - unsigned i, count = 0; - - for (CaseItr I = CR.Range.first, E = CR.Range.second; I!=E; ++I) { - MachineBasicBlock* Dest = I->BB; - for (i = 0; i < count; ++i) - if (Dest == CasesBits[i].BB) - break; - - if (i == count) { - assert((count < 3) && "Too much destinations to test!"); - CasesBits.push_back(CaseBits(0, Dest, 0)); - count++; - } - - const APInt& lowValue = cast(I->Low)->getValue(); - const APInt& highValue = cast(I->High)->getValue(); - - uint64_t lo = (lowValue - lowBound).getZExtValue(); - uint64_t hi = (highValue - lowBound).getZExtValue(); - - for (uint64_t j = lo; j <= hi; j++) { - CasesBits[i].Mask |= 1ULL << j; - CasesBits[i].Bits++; - } - - } - std::sort(CasesBits.begin(), CasesBits.end(), CaseBitsCmp()); - - BitTestInfo BTC; - - // Figure out which block is immediately after the current one. - MachineFunction::iterator BBI = CR.CaseBB; - ++BBI; - - const BasicBlock *LLVMBB = CR.CaseBB->getBasicBlock(); - - DEBUG(errs() << "Cases:\n"); - for (unsigned i = 0, e = CasesBits.size(); i!=e; ++i) { - DEBUG(errs() << "Mask: " << CasesBits[i].Mask - << ", Bits: " << CasesBits[i].Bits - << ", BB: " << CasesBits[i].BB << '\n'); - - MachineBasicBlock *CaseBB = CurMF->CreateMachineBasicBlock(LLVMBB); - CurMF->insert(BBI, CaseBB); - BTC.push_back(BitTestCase(CasesBits[i].Mask, - CaseBB, - CasesBits[i].BB)); - - // Put SV in a virtual register to make it available from the new blocks. - ExportFromCurrentBlock(SV); - } - - BitTestBlock BTB(lowBound, cmpRange, SV, - -1U, (CR.CaseBB == CurMBB), - CR.CaseBB, Default, BTC); - - if (CR.CaseBB == CurMBB) - visitBitTestHeader(BTB); - - BitTestCases.push_back(BTB); - - return true; -} - - -/// Clusterify - Transform simple list of Cases into list of CaseRange's -size_t SelectionDAGLowering::Clusterify(CaseVector& Cases, - const SwitchInst& SI) { - size_t numCmps = 0; - - // Start with "simple" cases - for (size_t i = 1; i < SI.getNumSuccessors(); ++i) { - MachineBasicBlock *SMBB = FuncInfo.MBBMap[SI.getSuccessor(i)]; - Cases.push_back(Case(SI.getSuccessorValue(i), - SI.getSuccessorValue(i), - SMBB)); - } - std::sort(Cases.begin(), Cases.end(), CaseCmp()); - - // Merge case into clusters - if (Cases.size() >= 2) - // Must recompute end() each iteration because it may be - // invalidated by erase if we hold on to it - for (CaseItr I = Cases.begin(), J = ++(Cases.begin()); J != Cases.end(); ) { - const APInt& nextValue = cast(J->Low)->getValue(); - const APInt& currentValue = cast(I->High)->getValue(); - MachineBasicBlock* nextBB = J->BB; - MachineBasicBlock* currentBB = I->BB; - - // If the two neighboring cases go to the same destination, merge them - // into a single case. - if ((nextValue - currentValue == 1) && (currentBB == nextBB)) { - I->High = J->High; - J = Cases.erase(J); - } else { - I = J++; - } - } - - for (CaseItr I=Cases.begin(), E=Cases.end(); I!=E; ++I, ++numCmps) { - if (I->Low != I->High) - // A range counts double, since it requires two compares. - ++numCmps; - } - - return numCmps; -} - -void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { - // Figure out which block is immediately after the current one. - MachineBasicBlock *NextBlock = 0; - - MachineBasicBlock *Default = FuncInfo.MBBMap[SI.getDefaultDest()]; - - // If there is only the default destination, branch to it if it is not the - // next basic block. Otherwise, just fall through. - if (SI.getNumOperands() == 2) { - // Update machine-CFG edges. - - // If this is not a fall-through branch, emit the branch. - CurMBB->addSuccessor(Default); - if (Default != NextBlock) - DAG.setRoot(DAG.getNode(ISD::BR, getCurDebugLoc(), - MVT::Other, getControlRoot(), - DAG.getBasicBlock(Default))); - return; - } - - // If there are any non-default case statements, create a vector of Cases - // representing each one, and sort the vector so that we can efficiently - // create a binary search tree from them. - CaseVector Cases; - size_t numCmps = Clusterify(Cases, SI); - DEBUG(errs() << "Clusterify finished. Total clusters: " << Cases.size() - << ". Total compares: " << numCmps << '\n'); - numCmps = 0; - - // Get the Value to be switched on and default basic blocks, which will be - // inserted into CaseBlock records, representing basic blocks in the binary - // search tree. - Value *SV = SI.getOperand(0); - - // Push the initial CaseRec onto the worklist - CaseRecVector WorkList; - WorkList.push_back(CaseRec(CurMBB,0,0,CaseRange(Cases.begin(),Cases.end()))); - - while (!WorkList.empty()) { - // Grab a record representing a case range to process off the worklist - CaseRec CR = WorkList.back(); - WorkList.pop_back(); - - if (handleBitTestsSwitchCase(CR, WorkList, SV, Default)) - continue; - - // If the range has few cases (two or less) emit a series of specific - // tests. - if (handleSmallSwitchRange(CR, WorkList, SV, Default)) - continue; - - // If the switch has more than 5 blocks, and at least 40% dense, and the - // target supports indirect branches, then emit a jump table rather than - // lowering the switch to a binary tree of conditional branches. - if (handleJTSwitchCase(CR, WorkList, SV, Default)) - continue; - - // Emit binary tree. We need to pick a pivot, and push left and right ranges - // onto the worklist. Leafs are handled via handleSmallSwitchRange() call. - handleBTSplitSwitchCase(CR, WorkList, SV, Default); - } -} - -void SelectionDAGLowering::visitIndirectBr(IndirectBrInst &I) { - // Update machine-CFG edges. - for (unsigned i = 0, e = I.getNumSuccessors(); i != e; ++i) - CurMBB->addSuccessor(FuncInfo.MBBMap[I.getSuccessor(i)]); - - DAG.setRoot(DAG.getNode(ISD::BRIND, getCurDebugLoc(), - MVT::Other, getControlRoot(), - getValue(I.getAddress()))); -} - - -void SelectionDAGLowering::visitFSub(User &I) { - // -0.0 - X --> fneg - const Type *Ty = I.getType(); - if (isa(Ty)) { - if (ConstantVector *CV = dyn_cast(I.getOperand(0))) { - const VectorType *DestTy = cast(I.getType()); - const Type *ElTy = DestTy->getElementType(); - unsigned VL = DestTy->getNumElements(); - std::vector NZ(VL, ConstantFP::getNegativeZero(ElTy)); - Constant *CNZ = ConstantVector::get(&NZ[0], NZ.size()); - if (CV == CNZ) { - SDValue Op2 = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FNEG, getCurDebugLoc(), - Op2.getValueType(), Op2)); - return; - } - } - } - if (ConstantFP *CFP = dyn_cast(I.getOperand(0))) - if (CFP->isExactlyValue(ConstantFP::getNegativeZero(Ty)->getValueAPF())) { - SDValue Op2 = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FNEG, getCurDebugLoc(), - Op2.getValueType(), Op2)); - return; - } - - visitBinary(I, ISD::FSUB); -} - -void SelectionDAGLowering::visitBinary(User &I, unsigned OpCode) { - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - - setValue(&I, DAG.getNode(OpCode, getCurDebugLoc(), - Op1.getValueType(), Op1, Op2)); -} - -void SelectionDAGLowering::visitShift(User &I, unsigned Opcode) { - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - if (!isa(I.getType()) && - Op2.getValueType() != TLI.getShiftAmountTy()) { - // If the operand is smaller than the shift count type, promote it. - EVT PTy = TLI.getPointerTy(); - EVT STy = TLI.getShiftAmountTy(); - if (STy.bitsGT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::ANY_EXTEND, getCurDebugLoc(), - TLI.getShiftAmountTy(), Op2); - // If the operand is larger than the shift count type but the shift - // count type has enough bits to represent any shift value, truncate - // it now. This is a common case and it exposes the truncate to - // optimization early. - else if (STy.getSizeInBits() >= - Log2_32_Ceil(Op2.getValueType().getSizeInBits())) - Op2 = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getShiftAmountTy(), Op2); - // Otherwise we'll need to temporarily settle for some other - // convenient type; type legalization will make adjustments as - // needed. - else if (PTy.bitsLT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getPointerTy(), Op2); - else if (PTy.bitsGT(Op2.getValueType())) - Op2 = DAG.getNode(ISD::ANY_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), Op2); - } - - setValue(&I, DAG.getNode(Opcode, getCurDebugLoc(), - Op1.getValueType(), Op1, Op2)); -} - -void SelectionDAGLowering::visitICmp(User &I) { - ICmpInst::Predicate predicate = ICmpInst::BAD_ICMP_PREDICATE; - if (ICmpInst *IC = dyn_cast(&I)) - predicate = IC->getPredicate(); - else if (ConstantExpr *IC = dyn_cast(&I)) - predicate = ICmpInst::Predicate(IC->getPredicate()); - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - ISD::CondCode Opcode = getICmpCondCode(predicate); - - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Opcode)); -} - -void SelectionDAGLowering::visitFCmp(User &I) { - FCmpInst::Predicate predicate = FCmpInst::BAD_FCMP_PREDICATE; - if (FCmpInst *FC = dyn_cast(&I)) - predicate = FC->getPredicate(); - else if (ConstantExpr *FC = dyn_cast(&I)) - predicate = FCmpInst::Predicate(FC->getPredicate()); - SDValue Op1 = getValue(I.getOperand(0)); - SDValue Op2 = getValue(I.getOperand(1)); - ISD::CondCode Condition = getFCmpCondCode(predicate); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Condition)); -} - -void SelectionDAGLowering::visitSelect(User &I) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - if (NumValues != 0) { - SmallVector Values(NumValues); - SDValue Cond = getValue(I.getOperand(0)); - SDValue TrueVal = getValue(I.getOperand(1)); - SDValue FalseVal = getValue(I.getOperand(2)); - - for (unsigned i = 0; i != NumValues; ++i) - Values[i] = DAG.getNode(ISD::SELECT, getCurDebugLoc(), - TrueVal.getValueType(), Cond, - SDValue(TrueVal.getNode(), TrueVal.getResNo() + i), - SDValue(FalseVal.getNode(), FalseVal.getResNo() + i)); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValueVTs[0], NumValues), - &Values[0], NumValues)); - } -} - - -void SelectionDAGLowering::visitTrunc(User &I) { - // TruncInst cannot be a no-op cast because sizeof(src) > sizeof(dest). - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitZExt(User &I) { - // ZExt cannot be a no-op cast because sizeof(src) < sizeof(dest). - // ZExt also can't be a cast to bool for same reason. So, nothing much to do - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitSExt(User &I) { - // SExt cannot be a no-op cast because sizeof(src) < sizeof(dest). - // SExt also can't be a cast to bool for same reason. So, nothing much to do - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::SIGN_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPTrunc(User &I) { - // FPTrunc is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_ROUND, getCurDebugLoc(), - DestVT, N, DAG.getIntPtrConstant(0))); -} - -void SelectionDAGLowering::visitFPExt(User &I){ - // FPTrunc is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPToUI(User &I) { - // FPToUI is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_TO_UINT, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitFPToSI(User &I) { - // FPToSI is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::FP_TO_SINT, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitUIToFP(User &I) { - // UIToFP is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::UINT_TO_FP, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitSIToFP(User &I){ - // SIToFP is never a no-op cast, no need to check - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getNode(ISD::SINT_TO_FP, getCurDebugLoc(), DestVT, N)); -} - -void SelectionDAGLowering::visitPtrToInt(User &I) { - // What to do depends on the size of the integer and the size of the pointer. - // We can either truncate, zero extend, or no-op, accordingly. - SDValue N = getValue(I.getOperand(0)); - EVT SrcVT = N.getValueType(); - EVT DestVT = TLI.getValueType(I.getType()); - SDValue Result = DAG.getZExtOrTrunc(N, getCurDebugLoc(), DestVT); - setValue(&I, Result); -} - -void SelectionDAGLowering::visitIntToPtr(User &I) { - // What to do depends on the size of the integer and the size of the pointer. - // We can either truncate, zero extend, or no-op, accordingly. - SDValue N = getValue(I.getOperand(0)); - EVT SrcVT = N.getValueType(); - EVT DestVT = TLI.getValueType(I.getType()); - setValue(&I, DAG.getZExtOrTrunc(N, getCurDebugLoc(), DestVT)); -} - -void SelectionDAGLowering::visitBitCast(User &I) { - SDValue N = getValue(I.getOperand(0)); - EVT DestVT = TLI.getValueType(I.getType()); - - // BitCast assures us that source and destination are the same size so this - // is either a BIT_CONVERT or a no-op. - if (DestVT != N.getValueType()) - setValue(&I, DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - DestVT, N)); // convert types - else - setValue(&I, N); // noop cast. -} - -void SelectionDAGLowering::visitInsertElement(User &I) { - SDValue InVec = getValue(I.getOperand(0)); - SDValue InVal = getValue(I.getOperand(1)); - SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), - getValue(I.getOperand(2))); - - setValue(&I, DAG.getNode(ISD::INSERT_VECTOR_ELT, getCurDebugLoc(), - TLI.getValueType(I.getType()), - InVec, InVal, InIdx)); -} - -void SelectionDAGLowering::visitExtractElement(User &I) { - SDValue InVec = getValue(I.getOperand(0)); - SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), - TLI.getPointerTy(), - getValue(I.getOperand(1))); - setValue(&I, DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - TLI.getValueType(I.getType()), InVec, InIdx)); -} - - -// Utility for visitShuffleVector - Returns true if the mask is mask starting -// from SIndx and increasing to the element length (undefs are allowed). -static bool SequentialMask(SmallVectorImpl &Mask, unsigned SIndx) { - unsigned MaskNumElts = Mask.size(); - for (unsigned i = 0; i != MaskNumElts; ++i) - if ((Mask[i] >= 0) && (Mask[i] != (int)(i + SIndx))) - return false; - return true; -} - -void SelectionDAGLowering::visitShuffleVector(User &I) { - SmallVector Mask; - SDValue Src1 = getValue(I.getOperand(0)); - SDValue Src2 = getValue(I.getOperand(1)); - - // Convert the ConstantVector mask operand into an array of ints, with -1 - // representing undef values. - SmallVector MaskElts; - cast(I.getOperand(2))->getVectorElements(*DAG.getContext(), - MaskElts); - unsigned MaskNumElts = MaskElts.size(); - for (unsigned i = 0; i != MaskNumElts; ++i) { - if (isa(MaskElts[i])) - Mask.push_back(-1); - else - Mask.push_back(cast(MaskElts[i])->getSExtValue()); - } - - EVT VT = TLI.getValueType(I.getType()); - EVT SrcVT = Src1.getValueType(); - unsigned SrcNumElts = SrcVT.getVectorNumElements(); - - if (SrcNumElts == MaskNumElts) { - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &Mask[0])); - return; - } - - // Normalize the shuffle vector since mask and vector length don't match. - if (SrcNumElts < MaskNumElts && MaskNumElts % SrcNumElts == 0) { - // Mask is longer than the source vectors and is a multiple of the source - // vectors. We can use concatenate vector to make the mask and vectors - // lengths match. - if (SrcNumElts*2 == MaskNumElts && SequentialMask(Mask, 0)) { - // The shuffle is concatenating two vectors together. - setValue(&I, DAG.getNode(ISD::CONCAT_VECTORS, getCurDebugLoc(), - VT, Src1, Src2)); - return; - } - - // Pad both vectors with undefs to make them the same length as the mask. - unsigned NumConcat = MaskNumElts / SrcNumElts; - bool Src1U = Src1.getOpcode() == ISD::UNDEF; - bool Src2U = Src2.getOpcode() == ISD::UNDEF; - SDValue UndefVal = DAG.getUNDEF(SrcVT); - - SmallVector MOps1(NumConcat, UndefVal); - SmallVector MOps2(NumConcat, UndefVal); - MOps1[0] = Src1; - MOps2[0] = Src2; - - Src1 = Src1U ? DAG.getUNDEF(VT) : DAG.getNode(ISD::CONCAT_VECTORS, - getCurDebugLoc(), VT, - &MOps1[0], NumConcat); - Src2 = Src2U ? DAG.getUNDEF(VT) : DAG.getNode(ISD::CONCAT_VECTORS, - getCurDebugLoc(), VT, - &MOps2[0], NumConcat); - - // Readjust mask for new input vector length. - SmallVector MappedOps; - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - if (Idx < (int)SrcNumElts) - MappedOps.push_back(Idx); - else - MappedOps.push_back(Idx + MaskNumElts - SrcNumElts); - } - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &MappedOps[0])); - return; - } - - if (SrcNumElts > MaskNumElts) { - // Analyze the access pattern of the vector to see if we can extract - // two subvectors and do the shuffle. The analysis is done by calculating - // the range of elements the mask access on both vectors. - int MinRange[2] = { SrcNumElts+1, SrcNumElts+1}; - int MaxRange[2] = {-1, -1}; - - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - int Input = 0; - if (Idx < 0) - continue; - - if (Idx >= (int)SrcNumElts) { - Input = 1; - Idx -= SrcNumElts; - } - if (Idx > MaxRange[Input]) - MaxRange[Input] = Idx; - if (Idx < MinRange[Input]) - MinRange[Input] = Idx; - } - - // Check if the access is smaller than the vector size and can we find - // a reasonable extract index. - int RangeUse[2] = { 2, 2 }; // 0 = Unused, 1 = Extract, 2 = Can not Extract. - int StartIdx[2]; // StartIdx to extract from - for (int Input=0; Input < 2; ++Input) { - if (MinRange[Input] == (int)(SrcNumElts+1) && MaxRange[Input] == -1) { - RangeUse[Input] = 0; // Unused - StartIdx[Input] = 0; - } else if (MaxRange[Input] - MinRange[Input] < (int)MaskNumElts) { - // Fits within range but we should see if we can find a good - // start index that is a multiple of the mask length. - if (MaxRange[Input] < (int)MaskNumElts) { - RangeUse[Input] = 1; // Extract from beginning of the vector - StartIdx[Input] = 0; - } else { - StartIdx[Input] = (MinRange[Input]/MaskNumElts)*MaskNumElts; - if (MaxRange[Input] - StartIdx[Input] < (int)MaskNumElts && - StartIdx[Input] + MaskNumElts < SrcNumElts) - RangeUse[Input] = 1; // Extract from a multiple of the mask length. - } - } - } - - if (RangeUse[0] == 0 && RangeUse[1] == 0) { - setValue(&I, DAG.getUNDEF(VT)); // Vectors are not used. - return; - } - else if (RangeUse[0] < 2 && RangeUse[1] < 2) { - // Extract appropriate subvector and generate a vector shuffle - for (int Input=0; Input < 2; ++Input) { - SDValue& Src = Input == 0 ? Src1 : Src2; - if (RangeUse[Input] == 0) { - Src = DAG.getUNDEF(VT); - } else { - Src = DAG.getNode(ISD::EXTRACT_SUBVECTOR, getCurDebugLoc(), VT, - Src, DAG.getIntPtrConstant(StartIdx[Input])); - } - } - // Calculate new mask. - SmallVector MappedOps; - for (unsigned i = 0; i != MaskNumElts; ++i) { - int Idx = Mask[i]; - if (Idx < 0) - MappedOps.push_back(Idx); - else if (Idx < (int)SrcNumElts) - MappedOps.push_back(Idx - StartIdx[0]); - else - MappedOps.push_back(Idx - SrcNumElts - StartIdx[1] + MaskNumElts); - } - setValue(&I, DAG.getVectorShuffle(VT, getCurDebugLoc(), Src1, Src2, - &MappedOps[0])); - return; - } - } - - // We can't use either concat vectors or extract subvectors so fall back to - // replacing the shuffle with extract and build vector. - // to insert and build vector. - EVT EltVT = VT.getVectorElementType(); - EVT PtrVT = TLI.getPointerTy(); - SmallVector Ops; - for (unsigned i = 0; i != MaskNumElts; ++i) { - if (Mask[i] < 0) { - Ops.push_back(DAG.getUNDEF(EltVT)); - } else { - int Idx = Mask[i]; - if (Idx < (int)SrcNumElts) - Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - EltVT, Src1, DAG.getConstant(Idx, PtrVT))); - else - Ops.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, getCurDebugLoc(), - EltVT, Src2, - DAG.getConstant(Idx - SrcNumElts, PtrVT))); - } - } - setValue(&I, DAG.getNode(ISD::BUILD_VECTOR, getCurDebugLoc(), - VT, &Ops[0], Ops.size())); -} - -void SelectionDAGLowering::visitInsertValue(InsertValueInst &I) { - const Value *Op0 = I.getOperand(0); - const Value *Op1 = I.getOperand(1); - const Type *AggTy = I.getType(); - const Type *ValTy = Op1->getType(); - bool IntoUndef = isa(Op0); - bool FromUndef = isa(Op1); - - unsigned LinearIndex = ComputeLinearIndex(TLI, AggTy, - I.idx_begin(), I.idx_end()); - - SmallVector AggValueVTs; - ComputeValueVTs(TLI, AggTy, AggValueVTs); - SmallVector ValValueVTs; - ComputeValueVTs(TLI, ValTy, ValValueVTs); - - unsigned NumAggValues = AggValueVTs.size(); - unsigned NumValValues = ValValueVTs.size(); - SmallVector Values(NumAggValues); - - SDValue Agg = getValue(Op0); - SDValue Val = getValue(Op1); - unsigned i = 0; - // Copy the beginning value(s) from the original aggregate. - for (; i != LinearIndex; ++i) - Values[i] = IntoUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - // Copy values from the inserted value(s). - for (; i != LinearIndex + NumValValues; ++i) - Values[i] = FromUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Val.getNode(), Val.getResNo() + i - LinearIndex); - // Copy remaining value(s) from the original aggregate. - for (; i != NumAggValues; ++i) - Values[i] = IntoUndef ? DAG.getUNDEF(AggValueVTs[i]) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&AggValueVTs[0], NumAggValues), - &Values[0], NumAggValues)); -} - -void SelectionDAGLowering::visitExtractValue(ExtractValueInst &I) { - const Value *Op0 = I.getOperand(0); - const Type *AggTy = Op0->getType(); - const Type *ValTy = I.getType(); - bool OutOfUndef = isa(Op0); - - unsigned LinearIndex = ComputeLinearIndex(TLI, AggTy, - I.idx_begin(), I.idx_end()); - - SmallVector ValValueVTs; - ComputeValueVTs(TLI, ValTy, ValValueVTs); - - unsigned NumValValues = ValValueVTs.size(); - SmallVector Values(NumValValues); - - SDValue Agg = getValue(Op0); - // Copy out the selected value(s). - for (unsigned i = LinearIndex; i != LinearIndex + NumValValues; ++i) - Values[i - LinearIndex] = - OutOfUndef ? - DAG.getUNDEF(Agg.getNode()->getValueType(Agg.getResNo() + i)) : - SDValue(Agg.getNode(), Agg.getResNo() + i); - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValValueVTs[0], NumValValues), - &Values[0], NumValValues)); -} - - -void SelectionDAGLowering::visitGetElementPtr(User &I) { - SDValue N = getValue(I.getOperand(0)); - const Type *Ty = I.getOperand(0)->getType(); - - for (GetElementPtrInst::op_iterator OI = I.op_begin()+1, E = I.op_end(); - OI != E; ++OI) { - Value *Idx = *OI; - if (const StructType *StTy = dyn_cast(Ty)) { - unsigned Field = cast(Idx)->getZExtValue(); - if (Field) { - // N = N + Offset - uint64_t Offset = TD->getStructLayout(StTy)->getElementOffset(Field); - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), N.getValueType(), N, - DAG.getIntPtrConstant(Offset)); - } - Ty = StTy->getElementType(Field); - } else { - Ty = cast(Ty)->getElementType(); - - // If this is a constant subscript, handle it quickly. - if (ConstantInt *CI = dyn_cast(Idx)) { - if (CI->getZExtValue() == 0) continue; - uint64_t Offs = - TD->getTypeAllocSize(Ty)*cast(CI)->getSExtValue(); - SDValue OffsVal; - EVT PTy = TLI.getPointerTy(); - unsigned PtrBits = PTy.getSizeInBits(); - if (PtrBits < 64) { - OffsVal = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), - TLI.getPointerTy(), - DAG.getConstant(Offs, MVT::i64)); - } else - OffsVal = DAG.getIntPtrConstant(Offs); - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), N.getValueType(), N, - OffsVal); - continue; - } - - // N = N + Idx * ElementSize; - APInt ElementSize = APInt(TLI.getPointerTy().getSizeInBits(), - TD->getTypeAllocSize(Ty)); - SDValue IdxN = getValue(Idx); - - // If the index is smaller or larger than intptr_t, truncate or extend - // it. - IdxN = DAG.getSExtOrTrunc(IdxN, getCurDebugLoc(), N.getValueType()); - - // If this is a multiply by a power of two, turn it into a shl - // immediately. This is a very common case. - if (ElementSize != 1) { - if (ElementSize.isPowerOf2()) { - unsigned Amt = ElementSize.logBase2(); - IdxN = DAG.getNode(ISD::SHL, getCurDebugLoc(), - N.getValueType(), IdxN, - DAG.getConstant(Amt, TLI.getPointerTy())); - } else { - SDValue Scale = DAG.getConstant(ElementSize, TLI.getPointerTy()); - IdxN = DAG.getNode(ISD::MUL, getCurDebugLoc(), - N.getValueType(), IdxN, Scale); - } - } - - N = DAG.getNode(ISD::ADD, getCurDebugLoc(), - N.getValueType(), N, IdxN); - } - } - setValue(&I, N); -} - -void SelectionDAGLowering::visitAlloca(AllocaInst &I) { - // If this is a fixed sized alloca in the entry block of the function, - // allocate it statically on the stack. - if (FuncInfo.StaticAllocaMap.count(&I)) - return; // getValue will auto-populate this. - - const Type *Ty = I.getAllocatedType(); - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize(Ty); - unsigned Align = - std::max((unsigned)TLI.getTargetData()->getPrefTypeAlignment(Ty), - I.getAlignment()); - - SDValue AllocSize = getValue(I.getArraySize()); - - AllocSize = DAG.getNode(ISD::MUL, getCurDebugLoc(), AllocSize.getValueType(), - AllocSize, - DAG.getConstant(TySize, AllocSize.getValueType())); - - - - EVT IntPtr = TLI.getPointerTy(); - AllocSize = DAG.getZExtOrTrunc(AllocSize, getCurDebugLoc(), IntPtr); - - // Handle alignment. If the requested alignment is less than or equal to - // the stack alignment, ignore it. If the size is greater than or equal to - // the stack alignment, we note this in the DYNAMIC_STACKALLOC node. - unsigned StackAlign = - TLI.getTargetMachine().getFrameInfo()->getStackAlignment(); - if (Align <= StackAlign) - Align = 0; - - // Round the size of the allocation up to the stack alignment size - // by add SA-1 to the size. - AllocSize = DAG.getNode(ISD::ADD, getCurDebugLoc(), - AllocSize.getValueType(), AllocSize, - DAG.getIntPtrConstant(StackAlign-1)); - // Mask out the low bits for alignment purposes. - AllocSize = DAG.getNode(ISD::AND, getCurDebugLoc(), - AllocSize.getValueType(), AllocSize, - DAG.getIntPtrConstant(~(uint64_t)(StackAlign-1))); - - SDValue Ops[] = { getRoot(), AllocSize, DAG.getIntPtrConstant(Align) }; - SDVTList VTs = DAG.getVTList(AllocSize.getValueType(), MVT::Other); - SDValue DSA = DAG.getNode(ISD::DYNAMIC_STACKALLOC, getCurDebugLoc(), - VTs, Ops, 3); - setValue(&I, DSA); - DAG.setRoot(DSA.getValue(1)); - - // Inform the Frame Information that we have just allocated a variable-sized - // object. - FuncInfo.MF->getFrameInfo()->CreateVariableSizedObject(); -} - -void SelectionDAGLowering::visitLoad(LoadInst &I) { - const Value *SV = I.getOperand(0); - SDValue Ptr = getValue(SV); - - const Type *Ty = I.getType(); - bool isVolatile = I.isVolatile(); - unsigned Alignment = I.getAlignment(); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, Ty, ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) - return; - - SDValue Root; - bool ConstantMemory = false; - if (I.isVolatile()) - // Serialize volatile loads with other side effects. - Root = getRoot(); - else if (AA->pointsToConstantMemory(SV)) { - // Do not serialize (non-volatile) loads of constant memory with anything. - Root = DAG.getEntryNode(); - ConstantMemory = true; - } else { - // Do not serialize non-volatile loads against each other. - Root = DAG.getRoot(); - } - - SmallVector Values(NumValues); - SmallVector Chains(NumValues); - EVT PtrVT = Ptr.getValueType(); - for (unsigned i = 0; i != NumValues; ++i) { - SDValue L = DAG.getLoad(ValueVTs[i], getCurDebugLoc(), Root, - DAG.getNode(ISD::ADD, getCurDebugLoc(), - PtrVT, Ptr, - DAG.getConstant(Offsets[i], PtrVT)), - SV, Offsets[i], isVolatile, Alignment); - Values[i] = L; - Chains[i] = L.getValue(1); - } - - if (!ConstantMemory) { - SDValue Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, - &Chains[0], NumValues); - if (isVolatile) - DAG.setRoot(Chain); - else - PendingLoads.push_back(Chain); - } - - setValue(&I, DAG.getNode(ISD::MERGE_VALUES, getCurDebugLoc(), - DAG.getVTList(&ValueVTs[0], NumValues), - &Values[0], NumValues)); -} - - -void SelectionDAGLowering::visitStore(StoreInst &I) { - Value *SrcV = I.getOperand(0); - Value *PtrV = I.getOperand(1); - - SmallVector ValueVTs; - SmallVector Offsets; - ComputeValueVTs(TLI, SrcV->getType(), ValueVTs, &Offsets); - unsigned NumValues = ValueVTs.size(); - if (NumValues == 0) - return; - - // Get the lowered operands. Note that we do this after - // checking if NumResults is zero, because with zero results - // the operands won't have values in the map. - SDValue Src = getValue(SrcV); - SDValue Ptr = getValue(PtrV); - - SDValue Root = getRoot(); - SmallVector Chains(NumValues); - EVT PtrVT = Ptr.getValueType(); - bool isVolatile = I.isVolatile(); - unsigned Alignment = I.getAlignment(); - for (unsigned i = 0; i != NumValues; ++i) - Chains[i] = DAG.getStore(Root, getCurDebugLoc(), - SDValue(Src.getNode(), Src.getResNo() + i), - DAG.getNode(ISD::ADD, getCurDebugLoc(), - PtrVT, Ptr, - DAG.getConstant(Offsets[i], PtrVT)), - PtrV, Offsets[i], isVolatile, Alignment); - - DAG.setRoot(DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues)); -} - -/// visitTargetIntrinsic - Lower a call of a target intrinsic to an INTRINSIC -/// node. -void SelectionDAGLowering::visitTargetIntrinsic(CallInst &I, - unsigned Intrinsic) { - bool HasChain = !I.doesNotAccessMemory(); - bool OnlyLoad = HasChain && I.onlyReadsMemory(); - - // Build the operand list. - SmallVector Ops; - if (HasChain) { // If this intrinsic has side-effects, chainify it. - if (OnlyLoad) { - // We don't need to serialize loads against other loads. - Ops.push_back(DAG.getRoot()); - } else { - Ops.push_back(getRoot()); - } - } - - // Info is set by getTgtMemInstrinsic - TargetLowering::IntrinsicInfo Info; - bool IsTgtIntrinsic = TLI.getTgtMemIntrinsic(Info, I, Intrinsic); - - // Add the intrinsic ID as an integer operand if it's not a target intrinsic. - if (!IsTgtIntrinsic) - Ops.push_back(DAG.getConstant(Intrinsic, TLI.getPointerTy())); - - // Add all operands of the call to the operand list. - for (unsigned i = 1, e = I.getNumOperands(); i != e; ++i) { - SDValue Op = getValue(I.getOperand(i)); - assert(TLI.isTypeLegal(Op.getValueType()) && - "Intrinsic uses a non-legal type?"); - Ops.push_back(Op); - } - - SmallVector ValueVTs; - ComputeValueVTs(TLI, I.getType(), ValueVTs); -#ifndef NDEBUG - for (unsigned Val = 0, E = ValueVTs.size(); Val != E; ++Val) { - assert(TLI.isTypeLegal(ValueVTs[Val]) && - "Intrinsic uses a non-legal type?"); - } -#endif // NDEBUG - if (HasChain) - ValueVTs.push_back(MVT::Other); - - SDVTList VTs = DAG.getVTList(ValueVTs.data(), ValueVTs.size()); - - // Create the node. - SDValue Result; - if (IsTgtIntrinsic) { - // This is target intrinsic that touches memory - Result = DAG.getMemIntrinsicNode(Info.opc, getCurDebugLoc(), - VTs, &Ops[0], Ops.size(), - Info.memVT, Info.ptrVal, Info.offset, - Info.align, Info.vol, - Info.readMem, Info.writeMem); - } - else if (!HasChain) - Result = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - else if (I.getType() != Type::getVoidTy(*DAG.getContext())) - Result = DAG.getNode(ISD::INTRINSIC_W_CHAIN, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - else - Result = DAG.getNode(ISD::INTRINSIC_VOID, getCurDebugLoc(), - VTs, &Ops[0], Ops.size()); - - if (HasChain) { - SDValue Chain = Result.getValue(Result.getNode()->getNumValues()-1); - if (OnlyLoad) - PendingLoads.push_back(Chain); - else - DAG.setRoot(Chain); - } - if (I.getType() != Type::getVoidTy(*DAG.getContext())) { - if (const VectorType *PTy = dyn_cast(I.getType())) { - EVT VT = TLI.getValueType(PTy); - Result = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), VT, Result); - } - setValue(&I, Result); - } -} - -/// GetSignificand - Get the significand and build it into a floating-point -/// number with exponent of 1: -/// -/// Op = (Op & 0x007fffff) | 0x3f800000; -/// -/// where Op is the hexidecimal representation of floating point value. -static SDValue -GetSignificand(SelectionDAG &DAG, SDValue Op, DebugLoc dl) { - SDValue t1 = DAG.getNode(ISD::AND, dl, MVT::i32, Op, - DAG.getConstant(0x007fffff, MVT::i32)); - SDValue t2 = DAG.getNode(ISD::OR, dl, MVT::i32, t1, - DAG.getConstant(0x3f800000, MVT::i32)); - return DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t2); -} - -/// GetExponent - Get the exponent: -/// -/// (float)(int)(((Op & 0x7f800000) >> 23) - 127); -/// -/// where Op is the hexidecimal representation of floating point value. -static SDValue -GetExponent(SelectionDAG &DAG, SDValue Op, const TargetLowering &TLI, - DebugLoc dl) { - SDValue t0 = DAG.getNode(ISD::AND, dl, MVT::i32, Op, - DAG.getConstant(0x7f800000, MVT::i32)); - SDValue t1 = DAG.getNode(ISD::SRL, dl, MVT::i32, t0, - DAG.getConstant(23, TLI.getPointerTy())); - SDValue t2 = DAG.getNode(ISD::SUB, dl, MVT::i32, t1, - DAG.getConstant(127, MVT::i32)); - return DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, t2); -} - -/// getF32Constant - Get 32-bit floating point constant. -static SDValue -getF32Constant(SelectionDAG &DAG, unsigned Flt) { - return DAG.getConstantFP(APFloat(APInt(32, Flt)), MVT::f32); -} - -/// Inlined utility function to implement binary input atomic intrinsics for -/// visitIntrinsicCall: I is a call instruction -/// Op is the associated NodeType for I -const char * -SelectionDAGLowering::implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op) { - SDValue Root = getRoot(); - SDValue L = - DAG.getAtomic(Op, getCurDebugLoc(), - getValue(I.getOperand(2)).getValueType().getSimpleVT(), - Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - I.getOperand(1)); - setValue(&I, L); - DAG.setRoot(L.getValue(1)); - return 0; -} - -// implVisitAluOverflow - Lower arithmetic overflow instrinsics. -const char * -SelectionDAGLowering::implVisitAluOverflow(CallInst &I, ISD::NodeType Op) { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - - SDVTList VTs = DAG.getVTList(Op1.getValueType(), MVT::i1); - SDValue Result = DAG.getNode(Op, getCurDebugLoc(), VTs, Op1, Op2); - - setValue(&I, Result); - return 0; -} - -/// visitExp - Lower an exp intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitExp(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - - // Put the exponent in the right bit position for later addition to the - // final result: - // - // #define LOG2OFe 1.4426950f - // IntegerPartOfX = ((int32_t)(X * LOG2OFe)); - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, Op, - getF32Constant(DAG, 0x3fb8aa3b)); - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, t0); - - // FractionalPartOfX = (X * LOG2OFe) - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // TwoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl,MVT::i32, t5); - - // Add the exponent into the result in integer domain. - SDValue t6 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t6); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // 0.000107046256 error, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl,MVT::i32, t7); - - // Add the exponent into the result in integer domain. - SDValue t8 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t8); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue TwoToFracPartOfX = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::i32, t13); - - // Add the exponent into the result in integer domain. - SDValue t14 = DAG.getNode(ISD::ADD, dl, MVT::i32, - TwoToFracPartOfX, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::f32, t14); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FEXP, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog - Lower a log intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Scale the exponent by log(2) [0.69314718f]. - SDValue Exp = GetExponent(DAG, Op1, TLI, dl); - SDValue LogOfExponent = DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, - getF32Constant(DAG, 0x3f317218)); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // LogofMantissa = - // -1.1609546f + - // (1.4034025f - 0.23903021f * x) * x; - // - // error 0.0034276066, which is better than 8 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbe74c456)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3fb3a2b1)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f949a29)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // LogOfMantissa = - // -1.7417939f + - // (2.8212026f + - // (-1.4699568f + - // (0.44717955f - 0.56570851e-1f * x) * x) * x) * x; - // - // error 0.000061011436, which is 14 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbd67b6d6)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3ee4f4b8)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fbc278b)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40348e95)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3fdef31a)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // LogOfMantissa = - // -2.1072184f + - // (4.2372794f + - // (-3.7029485f + - // (2.2781945f + - // (-0.87823314f + - // (0.19073739f - 0.17809712e-1f * x) * x) * x) * x) * x)*x; - // - // error 0.0000023660568, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbc91e5ac)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e4350aa)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f60d3e3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x4011cdf0)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x406cfd1c)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x408797cb)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue LogOfMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t10, - getF32Constant(DAG, 0x4006dcab)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, LogOfMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog2 - Lower a log2 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog2(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Get the exponent. - SDValue LogOfExponent = GetExponent(DAG, Op1, TLI, dl); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - // Different possible minimax approximations of significand in - // floating-point for various degrees of accuracy over [1,2]. - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // Log2ofMantissa = -1.6749035f + (2.0246817f - .34484768f * x) * x; - // - // error 0.0049451742, which is more than 7 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbeb08fe0)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x40019463)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fd6633d)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // Log2ofMantissa = - // -2.51285454f + - // (4.07009056f + - // (-2.12067489f + - // (.645142248f - 0.816157886e-1f * x) * x) * x) * x; - // - // error 0.0000876136000, which is better than 13 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbda7262e)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3f25280b)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x4007b923)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40823e2f)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x4020d29c)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // Log2ofMantissa = - // -3.0400495f + - // (6.1129976f + - // (-5.3420409f + - // (3.2865683f + - // (-1.2669343f + - // (0.27515199f - - // 0.25691327e-1f * x) * x) * x) * x) * x) * x; - // - // error 0.0000018516, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbcd2769e)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e8ce0b9)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3fa22ae7)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x40525723)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t6, - getF32Constant(DAG, 0x40aaf200)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x40c39dad)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue Log2ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t10, - getF32Constant(DAG, 0x4042902c)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log2ofMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG2, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitLog10 - Lower a log10 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitLog10(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - SDValue Op1 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, Op); - - // Scale the exponent by log10(2) [0.30102999f]. - SDValue Exp = GetExponent(DAG, Op1, TLI, dl); - SDValue LogOfExponent = DAG.getNode(ISD::FMUL, dl, MVT::f32, Exp, - getF32Constant(DAG, 0x3e9a209a)); - - // Get the significand and build it into a floating-point number with - // exponent of 1. - SDValue X = GetSignificand(DAG, Op1, dl); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // Log10ofMantissa = - // -0.50419619f + - // (0.60948995f - 0.10380950f * x) * x; - // - // error 0.0014886165, which is 6 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0xbdd49a13)); - SDValue t1 = DAG.getNode(ISD::FADD, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3f1c0789)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f011300)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // Log10ofMantissa = - // -0.64831180f + - // (0.91751397f + - // (-0.31664806f + 0.47637168e-1f * x) * x) * x; - // - // error 0.00019228036, which is better than 12 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3d431f31)); - SDValue t1 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3ea21fb2)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f6ae232)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f25f7c3)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // Log10ofMantissa = - // -0.84299375f + - // (1.5327582f + - // (-1.0688956f + - // (0.49102474f + - // (-0.12539807f + 0.13508273e-1f * x) * x) * x) * x) * x; - // - // error 0.0000037995730, which is better than 18 bits - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3c5d51ce)); - SDValue t1 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, - getF32Constant(DAG, 0x3e00685a)); - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t1, X); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3efb6798)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FSUB, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f88d192)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3fc4316c)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue Log10ofMantissa = DAG.getNode(ISD::FSUB, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3f57ce70)); - - result = DAG.getNode(ISD::FADD, dl, - MVT::f32, LogOfExponent, Log10ofMantissa); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FLOG10, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitExp2 - Lower an exp2 intrinsic. Handles the special sequences for -/// limited-precision mode. -void -SelectionDAGLowering::visitExp2(CallInst &I) { - SDValue result; - DebugLoc dl = getCurDebugLoc(); - - if (getValue(I.getOperand(1)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(1)); - - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, Op); - - // FractionalPartOfX = x - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, Op, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // TwoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue t6 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t5); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t6, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // error 0.000107046256, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue t8 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t7); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t8, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue t14 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t13); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t14, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FEXP2, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1))); - } - - setValue(&I, result); -} - -/// visitPow - Lower a pow intrinsic. Handles the special sequences for -/// limited-precision mode with x == 10.0f. -void -SelectionDAGLowering::visitPow(CallInst &I) { - SDValue result; - Value *Val = I.getOperand(1); - DebugLoc dl = getCurDebugLoc(); - bool IsExp10 = false; - - if (getValue(Val).getValueType() == MVT::f32 && - getValue(I.getOperand(2)).getValueType() == MVT::f32 && - LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - if (Constant *C = const_cast(dyn_cast(Val))) { - if (ConstantFP *CFP = dyn_cast(C)) { - APFloat Ten(10.0f); - IsExp10 = CFP->getValueAPF().bitwiseIsEqual(Ten); - } - } - } - - if (IsExp10 && LimitFloatPrecision > 0 && LimitFloatPrecision <= 18) { - SDValue Op = getValue(I.getOperand(2)); - - // Put the exponent in the right bit position for later addition to the - // final result: - // - // #define LOG2OF10 3.3219281f - // IntegerPartOfX = (int32_t)(x * LOG2OF10); - SDValue t0 = DAG.getNode(ISD::FMUL, dl, MVT::f32, Op, - getF32Constant(DAG, 0x40549a78)); - SDValue IntegerPartOfX = DAG.getNode(ISD::FP_TO_SINT, dl, MVT::i32, t0); - - // FractionalPartOfX = x - (float)IntegerPartOfX; - SDValue t1 = DAG.getNode(ISD::SINT_TO_FP, dl, MVT::f32, IntegerPartOfX); - SDValue X = DAG.getNode(ISD::FSUB, dl, MVT::f32, t0, t1); - - // IntegerPartOfX <<= 23; - IntegerPartOfX = DAG.getNode(ISD::SHL, dl, MVT::i32, IntegerPartOfX, - DAG.getConstant(23, TLI.getPointerTy())); - - if (LimitFloatPrecision <= 6) { - // For floating-point precision of 6: - // - // twoToFractionalPartOfX = - // 0.997535578f + - // (0.735607626f + 0.252464424f * x) * x; - // - // error 0.0144103317, which is 6 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3e814304)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3f3c50c8)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f7f5e7e)); - SDValue t6 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t5); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t6, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else if (LimitFloatPrecision > 6 && LimitFloatPrecision <= 12) { - // For floating-point precision of 12: - // - // TwoToFractionalPartOfX = - // 0.999892986f + - // (0.696457318f + - // (0.224338339f + 0.792043434e-1f * x) * x) * x; - // - // error 0.000107046256, which is 13 to 14 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3da235e3)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3e65b8f3)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3f324b07)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3f7ff8fd)); - SDValue t8 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t7); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t8, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } else { // LimitFloatPrecision > 12 && LimitFloatPrecision <= 18 - // For floating-point precision of 18: - // - // TwoToFractionalPartOfX = - // 0.999999982f + - // (0.693148872f + - // (0.240227044f + - // (0.554906021e-1f + - // (0.961591928e-2f + - // (0.136028312e-2f + 0.157059148e-3f *x)*x)*x)*x)*x)*x; - // error 2.47208000*10^(-7), which is better than 18 bits - SDValue t2 = DAG.getNode(ISD::FMUL, dl, MVT::f32, X, - getF32Constant(DAG, 0x3924b03e)); - SDValue t3 = DAG.getNode(ISD::FADD, dl, MVT::f32, t2, - getF32Constant(DAG, 0x3ab24b87)); - SDValue t4 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t3, X); - SDValue t5 = DAG.getNode(ISD::FADD, dl, MVT::f32, t4, - getF32Constant(DAG, 0x3c1d8c17)); - SDValue t6 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t5, X); - SDValue t7 = DAG.getNode(ISD::FADD, dl, MVT::f32, t6, - getF32Constant(DAG, 0x3d634a1d)); - SDValue t8 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t7, X); - SDValue t9 = DAG.getNode(ISD::FADD, dl, MVT::f32, t8, - getF32Constant(DAG, 0x3e75fe14)); - SDValue t10 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t9, X); - SDValue t11 = DAG.getNode(ISD::FADD, dl, MVT::f32, t10, - getF32Constant(DAG, 0x3f317234)); - SDValue t12 = DAG.getNode(ISD::FMUL, dl, MVT::f32, t11, X); - SDValue t13 = DAG.getNode(ISD::FADD, dl, MVT::f32, t12, - getF32Constant(DAG, 0x3f800000)); - SDValue t14 = DAG.getNode(ISD::BIT_CONVERT, dl, MVT::i32, t13); - SDValue TwoToFractionalPartOfX = - DAG.getNode(ISD::ADD, dl, MVT::i32, t14, IntegerPartOfX); - - result = DAG.getNode(ISD::BIT_CONVERT, dl, - MVT::f32, TwoToFractionalPartOfX); - } - } else { - // No special expansion. - result = DAG.getNode(ISD::FPOW, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2))); - } - - setValue(&I, result); -} - -/// visitIntrinsicCall - Lower the call to the specified intrinsic function. If -/// we want to emit this as a call to a named external function, return the name -/// otherwise lower it and return null. -const char * -SelectionDAGLowering::visitIntrinsicCall(CallInst &I, unsigned Intrinsic) { - DebugLoc dl = getCurDebugLoc(); - switch (Intrinsic) { - default: - // By default, turn this into a target intrinsic node. - visitTargetIntrinsic(I, Intrinsic); - return 0; - case Intrinsic::vastart: visitVAStart(I); return 0; - case Intrinsic::vaend: visitVAEnd(I); return 0; - case Intrinsic::vacopy: visitVACopy(I); return 0; - case Intrinsic::returnaddress: - setValue(&I, DAG.getNode(ISD::RETURNADDR, dl, TLI.getPointerTy(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::frameaddress: - setValue(&I, DAG.getNode(ISD::FRAMEADDR, dl, TLI.getPointerTy(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::setjmp: - return "_setjmp"+!TLI.usesUnderscoreSetJmp(); - break; - case Intrinsic::longjmp: - return "_longjmp"+!TLI.usesUnderscoreLongJmp(); - break; - case Intrinsic::memcpy: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - DAG.setRoot(DAG.getMemcpy(getRoot(), dl, Op1, Op2, Op3, Align, false, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - case Intrinsic::memset: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - DAG.setRoot(DAG.getMemset(getRoot(), dl, Op1, Op2, Op3, Align, - I.getOperand(1), 0)); - return 0; - } - case Intrinsic::memmove: { - SDValue Op1 = getValue(I.getOperand(1)); - SDValue Op2 = getValue(I.getOperand(2)); - SDValue Op3 = getValue(I.getOperand(3)); - unsigned Align = cast(I.getOperand(4))->getZExtValue(); - - // If the source and destination are known to not be aliases, we can - // lower memmove as memcpy. - uint64_t Size = -1ULL; - if (ConstantSDNode *C = dyn_cast(Op3)) - Size = C->getZExtValue(); - if (AA->alias(I.getOperand(1), Size, I.getOperand(2), Size) == - AliasAnalysis::NoAlias) { - DAG.setRoot(DAG.getMemcpy(getRoot(), dl, Op1, Op2, Op3, Align, false, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - - DAG.setRoot(DAG.getMemmove(getRoot(), dl, Op1, Op2, Op3, Align, - I.getOperand(1), 0, I.getOperand(2), 0)); - return 0; - } - case Intrinsic::dbg_stoppoint: - case Intrinsic::dbg_region_start: - case Intrinsic::dbg_region_end: - case Intrinsic::dbg_func_start: - // FIXME - Remove this instructions once the dust settles. - return 0; - case Intrinsic::dbg_declare: { - if (OptLevel != CodeGenOpt::None) - // FIXME: Variable debug info is not supported here. - return 0; - DwarfWriter *DW = DAG.getDwarfWriter(); - if (!DW) - return 0; - DbgDeclareInst &DI = cast(I); - if (!isValidDebugInfoIntrinsic(DI, CodeGenOpt::None)) - return 0; - - MDNode *Variable = DI.getVariable(); - Value *Address = DI.getAddress(); - if (BitCastInst *BCI = dyn_cast(Address)) - Address = BCI->getOperand(0); - AllocaInst *AI = dyn_cast(Address); - // Don't handle byval struct arguments or VLAs, for example. - if (!AI) - return 0; - DenseMap::iterator SI = - FuncInfo.StaticAllocaMap.find(AI); - if (SI == FuncInfo.StaticAllocaMap.end()) - return 0; // VLAs. - int FI = SI->second; - - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - if (MMI) { - MetadataContext &TheMetadata = - DI.getParent()->getContext().getMetadata(); - unsigned MDDbgKind = TheMetadata.getMDKind("dbg"); - MDNode *Dbg = TheMetadata.getMD(MDDbgKind, &DI); - MMI->setVariableDbgInfo(Variable, FI, Dbg); - } - return 0; - } - case Intrinsic::eh_exception: { - // Insert the EXCEPTIONADDR instruction. - assert(CurMBB->isLandingPad() &&"Call to eh.exception not in landing pad!"); - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[1]; - Ops[0] = DAG.getRoot(); - SDValue Op = DAG.getNode(ISD::EXCEPTIONADDR, dl, VTs, Ops, 1); - setValue(&I, Op); - DAG.setRoot(Op.getValue(1)); - return 0; - } - - case Intrinsic::eh_selector: { - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - - if (CurMBB->isLandingPad()) - AddCatchInfo(I, MMI, CurMBB); - else { -#ifndef NDEBUG - FuncInfo.CatchInfoLost.insert(&I); -#endif - // FIXME: Mark exception selector register as live in. Hack for PR1508. - unsigned Reg = TLI.getExceptionSelectorRegister(); - if (Reg) CurMBB->addLiveIn(Reg); - } - - // Insert the EHSELECTION instruction. - SDVTList VTs = DAG.getVTList(TLI.getPointerTy(), MVT::Other); - SDValue Ops[2]; - Ops[0] = getValue(I.getOperand(1)); - Ops[1] = getRoot(); - SDValue Op = DAG.getNode(ISD::EHSELECTION, dl, VTs, Ops, 2); - - DAG.setRoot(Op.getValue(1)); - - setValue(&I, DAG.getSExtOrTrunc(Op, dl, MVT::i32)); - return 0; - } - - case Intrinsic::eh_typeid_for: { - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - - if (MMI) { - // Find the type id for the given typeinfo. - GlobalVariable *GV = ExtractTypeInfo(I.getOperand(1)); - - unsigned TypeID = MMI->getTypeIDFor(GV); - setValue(&I, DAG.getConstant(TypeID, MVT::i32)); - } else { - // Return something different to eh_selector. - setValue(&I, DAG.getConstant(1, MVT::i32)); - } - - return 0; - } - - case Intrinsic::eh_return_i32: - case Intrinsic::eh_return_i64: - if (MachineModuleInfo *MMI = DAG.getMachineModuleInfo()) { - MMI->setCallsEHReturn(true); - DAG.setRoot(DAG.getNode(ISD::EH_RETURN, dl, - MVT::Other, - getControlRoot(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)))); - } else { - setValue(&I, DAG.getConstant(0, TLI.getPointerTy())); - } - - return 0; - case Intrinsic::eh_unwind_init: - if (MachineModuleInfo *MMI = DAG.getMachineModuleInfo()) { - MMI->setCallsUnwindInit(true); - } - - return 0; - - case Intrinsic::eh_dwarf_cfa: { - EVT VT = getValue(I.getOperand(1)).getValueType(); - SDValue CfaArg = DAG.getSExtOrTrunc(getValue(I.getOperand(1)), dl, - TLI.getPointerTy()); - - SDValue Offset = DAG.getNode(ISD::ADD, dl, - TLI.getPointerTy(), - DAG.getNode(ISD::FRAME_TO_ARGS_OFFSET, dl, - TLI.getPointerTy()), - CfaArg); - setValue(&I, DAG.getNode(ISD::ADD, dl, - TLI.getPointerTy(), - DAG.getNode(ISD::FRAMEADDR, dl, - TLI.getPointerTy(), - DAG.getConstant(0, - TLI.getPointerTy())), - Offset)); - return 0; - } - case Intrinsic::convertff: - case Intrinsic::convertfsi: - case Intrinsic::convertfui: - case Intrinsic::convertsif: - case Intrinsic::convertuif: - case Intrinsic::convertss: - case Intrinsic::convertsu: - case Intrinsic::convertus: - case Intrinsic::convertuu: { - ISD::CvtCode Code = ISD::CVT_INVALID; - switch (Intrinsic) { - case Intrinsic::convertff: Code = ISD::CVT_FF; break; - case Intrinsic::convertfsi: Code = ISD::CVT_FS; break; - case Intrinsic::convertfui: Code = ISD::CVT_FU; break; - case Intrinsic::convertsif: Code = ISD::CVT_SF; break; - case Intrinsic::convertuif: Code = ISD::CVT_UF; break; - case Intrinsic::convertss: Code = ISD::CVT_SS; break; - case Intrinsic::convertsu: Code = ISD::CVT_SU; break; - case Intrinsic::convertus: Code = ISD::CVT_US; break; - case Intrinsic::convertuu: Code = ISD::CVT_UU; break; - } - EVT DestVT = TLI.getValueType(I.getType()); - Value* Op1 = I.getOperand(1); - setValue(&I, DAG.getConvertRndSat(DestVT, getCurDebugLoc(), getValue(Op1), - DAG.getValueType(DestVT), - DAG.getValueType(getValue(Op1).getValueType()), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - Code)); - return 0; - } - - case Intrinsic::sqrt: - setValue(&I, DAG.getNode(ISD::FSQRT, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::powi: - setValue(&I, DAG.getNode(ISD::FPOWI, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)))); - return 0; - case Intrinsic::sin: - setValue(&I, DAG.getNode(ISD::FSIN, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::cos: - setValue(&I, DAG.getNode(ISD::FCOS, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::log: - visitLog(I); - return 0; - case Intrinsic::log2: - visitLog2(I); - return 0; - case Intrinsic::log10: - visitLog10(I); - return 0; - case Intrinsic::exp: - visitExp(I); - return 0; - case Intrinsic::exp2: - visitExp2(I); - return 0; - case Intrinsic::pow: - visitPow(I); - return 0; - case Intrinsic::pcmarker: { - SDValue Tmp = getValue(I.getOperand(1)); - DAG.setRoot(DAG.getNode(ISD::PCMARKER, dl, MVT::Other, getRoot(), Tmp)); - return 0; - } - case Intrinsic::readcyclecounter: { - SDValue Op = getRoot(); - SDValue Tmp = DAG.getNode(ISD::READCYCLECOUNTER, dl, - DAG.getVTList(MVT::i64, MVT::Other), - &Op, 1); - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - case Intrinsic::bswap: - setValue(&I, DAG.getNode(ISD::BSWAP, dl, - getValue(I.getOperand(1)).getValueType(), - getValue(I.getOperand(1)))); - return 0; - case Intrinsic::cttz: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTTZ, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::ctlz: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTLZ, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::ctpop: { - SDValue Arg = getValue(I.getOperand(1)); - EVT Ty = Arg.getValueType(); - SDValue result = DAG.getNode(ISD::CTPOP, dl, Ty, Arg); - setValue(&I, result); - return 0; - } - case Intrinsic::stacksave: { - SDValue Op = getRoot(); - SDValue Tmp = DAG.getNode(ISD::STACKSAVE, dl, - DAG.getVTList(TLI.getPointerTy(), MVT::Other), &Op, 1); - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - case Intrinsic::stackrestore: { - SDValue Tmp = getValue(I.getOperand(1)); - DAG.setRoot(DAG.getNode(ISD::STACKRESTORE, dl, MVT::Other, getRoot(), Tmp)); - return 0; - } - case Intrinsic::stackprotector: { - // Emit code into the DAG to store the stack guard onto the stack. - MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo *MFI = MF.getFrameInfo(); - EVT PtrTy = TLI.getPointerTy(); - - SDValue Src = getValue(I.getOperand(1)); // The guard's value. - AllocaInst *Slot = cast(I.getOperand(2)); - - int FI = FuncInfo.StaticAllocaMap[Slot]; - MFI->setStackProtectorIndex(FI); - - SDValue FIN = DAG.getFrameIndex(FI, PtrTy); - - // Store the stack protector onto the stack. - SDValue Result = DAG.getStore(getRoot(), getCurDebugLoc(), Src, FIN, - PseudoSourceValue::getFixedStack(FI), - 0, true); - setValue(&I, Result); - DAG.setRoot(Result); - return 0; - } - case Intrinsic::objectsize: { - // If we don't know by now, we're never going to know. - ConstantInt *CI = dyn_cast(I.getOperand(2)); - - assert(CI && "Non-constant type in __builtin_object_size?"); - - SDValue Arg = getValue(I.getOperand(0)); - EVT Ty = Arg.getValueType(); - - if (CI->getZExtValue() < 2) - setValue(&I, DAG.getConstant(-1ULL, Ty)); - else - setValue(&I, DAG.getConstant(0, Ty)); - return 0; - } - case Intrinsic::var_annotation: - // Discard annotate attributes - return 0; - - case Intrinsic::init_trampoline: { - const Function *F = cast(I.getOperand(2)->stripPointerCasts()); - - SDValue Ops[6]; - Ops[0] = getRoot(); - Ops[1] = getValue(I.getOperand(1)); - Ops[2] = getValue(I.getOperand(2)); - Ops[3] = getValue(I.getOperand(3)); - Ops[4] = DAG.getSrcValue(I.getOperand(1)); - Ops[5] = DAG.getSrcValue(F); - - SDValue Tmp = DAG.getNode(ISD::TRAMPOLINE, dl, - DAG.getVTList(TLI.getPointerTy(), MVT::Other), - Ops, 6); - - setValue(&I, Tmp); - DAG.setRoot(Tmp.getValue(1)); - return 0; - } - - case Intrinsic::gcroot: - if (GFI) { - Value *Alloca = I.getOperand(1); - Constant *TypeMap = cast(I.getOperand(2)); - - FrameIndexSDNode *FI = cast(getValue(Alloca).getNode()); - GFI->addStackRoot(FI->getIndex(), TypeMap); - } - return 0; - - case Intrinsic::gcread: - case Intrinsic::gcwrite: - llvm_unreachable("GC failed to lower gcread/gcwrite intrinsics!"); - return 0; - - case Intrinsic::flt_rounds: { - setValue(&I, DAG.getNode(ISD::FLT_ROUNDS_, dl, MVT::i32)); - return 0; - } - - case Intrinsic::trap: { - DAG.setRoot(DAG.getNode(ISD::TRAP, dl,MVT::Other, getRoot())); - return 0; - } - - case Intrinsic::uadd_with_overflow: - return implVisitAluOverflow(I, ISD::UADDO); - case Intrinsic::sadd_with_overflow: - return implVisitAluOverflow(I, ISD::SADDO); - case Intrinsic::usub_with_overflow: - return implVisitAluOverflow(I, ISD::USUBO); - case Intrinsic::ssub_with_overflow: - return implVisitAluOverflow(I, ISD::SSUBO); - case Intrinsic::umul_with_overflow: - return implVisitAluOverflow(I, ISD::UMULO); - case Intrinsic::smul_with_overflow: - return implVisitAluOverflow(I, ISD::SMULO); - - case Intrinsic::prefetch: { - SDValue Ops[4]; - Ops[0] = getRoot(); - Ops[1] = getValue(I.getOperand(1)); - Ops[2] = getValue(I.getOperand(2)); - Ops[3] = getValue(I.getOperand(3)); - DAG.setRoot(DAG.getNode(ISD::PREFETCH, dl, MVT::Other, &Ops[0], 4)); - return 0; - } - - case Intrinsic::memory_barrier: { - SDValue Ops[6]; - Ops[0] = getRoot(); - for (int x = 1; x < 6; ++x) - Ops[x] = getValue(I.getOperand(x)); - - DAG.setRoot(DAG.getNode(ISD::MEMBARRIER, dl, MVT::Other, &Ops[0], 6)); - return 0; - } - case Intrinsic::atomic_cmp_swap: { - SDValue Root = getRoot(); - SDValue L = - DAG.getAtomic(ISD::ATOMIC_CMP_SWAP, getCurDebugLoc(), - getValue(I.getOperand(2)).getValueType().getSimpleVT(), - Root, - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - getValue(I.getOperand(3)), - I.getOperand(1)); - setValue(&I, L); - DAG.setRoot(L.getValue(1)); - return 0; - } - case Intrinsic::atomic_load_add: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_ADD); - case Intrinsic::atomic_load_sub: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_SUB); - case Intrinsic::atomic_load_or: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_OR); - case Intrinsic::atomic_load_xor: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_XOR); - case Intrinsic::atomic_load_and: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_AND); - case Intrinsic::atomic_load_nand: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_NAND); - case Intrinsic::atomic_load_max: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MAX); - case Intrinsic::atomic_load_min: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_MIN); - case Intrinsic::atomic_load_umin: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMIN); - case Intrinsic::atomic_load_umax: - return implVisitBinaryAtomic(I, ISD::ATOMIC_LOAD_UMAX); - case Intrinsic::atomic_swap: - return implVisitBinaryAtomic(I, ISD::ATOMIC_SWAP); - - case Intrinsic::invariant_start: - case Intrinsic::lifetime_start: - // Discard region information. - setValue(&I, DAG.getUNDEF(TLI.getPointerTy())); - return 0; - case Intrinsic::invariant_end: - case Intrinsic::lifetime_end: - // Discard region information. - return 0; - } -} - -/// Test if the given instruction is in a position to be optimized -/// with a tail-call. This roughly means that it's in a block with -/// a return and there's nothing that needs to be scheduled -/// between it and the return. -/// -/// This function only tests target-independent requirements. -/// For target-dependent requirements, a target should override -/// TargetLowering::IsEligibleForTailCallOptimization. -/// -static bool -isInTailCallPosition(const Instruction *I, Attributes CalleeRetAttr, - const TargetLowering &TLI) { - const BasicBlock *ExitBB = I->getParent(); - const TerminatorInst *Term = ExitBB->getTerminator(); - const ReturnInst *Ret = dyn_cast(Term); - const Function *F = ExitBB->getParent(); - - // The block must end in a return statement or an unreachable. - if (!Ret && !isa(Term)) return false; - - // If I will have a chain, make sure no other instruction that will have a - // chain interposes between I and the return. - if (I->mayHaveSideEffects() || I->mayReadFromMemory() || - !I->isSafeToSpeculativelyExecute()) - for (BasicBlock::const_iterator BBI = prior(prior(ExitBB->end())); ; - --BBI) { - if (&*BBI == I) - break; - if (BBI->mayHaveSideEffects() || BBI->mayReadFromMemory() || - !BBI->isSafeToSpeculativelyExecute()) - return false; - } - - // If the block ends with a void return or unreachable, it doesn't matter - // what the call's return type is. - if (!Ret || Ret->getNumOperands() == 0) return true; - - // If the return value is undef, it doesn't matter what the call's - // return type is. - if (isa(Ret->getOperand(0))) return true; - - // Conservatively require the attributes of the call to match those of - // the return. Ignore noalias because it doesn't affect the call sequence. - unsigned CallerRetAttr = F->getAttributes().getRetAttributes(); - if ((CalleeRetAttr ^ CallerRetAttr) & ~Attribute::NoAlias) - return false; - - // Otherwise, make sure the unmodified return value of I is the return value. - for (const Instruction *U = dyn_cast(Ret->getOperand(0)); ; - U = dyn_cast(U->getOperand(0))) { - if (!U) - return false; - if (!U->hasOneUse()) - return false; - if (U == I) - break; - // Check for a truly no-op truncate. - if (isa(U) && - TLI.isTruncateFree(U->getOperand(0)->getType(), U->getType())) - continue; - // Check for a truly no-op bitcast. - if (isa(U) && - (U->getOperand(0)->getType() == U->getType() || - (isa(U->getOperand(0)->getType()) && - isa(U->getType())))) - continue; - // Otherwise it's not a true no-op. - return false; - } - - return true; -} - -void SelectionDAGLowering::LowerCallTo(CallSite CS, SDValue Callee, - bool isTailCall, - MachineBasicBlock *LandingPad) { - const PointerType *PT = cast(CS.getCalledValue()->getType()); - const FunctionType *FTy = cast(PT->getElementType()); - const Type *RetTy = FTy->getReturnType(); - MachineModuleInfo *MMI = DAG.getMachineModuleInfo(); - unsigned BeginLabel = 0, EndLabel = 0; - - TargetLowering::ArgListTy Args; - TargetLowering::ArgListEntry Entry; - Args.reserve(CS.arg_size()); - - // Check whether the function can return without sret-demotion. - SmallVector OutVTs; - SmallVector OutsFlags; - SmallVector Offsets; - getReturnInfo(RetTy, CS.getAttributes().getRetAttributes(), - OutVTs, OutsFlags, TLI, &Offsets); - - - bool CanLowerReturn = TLI.CanLowerReturn(CS.getCallingConv(), - FTy->isVarArg(), OutVTs, OutsFlags, DAG); - - SDValue DemoteStackSlot; - - if (!CanLowerReturn) { - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize( - FTy->getReturnType()); - unsigned Align = TLI.getTargetData()->getPrefTypeAlignment( - FTy->getReturnType()); - MachineFunction &MF = DAG.getMachineFunction(); - int SSFI = MF.getFrameInfo()->CreateStackObject(TySize, Align, false); - const Type *StackSlotPtrType = PointerType::getUnqual(FTy->getReturnType()); - - DemoteStackSlot = DAG.getFrameIndex(SSFI, TLI.getPointerTy()); - Entry.Node = DemoteStackSlot; - Entry.Ty = StackSlotPtrType; - Entry.isSExt = false; - Entry.isZExt = false; - Entry.isInReg = false; - Entry.isSRet = true; - Entry.isNest = false; - Entry.isByVal = false; - Entry.Alignment = Align; - Args.push_back(Entry); - RetTy = Type::getVoidTy(FTy->getContext()); - } - - for (CallSite::arg_iterator i = CS.arg_begin(), e = CS.arg_end(); - i != e; ++i) { - SDValue ArgNode = getValue(*i); - Entry.Node = ArgNode; Entry.Ty = (*i)->getType(); - - unsigned attrInd = i - CS.arg_begin() + 1; - Entry.isSExt = CS.paramHasAttr(attrInd, Attribute::SExt); - Entry.isZExt = CS.paramHasAttr(attrInd, Attribute::ZExt); - Entry.isInReg = CS.paramHasAttr(attrInd, Attribute::InReg); - Entry.isSRet = CS.paramHasAttr(attrInd, Attribute::StructRet); - Entry.isNest = CS.paramHasAttr(attrInd, Attribute::Nest); - Entry.isByVal = CS.paramHasAttr(attrInd, Attribute::ByVal); - Entry.Alignment = CS.getParamAlignment(attrInd); - Args.push_back(Entry); - } - - if (LandingPad && MMI) { - // Insert a label before the invoke call to mark the try range. This can be - // used to detect deletion of the invoke via the MachineModuleInfo. - BeginLabel = MMI->NextLabelID(); - - // Both PendingLoads and PendingExports must be flushed here; - // this call might not return. - (void)getRoot(); - DAG.setRoot(DAG.getLabel(ISD::EH_LABEL, getCurDebugLoc(), - getControlRoot(), BeginLabel)); - } - - // Check if target-independent constraints permit a tail call here. - // Target-dependent constraints are checked within TLI.LowerCallTo. - if (isTailCall && - !isInTailCallPosition(CS.getInstruction(), - CS.getAttributes().getRetAttributes(), - TLI)) - isTailCall = false; - - std::pair Result = - TLI.LowerCallTo(getRoot(), RetTy, - CS.paramHasAttr(0, Attribute::SExt), - CS.paramHasAttr(0, Attribute::ZExt), FTy->isVarArg(), - CS.paramHasAttr(0, Attribute::InReg), FTy->getNumParams(), - CS.getCallingConv(), - isTailCall, - !CS.getInstruction()->use_empty(), - Callee, Args, DAG, getCurDebugLoc()); - assert((isTailCall || Result.second.getNode()) && - "Non-null chain expected with non-tail call!"); - assert((Result.second.getNode() || !Result.first.getNode()) && - "Null value expected with tail call!"); - if (Result.first.getNode()) - setValue(CS.getInstruction(), Result.first); - else if (!CanLowerReturn && Result.second.getNode()) { - // The instruction result is the result of loading from the - // hidden sret parameter. - SmallVector PVTs; - const Type *PtrRetTy = PointerType::getUnqual(FTy->getReturnType()); - - ComputeValueVTs(TLI, PtrRetTy, PVTs); - assert(PVTs.size() == 1 && "Pointers should fit in one register"); - EVT PtrVT = PVTs[0]; - unsigned NumValues = OutVTs.size(); - SmallVector Values(NumValues); - SmallVector Chains(NumValues); - - for (unsigned i = 0; i < NumValues; ++i) { - SDValue L = DAG.getLoad(OutVTs[i], getCurDebugLoc(), Result.second, - DAG.getNode(ISD::ADD, getCurDebugLoc(), PtrVT, DemoteStackSlot, - DAG.getConstant(Offsets[i], PtrVT)), - NULL, Offsets[i], false, 1); - Values[i] = L; - Chains[i] = L.getValue(1); - } - SDValue Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), - MVT::Other, &Chains[0], NumValues); - PendingLoads.push_back(Chain); - - setValue(CS.getInstruction(), DAG.getNode(ISD::MERGE_VALUES, - getCurDebugLoc(), DAG.getVTList(&OutVTs[0], NumValues), - &Values[0], NumValues)); - } - // As a special case, a null chain means that a tail call has - // been emitted and the DAG root is already updated. - if (Result.second.getNode()) - DAG.setRoot(Result.second); - else - HasTailCall = true; - - if (LandingPad && MMI) { - // Insert a label at the end of the invoke call to mark the try range. This - // can be used to detect deletion of the invoke via the MachineModuleInfo. - EndLabel = MMI->NextLabelID(); - DAG.setRoot(DAG.getLabel(ISD::EH_LABEL, getCurDebugLoc(), - getRoot(), EndLabel)); - - // Inform MachineModuleInfo of range. - MMI->addInvoke(LandingPad, BeginLabel, EndLabel); - } -} - - -void SelectionDAGLowering::visitCall(CallInst &I) { - const char *RenameFn = 0; - if (Function *F = I.getCalledFunction()) { - if (F->isDeclaration()) { - const TargetIntrinsicInfo *II = TLI.getTargetMachine().getIntrinsicInfo(); - if (II) { - if (unsigned IID = II->getIntrinsicID(F)) { - RenameFn = visitIntrinsicCall(I, IID); - if (!RenameFn) - return; - } - } - if (unsigned IID = F->getIntrinsicID()) { - RenameFn = visitIntrinsicCall(I, IID); - if (!RenameFn) - return; - } - } - - // Check for well-known libc/libm calls. If the function is internal, it - // can't be a library call. - if (!F->hasLocalLinkage() && F->hasName()) { - StringRef Name = F->getName(); - if (Name == "copysign" || Name == "copysignf") { - if (I.getNumOperands() == 3 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.getType() == I.getOperand(2)->getType()) { - SDValue LHS = getValue(I.getOperand(1)); - SDValue RHS = getValue(I.getOperand(2)); - setValue(&I, DAG.getNode(ISD::FCOPYSIGN, getCurDebugLoc(), - LHS.getValueType(), LHS, RHS)); - return; - } - } else if (Name == "fabs" || Name == "fabsf" || Name == "fabsl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FABS, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "sin" || Name == "sinf" || Name == "sinl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FSIN, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "cos" || Name == "cosf" || Name == "cosl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FCOS, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } else if (Name == "sqrt" || Name == "sqrtf" || Name == "sqrtl") { - if (I.getNumOperands() == 2 && // Basic sanity checks. - I.getOperand(1)->getType()->isFloatingPoint() && - I.getType() == I.getOperand(1)->getType() && - I.onlyReadsMemory()) { - SDValue Tmp = getValue(I.getOperand(1)); - setValue(&I, DAG.getNode(ISD::FSQRT, getCurDebugLoc(), - Tmp.getValueType(), Tmp)); - return; - } - } - } - } else if (isa(I.getOperand(0))) { - visitInlineAsm(&I); - return; - } - - SDValue Callee; - if (!RenameFn) - Callee = getValue(I.getOperand(0)); - else - Callee = DAG.getExternalSymbol(RenameFn, TLI.getPointerTy()); - - // Check if we can potentially perform a tail call. More detailed - // checking is be done within LowerCallTo, after more information - // about the call is known. - bool isTailCall = PerformTailCallOpt && I.isTailCall(); - - LowerCallTo(&I, Callee, isTailCall); -} - - -/// getCopyFromRegs - Emit a series of CopyFromReg nodes that copies from -/// this value and returns the result as a ValueVT value. This uses -/// Chain/Flag as the input and updates them for the output Chain/Flag. -/// If the Flag pointer is NULL, no flag is used. -SDValue RegsForValue::getCopyFromRegs(SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, - SDValue *Flag) const { - // Assemble the legal parts into the final values. - SmallVector Values(ValueVTs.size()); - SmallVector Parts; - for (unsigned Value = 0, Part = 0, e = ValueVTs.size(); Value != e; ++Value) { - // Copy the legal parts from the registers. - EVT ValueVT = ValueVTs[Value]; - unsigned NumRegs = TLI->getNumRegisters(*DAG.getContext(), ValueVT); - EVT RegisterVT = RegVTs[Value]; - - Parts.resize(NumRegs); - for (unsigned i = 0; i != NumRegs; ++i) { - SDValue P; - if (Flag == 0) - P = DAG.getCopyFromReg(Chain, dl, Regs[Part+i], RegisterVT); - else { - P = DAG.getCopyFromReg(Chain, dl, Regs[Part+i], RegisterVT, *Flag); - *Flag = P.getValue(2); - } - Chain = P.getValue(1); - - // If the source register was virtual and if we know something about it, - // add an assert node. - if (TargetRegisterInfo::isVirtualRegister(Regs[Part+i]) && - RegisterVT.isInteger() && !RegisterVT.isVector()) { - unsigned SlotNo = Regs[Part+i]-TargetRegisterInfo::FirstVirtualRegister; - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - if (FLI.LiveOutRegInfo.size() > SlotNo) { - FunctionLoweringInfo::LiveOutInfo &LOI = FLI.LiveOutRegInfo[SlotNo]; - - unsigned RegSize = RegisterVT.getSizeInBits(); - unsigned NumSignBits = LOI.NumSignBits; - unsigned NumZeroBits = LOI.KnownZero.countLeadingOnes(); - - // FIXME: We capture more information than the dag can represent. For - // now, just use the tightest assertzext/assertsext possible. - bool isSExt = true; - EVT FromVT(MVT::Other); - if (NumSignBits == RegSize) - isSExt = true, FromVT = MVT::i1; // ASSERT SEXT 1 - else if (NumZeroBits >= RegSize-1) - isSExt = false, FromVT = MVT::i1; // ASSERT ZEXT 1 - else if (NumSignBits > RegSize-8) - isSExt = true, FromVT = MVT::i8; // ASSERT SEXT 8 - else if (NumZeroBits >= RegSize-8) - isSExt = false, FromVT = MVT::i8; // ASSERT ZEXT 8 - else if (NumSignBits > RegSize-16) - isSExt = true, FromVT = MVT::i16; // ASSERT SEXT 16 - else if (NumZeroBits >= RegSize-16) - isSExt = false, FromVT = MVT::i16; // ASSERT ZEXT 16 - else if (NumSignBits > RegSize-32) - isSExt = true, FromVT = MVT::i32; // ASSERT SEXT 32 - else if (NumZeroBits >= RegSize-32) - isSExt = false, FromVT = MVT::i32; // ASSERT ZEXT 32 - - if (FromVT != MVT::Other) { - P = DAG.getNode(isSExt ? ISD::AssertSext : ISD::AssertZext, dl, - RegisterVT, P, DAG.getValueType(FromVT)); - - } - } - } - - Parts[i] = P; - } - - Values[Value] = getCopyFromParts(DAG, dl, Parts.begin(), - NumRegs, RegisterVT, ValueVT); - Part += NumRegs; - Parts.clear(); - } - - return DAG.getNode(ISD::MERGE_VALUES, dl, - DAG.getVTList(&ValueVTs[0], ValueVTs.size()), - &Values[0], ValueVTs.size()); -} - -/// getCopyToRegs - Emit a series of CopyToReg nodes that copies the -/// specified value into the registers specified by this object. This uses -/// Chain/Flag as the input and updates them for the output Chain/Flag. -/// If the Flag pointer is NULL, no flag is used. -void RegsForValue::getCopyToRegs(SDValue Val, SelectionDAG &DAG, DebugLoc dl, - SDValue &Chain, SDValue *Flag) const { - // Get the list of the values's legal parts. - unsigned NumRegs = Regs.size(); - SmallVector Parts(NumRegs); - for (unsigned Value = 0, Part = 0, e = ValueVTs.size(); Value != e; ++Value) { - EVT ValueVT = ValueVTs[Value]; - unsigned NumParts = TLI->getNumRegisters(*DAG.getContext(), ValueVT); - EVT RegisterVT = RegVTs[Value]; - - getCopyToParts(DAG, dl, Val.getValue(Val.getResNo() + Value), - &Parts[Part], NumParts, RegisterVT); - Part += NumParts; - } - - // Copy the parts into the registers. - SmallVector Chains(NumRegs); - for (unsigned i = 0; i != NumRegs; ++i) { - SDValue Part; - if (Flag == 0) - Part = DAG.getCopyToReg(Chain, dl, Regs[i], Parts[i]); - else { - Part = DAG.getCopyToReg(Chain, dl, Regs[i], Parts[i], *Flag); - *Flag = Part.getValue(1); - } - Chains[i] = Part.getValue(0); - } - - if (NumRegs == 1 || Flag) - // If NumRegs > 1 && Flag is used then the use of the last CopyToReg is - // flagged to it. That is the CopyToReg nodes and the user are considered - // a single scheduling unit. If we create a TokenFactor and return it as - // chain, then the TokenFactor is both a predecessor (operand) of the - // user as well as a successor (the TF operands are flagged to the user). - // c1, f1 = CopyToReg - // c2, f2 = CopyToReg - // c3 = TokenFactor c1, c2 - // ... - // = op c3, ..., f2 - Chain = Chains[NumRegs-1]; - else - Chain = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, &Chains[0], NumRegs); -} - -/// AddInlineAsmOperands - Add this value to the specified inlineasm node -/// operand list. This adds the code marker and includes the number of -/// values added into it. -void RegsForValue::AddInlineAsmOperands(unsigned Code, - bool HasMatching,unsigned MatchingIdx, - SelectionDAG &DAG, - std::vector &Ops) const { - EVT IntPtrTy = DAG.getTargetLoweringInfo().getPointerTy(); - assert(Regs.size() < (1 << 13) && "Too many inline asm outputs!"); - unsigned Flag = Code | (Regs.size() << 3); - if (HasMatching) - Flag |= 0x80000000 | (MatchingIdx << 16); - Ops.push_back(DAG.getTargetConstant(Flag, IntPtrTy)); - for (unsigned Value = 0, Reg = 0, e = ValueVTs.size(); Value != e; ++Value) { - unsigned NumRegs = TLI->getNumRegisters(*DAG.getContext(), ValueVTs[Value]); - EVT RegisterVT = RegVTs[Value]; - for (unsigned i = 0; i != NumRegs; ++i) { - assert(Reg < Regs.size() && "Mismatch in # registers expected"); - Ops.push_back(DAG.getRegister(Regs[Reg++], RegisterVT)); - } - } -} - -/// isAllocatableRegister - If the specified register is safe to allocate, -/// i.e. it isn't a stack pointer or some other special register, return the -/// register class for the register. Otherwise, return null. -static const TargetRegisterClass * -isAllocatableRegister(unsigned Reg, MachineFunction &MF, - const TargetLowering &TLI, - const TargetRegisterInfo *TRI) { - EVT FoundVT = MVT::Other; - const TargetRegisterClass *FoundRC = 0; - for (TargetRegisterInfo::regclass_iterator RCI = TRI->regclass_begin(), - E = TRI->regclass_end(); RCI != E; ++RCI) { - EVT ThisVT = MVT::Other; - - const TargetRegisterClass *RC = *RCI; - // If none of the the value types for this register class are valid, we - // can't use it. For example, 64-bit reg classes on 32-bit targets. - for (TargetRegisterClass::vt_iterator I = RC->vt_begin(), E = RC->vt_end(); - I != E; ++I) { - if (TLI.isTypeLegal(*I)) { - // If we have already found this register in a different register class, - // choose the one with the largest VT specified. For example, on - // PowerPC, we favor f64 register classes over f32. - if (FoundVT == MVT::Other || FoundVT.bitsLT(*I)) { - ThisVT = *I; - break; - } - } - } - - if (ThisVT == MVT::Other) continue; - - // NOTE: This isn't ideal. In particular, this might allocate the - // frame pointer in functions that need it (due to them not being taken - // out of allocation, because a variable sized allocation hasn't been seen - // yet). This is a slight code pessimization, but should still work. - for (TargetRegisterClass::iterator I = RC->allocation_order_begin(MF), - E = RC->allocation_order_end(MF); I != E; ++I) - if (*I == Reg) { - // We found a matching register class. Keep looking at others in case - // we find one with larger registers that this physreg is also in. - FoundRC = RC; - FoundVT = ThisVT; - break; - } - } - return FoundRC; -} - - -namespace llvm { -/// AsmOperandInfo - This contains information for each constraint that we are -/// lowering. -class VISIBILITY_HIDDEN SDISelAsmOperandInfo : - public TargetLowering::AsmOperandInfo { -public: - /// CallOperand - If this is the result output operand or a clobber - /// this is null, otherwise it is the incoming operand to the CallInst. - /// This gets modified as the asm is processed. - SDValue CallOperand; - - /// AssignedRegs - If this is a register or register class operand, this - /// contains the set of register corresponding to the operand. - RegsForValue AssignedRegs; - - explicit SDISelAsmOperandInfo(const InlineAsm::ConstraintInfo &info) - : TargetLowering::AsmOperandInfo(info), CallOperand(0,0) { - } - - /// MarkAllocatedRegs - Once AssignedRegs is set, mark the assigned registers - /// busy in OutputRegs/InputRegs. - void MarkAllocatedRegs(bool isOutReg, bool isInReg, - std::set &OutputRegs, - std::set &InputRegs, - const TargetRegisterInfo &TRI) const { - if (isOutReg) { - for (unsigned i = 0, e = AssignedRegs.Regs.size(); i != e; ++i) - MarkRegAndAliases(AssignedRegs.Regs[i], OutputRegs, TRI); - } - if (isInReg) { - for (unsigned i = 0, e = AssignedRegs.Regs.size(); i != e; ++i) - MarkRegAndAliases(AssignedRegs.Regs[i], InputRegs, TRI); - } - } - - /// getCallOperandValEVT - Return the EVT of the Value* that this operand - /// corresponds to. If there is no Value* for this operand, it returns - /// MVT::Other. - EVT getCallOperandValEVT(LLVMContext &Context, - const TargetLowering &TLI, - const TargetData *TD) const { - if (CallOperandVal == 0) return MVT::Other; - - if (isa(CallOperandVal)) - return TLI.getPointerTy(); - - const llvm::Type *OpTy = CallOperandVal->getType(); - - // If this is an indirect operand, the operand is a pointer to the - // accessed type. - if (isIndirect) - OpTy = cast(OpTy)->getElementType(); - - // If OpTy is not a single value, it may be a struct/union that we - // can tile with integers. - if (!OpTy->isSingleValueType() && OpTy->isSized()) { - unsigned BitSize = TD->getTypeSizeInBits(OpTy); - switch (BitSize) { - default: break; - case 1: - case 8: - case 16: - case 32: - case 64: - case 128: - OpTy = IntegerType::get(Context, BitSize); - break; - } - } - - return TLI.getValueType(OpTy, true); - } - -private: - /// MarkRegAndAliases - Mark the specified register and all aliases in the - /// specified set. - static void MarkRegAndAliases(unsigned Reg, std::set &Regs, - const TargetRegisterInfo &TRI) { - assert(TargetRegisterInfo::isPhysicalRegister(Reg) && "Isn't a physreg"); - Regs.insert(Reg); - if (const unsigned *Aliases = TRI.getAliasSet(Reg)) - for (; *Aliases; ++Aliases) - Regs.insert(*Aliases); - } -}; -} // end llvm namespace. - - -/// GetRegistersForValue - Assign registers (virtual or physical) for the -/// specified operand. We prefer to assign virtual registers, to allow the -/// register allocator handle the assignment process. However, if the asm uses -/// features that we can't model on machineinstrs, we have SDISel do the -/// allocation. This produces generally horrible, but correct, code. -/// -/// OpInfo describes the operand. -/// Input and OutputRegs are the set of already allocated physical registers. -/// -void SelectionDAGLowering:: -GetRegistersForValue(SDISelAsmOperandInfo &OpInfo, - std::set &OutputRegs, - std::set &InputRegs) { - LLVMContext &Context = FuncInfo.Fn->getContext(); - - // Compute whether this value requires an input register, an output register, - // or both. - bool isOutReg = false; - bool isInReg = false; - switch (OpInfo.Type) { - case InlineAsm::isOutput: - isOutReg = true; - - // If there is an input constraint that matches this, we need to reserve - // the input register so no other inputs allocate to it. - isInReg = OpInfo.hasMatchingInput(); - break; - case InlineAsm::isInput: - isInReg = true; - isOutReg = false; - break; - case InlineAsm::isClobber: - isOutReg = true; - isInReg = true; - break; - } - - - MachineFunction &MF = DAG.getMachineFunction(); - SmallVector Regs; - - // If this is a constraint for a single physreg, or a constraint for a - // register class, find it. - std::pair PhysReg = - TLI.getRegForInlineAsmConstraint(OpInfo.ConstraintCode, - OpInfo.ConstraintVT); - - unsigned NumRegs = 1; - if (OpInfo.ConstraintVT != MVT::Other) { - // If this is a FP input in an integer register (or visa versa) insert a bit - // cast of the input value. More generally, handle any case where the input - // value disagrees with the register class we plan to stick this in. - if (OpInfo.Type == InlineAsm::isInput && - PhysReg.second && !PhysReg.second->hasType(OpInfo.ConstraintVT)) { - // Try to convert to the first EVT that the reg class contains. If the - // types are identical size, use a bitcast to convert (e.g. two differing - // vector types). - EVT RegVT = *PhysReg.second->vt_begin(); - if (RegVT.getSizeInBits() == OpInfo.ConstraintVT.getSizeInBits()) { - OpInfo.CallOperand = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - RegVT, OpInfo.CallOperand); - OpInfo.ConstraintVT = RegVT; - } else if (RegVT.isInteger() && OpInfo.ConstraintVT.isFloatingPoint()) { - // If the input is a FP value and we want it in FP registers, do a - // bitcast to the corresponding integer type. This turns an f64 value - // into i64, which can be passed with two i32 values on a 32-bit - // machine. - RegVT = EVT::getIntegerVT(Context, - OpInfo.ConstraintVT.getSizeInBits()); - OpInfo.CallOperand = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - RegVT, OpInfo.CallOperand); - OpInfo.ConstraintVT = RegVT; - } - } - - NumRegs = TLI.getNumRegisters(Context, OpInfo.ConstraintVT); - } - - EVT RegVT; - EVT ValueVT = OpInfo.ConstraintVT; - - // If this is a constraint for a specific physical register, like {r17}, - // assign it now. - if (unsigned AssignedReg = PhysReg.first) { - const TargetRegisterClass *RC = PhysReg.second; - if (OpInfo.ConstraintVT == MVT::Other) - ValueVT = *RC->vt_begin(); - - // Get the actual register value type. This is important, because the user - // may have asked for (e.g.) the AX register in i32 type. We need to - // remember that AX is actually i16 to get the right extension. - RegVT = *RC->vt_begin(); - - // This is a explicit reference to a physical register. - Regs.push_back(AssignedReg); - - // If this is an expanded reference, add the rest of the regs to Regs. - if (NumRegs != 1) { - TargetRegisterClass::iterator I = RC->begin(); - for (; *I != AssignedReg; ++I) - assert(I != RC->end() && "Didn't find reg!"); - - // Already added the first reg. - --NumRegs; ++I; - for (; NumRegs; --NumRegs, ++I) { - assert(I != RC->end() && "Ran out of registers to allocate!"); - Regs.push_back(*I); - } - } - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, RegVT, ValueVT); - const TargetRegisterInfo *TRI = DAG.getTarget().getRegisterInfo(); - OpInfo.MarkAllocatedRegs(isOutReg, isInReg, OutputRegs, InputRegs, *TRI); - return; - } - - // Otherwise, if this was a reference to an LLVM register class, create vregs - // for this reference. - if (const TargetRegisterClass *RC = PhysReg.second) { - RegVT = *RC->vt_begin(); - if (OpInfo.ConstraintVT == MVT::Other) - ValueVT = RegVT; - - // Create the appropriate number of virtual registers. - MachineRegisterInfo &RegInfo = MF.getRegInfo(); - for (; NumRegs; --NumRegs) - Regs.push_back(RegInfo.createVirtualRegister(RC)); - - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, RegVT, ValueVT); - return; - } - - // This is a reference to a register class that doesn't directly correspond - // to an LLVM register class. Allocate NumRegs consecutive, available, - // registers from the class. - std::vector RegClassRegs - = TLI.getRegClassForInlineAsmConstraint(OpInfo.ConstraintCode, - OpInfo.ConstraintVT); - - const TargetRegisterInfo *TRI = DAG.getTarget().getRegisterInfo(); - unsigned NumAllocated = 0; - for (unsigned i = 0, e = RegClassRegs.size(); i != e; ++i) { - unsigned Reg = RegClassRegs[i]; - // See if this register is available. - if ((isOutReg && OutputRegs.count(Reg)) || // Already used. - (isInReg && InputRegs.count(Reg))) { // Already used. - // Make sure we find consecutive registers. - NumAllocated = 0; - continue; - } - - // Check to see if this register is allocatable (i.e. don't give out the - // stack pointer). - const TargetRegisterClass *RC = isAllocatableRegister(Reg, MF, TLI, TRI); - if (!RC) { // Couldn't allocate this register. - // Reset NumAllocated to make sure we return consecutive registers. - NumAllocated = 0; - continue; - } - - // Okay, this register is good, we can use it. - ++NumAllocated; - - // If we allocated enough consecutive registers, succeed. - if (NumAllocated == NumRegs) { - unsigned RegStart = (i-NumAllocated)+1; - unsigned RegEnd = i+1; - // Mark all of the allocated registers used. - for (unsigned i = RegStart; i != RegEnd; ++i) - Regs.push_back(RegClassRegs[i]); - - OpInfo.AssignedRegs = RegsForValue(TLI, Regs, *RC->vt_begin(), - OpInfo.ConstraintVT); - OpInfo.MarkAllocatedRegs(isOutReg, isInReg, OutputRegs, InputRegs, *TRI); - return; - } - } - - // Otherwise, we couldn't allocate enough registers for this. -} - -/// hasInlineAsmMemConstraint - Return true if the inline asm instruction being -/// processed uses a memory 'm' constraint. -static bool -hasInlineAsmMemConstraint(std::vector &CInfos, - const TargetLowering &TLI) { - for (unsigned i = 0, e = CInfos.size(); i != e; ++i) { - InlineAsm::ConstraintInfo &CI = CInfos[i]; - for (unsigned j = 0, ee = CI.Codes.size(); j != ee; ++j) { - TargetLowering::ConstraintType CType = TLI.getConstraintType(CI.Codes[j]); - if (CType == TargetLowering::C_Memory) - return true; - } - - // Indirect operand accesses access memory. - if (CI.isIndirect) - return true; - } - - return false; -} - -/// visitInlineAsm - Handle a call to an InlineAsm object. -/// -void SelectionDAGLowering::visitInlineAsm(CallSite CS) { - InlineAsm *IA = cast(CS.getCalledValue()); - - /// ConstraintOperands - Information about all of the constraints. - std::vector ConstraintOperands; - - std::set OutputRegs, InputRegs; - - // Do a prepass over the constraints, canonicalizing them, and building up the - // ConstraintOperands list. - std::vector - ConstraintInfos = IA->ParseConstraints(); - - bool hasMemory = hasInlineAsmMemConstraint(ConstraintInfos, TLI); - - SDValue Chain, Flag; - - // We won't need to flush pending loads if this asm doesn't touch - // memory and is nonvolatile. - if (hasMemory || IA->hasSideEffects()) - Chain = getRoot(); - else - Chain = DAG.getRoot(); - - unsigned ArgNo = 0; // ArgNo - The argument of the CallInst. - unsigned ResNo = 0; // ResNo - The result number of the next output. - for (unsigned i = 0, e = ConstraintInfos.size(); i != e; ++i) { - ConstraintOperands.push_back(SDISelAsmOperandInfo(ConstraintInfos[i])); - SDISelAsmOperandInfo &OpInfo = ConstraintOperands.back(); - - EVT OpVT = MVT::Other; - - // Compute the value type for each operand. - switch (OpInfo.Type) { - case InlineAsm::isOutput: - // Indirect outputs just consume an argument. - if (OpInfo.isIndirect) { - OpInfo.CallOperandVal = CS.getArgument(ArgNo++); - break; - } - - // The return value of the call is this value. As such, there is no - // corresponding argument. - assert(CS.getType() != Type::getVoidTy(*DAG.getContext()) && - "Bad inline asm!"); - if (const StructType *STy = dyn_cast(CS.getType())) { - OpVT = TLI.getValueType(STy->getElementType(ResNo)); - } else { - assert(ResNo == 0 && "Asm only has one result!"); - OpVT = TLI.getValueType(CS.getType()); - } - ++ResNo; - break; - case InlineAsm::isInput: - OpInfo.CallOperandVal = CS.getArgument(ArgNo++); - break; - case InlineAsm::isClobber: - // Nothing to do. - break; - } - - // If this is an input or an indirect output, process the call argument. - // BasicBlocks are labels, currently appearing only in asm's. - if (OpInfo.CallOperandVal) { - // Strip bitcasts, if any. This mostly comes up for functions. - OpInfo.CallOperandVal = OpInfo.CallOperandVal->stripPointerCasts(); - - if (BasicBlock *BB = dyn_cast(OpInfo.CallOperandVal)) { - OpInfo.CallOperand = DAG.getBasicBlock(FuncInfo.MBBMap[BB]); - } else { - OpInfo.CallOperand = getValue(OpInfo.CallOperandVal); - } - - OpVT = OpInfo.getCallOperandValEVT(*DAG.getContext(), TLI, TD); - } - - OpInfo.ConstraintVT = OpVT; - } - - // Second pass over the constraints: compute which constraint option to use - // and assign registers to constraints that want a specific physreg. - for (unsigned i = 0, e = ConstraintInfos.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - // If this is an output operand with a matching input operand, look up the - // matching input. If their types mismatch, e.g. one is an integer, the - // other is floating point, or their sizes are different, flag it as an - // error. - if (OpInfo.hasMatchingInput()) { - SDISelAsmOperandInfo &Input = ConstraintOperands[OpInfo.MatchingInput]; - if (OpInfo.ConstraintVT != Input.ConstraintVT) { - if ((OpInfo.ConstraintVT.isInteger() != - Input.ConstraintVT.isInteger()) || - (OpInfo.ConstraintVT.getSizeInBits() != - Input.ConstraintVT.getSizeInBits())) { - llvm_report_error("Unsupported asm: input constraint" - " with a matching output constraint of incompatible" - " type!"); - } - Input.ConstraintVT = OpInfo.ConstraintVT; - } - } - - // Compute the constraint code and ConstraintType to use. - TLI.ComputeConstraintToUse(OpInfo, OpInfo.CallOperand, hasMemory, &DAG); - - // If this is a memory input, and if the operand is not indirect, do what we - // need to to provide an address for the memory input. - if (OpInfo.ConstraintType == TargetLowering::C_Memory && - !OpInfo.isIndirect) { - assert(OpInfo.Type == InlineAsm::isInput && - "Can only indirectify direct input operands!"); - - // Memory operands really want the address of the value. If we don't have - // an indirect input, put it in the constpool if we can, otherwise spill - // it to a stack slot. - - // If the operand is a float, integer, or vector constant, spill to a - // constant pool entry to get its address. - Value *OpVal = OpInfo.CallOperandVal; - if (isa(OpVal) || isa(OpVal) || - isa(OpVal)) { - OpInfo.CallOperand = DAG.getConstantPool(cast(OpVal), - TLI.getPointerTy()); - } else { - // Otherwise, create a stack slot and emit a store to it before the - // asm. - const Type *Ty = OpVal->getType(); - uint64_t TySize = TLI.getTargetData()->getTypeAllocSize(Ty); - unsigned Align = TLI.getTargetData()->getPrefTypeAlignment(Ty); - MachineFunction &MF = DAG.getMachineFunction(); - int SSFI = MF.getFrameInfo()->CreateStackObject(TySize, Align, false); - SDValue StackSlot = DAG.getFrameIndex(SSFI, TLI.getPointerTy()); - Chain = DAG.getStore(Chain, getCurDebugLoc(), - OpInfo.CallOperand, StackSlot, NULL, 0); - OpInfo.CallOperand = StackSlot; - } - - // There is no longer a Value* corresponding to this operand. - OpInfo.CallOperandVal = 0; - // It is now an indirect operand. - OpInfo.isIndirect = true; - } - - // If this constraint is for a specific register, allocate it before - // anything else. - if (OpInfo.ConstraintType == TargetLowering::C_Register) - GetRegistersForValue(OpInfo, OutputRegs, InputRegs); - } - ConstraintInfos.clear(); - - - // Second pass - Loop over all of the operands, assigning virtual or physregs - // to register class operands. - for (unsigned i = 0, e = ConstraintOperands.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - // C_Register operands have already been allocated, Other/Memory don't need - // to be. - if (OpInfo.ConstraintType == TargetLowering::C_RegisterClass) - GetRegistersForValue(OpInfo, OutputRegs, InputRegs); - } - - // AsmNodeOperands - The operands for the ISD::INLINEASM node. - std::vector AsmNodeOperands; - AsmNodeOperands.push_back(SDValue()); // reserve space for input chain - AsmNodeOperands.push_back( - DAG.getTargetExternalSymbol(IA->getAsmString().c_str(), MVT::Other)); - - - // Loop over all of the inputs, copying the operand values into the - // appropriate registers and processing the output regs. - RegsForValue RetValRegs; - - // IndirectStoresToEmit - The set of stores to emit after the inline asm node. - std::vector > IndirectStoresToEmit; - - for (unsigned i = 0, e = ConstraintOperands.size(); i != e; ++i) { - SDISelAsmOperandInfo &OpInfo = ConstraintOperands[i]; - - switch (OpInfo.Type) { - case InlineAsm::isOutput: { - if (OpInfo.ConstraintType != TargetLowering::C_RegisterClass && - OpInfo.ConstraintType != TargetLowering::C_Register) { - // Memory output, or 'other' output (e.g. 'X' constraint). - assert(OpInfo.isIndirect && "Memory output must be indirect operand"); - - // Add information to the INLINEASM node to know about this output. - unsigned ResOpType = 4/*MEM*/ | (1<<3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.push_back(OpInfo.CallOperand); - break; - } - - // Otherwise, this is a register or register class output. - - // Copy the output from the appropriate register. Find a register that - // we can use. - if (OpInfo.AssignedRegs.Regs.empty()) { - llvm_report_error("Couldn't allocate output reg for" - " constraint '" + OpInfo.ConstraintCode + "'!"); - } - - // If this is an indirect operand, store through the pointer after the - // asm. - if (OpInfo.isIndirect) { - IndirectStoresToEmit.push_back(std::make_pair(OpInfo.AssignedRegs, - OpInfo.CallOperandVal)); - } else { - // This is the result value of the call. - assert(CS.getType() != Type::getVoidTy(*DAG.getContext()) && - "Bad inline asm!"); - // Concatenate this output onto the outputs list. - RetValRegs.append(OpInfo.AssignedRegs); - } - - // Add information to the INLINEASM node to know that this register is - // set. - OpInfo.AssignedRegs.AddInlineAsmOperands(OpInfo.isEarlyClobber ? - 6 /* EARLYCLOBBER REGDEF */ : - 2 /* REGDEF */ , - false, - 0, - DAG, AsmNodeOperands); - break; - } - case InlineAsm::isInput: { - SDValue InOperandVal = OpInfo.CallOperand; - - if (OpInfo.isMatchingInputConstraint()) { // Matching constraint? - // If this is required to match an output register we have already set, - // just use its register. - unsigned OperandNo = OpInfo.getMatchedOperand(); - - // Scan until we find the definition we already emitted of this operand. - // When we find it, create a RegsForValue operand. - unsigned CurOp = 2; // The first operand. - for (; OperandNo; --OperandNo) { - // Advance to the next operand. - unsigned OpFlag = - cast(AsmNodeOperands[CurOp])->getZExtValue(); - assert(((OpFlag & 7) == 2 /*REGDEF*/ || - (OpFlag & 7) == 6 /*EARLYCLOBBER REGDEF*/ || - (OpFlag & 7) == 4 /*MEM*/) && - "Skipped past definitions?"); - CurOp += InlineAsm::getNumOperandRegisters(OpFlag)+1; - } - - unsigned OpFlag = - cast(AsmNodeOperands[CurOp])->getZExtValue(); - if ((OpFlag & 7) == 2 /*REGDEF*/ - || (OpFlag & 7) == 6 /* EARLYCLOBBER REGDEF */) { - // Add (OpFlag&0xffff)>>3 registers to MatchedRegs. - if (OpInfo.isIndirect) { - llvm_report_error("Don't know how to handle tied indirect " - "register inputs yet!"); - } - RegsForValue MatchedRegs; - MatchedRegs.TLI = &TLI; - MatchedRegs.ValueVTs.push_back(InOperandVal.getValueType()); - EVT RegVT = AsmNodeOperands[CurOp+1].getValueType(); - MatchedRegs.RegVTs.push_back(RegVT); - MachineRegisterInfo &RegInfo = DAG.getMachineFunction().getRegInfo(); - for (unsigned i = 0, e = InlineAsm::getNumOperandRegisters(OpFlag); - i != e; ++i) - MatchedRegs.Regs. - push_back(RegInfo.createVirtualRegister(TLI.getRegClassFor(RegVT))); - - // Use the produced MatchedRegs object to - MatchedRegs.getCopyToRegs(InOperandVal, DAG, getCurDebugLoc(), - Chain, &Flag); - MatchedRegs.AddInlineAsmOperands(1 /*REGUSE*/, - true, OpInfo.getMatchedOperand(), - DAG, AsmNodeOperands); - break; - } else { - assert(((OpFlag & 7) == 4) && "Unknown matching constraint!"); - assert((InlineAsm::getNumOperandRegisters(OpFlag)) == 1 && - "Unexpected number of operands"); - // Add information to the INLINEASM node to know about this input. - // See InlineAsm.h isUseOperandTiedToDef. - OpFlag |= 0x80000000 | (OpInfo.getMatchedOperand() << 16); - AsmNodeOperands.push_back(DAG.getTargetConstant(OpFlag, - TLI.getPointerTy())); - AsmNodeOperands.push_back(AsmNodeOperands[CurOp+1]); - break; - } - } - - if (OpInfo.ConstraintType == TargetLowering::C_Other) { - assert(!OpInfo.isIndirect && - "Don't know how to handle indirect other inputs yet!"); - - std::vector Ops; - TLI.LowerAsmOperandForConstraint(InOperandVal, OpInfo.ConstraintCode[0], - hasMemory, Ops, DAG); - if (Ops.empty()) { - llvm_report_error("Invalid operand for inline asm" - " constraint '" + OpInfo.ConstraintCode + "'!"); - } - - // Add information to the INLINEASM node to know about this input. - unsigned ResOpType = 3 /*IMM*/ | (Ops.size() << 3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.insert(AsmNodeOperands.end(), Ops.begin(), Ops.end()); - break; - } else if (OpInfo.ConstraintType == TargetLowering::C_Memory) { - assert(OpInfo.isIndirect && "Operand must be indirect to be a mem!"); - assert(InOperandVal.getValueType() == TLI.getPointerTy() && - "Memory operands expect pointer values"); - - // Add information to the INLINEASM node to know about this input. - unsigned ResOpType = 4/*MEM*/ | (1<<3); - AsmNodeOperands.push_back(DAG.getTargetConstant(ResOpType, - TLI.getPointerTy())); - AsmNodeOperands.push_back(InOperandVal); - break; - } - - assert((OpInfo.ConstraintType == TargetLowering::C_RegisterClass || - OpInfo.ConstraintType == TargetLowering::C_Register) && - "Unknown constraint type!"); - assert(!OpInfo.isIndirect && - "Don't know how to handle indirect register inputs yet!"); - - // Copy the input into the appropriate registers. - if (OpInfo.AssignedRegs.Regs.empty()) { - llvm_report_error("Couldn't allocate input reg for" - " constraint '"+ OpInfo.ConstraintCode +"'!"); - } - - OpInfo.AssignedRegs.getCopyToRegs(InOperandVal, DAG, getCurDebugLoc(), - Chain, &Flag); - - OpInfo.AssignedRegs.AddInlineAsmOperands(1/*REGUSE*/, false, 0, - DAG, AsmNodeOperands); - break; - } - case InlineAsm::isClobber: { - // Add the clobbered value to the operand list, so that the register - // allocator is aware that the physreg got clobbered. - if (!OpInfo.AssignedRegs.Regs.empty()) - OpInfo.AssignedRegs.AddInlineAsmOperands(6 /* EARLYCLOBBER REGDEF */, - false, 0, DAG,AsmNodeOperands); - break; - } - } - } - - // Finish up input operands. - AsmNodeOperands[0] = Chain; - if (Flag.getNode()) AsmNodeOperands.push_back(Flag); - - Chain = DAG.getNode(ISD::INLINEASM, getCurDebugLoc(), - DAG.getVTList(MVT::Other, MVT::Flag), - &AsmNodeOperands[0], AsmNodeOperands.size()); - Flag = Chain.getValue(1); - - // If this asm returns a register value, copy the result from that register - // and set it as the value of the call. - if (!RetValRegs.Regs.empty()) { - SDValue Val = RetValRegs.getCopyFromRegs(DAG, getCurDebugLoc(), - Chain, &Flag); - - // FIXME: Why don't we do this for inline asms with MRVs? - if (CS.getType()->isSingleValueType() && CS.getType()->isSized()) { - EVT ResultType = TLI.getValueType(CS.getType()); - - // If any of the results of the inline asm is a vector, it may have the - // wrong width/num elts. This can happen for register classes that can - // contain multiple different value types. The preg or vreg allocated may - // not have the same VT as was expected. Convert it to the right type - // with bit_convert. - if (ResultType != Val.getValueType() && Val.getValueType().isVector()) { - Val = DAG.getNode(ISD::BIT_CONVERT, getCurDebugLoc(), - ResultType, Val); - - } else if (ResultType != Val.getValueType() && - ResultType.isInteger() && Val.getValueType().isInteger()) { - // If a result value was tied to an input value, the computed result may - // have a wider width than the expected result. Extract the relevant - // portion. - Val = DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), ResultType, Val); - } - - assert(ResultType == Val.getValueType() && "Asm result value mismatch!"); - } - - setValue(CS.getInstruction(), Val); - // Don't need to use this as a chain in this case. - if (!IA->hasSideEffects() && !hasMemory && IndirectStoresToEmit.empty()) - return; - } - - std::vector > StoresToEmit; - - // Process indirect outputs, first output all of the flagged copies out of - // physregs. - for (unsigned i = 0, e = IndirectStoresToEmit.size(); i != e; ++i) { - RegsForValue &OutRegs = IndirectStoresToEmit[i].first; - Value *Ptr = IndirectStoresToEmit[i].second; - SDValue OutVal = OutRegs.getCopyFromRegs(DAG, getCurDebugLoc(), - Chain, &Flag); - StoresToEmit.push_back(std::make_pair(OutVal, Ptr)); - - } - - // Emit the non-flagged stores from the physregs. - SmallVector OutChains; - for (unsigned i = 0, e = StoresToEmit.size(); i != e; ++i) - OutChains.push_back(DAG.getStore(Chain, getCurDebugLoc(), - StoresToEmit[i].first, - getValue(StoresToEmit[i].second), - StoresToEmit[i].second, 0)); - if (!OutChains.empty()) - Chain = DAG.getNode(ISD::TokenFactor, getCurDebugLoc(), MVT::Other, - &OutChains[0], OutChains.size()); - DAG.setRoot(Chain); -} - -void SelectionDAGLowering::visitVAStart(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VASTART, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(1)))); -} - -void SelectionDAGLowering::visitVAArg(VAArgInst &I) { - SDValue V = DAG.getVAArg(TLI.getValueType(I.getType()), getCurDebugLoc(), - getRoot(), getValue(I.getOperand(0)), - DAG.getSrcValue(I.getOperand(0))); - setValue(&I, V); - DAG.setRoot(V.getValue(1)); -} - -void SelectionDAGLowering::visitVAEnd(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VAEND, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(1)))); -} - -void SelectionDAGLowering::visitVACopy(CallInst &I) { - DAG.setRoot(DAG.getNode(ISD::VACOPY, getCurDebugLoc(), - MVT::Other, getRoot(), - getValue(I.getOperand(1)), - getValue(I.getOperand(2)), - DAG.getSrcValue(I.getOperand(1)), - DAG.getSrcValue(I.getOperand(2)))); -} - -/// TargetLowering::LowerCallTo - This is the default LowerCallTo -/// implementation, which just calls LowerCall. -/// FIXME: When all targets are -/// migrated to using LowerCall, this hook should be integrated into SDISel. -std::pair -TargetLowering::LowerCallTo(SDValue Chain, const Type *RetTy, - bool RetSExt, bool RetZExt, bool isVarArg, - bool isInreg, unsigned NumFixedArgs, - CallingConv::ID CallConv, bool isTailCall, - bool isReturnValueUsed, - SDValue Callee, - ArgListTy &Args, SelectionDAG &DAG, DebugLoc dl) { - - assert((!isTailCall || PerformTailCallOpt) && - "isTailCall set when tail-call optimizations are disabled!"); - - // Handle all of the outgoing arguments. - SmallVector Outs; - for (unsigned i = 0, e = Args.size(); i != e; ++i) { - SmallVector ValueVTs; - ComputeValueVTs(*this, Args[i].Ty, ValueVTs); - for (unsigned Value = 0, NumValues = ValueVTs.size(); - Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - const Type *ArgTy = VT.getTypeForEVT(RetTy->getContext()); - SDValue Op = SDValue(Args[i].Node.getNode(), - Args[i].Node.getResNo() + Value); - ISD::ArgFlagsTy Flags; - unsigned OriginalAlignment = - getTargetData()->getABITypeAlignment(ArgTy); - - if (Args[i].isZExt) - Flags.setZExt(); - if (Args[i].isSExt) - Flags.setSExt(); - if (Args[i].isInReg) - Flags.setInReg(); - if (Args[i].isSRet) - Flags.setSRet(); - if (Args[i].isByVal) { - Flags.setByVal(); - const PointerType *Ty = cast(Args[i].Ty); - const Type *ElementTy = Ty->getElementType(); - unsigned FrameAlign = getByValTypeAlignment(ElementTy); - unsigned FrameSize = getTargetData()->getTypeAllocSize(ElementTy); - // For ByVal, alignment should come from FE. BE will guess if this - // info is not there but there are cases it cannot get right. - if (Args[i].Alignment) - FrameAlign = Args[i].Alignment; - Flags.setByValAlign(FrameAlign); - Flags.setByValSize(FrameSize); - } - if (Args[i].isNest) - Flags.setNest(); - Flags.setOrigAlign(OriginalAlignment); - - EVT PartVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumParts = getNumRegisters(RetTy->getContext(), VT); - SmallVector Parts(NumParts); - ISD::NodeType ExtendKind = ISD::ANY_EXTEND; - - if (Args[i].isSExt) - ExtendKind = ISD::SIGN_EXTEND; - else if (Args[i].isZExt) - ExtendKind = ISD::ZERO_EXTEND; - - getCopyToParts(DAG, dl, Op, &Parts[0], NumParts, PartVT, ExtendKind); - - for (unsigned j = 0; j != NumParts; ++j) { - // if it isn't first piece, alignment must be 1 - ISD::OutputArg MyFlags(Flags, Parts[j], i < NumFixedArgs); - if (NumParts > 1 && j == 0) - MyFlags.Flags.setSplit(); - else if (j != 0) - MyFlags.Flags.setOrigAlign(1); - - Outs.push_back(MyFlags); - } - } - } - - // Handle the incoming return values from the call. - SmallVector Ins; - SmallVector RetTys; - ComputeValueVTs(*this, RetTy, RetTys); - for (unsigned I = 0, E = RetTys.size(); I != E; ++I) { - EVT VT = RetTys[I]; - EVT RegisterVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumRegs = getNumRegisters(RetTy->getContext(), VT); - for (unsigned i = 0; i != NumRegs; ++i) { - ISD::InputArg MyFlags; - MyFlags.VT = RegisterVT; - MyFlags.Used = isReturnValueUsed; - if (RetSExt) - MyFlags.Flags.setSExt(); - if (RetZExt) - MyFlags.Flags.setZExt(); - if (isInreg) - MyFlags.Flags.setInReg(); - Ins.push_back(MyFlags); - } - } - - // Check if target-dependent constraints permit a tail call here. - // Target-independent constraints should be checked by the caller. - if (isTailCall && - !IsEligibleForTailCallOptimization(Callee, CallConv, isVarArg, Ins, DAG)) - isTailCall = false; - - SmallVector InVals; - Chain = LowerCall(Chain, Callee, CallConv, isVarArg, isTailCall, - Outs, Ins, dl, DAG, InVals); - - // Verify that the target's LowerCall behaved as expected. - assert(Chain.getNode() && Chain.getValueType() == MVT::Other && - "LowerCall didn't return a valid chain!"); - assert((!isTailCall || InVals.empty()) && - "LowerCall emitted a return value for a tail call!"); - assert((isTailCall || InVals.size() == Ins.size()) && - "LowerCall didn't emit the correct number of values!"); - DEBUG(for (unsigned i = 0, e = Ins.size(); i != e; ++i) { - assert(InVals[i].getNode() && - "LowerCall emitted a null value!"); - assert(Ins[i].VT == InVals[i].getValueType() && - "LowerCall emitted a value with the wrong type!"); - }); - - // For a tail call, the return value is merely live-out and there aren't - // any nodes in the DAG representing it. Return a special value to - // indicate that a tail call has been emitted and no more Instructions - // should be processed in the current block. - if (isTailCall) { - DAG.setRoot(Chain); - return std::make_pair(SDValue(), SDValue()); - } - - // Collect the legal value parts into potentially illegal values - // that correspond to the original function's return values. - ISD::NodeType AssertOp = ISD::DELETED_NODE; - if (RetSExt) - AssertOp = ISD::AssertSext; - else if (RetZExt) - AssertOp = ISD::AssertZext; - SmallVector ReturnValues; - unsigned CurReg = 0; - for (unsigned I = 0, E = RetTys.size(); I != E; ++I) { - EVT VT = RetTys[I]; - EVT RegisterVT = getRegisterType(RetTy->getContext(), VT); - unsigned NumRegs = getNumRegisters(RetTy->getContext(), VT); - - SDValue ReturnValue = - getCopyFromParts(DAG, dl, &InVals[CurReg], NumRegs, RegisterVT, VT, - AssertOp); - ReturnValues.push_back(ReturnValue); - CurReg += NumRegs; - } - - // For a function returning void, there is no return value. We can't create - // such a node, so we just return a null return value in that case. In - // that case, nothing will actualy look at the value. - if (ReturnValues.empty()) - return std::make_pair(SDValue(), Chain); - - SDValue Res = DAG.getNode(ISD::MERGE_VALUES, dl, - DAG.getVTList(&RetTys[0], RetTys.size()), - &ReturnValues[0], ReturnValues.size()); - - return std::make_pair(Res, Chain); -} - -void TargetLowering::LowerOperationWrapper(SDNode *N, - SmallVectorImpl &Results, - SelectionDAG &DAG) { - SDValue Res = LowerOperation(SDValue(N, 0), DAG); - if (Res.getNode()) - Results.push_back(Res); -} - -SDValue TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { - llvm_unreachable("LowerOperation not implemented for this target!"); - return SDValue(); -} - - -void SelectionDAGLowering::CopyValueToVirtualRegister(Value *V, unsigned Reg) { - SDValue Op = getValue(V); - assert((Op.getOpcode() != ISD::CopyFromReg || - cast(Op.getOperand(1))->getReg() != Reg) && - "Copy from a reg to the same reg!"); - assert(!TargetRegisterInfo::isPhysicalRegister(Reg) && "Is a physreg"); - - RegsForValue RFV(V->getContext(), TLI, Reg, V->getType()); - SDValue Chain = DAG.getEntryNode(); - RFV.getCopyToRegs(Op, DAG, getCurDebugLoc(), Chain, 0); - PendingExports.push_back(Chain); -} - -#include "llvm/CodeGen/SelectionDAGISel.h" - -void SelectionDAGISel::LowerArguments(BasicBlock *LLVMBB) { - // If this is the entry block, emit arguments. - Function &F = *LLVMBB->getParent(); - SelectionDAG &DAG = SDL->DAG; - SDValue OldRoot = DAG.getRoot(); - DebugLoc dl = SDL->getCurDebugLoc(); - const TargetData *TD = TLI.getTargetData(); - SmallVector Ins; - - // Check whether the function can return without sret-demotion. - SmallVector OutVTs; - SmallVector OutsFlags; - getReturnInfo(F.getReturnType(), F.getAttributes().getRetAttributes(), - OutVTs, OutsFlags, TLI); - FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); - - FLI.CanLowerReturn = TLI.CanLowerReturn(F.getCallingConv(), F.isVarArg(), - OutVTs, OutsFlags, DAG); - if (!FLI.CanLowerReturn) { - // Put in an sret pointer parameter before all the other parameters. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F.getReturnType()), ValueVTs); - - // NOTE: Assuming that a pointer will never break down to more than one VT - // or one register. - ISD::ArgFlagsTy Flags; - Flags.setSRet(); - EVT RegisterVT = TLI.getRegisterType(*CurDAG->getContext(), ValueVTs[0]); - ISD::InputArg RetArg(Flags, RegisterVT, true); - Ins.push_back(RetArg); - } - - // Set up the incoming argument description vector. - unsigned Idx = 1; - for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(); - I != E; ++I, ++Idx) { - SmallVector ValueVTs; - ComputeValueVTs(TLI, I->getType(), ValueVTs); - bool isArgValueUsed = !I->use_empty(); - for (unsigned Value = 0, NumValues = ValueVTs.size(); - Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - const Type *ArgTy = VT.getTypeForEVT(*DAG.getContext()); - ISD::ArgFlagsTy Flags; - unsigned OriginalAlignment = - TD->getABITypeAlignment(ArgTy); - - if (F.paramHasAttr(Idx, Attribute::ZExt)) - Flags.setZExt(); - if (F.paramHasAttr(Idx, Attribute::SExt)) - Flags.setSExt(); - if (F.paramHasAttr(Idx, Attribute::InReg)) - Flags.setInReg(); - if (F.paramHasAttr(Idx, Attribute::StructRet)) - Flags.setSRet(); - if (F.paramHasAttr(Idx, Attribute::ByVal)) { - Flags.setByVal(); - const PointerType *Ty = cast(I->getType()); - const Type *ElementTy = Ty->getElementType(); - unsigned FrameAlign = TLI.getByValTypeAlignment(ElementTy); - unsigned FrameSize = TD->getTypeAllocSize(ElementTy); - // For ByVal, alignment should be passed from FE. BE will guess if - // this info is not there but there are cases it cannot get right. - if (F.getParamAlignment(Idx)) - FrameAlign = F.getParamAlignment(Idx); - Flags.setByValAlign(FrameAlign); - Flags.setByValSize(FrameSize); - } - if (F.paramHasAttr(Idx, Attribute::Nest)) - Flags.setNest(); - Flags.setOrigAlign(OriginalAlignment); - - EVT RegisterVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - unsigned NumRegs = TLI.getNumRegisters(*CurDAG->getContext(), VT); - for (unsigned i = 0; i != NumRegs; ++i) { - ISD::InputArg MyFlags(Flags, RegisterVT, isArgValueUsed); - if (NumRegs > 1 && i == 0) - MyFlags.Flags.setSplit(); - // if it isn't first piece, alignment must be 1 - else if (i > 0) - MyFlags.Flags.setOrigAlign(1); - Ins.push_back(MyFlags); - } - } - } - - // Call the target to set up the argument values. - SmallVector InVals; - SDValue NewRoot = TLI.LowerFormalArguments(DAG.getRoot(), F.getCallingConv(), - F.isVarArg(), Ins, - dl, DAG, InVals); - - // Verify that the target's LowerFormalArguments behaved as expected. - assert(NewRoot.getNode() && NewRoot.getValueType() == MVT::Other && - "LowerFormalArguments didn't return a valid chain!"); - assert(InVals.size() == Ins.size() && - "LowerFormalArguments didn't emit the correct number of values!"); - DEBUG(for (unsigned i = 0, e = Ins.size(); i != e; ++i) { - assert(InVals[i].getNode() && - "LowerFormalArguments emitted a null value!"); - assert(Ins[i].VT == InVals[i].getValueType() && - "LowerFormalArguments emitted a value with the wrong type!"); - }); - - // Update the DAG with the new chain value resulting from argument lowering. - DAG.setRoot(NewRoot); - - // Set up the argument values. - unsigned i = 0; - Idx = 1; - if (!FLI.CanLowerReturn) { - // Create a virtual register for the sret pointer, and put in a copy - // from the sret argument into it. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PointerType::getUnqual(F.getReturnType()), ValueVTs); - EVT VT = ValueVTs[0]; - EVT RegVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - ISD::NodeType AssertOp = ISD::DELETED_NODE; - SDValue ArgValue = getCopyFromParts(DAG, dl, &InVals[0], 1, RegVT, - VT, AssertOp); - - MachineFunction& MF = SDL->DAG.getMachineFunction(); - MachineRegisterInfo& RegInfo = MF.getRegInfo(); - unsigned SRetReg = RegInfo.createVirtualRegister(TLI.getRegClassFor(RegVT)); - FLI.DemoteRegister = SRetReg; - NewRoot = SDL->DAG.getCopyToReg(NewRoot, SDL->getCurDebugLoc(), SRetReg, ArgValue); - DAG.setRoot(NewRoot); - - // i indexes lowered arguments. Bump it past the hidden sret argument. - // Idx indexes LLVM arguments. Don't touch it. - ++i; - } - for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(); I != E; - ++I, ++Idx) { - SmallVector ArgValues; - SmallVector ValueVTs; - ComputeValueVTs(TLI, I->getType(), ValueVTs); - unsigned NumValues = ValueVTs.size(); - for (unsigned Value = 0; Value != NumValues; ++Value) { - EVT VT = ValueVTs[Value]; - EVT PartVT = TLI.getRegisterType(*CurDAG->getContext(), VT); - unsigned NumParts = TLI.getNumRegisters(*CurDAG->getContext(), VT); - - if (!I->use_empty()) { - ISD::NodeType AssertOp = ISD::DELETED_NODE; - if (F.paramHasAttr(Idx, Attribute::SExt)) - AssertOp = ISD::AssertSext; - else if (F.paramHasAttr(Idx, Attribute::ZExt)) - AssertOp = ISD::AssertZext; - - ArgValues.push_back(getCopyFromParts(DAG, dl, &InVals[i], NumParts, - PartVT, VT, AssertOp)); - } - i += NumParts; - } - if (!I->use_empty()) { - SDL->setValue(I, DAG.getMergeValues(&ArgValues[0], NumValues, - SDL->getCurDebugLoc())); - // If this argument is live outside of the entry block, insert a copy from - // whereever we got it to the vreg that other BB's will reference it as. - SDL->CopyToExportRegsIfNeeded(I); - } - } - assert(i == InVals.size() && "Argument register count mismatch!"); - - // Finally, if the target has anything special to do, allow it to do so. - // FIXME: this should insert code into the DAG! - EmitFunctionEntryCode(F, SDL->DAG.getMachineFunction()); -} - -/// Handle PHI nodes in successor blocks. Emit code into the SelectionDAG to -/// ensure constants are generated when needed. Remember the virtual registers -/// that need to be added to the Machine PHI nodes as input. We cannot just -/// directly add them, because expansion might result in multiple MBB's for one -/// BB. As such, the start of the BB might correspond to a different MBB than -/// the end. -/// -void -SelectionDAGISel::HandlePHINodesInSuccessorBlocks(BasicBlock *LLVMBB) { - TerminatorInst *TI = LLVMBB->getTerminator(); - - SmallPtrSet SuccsHandled; - - // Check successor nodes' PHI nodes that expect a constant to be available - // from this block. - for (unsigned succ = 0, e = TI->getNumSuccessors(); succ != e; ++succ) { - BasicBlock *SuccBB = TI->getSuccessor(succ); - if (!isa(SuccBB->begin())) continue; - MachineBasicBlock *SuccMBB = FuncInfo->MBBMap[SuccBB]; - - // If this terminator has multiple identical successors (common for - // switches), only handle each succ once. - if (!SuccsHandled.insert(SuccMBB)) continue; - - MachineBasicBlock::iterator MBBI = SuccMBB->begin(); - PHINode *PN; - - // At this point we know that there is a 1-1 correspondence between LLVM PHI - // nodes and Machine PHI nodes, but the incoming operands have not been - // emitted yet. - for (BasicBlock::iterator I = SuccBB->begin(); - (PN = dyn_cast(I)); ++I) { - // Ignore dead phi's. - if (PN->use_empty()) continue; - - unsigned Reg; - Value *PHIOp = PN->getIncomingValueForBlock(LLVMBB); - - if (Constant *C = dyn_cast(PHIOp)) { - unsigned &RegOut = SDL->ConstantsOut[C]; - if (RegOut == 0) { - RegOut = FuncInfo->CreateRegForValue(C); - SDL->CopyValueToVirtualRegister(C, RegOut); - } - Reg = RegOut; - } else { - Reg = FuncInfo->ValueMap[PHIOp]; - if (Reg == 0) { - assert(isa(PHIOp) && - FuncInfo->StaticAllocaMap.count(cast(PHIOp)) && - "Didn't codegen value into a register!??"); - Reg = FuncInfo->CreateRegForValue(PHIOp); - SDL->CopyValueToVirtualRegister(PHIOp, Reg); - } - } - - // Remember that this register needs to added to the machine PHI node as - // the input for this MBB. - SmallVector ValueVTs; - ComputeValueVTs(TLI, PN->getType(), ValueVTs); - for (unsigned vti = 0, vte = ValueVTs.size(); vti != vte; ++vti) { - EVT VT = ValueVTs[vti]; - unsigned NumRegisters = TLI.getNumRegisters(*CurDAG->getContext(), VT); - for (unsigned i = 0, e = NumRegisters; i != e; ++i) - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg+i)); - Reg += NumRegisters; - } - } - } - SDL->ConstantsOut.clear(); -} - -/// This is the Fast-ISel version of HandlePHINodesInSuccessorBlocks. It only -/// supports legal types, and it emits MachineInstrs directly instead of -/// creating SelectionDAG nodes. -/// -bool -SelectionDAGISel::HandlePHINodesInSuccessorBlocksFast(BasicBlock *LLVMBB, - FastISel *F) { - TerminatorInst *TI = LLVMBB->getTerminator(); - - SmallPtrSet SuccsHandled; - unsigned OrigNumPHINodesToUpdate = SDL->PHINodesToUpdate.size(); - - // Check successor nodes' PHI nodes that expect a constant to be available - // from this block. - for (unsigned succ = 0, e = TI->getNumSuccessors(); succ != e; ++succ) { - BasicBlock *SuccBB = TI->getSuccessor(succ); - if (!isa(SuccBB->begin())) continue; - MachineBasicBlock *SuccMBB = FuncInfo->MBBMap[SuccBB]; - - // If this terminator has multiple identical successors (common for - // switches), only handle each succ once. - if (!SuccsHandled.insert(SuccMBB)) continue; - - MachineBasicBlock::iterator MBBI = SuccMBB->begin(); - PHINode *PN; - - // At this point we know that there is a 1-1 correspondence between LLVM PHI - // nodes and Machine PHI nodes, but the incoming operands have not been - // emitted yet. - for (BasicBlock::iterator I = SuccBB->begin(); - (PN = dyn_cast(I)); ++I) { - // Ignore dead phi's. - if (PN->use_empty()) continue; - - // Only handle legal types. Two interesting things to note here. First, - // by bailing out early, we may leave behind some dead instructions, - // since SelectionDAG's HandlePHINodesInSuccessorBlocks will insert its - // own moves. Second, this check is necessary becuase FastISel doesn't - // use CreateRegForValue to create registers, so it always creates - // exactly one register for each non-void instruction. - EVT VT = TLI.getValueType(PN->getType(), /*AllowUnknown=*/true); - if (VT == MVT::Other || !TLI.isTypeLegal(VT)) { - // Promote MVT::i1. - if (VT == MVT::i1) - VT = TLI.getTypeToTransformTo(*CurDAG->getContext(), VT); - else { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); - return false; - } - } - - Value *PHIOp = PN->getIncomingValueForBlock(LLVMBB); - - unsigned Reg = F->getRegForValue(PHIOp); - if (Reg == 0) { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); - return false; - } - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg)); - } - } - - return true; -} Removed: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h?rev=89680&view=auto ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (removed) @@ -1,487 +0,0 @@ -//===-- SelectionDAGBuild.h - Selection-DAG building ----------------------===// -// -// The LLVM Compiler Infrastructure -// -// This file is distributed under the University of Illinois Open Source -// License. See LICENSE.TXT for details. -// -//===----------------------------------------------------------------------===// -// -// This implements routines for translating from LLVM IR into SelectionDAG IR. -// -//===----------------------------------------------------------------------===// - -#ifndef SELECTIONDAGBUILD_H -#define SELECTIONDAGBUILD_H - -#include "llvm/Constants.h" -#include "llvm/CodeGen/SelectionDAG.h" -#include "llvm/ADT/APInt.h" -#include "llvm/ADT/DenseMap.h" -#ifndef NDEBUG -#include "llvm/ADT/SmallSet.h" -#endif -#include "llvm/CodeGen/SelectionDAGNodes.h" -#include "llvm/CodeGen/ValueTypes.h" -#include "llvm/Support/CallSite.h" -#include "llvm/Support/ErrorHandling.h" -#include -#include - -namespace llvm { - -class AliasAnalysis; -class AllocaInst; -class BasicBlock; -class BitCastInst; -class BranchInst; -class CallInst; -class ExtractElementInst; -class ExtractValueInst; -class FCmpInst; -class FPExtInst; -class FPToSIInst; -class FPToUIInst; -class FPTruncInst; -class Function; -class FunctionLoweringInfo; -class GetElementPtrInst; -class GCFunctionInfo; -class ICmpInst; -class IntToPtrInst; -class IndirectBrInst; -class InvokeInst; -class InsertElementInst; -class InsertValueInst; -class Instruction; -class LoadInst; -class MachineBasicBlock; -class MachineFunction; -class MachineInstr; -class MachineRegisterInfo; -class PHINode; -class PtrToIntInst; -class ReturnInst; -class SDISelAsmOperandInfo; -class SExtInst; -class SelectInst; -class ShuffleVectorInst; -class SIToFPInst; -class StoreInst; -class SwitchInst; -class TargetData; -class TargetLowering; -class TruncInst; -class UIToFPInst; -class UnreachableInst; -class UnwindInst; -class VAArgInst; -class ZExtInst; - -//===----------------------------------------------------------------------===// -/// SelectionDAGLowering - This is the common target-independent lowering -/// implementation that is parameterized by a TargetLowering object. -/// Also, targets can overload any lowering method. -/// -class SelectionDAGLowering { - MachineBasicBlock *CurMBB; - - /// CurDebugLoc - current file + line number. Changes as we build the DAG. - DebugLoc CurDebugLoc; - - DenseMap NodeMap; - - /// PendingLoads - Loads are not emitted to the program immediately. We bunch - /// them up and then emit token factor nodes when possible. This allows us to - /// get simple disambiguation between loads without worrying about alias - /// analysis. - SmallVector PendingLoads; - - /// PendingExports - CopyToReg nodes that copy values to virtual registers - /// for export to other blocks need to be emitted before any terminator - /// instruction, but they have no other ordering requirements. We bunch them - /// up and the emit a single tokenfactor for them just before terminator - /// instructions. - SmallVector PendingExports; - - /// Case - A struct to record the Value for a switch case, and the - /// case's target basic block. - struct Case { - Constant* Low; - Constant* High; - MachineBasicBlock* BB; - - Case() : Low(0), High(0), BB(0) { } - Case(Constant* low, Constant* high, MachineBasicBlock* bb) : - Low(low), High(high), BB(bb) { } - APInt size() const { - const APInt &rHigh = cast(High)->getValue(); - const APInt &rLow = cast(Low)->getValue(); - return (rHigh - rLow + 1ULL); - } - }; - - struct CaseBits { - uint64_t Mask; - MachineBasicBlock* BB; - unsigned Bits; - - CaseBits(uint64_t mask, MachineBasicBlock* bb, unsigned bits): - Mask(mask), BB(bb), Bits(bits) { } - }; - - typedef std::vector CaseVector; - typedef std::vector CaseBitsVector; - typedef CaseVector::iterator CaseItr; - typedef std::pair CaseRange; - - /// CaseRec - A struct with ctor used in lowering switches to a binary tree - /// of conditional branches. - struct CaseRec { - CaseRec(MachineBasicBlock *bb, Constant *lt, Constant *ge, CaseRange r) : - CaseBB(bb), LT(lt), GE(ge), Range(r) {} - - /// CaseBB - The MBB in which to emit the compare and branch - MachineBasicBlock *CaseBB; - /// LT, GE - If nonzero, we know the current case value must be less-than or - /// greater-than-or-equal-to these Constants. - Constant *LT; - Constant *GE; - /// Range - A pair of iterators representing the range of case values to be - /// processed at this point in the binary search tree. - CaseRange Range; - }; - - typedef std::vector CaseRecVector; - - /// The comparison function for sorting the switch case values in the vector. - /// WARNING: Case ranges should be disjoint! - struct CaseCmp { - bool operator () (const Case& C1, const Case& C2) { - assert(isa(C1.Low) && isa(C2.High)); - const ConstantInt* CI1 = cast(C1.Low); - const ConstantInt* CI2 = cast(C2.High); - return CI1->getValue().slt(CI2->getValue()); - } - }; - - struct CaseBitsCmp { - bool operator () (const CaseBits& C1, const CaseBits& C2) { - return C1.Bits > C2.Bits; - } - }; - - size_t Clusterify(CaseVector& Cases, const SwitchInst &SI); - - /// CaseBlock - This structure is used to communicate between SDLowering and - /// SDISel for the code generation of additional basic blocks needed by multi- - /// case switch statements. - struct CaseBlock { - CaseBlock(ISD::CondCode cc, Value *cmplhs, Value *cmprhs, Value *cmpmiddle, - MachineBasicBlock *truebb, MachineBasicBlock *falsebb, - MachineBasicBlock *me) - : CC(cc), CmpLHS(cmplhs), CmpMHS(cmpmiddle), CmpRHS(cmprhs), - TrueBB(truebb), FalseBB(falsebb), ThisBB(me) {} - // CC - the condition code to use for the case block's setcc node - ISD::CondCode CC; - // CmpLHS/CmpRHS/CmpMHS - The LHS/MHS/RHS of the comparison to emit. - // Emit by default LHS op RHS. MHS is used for range comparisons: - // If MHS is not null: (LHS <= MHS) and (MHS <= RHS). - Value *CmpLHS, *CmpMHS, *CmpRHS; - // TrueBB/FalseBB - the block to branch to if the setcc is true/false. - MachineBasicBlock *TrueBB, *FalseBB; - // ThisBB - the block into which to emit the code for the setcc and branches - MachineBasicBlock *ThisBB; - }; - struct JumpTable { - JumpTable(unsigned R, unsigned J, MachineBasicBlock *M, - MachineBasicBlock *D): Reg(R), JTI(J), MBB(M), Default(D) {} - - /// Reg - the virtual register containing the index of the jump table entry - //. to jump to. - unsigned Reg; - /// JTI - the JumpTableIndex for this jump table in the function. - unsigned JTI; - /// MBB - the MBB into which to emit the code for the indirect jump. - MachineBasicBlock *MBB; - /// Default - the MBB of the default bb, which is a successor of the range - /// check MBB. This is when updating PHI nodes in successors. - MachineBasicBlock *Default; - }; - struct JumpTableHeader { - JumpTableHeader(APInt F, APInt L, Value* SV, MachineBasicBlock* H, - bool E = false): - First(F), Last(L), SValue(SV), HeaderBB(H), Emitted(E) {} - APInt First; - APInt Last; - Value *SValue; - MachineBasicBlock *HeaderBB; - bool Emitted; - }; - typedef std::pair JumpTableBlock; - - struct BitTestCase { - BitTestCase(uint64_t M, MachineBasicBlock* T, MachineBasicBlock* Tr): - Mask(M), ThisBB(T), TargetBB(Tr) { } - uint64_t Mask; - MachineBasicBlock* ThisBB; - MachineBasicBlock* TargetBB; - }; - - typedef SmallVector BitTestInfo; - - struct BitTestBlock { - BitTestBlock(APInt F, APInt R, Value* SV, - unsigned Rg, bool E, - MachineBasicBlock* P, MachineBasicBlock* D, - const BitTestInfo& C): - First(F), Range(R), SValue(SV), Reg(Rg), Emitted(E), - Parent(P), Default(D), Cases(C) { } - APInt First; - APInt Range; - Value *SValue; - unsigned Reg; - bool Emitted; - MachineBasicBlock *Parent; - MachineBasicBlock *Default; - BitTestInfo Cases; - }; - -public: - // TLI - This is information that describes the available target features we - // need for lowering. This indicates when operations are unavailable, - // implemented with a libcall, etc. - TargetLowering &TLI; - SelectionDAG &DAG; - const TargetData *TD; - AliasAnalysis *AA; - - /// SwitchCases - Vector of CaseBlock structures used to communicate - /// SwitchInst code generation information. - std::vector SwitchCases; - /// JTCases - Vector of JumpTable structures used to communicate - /// SwitchInst code generation information. - std::vector JTCases; - /// BitTestCases - Vector of BitTestBlock structures used to communicate - /// SwitchInst code generation information. - std::vector BitTestCases; - - /// PHINodesToUpdate - A list of phi instructions whose operand list will - /// be updated after processing the current basic block. - std::vector > PHINodesToUpdate; - - /// EdgeMapping - If an edge from CurMBB to any MBB is changed (e.g. due to - /// scheduler custom lowering), track the change here. - DenseMap EdgeMapping; - - // Emit PHI-node-operand constants only once even if used by multiple - // PHI nodes. - DenseMap ConstantsOut; - - /// FuncInfo - Information about the function as a whole. - /// - FunctionLoweringInfo &FuncInfo; - - /// OptLevel - What optimization level we're generating code for. - /// - CodeGenOpt::Level OptLevel; - - /// GFI - Garbage collection metadata for the function. - GCFunctionInfo *GFI; - - /// HasTailCall - This is set to true if a call in the current - /// block has been translated as a tail call. In this case, - /// no subsequent DAG nodes should be created. - /// - bool HasTailCall; - - LLVMContext *Context; - - SelectionDAGLowering(SelectionDAG &dag, TargetLowering &tli, - FunctionLoweringInfo &funcinfo, - CodeGenOpt::Level ol) - : CurDebugLoc(DebugLoc::getUnknownLoc()), - TLI(tli), DAG(dag), FuncInfo(funcinfo), OptLevel(ol), - HasTailCall(false), - Context(dag.getContext()) { - } - - void init(GCFunctionInfo *gfi, AliasAnalysis &aa); - - /// clear - Clear out the curret SelectionDAG and the associated - /// state and prepare this SelectionDAGLowering object to be used - /// for a new block. This doesn't clear out information about - /// additional blocks that are needed to complete switch lowering - /// or PHI node updating; that information is cleared out as it is - /// consumed. - void clear(); - - /// getRoot - Return the current virtual root of the Selection DAG, - /// flushing any PendingLoad items. This must be done before emitting - /// a store or any other node that may need to be ordered after any - /// prior load instructions. - /// - SDValue getRoot(); - - /// getControlRoot - Similar to getRoot, but instead of flushing all the - /// PendingLoad items, flush all the PendingExports items. It is necessary - /// to do this before emitting a terminator instruction. - /// - SDValue getControlRoot(); - - DebugLoc getCurDebugLoc() const { return CurDebugLoc; } - void setCurDebugLoc(DebugLoc dl) { CurDebugLoc = dl; } - - void CopyValueToVirtualRegister(Value *V, unsigned Reg); - - void visit(Instruction &I); - - void visit(unsigned Opcode, User &I); - - void setCurrentBasicBlock(MachineBasicBlock *MBB) { CurMBB = MBB; } - - SDValue getValue(const Value *V); - - void setValue(const Value *V, SDValue NewN) { - SDValue &N = NodeMap[V]; - assert(N.getNode() == 0 && "Already set a value for this node!"); - N = NewN; - } - - void GetRegistersForValue(SDISelAsmOperandInfo &OpInfo, - std::set &OutputRegs, - std::set &InputRegs); - - void FindMergedConditions(Value *Cond, MachineBasicBlock *TBB, - MachineBasicBlock *FBB, MachineBasicBlock *CurBB, - unsigned Opc); - void EmitBranchForMergedCondition(Value *Cond, MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB); - bool ShouldEmitAsBranches(const std::vector &Cases); - bool isExportableFromCurrentBlock(Value *V, const BasicBlock *FromBB); - void CopyToExportRegsIfNeeded(Value *V); - void ExportFromCurrentBlock(Value *V); - void LowerCallTo(CallSite CS, SDValue Callee, bool IsTailCall, - MachineBasicBlock *LandingPad = NULL); - -private: - // Terminator instructions. - void visitRet(ReturnInst &I); - void visitBr(BranchInst &I); - void visitSwitch(SwitchInst &I); - void visitIndirectBr(IndirectBrInst &I); - void visitUnreachable(UnreachableInst &I) { /* noop */ } - - // Helpers for visitSwitch - bool handleSmallSwitchRange(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleJTSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleBTSplitSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); - bool handleBitTestsSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default); -public: - void visitSwitchCase(CaseBlock &CB); - void visitBitTestHeader(BitTestBlock &B); - void visitBitTestCase(MachineBasicBlock* NextMBB, - unsigned Reg, - BitTestCase &B); - void visitJumpTable(JumpTable &JT); - void visitJumpTableHeader(JumpTable &JT, JumpTableHeader &JTH); - -private: - // These all get lowered before this pass. - void visitInvoke(InvokeInst &I); - void visitUnwind(UnwindInst &I); - - void visitBinary(User &I, unsigned OpCode); - void visitShift(User &I, unsigned Opcode); - void visitAdd(User &I) { visitBinary(I, ISD::ADD); } - void visitFAdd(User &I) { visitBinary(I, ISD::FADD); } - void visitSub(User &I) { visitBinary(I, ISD::SUB); } - void visitFSub(User &I); - void visitMul(User &I) { visitBinary(I, ISD::MUL); } - void visitFMul(User &I) { visitBinary(I, ISD::FMUL); } - void visitURem(User &I) { visitBinary(I, ISD::UREM); } - void visitSRem(User &I) { visitBinary(I, ISD::SREM); } - void visitFRem(User &I) { visitBinary(I, ISD::FREM); } - void visitUDiv(User &I) { visitBinary(I, ISD::UDIV); } - void visitSDiv(User &I) { visitBinary(I, ISD::SDIV); } - void visitFDiv(User &I) { visitBinary(I, ISD::FDIV); } - void visitAnd (User &I) { visitBinary(I, ISD::AND); } - void visitOr (User &I) { visitBinary(I, ISD::OR); } - void visitXor (User &I) { visitBinary(I, ISD::XOR); } - void visitShl (User &I) { visitShift(I, ISD::SHL); } - void visitLShr(User &I) { visitShift(I, ISD::SRL); } - void visitAShr(User &I) { visitShift(I, ISD::SRA); } - void visitICmp(User &I); - void visitFCmp(User &I); - // Visit the conversion instructions - void visitTrunc(User &I); - void visitZExt(User &I); - void visitSExt(User &I); - void visitFPTrunc(User &I); - void visitFPExt(User &I); - void visitFPToUI(User &I); - void visitFPToSI(User &I); - void visitUIToFP(User &I); - void visitSIToFP(User &I); - void visitPtrToInt(User &I); - void visitIntToPtr(User &I); - void visitBitCast(User &I); - - void visitExtractElement(User &I); - void visitInsertElement(User &I); - void visitShuffleVector(User &I); - - void visitExtractValue(ExtractValueInst &I); - void visitInsertValue(InsertValueInst &I); - - void visitGetElementPtr(User &I); - void visitSelect(User &I); - - void visitAlloca(AllocaInst &I); - void visitLoad(LoadInst &I); - void visitStore(StoreInst &I); - void visitPHI(PHINode &I) { } // PHI nodes are handled specially. - void visitCall(CallInst &I); - void visitInlineAsm(CallSite CS); - const char *visitIntrinsicCall(CallInst &I, unsigned Intrinsic); - void visitTargetIntrinsic(CallInst &I, unsigned Intrinsic); - - void visitPow(CallInst &I); - void visitExp2(CallInst &I); - void visitExp(CallInst &I); - void visitLog(CallInst &I); - void visitLog2(CallInst &I); - void visitLog10(CallInst &I); - - void visitVAStart(CallInst &I); - void visitVAArg(VAArgInst &I); - void visitVAEnd(CallInst &I); - void visitVACopy(CallInst &I); - - void visitUserOp1(Instruction &I) { - llvm_unreachable("UserOp1 should not exist at instruction selection time!"); - } - void visitUserOp2(Instruction &I) { - llvm_unreachable("UserOp2 should not exist at instruction selection time!"); - } - - const char *implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op); - const char *implVisitAluOverflow(CallInst &I, ISD::NodeType Op); -}; - -} // end namespace llvm - -#endif Copied: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (from r89675, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?p2=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp&p1=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp&r1=89675&r2=89681&rev=89681&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Mon Nov 23 12:04:58 2009 @@ -1,4 +1,4 @@ -//===-- SelectionDAGBuild.cpp - Selection-DAG building --------------------===// +//===-- SelectionDAGBuilder.cpp - Selection-DAG building ------------------===// // // The LLVM Compiler Infrastructure // @@ -12,7 +12,7 @@ //===----------------------------------------------------------------------===// #define DEBUG_TYPE "isel" -#include "SelectionDAGBuild.h" +#include "SelectionDAGBuilder.h" #include "FunctionLoweringInfo.h" #include "llvm/ADT/BitVector.h" #include "llvm/ADT/SmallSet.h" @@ -501,19 +501,19 @@ } -void SelectionDAGLowering::init(GCFunctionInfo *gfi, AliasAnalysis &aa) { +void SelectionDAGBuilder::init(GCFunctionInfo *gfi, AliasAnalysis &aa) { AA = &aa; GFI = gfi; TD = DAG.getTarget().getTargetData(); } /// clear - Clear out the curret SelectionDAG and the associated -/// state and prepare this SelectionDAGLowering object to be used +/// state and prepare this SelectionDAGBuilder object to be used /// for a new block. This doesn't clear out information about /// additional blocks that are needed to complete switch lowering /// or PHI node updating; that information is cleared out as it is /// consumed. -void SelectionDAGLowering::clear() { +void SelectionDAGBuilder::clear() { NodeMap.clear(); PendingLoads.clear(); PendingExports.clear(); @@ -528,7 +528,7 @@ /// a store or any other node that may need to be ordered after any /// prior load instructions. /// -SDValue SelectionDAGLowering::getRoot() { +SDValue SelectionDAGBuilder::getRoot() { if (PendingLoads.empty()) return DAG.getRoot(); @@ -551,7 +551,7 @@ /// PendingLoad items, flush all the PendingExports items. It is necessary /// to do this before emitting a terminator instruction. /// -SDValue SelectionDAGLowering::getControlRoot() { +SDValue SelectionDAGBuilder::getControlRoot() { SDValue Root = DAG.getRoot(); if (PendingExports.empty()) @@ -578,11 +578,11 @@ return Root; } -void SelectionDAGLowering::visit(Instruction &I) { +void SelectionDAGBuilder::visit(Instruction &I) { visit(I.getOpcode(), I); } -void SelectionDAGLowering::visit(unsigned Opcode, User &I) { +void SelectionDAGBuilder::visit(unsigned Opcode, User &I) { // Note: this doesn't use InstVisitor, because it has to work with // ConstantExpr's in addition to instructions. switch (Opcode) { @@ -594,7 +594,7 @@ } } -SDValue SelectionDAGLowering::getValue(const Value *V) { +SDValue SelectionDAGBuilder::getValue(const Value *V) { SDValue &N = NodeMap[V]; if (N.getNode()) return N; @@ -759,7 +759,7 @@ } } -void SelectionDAGLowering::visitRet(ReturnInst &I) { +void SelectionDAGBuilder::visitRet(ReturnInst &I) { SDValue Chain = getControlRoot(); SmallVector Outs; FunctionLoweringInfo &FLI = DAG.getFunctionLoweringInfo(); @@ -864,7 +864,7 @@ /// CopyToExportRegsIfNeeded - If the given value has virtual registers /// created for it, emit nodes to copy the value into the virtual /// registers. -void SelectionDAGLowering::CopyToExportRegsIfNeeded(Value *V) { +void SelectionDAGBuilder::CopyToExportRegsIfNeeded(Value *V) { if (!V->use_empty()) { DenseMap::iterator VMI = FuncInfo.ValueMap.find(V); if (VMI != FuncInfo.ValueMap.end()) @@ -875,7 +875,7 @@ /// ExportFromCurrentBlock - If this condition isn't known to be exported from /// the current basic block, add it to ValueMap now so that we'll get a /// CopyTo/FromReg. -void SelectionDAGLowering::ExportFromCurrentBlock(Value *V) { +void SelectionDAGBuilder::ExportFromCurrentBlock(Value *V) { // No need to export constants. if (!isa(V) && !isa(V)) return; @@ -886,8 +886,8 @@ CopyValueToVirtualRegister(V, Reg); } -bool SelectionDAGLowering::isExportableFromCurrentBlock(Value *V, - const BasicBlock *FromBB) { +bool SelectionDAGBuilder::isExportableFromCurrentBlock(Value *V, + const BasicBlock *FromBB) { // The operands of the setcc have to be in this block. We don't know // how to export them from some other block. if (Instruction *VI = dyn_cast(V)) { @@ -979,10 +979,10 @@ /// AND operator tree. /// void -SelectionDAGLowering::EmitBranchForMergedCondition(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB) { +SelectionDAGBuilder::EmitBranchForMergedCondition(Value *Cond, + MachineBasicBlock *TBB, + MachineBasicBlock *FBB, + MachineBasicBlock *CurBB) { const BasicBlock *BB = CurBB->getBasicBlock(); // If the leaf of the tree is a comparison, merge the condition into @@ -1018,11 +1018,11 @@ } /// FindMergedConditions - If Cond is an expression like -void SelectionDAGLowering::FindMergedConditions(Value *Cond, - MachineBasicBlock *TBB, - MachineBasicBlock *FBB, - MachineBasicBlock *CurBB, - unsigned Opc) { +void SelectionDAGBuilder::FindMergedConditions(Value *Cond, + MachineBasicBlock *TBB, + MachineBasicBlock *FBB, + MachineBasicBlock *CurBB, + unsigned Opc) { // If this node is not part of the or/and tree, emit it as a branch. Instruction *BOp = dyn_cast(Cond); if (!BOp || !(isa(BOp) || isa(BOp)) || @@ -1077,7 +1077,7 @@ /// If we should emit this as a bunch of and/or'd together conditions, return /// false. bool -SelectionDAGLowering::ShouldEmitAsBranches(const std::vector &Cases){ +SelectionDAGBuilder::ShouldEmitAsBranches(const std::vector &Cases){ if (Cases.size() != 2) return true; // If this is two comparisons of the same values or'd or and'd together, they @@ -1092,7 +1092,7 @@ return true; } -void SelectionDAGLowering::visitBr(BranchInst &I) { +void SelectionDAGBuilder::visitBr(BranchInst &I) { // Update machine-CFG edges. MachineBasicBlock *Succ0MBB = FuncInfo.MBBMap[I.getSuccessor(0)]; @@ -1176,7 +1176,7 @@ /// visitSwitchCase - Emits the necessary code to represent a single node in /// the binary search tree resulting from lowering a switch instruction. -void SelectionDAGLowering::visitSwitchCase(CaseBlock &CB) { +void SelectionDAGBuilder::visitSwitchCase(CaseBlock &CB) { SDValue Cond; SDValue CondLHS = getValue(CB.CmpLHS); DebugLoc dl = getCurDebugLoc(); @@ -1254,7 +1254,7 @@ } /// visitJumpTable - Emit JumpTable node in the current MBB -void SelectionDAGLowering::visitJumpTable(JumpTable &JT) { +void SelectionDAGBuilder::visitJumpTable(JumpTable &JT) { // Emit the code for the jump table assert(JT.Reg != -1U && "Should lower JT Header first!"); EVT PTy = TLI.getPointerTy(); @@ -1268,8 +1268,8 @@ /// visitJumpTableHeader - This function emits necessary code to produce index /// in the JumpTable from switch case. -void SelectionDAGLowering::visitJumpTableHeader(JumpTable &JT, - JumpTableHeader &JTH) { +void SelectionDAGBuilder::visitJumpTableHeader(JumpTable &JT, + JumpTableHeader &JTH) { // Subtract the lowest switch case value from the value being switched on and // conditional branch to default mbb if the result is greater than the // difference between smallest and largest cases. @@ -1318,7 +1318,7 @@ /// visitBitTestHeader - This function emits necessary code to produce value /// suitable for "bit tests" -void SelectionDAGLowering::visitBitTestHeader(BitTestBlock &B) { +void SelectionDAGBuilder::visitBitTestHeader(BitTestBlock &B) { // Subtract the minimum value SDValue SwitchOp = getValue(B.SValue); EVT VT = SwitchOp.getValueType(); @@ -1361,9 +1361,9 @@ } /// visitBitTestCase - this function produces one "bit test" -void SelectionDAGLowering::visitBitTestCase(MachineBasicBlock* NextMBB, - unsigned Reg, - BitTestCase &B) { +void SelectionDAGBuilder::visitBitTestCase(MachineBasicBlock* NextMBB, + unsigned Reg, + BitTestCase &B) { // Make desired shift SDValue ShiftOp = DAG.getCopyFromReg(getControlRoot(), getCurDebugLoc(), Reg, TLI.getPointerTy()); @@ -1402,7 +1402,7 @@ DAG.getBasicBlock(NextMBB))); } -void SelectionDAGLowering::visitInvoke(InvokeInst &I) { +void SelectionDAGBuilder::visitInvoke(InvokeInst &I) { // Retrieve successors. MachineBasicBlock *Return = FuncInfo.MBBMap[I.getSuccessor(0)]; MachineBasicBlock *LandingPad = FuncInfo.MBBMap[I.getSuccessor(1)]; @@ -1427,15 +1427,15 @@ DAG.getBasicBlock(Return))); } -void SelectionDAGLowering::visitUnwind(UnwindInst &I) { +void SelectionDAGBuilder::visitUnwind(UnwindInst &I) { } /// handleSmallSwitchCaseRange - Emit a series of specific tests (suitable for /// small case ranges). -bool SelectionDAGLowering::handleSmallSwitchRange(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { +bool SelectionDAGBuilder::handleSmallSwitchRange(CaseRec& CR, + CaseRecVector& WorkList, + Value* SV, + MachineBasicBlock* Default) { Case& BackCase = *(CR.Range.second-1); // Size is the number of Cases represented by this range. @@ -1529,10 +1529,10 @@ } /// handleJTSwitchCase - Emit jumptable for current switch case range -bool SelectionDAGLowering::handleJTSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { +bool SelectionDAGBuilder::handleJTSwitchCase(CaseRec& CR, + CaseRecVector& WorkList, + Value* SV, + MachineBasicBlock* Default) { Case& FrontCase = *CR.Range.first; Case& BackCase = *(CR.Range.second-1); @@ -1623,10 +1623,10 @@ /// handleBTSplitSwitchCase - emit comparison and split binary search tree into /// 2 subtrees. -bool SelectionDAGLowering::handleBTSplitSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default) { +bool SelectionDAGBuilder::handleBTSplitSwitchCase(CaseRec& CR, + CaseRecVector& WorkList, + Value* SV, + MachineBasicBlock* Default) { // Get the MachineFunction which holds the current MBB. This is used when // inserting any additional MBBs necessary to represent the switch. MachineFunction *CurMF = FuncInfo.MF; @@ -1751,10 +1751,10 @@ /// handleBitTestsSwitchCase - if current case range has few destination and /// range span less, than machine word bitwidth, encode case range into series /// of masks and emit bit tests with these masks. -bool SelectionDAGLowering::handleBitTestsSwitchCase(CaseRec& CR, - CaseRecVector& WorkList, - Value* SV, - MachineBasicBlock* Default){ +bool SelectionDAGBuilder::handleBitTestsSwitchCase(CaseRec& CR, + CaseRecVector& WorkList, + Value* SV, + MachineBasicBlock* Default){ EVT PTy = TLI.getPointerTy(); unsigned IntPtrBits = PTy.getSizeInBits(); @@ -1882,8 +1882,8 @@ /// Clusterify - Transform simple list of Cases into list of CaseRange's -size_t SelectionDAGLowering::Clusterify(CaseVector& Cases, - const SwitchInst& SI) { +size_t SelectionDAGBuilder::Clusterify(CaseVector& Cases, + const SwitchInst& SI) { size_t numCmps = 0; // Start with "simple" cases @@ -1924,7 +1924,7 @@ return numCmps; } -void SelectionDAGLowering::visitSwitch(SwitchInst &SI) { +void SelectionDAGBuilder::visitSwitch(SwitchInst &SI) { // Figure out which block is immediately after the current one. MachineBasicBlock *NextBlock = 0; @@ -1987,7 +1987,7 @@ } } -void SelectionDAGLowering::visitIndirectBr(IndirectBrInst &I) { +void SelectionDAGBuilder::visitIndirectBr(IndirectBrInst &I) { // Update machine-CFG edges. for (unsigned i = 0, e = I.getNumSuccessors(); i != e; ++i) CurMBB->addSuccessor(FuncInfo.MBBMap[I.getSuccessor(i)]); @@ -1998,7 +1998,7 @@ } -void SelectionDAGLowering::visitFSub(User &I) { +void SelectionDAGBuilder::visitFSub(User &I) { // -0.0 - X --> fneg const Type *Ty = I.getType(); if (isa(Ty)) { @@ -2027,7 +2027,7 @@ visitBinary(I, ISD::FSUB); } -void SelectionDAGLowering::visitBinary(User &I, unsigned OpCode) { +void SelectionDAGBuilder::visitBinary(User &I, unsigned OpCode) { SDValue Op1 = getValue(I.getOperand(0)); SDValue Op2 = getValue(I.getOperand(1)); @@ -2035,7 +2035,7 @@ Op1.getValueType(), Op1, Op2)); } -void SelectionDAGLowering::visitShift(User &I, unsigned Opcode) { +void SelectionDAGBuilder::visitShift(User &I, unsigned Opcode) { SDValue Op1 = getValue(I.getOperand(0)); SDValue Op2 = getValue(I.getOperand(1)); if (!isa(I.getType()) && @@ -2069,7 +2069,7 @@ Op1.getValueType(), Op1, Op2)); } -void SelectionDAGLowering::visitICmp(User &I) { +void SelectionDAGBuilder::visitICmp(User &I) { ICmpInst::Predicate predicate = ICmpInst::BAD_ICMP_PREDICATE; if (ICmpInst *IC = dyn_cast(&I)) predicate = IC->getPredicate(); @@ -2083,7 +2083,7 @@ setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Opcode)); } -void SelectionDAGLowering::visitFCmp(User &I) { +void SelectionDAGBuilder::visitFCmp(User &I) { FCmpInst::Predicate predicate = FCmpInst::BAD_FCMP_PREDICATE; if (FCmpInst *FC = dyn_cast(&I)) predicate = FC->getPredicate(); @@ -2096,7 +2096,7 @@ setValue(&I, DAG.getSetCC(getCurDebugLoc(), DestVT, Op1, Op2, Condition)); } -void SelectionDAGLowering::visitSelect(User &I) { +void SelectionDAGBuilder::visitSelect(User &I) { SmallVector ValueVTs; ComputeValueVTs(TLI, I.getType(), ValueVTs); unsigned NumValues = ValueVTs.size(); @@ -2119,14 +2119,14 @@ } -void SelectionDAGLowering::visitTrunc(User &I) { +void SelectionDAGBuilder::visitTrunc(User &I) { // TruncInst cannot be a no-op cast because sizeof(src) > sizeof(dest). SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::TRUNCATE, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitZExt(User &I) { +void SelectionDAGBuilder::visitZExt(User &I) { // ZExt cannot be a no-op cast because sizeof(src) < sizeof(dest). // ZExt also can't be a cast to bool for same reason. So, nothing much to do SDValue N = getValue(I.getOperand(0)); @@ -2134,7 +2134,7 @@ setValue(&I, DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitSExt(User &I) { +void SelectionDAGBuilder::visitSExt(User &I) { // SExt cannot be a no-op cast because sizeof(src) < sizeof(dest). // SExt also can't be a cast to bool for same reason. So, nothing much to do SDValue N = getValue(I.getOperand(0)); @@ -2142,7 +2142,7 @@ setValue(&I, DAG.getNode(ISD::SIGN_EXTEND, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitFPTrunc(User &I) { +void SelectionDAGBuilder::visitFPTrunc(User &I) { // FPTrunc is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); @@ -2150,42 +2150,42 @@ DestVT, N, DAG.getIntPtrConstant(0))); } -void SelectionDAGLowering::visitFPExt(User &I){ +void SelectionDAGBuilder::visitFPExt(User &I){ // FPTrunc is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::FP_EXTEND, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitFPToUI(User &I) { +void SelectionDAGBuilder::visitFPToUI(User &I) { // FPToUI is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::FP_TO_UINT, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitFPToSI(User &I) { +void SelectionDAGBuilder::visitFPToSI(User &I) { // FPToSI is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::FP_TO_SINT, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitUIToFP(User &I) { +void SelectionDAGBuilder::visitUIToFP(User &I) { // UIToFP is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::UINT_TO_FP, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitSIToFP(User &I){ +void SelectionDAGBuilder::visitSIToFP(User &I){ // SIToFP is never a no-op cast, no need to check SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); setValue(&I, DAG.getNode(ISD::SINT_TO_FP, getCurDebugLoc(), DestVT, N)); } -void SelectionDAGLowering::visitPtrToInt(User &I) { +void SelectionDAGBuilder::visitPtrToInt(User &I) { // What to do depends on the size of the integer and the size of the pointer. // We can either truncate, zero extend, or no-op, accordingly. SDValue N = getValue(I.getOperand(0)); @@ -2195,7 +2195,7 @@ setValue(&I, Result); } -void SelectionDAGLowering::visitIntToPtr(User &I) { +void SelectionDAGBuilder::visitIntToPtr(User &I) { // What to do depends on the size of the integer and the size of the pointer. // We can either truncate, zero extend, or no-op, accordingly. SDValue N = getValue(I.getOperand(0)); @@ -2204,7 +2204,7 @@ setValue(&I, DAG.getZExtOrTrunc(N, getCurDebugLoc(), DestVT)); } -void SelectionDAGLowering::visitBitCast(User &I) { +void SelectionDAGBuilder::visitBitCast(User &I) { SDValue N = getValue(I.getOperand(0)); EVT DestVT = TLI.getValueType(I.getType()); @@ -2217,7 +2217,7 @@ setValue(&I, N); // noop cast. } -void SelectionDAGLowering::visitInsertElement(User &I) { +void SelectionDAGBuilder::visitInsertElement(User &I) { SDValue InVec = getValue(I.getOperand(0)); SDValue InVal = getValue(I.getOperand(1)); SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), @@ -2229,7 +2229,7 @@ InVec, InVal, InIdx)); } -void SelectionDAGLowering::visitExtractElement(User &I) { +void SelectionDAGBuilder::visitExtractElement(User &I) { SDValue InVec = getValue(I.getOperand(0)); SDValue InIdx = DAG.getNode(ISD::ZERO_EXTEND, getCurDebugLoc(), TLI.getPointerTy(), @@ -2249,7 +2249,7 @@ return true; } -void SelectionDAGLowering::visitShuffleVector(User &I) { +void SelectionDAGBuilder::visitShuffleVector(User &I) { SmallVector Mask; SDValue Src1 = getValue(I.getOperand(0)); SDValue Src2 = getValue(I.getOperand(1)); @@ -2423,7 +2423,7 @@ VT, &Ops[0], Ops.size())); } -void SelectionDAGLowering::visitInsertValue(InsertValueInst &I) { +void SelectionDAGBuilder::visitInsertValue(InsertValueInst &I) { const Value *Op0 = I.getOperand(0); const Value *Op1 = I.getOperand(1); const Type *AggTy = I.getType(); @@ -2464,7 +2464,7 @@ &Values[0], NumAggValues)); } -void SelectionDAGLowering::visitExtractValue(ExtractValueInst &I) { +void SelectionDAGBuilder::visitExtractValue(ExtractValueInst &I) { const Value *Op0 = I.getOperand(0); const Type *AggTy = Op0->getType(); const Type *ValTy = I.getType(); @@ -2493,7 +2493,7 @@ } -void SelectionDAGLowering::visitGetElementPtr(User &I) { +void SelectionDAGBuilder::visitGetElementPtr(User &I) { SDValue N = getValue(I.getOperand(0)); const Type *Ty = I.getOperand(0)->getType(); @@ -2562,7 +2562,7 @@ setValue(&I, N); } -void SelectionDAGLowering::visitAlloca(AllocaInst &I) { +void SelectionDAGBuilder::visitAlloca(AllocaInst &I) { // If this is a fixed sized alloca in the entry block of the function, // allocate it statically on the stack. if (FuncInfo.StaticAllocaMap.count(&I)) @@ -2615,7 +2615,7 @@ FuncInfo.MF->getFrameInfo()->CreateVariableSizedObject(); } -void SelectionDAGLowering::visitLoad(LoadInst &I) { +void SelectionDAGBuilder::visitLoad(LoadInst &I) { const Value *SV = I.getOperand(0); SDValue Ptr = getValue(SV); @@ -2673,7 +2673,7 @@ } -void SelectionDAGLowering::visitStore(StoreInst &I) { +void SelectionDAGBuilder::visitStore(StoreInst &I) { Value *SrcV = I.getOperand(0); Value *PtrV = I.getOperand(1); @@ -2709,8 +2709,8 @@ /// visitTargetIntrinsic - Lower a call of a target intrinsic to an INTRINSIC /// node. -void SelectionDAGLowering::visitTargetIntrinsic(CallInst &I, - unsigned Intrinsic) { +void SelectionDAGBuilder::visitTargetIntrinsic(CallInst &I, + unsigned Intrinsic) { bool HasChain = !I.doesNotAccessMemory(); bool OnlyLoad = HasChain && I.onlyReadsMemory(); @@ -2832,7 +2832,7 @@ /// visitIntrinsicCall: I is a call instruction /// Op is the associated NodeType for I const char * -SelectionDAGLowering::implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op) { +SelectionDAGBuilder::implVisitBinaryAtomic(CallInst& I, ISD::NodeType Op) { SDValue Root = getRoot(); SDValue L = DAG.getAtomic(Op, getCurDebugLoc(), @@ -2848,7 +2848,7 @@ // implVisitAluOverflow - Lower arithmetic overflow instrinsics. const char * -SelectionDAGLowering::implVisitAluOverflow(CallInst &I, ISD::NodeType Op) { +SelectionDAGBuilder::implVisitAluOverflow(CallInst &I, ISD::NodeType Op) { SDValue Op1 = getValue(I.getOperand(1)); SDValue Op2 = getValue(I.getOperand(2)); @@ -2862,7 +2862,7 @@ /// visitExp - Lower an exp intrinsic. Handles the special sequences for /// limited-precision mode. void -SelectionDAGLowering::visitExp(CallInst &I) { +SelectionDAGBuilder::visitExp(CallInst &I) { SDValue result; DebugLoc dl = getCurDebugLoc(); @@ -2988,7 +2988,7 @@ /// visitLog - Lower a log intrinsic. Handles the special sequences for /// limited-precision mode. void -SelectionDAGLowering::visitLog(CallInst &I) { +SelectionDAGBuilder::visitLog(CallInst &I) { SDValue result; DebugLoc dl = getCurDebugLoc(); @@ -3098,7 +3098,7 @@ /// visitLog2 - Lower a log2 intrinsic. Handles the special sequences for /// limited-precision mode. void -SelectionDAGLowering::visitLog2(CallInst &I) { +SelectionDAGBuilder::visitLog2(CallInst &I) { SDValue result; DebugLoc dl = getCurDebugLoc(); @@ -3207,7 +3207,7 @@ /// visitLog10 - Lower a log10 intrinsic. Handles the special sequences for /// limited-precision mode. void -SelectionDAGLowering::visitLog10(CallInst &I) { +SelectionDAGBuilder::visitLog10(CallInst &I) { SDValue result; DebugLoc dl = getCurDebugLoc(); @@ -3309,7 +3309,7 @@ /// visitExp2 - Lower an exp2 intrinsic. Handles the special sequences for /// limited-precision mode. void -SelectionDAGLowering::visitExp2(CallInst &I) { +SelectionDAGBuilder::visitExp2(CallInst &I) { SDValue result; DebugLoc dl = getCurDebugLoc(); @@ -3423,7 +3423,7 @@ /// visitPow - Lower a pow intrinsic. Handles the special sequences for /// limited-precision mode with x == 10.0f. void -SelectionDAGLowering::visitPow(CallInst &I) { +SelectionDAGBuilder::visitPow(CallInst &I) { SDValue result; Value *Val = I.getOperand(1); DebugLoc dl = getCurDebugLoc(); @@ -3558,7 +3558,7 @@ /// we want to emit this as a call to a named external function, return the name /// otherwise lower it and return null. const char * -SelectionDAGLowering::visitIntrinsicCall(CallInst &I, unsigned Intrinsic) { +SelectionDAGBuilder::visitIntrinsicCall(CallInst &I, unsigned Intrinsic) { DebugLoc dl = getCurDebugLoc(); switch (Intrinsic) { default: @@ -4123,9 +4123,9 @@ return true; } -void SelectionDAGLowering::LowerCallTo(CallSite CS, SDValue Callee, - bool isTailCall, - MachineBasicBlock *LandingPad) { +void SelectionDAGBuilder::LowerCallTo(CallSite CS, SDValue Callee, + bool isTailCall, + MachineBasicBlock *LandingPad) { const PointerType *PT = cast(CS.getCalledValue()->getType()); const FunctionType *FTy = cast(PT->getElementType()); const Type *RetTy = FTy->getReturnType(); @@ -4272,7 +4272,7 @@ } -void SelectionDAGLowering::visitCall(CallInst &I) { +void SelectionDAGBuilder::visitCall(CallInst &I) { const char *RenameFn = 0; if (Function *F = I.getCalledFunction()) { if (F->isDeclaration()) { @@ -4667,7 +4667,7 @@ /// OpInfo describes the operand. /// Input and OutputRegs are the set of already allocated physical registers. /// -void SelectionDAGLowering:: +void SelectionDAGBuilder:: GetRegistersForValue(SDISelAsmOperandInfo &OpInfo, std::set &OutputRegs, std::set &InputRegs) { @@ -4861,7 +4861,7 @@ /// visitInlineAsm - Handle a call to an InlineAsm object. /// -void SelectionDAGLowering::visitInlineAsm(CallSite CS) { +void SelectionDAGBuilder::visitInlineAsm(CallSite CS) { InlineAsm *IA = cast(CS.getCalledValue()); /// ConstraintOperands - Information about all of the constraints. @@ -5283,14 +5283,14 @@ DAG.setRoot(Chain); } -void SelectionDAGLowering::visitVAStart(CallInst &I) { +void SelectionDAGBuilder::visitVAStart(CallInst &I) { DAG.setRoot(DAG.getNode(ISD::VASTART, getCurDebugLoc(), MVT::Other, getRoot(), getValue(I.getOperand(1)), DAG.getSrcValue(I.getOperand(1)))); } -void SelectionDAGLowering::visitVAArg(VAArgInst &I) { +void SelectionDAGBuilder::visitVAArg(VAArgInst &I) { SDValue V = DAG.getVAArg(TLI.getValueType(I.getType()), getCurDebugLoc(), getRoot(), getValue(I.getOperand(0)), DAG.getSrcValue(I.getOperand(0))); @@ -5298,14 +5298,14 @@ DAG.setRoot(V.getValue(1)); } -void SelectionDAGLowering::visitVAEnd(CallInst &I) { +void SelectionDAGBuilder::visitVAEnd(CallInst &I) { DAG.setRoot(DAG.getNode(ISD::VAEND, getCurDebugLoc(), MVT::Other, getRoot(), getValue(I.getOperand(1)), DAG.getSrcValue(I.getOperand(1)))); } -void SelectionDAGLowering::visitVACopy(CallInst &I) { +void SelectionDAGBuilder::visitVACopy(CallInst &I) { DAG.setRoot(DAG.getNode(ISD::VACOPY, getCurDebugLoc(), MVT::Other, getRoot(), getValue(I.getOperand(1)), @@ -5498,7 +5498,7 @@ } -void SelectionDAGLowering::CopyValueToVirtualRegister(Value *V, unsigned Reg) { +void SelectionDAGBuilder::CopyValueToVirtualRegister(Value *V, unsigned Reg) { SDValue Op = getValue(V); assert((Op.getOpcode() != ISD::CopyFromReg || cast(Op.getOperand(1))->getReg() != Reg) && @@ -5516,9 +5516,9 @@ void SelectionDAGISel::LowerArguments(BasicBlock *LLVMBB) { // If this is the entry block, emit arguments. Function &F = *LLVMBB->getParent(); - SelectionDAG &DAG = SDL->DAG; + SelectionDAG &DAG = SDB->DAG; SDValue OldRoot = DAG.getRoot(); - DebugLoc dl = SDL->getCurDebugLoc(); + DebugLoc dl = SDB->getCurDebugLoc(); const TargetData *TD = TLI.getTargetData(); SmallVector Ins; @@ -5634,11 +5634,11 @@ SDValue ArgValue = getCopyFromParts(DAG, dl, &InVals[0], 1, RegVT, VT, AssertOp); - MachineFunction& MF = SDL->DAG.getMachineFunction(); + MachineFunction& MF = SDB->DAG.getMachineFunction(); MachineRegisterInfo& RegInfo = MF.getRegInfo(); unsigned SRetReg = RegInfo.createVirtualRegister(TLI.getRegClassFor(RegVT)); FLI.DemoteRegister = SRetReg; - NewRoot = SDL->DAG.getCopyToReg(NewRoot, SDL->getCurDebugLoc(), SRetReg, ArgValue); + NewRoot = SDB->DAG.getCopyToReg(NewRoot, SDB->getCurDebugLoc(), SRetReg, ArgValue); DAG.setRoot(NewRoot); // i indexes lowered arguments. Bump it past the hidden sret argument. @@ -5669,18 +5669,18 @@ i += NumParts; } if (!I->use_empty()) { - SDL->setValue(I, DAG.getMergeValues(&ArgValues[0], NumValues, - SDL->getCurDebugLoc())); + SDB->setValue(I, DAG.getMergeValues(&ArgValues[0], NumValues, + SDB->getCurDebugLoc())); // If this argument is live outside of the entry block, insert a copy from // whereever we got it to the vreg that other BB's will reference it as. - SDL->CopyToExportRegsIfNeeded(I); + SDB->CopyToExportRegsIfNeeded(I); } } assert(i == InVals.size() && "Argument register count mismatch!"); // Finally, if the target has anything special to do, allow it to do so. // FIXME: this should insert code into the DAG! - EmitFunctionEntryCode(F, SDL->DAG.getMachineFunction()); + EmitFunctionEntryCode(F, SDB->DAG.getMachineFunction()); } /// Handle PHI nodes in successor blocks. Emit code into the SelectionDAG to @@ -5722,10 +5722,10 @@ Value *PHIOp = PN->getIncomingValueForBlock(LLVMBB); if (Constant *C = dyn_cast(PHIOp)) { - unsigned &RegOut = SDL->ConstantsOut[C]; + unsigned &RegOut = SDB->ConstantsOut[C]; if (RegOut == 0) { RegOut = FuncInfo->CreateRegForValue(C); - SDL->CopyValueToVirtualRegister(C, RegOut); + SDB->CopyValueToVirtualRegister(C, RegOut); } Reg = RegOut; } else { @@ -5735,7 +5735,7 @@ FuncInfo->StaticAllocaMap.count(cast(PHIOp)) && "Didn't codegen value into a register!??"); Reg = FuncInfo->CreateRegForValue(PHIOp); - SDL->CopyValueToVirtualRegister(PHIOp, Reg); + SDB->CopyValueToVirtualRegister(PHIOp, Reg); } } @@ -5747,12 +5747,12 @@ EVT VT = ValueVTs[vti]; unsigned NumRegisters = TLI.getNumRegisters(*CurDAG->getContext(), VT); for (unsigned i = 0, e = NumRegisters; i != e; ++i) - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg+i)); + SDB->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg+i)); Reg += NumRegisters; } } } - SDL->ConstantsOut.clear(); + SDB->ConstantsOut.clear(); } /// This is the Fast-ISel version of HandlePHINodesInSuccessorBlocks. It only @@ -5765,7 +5765,7 @@ TerminatorInst *TI = LLVMBB->getTerminator(); SmallPtrSet SuccsHandled; - unsigned OrigNumPHINodesToUpdate = SDL->PHINodesToUpdate.size(); + unsigned OrigNumPHINodesToUpdate = SDB->PHINodesToUpdate.size(); // Check successor nodes' PHI nodes that expect a constant to be available // from this block. @@ -5801,7 +5801,7 @@ if (VT == MVT::i1) VT = TLI.getTypeToTransformTo(*CurDAG->getContext(), VT); else { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); + SDB->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); return false; } } @@ -5810,10 +5810,10 @@ unsigned Reg = F->getRegForValue(PHIOp); if (Reg == 0) { - SDL->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); + SDB->PHINodesToUpdate.resize(OrigNumPHINodesToUpdate); return false; } - SDL->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg)); + SDB->PHINodesToUpdate.push_back(std::make_pair(MBBI++, Reg)); } } Copied: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (from r89674, llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?p2=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h&p1=llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h&r1=89674&r2=89681&rev=89681&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuild.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h Mon Nov 23 12:04:58 2009 @@ -1,4 +1,4 @@ -//===-- SelectionDAGBuild.h - Selection-DAG building ----------------------===// +//===-- SelectionDAGBuilder.h - Selection-DAG building --------------------===// // // The LLVM Compiler Infrastructure // @@ -11,8 +11,8 @@ // //===----------------------------------------------------------------------===// -#ifndef SELECTIONDAGBUILD_H -#define SELECTIONDAGBUILD_H +#ifndef SELECTIONDAGBUILDER_H +#define SELECTIONDAGBUILDER_H #include "llvm/Constants.h" #include "llvm/CodeGen/SelectionDAG.h" @@ -79,11 +79,11 @@ class ZExtInst; //===----------------------------------------------------------------------===// -/// SelectionDAGLowering - This is the common target-independent lowering +/// SelectionDAGBuilder - This is the common target-independent lowering /// implementation that is parameterized by a TargetLowering object. /// Also, targets can overload any lowering method. /// -class SelectionDAGLowering { +class SelectionDAGBuilder { MachineBasicBlock *CurMBB; /// CurDebugLoc - current file + line number. Changes as we build the DAG. @@ -173,9 +173,9 @@ size_t Clusterify(CaseVector& Cases, const SwitchInst &SI); - /// CaseBlock - This structure is used to communicate between SDLowering and - /// SDISel for the code generation of additional basic blocks needed by multi- - /// case switch statements. + /// CaseBlock - This structure is used to communicate between + /// SelectionDAGBuilder and SDISel for the code generation of additional basic + /// blocks needed by multi-case switch statements. struct CaseBlock { CaseBlock(ISD::CondCode cc, Value *cmplhs, Value *cmprhs, Value *cmpmiddle, MachineBasicBlock *truebb, MachineBasicBlock *falsebb, @@ -297,9 +297,9 @@ LLVMContext *Context; - SelectionDAGLowering(SelectionDAG &dag, TargetLowering &tli, - FunctionLoweringInfo &funcinfo, - CodeGenOpt::Level ol) + SelectionDAGBuilder(SelectionDAG &dag, TargetLowering &tli, + FunctionLoweringInfo &funcinfo, + CodeGenOpt::Level ol) : CurDebugLoc(DebugLoc::getUnknownLoc()), TLI(tli), DAG(dag), FuncInfo(funcinfo), OptLevel(ol), HasTailCall(false), @@ -309,7 +309,7 @@ void init(GCFunctionInfo *gfi, AliasAnalysis &aa); /// clear - Clear out the curret SelectionDAG and the associated - /// state and prepare this SelectionDAGLowering object to be used + /// state and prepare this SelectionDAGBuilder object to be used /// for a new block. This doesn't clear out information about /// additional blocks that are needed to complete switch lowering /// or PHI node updating; that information is cleared out as it is Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp?rev=89681&r1=89680&r2=89681&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Mon Nov 23 12:04:58 2009 @@ -13,7 +13,7 @@ #define DEBUG_TYPE "isel" #include "ScheduleDAGSDNodes.h" -#include "SelectionDAGBuild.h" +#include "SelectionDAGBuilder.h" #include "FunctionLoweringInfo.h" #include "llvm/CodeGen/SelectionDAGISel.h" #include "llvm/Analysis/AliasAnalysis.h" @@ -280,14 +280,14 @@ MachineFunctionPass(&ID), TM(tm), TLI(*tm.getTargetLowering()), FuncInfo(new FunctionLoweringInfo(TLI)), CurDAG(new SelectionDAG(TLI, *FuncInfo)), - SDL(new SelectionDAGLowering(*CurDAG, TLI, *FuncInfo, OL)), + SDB(new SelectionDAGBuilder(*CurDAG, TLI, *FuncInfo, OL)), GFI(), OptLevel(OL), DAGSize(0) {} SelectionDAGISel::~SelectionDAGISel() { - delete SDL; + delete SDB; delete CurDAG; delete FuncInfo; } @@ -333,7 +333,7 @@ DwarfWriter *DW = getAnalysisIfAvailable(); CurDAG->init(*MF, MMI, DW); FuncInfo->set(Fn, *MF, EnableFastISel); - SDL->init(GFI, *AA); + SDB->init(GFI, *AA); for (Function::iterator I = Fn.begin(), E = Fn.end(); I != E; ++I) if (InvokeInst *Invoke = dyn_cast(I->getTerminator())) @@ -379,13 +379,13 @@ BasicBlock::iterator Begin, BasicBlock::iterator End, bool &HadTailCall) { - SDL->setCurrentBasicBlock(BB); + SDB->setCurrentBasicBlock(BB); MetadataContext &TheMetadata = LLVMBB->getParent()->getContext().getMetadata(); unsigned MDDbgKind = TheMetadata.getMDKind("dbg"); // Lower all of the non-terminator instructions. If a call is emitted // as a tail call, cease emitting nodes for this block. - for (BasicBlock::iterator I = Begin; I != End && !SDL->HasTailCall; ++I) { + for (BasicBlock::iterator I = Begin; I != End && !SDB->HasTailCall; ++I) { if (MDDbgKind) { // Update DebugLoc if debug information is attached with this // instruction. @@ -393,38 +393,38 @@ if (MDNode *Dbg = TheMetadata.getMD(MDDbgKind, I)) { DILocation DILoc(Dbg); DebugLoc Loc = ExtractDebugLocation(DILoc, MF->getDebugLocInfo()); - SDL->setCurDebugLoc(Loc); + SDB->setCurDebugLoc(Loc); if (MF->getDefaultDebugLoc().isUnknown()) MF->setDefaultDebugLoc(Loc); } } if (!isa(I)) - SDL->visit(*I); + SDB->visit(*I); } - if (!SDL->HasTailCall) { + if (!SDB->HasTailCall) { // Ensure that all instructions which are used outside of their defining // blocks are available as virtual registers. Invoke is handled elsewhere. for (BasicBlock::iterator I = Begin; I != End; ++I) if (!isa(I) && !isa(I)) - SDL->CopyToExportRegsIfNeeded(I); + SDB->CopyToExportRegsIfNeeded(I); // Handle PHI nodes in successor blocks. if (End == LLVMBB->end()) { HandlePHINodesInSuccessorBlocks(LLVMBB); // Lower the terminator after the copies are emitted. - SDL->visit(*LLVMBB->getTerminator()); + SDB->visit(*LLVMBB->getTerminator()); } } // Make sure the root of the DAG is up-to-date. - CurDAG->setRoot(SDL->getControlRoot()); + CurDAG->setRoot(SDB->getControlRoot()); // Final step, emit the lowered DAG as machine code. CodeGenAndEmitDAG(); - HadTailCall = SDL->HasTailCall; - SDL->clear(); + HadTailCall = SDB->HasTailCall; + SDB->clear(); } void SelectionDAGISel::ComputeLiveOutVRegInfo() { @@ -632,9 +632,9 @@ // inserted into. if (TimePassesIsEnabled) { NamedRegionTimer T("Instruction Creation", GroupName); - BB = Scheduler->EmitSchedule(&SDL->EdgeMapping); + BB = Scheduler->EmitSchedule(&SDB->EdgeMapping); } else { - BB = Scheduler->EmitSchedule(&SDL->EdgeMapping); + BB = Scheduler->EmitSchedule(&SDB->EdgeMapping); } // Free the scheduler state. @@ -704,7 +704,7 @@ unsigned LabelID = MMI->addLandingPad(BB); const TargetInstrDesc &II = TII.get(TargetInstrInfo::EH_LABEL); - BuildMI(BB, SDL->getCurDebugLoc(), II).addImm(LabelID); + BuildMI(BB, SDB->getCurDebugLoc(), II).addImm(LabelID); // Mark exception register as live in. unsigned Reg = TLI.getExceptionAddressRegister(); @@ -744,9 +744,9 @@ // Emit code for any incoming arguments. This must happen before // beginning FastISel on the entry block. if (LLVMBB == &Fn.getEntryBlock()) { - CurDAG->setRoot(SDL->getControlRoot()); + CurDAG->setRoot(SDB->getControlRoot()); CodeGenAndEmitDAG(); - SDL->clear(); + SDB->clear(); } FastIS->startNewBlock(BB); // Do FastISel on as many instructions as possible. @@ -799,7 +799,7 @@ R = FuncInfo->CreateRegForValue(BI); } - SDL->setCurDebugLoc(FastIS->getCurDebugLoc()); + SDB->setCurDebugLoc(FastIS->getCurDebugLoc()); bool HadTailCall = false; SelectBasicBlock(LLVMBB, BI, next(BI), HadTailCall); @@ -838,7 +838,7 @@ if (BI != End) { // If FastISel is run and it has known DebugLoc then use it. if (FastIS && !FastIS->getCurDebugLoc().isUnknown()) - SDL->setCurDebugLoc(FastIS->getCurDebugLoc()); + SDB->setCurDebugLoc(FastIS->getCurDebugLoc()); bool HadTailCall; SelectBasicBlock(LLVMBB, BI, End, HadTailCall); } @@ -856,150 +856,150 @@ DEBUG(BB->dump()); DEBUG(errs() << "Total amount of phi nodes to update: " - << SDL->PHINodesToUpdate.size() << "\n"); - DEBUG(for (unsigned i = 0, e = SDL->PHINodesToUpdate.size(); i != e; ++i) + << SDB->PHINodesToUpdate.size() << "\n"); + DEBUG(for (unsigned i = 0, e = SDB->PHINodesToUpdate.size(); i != e; ++i) errs() << "Node " << i << " : (" - << SDL->PHINodesToUpdate[i].first - << ", " << SDL->PHINodesToUpdate[i].second << ")\n"); + << SDB->PHINodesToUpdate[i].first + << ", " << SDB->PHINodesToUpdate[i].second << ")\n"); // Next, now that we know what the last MBB the LLVM BB expanded is, update // PHI nodes in successors. - if (SDL->SwitchCases.empty() && - SDL->JTCases.empty() && - SDL->BitTestCases.empty()) { - for (unsigned i = 0, e = SDL->PHINodesToUpdate.size(); i != e; ++i) { - MachineInstr *PHI = SDL->PHINodesToUpdate[i].first; + if (SDB->SwitchCases.empty() && + SDB->JTCases.empty() && + SDB->BitTestCases.empty()) { + for (unsigned i = 0, e = SDB->PHINodesToUpdate.size(); i != e; ++i) { + MachineInstr *PHI = SDB->PHINodesToUpdate[i].first; assert(PHI->getOpcode() == TargetInstrInfo::PHI && "This is not a machine PHI node that we are updating!"); - PHI->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[i].second, + PHI->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[i].second, false)); PHI->addOperand(MachineOperand::CreateMBB(BB)); } - SDL->PHINodesToUpdate.clear(); + SDB->PHINodesToUpdate.clear(); return; } - for (unsigned i = 0, e = SDL->BitTestCases.size(); i != e; ++i) { + for (unsigned i = 0, e = SDB->BitTestCases.size(); i != e; ++i) { // Lower header first, if it wasn't already lowered - if (!SDL->BitTestCases[i].Emitted) { + if (!SDB->BitTestCases[i].Emitted) { // Set the current basic block to the mbb we wish to insert the code into - BB = SDL->BitTestCases[i].Parent; - SDL->setCurrentBasicBlock(BB); + BB = SDB->BitTestCases[i].Parent; + SDB->setCurrentBasicBlock(BB); // Emit the code - SDL->visitBitTestHeader(SDL->BitTestCases[i]); - CurDAG->setRoot(SDL->getRoot()); + SDB->visitBitTestHeader(SDB->BitTestCases[i]); + CurDAG->setRoot(SDB->getRoot()); CodeGenAndEmitDAG(); - SDL->clear(); + SDB->clear(); } - for (unsigned j = 0, ej = SDL->BitTestCases[i].Cases.size(); j != ej; ++j) { + for (unsigned j = 0, ej = SDB->BitTestCases[i].Cases.size(); j != ej; ++j) { // Set the current basic block to the mbb we wish to insert the code into - BB = SDL->BitTestCases[i].Cases[j].ThisBB; - SDL->setCurrentBasicBlock(BB); + BB = SDB->BitTestCases[i].Cases[j].ThisBB; + SDB->setCurrentBasicBlock(BB); // Emit the code if (j+1 != ej) - SDL->visitBitTestCase(SDL->BitTestCases[i].Cases[j+1].ThisBB, - SDL->BitTestCases[i].Reg, - SDL->BitTestCases[i].Cases[j]); + SDB->visitBitTestCase(SDB->BitTestCases[i].Cases[j+1].ThisBB, + SDB->BitTestCases[i].Reg, + SDB->BitTestCases[i].Cases[j]); else - SDL->visitBitTestCase(SDL->BitTestCases[i].Default, - SDL->BitTestCases[i].Reg, - SDL->BitTestCases[i].Cases[j]); + SDB->visitBitTestCase(SDB->BitTestCases[i].Default, + SDB->BitTestCases[i].Reg, + SDB->BitTestCases[i].Cases[j]); - CurDAG->setRoot(SDL->getRoot()); + CurDAG->setRoot(SDB->getRoot()); CodeGenAndEmitDAG(); - SDL->clear(); + SDB->clear(); } // Update PHI Nodes - for (unsigned pi = 0, pe = SDL->PHINodesToUpdate.size(); pi != pe; ++pi) { - MachineInstr *PHI = SDL->PHINodesToUpdate[pi].first; + for (unsigned pi = 0, pe = SDB->PHINodesToUpdate.size(); pi != pe; ++pi) { + MachineInstr *PHI = SDB->PHINodesToUpdate[pi].first; MachineBasicBlock *PHIBB = PHI->getParent(); assert(PHI->getOpcode() == TargetInstrInfo::PHI && "This is not a machine PHI node that we are updating!"); // This is "default" BB. We have two jumps to it. From "header" BB and // from last "case" BB. - if (PHIBB == SDL->BitTestCases[i].Default) { - PHI->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[pi].second, + if (PHIBB == SDB->BitTestCases[i].Default) { + PHI->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[pi].second, false)); - PHI->addOperand(MachineOperand::CreateMBB(SDL->BitTestCases[i].Parent)); - PHI->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[pi].second, + PHI->addOperand(MachineOperand::CreateMBB(SDB->BitTestCases[i].Parent)); + PHI->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[pi].second, false)); - PHI->addOperand(MachineOperand::CreateMBB(SDL->BitTestCases[i].Cases. + PHI->addOperand(MachineOperand::CreateMBB(SDB->BitTestCases[i].Cases. back().ThisBB)); } // One of "cases" BB. - for (unsigned j = 0, ej = SDL->BitTestCases[i].Cases.size(); + for (unsigned j = 0, ej = SDB->BitTestCases[i].Cases.size(); j != ej; ++j) { - MachineBasicBlock* cBB = SDL->BitTestCases[i].Cases[j].ThisBB; + MachineBasicBlock* cBB = SDB->BitTestCases[i].Cases[j].ThisBB; if (cBB->succ_end() != std::find(cBB->succ_begin(),cBB->succ_end(), PHIBB)) { - PHI->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[pi].second, + PHI->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[pi].second, false)); PHI->addOperand(MachineOperand::CreateMBB(cBB)); } } } } - SDL->BitTestCases.clear(); + SDB->BitTestCases.clear(); // If the JumpTable record is filled in, then we need to emit a jump table. // Updating the PHI nodes is tricky in this case, since we need to determine // whether the PHI is a successor of the range check MBB or the jump table MBB - for (unsigned i = 0, e = SDL->JTCases.size(); i != e; ++i) { + for (unsigned i = 0, e = SDB->JTCases.size(); i != e; ++i) { // Lower header first, if it wasn't already lowered - if (!SDL->JTCases[i].first.Emitted) { + if (!SDB->JTCases[i].first.Emitted) { // Set the current basic block to the mbb we wish to insert the code into - BB = SDL->JTCases[i].first.HeaderBB; - SDL->setCurrentBasicBlock(BB); + BB = SDB->JTCases[i].first.HeaderBB; + SDB->setCurrentBasicBlock(BB); // Emit the code - SDL->visitJumpTableHeader(SDL->JTCases[i].second, SDL->JTCases[i].first); - CurDAG->setRoot(SDL->getRoot()); + SDB->visitJumpTableHeader(SDB->JTCases[i].second, SDB->JTCases[i].first); + CurDAG->setRoot(SDB->getRoot()); CodeGenAndEmitDAG(); - SDL->clear(); + SDB->clear(); } // Set the current basic block to the mbb we wish to insert the code into - BB = SDL->JTCases[i].second.MBB; - SDL->setCurrentBasicBlock(BB); + BB = SDB->JTCases[i].second.MBB; + SDB->setCurrentBasicBlock(BB); // Emit the code - SDL->visitJumpTable(SDL->JTCases[i].second); - CurDAG->setRoot(SDL->getRoot()); + SDB->visitJumpTable(SDB->JTCases[i].second); + CurDAG->setRoot(SDB->getRoot()); CodeGenAndEmitDAG(); - SDL->clear(); + SDB->clear(); // Update PHI Nodes - for (unsigned pi = 0, pe = SDL->PHINodesToUpdate.size(); pi != pe; ++pi) { - MachineInstr *PHI = SDL->PHINodesToUpdate[pi].first; + for (unsigned pi = 0, pe = SDB->PHINodesToUpdate.size(); pi != pe; ++pi) { + MachineInstr *PHI = SDB->PHINodesToUpdate[pi].first; MachineBasicBlock *PHIBB = PHI->getParent(); assert(PHI->getOpcode() == TargetInstrInfo::PHI && "This is not a machine PHI node that we are updating!"); // "default" BB. We can go there only from header BB. - if (PHIBB == SDL->JTCases[i].second.Default) { + if (PHIBB == SDB->JTCases[i].second.Default) { PHI->addOperand - (MachineOperand::CreateReg(SDL->PHINodesToUpdate[pi].second, false)); + (MachineOperand::CreateReg(SDB->PHINodesToUpdate[pi].second, false)); PHI->addOperand - (MachineOperand::CreateMBB(SDL->JTCases[i].first.HeaderBB)); + (MachineOperand::CreateMBB(SDB->JTCases[i].first.HeaderBB)); } // JT BB. Just iterate over successors here if (BB->succ_end() != std::find(BB->succ_begin(),BB->succ_end(), PHIBB)) { PHI->addOperand - (MachineOperand::CreateReg(SDL->PHINodesToUpdate[pi].second, false)); + (MachineOperand::CreateReg(SDB->PHINodesToUpdate[pi].second, false)); PHI->addOperand(MachineOperand::CreateMBB(BB)); } } } - SDL->JTCases.clear(); + SDB->JTCases.clear(); // If the switch block involved a branch to one of the actual successors, we // need to update PHI nodes in that block. - for (unsigned i = 0, e = SDL->PHINodesToUpdate.size(); i != e; ++i) { - MachineInstr *PHI = SDL->PHINodesToUpdate[i].first; + for (unsigned i = 0, e = SDB->PHINodesToUpdate.size(); i != e; ++i) { + MachineInstr *PHI = SDB->PHINodesToUpdate[i].first; assert(PHI->getOpcode() == TargetInstrInfo::PHI && "This is not a machine PHI node that we are updating!"); if (BB->isSuccessor(PHI->getParent())) { - PHI->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[i].second, + PHI->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[i].second, false)); PHI->addOperand(MachineOperand::CreateMBB(BB)); } @@ -1007,36 +1007,36 @@ // If we generated any switch lowering information, build and codegen any // additional DAGs necessary. - for (unsigned i = 0, e = SDL->SwitchCases.size(); i != e; ++i) { + for (unsigned i = 0, e = SDB->SwitchCases.size(); i != e; ++i) { // Set the current basic block to the mbb we wish to insert the code into - MachineBasicBlock *ThisBB = BB = SDL->SwitchCases[i].ThisBB; - SDL->setCurrentBasicBlock(BB); + MachineBasicBlock *ThisBB = BB = SDB->SwitchCases[i].ThisBB; + SDB->setCurrentBasicBlock(BB); // Emit the code - SDL->visitSwitchCase(SDL->SwitchCases[i]); - CurDAG->setRoot(SDL->getRoot()); + SDB->visitSwitchCase(SDB->SwitchCases[i]); + CurDAG->setRoot(SDB->getRoot()); CodeGenAndEmitDAG(); // Handle any PHI nodes in successors of this chunk, as if we were coming // from the original BB before switch expansion. Note that PHI nodes can // occur multiple times in PHINodesToUpdate. We have to be very careful to // handle them the right number of times. - while ((BB = SDL->SwitchCases[i].TrueBB)) { // Handle LHS and RHS. + while ((BB = SDB->SwitchCases[i].TrueBB)) { // Handle LHS and RHS. // If new BB's are created during scheduling, the edges may have been // updated. That is, the edge from ThisBB to BB may have been split and // BB's predecessor is now another block. DenseMap::iterator EI = - SDL->EdgeMapping.find(BB); - if (EI != SDL->EdgeMapping.end()) + SDB->EdgeMapping.find(BB); + if (EI != SDB->EdgeMapping.end()) ThisBB = EI->second; for (MachineBasicBlock::iterator Phi = BB->begin(); Phi != BB->end() && Phi->getOpcode() == TargetInstrInfo::PHI; ++Phi){ // This value for this PHI node is recorded in PHINodesToUpdate, get it. for (unsigned pn = 0; ; ++pn) { - assert(pn != SDL->PHINodesToUpdate.size() && + assert(pn != SDB->PHINodesToUpdate.size() && "Didn't find PHI entry!"); - if (SDL->PHINodesToUpdate[pn].first == Phi) { - Phi->addOperand(MachineOperand::CreateReg(SDL->PHINodesToUpdate[pn]. + if (SDB->PHINodesToUpdate[pn].first == Phi) { + Phi->addOperand(MachineOperand::CreateReg(SDB->PHINodesToUpdate[pn]. second, false)); Phi->addOperand(MachineOperand::CreateMBB(ThisBB)); break; @@ -1045,19 +1045,19 @@ } // Don't process RHS if same block as LHS. - if (BB == SDL->SwitchCases[i].FalseBB) - SDL->SwitchCases[i].FalseBB = 0; + if (BB == SDB->SwitchCases[i].FalseBB) + SDB->SwitchCases[i].FalseBB = 0; // If we haven't handled the RHS, do so now. Otherwise, we're done. - SDL->SwitchCases[i].TrueBB = SDL->SwitchCases[i].FalseBB; - SDL->SwitchCases[i].FalseBB = 0; + SDB->SwitchCases[i].TrueBB = SDB->SwitchCases[i].FalseBB; + SDB->SwitchCases[i].FalseBB = 0; } - assert(SDL->SwitchCases[i].TrueBB == 0 && SDL->SwitchCases[i].FalseBB == 0); - SDL->clear(); + assert(SDB->SwitchCases[i].TrueBB == 0 && SDB->SwitchCases[i].FalseBB == 0); + SDB->clear(); } - SDL->SwitchCases.clear(); + SDB->SwitchCases.clear(); - SDL->PHINodesToUpdate.clear(); + SDB->PHINodesToUpdate.clear(); } From clattner at apple.com Mon Nov 23 12:11:10 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 10:11:10 -0800 Subject: [llvm-commits] [llvm] r89487 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp In-Reply-To: References: <200911202105.nAKL5bJe020587@zion.cs.uiuc.edu> <56E0813A-560E-4810-9B9F-457E0853AB17@apple.com> Message-ID: <52095503-5F8D-41F5-8001-DDAC8E8582E0@apple.com> On Nov 23, 2009, at 9:56 AM, Devang Patel wrote: > On Nov 21, 2009, at 7:18 AM, Chris Lattner wrote: > On Nov 20, 2009, at 1:05 PM, Devang Patel wrote: >> >>> Author: dpatel >>> Date: Fri Nov 20 15:05:37 2009 >>> New Revision: 89487 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=89487&view=rev >>> Log: >>> There is no need to emit source location info for DW_TAG_pointer_type. >> >> This seems like a strange special case. Why are pointers special here? > > Usually, one pointer type is used many places and FEs do not preserve separate locations for each instance. GCC does not emit location info here. If we get precise location info from FE then it makes sense to include location info here. Clang preserves this information and could emit it. However, this is not really relevant here: shouldn't the decision be left up to the front-end, not discarded arbitrarily by the code generator? It seems like the right fix is in llvm-gcc, not the llvm backend. -Chris From gohman at apple.com Mon Nov 23 12:12:11 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 18:12:11 -0000 Subject: [llvm-commits] [llvm] r89683 - in /llvm/trunk/lib/CodeGen/SelectionDAG: FunctionLoweringInfo.cpp FunctionLoweringInfo.h SelectionDAGISel.cpp Message-ID: <200911231812.nANICBlK001271@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 12:12:11 2009 New Revision: 89683 URL: http://llvm.org/viewvc/llvm-project?rev=89683&view=rev Log: Move CopyCatchInfo into FunctionLoweringInfo.cpp too, for consistency. Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp?rev=89683&r1=89682&r2=89683&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp Mon Nov 23 12:12:11 2009 @@ -18,6 +18,7 @@ #include "llvm/DerivedTypes.h" #include "llvm/Function.h" #include "llvm/Instructions.h" +#include "llvm/IntrinsicInst.h" #include "llvm/LLVMContext.h" #include "llvm/Module.h" #include "llvm/CodeGen/MachineFunction.h" @@ -340,4 +341,15 @@ } } - +void llvm::CopyCatchInfo(BasicBlock *SrcBB, BasicBlock *DestBB, + MachineModuleInfo *MMI, FunctionLoweringInfo &FLI) { + for (BasicBlock::iterator I = SrcBB->begin(), E = --SrcBB->end(); I != E; ++I) + if (EHSelectorInst *EHSel = dyn_cast(I)) { + // Apply the catch info to DestBB. + AddCatchInfo(*EHSel, MMI, FLI.MBBMap[DestBB]); +#ifndef NDEBUG + if (!FLI.MBBMap[SrcBB]->isLandingPad()) + FLI.CatchInfoFound.insert(EHSel); +#endif + } +} Modified: llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h?rev=89683&r1=89682&r2=89683&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.h Mon Nov 23 12:12:11 2009 @@ -142,6 +142,10 @@ /// call, and add them to the specified machine basic block. void AddCatchInfo(CallInst &I, MachineModuleInfo *MMI, MachineBasicBlock *MBB); +/// CopyCatchInfo - Copy catch information from DestBB to SrcBB. +void CopyCatchInfo(BasicBlock *SrcBB, BasicBlock *DestBB, + MachineModuleInfo *MMI, FunctionLoweringInfo &FLI); + } // end namespace llvm #endif Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp?rev=89683&r1=89682&r2=89683&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Mon Nov 23 12:12:11 2009 @@ -362,19 +362,6 @@ return true; } -static void copyCatchInfo(BasicBlock *SrcBB, BasicBlock *DestBB, - MachineModuleInfo *MMI, FunctionLoweringInfo &FLI) { - for (BasicBlock::iterator I = SrcBB->begin(), E = --SrcBB->end(); I != E; ++I) - if (EHSelectorInst *EHSel = dyn_cast(I)) { - // Apply the catch info to DestBB. - AddCatchInfo(*EHSel, MMI, FLI.MBBMap[DestBB]); -#ifndef NDEBUG - if (!FLI.MBBMap[SrcBB]->isLandingPad()) - FLI.CatchInfoFound.insert(EHSel); -#endif - } -} - void SelectionDAGISel::SelectBasicBlock(BasicBlock *LLVMBB, BasicBlock::iterator Begin, BasicBlock::iterator End, @@ -735,7 +722,7 @@ if (I == E) // No catch info found - try to extract some from the successor. - copyCatchInfo(Br->getSuccessor(0), LLVMBB, MMI, *FuncInfo); + CopyCatchInfo(Br->getSuccessor(0), LLVMBB, MMI, *FuncInfo); } } From grosbach at apple.com Mon Nov 23 12:14:15 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 10:14:15 -0800 Subject: [llvm-commits] [llvm] r89403 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp In-Reply-To: <0D81A26C-67F8-4671-82F7-C55A21064706@apple.com> References: <200911192310.nAJNASND022655@zion.cs.uiuc.edu> <6A655B82-5A9A-4854-A1C8-98E2DF7EE8E7@apple.com> <0D81A26C-67F8-4671-82F7-C55A21064706@apple.com> Message-ID: On Nov 23, 2009, at 9:44 AM, Chris Lattner wrote: > On Nov 22, 2009, at 10:28 AM, Jim Grosbach wrote: >>> Are you sure that this is the right thing to do? GCC inline asm >>> precisely specifies how the size of an inline asm is computed: >>> http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm >>> (6.39.1) >>> >>> If some inline asm isn't working with this algorithm, then the >>> inline asm is wrong and should be fixed. >> >> >> Yes, I believe this is the correct approach. >> >> The algorithm described is, "The estimate is formed by counting the >> number of statements in the pattern of the asm and multiplying that >> by the length of the longest instruction on that processor." This >> is exactly what we do, and gets us close, but it is still an >> estimate. To do better, we'll need to actually parse the assembly >> code to determine whether it's a 16 or a 32 bit instruction. In >> Thumb2, that's a non-trivial thing, and I suspect we don't want to >> go down that path, at least right now. > > No, this isn't true at all. This is a statement of how the language > extension (in this case, gnu assembly) works. If the "estimate" is > incorrect for a piece of code, then the asm is incorrect, not the > compiler. > I think we're talking past one another here but get back on the same page in your next comment. >> What's happening here is that the inline asm is actually shorter >> than the estimate since it's a 16-bit instruction, and is also not >> located between the constant pool reference and the constant pool >> entry. Since it's different in size than what the estimate >> believes, our calculations for when alignment padding will be >> inserted are now off, leading to problems. > > Ah, this is a completely different problem from what I thought. I > thought we were incorrectly under-estimating the size of an asm, not > overestimating it. If the code is expecting to get the exact size > of the asm, it is definitely "doomed to failure" :). > Yes, that's exactly the problem: the code is expecting the exact size (in bytes) of the asm, is over-estimating it, and is falling over as a result. You're absolutely right that if we were underestimating the size of the inline asm, then the problem would be a flaw in that algorithm. > However, if you just need alignment, can you just not insert > "alignment padding" and instead insert a .align directive after the > asm? We already use an .align directive to actually insert the padding, which is good. Otherwise we'd likely be having far more failures of other sorts. The constant island pool pass thinks it knows for sure what that directive will do (that is, when it will and won't insert padding bytes, and how many), and takes that into account when calculating distances between instructions. In the presence of inline asm, it can get it wrong and mistakenly think padding has not been added when it actually has, resulting in the out of range error since the target constant pool entry is actually further away than the compiler thought. >> This patch adjusts those padding calculations to know that in the >> presence of inline assembly, we're dealing with an estimate and >> therefore can't make assumptions about alignment padding. > > Ah ok, so this just causes extra .align directives to be emitted, it > doesn't make wildly conservative assumptions? If so, sounds great! :) Close. We're not inserting any additional directives, but rather better understanding the ones we're already emitting. It causes us to correctly understand that when we have inline asm, we don't know for sure when an .align directive will cause padding to be inserted. We know when it may be inserted, and track that instead. Note that in the absence of inline asm, we continue to calculate precisely. I believe this is well on the side of prudently rather than wildly conservative. Thanks, Jim From johnny.chen at apple.com Mon Nov 23 12:16:16 2009 From: johnny.chen at apple.com (Johnny Chen) Date: Mon, 23 Nov 2009 18:16:16 -0000 Subject: [llvm-commits] [llvm] r89684 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td Message-ID: <200911231816.nANIGGiG001437@zion.cs.uiuc.edu> Author: johnny Date: Mon Nov 23 12:16:16 2009 New Revision: 89684 URL: http://llvm.org/viewvc/llvm-project?rev=89684&view=rev Log: Partially revert r89377 by removing NLdStLN class definition from ARMInstrFormats.td and fixing VLD[234]LN* and VST[234]LN* to derive from NLdSt instead of NLdStLN. Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89684&r1=89683&r2=89684&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 12:16:16 2009 @@ -1248,17 +1248,6 @@ let Inst{7-4} = op7_4; } -// With selective bit(s) from op7_4 specified by subclasses. -class NLdStLN op21_20, bits<4> op11_8, - dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NeonI { - let Inst{31-24} = 0b11110100; - let Inst{23} = op23; - let Inst{21-20} = op21_20; - let Inst{11-8} = op11_8; -} - class NDataI pattern> : NeonI op11_8, string OpcodeStr> - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), - IIC_VLD2, - OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2", []>; + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), + IIC_VLD2, + OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2", []>; // vld2 to single-spaced registers. def VLD2LNd8 : VLD2LN<0b0001, "vld2.8">; @@ -313,12 +313,12 @@ // VLD3LN : Vector Load (single 3-element structure to one lane) class VLD3LN op11_8, string OpcodeStr> - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, - nohash_imm:$lane), IIC_VLD3, - OpcodeStr, - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, + nohash_imm:$lane), IIC_VLD3, + OpcodeStr, + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; // vld3 to single-spaced registers. def VLD3LNd8 : VLD3LN<0b0010, "vld3.8"> { @@ -349,13 +349,13 @@ // VLD4LN : Vector Load (single 4-element structure to one lane) class VLD4LN op11_8, string OpcodeStr> - : NLdStLN<1,0b10,op11_8, - (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, - nohash_imm:$lane), IIC_VLD4, - OpcodeStr, - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; + : NLdSt<1,0b10,op11_8,{?,?,?,?}, + (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, + nohash_imm:$lane), IIC_VLD4, + OpcodeStr, + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; // vld4 to single-spaced registers. def VLD4LNd8 : VLD4LN<0b0011, "vld4.8">; @@ -504,11 +504,11 @@ // VST2LN : Vector Store (single 2-element structure from one lane) class VST2LN op11_8, string OpcodeStr> - : NLdStLN<1,0b00,op11_8, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), - IIC_VST, - OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", - "", []>; + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), + IIC_VST, + OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", + "", []>; // vst2 to single-spaced registers. def VST2LNd8 : VST2LN<0b0001, "vst2.8">; @@ -537,11 +537,11 @@ // VST3LN : Vector Store (single 3-element structure from one lane) class VST3LN op11_8, string OpcodeStr> - : NLdStLN<1,0b00,op11_8, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, - nohash_imm:$lane), IIC_VST, - OpcodeStr, - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, + nohash_imm:$lane), IIC_VST, + OpcodeStr, + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; // vst3 to single-spaced registers. def VST3LNd8 : VST3LN<0b0010, "vst3.8"> { @@ -572,12 +572,12 @@ // VST4LN : Vector Store (single 4-element structure from one lane) class VST4LN op11_8, string OpcodeStr> - : NLdStLN<1,0b00,op11_8, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, - nohash_imm:$lane), IIC_VST, - OpcodeStr, - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", - "", []>; + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, + nohash_imm:$lane), IIC_VST, + OpcodeStr, + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", + "", []>; // vst4 to single-spaced registers. def VST4LNd8 : VST4LN<0b0011, "vst4.8">; From clattner at apple.com Mon Nov 23 12:22:33 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 10:22:33 -0800 Subject: [llvm-commits] [llvm] r89667 - in /llvm/trunk/lib/CodeGen/SelectionDAG: FunctionLoweringInfo.cpp FunctionLoweringInfo.h SelectionDAGBuild.cpp SelectionDAGBuild.h SelectionDAGISel.cpp In-Reply-To: <200911231716.nANHGNlI031330@zion.cs.uiuc.edu> References: <200911231716.nANHGNlI031330@zion.cs.uiuc.edu> Message-ID: <5C9DE571-FF75-4AAE-94BC-49AEBB87DBF7@apple.com> On Nov 23, 2009, at 9:16 AM, Dan Gohman wrote: > Author: djg > Date: Mon Nov 23 11:16:22 2009 > New Revision: 89667 > > URL: http://llvm.org/viewvc/llvm-project?rev=89667&view=rev > Log: > Move the FunctionLoweringInfo class and some related utility functions out > of SelectionDAGBuild.h/cpp into its own files, to help separate > general lowering logic from SelectionDAG-specific lowering logic. Hi Dan, Would it make sense to split TargetLowering into TargetLowering and SDLowering classes? The former would live in libtarget and the later in lib/codegen/selectiondag? Right now we have some clients (like LSR) that use TLI - but only inline methods and virtual methods - to avoid cyclic dependencies. This is gross :) -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091123/70344d4b/attachment.html From clattner at apple.com Mon Nov 23 12:23:46 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 10:23:46 -0800 Subject: [llvm-commits] [llvm] r89403 - /llvm/trunk/lib/Target/ARM/ARMConstantIslandPass.cpp In-Reply-To: References: <200911192310.nAJNASND022655@zion.cs.uiuc.edu> <6A655B82-5A9A-4854-A1C8-98E2DF7EE8E7@apple.com> <0D81A26C-67F8-4671-82F7-C55A21064706@apple.com> Message-ID: <8137DD7C-C825-49CE-A293-81BFB208622E@apple.com> On Nov 23, 2009, at 10:14 AM, Jim Grosbach wrote: >>> >>> This patch adjusts those padding calculations to know that in the presence of inline assembly, we're dealing with an estimate and therefore can't make assumptions about alignment padding. >> >> Ah ok, so this just causes extra .align directives to be emitted, it doesn't make wildly conservative assumptions? If so, sounds great! :) > > Close. We're not inserting any additional directives, but rather better understanding the ones we're already emitting. It causes us to correctly understand that when we have inline asm, we don't know for sure when an .align directive will cause padding to be inserted. We know when it may be inserted, and track that instead. Note that in the absence of inline asm, we continue to calculate precisely. I believe this is well on the side of prudently rather than wildly conservative. Ok, thanks Jim! -Chris From clattner at apple.com Mon Nov 23 12:28:56 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 10:28:56 -0800 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <4B0A9A8C.9040905@free.fr> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> <4B0A9A8C.9040905@free.fr> Message-ID: <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> On Nov 23, 2009, at 6:22 AM, Duncan Sands wrote: > Hi Chris, > >>>>> I think this is wrong, consider the following pseudocode example: >>>> While this example is "possible" I really don't think this is worth worrying about. It is not valid C code, is not likely to exist in practice, etc. Beyond that, comparison against null is really common and we really do want "nocapture" in this cases. >>> well, it's a slippery slope :) Dan later changed this to only allow >>> comparisons against malloc return values and other noalias function >>> results. >> Is such paranoia really worthwhile? > > I'm not against loosening the definition of nocapture as long as it is clearly > defined what nocapture means. I think the obvious thing to do is to say that > if the value of the pointer is reconstructed *entirely by control flow* then it > is not considered to be captured. That works for me, though I'm not sure exactly what it means. > Consider the following standard crazy example > of capturing a pointer P. Yes, I'm aware of things like that, I just don't care about them ;-) > > Using my proposed definition here we would say that P is *not* captured. What > do you think? That sort of thing should definitely not be supported. Can you propose wording to add to langref to codify how nocapture should work here? -Chris From clattner at apple.com Mon Nov 23 12:29:55 2009 From: clattner at apple.com (Chris Lattner) Date: Mon, 23 Nov 2009 10:29:55 -0800 Subject: [llvm-commits] [llvm] r89626 - in /llvm/trunk: docs/CommandGuide/FileCheck.pod utils/FileCheck/FileCheck.cpp In-Reply-To: <200911222207.nAMM7pis006331@zion.cs.uiuc.edu> References: <200911222207.nAMM7pis006331@zion.cs.uiuc.edu> Message-ID: On Nov 22, 2009, at 2:07 PM, Daniel Dunbar wrote: > Author: ddunbar > Date: Sun Nov 22 16:07:50 2009 > New Revision: 89626 > > URL: http://llvm.org/viewvc/llvm-project?rev=89626&view=rev > Log: > Allow '_' in FileCheck variable names, it is nice to have at least one > separate character. > - Chris, OK? Fine with me, thx. Please update the TestingGuide documentation. -Chris From devang.patel at gmail.com Mon Nov 23 12:27:07 2009 From: devang.patel at gmail.com (Devang Patel) Date: Mon, 23 Nov 2009 10:27:07 -0800 Subject: [llvm-commits] [llvm] r86914 - in /llvm/trunk: include/llvm/Analysis/DebugInfo.h include/llvm/Metadata.h lib/Analysis/DebugInfo.cpp lib/VMCore/Metadata.cpp In-Reply-To: <352a1fb20911230946m350703c8ndad58655c89b58e8@mail.gmail.com> References: <200911120050.nAC0owVT014507@zion.cs.uiuc.edu> <9DC587FF-9855-4F2E-BC34-5A40A9233CEA@apple.com> <6a8523d60911221006y2058abd4m977ec8d3e13cbea8@mail.gmail.com> <352a1fb20911230946m350703c8ndad58655c89b58e8@mail.gmail.com> Message-ID: <352a1fb20911231027g46e767far8ed88cad7a03a698@mail.gmail.com> I'll take another shot and try to update DebugInfo APIs to use StringRef. - Devang From dpatel at apple.com Mon Nov 23 12:43:38 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 23 Nov 2009 18:43:38 -0000 Subject: [llvm-commits] [llvm] r89686 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Message-ID: <200911231843.nANIhc1M002405@zion.cs.uiuc.edu> Author: dpatel Date: Mon Nov 23 12:43:37 2009 New Revision: 89686 URL: http://llvm.org/viewvc/llvm-project?rev=89686&view=rev Log: Revert r89487. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89686&r1=89685&r2=89686&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Mon Nov 23 12:43:37 2009 @@ -812,7 +812,7 @@ addUInt(&Buffer, dwarf::DW_AT_byte_size, 0, Size); // Add source line info if available and TyDesc is not a forward declaration. - if (!DTy.isForwardDecl() && Tag != dwarf::DW_TAG_pointer_type) + if (!DTy.isForwardDecl()) addSourceLine(&Buffer, &DTy); } From devang.patel at gmail.com Mon Nov 23 11:46:27 2009 From: devang.patel at gmail.com (Devang Patel) Date: Mon, 23 Nov 2009 09:46:27 -0800 Subject: [llvm-commits] [llvm] r86914 - in /llvm/trunk: include/llvm/Analysis/DebugInfo.h include/llvm/Metadata.h lib/Analysis/DebugInfo.cpp lib/VMCore/Metadata.cpp In-Reply-To: <6a8523d60911221006y2058abd4m977ec8d3e13cbea8@mail.gmail.com> References: <200911120050.nAC0owVT014507@zion.cs.uiuc.edu> <9DC587FF-9855-4F2E-BC34-5A40A9233CEA@apple.com> <6a8523d60911221006y2058abd4m977ec8d3e13cbea8@mail.gmail.com> Message-ID: <352a1fb20911230946m350703c8ndad58655c89b58e8@mail.gmail.com> On Sun, Nov 22, 2009 at 10:06 AM, Daniel Dunbar wrote: > As Jeffrey points out, if debug info is already ignoring "" then it > would "just work" by using .empty(), on null stringrefs. Another wrinkle here is the one of the client, llvm-gcc, does not use StringRef to access its tree nodes. Which means when llvm-gcc need to check for NULL to pass IDENTIFIER_POINTER(DECL_ASSEMBLER_NAME(Node)) through debug info API. And DwarfWriter will have to again check for null or empty string. > StringRef() equals StringRef(0). IIRC, this is not true. StringRef(0) does not work. > StringRef is a "ref", so you get what > you put in. (StringRef(p).data() == p). - Devang From devang.patel at gmail.com Mon Nov 23 12:29:45 2009 From: devang.patel at gmail.com (Devang Patel) Date: Mon, 23 Nov 2009 10:29:45 -0800 Subject: [llvm-commits] [llvm] r89487 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp In-Reply-To: <52095503-5F8D-41F5-8001-DDAC8E8582E0@apple.com> References: <200911202105.nAKL5bJe020587@zion.cs.uiuc.edu> <56E0813A-560E-4810-9B9F-457E0853AB17@apple.com> <52095503-5F8D-41F5-8001-DDAC8E8582E0@apple.com> Message-ID: <352a1fb20911231029p6cea8fb2x46d1ff552f0b3654@mail.gmail.com> > Clang preserves this information and could emit it. ?However, this is not really relevant here: shouldn't the decision be left up to the front-end, not discarded arbitrarily by the code generator? ?It seems like the right fix is in llvm-gcc, not the llvm backend. OK. Makes sense. - Devang From dpatel at apple.com Mon Nov 23 13:11:20 2009 From: dpatel at apple.com (Devang Patel) Date: Mon, 23 Nov 2009 19:11:20 -0000 Subject: [llvm-commits] [llvm] r89689 - in /llvm/trunk: include/llvm/Analysis/DebugInfo.h lib/Analysis/DebugInfo.cpp Message-ID: <200911231911.nANJBKGA003563@zion.cs.uiuc.edu> Author: dpatel Date: Mon Nov 23 13:11:20 2009 New Revision: 89689 URL: http://llvm.org/viewvc/llvm-project?rev=89689&view=rev Log: Add CreateLocation varinat that accepts MDNode (with a default value). Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h llvm/trunk/lib/Analysis/DebugInfo.cpp Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/DebugInfo.h?rev=89689&r1=89688&r2=89689&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/DebugInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/DebugInfo.h Mon Nov 23 13:11:20 2009 @@ -598,6 +598,10 @@ DILocation CreateLocation(unsigned LineNo, unsigned ColumnNo, DIScope S, DILocation OrigLoc); + /// CreateLocation - Creates a debug info location. + DILocation CreateLocation(unsigned LineNo, unsigned ColumnNo, + DIScope S, MDNode *OrigLoc = 0); + /// InsertDeclare - Insert a new llvm.dbg.declare intrinsic call. Instruction *InsertDeclare(llvm::Value *Storage, DIVariable D, BasicBlock *InsertAtEnd); Modified: llvm/trunk/lib/Analysis/DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/DebugInfo.cpp?rev=89689&r1=89688&r2=89689&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/DebugInfo.cpp (original) +++ llvm/trunk/lib/Analysis/DebugInfo.cpp Mon Nov 23 13:11:20 2009 @@ -976,6 +976,17 @@ return DILocation(MDNode::get(VMContext, &Elts[0], 4)); } +/// CreateLocation - Creates a debug info location. +DILocation DIFactory::CreateLocation(unsigned LineNo, unsigned ColumnNo, + DIScope S, MDNode *OrigLoc) { + Value *Elts[] = { + ConstantInt::get(Type::getInt32Ty(VMContext), LineNo), + ConstantInt::get(Type::getInt32Ty(VMContext), ColumnNo), + S.getNode(), + OrigLoc + }; + return DILocation(MDNode::get(VMContext, &Elts[0], 4)); +} //===----------------------------------------------------------------------===// // DIFactory: Routines for inserting code into a function From evan.cheng at apple.com Mon Nov 23 13:33:50 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 11:33:50 -0800 Subject: [llvm-commits] [llvm] r89500 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/palignr-2.ll In-Reply-To: <200911202228.nAKMShWY023966@zion.cs.uiuc.edu> References: <200911202228.nAKMShWY023966@zion.cs.uiuc.edu> Message-ID: A better fix would be to have llvm-gcc and Clang to lower it into a shuffle instruction. Any interest in doing that? :-) Evan On Nov 20, 2009, at 2:28 PM, Sean Callanan wrote: > Author: spyffe > Date: Fri Nov 20 16:28:42 2009 > New Revision: 89500 > > URL: http://llvm.org/viewvc/llvm-project?rev=89500&view=rev > Log: > Recommitting PALIGNR shift width fixes. > Thanks to Daniel Dunbar for fixing clang intrinsics: > http://llvm.org/viewvc/llvm-project?view=rev&revision=89499 > > Modified: > llvm/trunk/include/llvm/IntrinsicsX86.td > llvm/trunk/lib/Target/X86/X86InstrSSE.td > llvm/trunk/test/CodeGen/X86/palignr-2.ll > > Modified: llvm/trunk/include/llvm/IntrinsicsX86.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=89500&r1=89499&r2=89500&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) > +++ llvm/trunk/include/llvm/IntrinsicsX86.td Fri Nov 20 16:28:42 2009 > @@ -673,10 +673,10 @@ > let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". > def int_x86_ssse3_palign_r : GCCBuiltin<"__builtin_ia32_palignr">, > Intrinsic<[llvm_v1i64_ty], [llvm_v1i64_ty, > - llvm_v1i64_ty, llvm_i16_ty], [IntrNoMem]>; > + llvm_v1i64_ty, llvm_i8_ty], [IntrNoMem]>; > def int_x86_ssse3_palign_r_128 : GCCBuiltin<"__builtin_ia32_palignr128">, > Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, > - llvm_v2i64_ty, llvm_i32_ty], [IntrNoMem]>; > + llvm_v2i64_ty, llvm_i8_ty], [IntrNoMem]>; > } > > //===----------------------------------------------------------------------===// > > Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=89500&r1=89499&r2=89500&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original) > +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Fri Nov 20 16:28:42 2009 > @@ -2820,40 +2820,40 @@ > > let Constraints = "$src1 = $dst" in { > def PALIGNR64rr : SS3AI<0x0F, MRMSrcReg, (outs VR64:$dst), > - (ins VR64:$src1, VR64:$src2, i16imm:$src3), > + (ins VR64:$src1, VR64:$src2, i8imm:$src3), > "palignr\t{$src3, $src2, $dst|$dst, $src2, $src3}", > []>; > def PALIGNR64rm : SS3AI<0x0F, MRMSrcMem, (outs VR64:$dst), > - (ins VR64:$src1, i64mem:$src2, i16imm:$src3), > + (ins VR64:$src1, i64mem:$src2, i8imm:$src3), > "palignr\t{$src3, $src2, $dst|$dst, $src2, $src3}", > []>; > > def PALIGNR128rr : SS3AI<0x0F, MRMSrcReg, (outs VR128:$dst), > - (ins VR128:$src1, VR128:$src2, i32imm:$src3), > + (ins VR128:$src1, VR128:$src2, i8imm:$src3), > "palignr\t{$src3, $src2, $dst|$dst, $src2, $src3}", > []>, OpSize; > def PALIGNR128rm : SS3AI<0x0F, MRMSrcMem, (outs VR128:$dst), > - (ins VR128:$src1, i128mem:$src2, i32imm:$src3), > + (ins VR128:$src1, i128mem:$src2, i8imm:$src3), > "palignr\t{$src3, $src2, $dst|$dst, $src2, $src3}", > []>, OpSize; > } > > // palignr patterns. > -def : Pat<(int_x86_ssse3_palign_r VR64:$src1, VR64:$src2, (i16 imm:$src3)), > +def : Pat<(int_x86_ssse3_palign_r VR64:$src1, VR64:$src2, (i8 imm:$src3)), > (PALIGNR64rr VR64:$src1, VR64:$src2, (BYTE_imm imm:$src3))>, > Requires<[HasSSSE3]>; > def : Pat<(int_x86_ssse3_palign_r VR64:$src1, > (memop64 addr:$src2), > - (i16 imm:$src3)), > + (i8 imm:$src3)), > (PALIGNR64rm VR64:$src1, addr:$src2, (BYTE_imm imm:$src3))>, > Requires<[HasSSSE3]>; > > -def : Pat<(int_x86_ssse3_palign_r_128 VR128:$src1, VR128:$src2, (i32 imm:$src3)), > +def : Pat<(int_x86_ssse3_palign_r_128 VR128:$src1, VR128:$src2, (i8 imm:$src3)), > (PALIGNR128rr VR128:$src1, VR128:$src2, (BYTE_imm imm:$src3))>, > Requires<[HasSSSE3]>; > def : Pat<(int_x86_ssse3_palign_r_128 VR128:$src1, > (memopv2i64 addr:$src2), > - (i32 imm:$src3)), > + (i8 imm:$src3)), > (PALIGNR128rm VR128:$src1, addr:$src2, (BYTE_imm imm:$src3))>, > Requires<[HasSSSE3]>; > > > Modified: llvm/trunk/test/CodeGen/X86/palignr-2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/palignr-2.ll?rev=89500&r1=89499&r2=89500&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/X86/palignr-2.ll (original) > +++ llvm/trunk/test/CodeGen/X86/palignr-2.ll Fri Nov 20 16:28:42 2009 > @@ -9,12 +9,12 @@ > entry: > ; CHECK: t1: > ; palignr $3, %xmm1, %xmm0 > - %0 = tail call <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64> %a, <2 x i64> %b, i32 24) nounwind readnone > + %0 = tail call <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64> %a, <2 x i64> %b, i8 24) nounwind readnone > store <2 x i64> %0, <2 x i64>* bitcast ([4 x i32]* @c to <2 x i64>*), align 16 > ret void > } > > -declare <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64>, <2 x i64>, i32) nounwind readnone > +declare <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64>, <2 x i64>, i8) nounwind readnone > > define void @t2() nounwind ssp { > entry: > @@ -22,7 +22,7 @@ > ; palignr $4, _b, %xmm0 > %0 = load <2 x i64>* bitcast ([4 x i32]* @b to <2 x i64>*), align 16 ; <<2 x i64>> [#uses=1] > %1 = load <2 x i64>* bitcast ([4 x i32]* @a to <2 x i64>*), align 16 ; <<2 x i64>> [#uses=1] > - %2 = tail call <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64> %1, <2 x i64> %0, i32 32) nounwind readnone > + %2 = tail call <2 x i64> @llvm.x86.ssse3.palign.r.128(<2 x i64> %1, <2 x i64> %0, i8 32) nounwind readnone > store <2 x i64> %2, <2 x i64>* bitcast ([4 x i32]* @c to <2 x i64>*), align 16 > ret void > } > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From evan.cheng at apple.com Mon Nov 23 13:39:22 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 11:39:22 -0800 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911230901.47371.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911230901.47371.dag@cray.com> Message-ID: <9895559A-187A-447E-945A-FEFEA37EFF43@apple.com> David, this really is not a good idea. You are adding more target hooks purely for asm printing comments. These hooks return instruction properties that should be static so they should be moved to td files. Evan On Nov 23, 2009, at 7:01 AM, David Greene wrote: > On Friday 20 November 2009 16:22, you wrote: >> This patch adds information to spill/reload comments as to whether they are >> vector or scalar. This is helpful when doing static code analysis of >> performance issues and other things. It's only implemented for X86. >> Experts on other architectures will have to fill things in. >> >> Please review. Thanks! > > Ping! > > -Dave > >> Index: include/llvm/Target/TargetInstrInfo.h >> =================================================================== >> --- include/llvm/Target/TargetInstrInfo.h (revision 89484) >> +++ include/llvm/Target/TargetInstrInfo.h (working copy) >> @@ -142,6 +142,23 @@ >> return false; >> } >> >> + /// isVectorInstr - Return true if the instruction is a vector >> operation. + virtual bool isVectorInstr(const MachineInstr& MI) const { >> + return false; >> + } >> + >> + /// isVectorOperand - Return true if the operand is of vector type. >> + virtual bool isVectorOperand(const MachineInstr &MI, >> + const MachineOperand *MO) const { >> + return false; >> + } >> + >> + /// isVectorOperand - Return true if the mem operand is of vector type. >> + virtual bool isVectorOperand(const MachineInstr &MI, >> + const MachineMemOperand *MMO) const { >> + return false; >> + } >> + >> /// isIdentityCopy - Return true if the instruction is a copy (or >> /// extract_subreg, insert_subreg, subreg_to_reg) where the source and >> /// destination registers are the same. >> @@ -182,11 +199,13 @@ >> >> /// hasLoadFromStackSlot - If the specified machine instruction has >> /// a load from a stack slot, return true along with the FrameIndex >> - /// of the loaded stack slot. If not, return false. Unlike >> + /// of the loaded stack slot and the machine mem operand containing >> + /// the reference. If not, return false. Unlike >> /// isLoadFromStackSlot, this returns true for any instructions that >> /// loads from the stack. This is just a hint, as some cases may be >> /// missed. >> virtual bool hasLoadFromStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> int &FrameIndex) const { >> return 0; >> } >> @@ -205,17 +224,18 @@ >> /// stack locations as well. This uses a heuristic so it isn't >> /// reliable for correctness. >> virtual unsigned isStoreToStackSlotPostFE(const MachineInstr *MI, >> - int &FrameIndex) const { >> + int &FrameIndex) const { >> return 0; >> } >> >> /// hasStoreToStackSlot - If the specified machine instruction has a >> /// store to a stack slot, return true along with the FrameIndex of >> - /// the loaded stack slot. If not, return false. Unlike >> - /// isStoreToStackSlot, this returns true for any instructions that >> - /// loads from the stack. This is just a hint, as some cases may be >> - /// missed. >> + /// the loaded stack slot and the machine mem operand containing the >> + /// reference. If not, return false. Unlike isStoreToStackSlot, >> + /// this returns true for any instructions that loads from the >> + /// stack. This is just a hint, as some cases may be missed. >> virtual bool hasStoreToStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> int &FrameIndex) const { >> return 0; >> } >> Index: lib/CodeGen/AsmPrinter/AsmPrinter.cpp >> =================================================================== >> --- lib/CodeGen/AsmPrinter/AsmPrinter.cpp (revision 89484) >> +++ lib/CodeGen/AsmPrinter/AsmPrinter.cpp (working copy) >> @@ -1854,35 +1854,46 @@ >> >> // We assume a single instruction only has a spill or reload, not >> // both. >> + const MachineMemOperand *MMO; >> if (TM.getInstrInfo()->isLoadFromStackSlotPostFE(&MI, FI)) { >> if (FrameInfo->isSpillSlotObjectIndex(FI)) { >> + MMO = *MI.memoperands_begin(); >> + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); >> if (Newline) O << '\n'; >> O.PadToColumn(MAI->getCommentColumn()); >> - O << MAI->getCommentString() << " Reload"; >> + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") >> + << " Reload"; >> Newline = true; >> } >> } >> - else if (TM.getInstrInfo()->hasLoadFromStackSlot(&MI, FI)) { >> + else if (TM.getInstrInfo()->hasLoadFromStackSlot(&MI, MMO, FI)) { >> if (FrameInfo->isSpillSlotObjectIndex(FI)) { >> + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); >> if (Newline) O << '\n'; >> O.PadToColumn(MAI->getCommentColumn()); >> - O << MAI->getCommentString() << " Folded Reload"; >> + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") >> + << " Folded Reload"; >> Newline = true; >> } >> } >> else if (TM.getInstrInfo()->isStoreToStackSlotPostFE(&MI, FI)) { >> if (FrameInfo->isSpillSlotObjectIndex(FI)) { >> + MMO = *MI.memoperands_begin(); >> + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); >> if (Newline) O << '\n'; >> O.PadToColumn(MAI->getCommentColumn()); >> - O << MAI->getCommentString() << " Spill"; >> + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") >> + << " Spill"; >> Newline = true; >> } >> } >> - else if (TM.getInstrInfo()->hasStoreToStackSlot(&MI, FI)) { >> + else if (TM.getInstrInfo()->hasStoreToStackSlot(&MI, MMO, FI)) { >> if (FrameInfo->isSpillSlotObjectIndex(FI)) { >> + bool isVector = TM.getInstrInfo()->isVectorOperand(MI, MMO); >> if (Newline) O << '\n'; >> O.PadToColumn(MAI->getCommentColumn()); >> - O << MAI->getCommentString() << " Folded Spill"; >> + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") >> + << " Folded Spill"; >> Newline = true; >> } >> } >> @@ -1892,9 +1903,11 @@ >> if (TM.getInstrInfo()->isMoveInstr(MI, SrcReg, DstReg, >> SrcSubIdx, DstSubIdx)) { >> if (MI.getAsmPrinterFlag(ReloadReuse)) { >> + bool isVector = TM.getInstrInfo()->isVectorInstr(MI); >> if (Newline) O << '\n'; >> O.PadToColumn(MAI->getCommentColumn()); >> - O << MAI->getCommentString() << " Reload Reuse"; >> + O << MAI->getCommentString() << (isVector? " Vector" : " Scalar") >> + << " Reload Reuse"; >> Newline = true; >> } >> } >> Index: lib/Target/X86/X86InstrInfo.cpp >> =================================================================== >> --- lib/Target/X86/X86InstrInfo.cpp (revision 89484) >> +++ lib/Target/X86/X86InstrInfo.cpp (working copy) >> @@ -34,6 +34,7 @@ >> #include "llvm/MC/MCAsmInfo.h" >> >> #include >> +#include >> >> using namespace llvm; >> >> @@ -711,6 +712,393 @@ >> } >> } >> >> +bool X86InstrInfo::isVectorInstr(const MachineInstr &MI) const{ >> + // Handle special cases here. >> + switch(MI.getOpcode()) { >> + case X86::MOVDDUPrr: >> + case X86::MOVDDUPrm: >> + case X86::MOVSHDUPrr: >> + case X86::MOVSHDUPrm: >> + case X86::MOVSLDUPrr: >> + case X86::MOVSLDUPrm: >> + case X86::MPSADBWrri: // "PS" is lucky. Be explicit. >> + case X86::MPSADBWrmi: >> + return true; >> + case X86::MMX_MOVQ2DQrr: >> + return false; >> + } >> + >> + // Look for the common cases. >> + const TargetInstrDesc &InstrDesc = get(MI.getOpcode()); >> + const char *Name = InstrDesc.getName(); >> + if (std::strstr(Name, "PS") != 0 // SSE packed single >> + || std::strstr(Name, "PD") != 0 // SSE packed double >> + || std::strstr(Name, "DQ") != 0 // SSE packed integer >> + || Name[0] == 'P' // MMX/SSE packed integer >> + || Name[0] == 'V' && Name[1] == 'P') // AVX packed integer >> + return true; >> + >> + return false; >> +} >> + >> +bool X86InstrInfo::isVectorOperand(const MachineInstr &MI, >> + const MachineOperand *MO) const { >> + // Handle special cases here. These are for mixed vector/scalar >> + // instructions. >> + if (MO->getType() != MachineOperand::MO_Register >> + && MO->getType() != MachineOperand::MO_FrameIndex >> + && MO->getType() != MachineOperand::MO_ExternalSymbol >> + && MO->getType() != MachineOperand::MO_GlobalAddress) >> + return false; >> + >> + // Operands that are part of memory addresses are never vector. >> + // Come Larrabee, we will need to handle vector address operands so >> + // this will get more complicated. >> + for (unsigned OpNum = 0; OpNum < MI.getNumOperands(); ++OpNum) { >> + if (&MI.getOperand(OpNum) == MO) { >> + switch(MI.getOpcode()) { >> + case X86::EXTRACTPSmr: >> + case X86::EXTRACTPSrr: >> + return OpNum == MI.getNumExplicitOperands() - 1; >> + case X86::INSERTPSrm: >> + case X86::INSERTPSrr: >> + return OpNum == 0; >> + case X86::MOVDDUPrm: >> + case X86::MOVDDUPrr: >> + return OpNum == 0; >> + case X86::MOVHPDmr: >> + return OpNum == MI.getNumExplicitOperands() - 1; >> + case X86::MOVHPDrm: >> + // Address operands are never vector. >> + return false; >> + case X86::MOVLPDmr: >> + return OpNum == MI.getNumExplicitOperands() - 1; >> + case X86::MOVLPDrr: >> + case X86::MOVLPDrm: >> + return OpNum == 0; >> + case X86::MOVMSKPDrr: >> + case X86::MOVMSKPSrr: >> + return OpNum == 1; >> + case X86::PBLENDVBrr0: >> + case X86::PBLENDVBrm0: >> + return !(MO->isReg() && MO->isImplicit()); >> + case X86::PCMPESTRIrr: >> + case X86::PCMPESTRIrm: >> + case X86::PCMPESTRIArr: >> + case X86::PCMPESTRIArm: >> + case X86::PCMPESTRICrr: >> + case X86::PCMPESTRICrm: >> + case X86::PCMPESTRIOrr: >> + case X86::PCMPESTRIOrm: >> + case X86::PCMPESTRISrr: >> + case X86::PCMPESTRISrm: >> + case X86::PCMPESTRIZrr: >> + case X86::PCMPESTRIZrm: >> + case X86::PCMPESTRM128MEM: >> + case X86::PCMPESTRM128REG: >> + case X86::PCMPESTRM128rr: >> + case X86::PCMPESTRM128rm: >> + case X86::PCMPISTRIrr: >> + case X86::PCMPISTRIrm: >> + case X86::PCMPISTRIArr: >> + case X86::PCMPISTRIArm: >> + case X86::PCMPISTRICrr: >> + case X86::PCMPISTRICrm: >> + case X86::PCMPISTRIOrr: >> + case X86::PCMPISTRIOrm: >> + case X86::PCMPISTRISrr: >> + case X86::PCMPISTRISrm: >> + case X86::PCMPISTRIZrr: >> + case X86::PCMPISTRIZrm: >> + case X86::PCMPISTRM128MEM: >> + case X86::PCMPISTRM128REG: >> + case X86::PCMPISTRM128rr: >> + case X86::PCMPISTRM128rm: >> + return !(MO->isReg() && MO->isImplicit()); >> + case X86::PEXTRBrr: >> + case X86::MMX_PEXTRWri: >> + case X86::PEXTRWri: >> + case X86::PEXTRDrr: >> + case X86::PEXTRQrr: >> + case X86::PEXTRBmr: >> + case X86::PEXTRWmr: >> + case X86::PEXTRDmr: >> + case X86::PEXTRQmr: >> + // Account for the immediate operand. >> + return OpNum == MI.getNumExplicitOperands() - 2; >> + case X86::PINSRBrr: >> + case X86::PINSRBrm: >> + case X86::MMX_PINSRWrri: >> + case X86::PINSRWrri: >> + case X86::MMX_PINSRWrmi: >> + case X86::PINSRWrmi: >> + case X86::PINSRDrr: >> + case X86::PINSRDrm: >> + case X86::PINSRQrr: >> + case X86::PINSRQrm: >> + return OpNum == 0; >> + case X86::PMOVMSKBrr: >> + return OpNum == 1; >> + case X86::MMX_PSLLWrr: >> + case X86::MMX_PSLLWri: >> + case X86::MMX_PSLLWrm: >> + case X86::PSLLWrr: >> + case X86::PSLLWri: >> + case X86::PSLLWrm: >> + case X86::MMX_PSLLDrr: >> + case X86::MMX_PSLLDri: >> + case X86::MMX_PSLLDrm: >> + case X86::PSLLDrr: >> + case X86::PSLLDri: >> + case X86::PSLLDrm: >> + case X86::MMX_PSLLQrr: >> + case X86::MMX_PSLLQri: >> + case X86::MMX_PSLLQrm: >> + case X86::PSLLQrr: >> + case X86::PSLLQri: >> + case X86::PSLLQrm: >> + case X86::MMX_PSRAWrr: >> + case X86::MMX_PSRAWri: >> + case X86::MMX_PSRAWrm: >> + case X86::PSRAWrr: >> + case X86::PSRAWri: >> + case X86::PSRAWrm: >> + case X86::MMX_PSRADrr: >> + case X86::MMX_PSRADri: >> + case X86::MMX_PSRADrm: >> + case X86::PSRADrr: >> + case X86::PSRADri: >> + case X86::PSRADrm: >> + case X86::MMX_PSRLWrr: >> + case X86::MMX_PSRLWri: >> + case X86::MMX_PSRLWrm: >> + case X86::PSRLWrr: >> + case X86::PSRLWri: >> + case X86::PSRLWrm: >> + case X86::MMX_PSRLDrr: >> + case X86::MMX_PSRLDri: >> + case X86::MMX_PSRLDrm: >> + case X86::PSRLDrr: >> + case X86::PSRLDri: >> + case X86::PSRLDrm: >> + case X86::MMX_PSRLQrr: >> + case X86::MMX_PSRLQri: >> + case X86::MMX_PSRLQrm: >> + case X86::PSRLQrr: >> + case X86::PSRLQri: >> + case X86::PSRLQrm: >> + return OpNum == 0; >> + case X86::PTESTrr: >> + case X86::PTESTrm: >> + return !(MO->isReg() && MO->isImplicit()); >> + case X86::UNPCKLPDrr: >> + case X86::UNPCKLPDrm: >> + return OpNum == 0; >> + } >> + return isVectorInstr(MI); >> + } >> + } >> + >> + assert(0 && "Did not find operand in instruction!"); >> + >> + return false; >> +} >> + >> +bool X86InstrInfo::isVectorOperand(const MachineInstr &MI, >> + const MachineMemOperand *MMO) const { >> + bool found = false; >> + for (MachineInstr::mmo_iterator m = MI.memoperands_begin(), >> + mend = MI.memoperands_end(); >> + m != mend; >> + ++m) { >> + if (*m == MMO) >> + found = true; >> + } >> + >> + if (!found) >> + assert(0 && "Wrong machine mem operands for instruction!"); >> + >> + // Handle special cases here. These are for mixed vector/scalar >> + // instructions. >> + switch(MI.getOpcode()) { >> + case X86::EXTRACTPSrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::EXTRACTPSmr: >> + assert(MMO->isStore() && "Wrong machine mem operand for >> instruction!"); + return false; >> + case X86::INSERTPSrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::INSERTPSrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVDDUPrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVDDUPrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVHPDmr: >> + assert(MMO->isStore() && "Wrong machine mem operand for >> instruction!"); + return false; >> + case X86::MOVHPDrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVLPDmr: >> + assert(MMO->isStore() && "Wrong machine mem operand for >> instruction!"); + return false; >> + case X86::MOVLPDrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVLPDrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MOVMSKPDrr: >> + case X86::MOVMSKPSrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PBLENDVBrr0: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PBLENDVBrm0: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return true; >> + case X86::PCMPESTRIrm: >> + case X86::PCMPESTRIArm: >> + case X86::PCMPESTRICrm: >> + case X86::PCMPESTRIOrm: >> + case X86::PCMPESTRISrm: >> + case X86::PCMPESTRIZrm: >> + case X86::PCMPESTRM128MEM: >> + case X86::PCMPESTRM128rm: >> + case X86::PCMPISTRIrm: >> + case X86::PCMPISTRIArm: >> + case X86::PCMPISTRICrm: >> + case X86::PCMPISTRIOrm: >> + case X86::PCMPISTRISrm: >> + case X86::PCMPISTRIZrm: >> + case X86::PCMPISTRM128MEM: >> + case X86::PCMPISTRM128rm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PCMPESTRIrr: >> + case X86::PCMPESTRIArr: >> + case X86::PCMPESTRICrr: >> + case X86::PCMPESTRIOrr: >> + case X86::PCMPESTRISrr: >> + case X86::PCMPESTRIZrr: >> + case X86::PCMPESTRM128REG: >> + case X86::PCMPESTRM128rr: >> + case X86::PCMPISTRIrr: >> + case X86::PCMPISTRIArr: >> + case X86::PCMPISTRICrr: >> + case X86::PCMPISTRIOrr: >> + case X86::PCMPISTRISrr: >> + case X86::PCMPISTRIZrr: >> + case X86::PCMPISTRM128REG: >> + case X86::PCMPISTRM128rr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PEXTRBrr: >> + case X86::MMX_PEXTRWri: >> + case X86::PEXTRWri: >> + case X86::PEXTRDrr: >> + case X86::PEXTRQrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PEXTRBmr: >> + case X86::PEXTRWmr: >> + case X86::PEXTRDmr: >> + case X86::PEXTRQmr: >> + assert(MMO->isStore() && "Wrong machine mem operand for >> instruction!"); + return false; >> + case X86::PINSRBrr: >> + case X86::MMX_PINSRWrri: >> + case X86::PINSRWrri: >> + case X86::PINSRDrr: >> + case X86::PINSRQrr: >> + assert(MMO->isStore() && "Wrong machine mem operand for >> instruction!"); + return false; >> + case X86::PINSRBrm: >> + case X86::MMX_PINSRWrmi: >> + case X86::PINSRWrmi: >> + case X86::PINSRDrm: >> + case X86::PINSRQrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PMOVMSKBrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MMX_PSLLWrm: >> + case X86::PSLLWrm: >> + case X86::MMX_PSLLDrm: >> + case X86::PSLLDrm: >> + case X86::MMX_PSLLQrm: >> + case X86::PSLLQrm: >> + case X86::MMX_PSRAWrm: >> + case X86::PSRAWrm: >> + case X86::MMX_PSRADrm: >> + case X86::PSRADrm: >> + case X86::MMX_PSRLWrm: >> + case X86::PSRLWrm: >> + case X86::MMX_PSRLDrm: >> + case X86::PSRLDrm: >> + case X86::MMX_PSRLQrm: >> + case X86::PSRLQrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::MMX_PSLLWrr: >> + case X86::MMX_PSLLWri: >> + case X86::PSLLWrr: >> + case X86::PSLLWri: >> + case X86::MMX_PSLLDrr: >> + case X86::MMX_PSLLDri: >> + case X86::PSLLDrr: >> + case X86::PSLLDri: >> + case X86::MMX_PSLLQrr: >> + case X86::MMX_PSLLQri: >> + case X86::PSLLQrr: >> + case X86::PSLLQri: >> + case X86::MMX_PSRAWrr: >> + case X86::MMX_PSRAWri: >> + case X86::PSRAWrr: >> + case X86::PSRAWri: >> + case X86::MMX_PSRADrr: >> + case X86::MMX_PSRADri: >> + case X86::PSRADrr: >> + case X86::PSRADri: >> + case X86::MMX_PSRLWrr: >> + case X86::MMX_PSRLWri: >> + case X86::PSRLWrr: >> + case X86::PSRLWri: >> + case X86::MMX_PSRLDrr: >> + case X86::MMX_PSRLDri: >> + case X86::PSRLDrr: >> + case X86::PSRLDri: >> + case X86::MMX_PSRLQrr: >> + case X86::MMX_PSRLQri: >> + case X86::PSRLQrr: >> + case X86::PSRLQri: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PTESTrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::PTESTrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::UNPCKLPDrr: >> + assert(0 && "Wrong machine mem operand for instruction!"); >> + return false; >> + case X86::UNPCKLPDrm: >> + assert(MMO->isLoad() && "Wrong machine mem operand for instruction!"); >> + return false; >> + } >> + >> + return isVectorInstr(MI); >> +} >> + >> /// isFrameOperand - Return true and the FrameIndex if the specified >> /// operand and follow operands form a reference to the stack frame. >> bool X86InstrInfo::isFrameOperand(const MachineInstr *MI, unsigned int Op, >> @@ -783,12 +1171,14 @@ >> if ((Reg = isLoadFromStackSlot(MI, FrameIndex))) >> return Reg; >> // Check for post-frame index elimination operations >> - return hasLoadFromStackSlot(MI, FrameIndex); >> + const MachineMemOperand *Dummy; >> + return hasLoadFromStackSlot(MI, Dummy, FrameIndex); >> } >> return 0; >> } >> >> bool X86InstrInfo::hasLoadFromStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> int &FrameIndex) const { >> for (MachineInstr::mmo_iterator o = MI->memoperands_begin(), >> oe = MI->memoperands_end(); >> @@ -798,6 +1188,7 @@ >> if (const FixedStackPseudoSourceValue *Value = >> dyn_cast((*o)->getValue())) { >> FrameIndex = Value->getFrameIndex(); >> + MMO = *o; >> return true; >> } >> } >> @@ -819,12 +1210,14 @@ >> if ((Reg = isStoreToStackSlot(MI, FrameIndex))) >> return Reg; >> // Check for post-frame index elimination operations >> - return hasStoreToStackSlot(MI, FrameIndex); >> + const MachineMemOperand *Dummy; >> + return hasStoreToStackSlot(MI, Dummy, FrameIndex); >> } >> return 0; >> } >> >> bool X86InstrInfo::hasStoreToStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> int &FrameIndex) const { >> for (MachineInstr::mmo_iterator o = MI->memoperands_begin(), >> oe = MI->memoperands_end(); >> @@ -834,6 +1227,7 @@ >> if (const FixedStackPseudoSourceValue *Value = >> dyn_cast((*o)->getValue())) { >> FrameIndex = Value->getFrameIndex(); >> + MMO = *o; >> return true; >> } >> } >> Index: lib/Target/X86/X86InstrInfo.h >> =================================================================== >> --- lib/Target/X86/X86InstrInfo.h (revision 89484) >> +++ lib/Target/X86/X86InstrInfo.h (working copy) >> @@ -448,6 +448,17 @@ >> unsigned &SrcReg, unsigned &DstReg, >> unsigned &SrcSubIdx, unsigned &DstSubIdx) >> const; >> >> + /// isVectorInstr - Return true if the instruction is a vector >> operation. + virtual bool isVectorInstr(const MachineInstr& MI) const; >> + >> + /// isVectorOperand - Return true if the operand is of vector type.. >> + virtual bool isVectorOperand(const MachineInstr& MI, >> + const MachineOperand *MO) const; >> + >> + /// isVectorOperand - Return true if the mem operand is of vector type.. >> + virtual bool isVectorOperand(const MachineInstr& MI, >> + const MachineMemOperand *MMO) const; >> + >> unsigned isLoadFromStackSlot(const MachineInstr *MI, int &FrameIndex) >> const; >> /// isLoadFromStackSlotPostFE - Check for post-frame ptr elimination >> /// stack locations as well. This uses a heuristic so it isn't >> @@ -457,11 +468,14 @@ >> >> /// hasLoadFromStackSlot - If the specified machine instruction has >> /// a load from a stack slot, return true along with the FrameIndex >> - /// of the loaded stack slot. If not, return false. Unlike >> + /// of the loaded stack slot and the machine mem operand containing >> + /// the reference. If not, return false. Unlike >> /// isLoadFromStackSlot, this returns true for any instructions that >> /// loads from the stack. This is a hint only and may not catch all >> /// cases. >> - bool hasLoadFromStackSlot(const MachineInstr *MI, int &FrameIndex) >> const; + bool hasLoadFromStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> + int &FrameIndex) const; >> >> unsigned isStoreToStackSlot(const MachineInstr *MI, int &FrameIndex) >> const; /// isStoreToStackSlotPostFE - Check for post-frame ptr elimination >> @@ -472,11 +486,13 @@ >> >> /// hasStoreToStackSlot - If the specified machine instruction has a >> /// store to a stack slot, return true along with the FrameIndex of >> - /// the loaded stack slot. If not, return false. Unlike >> - /// isStoreToStackSlot, this returns true for any instructions that >> - /// loads from the stack. This is a hint only and may not catch all >> - /// cases. >> - bool hasStoreToStackSlot(const MachineInstr *MI, int &FrameIndex) const; >> + /// the loaded stack slot and the machine mem operand containing the >> + /// reference. If not, return false. Unlike isStoreToStackSlot, >> + /// this returns true for any instructions that loads from the >> + /// stack. This is a hint only and may not catch all cases. >> + bool hasStoreToStackSlot(const MachineInstr *MI, >> + const MachineMemOperand *&MMO, >> + int &FrameIndex) const; >> >> bool isReallyTriviallyReMaterializable(const MachineInstr *MI, >> AliasAnalysis *AA) const; >> Index: test/CodeGen/X86/2009-11-20-VectorSpillComments.ll >> =================================================================== >> --- test/CodeGen/X86/2009-11-20-VectorSpillComments.ll (revision 0) >> +++ test/CodeGen/X86/2009-11-20-VectorSpillComments.ll (revision 0) >> @@ -0,0 +1,19 @@ >> +; RUN: llc < %s -march=x86-64 | FileCheck %s >> +; CHECK: Vector Spill >> +; CHECK: Vector Reload >> +; CHECK: Vector Folded Reload >> +; CHECK: Scalar Spill >> +; CHECK: Scalar Folded Reload >> + >> +define <8 x i32> @foo(<8 x i32> %t, <8 x i32> %u) { >> + %m = srem <8 x i32> %t, %u >> + ret <8 x i32> %m >> +} >> +define <8 x i32> @bar(<8 x i32> %t, <8 x i32> %u) { >> + %m = urem <8 x i32> %t, %u >> + ret <8 x i32> %m >> +} >> +define <8 x float> @qux(<8 x float> %t, <8 x float> %u) { >> + %m = frem <8 x float> %t, %u >> + ret <8 x float> %m >> +} > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From bob.wilson at apple.com Mon Nov 23 13:40:12 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Mon, 23 Nov 2009 11:40:12 -0800 Subject: [llvm-commits] [llvm] r89684 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td In-Reply-To: <200911231816.nANIGGiG001437@zion.cs.uiuc.edu> References: <200911231816.nANIGGiG001437@zion.cs.uiuc.edu> Message-ID: <23F04D9C-4DC5-49FC-8EC0-D7E3A508074D@apple.com> Very nice! Thanks for doing this. Can you do the same thing for r84572 (remove N3VImm) and r84730 (remove N2VDup and associated changes)? You might also consider changing the shift-operand and N2VImm portions of r84730, but I'll leave that to your discretion. On Nov 23, 2009, at 10:16 AM, Johnny Chen wrote: > Author: johnny > Date: Mon Nov 23 12:16:16 2009 > New Revision: 89684 > > URL: http://llvm.org/viewvc/llvm-project?rev=89684&view=rev > Log: > Partially revert r89377 by removing NLdStLN class definition from > ARMInstrFormats.td and fixing VLD[234]LN* and VST[234]LN* to derive from NLdSt > instead of NLdStLN. > > Modified: > llvm/trunk/lib/Target/ARM/ARMInstrFormats.td > llvm/trunk/lib/Target/ARM/ARMInstrNEON.td > > Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89684&r1=89683&r2=89684&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 12:16:16 2009 > @@ -1248,17 +1248,6 @@ > let Inst{7-4} = op7_4; > } > > -// With selective bit(s) from op7_4 specified by subclasses. > -class NLdStLN op21_20, bits<4> op11_8, > - dag oops, dag iops, InstrItinClass itin, > - string opc, string asm, string cstr, list pattern> > - : NeonI { > - let Inst{31-24} = 0b11110100; > - let Inst{23} = op23; > - let Inst{21-20} = op21_20; > - let Inst{11-8} = op11_8; > -} > - > class NDataI string opc, string asm, string cstr, list pattern> > : NeonI > Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89684&r1=89683&r2=89684&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 12:16:16 2009 > @@ -280,11 +280,11 @@ > > // VLD2LN : Vector Load (single 2-element structure to one lane) > class VLD2LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), > - IIC_VLD2, > - OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", > - "$src1 = $dst1, $src2 = $dst2", []>; > + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), > + IIC_VLD2, > + OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", > + "$src1 = $dst1, $src2 = $dst2", []>; > > // vld2 to single-spaced registers. > def VLD2LNd8 : VLD2LN<0b0001, "vld2.8">; > @@ -313,12 +313,12 @@ > > // VLD3LN : Vector Load (single 3-element structure to one lane) > class VLD3LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, > - nohash_imm:$lane), IIC_VLD3, > - OpcodeStr, > - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", > - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; > + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, > + nohash_imm:$lane), IIC_VLD3, > + OpcodeStr, > + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", > + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; > > // vld3 to single-spaced registers. > def VLD3LNd8 : VLD3LN<0b0010, "vld3.8"> { > @@ -349,13 +349,13 @@ > > // VLD4LN : Vector Load (single 4-element structure to one lane) > class VLD4LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b10,op11_8, > - (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, > - nohash_imm:$lane), IIC_VLD4, > - OpcodeStr, > - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", > - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; > + : NLdSt<1,0b10,op11_8,{?,?,?,?}, > + (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, > + nohash_imm:$lane), IIC_VLD4, > + OpcodeStr, > + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", > + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; > > // vld4 to single-spaced registers. > def VLD4LNd8 : VLD4LN<0b0011, "vld4.8">; > @@ -504,11 +504,11 @@ > > // VST2LN : Vector Store (single 2-element structure from one lane) > class VST2LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b00,op11_8, (outs), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), > - IIC_VST, > - OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", > - "", []>; > + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), > + IIC_VST, > + OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", > + "", []>; > > // vst2 to single-spaced registers. > def VST2LNd8 : VST2LN<0b0001, "vst2.8">; > @@ -537,11 +537,11 @@ > > // VST3LN : Vector Store (single 3-element structure from one lane) > class VST3LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b00,op11_8, (outs), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, > - nohash_imm:$lane), IIC_VST, > - OpcodeStr, > - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; > + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, > + nohash_imm:$lane), IIC_VST, > + OpcodeStr, > + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; > > // vst3 to single-spaced registers. > def VST3LNd8 : VST3LN<0b0010, "vst3.8"> { > @@ -572,12 +572,12 @@ > > // VST4LN : Vector Store (single 4-element structure from one lane) > class VST4LN op11_8, string OpcodeStr> > - : NLdStLN<1,0b00,op11_8, (outs), > - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, > - nohash_imm:$lane), IIC_VST, > - OpcodeStr, > - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", > - "", []>; > + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), > + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, > + nohash_imm:$lane), IIC_VST, > + OpcodeStr, > + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", > + "", []>; > > // vst4 to single-spaced registers. > def VST4LNd8 : VST4LN<0b0011, "vst4.8">; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From johnny.chen at apple.com Mon Nov 23 14:09:13 2009 From: johnny.chen at apple.com (Johnny Chen) Date: Mon, 23 Nov 2009 20:09:13 -0000 Subject: [llvm-commits] [llvm] r89693 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td Message-ID: <200911232009.nANK9DaO005678@zion.cs.uiuc.edu> Author: johnny Date: Mon Nov 23 14:09:13 2009 New Revision: 89693 URL: http://llvm.org/viewvc/llvm-project?rev=89693&view=rev Log: Revert r84572 by removing N3VImm from ARMInstrFormats.td now that we can specify {?,?,?,?} as op11_8 for VEXTd and VEXTq. Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89693&r1=89692&r2=89693&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 14:09:13 2009 @@ -1324,20 +1324,6 @@ let Inst{4} = op4; } -// NEON 3 vector register with immediate. This is only used for VEXT where -// op11_8 represents the starting byte index of the extracted result in the -// concatenation of the operands and is left unspecified. -class N3VImm op21_20, bit op6, bit op4, - dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NDataI { - let Inst{24} = op24; - let Inst{23} = op23; - let Inst{21-20} = op21_20; - let Inst{6} = op6; - let Inst{4} = op4; -} - // NEON VMOVs between scalar and core registers. class NVLaneOp opcod1, bits<4> opcod2, bits<2> opcod3, dag oops, dag iops, Format f, InstrItinClass itin, Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89693&r1=89692&r2=89693&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 14:09:13 2009 @@ -2864,18 +2864,18 @@ // VEXT : Vector Extract class VEXTd - : N3VImm<0,1,0b11,0,0, (outs DPR:$dst), - (ins DPR:$lhs, DPR:$rhs, i32imm:$index), IIC_VEXTD, - OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", - [(set DPR:$dst, (Ty (NEONvext (Ty DPR:$lhs), - (Ty DPR:$rhs), imm:$index)))]>; + : N3V<0,1,0b11,{?,?,?,?},0,0, (outs DPR:$dst), + (ins DPR:$lhs, DPR:$rhs, i32imm:$index), IIC_VEXTD, + OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", + [(set DPR:$dst, (Ty (NEONvext (Ty DPR:$lhs), + (Ty DPR:$rhs), imm:$index)))]>; class VEXTq - : N3VImm<0,1,0b11,1,0, (outs QPR:$dst), - (ins QPR:$lhs, QPR:$rhs, i32imm:$index), IIC_VEXTQ, - OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", - [(set QPR:$dst, (Ty (NEONvext (Ty QPR:$lhs), - (Ty QPR:$rhs), imm:$index)))]>; + : N3V<0,1,0b11,{?,?,?,?},1,0, (outs QPR:$dst), + (ins QPR:$lhs, QPR:$rhs, i32imm:$index), IIC_VEXTQ, + OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", + [(set QPR:$dst, (Ty (NEONvext (Ty QPR:$lhs), + (Ty QPR:$rhs), imm:$index)))]>; def VEXTd8 : VEXTd<"vext.8", v8i8>; def VEXTd16 : VEXTd<"vext.16", v4i16>; From dag at cray.com Mon Nov 23 14:11:29 2009 From: dag at cray.com (David Greene) Date: Mon, 23 Nov 2009 14:11:29 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <9895559A-187A-447E-945A-FEFEA37EFF43@apple.com> References: <200911201622.37994.dag@cray.com> <200911230901.47371.dag@cray.com> <9895559A-187A-447E-945A-FEFEA37EFF43@apple.com> Message-ID: <200911231411.29627.dag@cray.com> On Monday 23 November 2009 13:39, Evan Cheng wrote: > David, this really is not a good idea. You are adding more target hooks > purely for asm printing comments. These hooks return instruction properties > that should be static so they should be moved to td files. Ok, that's reasonable. There are a number of target hooks that should be processed by tblgen, then. For example, isMoveInstr, GetCondBranchFromCond and sizeOfImm. To do isVectorInstr and isVectorOperandInstr will require some additional flags in the .td files, I think. Is that ok? I don't want to to a whole bunch of work to find out later that there's a better way. -Dave From vkutuzov at accesssoftek.com Mon Nov 23 14:20:04 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Mon, 23 Nov 2009 12:20:04 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911141921r7deec8b3k91e06950263f0485@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607BF@mail.accesssoftek.com> <38a0d8450911170809j6a32716ar840b8622cfad6f17@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607C4@mail.accesssoftek.com> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> Message-ID: Hello Rafael, I have removed the hasFeature method for now since it triggers those kind of questions. Later I may add it along with the clear use case, but for now the result of doing AddFeature("foo", true); AddFeature("foo", false) remains unchanged and is up to the TargetMachine implementation (the last one wins). Thanks a lot for reviewing the patch. It is commited as http://llvm.org/viewvc/llvm-project?rev=89516&view=rev Now everything is reaady for the target triple overriding. Please find the patch attached. This patch allows setting target triple (-mtriple) as a command line argument as well as -mcpu, and -mattr. Best regards, Viktor ----- Original Message ----- From: "Rafael Espindola" To: "Viktor Kutuzov" Cc: "Commit Messages and Patches for LLVM" Sent: Friday, November 20, 2009 7:22 AM Subject: Re: [llvm-commits] [PATCH] LTO code generator options >> StringRef::split would do. >> The issue here is: SubtargetFeatures class uses std::strings and has a >> helper method Split which is used in few other places. >> I'm trying to follow this. I don't like the idea to use StringRef::split >> only in this particular place. >> However, I wouldn't mind to re-factor the SubtargetFeatures class to use >> StringRef instead of std::strings and prepare another patch. Shell I do so? > > You don't need to. If Split is used elsewhere I agree it is better to > use it in here too. > >>> You say features are normalized, why does hasFeature needs to convert >>> to lowercase and strip flags? >> >> Features are normalized inside the SubtargetFeatures class. hasFeature gets >> a string which also should be normalized to compare with the already >> normalized features. >> We need to strip the flag because we do not care was this feature set on or >> off, we simply want to know if it was set or not. >> But if you have the question, I'll better add a comment there to clarify >> this. How about something like this: "Normalize the given string to compare >> with the normalized features, and strip the flag since we are checking was >> this feature set at all without checking was it set on or off."? > > I think I am still missing something. First on the flag, what is the > expected result of doing AddFeature("foo", true); AddFeature("foo", > false). The patch implements a "first one wins". Is that correct? > Assuming that is the desired behavior I understand the calls to > StripFlag, but the call to LowercaseString still looks redundant: > > *) hasFeature is called only from AddFeature > *) the string passed to it was generated with > PrependFlag(LowercaseString(String), IsEnabled) > > A comment explaining this should do. > >>> The comment about "after all explicit feature settings" is a future >>> reference, right? I don't see any feature being set :-) >> >> You are right. This is a future reference. Next patch will add explicit >> settings of cpu and attributes. We just need to get there. :) >> >>> The method SubtargetFeatures::AddFeatures(const cl::list >>> &List) is not being used, is it? >> >> I didn't find a way to add the usage in this patch without adding a lot of >> other changes to get it work. This will be in the next patch and the usage >> will look like this in the LTOCodeGenerator.cpp file: >> >> ... >> +static cl::list MAttrs("mattr", >> + cl::CommaSeparated, >> + cl::desc("Target specific attributes (see -mattr=help for details)"), >> + cl::value_desc("a1,+a2,-a3,...")); >> + >> ... >> >> bool LTOCodeGenerator::determineTarget(std::string& errMsg) >> ... >> + // Prepare subtarget feature set for the given command line >> options. >> + SubtargetFeatures features; >> ... >> + if (!MAttrs.empty()) >> + features.AddFeatures(MAttrs); >> + >> + // Set the rest of features by default. >> + // Note: Please keep this after all explict feature settings to >> make sure >> + // defaults will not override explicitly set options. >> + >> features.AddFeatures(SubtargetFeatures::getDefaultSubTargetTripleFeatures(_targetTriple)); >> ... >> >> Which also fill that "keep this after all explict feature settings" comment >> with meaning. >> >> Is it Ok to ssubmit this patch? > > OK with the comment explaining the addFeature, hasFeature behavior. > >> Cheers, >> Viktor > > > Cheers, > -- > Rafael ?vila de Esp?ndola > -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-lto-codegen-target_triple_override.diff Type: application/octet-stream Size: 9382 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091123/3cf04ae3/attachment.obj From vkutuzov at accesssoftek.com Mon Nov 23 14:26:03 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Mon, 23 Nov 2009 12:26:03 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <6AE1604EE3EC5F4296C096518C6B77EEFD4607BF@mail.accesssoftek.com> <38a0d8450911170809j6a32716ar840b8622cfad6f17@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607C4@mail.accesssoftek.com> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <6a8523d60911221008t431f5662v369a3ebd910a8696@mail.gmail.com> Message-ID: <9E5250E11500450492F0683916275B7D@andreic6e7fe55> > Please! Will do. Right after the LTO code generator options will be finished. There are few more patches to go (one is just submitted for review and 2 are pending). -Viktor ----- Original Message ----- From: "Daniel Dunbar" To: "Rafael Espindola" Cc: "Viktor Kutuzov" ; "Commit Messages and Patches for LLVM" Sent: Sunday, November 22, 2009 10:08 AM Subject: Re: [llvm-commits] [PATCH] LTO code generator options On Fri, Nov 20, 2009 at 7:22 AM, Rafael Espindola wrote: >> StringRef::split would do. >> The issue here is: SubtargetFeatures class uses std::strings and has a >> helper method Split which is used in few other places. >> I'm trying to follow this. I don't like the idea to use StringRef::split >> only in this particular place. >> However, I wouldn't mind to re-factor the SubtargetFeatures class to use >> StringRef instead of std::strings and prepare another patch. Shell I do so? Please! > You don't need to. If Split is used elsewhere I agree it is better to > use it in here too. Uh, why? We should kill off duplicate code. - Daniel >>> You say features are normalized, why does hasFeature needs to convert >>> to lowercase and strip flags? >> >> Features are normalized inside the SubtargetFeatures class. hasFeature gets >> a string which also should be normalized to compare with the already >> normalized features. >> We need to strip the flag because we do not care was this feature set on or >> off, we simply want to know if it was set or not. >> But if you have the question, I'll better add a comment there to clarify >> this. How about something like this: "Normalize the given string to compare >> with the normalized features, and strip the flag since we are checking was >> this feature set at all without checking was it set on or off."? > > I think I am still missing something. First on the flag, what is the > expected result of doing AddFeature("foo", true); AddFeature("foo", > false). The patch implements a "first one wins". Is that correct? > Assuming that is the desired behavior I understand the calls to > StripFlag, but the call to LowercaseString still looks redundant: > > *) hasFeature is called only from AddFeature > *) the string passed to it was generated with > PrependFlag(LowercaseString(String), IsEnabled) > > A comment explaining this should do. > >>> The comment about "after all explicit feature settings" is a future >>> reference, right? I don't see any feature being set :-) >> >> You are right. This is a future reference. Next patch will add explicit >> settings of cpu and attributes. We just need to get there. :) >> >>> The method SubtargetFeatures::AddFeatures(const cl::list >>> &List) is not being used, is it? >> >> I didn't find a way to add the usage in this patch without adding a lot of >> other changes to get it work. This will be in the next patch and the usage >> will look like this in the LTOCodeGenerator.cpp file: >> >> ... >> +static cl::list MAttrs("mattr", >> + cl::CommaSeparated, >> + cl::desc("Target specific attributes (see -mattr=help for details)"), >> + cl::value_desc("a1,+a2,-a3,...")); >> + >> ... >> >> bool LTOCodeGenerator::determineTarget(std::string& errMsg) >> ... >> + // Prepare subtarget feature set for the given command line >> options. >> + SubtargetFeatures features; >> ... >> + if (!MAttrs.empty()) >> + features.AddFeatures(MAttrs); >> + >> + // Set the rest of features by default. >> + // Note: Please keep this after all explict feature settings to >> make sure >> + // defaults will not override explicitly set options. >> + >> features.AddFeatures(SubtargetFeatures::getDefaultSubTargetTripleFeatures(_targetTriple)); >> ... >> >> Which also fill that "keep this after all explict feature settings" comment >> with meaning. >> >> Is it Ok to ssubmit this patch? > > OK with the comment explaining the addFeature, hasFeature behavior. > >> Cheers, >> Viktor > > > Cheers, > -- > Rafael ?vila de Esp?ndola > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From grosbach at apple.com Mon Nov 23 14:35:56 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 20:35:56 -0000 Subject: [llvm-commits] [llvm] r89694 - in /llvm/trunk/lib/Target/ARM: ARMInstrInfo.td ARMInstrThumb2.td Message-ID: <200911232035.nANKZuM2007007@zion.cs.uiuc.edu> Author: grosbach Date: Mon Nov 23 14:35:53 2009 New Revision: 89694 URL: http://llvm.org/viewvc/llvm-project?rev=89694&view=rev Log: fold immediate of a + Const into the user as a subtract if it can fit as a negated two-part immediate. Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td?rev=89694&r1=89693&r2=89694&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrInfo.td Mon Nov 23 14:35:53 2009 @@ -284,6 +284,22 @@ return CurDAG->getTargetConstant(V, MVT::i32); }]>; +def so_neg_imm2part : Operand, PatLeaf<(imm), [{ + return ARM_AM::isSOImmTwoPartVal(-(int)N->getZExtValue()); + }]> { + let PrintMethod = "printSOImm2PartOperand"; +} + +def so_neg_imm2part_1 : SDNodeXFormgetZExtValue()); + return CurDAG->getTargetConstant(V, MVT::i32); +}]>; + +def so_neg_imm2part_2 : SDNodeXFormgetZExtValue()); + return CurDAG->getTargetConstant(V, MVT::i32); +}]>; + /// imm0_31 predicate - True if the 32-bit immediate is in the range [0,31]. def imm0_31 : Operand, PatLeaf<(imm), [{ return (int32_t)N->getZExtValue() < 32; @@ -1618,9 +1634,9 @@ def : ARMPat<(add GPR:$LHS, so_imm2part:$RHS), (ADDri (ADDri GPR:$LHS, (so_imm2part_1 imm:$RHS)), (so_imm2part_2 imm:$RHS))>; -def : ARMPat<(sub GPR:$LHS, so_imm2part:$RHS), - (SUBri (SUBri GPR:$LHS, (so_imm2part_1 imm:$RHS)), - (so_imm2part_2 imm:$RHS))>; +def : ARMPat<(add GPR:$LHS, so_neg_imm2part:$RHS), + (SUBri (SUBri GPR:$LHS, (so_neg_imm2part_1 imm:$RHS)), + (so_neg_imm2part_2 imm:$RHS))>; // 32-bit immediate using movw + movt. // This is a single pseudo instruction, the benefit is that it can be remat'd Modified: llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td?rev=89694&r1=89693&r2=89694&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Mon Nov 23 14:35:53 2009 @@ -88,6 +88,21 @@ return CurDAG->getTargetConstant(V, MVT::i32); }]>; +def t2_so_neg_imm2part : Operand, PatLeaf<(imm), [{ + return ARM_AM::isT2SOImmTwoPartVal(-(int)N->getZExtValue()); + }]> { +} + +def t2_so_neg_imm2part_1 : SDNodeXFormgetZExtValue()); + return CurDAG->getTargetConstant(V, MVT::i32); +}]>; + +def t2_so_neg_imm2part_2 : SDNodeXFormgetZExtValue()); + return CurDAG->getTargetConstant(V, MVT::i32); +}]>; + /// imm1_31 predicate - True if the 32-bit immediate is in the range [1,31]. def imm1_31 : PatLeaf<(i32 imm), [{ return (int32_t)N->getZExtValue() >= 1 && (int32_t)N->getZExtValue() < 32; @@ -1162,9 +1177,9 @@ def : T2Pat<(add GPR:$LHS, t2_so_imm2part:$RHS), (t2ADDri (t2ADDri GPR:$LHS, (t2_so_imm2part_1 imm:$RHS)), (t2_so_imm2part_2 imm:$RHS))>; -def : T2Pat<(sub GPR:$LHS, t2_so_imm2part:$RHS), - (t2SUBri (t2SUBri GPR:$LHS, (t2_so_imm2part_1 imm:$RHS)), - (t2_so_imm2part_2 imm:$RHS))>; +def : T2Pat<(add GPR:$LHS, t2_so_neg_imm2part:$RHS), + (t2SUBri (t2SUBri GPR:$LHS, (t2_so_neg_imm2part_1 imm:$RHS)), + (t2_so_neg_imm2part_2 imm:$RHS))>; // ConstantPool, GlobalAddress, and JumpTable def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2LEApcrel tglobaladdr :$dst)>; From grosbach at apple.com Mon Nov 23 14:39:53 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 20:39:53 -0000 Subject: [llvm-commits] [llvm] r89695 - /llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll Message-ID: <200911232039.nANKdslU007380@zion.cs.uiuc.edu> Author: grosbach Date: Mon Nov 23 14:39:53 2009 New Revision: 89695 URL: http://llvm.org/viewvc/llvm-project?rev=89695&view=rev Log: update test for 89694 Modified: llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll Modified: llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll?rev=89695&r1=89694&r2=89695&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/thumb2-select_xform.ll Mon Nov 23 14:39:53 2009 @@ -2,9 +2,9 @@ define i32 @t1(i32 %a, i32 %b, i32 %c) nounwind { ; CHECK: t1 -; CHECK: mvn r0, #-2147483648 +; CHECK: sub.w r0, r1, #-2147483648 ; CHECK: cmp r2, #10 -; CHECK: add.w r0, r1, r0 +; CHECK: sub.w r0, r0, #1 ; CHECK: it gt ; CHECK: movgt r0, r1 %tmp1 = icmp sgt i32 %c, 10 From johnny.chen at apple.com Mon Nov 23 15:00:43 2009 From: johnny.chen at apple.com (Johnny Chen) Date: Mon, 23 Nov 2009 21:00:43 -0000 Subject: [llvm-commits] [llvm] r89699 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td Message-ID: <200911232100.nANL0hCs008201@zion.cs.uiuc.edu> Author: johnny Date: Mon Nov 23 15:00:43 2009 New Revision: 89699 URL: http://llvm.org/viewvc/llvm-project?rev=89699&view=rev Log: Partially revert r84730 by removing N2VDup from ARMInstrFormats.td and modifying VDUPLND and VDUPLNQ to derive from N2V instead of N2VDup. VDUPLND and VDUPLNQ now expect op19_18 and op17_16 as the first two args. Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89699&r1=89698&r2=89699&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 15:00:43 2009 @@ -1285,19 +1285,6 @@ let Inst{4} = op4; } -// NEON Vector Duplicate (scalar). -// Inst{19-16} is specified by subclasses. -class N2VDup op24_23, bits<2> op21_20, bits<5> op11_7, bit op6, bit op4, - dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NDataI { - let Inst{24-23} = op24_23; - let Inst{21-20} = op21_20; - let Inst{11-7} = op11_7; - let Inst{6} = op6; - let Inst{4} = op4; -} - // NEON 2 vector register with immediate. class N2VImm op11_8, bit op7, bit op6, bit op4, dag oops, dag iops, InstrItinClass itin, Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89699&r1=89698&r2=89699&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 15:00:43 2009 @@ -2682,28 +2682,29 @@ // VDUP : Vector Duplicate Lane (from scalar to all elements) -class VDUPLND - : N2VDup<0b11, 0b11, 0b11000, 0, 0, +class VDUPLND op19_18, bits<2> op17_16, string OpcodeStr, ValueType Ty> + : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 0, 0, (outs DPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, OpcodeStr, "\t$dst, $src[$lane]", "", [(set DPR:$dst, (Ty (NEONvduplane (Ty DPR:$src), imm:$lane)))]>; -class VDUPLNQ - : N2VDup<0b11, 0b11, 0b11000, 1, 0, +class VDUPLNQ op19_18, bits<2> op17_16, string OpcodeStr, + ValueType ResTy, ValueType OpTy> + : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 1, 0, (outs QPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, OpcodeStr, "\t$dst, $src[$lane]", "", [(set QPR:$dst, (ResTy (NEONvduplane (OpTy DPR:$src), imm:$lane)))]>; // Inst{19-16} is partially specified depending on the element size. -def VDUPLN8d : VDUPLND<"vdup.8", v8i8> { let Inst{16} = 1; } -def VDUPLN16d : VDUPLND<"vdup.16", v4i16> { let Inst{17-16} = 0b10; } -def VDUPLN32d : VDUPLND<"vdup.32", v2i32> { let Inst{18-16} = 0b100; } -def VDUPLNfd : VDUPLND<"vdup.32", v2f32> { let Inst{18-16} = 0b100; } -def VDUPLN8q : VDUPLNQ<"vdup.8", v16i8, v8i8> { let Inst{16} = 1; } -def VDUPLN16q : VDUPLNQ<"vdup.16", v8i16, v4i16> { let Inst{17-16} = 0b10; } -def VDUPLN32q : VDUPLNQ<"vdup.32", v4i32, v2i32> { let Inst{18-16} = 0b100; } -def VDUPLNfq : VDUPLNQ<"vdup.32", v4f32, v2f32> { let Inst{18-16} = 0b100; } +def VDUPLN8d : VDUPLND<{?,?}, {?,1}, "vdup.8", v8i8>; +def VDUPLN16d : VDUPLND<{?,?}, {1,0}, "vdup.16", v4i16>; +def VDUPLN32d : VDUPLND<{?,1}, {0,0}, "vdup.32", v2i32>; +def VDUPLNfd : VDUPLND<{?,1}, {0,0}, "vdup.32", v2f32>; +def VDUPLN8q : VDUPLNQ<{?,?}, {?,1}, "vdup.8", v16i8, v8i8>; +def VDUPLN16q : VDUPLNQ<{?,?}, {1,0}, "vdup.16", v8i16, v4i16>; +def VDUPLN32q : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4i32, v2i32>; +def VDUPLNfq : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4f32, v2f32>; def : Pat<(v16i8 (NEONvduplane (v16i8 QPR:$src), imm:$lane)), (v16i8 (VDUPLN8q (v8i8 (EXTRACT_SUBREG QPR:$src, @@ -2722,19 +2723,15 @@ (DSubReg_i32_reg imm:$lane))), (SubReg_i32_lane imm:$lane)))>; -def VDUPfdf : N2VDup<0b11, 0b11, 0b11000, 0, 0, - (outs DPR:$dst), (ins SPR:$src), - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", - [(set DPR:$dst, (v2f32 (NEONvdup (f32 SPR:$src))))]> { - let Inst{18-16} = 0b100; -} - -def VDUPfqf : N2VDup<0b11, 0b11, 0b11000, 1, 0, - (outs QPR:$dst), (ins SPR:$src), - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", - [(set QPR:$dst, (v4f32 (NEONvdup (f32 SPR:$src))))]> { - let Inst{18-16} = 0b100; -} +def VDUPfdf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 0, 0, + (outs DPR:$dst), (ins SPR:$src), + IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", + [(set DPR:$dst, (v2f32 (NEONvdup (f32 SPR:$src))))]>; + +def VDUPfqf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 1, 0, + (outs QPR:$dst), (ins SPR:$src), + IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", + [(set QPR:$dst, (v4f32 (NEONvdup (f32 SPR:$src))))]>; def : Pat<(v2i64 (NEONvduplane (v2i64 QPR:$src), imm:$lane)), (INSERT_SUBREG QPR:$src, From baldrick at free.fr Mon Nov 23 15:02:22 2009 From: baldrick at free.fr (Duncan Sands) Date: Mon, 23 Nov 2009 22:02:22 +0100 Subject: [llvm-commits] [llvm] r89663 - /llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp In-Reply-To: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu> References: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu> Message-ID: <4B0AF85E.4090709@free.fr> Hi Chris, I tried to understand what you are doing here but failed, feel like explaining some more? I don't see how capture/nocapture is relevant here... Ciao, Duncan. > + // then the call can not mod/ref the pointer unless the call takes the pointer > + // as an argument, and itself doesn't capture it. > if (isNonEscapingLocalObject(Object) && CS.getInstruction() != Object) { > - bool passedAsArg = false; > - // TODO: Eventually only check 'nocapture' arguments. > + bool PassedAsArg = false; > + unsigned ArgNo = 0; > for (CallSite::arg_iterator CI = CS.arg_begin(), CE = CS.arg_end(); > - CI != CE; ++CI) > - if (isa((*CI)->getType()) && > - alias(cast(CI), ~0U, P, ~0U) != NoAlias) > - passedAsArg = true; > + CI != CE; ++CI, ++ArgNo) { > + // Only look at the no-capture pointer arguments. > + if (!isa((*CI)->getType()) || > + !CS.paramHasAttr(ArgNo+1, Attribute::NoCapture)) > + continue; > + > + // If this is a no-capture pointer argument, see if we can tell that it > + // is impossible to alias the pointer we're checking. If not, we have to > + // assume that the call could touch the pointer, even though it doesn't > + // escape. > + if (alias(cast(CI), ~0U, P, ~0U) != NoAlias) { > + PassedAsArg = true; > + break; > + } > + } > > - if (!passedAsArg) > + if (!PassedAsArg) > return NoModRef; From bob.wilson at apple.com Mon Nov 23 15:01:53 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Mon, 23 Nov 2009 13:01:53 -0800 Subject: [llvm-commits] [llvm] r89699 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td In-Reply-To: <200911232100.nANL0hCs008201@zion.cs.uiuc.edu> References: <200911232100.nANL0hCs008201@zion.cs.uiuc.edu> Message-ID: Thanks! On Nov 23, 2009, at 1:00 PM, Johnny Chen wrote: > Author: johnny > Date: Mon Nov 23 15:00:43 2009 > New Revision: 89699 > > URL: http://llvm.org/viewvc/llvm-project?rev=89699&view=rev > Log: > Partially revert r84730 by removing N2VDup from ARMInstrFormats.td and modifying > VDUPLND and VDUPLNQ to derive from N2V instead of N2VDup. VDUPLND and VDUPLNQ > now expect op19_18 and op17_16 as the first two args. > > Modified: > llvm/trunk/lib/Target/ARM/ARMInstrFormats.td > llvm/trunk/lib/Target/ARM/ARMInstrNEON.td > > Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89699&r1=89698&r2=89699&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 15:00:43 2009 > @@ -1285,19 +1285,6 @@ > let Inst{4} = op4; > } > > -// NEON Vector Duplicate (scalar). > -// Inst{19-16} is specified by subclasses. > -class N2VDup op24_23, bits<2> op21_20, bits<5> op11_7, bit op6, bit op4, > - dag oops, dag iops, InstrItinClass itin, > - string opc, string asm, string cstr, list pattern> > - : NDataI { > - let Inst{24-23} = op24_23; > - let Inst{21-20} = op21_20; > - let Inst{11-7} = op11_7; > - let Inst{6} = op6; > - let Inst{4} = op4; > -} > - > // NEON 2 vector register with immediate. > class N2VImm op11_8, bit op7, bit op6, bit op4, > dag oops, dag iops, InstrItinClass itin, > > Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89699&r1=89698&r2=89699&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 15:00:43 2009 > @@ -2682,28 +2682,29 @@ > > // VDUP : Vector Duplicate Lane (from scalar to all elements) > > -class VDUPLND > - : N2VDup<0b11, 0b11, 0b11000, 0, 0, > +class VDUPLND op19_18, bits<2> op17_16, string OpcodeStr, ValueType Ty> > + : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 0, 0, > (outs DPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, > OpcodeStr, "\t$dst, $src[$lane]", "", > [(set DPR:$dst, (Ty (NEONvduplane (Ty DPR:$src), imm:$lane)))]>; > > -class VDUPLNQ > - : N2VDup<0b11, 0b11, 0b11000, 1, 0, > +class VDUPLNQ op19_18, bits<2> op17_16, string OpcodeStr, > + ValueType ResTy, ValueType OpTy> > + : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 1, 0, > (outs QPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, > OpcodeStr, "\t$dst, $src[$lane]", "", > [(set QPR:$dst, (ResTy (NEONvduplane (OpTy DPR:$src), imm:$lane)))]>; > > // Inst{19-16} is partially specified depending on the element size. > > -def VDUPLN8d : VDUPLND<"vdup.8", v8i8> { let Inst{16} = 1; } > -def VDUPLN16d : VDUPLND<"vdup.16", v4i16> { let Inst{17-16} = 0b10; } > -def VDUPLN32d : VDUPLND<"vdup.32", v2i32> { let Inst{18-16} = 0b100; } > -def VDUPLNfd : VDUPLND<"vdup.32", v2f32> { let Inst{18-16} = 0b100; } > -def VDUPLN8q : VDUPLNQ<"vdup.8", v16i8, v8i8> { let Inst{16} = 1; } > -def VDUPLN16q : VDUPLNQ<"vdup.16", v8i16, v4i16> { let Inst{17-16} = 0b10; } > -def VDUPLN32q : VDUPLNQ<"vdup.32", v4i32, v2i32> { let Inst{18-16} = 0b100; } > -def VDUPLNfq : VDUPLNQ<"vdup.32", v4f32, v2f32> { let Inst{18-16} = 0b100; } > +def VDUPLN8d : VDUPLND<{?,?}, {?,1}, "vdup.8", v8i8>; > +def VDUPLN16d : VDUPLND<{?,?}, {1,0}, "vdup.16", v4i16>; > +def VDUPLN32d : VDUPLND<{?,1}, {0,0}, "vdup.32", v2i32>; > +def VDUPLNfd : VDUPLND<{?,1}, {0,0}, "vdup.32", v2f32>; > +def VDUPLN8q : VDUPLNQ<{?,?}, {?,1}, "vdup.8", v16i8, v8i8>; > +def VDUPLN16q : VDUPLNQ<{?,?}, {1,0}, "vdup.16", v8i16, v4i16>; > +def VDUPLN32q : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4i32, v2i32>; > +def VDUPLNfq : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4f32, v2f32>; > > def : Pat<(v16i8 (NEONvduplane (v16i8 QPR:$src), imm:$lane)), > (v16i8 (VDUPLN8q (v8i8 (EXTRACT_SUBREG QPR:$src, > @@ -2722,19 +2723,15 @@ > (DSubReg_i32_reg imm:$lane))), > (SubReg_i32_lane imm:$lane)))>; > > -def VDUPfdf : N2VDup<0b11, 0b11, 0b11000, 0, 0, > - (outs DPR:$dst), (ins SPR:$src), > - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", > - [(set DPR:$dst, (v2f32 (NEONvdup (f32 SPR:$src))))]> { > - let Inst{18-16} = 0b100; > -} > - > -def VDUPfqf : N2VDup<0b11, 0b11, 0b11000, 1, 0, > - (outs QPR:$dst), (ins SPR:$src), > - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", > - [(set QPR:$dst, (v4f32 (NEONvdup (f32 SPR:$src))))]> { > - let Inst{18-16} = 0b100; > -} > +def VDUPfdf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 0, 0, > + (outs DPR:$dst), (ins SPR:$src), > + IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", > + [(set DPR:$dst, (v2f32 (NEONvdup (f32 SPR:$src))))]>; > + > +def VDUPfqf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 1, 0, > + (outs QPR:$dst), (ins SPR:$src), > + IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", > + [(set QPR:$dst, (v4f32 (NEONvdup (f32 SPR:$src))))]>; > > def : Pat<(v2i64 (NEONvduplane (v2i64 QPR:$src), imm:$lane)), > (INSERT_SUBREG QPR:$src, > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From johnny.chen at apple.com Mon Nov 23 15:03:11 2009 From: johnny.chen at apple.com (Johnny Chen) Date: Mon, 23 Nov 2009 13:03:11 -0800 Subject: [llvm-commits] [llvm] r89684 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td In-Reply-To: <23F04D9C-4DC5-49FC-8EC0-D7E3A508074D@apple.com> References: <200911231816.nANIGGiG001437@zion.cs.uiuc.edu> <23F04D9C-4DC5-49FC-8EC0-D7E3A508074D@apple.com> Message-ID: <68A9EC68-B35B-4541-9433-8B67F6C0DC2E@apple.com> Hi Bob, Removal of N3VImm and N2VDup is done. The N2VImm thing is more involved and the current specification looks fine, so I dare not touch it. :-) On Nov 23, 2009, at 11:40 AM, Bob Wilson wrote: > Very nice! Thanks for doing this. > > Can you do the same thing for r84572 (remove N3VImm) and r84730 (remove N2VDup and associated changes)? You might also consider changing the shift-operand and N2VImm portions of r84730, but I'll leave that to your discretion. > > On Nov 23, 2009, at 10:16 AM, Johnny Chen wrote: > >> Author: johnny >> Date: Mon Nov 23 12:16:16 2009 >> New Revision: 89684 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89684&view=rev >> Log: >> Partially revert r89377 by removing NLdStLN class definition from >> ARMInstrFormats.td and fixing VLD[234]LN* and VST[234]LN* to derive from NLdSt >> instead of NLdStLN. >> >> Modified: >> llvm/trunk/lib/Target/ARM/ARMInstrFormats.td >> llvm/trunk/lib/Target/ARM/ARMInstrNEON.td >> >> Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89684&r1=89683&r2=89684&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) >> +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 12:16:16 2009 >> @@ -1248,17 +1248,6 @@ >> let Inst{7-4} = op7_4; >> } >> >> -// With selective bit(s) from op7_4 specified by subclasses. >> -class NLdStLN op21_20, bits<4> op11_8, >> - dag oops, dag iops, InstrItinClass itin, >> - string opc, string asm, string cstr, list pattern> >> - : NeonI { >> - let Inst{31-24} = 0b11110100; >> - let Inst{23} = op23; >> - let Inst{21-20} = op21_20; >> - let Inst{11-8} = op11_8; >> -} >> - >> class NDataI> string opc, string asm, string cstr, list pattern> >> : NeonI> >> Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89684&r1=89683&r2=89684&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) >> +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 12:16:16 2009 >> @@ -280,11 +280,11 @@ >> >> // VLD2LN : Vector Load (single 2-element structure to one lane) >> class VLD2LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), >> - IIC_VLD2, >> - OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", >> - "$src1 = $dst1, $src2 = $dst2", []>; >> + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), >> + IIC_VLD2, >> + OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", >> + "$src1 = $dst1, $src2 = $dst2", []>; >> >> // vld2 to single-spaced registers. >> def VLD2LNd8 : VLD2LN<0b0001, "vld2.8">; >> @@ -313,12 +313,12 @@ >> >> // VLD3LN : Vector Load (single 3-element structure to one lane) >> class VLD3LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b10,op11_8, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, >> - nohash_imm:$lane), IIC_VLD3, >> - OpcodeStr, >> - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", >> - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; >> + : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, >> + nohash_imm:$lane), IIC_VLD3, >> + OpcodeStr, >> + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", >> + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; >> >> // vld3 to single-spaced registers. >> def VLD3LNd8 : VLD3LN<0b0010, "vld3.8"> { >> @@ -349,13 +349,13 @@ >> >> // VLD4LN : Vector Load (single 4-element structure to one lane) >> class VLD4LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b10,op11_8, >> - (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, >> - nohash_imm:$lane), IIC_VLD4, >> - OpcodeStr, >> - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", >> - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; >> + : NLdSt<1,0b10,op11_8,{?,?,?,?}, >> + (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, >> + nohash_imm:$lane), IIC_VLD4, >> + OpcodeStr, >> + "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", >> + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; >> >> // vld4 to single-spaced registers. >> def VLD4LNd8 : VLD4LN<0b0011, "vld4.8">; >> @@ -504,11 +504,11 @@ >> >> // VST2LN : Vector Store (single 2-element structure from one lane) >> class VST2LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b00,op11_8, (outs), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), >> - IIC_VST, >> - OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", >> - "", []>; >> + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), >> + IIC_VST, >> + OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", >> + "", []>; >> >> // vst2 to single-spaced registers. >> def VST2LNd8 : VST2LN<0b0001, "vst2.8">; >> @@ -537,11 +537,11 @@ >> >> // VST3LN : Vector Store (single 3-element structure from one lane) >> class VST3LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b00,op11_8, (outs), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, >> - nohash_imm:$lane), IIC_VST, >> - OpcodeStr, >> - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; >> + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, >> + nohash_imm:$lane), IIC_VST, >> + OpcodeStr, >> + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; >> >> // vst3 to single-spaced registers. >> def VST3LNd8 : VST3LN<0b0010, "vst3.8"> { >> @@ -572,12 +572,12 @@ >> >> // VST4LN : Vector Store (single 4-element structure from one lane) >> class VST4LN op11_8, string OpcodeStr> >> - : NLdStLN<1,0b00,op11_8, (outs), >> - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, >> - nohash_imm:$lane), IIC_VST, >> - OpcodeStr, >> - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", >> - "", []>; >> + : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), >> + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, >> + nohash_imm:$lane), IIC_VST, >> + OpcodeStr, >> + "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", >> + "", []>; >> >> // vst4 to single-spaced registers. >> def VST4LNd8 : VST4LN<0b0011, "vst4.8">; >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From grosbach at apple.com Mon Nov 23 15:08:27 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 21:08:27 -0000 Subject: [llvm-commits] [llvm] r89700 - in /llvm/trunk: lib/Target/ARM/ARMInstrVFP.td lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll test/CodeGen/ARM/fpconsts.ll test/CodeGen/Thumb2/cross-rc-coalescing-2.ll Message-ID: <200911232108.nANL8R6i008496@zion.cs.uiuc.edu> Author: grosbach Date: Mon Nov 23 15:08:25 2009 New Revision: 89700 URL: http://llvm.org/viewvc/llvm-project?rev=89700&view=rev Log: move fconst[sd] to UAL. Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll llvm/trunk/test/CodeGen/ARM/fpconsts.ll llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrVFP.td?rev=89700&r1=89699&r2=89700&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrVFP.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Mon Nov 23 15:08:25 2009 @@ -437,7 +437,7 @@ let isReMaterializable = 1 in { def FCONSTD : VFPAI<(outs DPR:$dst), (ins vfp_f64imm:$imm), VFPMiscFrm, IIC_VMOVImm, - "fconstd", "\t$dst, $imm", + "vmov.f64", "\t$dst, $imm", [(set DPR:$dst, vfp_f64imm:$imm)]>, Requires<[HasVFP3]> { let Inst{27-23} = 0b11101; let Inst{21-20} = 0b11; @@ -448,7 +448,7 @@ def FCONSTS : VFPAI<(outs SPR:$dst), (ins vfp_f32imm:$imm), VFPMiscFrm, IIC_VMOVImm, - "fconsts", "\t$dst, $imm", + "vmov.f32", "\t$dst, $imm", [(set SPR:$dst, vfp_f32imm:$imm)]>, Requires<[HasVFP3]> { let Inst{27-23} = 0b11101; let Inst{21-20} = 0b11; Modified: llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp?rev=89700&r1=89699&r2=89700&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp Mon Nov 23 15:08:25 2009 @@ -998,7 +998,7 @@ void ARMAsmPrinter::printVFPf32ImmOperand(const MachineInstr *MI, int OpNum) { const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); - O << '#' << ARM::getVFPf32Imm(FP->getValueAPF()); + O << '#' << FP->getValueAPF().convertToFloat(); if (VerboseAsm) { O.PadToColumn(MAI->getCommentColumn()); O << MAI->getCommentString() << ' '; @@ -1008,7 +1008,7 @@ void ARMAsmPrinter::printVFPf64ImmOperand(const MachineInstr *MI, int OpNum) { const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); - O << '#' << ARM::getVFPf64Imm(FP->getValueAPF()); + O << '#' << FP->getValueAPF().convertToDouble(); if (VerboseAsm) { O.PadToColumn(MAI->getCommentColumn()); O << MAI->getCommentString() << ' '; Modified: llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll?rev=89700&r1=89699&r2=89700&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll (original) +++ llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll Mon Nov 23 15:08:25 2009 @@ -13,7 +13,7 @@ %4 = fadd float 0.000000e+00, %3 ; [#uses=1] %5 = fsub float 1.000000e+00, %4 ; [#uses=1] ; CHECK: foo: -; CHECK: fconsts s{{[0-9]+}}, #112 +; CHECK: vmov.f32 s{{[0-9]+}}, #1.000000e+00 %6 = fsub float 1.000000e+00, undef ; [#uses=2] %7 = fsub float %2, undef ; [#uses=1] %8 = fsub float 0.000000e+00, undef ; [#uses=3] Modified: llvm/trunk/test/CodeGen/ARM/fpconsts.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fpconsts.ll?rev=89700&r1=89699&r2=89700&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/ARM/fpconsts.ll (original) +++ llvm/trunk/test/CodeGen/ARM/fpconsts.ll Mon Nov 23 15:08:25 2009 @@ -3,7 +3,7 @@ define arm_apcscc float @t1(float %x) nounwind readnone optsize { entry: ; CHECK: t1: -; CHECK: fconsts s1, #16 +; CHECK: vmov.f32 s1, #4.000000e+00 %0 = fadd float %x, 4.000000e+00 ret float %0 } @@ -11,7 +11,7 @@ define arm_apcscc double @t2(double %x) nounwind readnone optsize { entry: ; CHECK: t2: -; CHECK: fconstd d1, #8 +; CHECK: vmov.f64 d1, #3.000000e+00 %0 = fadd double %x, 3.000000e+00 ret double %0 } @@ -19,7 +19,7 @@ define arm_apcscc double @t3(double %x) nounwind readnone optsize { entry: ; CHECK: t3: -; CHECK: fconstd d1, #170 +; CHECK: vmov.f64 d1, #-1.300000e+01 %0 = fmul double %x, -1.300000e+01 ret double %0 } @@ -27,7 +27,7 @@ define arm_apcscc float @t4(float %x) nounwind readnone optsize { entry: ; CHECK: t4: -; CHECK: fconsts s1, #184 +; CHECK: vmov.f32 s1, #-2.400000e+01 %0 = fmul float %x, -2.400000e+01 ret float %0 } Modified: llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll?rev=89700&r1=89699&r2=89700&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll (original) +++ llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll Mon Nov 23 15:08:25 2009 @@ -1,4 +1,4 @@ -; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | grep vmov.f32 | count 6 +; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | grep vmov.f32 | count 7 define arm_apcscc void @fht(float* nocapture %fz, i16 signext %n) nounwind { entry: From espindola at google.com Mon Nov 23 15:09:26 2009 From: espindola at google.com (Rafael Espindola) Date: Mon, 23 Nov 2009 16:09:26 -0500 Subject: [llvm-commits] [PATCH] LTO code generator options In-Reply-To: References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911170809j6a32716ar840b8622cfad6f17@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607C4@mail.accesssoftek.com> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> Message-ID: <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> > Thanks a lot for reviewing the patch. > It is commited as http://llvm.org/viewvc/llvm-project?rev=89516&view=rev > > Now everything is reaady for the target triple overriding. I should be able to take a look at it tomorrow, but it would help if you could first implement Daniel's and Chirs' comments. > > Best regards, > Viktor Cheers, -- Rafael ?vila de Esp?ndola From gohman at apple.com Mon Nov 23 15:29:08 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 21:29:08 -0000 Subject: [llvm-commits] [llvm] r89701 - /llvm/trunk/lib/CodeGen/MachineInstr.cpp Message-ID: <200911232129.nANLT8Vi009464@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 15:29:08 2009 New Revision: 89701 URL: http://llvm.org/viewvc/llvm-project?rev=89701&view=rev Log: Print the debug info line and column in MachineInstr::print even when there's no filename. This situation is apparently fairly common right now. Modified: llvm/trunk/lib/CodeGen/MachineInstr.cpp Modified: llvm/trunk/lib/CodeGen/MachineInstr.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineInstr.cpp?rev=89701&r1=89700&r2=89701&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineInstr.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineInstr.cpp Mon Nov 23 15:29:08 2009 @@ -1149,9 +1149,10 @@ DebugLocTuple DLT = MF->getDebugLocTuple(debugLoc); DICompileUnit CU(DLT.Scope); + OS << " dbg:"; if (!CU.isNull()) - OS << " dbg:" << CU.getDirectory() << '/' << CU.getFilename() << ":" - << DLT.Line << ":" << DLT.Col; + OS << CU.getDirectory() << '/' << CU.getFilename() << ":"; + OS << DLT.Line << ":" << DLT.Col; } OS << "\n"; From evan.cheng at apple.com Mon Nov 23 15:30:31 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 13:30:31 -0800 Subject: [llvm-commits] [llvm] r89700 - in /llvm/trunk: lib/Target/ARM/ARMInstrVFP.td lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll test/CodeGen/ARM/fpconsts.ll test/CodeGen/Thumb2/cross-rc-coalescing-2.ll In-Reply-To: <200911232108.nANL8R6i008496@zion.cs.uiuc.edu> References: <200911232108.nANL8R6i008496@zion.cs.uiuc.edu> Message-ID: Thanks. But rather than converting to FP at asm printing time, could we just select it to a FP imm node? Evan On Nov 23, 2009, at 1:08 PM, Jim Grosbach wrote: > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp (original) > +++ llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp Mon Nov 23 15:08:25 2009 > @@ -998,7 +998,7 @@ > > void ARMAsmPrinter::printVFPf32ImmOperand(const MachineInstr *MI, int OpNum) { > const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); > - O << '#' << ARM::getVFPf32Imm(FP->getValueAPF()); > + O << '#' << FP->getValueAPF().convertToFloat(); > if (VerboseAsm) { > O.PadToColumn(MAI->getCommentColumn()); > O << MAI->getCommentString() << ' '; > @@ -1008,7 +1008,7 @@ > > void ARMAsmPrinter::printVFPf64ImmOperand(const MachineInstr *MI, int OpNum) { > const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); > - O << '#' << ARM::getVFPf64Imm(FP->getValueAPF()); > + O << '#' << FP->getValueAPF().convertToDouble(); > if (VerboseAsm) { > O.PadToColumn(MAI->getCommentColumn()); > O << MAI->getCommentString() << ' '; > > Modified: llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll?rev=89700&r1=89699&r2=89700&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll (original) > +++ llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll Mon Nov 23 15:08:25 2009 > @@ -13,7 +13,7 @@ > %4 = fadd float 0.000000e+00, %3 ; [#uses=1] > %5 = fsub float 1.000000e+00, %4 ; [#uses=1] > ; CHECK: foo: > -; CHECK: fconsts s{{[0-9]+}}, #112 > +; CHECK: vmov.f32 s{{[0-9]+}}, #1.000000e+00 > %6 = fsub float 1.000000e+00, undef ; [#uses=2] > %7 = fsub float %2, undef ; [#uses=1] > %8 = fsub float 0.000000e+00, undef ; [#uses=3] > > Modified: llvm/trunk/test/CodeGen/ARM/fpconsts.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fpconsts.ll?rev=89700&r1=89699&r2=89700&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/ARM/fpconsts.ll (original) > +++ llvm/trunk/test/CodeGen/ARM/fpconsts.ll Mon Nov 23 15:08:25 2009 > @@ -3,7 +3,7 @@ > define arm_apcscc float @t1(float %x) nounwind readnone optsize { > entry: > ; CHECK: t1: > -; CHECK: fconsts s1, #16 > +; CHECK: vmov.f32 s1, #4.000000e+00 > %0 = fadd float %x, 4.000000e+00 > ret float %0 > } > @@ -11,7 +11,7 @@ > define arm_apcscc double @t2(double %x) nounwind readnone optsize { > entry: > ; CHECK: t2: > -; CHECK: fconstd d1, #8 > +; CHECK: vmov.f64 d1, #3.000000e+00 > %0 = fadd double %x, 3.000000e+00 > ret double %0 > } > @@ -19,7 +19,7 @@ > define arm_apcscc double @t3(double %x) nounwind readnone optsize { > entry: > ; CHECK: t3: > -; CHECK: fconstd d1, #170 > +; CHECK: vmov.f64 d1, #-1.300000e+01 > %0 = fmul double %x, -1.300000e+01 > ret double %0 > } > @@ -27,7 +27,7 @@ > define arm_apcscc float @t4(float %x) nounwind readnone optsize { > entry: > ; CHECK: t4: > -; CHECK: fconsts s1, #184 > +; CHECK: vmov.f32 s1, #-2.400000e+01 > %0 = fmul float %x, -2.400000e+01 > ret float %0 > } > > Modified: llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll?rev=89700&r1=89699&r2=89700&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll (original) > +++ llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll Mon Nov 23 15:08:25 2009 > @@ -1,4 +1,4 @@ > -; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | grep vmov.f32 | count 6 > +; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | grep vmov.f32 | count 7 > > define arm_apcscc void @fht(float* nocapture %fz, i16 signext %n) nounwind { > entry: > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From gohman at apple.com Mon Nov 23 15:30:55 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 21:30:55 -0000 Subject: [llvm-commits] [llvm] r89702 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Message-ID: <200911232130.nANLUttt009621@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 15:30:55 2009 New Revision: 89702 URL: http://llvm.org/viewvc/llvm-project?rev=89702&view=rev Log: Simplify this code. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89702&r1=89701&r2=89702&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Mon Nov 23 15:30:55 2009 @@ -1854,7 +1854,7 @@ InsnToDbgScopeMapTy::iterator I = DbgScopeBeginMap.find(MI); if (I == DbgScopeBeginMap.end()) return; - ScopeVector &SD = DbgScopeBeginMap[MI]; + ScopeVector &SD = I->second; for (ScopeVector::iterator SDI = SD.begin(), SDE = SD.end(); SDI != SDE; ++SDI) (*SDI)->setStartLabelID(Label); From grosbach at apple.com Mon Nov 23 15:51:40 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 13:51:40 -0800 Subject: [llvm-commits] [llvm] r89700 - in /llvm/trunk: lib/Target/ARM/ARMInstrVFP.td lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll test/CodeGen/ARM/fpconsts.ll test/CodeGen/Thumb2/cross-rc-coalescing-2.ll In-Reply-To: References: <200911232108.nANL8R6i008496@zion.cs.uiuc.edu> Message-ID: <61C2FFFB-F064-4D4D-9421-F296AB66BD9E@apple.com> Well, we still need to check to make sure the immediate is encodable. The legal range of constant values is the same as it was for the fconst [ds] mnemonics. I kept the rest the same as before to make sure that stayed true. What's a better way to do that and also enable simplifying the code in the asm printer? On Nov 23, 2009, at 1:30 PM, Evan Cheng wrote: > Thanks. But rather than converting to FP at asm printing time, could > we just select it to a FP imm node? > > Evan > > On Nov 23, 2009, at 1:08 PM, Jim Grosbach wrote: > >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp Mon Nov >> 23 15:08:25 2009 >> @@ -998,7 +998,7 @@ >> >> void ARMAsmPrinter::printVFPf32ImmOperand(const MachineInstr *MI, >> int OpNum) { >> const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); >> - O << '#' << ARM::getVFPf32Imm(FP->getValueAPF()); >> + O << '#' << FP->getValueAPF().convertToFloat(); >> if (VerboseAsm) { >> O.PadToColumn(MAI->getCommentColumn()); >> O << MAI->getCommentString() << ' '; >> @@ -1008,7 +1008,7 @@ >> >> void ARMAsmPrinter::printVFPf64ImmOperand(const MachineInstr *MI, >> int OpNum) { >> const ConstantFP *FP = MI->getOperand(OpNum).getFPImm(); >> - O << '#' << ARM::getVFPf64Imm(FP->getValueAPF()); >> + O << '#' << FP->getValueAPF().convertToDouble(); >> if (VerboseAsm) { >> O.PadToColumn(MAI->getCommentColumn()); >> O << MAI->getCommentString() << ' '; >> >> Modified: llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll?rev=89700&r1=89699&r2=89700&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll >> (original) >> +++ llvm/trunk/test/CodeGen/ARM/2009-11-07-SubRegAsmPrinting.ll Mon >> Nov 23 15:08:25 2009 >> @@ -13,7 +13,7 @@ >> %4 = fadd float 0.000000e+00, %3 ; [#uses=1] >> %5 = fsub float 1.000000e+00, %4 ; [#uses=1] >> ; CHECK: foo: >> -; CHECK: fconsts s{{[0-9]+}}, #112 >> +; CHECK: vmov.f32 s{{[0-9]+}}, #1.000000e+00 >> %6 = fsub float 1.000000e+00, undef ; [#uses=2] >> %7 = fsub float %2, undef ; [#uses=1] >> %8 = fsub float 0.000000e+00, undef ; [#uses=3] >> >> Modified: llvm/trunk/test/CodeGen/ARM/fpconsts.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/fpconsts.ll?rev=89700&r1=89699&r2=89700&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/test/CodeGen/ARM/fpconsts.ll (original) >> +++ llvm/trunk/test/CodeGen/ARM/fpconsts.ll Mon Nov 23 15:08:25 2009 >> @@ -3,7 +3,7 @@ >> define arm_apcscc float @t1(float %x) nounwind readnone optsize { >> entry: >> ; CHECK: t1: >> -; CHECK: fconsts s1, #16 >> +; CHECK: vmov.f32 s1, #4.000000e+00 >> %0 = fadd float %x, 4.000000e+00 >> ret float %0 >> } >> @@ -11,7 +11,7 @@ >> define arm_apcscc double @t2(double %x) nounwind readnone optsize { >> entry: >> ; CHECK: t2: >> -; CHECK: fconstd d1, #8 >> +; CHECK: vmov.f64 d1, #3.000000e+00 >> %0 = fadd double %x, 3.000000e+00 >> ret double %0 >> } >> @@ -19,7 +19,7 @@ >> define arm_apcscc double @t3(double %x) nounwind readnone optsize { >> entry: >> ; CHECK: t3: >> -; CHECK: fconstd d1, #170 >> +; CHECK: vmov.f64 d1, #-1.300000e+01 >> %0 = fmul double %x, -1.300000e+01 >> ret double %0 >> } >> @@ -27,7 +27,7 @@ >> define arm_apcscc float @t4(float %x) nounwind readnone optsize { >> entry: >> ; CHECK: t4: >> -; CHECK: fconsts s1, #184 >> +; CHECK: vmov.f32 s1, #-2.400000e+01 >> %0 = fmul float %x, -2.400000e+01 >> ret float %0 >> } >> >> Modified: llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll?rev=89700&r1=89699&r2=89700&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll >> (original) >> +++ llvm/trunk/test/CodeGen/Thumb2/cross-rc-coalescing-2.ll Mon Nov >> 23 15:08:25 2009 >> @@ -1,4 +1,4 @@ >> -; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | >> grep vmov.f32 | count 6 >> +; RUN: llc < %s -mtriple=thumbv7-apple-darwin9 -mcpu=cortex-a8 | >> grep vmov.f32 | count 7 >> >> define arm_apcscc void @fht(float* nocapture %fz, i16 signext %n) >> nounwind { >> entry: >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From evan.cheng at apple.com Mon Nov 23 15:57:23 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 21:57:23 -0000 Subject: [llvm-commits] [llvm] r89706 - in /llvm/trunk/lib/Target/ARM: ARMInstrFormats.td ARMInstrNEON.td Message-ID: <200911232157.nANLvNjt010887@zion.cs.uiuc.edu> Author: evancheng Date: Mon Nov 23 15:57:23 2009 New Revision: 89706 URL: http://llvm.org/viewvc/llvm-project?rev=89706&view=rev Log: Massive refactoring of NEON instructions. Separate opcode from data size specifier suffix, move \t up stream to instruction format, and fix more 80 column violations. This fixes the NEON asm printing so the "predicate" field is printed between the opcode and the data type suffix. Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrFormats.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrFormats.td?rev=89706&r1=89705&r2=89706&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrFormats.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrFormats.td Mon Nov 23 15:57:23 2009 @@ -1217,30 +1217,45 @@ // class NeonI pattern> + : InstARM { + let OutOperandList = oops; + let InOperandList = !con(iops, (ops pred:$p)); + let AsmString = !strconcat( + !strconcat(!strconcat(opc, "${p}"), !strconcat(".", dt)), + !strconcat("\t", asm)); + let Pattern = pattern; + list Predicates = [HasNEON]; +} + +// Same as NeonI except it does not have a "data type" specifier. +class NeonXI pattern> : InstARM { let OutOperandList = oops; let InOperandList = !con(iops, (ops pred:$p)); - let AsmString = !strconcat(opc, !strconcat("${p}", asm)); + let AsmString = !strconcat(!strconcat(opc, "${p}"), !strconcat("\t", asm)); let Pattern = pattern; list Predicates = [HasNEON]; } class NI pattern> - : NeonI { } -class NI4 pattern> - : NeonI { +class NI4 pattern> + : NeonXI { } class NLdSt op21_20, bits<4> op11_8, bits<4> op7_4, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NeonI { + string opc, string dt, string asm, string cstr, list pattern> + : NeonI { let Inst{31-24} = 0b11110100; let Inst{23} = op23; let Inst{21-20} = op21_20; @@ -1249,8 +1264,15 @@ } class NDataI pattern> + : NeonI { + let Inst{31-25} = 0b1111001; +} + +class NDataXI pattern> - : NeonI { let Inst{31-25} = 0b1111001; } @@ -1259,8 +1281,8 @@ class N1ModImm op21_19, bits<4> op11_8, bit op7, bit op6, bit op5, bit op4, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NDataI { + string opc, string dt, string asm, string cstr, list pattern> + : NDataI { let Inst{23} = op23; let Inst{21-19} = op21_19; let Inst{11-8} = op11_8; @@ -1274,8 +1296,23 @@ class N2V op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op6, bit op4, dag oops, dag iops, InstrItinClass itin, + string opc, string dt, string asm, string cstr, list pattern> + : NDataI { + let Inst{24-23} = op24_23; + let Inst{21-20} = op21_20; + let Inst{19-18} = op19_18; + let Inst{17-16} = op17_16; + let Inst{11-7} = op11_7; + let Inst{6} = op6; + let Inst{4} = op4; +} + +// Same as N2V except it doesn't have a datatype suffix. +class N2VX op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, + bits<5> op11_7, bit op6, bit op4, + dag oops, dag iops, InstrItinClass itin, string opc, string asm, string cstr, list pattern> - : NDataI { + : NDataXI { let Inst{24-23} = op24_23; let Inst{21-20} = op21_20; let Inst{19-18} = op19_18; @@ -1288,8 +1325,8 @@ // NEON 2 vector register with immediate. class N2VImm op11_8, bit op7, bit op6, bit op4, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, string cstr, list pattern> - : NDataI { + string opc, string dt, string asm, string cstr, list pattern> + : NDataI { let Inst{24} = op24; let Inst{23} = op23; let Inst{11-8} = op11_8; @@ -1301,8 +1338,21 @@ // NEON 3 vector register format. class N3V op21_20, bits<4> op11_8, bit op6, bit op4, dag oops, dag iops, InstrItinClass itin, + string opc, string dt, string asm, string cstr, list pattern> + : NDataI { + let Inst{24} = op24; + let Inst{23} = op23; + let Inst{21-20} = op21_20; + let Inst{11-8} = op11_8; + let Inst{6} = op6; + let Inst{4} = op4; +} + +// Same as N3VX except it doesn't have a data type suffix. +class N3VX op21_20, bits<4> op11_8, bit op6, bit op4, + dag oops, dag iops, InstrItinClass itin, string opc, string asm, string cstr, list pattern> - : NDataI { + : NDataXI { let Inst{24} = op24; let Inst{23} = op23; let Inst{21-20} = op21_20; @@ -1314,29 +1364,37 @@ // NEON VMOVs between scalar and core registers. class NVLaneOp opcod1, bits<4> opcod2, bits<2> opcod3, dag oops, dag iops, Format f, InstrItinClass itin, - string opc, string asm, list pattern> - : AI { + string opc, string dt, string asm, list pattern> + : InstARM { let Inst{27-20} = opcod1; let Inst{11-8} = opcod2; let Inst{6-5} = opcod3; let Inst{4} = 1; + + let OutOperandList = oops; + let InOperandList = !con(iops, (ops pred:$p)); + let AsmString = !strconcat( + !strconcat(!strconcat(opc, "${p}"), !strconcat(".", dt)), + !strconcat("\t", asm)); + let Pattern = pattern; list Predicates = [HasNEON]; } class NVGetLane opcod1, bits<4> opcod2, bits<2> opcod3, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, list pattern> + string opc, string dt, string asm, list pattern> : NVLaneOp; + opc, dt, asm, pattern>; class NVSetLane opcod1, bits<4> opcod2, bits<2> opcod3, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, list pattern> + string opc, string dt, string asm, list pattern> : NVLaneOp; + opc, dt, asm, pattern>; class NVDup opcod1, bits<4> opcod2, bits<2> opcod3, dag oops, dag iops, InstrItinClass itin, - string opc, string asm, list pattern> + string opc, string dt, string asm, list pattern> : NVLaneOp; + opc, dt, asm, pattern>; // NEONFPPat - Same as Pat<>, but requires that the compiler be using NEON // for single-precision FP. Modified: llvm/trunk/lib/Target/ARM/ARMInstrNEON.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrNEON.td?rev=89706&r1=89705&r2=89706&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrNEON.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrNEON.td Mon Nov 23 15:57:23 2009 @@ -146,7 +146,7 @@ // Use vldmia to load a Q register as a D register pair. def VLDRQ : NI4<(outs QPR:$dst), (ins addrmode4:$addr), IIC_fpLoadm, - "vldmia", "\t$addr, ${dst:dregpair}", + "vldmia", "$addr, ${dst:dregpair}", [(set QPR:$dst, (v2f64 (load addrmode4:$addr)))]> { let Inst{27-25} = 0b110; let Inst{24} = 0; // P bit @@ -158,7 +158,7 @@ // Use vstmia to store a Q register as a D register pair. def VSTRQ : NI4<(outs), (ins QPR:$src, addrmode4:$addr), IIC_fpStorem, - "vstmia", "\t$addr, ${src:dregpair}", + "vstmia", "$addr, ${src:dregpair}", [(store (v2f64 QPR:$src), addrmode4:$addr)]> { let Inst{27-25} = 0b110; let Inst{24} = 0; // P bit @@ -168,217 +168,219 @@ } // VLD1 : Vector Load (multiple single elements) -class VLD1D op7_4, string OpcodeStr, ValueType Ty, Intrinsic IntOp> +class VLD1D op7_4, string OpcodeStr, string Dt, + ValueType Ty, Intrinsic IntOp> : NLdSt<0,0b10,0b0111,op7_4, (outs DPR:$dst), (ins addrmode6:$addr), IIC_VLD1, - OpcodeStr, "\t\\{$dst\\}, $addr", "", + OpcodeStr, Dt, "\\{$dst\\}, $addr", "", [(set DPR:$dst, (Ty (IntOp addrmode6:$addr)))]>; -class VLD1Q op7_4, string OpcodeStr, ValueType Ty, Intrinsic IntOp> +class VLD1Q op7_4, string OpcodeStr, string Dt, + ValueType Ty, Intrinsic IntOp> : NLdSt<0,0b10,0b1010,op7_4, (outs QPR:$dst), (ins addrmode6:$addr), IIC_VLD1, - OpcodeStr, "\t${dst:dregpair}, $addr", "", + OpcodeStr, Dt, "${dst:dregpair}, $addr", "", [(set QPR:$dst, (Ty (IntOp addrmode6:$addr)))]>; -def VLD1d8 : VLD1D<0b0000, "vld1.8", v8i8, int_arm_neon_vld1>; -def VLD1d16 : VLD1D<0b0100, "vld1.16", v4i16, int_arm_neon_vld1>; -def VLD1d32 : VLD1D<0b1000, "vld1.32", v2i32, int_arm_neon_vld1>; -def VLD1df : VLD1D<0b1000, "vld1.32", v2f32, int_arm_neon_vld1>; -def VLD1d64 : VLD1D<0b1100, "vld1.64", v1i64, int_arm_neon_vld1>; - -def VLD1q8 : VLD1Q<0b0000, "vld1.8", v16i8, int_arm_neon_vld1>; -def VLD1q16 : VLD1Q<0b0100, "vld1.16", v8i16, int_arm_neon_vld1>; -def VLD1q32 : VLD1Q<0b1000, "vld1.32", v4i32, int_arm_neon_vld1>; -def VLD1qf : VLD1Q<0b1000, "vld1.32", v4f32, int_arm_neon_vld1>; -def VLD1q64 : VLD1Q<0b1100, "vld1.64", v2i64, int_arm_neon_vld1>; +def VLD1d8 : VLD1D<0b0000, "vld1", "8", v8i8, int_arm_neon_vld1>; +def VLD1d16 : VLD1D<0b0100, "vld1", "16", v4i16, int_arm_neon_vld1>; +def VLD1d32 : VLD1D<0b1000, "vld1", "32", v2i32, int_arm_neon_vld1>; +def VLD1df : VLD1D<0b1000, "vld1", "32", v2f32, int_arm_neon_vld1>; +def VLD1d64 : VLD1D<0b1100, "vld1", "64", v1i64, int_arm_neon_vld1>; + +def VLD1q8 : VLD1Q<0b0000, "vld1", "8", v16i8, int_arm_neon_vld1>; +def VLD1q16 : VLD1Q<0b0100, "vld1", "16", v8i16, int_arm_neon_vld1>; +def VLD1q32 : VLD1Q<0b1000, "vld1", "32", v4i32, int_arm_neon_vld1>; +def VLD1qf : VLD1Q<0b1000, "vld1", "32", v4f32, int_arm_neon_vld1>; +def VLD1q64 : VLD1Q<0b1100, "vld1", "64", v2i64, int_arm_neon_vld1>; let mayLoad = 1, hasExtraDefRegAllocReq = 1 in { // VLD2 : Vector Load (multiple 2-element structures) -class VLD2D op7_4, string OpcodeStr> +class VLD2D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b1000,op7_4, (outs DPR:$dst1, DPR:$dst2), (ins addrmode6:$addr), IIC_VLD2, - OpcodeStr, "\t\\{$dst1,$dst2\\}, $addr", "", []>; -class VLD2Q op7_4, string OpcodeStr> + OpcodeStr, Dt, "\\{$dst1,$dst2\\}, $addr", "", []>; +class VLD2Q op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b0011,op7_4, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), (ins addrmode6:$addr), IIC_VLD2, - OpcodeStr, "\t\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", + OpcodeStr, Dt, "\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", "", []>; -def VLD2d8 : VLD2D<0b0000, "vld2.8">; -def VLD2d16 : VLD2D<0b0100, "vld2.16">; -def VLD2d32 : VLD2D<0b1000, "vld2.32">; +def VLD2d8 : VLD2D<0b0000, "vld2", "8">; +def VLD2d16 : VLD2D<0b0100, "vld2", "16">; +def VLD2d32 : VLD2D<0b1000, "vld2", "32">; def VLD2d64 : NLdSt<0,0b10,0b1010,0b1100, (outs DPR:$dst1, DPR:$dst2), (ins addrmode6:$addr), IIC_VLD1, - "vld1.64", "\t\\{$dst1,$dst2\\}, $addr", "", []>; + "vld1", "64", "\\{$dst1,$dst2\\}, $addr", "", []>; -def VLD2q8 : VLD2Q<0b0000, "vld2.8">; -def VLD2q16 : VLD2Q<0b0100, "vld2.16">; -def VLD2q32 : VLD2Q<0b1000, "vld2.32">; +def VLD2q8 : VLD2Q<0b0000, "vld2", "8">; +def VLD2q16 : VLD2Q<0b0100, "vld2", "16">; +def VLD2q32 : VLD2Q<0b1000, "vld2", "32">; // VLD3 : Vector Load (multiple 3-element structures) -class VLD3D op7_4, string OpcodeStr> +class VLD3D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b0100,op7_4, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), (ins addrmode6:$addr), IIC_VLD3, - OpcodeStr, "\t\\{$dst1,$dst2,$dst3\\}, $addr", "", []>; -class VLD3WB op7_4, string OpcodeStr> + OpcodeStr, Dt, "\\{$dst1,$dst2,$dst3\\}, $addr", "", []>; +class VLD3WB op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b0101,op7_4, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, GPR:$wb), (ins addrmode6:$addr), IIC_VLD3, - OpcodeStr, "\t\\{$dst1,$dst2,$dst3\\}, $addr", + OpcodeStr, Dt, "\\{$dst1,$dst2,$dst3\\}, $addr", "$addr.addr = $wb", []>; -def VLD3d8 : VLD3D<0b0000, "vld3.8">; -def VLD3d16 : VLD3D<0b0100, "vld3.16">; -def VLD3d32 : VLD3D<0b1000, "vld3.32">; +def VLD3d8 : VLD3D<0b0000, "vld3", "8">; +def VLD3d16 : VLD3D<0b0100, "vld3", "16">; +def VLD3d32 : VLD3D<0b1000, "vld3", "32">; def VLD3d64 : NLdSt<0,0b10,0b0110,0b1100, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), (ins addrmode6:$addr), IIC_VLD1, - "vld1.64", "\t\\{$dst1,$dst2,$dst3\\}, $addr", "", []>; + "vld1", "64", "\\{$dst1,$dst2,$dst3\\}, $addr", "", []>; // vld3 to double-spaced even registers. -def VLD3q8a : VLD3WB<0b0000, "vld3.8">; -def VLD3q16a : VLD3WB<0b0100, "vld3.16">; -def VLD3q32a : VLD3WB<0b1000, "vld3.32">; +def VLD3q8a : VLD3WB<0b0000, "vld3", "8">; +def VLD3q16a : VLD3WB<0b0100, "vld3", "16">; +def VLD3q32a : VLD3WB<0b1000, "vld3", "32">; // vld3 to double-spaced odd registers. -def VLD3q8b : VLD3WB<0b0000, "vld3.8">; -def VLD3q16b : VLD3WB<0b0100, "vld3.16">; -def VLD3q32b : VLD3WB<0b1000, "vld3.32">; +def VLD3q8b : VLD3WB<0b0000, "vld3", "8">; +def VLD3q16b : VLD3WB<0b0100, "vld3", "16">; +def VLD3q32b : VLD3WB<0b1000, "vld3", "32">; // VLD4 : Vector Load (multiple 4-element structures) -class VLD4D op7_4, string OpcodeStr> +class VLD4D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b0000,op7_4, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), (ins addrmode6:$addr), IIC_VLD4, - OpcodeStr, "\t\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", + OpcodeStr, Dt, "\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", "", []>; -class VLD4WB op7_4, string OpcodeStr> +class VLD4WB op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b10,0b0001,op7_4, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4, GPR:$wb), (ins addrmode6:$addr), IIC_VLD4, - OpcodeStr, "\t\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", + OpcodeStr, Dt, "\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", "$addr.addr = $wb", []>; -def VLD4d8 : VLD4D<0b0000, "vld4.8">; -def VLD4d16 : VLD4D<0b0100, "vld4.16">; -def VLD4d32 : VLD4D<0b1000, "vld4.32">; +def VLD4d8 : VLD4D<0b0000, "vld4", "8">; +def VLD4d16 : VLD4D<0b0100, "vld4", "16">; +def VLD4d32 : VLD4D<0b1000, "vld4", "32">; def VLD4d64 : NLdSt<0,0b10,0b0010,0b1100, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), (ins addrmode6:$addr), IIC_VLD1, - "vld1.64", "\t\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", "", []>; + "vld1", "64", "\\{$dst1,$dst2,$dst3,$dst4\\}, $addr", "", []>; // vld4 to double-spaced even registers. -def VLD4q8a : VLD4WB<0b0000, "vld4.8">; -def VLD4q16a : VLD4WB<0b0100, "vld4.16">; -def VLD4q32a : VLD4WB<0b1000, "vld4.32">; +def VLD4q8a : VLD4WB<0b0000, "vld4", "8">; +def VLD4q16a : VLD4WB<0b0100, "vld4", "16">; +def VLD4q32a : VLD4WB<0b1000, "vld4", "32">; // vld4 to double-spaced odd registers. -def VLD4q8b : VLD4WB<0b0000, "vld4.8">; -def VLD4q16b : VLD4WB<0b0100, "vld4.16">; -def VLD4q32b : VLD4WB<0b1000, "vld4.32">; +def VLD4q8b : VLD4WB<0b0000, "vld4", "8">; +def VLD4q16b : VLD4WB<0b0100, "vld4", "16">; +def VLD4q32b : VLD4WB<0b1000, "vld4", "32">; // VLD1LN : Vector Load (single element to one lane) // FIXME: Not yet implemented. // VLD2LN : Vector Load (single 2-element structure to one lane) -class VLD2LN op11_8, string OpcodeStr> +class VLD2LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), - IIC_VLD2, - OpcodeStr, "\t\\{$dst1[$lane],$dst2[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2", []>; + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), + IIC_VLD2, + OpcodeStr, Dt, "\\{$dst1[$lane],$dst2[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2", []>; // vld2 to single-spaced registers. -def VLD2LNd8 : VLD2LN<0b0001, "vld2.8">; -def VLD2LNd16 : VLD2LN<0b0101, "vld2.16"> { +def VLD2LNd8 : VLD2LN<0b0001, "vld2", "8">; +def VLD2LNd16 : VLD2LN<0b0101, "vld2", "16"> { let Inst{5} = 0; } -def VLD2LNd32 : VLD2LN<0b1001, "vld2.32"> { +def VLD2LNd32 : VLD2LN<0b1001, "vld2", "32"> { let Inst{6} = 0; } // vld2 to double-spaced even registers. -def VLD2LNq16a: VLD2LN<0b0101, "vld2.16"> { +def VLD2LNq16a: VLD2LN<0b0101, "vld2", "16"> { let Inst{5} = 1; } -def VLD2LNq32a: VLD2LN<0b1001, "vld2.32"> { +def VLD2LNq32a: VLD2LN<0b1001, "vld2", "32"> { let Inst{6} = 1; } // vld2 to double-spaced odd registers. -def VLD2LNq16b: VLD2LN<0b0101, "vld2.16"> { +def VLD2LNq16b: VLD2LN<0b0101, "vld2", "16"> { let Inst{5} = 1; } -def VLD2LNq32b: VLD2LN<0b1001, "vld2.32"> { +def VLD2LNq32b: VLD2LN<0b1001, "vld2", "32"> { let Inst{6} = 1; } // VLD3LN : Vector Load (single 3-element structure to one lane) -class VLD3LN op11_8, string OpcodeStr> +class VLD3LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b10,op11_8,{?,?,?,?}, (outs DPR:$dst1, DPR:$dst2, DPR:$dst3), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, - nohash_imm:$lane), IIC_VLD3, - OpcodeStr, - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, + nohash_imm:$lane), IIC_VLD3, + OpcodeStr, Dt, + "\\{$dst1[$lane],$dst2[$lane],$dst3[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3", []>; // vld3 to single-spaced registers. -def VLD3LNd8 : VLD3LN<0b0010, "vld3.8"> { +def VLD3LNd8 : VLD3LN<0b0010, "vld3", "8"> { let Inst{4} = 0; } -def VLD3LNd16 : VLD3LN<0b0110, "vld3.16"> { +def VLD3LNd16 : VLD3LN<0b0110, "vld3", "16"> { let Inst{5-4} = 0b00; } -def VLD3LNd32 : VLD3LN<0b1010, "vld3.32"> { +def VLD3LNd32 : VLD3LN<0b1010, "vld3", "32"> { let Inst{6-4} = 0b000; } // vld3 to double-spaced even registers. -def VLD3LNq16a: VLD3LN<0b0110, "vld3.16"> { +def VLD3LNq16a: VLD3LN<0b0110, "vld3", "16"> { let Inst{5-4} = 0b10; } -def VLD3LNq32a: VLD3LN<0b1010, "vld3.32"> { +def VLD3LNq32a: VLD3LN<0b1010, "vld3", "32"> { let Inst{6-4} = 0b100; } // vld3 to double-spaced odd registers. -def VLD3LNq16b: VLD3LN<0b0110, "vld3.16"> { +def VLD3LNq16b: VLD3LN<0b0110, "vld3", "16"> { let Inst{5-4} = 0b10; } -def VLD3LNq32b: VLD3LN<0b1010, "vld3.32"> { +def VLD3LNq32b: VLD3LN<0b1010, "vld3", "32"> { let Inst{6-4} = 0b100; } // VLD4LN : Vector Load (single 4-element structure to one lane) -class VLD4LN op11_8, string OpcodeStr> +class VLD4LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b10,op11_8,{?,?,?,?}, - (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, - nohash_imm:$lane), IIC_VLD4, - OpcodeStr, - "\t\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", - "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; + (outs DPR:$dst1, DPR:$dst2, DPR:$dst3, DPR:$dst4), + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, + nohash_imm:$lane), IIC_VLD4, + OpcodeStr, Dt, + "\\{$dst1[$lane],$dst2[$lane],$dst3[$lane],$dst4[$lane]\\}, $addr", + "$src1 = $dst1, $src2 = $dst2, $src3 = $dst3, $src4 = $dst4", []>; // vld4 to single-spaced registers. -def VLD4LNd8 : VLD4LN<0b0011, "vld4.8">; -def VLD4LNd16 : VLD4LN<0b0111, "vld4.16"> { +def VLD4LNd8 : VLD4LN<0b0011, "vld4", "8">; +def VLD4LNd16 : VLD4LN<0b0111, "vld4", "16"> { let Inst{5} = 0; } -def VLD4LNd32 : VLD4LN<0b1011, "vld4.32"> { +def VLD4LNd32 : VLD4LN<0b1011, "vld4", "32"> { let Inst{6} = 0; } // vld4 to double-spaced even registers. -def VLD4LNq16a: VLD4LN<0b0111, "vld4.16"> { +def VLD4LNq16a: VLD4LN<0b0111, "vld4", "16"> { let Inst{5} = 1; } -def VLD4LNq32a: VLD4LN<0b1011, "vld4.32"> { +def VLD4LNq32a: VLD4LN<0b1011, "vld4", "32"> { let Inst{6} = 1; } // vld4 to double-spaced odd registers. -def VLD4LNq16b: VLD4LN<0b0111, "vld4.16"> { +def VLD4LNq16b: VLD4LN<0b0111, "vld4", "16"> { let Inst{5} = 1; } -def VLD4LNq32b: VLD4LN<0b1011, "vld4.32"> { +def VLD4LNq32b: VLD4LN<0b1011, "vld4", "32"> { let Inst{6} = 1; } @@ -390,217 +392,219 @@ } // mayLoad = 1, hasExtraDefRegAllocReq = 1 // VST1 : Vector Store (multiple single elements) -class VST1D op7_4, string OpcodeStr, ValueType Ty, Intrinsic IntOp> +class VST1D op7_4, string OpcodeStr, string Dt, + ValueType Ty, Intrinsic IntOp> : NLdSt<0,0b00,0b0111,op7_4, (outs), (ins addrmode6:$addr, DPR:$src), IIC_VST, - OpcodeStr, "\t\\{$src\\}, $addr", "", + OpcodeStr, Dt, "\\{$src\\}, $addr", "", [(IntOp addrmode6:$addr, (Ty DPR:$src))]>; -class VST1Q op7_4, string OpcodeStr, ValueType Ty, Intrinsic IntOp> +class VST1Q op7_4, string OpcodeStr, string Dt, + ValueType Ty, Intrinsic IntOp> : NLdSt<0,0b00,0b1010,op7_4, (outs), (ins addrmode6:$addr, QPR:$src), IIC_VST, - OpcodeStr, "\t${src:dregpair}, $addr", "", + OpcodeStr, Dt, "${src:dregpair}, $addr", "", [(IntOp addrmode6:$addr, (Ty QPR:$src))]>; let hasExtraSrcRegAllocReq = 1 in { -def VST1d8 : VST1D<0b0000, "vst1.8", v8i8, int_arm_neon_vst1>; -def VST1d16 : VST1D<0b0100, "vst1.16", v4i16, int_arm_neon_vst1>; -def VST1d32 : VST1D<0b1000, "vst1.32", v2i32, int_arm_neon_vst1>; -def VST1df : VST1D<0b1000, "vst1.32", v2f32, int_arm_neon_vst1>; -def VST1d64 : VST1D<0b1100, "vst1.64", v1i64, int_arm_neon_vst1>; - -def VST1q8 : VST1Q<0b0000, "vst1.8", v16i8, int_arm_neon_vst1>; -def VST1q16 : VST1Q<0b0100, "vst1.16", v8i16, int_arm_neon_vst1>; -def VST1q32 : VST1Q<0b1000, "vst1.32", v4i32, int_arm_neon_vst1>; -def VST1qf : VST1Q<0b1000, "vst1.32", v4f32, int_arm_neon_vst1>; -def VST1q64 : VST1Q<0b1100, "vst1.64", v2i64, int_arm_neon_vst1>; +def VST1d8 : VST1D<0b0000, "vst1", "8", v8i8, int_arm_neon_vst1>; +def VST1d16 : VST1D<0b0100, "vst1", "16", v4i16, int_arm_neon_vst1>; +def VST1d32 : VST1D<0b1000, "vst1", "32", v2i32, int_arm_neon_vst1>; +def VST1df : VST1D<0b1000, "vst1", "32", v2f32, int_arm_neon_vst1>; +def VST1d64 : VST1D<0b1100, "vst1", "64", v1i64, int_arm_neon_vst1>; + +def VST1q8 : VST1Q<0b0000, "vst1", "8", v16i8, int_arm_neon_vst1>; +def VST1q16 : VST1Q<0b0100, "vst1", "16", v8i16, int_arm_neon_vst1>; +def VST1q32 : VST1Q<0b1000, "vst1", "32", v4i32, int_arm_neon_vst1>; +def VST1qf : VST1Q<0b1000, "vst1", "32", v4f32, int_arm_neon_vst1>; +def VST1q64 : VST1Q<0b1100, "vst1", "64", v2i64, int_arm_neon_vst1>; } // hasExtraSrcRegAllocReq let mayStore = 1, hasExtraSrcRegAllocReq = 1 in { // VST2 : Vector Store (multiple 2-element structures) -class VST2D op7_4, string OpcodeStr> +class VST2D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b1000,op7_4, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2\\}, $addr", "", []>; -class VST2Q op7_4, string OpcodeStr> + OpcodeStr, Dt, "\\{$src1,$src2\\}, $addr", "", []>; +class VST2Q op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b0011,op7_4, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2,$src3,$src4\\}, $addr", + OpcodeStr, Dt, "\\{$src1,$src2,$src3,$src4\\}, $addr", "", []>; -def VST2d8 : VST2D<0b0000, "vst2.8">; -def VST2d16 : VST2D<0b0100, "vst2.16">; -def VST2d32 : VST2D<0b1000, "vst2.32">; +def VST2d8 : VST2D<0b0000, "vst2", "8">; +def VST2d16 : VST2D<0b0100, "vst2", "16">; +def VST2d32 : VST2D<0b1000, "vst2", "32">; def VST2d64 : NLdSt<0,0b00,0b1010,0b1100, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2), IIC_VST, - "vst1.64", "\t\\{$src1,$src2\\}, $addr", "", []>; + "vst1", "64", "\\{$src1,$src2\\}, $addr", "", []>; -def VST2q8 : VST2Q<0b0000, "vst2.8">; -def VST2q16 : VST2Q<0b0100, "vst2.16">; -def VST2q32 : VST2Q<0b1000, "vst2.32">; +def VST2q8 : VST2Q<0b0000, "vst2", "8">; +def VST2q16 : VST2Q<0b0100, "vst2", "16">; +def VST2q32 : VST2Q<0b1000, "vst2", "32">; // VST3 : Vector Store (multiple 3-element structures) -class VST3D op7_4, string OpcodeStr> +class VST3D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b0100,op7_4, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2,$src3\\}, $addr", "", []>; -class VST3WB op7_4, string OpcodeStr> + OpcodeStr, Dt, "\\{$src1,$src2,$src3\\}, $addr", "", []>; +class VST3WB op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b0101,op7_4, (outs GPR:$wb), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2,$src3\\}, $addr", + OpcodeStr, Dt, "\\{$src1,$src2,$src3\\}, $addr", "$addr.addr = $wb", []>; -def VST3d8 : VST3D<0b0000, "vst3.8">; -def VST3d16 : VST3D<0b0100, "vst3.16">; -def VST3d32 : VST3D<0b1000, "vst3.32">; +def VST3d8 : VST3D<0b0000, "vst3", "8">; +def VST3d16 : VST3D<0b0100, "vst3", "16">; +def VST3d32 : VST3D<0b1000, "vst3", "32">; def VST3d64 : NLdSt<0,0b00,0b0110,0b1100, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3), IIC_VST, - "vst1.64", "\t\\{$src1,$src2,$src3\\}, $addr", "", []>; + "vst1", "64", "\\{$src1,$src2,$src3\\}, $addr", "", []>; // vst3 to double-spaced even registers. -def VST3q8a : VST3WB<0b0000, "vst3.8">; -def VST3q16a : VST3WB<0b0100, "vst3.16">; -def VST3q32a : VST3WB<0b1000, "vst3.32">; +def VST3q8a : VST3WB<0b0000, "vst3", "8">; +def VST3q16a : VST3WB<0b0100, "vst3", "16">; +def VST3q32a : VST3WB<0b1000, "vst3", "32">; // vst3 to double-spaced odd registers. -def VST3q8b : VST3WB<0b0000, "vst3.8">; -def VST3q16b : VST3WB<0b0100, "vst3.16">; -def VST3q32b : VST3WB<0b1000, "vst3.32">; +def VST3q8b : VST3WB<0b0000, "vst3", "8">; +def VST3q16b : VST3WB<0b0100, "vst3", "16">; +def VST3q32b : VST3WB<0b1000, "vst3", "32">; // VST4 : Vector Store (multiple 4-element structures) -class VST4D op7_4, string OpcodeStr> +class VST4D op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b0000,op7_4, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2,$src3,$src4\\}, $addr", + OpcodeStr, Dt, "\\{$src1,$src2,$src3,$src4\\}, $addr", "", []>; -class VST4WB op7_4, string OpcodeStr> +class VST4WB op7_4, string OpcodeStr, string Dt> : NLdSt<0,0b00,0b0001,op7_4, (outs GPR:$wb), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4), IIC_VST, - OpcodeStr, "\t\\{$src1,$src2,$src3,$src4\\}, $addr", + OpcodeStr, Dt, "\\{$src1,$src2,$src3,$src4\\}, $addr", "$addr.addr = $wb", []>; -def VST4d8 : VST4D<0b0000, "vst4.8">; -def VST4d16 : VST4D<0b0100, "vst4.16">; -def VST4d32 : VST4D<0b1000, "vst4.32">; +def VST4d8 : VST4D<0b0000, "vst4", "8">; +def VST4d16 : VST4D<0b0100, "vst4", "16">; +def VST4d32 : VST4D<0b1000, "vst4", "32">; def VST4d64 : NLdSt<0,0b00,0b0010,0b1100, (outs), (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4), IIC_VST, - "vst1.64", "\t\\{$src1,$src2,$src3,$src4\\}, $addr", "", []>; + "vst1", "64", "\\{$src1,$src2,$src3,$src4\\}, $addr", "", []>; // vst4 to double-spaced even registers. -def VST4q8a : VST4WB<0b0000, "vst4.8">; -def VST4q16a : VST4WB<0b0100, "vst4.16">; -def VST4q32a : VST4WB<0b1000, "vst4.32">; +def VST4q8a : VST4WB<0b0000, "vst4", "8">; +def VST4q16a : VST4WB<0b0100, "vst4", "16">; +def VST4q32a : VST4WB<0b1000, "vst4", "32">; // vst4 to double-spaced odd registers. -def VST4q8b : VST4WB<0b0000, "vst4.8">; -def VST4q16b : VST4WB<0b0100, "vst4.16">; -def VST4q32b : VST4WB<0b1000, "vst4.32">; +def VST4q8b : VST4WB<0b0000, "vst4", "8">; +def VST4q16b : VST4WB<0b0100, "vst4", "16">; +def VST4q32b : VST4WB<0b1000, "vst4", "32">; // VST1LN : Vector Store (single element from one lane) // FIXME: Not yet implemented. // VST2LN : Vector Store (single 2-element structure from one lane) -class VST2LN op11_8, string OpcodeStr> +class VST2LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), - IIC_VST, - OpcodeStr, "\t\\{$src1[$lane],$src2[$lane]\\}, $addr", - "", []>; + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, nohash_imm:$lane), + IIC_VST, + OpcodeStr, Dt, "\\{$src1[$lane],$src2[$lane]\\}, $addr", + "", []>; // vst2 to single-spaced registers. -def VST2LNd8 : VST2LN<0b0001, "vst2.8">; -def VST2LNd16 : VST2LN<0b0101, "vst2.16"> { +def VST2LNd8 : VST2LN<0b0001, "vst2", "8">; +def VST2LNd16 : VST2LN<0b0101, "vst2", "16"> { let Inst{5} = 0; } -def VST2LNd32 : VST2LN<0b1001, "vst2.32"> { +def VST2LNd32 : VST2LN<0b1001, "vst2", "32"> { let Inst{6} = 0; } // vst2 to double-spaced even registers. -def VST2LNq16a: VST2LN<0b0101, "vst2.16"> { +def VST2LNq16a: VST2LN<0b0101, "vst2", "16"> { let Inst{5} = 1; } -def VST2LNq32a: VST2LN<0b1001, "vst2.32"> { +def VST2LNq32a: VST2LN<0b1001, "vst2", "32"> { let Inst{6} = 1; } // vst2 to double-spaced odd registers. -def VST2LNq16b: VST2LN<0b0101, "vst2.16"> { +def VST2LNq16b: VST2LN<0b0101, "vst2", "16"> { let Inst{5} = 1; } -def VST2LNq32b: VST2LN<0b1001, "vst2.32"> { +def VST2LNq32b: VST2LN<0b1001, "vst2", "32"> { let Inst{6} = 1; } // VST3LN : Vector Store (single 3-element structure from one lane) -class VST3LN op11_8, string OpcodeStr> +class VST3LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, - nohash_imm:$lane), IIC_VST, - OpcodeStr, - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, + nohash_imm:$lane), IIC_VST, + OpcodeStr, Dt, + "\\{$src1[$lane],$src2[$lane],$src3[$lane]\\}, $addr", "", []>; // vst3 to single-spaced registers. -def VST3LNd8 : VST3LN<0b0010, "vst3.8"> { +def VST3LNd8 : VST3LN<0b0010, "vst3", "8"> { let Inst{4} = 0; } -def VST3LNd16 : VST3LN<0b0110, "vst3.16"> { +def VST3LNd16 : VST3LN<0b0110, "vst3", "16"> { let Inst{5-4} = 0b00; } -def VST3LNd32 : VST3LN<0b1010, "vst3.32"> { +def VST3LNd32 : VST3LN<0b1010, "vst3", "32"> { let Inst{6-4} = 0b000; } // vst3 to double-spaced even registers. -def VST3LNq16a: VST3LN<0b0110, "vst3.16"> { +def VST3LNq16a: VST3LN<0b0110, "vst3", "16"> { let Inst{5-4} = 0b10; } -def VST3LNq32a: VST3LN<0b1010, "vst3.32"> { +def VST3LNq32a: VST3LN<0b1010, "vst3", "32"> { let Inst{6-4} = 0b100; } // vst3 to double-spaced odd registers. -def VST3LNq16b: VST3LN<0b0110, "vst3.16"> { +def VST3LNq16b: VST3LN<0b0110, "vst3", "16"> { let Inst{5-4} = 0b10; } -def VST3LNq32b: VST3LN<0b1010, "vst3.32"> { +def VST3LNq32b: VST3LN<0b1010, "vst3", "32"> { let Inst{6-4} = 0b100; } // VST4LN : Vector Store (single 4-element structure from one lane) -class VST4LN op11_8, string OpcodeStr> +class VST4LN op11_8, string OpcodeStr, string Dt> : NLdSt<1,0b00,op11_8,{?,?,?,?}, (outs), - (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, - nohash_imm:$lane), IIC_VST, - OpcodeStr, - "\t\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", - "", []>; + (ins addrmode6:$addr, DPR:$src1, DPR:$src2, DPR:$src3, DPR:$src4, + nohash_imm:$lane), IIC_VST, + OpcodeStr, Dt, + "\\{$src1[$lane],$src2[$lane],$src3[$lane],$src4[$lane]\\}, $addr", + "", []>; // vst4 to single-spaced registers. -def VST4LNd8 : VST4LN<0b0011, "vst4.8">; -def VST4LNd16 : VST4LN<0b0111, "vst4.16"> { +def VST4LNd8 : VST4LN<0b0011, "vst4", "8">; +def VST4LNd16 : VST4LN<0b0111, "vst4", "16"> { let Inst{5} = 0; } -def VST4LNd32 : VST4LN<0b1011, "vst4.32"> { +def VST4LNd32 : VST4LN<0b1011, "vst4", "32"> { let Inst{6} = 0; } // vst4 to double-spaced even registers. -def VST4LNq16a: VST4LN<0b0111, "vst4.16"> { +def VST4LNq16a: VST4LN<0b0111, "vst4", "16"> { let Inst{5} = 1; } -def VST4LNq32a: VST4LN<0b1011, "vst4.32"> { +def VST4LNq32a: VST4LN<0b1011, "vst4", "32"> { let Inst{6} = 1; } // vst4 to double-spaced odd registers. -def VST4LNq16b: VST4LN<0b0111, "vst4.16"> { +def VST4LNq16b: VST4LN<0b0111, "vst4", "16"> { let Inst{5} = 1; } -def VST4LNq32b: VST4LN<0b1011, "vst4.32"> { +def VST4LNq32b: VST4LN<0b1011, "vst4", "32"> { let Inst{6} = 1; } @@ -652,25 +656,25 @@ // Basic 2-register operations, both double- and quad-register. class N2VD op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr,string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode> : N2V; class N2VQ op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr,string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode> : N2V; // Basic 2-register operations, scalar single-precision. class N2VDs op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr,string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode> : N2V; + IIC_VUNAD, OpcodeStr, Dt, "$dst, $src", "", []>; class N2VDsPat : NEONFPPat<(ResTy (OpNode SPR:$a)), @@ -681,27 +685,27 @@ // Basic 2-register intrinsics, both double- and quad-register. class N2VDInt op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; class N2VQInt op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; // Basic 2-register intrinsics, scalar single-precision class N2VDInts op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; + OpcodeStr, Dt, "$dst, $src", "", []>; class N2VDIntsPat : NEONFPPat<(f32 (OpNode SPR:$a)), @@ -712,49 +716,62 @@ // Narrow 2-register intrinsics. class N2VNInt op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op6, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType TyD, ValueType TyQ, Intrinsic IntOp> : N2V; // Long 2-register intrinsics (currently only used for VMOVL). class N2VLInt op24_23, bits<2> op21_20, bits<2> op19_18, bits<2> op17_16, bits<5> op11_7, bit op6, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType TyQ, ValueType TyD, Intrinsic IntOp> : N2V; // 2-register shuffles (VTRN/VZIP/VUZP), both double- and quad-register. -class N2VDShuffle op19_18, bits<5> op11_7, string OpcodeStr> +class N2VDShuffle op19_18, bits<5> op11_7, string OpcodeStr, string Dt> : N2V<0b11, 0b11, op19_18, 0b10, op11_7, 0, 0, (outs DPR:$dst1, DPR:$dst2), (ins DPR:$src1, DPR:$src2), IIC_VPERMD, - OpcodeStr, "\t$dst1, $dst2", + OpcodeStr, Dt, "$dst1, $dst2", "$src1 = $dst1, $src2 = $dst2", []>; class N2VQShuffle op19_18, bits<5> op11_7, - InstrItinClass itin, string OpcodeStr> + InstrItinClass itin, string OpcodeStr, string Dt> : N2V<0b11, 0b11, op19_18, 0b10, op11_7, 1, 0, (outs QPR:$dst1, QPR:$dst2), (ins QPR:$src1, QPR:$src2), itin, - OpcodeStr, "\t$dst1, $dst2", + OpcodeStr, Dt, "$dst1, $dst2", "$src1 = $dst1, $src2 = $dst2", []>; // Basic 3-register operations, both double- and quad-register. class N3VD op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType ResTy, ValueType OpTy, + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, SDNode OpNode, bit Commutable> : N3V { + let isCommutable = Commutable; +} +// Same as N3VD but no data type. +class N3VDX op21_20, bits<4> op11_8, bit op4, + InstrItinClass itin, string OpcodeStr, + ValueType ResTy, ValueType OpTy, + SDNode OpNode, bit Commutable> + : N3VX { let isCommutable = Commutable; } class N3VDSL op21_20, bits<4> op11_8, - InstrItinClass itin, string OpcodeStr, ValueType Ty, SDNode ShOp> + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType Ty, SDNode ShOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR_VFP2:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (Ty DPR:$dst), (Ty (ShOp (Ty DPR:$src1), (Ty (NEONvduplane (Ty DPR_VFP2:$src2), @@ -762,11 +779,11 @@ let isCommutable = 0; } class N3VDSL16 op21_20, bits<4> op11_8, - string OpcodeStr, ValueType Ty, SDNode ShOp> + string OpcodeStr, string Dt, ValueType Ty, SDNode ShOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR_8:$src2, nohash_imm:$lane), IIC_VMULi16D, - OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (Ty DPR:$dst), (Ty (ShOp (Ty DPR:$src1), (Ty (NEONvduplane (Ty DPR_8:$src2), @@ -775,20 +792,31 @@ } class N3VQ op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType ResTy, ValueType OpTy, + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, SDNode OpNode, bit Commutable> : N3V { + let isCommutable = Commutable; +} +class N3VQX op21_20, bits<4> op11_8, bit op4, + InstrItinClass itin, string OpcodeStr, + ValueType ResTy, ValueType OpTy, + SDNode OpNode, bit Commutable> + : N3VX { let isCommutable = Commutable; } class N3VQSL op21_20, bits<4> op11_8, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, SDNode ShOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, DPR_VFP2:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (ResTy QPR:$dst), (ResTy (ShOp (ResTy QPR:$src1), (ResTy (NEONvduplane (OpTy DPR_VFP2:$src2), @@ -796,11 +824,12 @@ let isCommutable = 0; } class N3VQSL16 op21_20, bits<4> op11_8, - string OpcodeStr, ValueType ResTy, ValueType OpTy, SDNode ShOp> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, SDNode ShOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, DPR_8:$src2, nohash_imm:$lane), IIC_VMULi16Q, - OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (ResTy QPR:$dst), (ResTy (ShOp (ResTy QPR:$src1), (ResTy (NEONvduplane (OpTy DPR_8:$src2), @@ -810,11 +839,11 @@ // Basic 3-register operations, scalar single-precision class N3VDs op21_20, bits<4> op11_8, bit op4, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode, bit Commutable> : N3V { + OpcodeStr, Dt, "$dst, $src1, $src2", "", []> { let isCommutable = Commutable; } class N3VDsPat @@ -826,19 +855,20 @@ // Basic 3-register intrinsics, both double- and quad-register. class N3VDInt op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType ResTy, ValueType OpTy, + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp, bit Commutable> : N3V { let isCommutable = Commutable; } class N3VDIntSL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType Ty, Intrinsic IntOp> + string OpcodeStr, string Dt, ValueType Ty, Intrinsic IntOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR_VFP2:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (Ty DPR:$dst), (Ty (IntOp (Ty DPR:$src1), (Ty (NEONvduplane (Ty DPR_VFP2:$src2), @@ -846,10 +876,10 @@ let isCommutable = 0; } class N3VDIntSL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType Ty, Intrinsic IntOp> + string OpcodeStr, string Dt, ValueType Ty, Intrinsic IntOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR_8:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (Ty DPR:$dst), (Ty (IntOp (Ty DPR:$src1), (Ty (NEONvduplane (Ty DPR_8:$src2), @@ -858,19 +888,21 @@ } class N3VQInt op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType ResTy, ValueType OpTy, + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp, bit Commutable> : N3V { let isCommutable = Commutable; } class N3VQIntSL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, DPR_VFP2:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (ResTy QPR:$dst), (ResTy (IntOp (ResTy QPR:$src1), (ResTy (NEONvduplane (OpTy DPR_VFP2:$src2), @@ -878,10 +910,11 @@ let isCommutable = 0; } class N3VQIntSL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, DPR_8:$src2, nohash_imm:$lane), - itin, OpcodeStr, "\t$dst, $src1, $src2[$lane]", "", + itin, OpcodeStr, Dt, "$dst, $src1, $src2[$lane]", "", [(set (ResTy QPR:$dst), (ResTy (IntOp (ResTy QPR:$src1), (ResTy (NEONvduplane (OpTy DPR_8:$src2), @@ -891,30 +924,32 @@ // Multiply-Add/Sub operations, both double- and quad-register. class N3VDMulOp op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType Ty, SDNode MulOp, SDNode OpNode> : N3V; class N3VDMulOpSL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType Ty, SDNode MulOp, SDNode ShOp> + string OpcodeStr, string Dt, + ValueType Ty, SDNode MulOp, SDNode ShOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR:$src2, DPR_VFP2:$src3, nohash_imm:$lane), itin, - OpcodeStr, "\t$dst, $src2, $src3[$lane]", "$src1 = $dst", + OpcodeStr, Dt, "$dst, $src2, $src3[$lane]", "$src1 = $dst", [(set (Ty DPR:$dst), (Ty (ShOp (Ty DPR:$src1), (Ty (MulOp DPR:$src2, (Ty (NEONvduplane (Ty DPR_VFP2:$src3), imm:$lane)))))))]>; class N3VDMulOpSL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType Ty, SDNode MulOp, SDNode ShOp> + string OpcodeStr, string Dt, + ValueType Ty, SDNode MulOp, SDNode ShOp> : N3V<0, 1, op21_20, op11_8, 1, 0, (outs DPR:$dst), (ins DPR:$src1, DPR:$src2, DPR_8:$src3, nohash_imm:$lane), itin, - OpcodeStr, "\t$dst, $src2, $src3[$lane]", "$src1 = $dst", + OpcodeStr, Dt, "$dst, $src2, $src3[$lane]", "$src1 = $dst", [(set (Ty DPR:$dst), (Ty (ShOp (Ty DPR:$src1), (Ty (MulOp DPR:$src2, @@ -922,32 +957,33 @@ imm:$lane)))))))]>; class N3VQMulOp op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType Ty, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType Ty, SDNode MulOp, SDNode OpNode> : N3V; class N3VQMulOpSL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, SDNode MulOp, SDNode ShOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, QPR:$src2, DPR_VFP2:$src3, nohash_imm:$lane), itin, - OpcodeStr, "\t$dst, $src2, $src3[$lane]", "$src1 = $dst", + OpcodeStr, Dt, "$dst, $src2, $src3[$lane]", "$src1 = $dst", [(set (ResTy QPR:$dst), (ResTy (ShOp (ResTy QPR:$src1), (ResTy (MulOp QPR:$src2, (ResTy (NEONvduplane (OpTy DPR_VFP2:$src3), imm:$lane)))))))]>; class N3VQMulOpSL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, SDNode MulOp, SDNode ShOp> : N3V<1, 1, op21_20, op11_8, 1, 0, (outs QPR:$dst), (ins QPR:$src1, QPR:$src2, DPR_8:$src3, nohash_imm:$lane), itin, - OpcodeStr, "\t$dst, $src2, $src3[$lane]", "$src1 = $dst", + OpcodeStr, Dt, "$dst, $src2, $src3[$lane]", "$src1 = $dst", [(set (ResTy QPR:$dst), (ResTy (ShOp (ResTy QPR:$src1), (ResTy (MulOp QPR:$src2, @@ -956,12 +992,12 @@ // Multiply-Add/Sub operations, scalar single-precision class N3VDMulOps op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType Ty, SDNode MulOp, SDNode OpNode> : N3V; + OpcodeStr, Dt, "$dst, $src2, $src3", "$src1 = $dst", []>; class N3VDMulOpsPat : NEONFPPat<(f32 (OpNode SPR:$acc, (f32 (MulNode SPR:$a, SPR:$b)))), @@ -974,50 +1010,51 @@ // Neon 3-argument intrinsics, both double- and quad-register. // The destination register is also used as the first source operand register. class N3VDInt3 op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V; class N3VQInt3 op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V; // Neon Long 3-argument intrinsic. The destination register is // a quad-register and is also used as the first source operand register. class N3VLInt3 op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType TyQ, ValueType TyD, Intrinsic IntOp> : N3V; class N3VLInt3SL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V; class N3VLInt3SL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V op21_20, bits<4> op11_8, bit op4, - string OpcodeStr, ValueType TyD, ValueType TyQ, + string OpcodeStr, string Dt, ValueType TyD, ValueType TyQ, Intrinsic IntOp, bit Commutable> : N3V { let isCommutable = Commutable; } // Long 3-register intrinsics. class N3VLInt op21_20, bits<4> op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType TyQ, ValueType TyD, - Intrinsic IntOp, bit Commutable> + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType TyQ, ValueType TyD, Intrinsic IntOp, bit Commutable> : N3V { let isCommutable = Commutable; } class N3VLIntSL op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V; class N3VLIntSL16 op21_20, bits<4> op11_8, InstrItinClass itin, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N3V op21_20, bits<4> op11_8, bit op4, - string OpcodeStr, ValueType TyQ, ValueType TyD, + string OpcodeStr, string Dt, ValueType TyQ, ValueType TyD, Intrinsic IntOp, bit Commutable> : N3V { let isCommutable = Commutable; } // Pairwise long 2-register intrinsics, both double- and quad-register. class N2VDPLInt op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; class N2VQPLInt op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; // Pairwise long 2-register accumulate intrinsics, // both double- and quad-register. // The destination register is also used as the first source operand register. class N2VDPLInt2 op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; class N2VQPLInt2 op24_23, bits<2> op21_20, bits<2> op19_18, - bits<2> op17_16, bits<5> op11_7, bit op4, string OpcodeStr, + bits<2> op17_16, bits<5> op11_7, bit op4, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2V; // Shift by immediate, // both double- and quad-register. class N2VDSh op11_8, bit op7, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType Ty, SDNode OpNode> + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType Ty, SDNode OpNode> : N2VImm; class N2VQSh op11_8, bit op7, bit op4, - InstrItinClass itin, string OpcodeStr, ValueType Ty, SDNode OpNode> + InstrItinClass itin, string OpcodeStr, string Dt, + ValueType Ty, SDNode OpNode> : N2VImm; // Long shift by immediate. class N2VLSh op11_8, bit op7, bit op6, bit op4, - string OpcodeStr, ValueType ResTy, ValueType OpTy, SDNode OpNode> + string OpcodeStr, string Dt, + ValueType ResTy, ValueType OpTy, SDNode OpNode> : N2VImm; // Narrow shift by immediate. class N2VNSh op11_8, bit op7, bit op6, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode> : N2VImm; // Shift right by immediate and accumulate, // both double- and quad-register. class N2VDShAdd op11_8, bit op7, bit op4, - string OpcodeStr, ValueType Ty, SDNode ShOp> + string OpcodeStr, string Dt, ValueType Ty, SDNode ShOp> : N2VImm; class N2VQShAdd op11_8, bit op7, bit op4, - string OpcodeStr, ValueType Ty, SDNode ShOp> + string OpcodeStr, string Dt, ValueType Ty, SDNode ShOp> : N2VImm; // Shift by immediate and insert, // both double- and quad-register. class N2VDShIns op11_8, bit op7, bit op4, - string OpcodeStr, ValueType Ty, SDNode ShOp> + string OpcodeStr, string Dt, ValueType Ty, SDNode ShOp> : N2VImm; class N2VQShIns op11_8, bit op7, bit op4, - string OpcodeStr, ValueType Ty, SDNode ShOp> + string OpcodeStr, string Dt, ValueType Ty, SDNode ShOp> : N2VImm; // Convert, with fractional bits immediate, // both double- and quad-register. class N2VCvtD op11_8, bit op7, bit op4, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2VImm; class N2VCvtQ op11_8, bit op7, bit op4, - string OpcodeStr, ValueType ResTy, ValueType OpTy, + string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, Intrinsic IntOp> : N2VImm; //===----------------------------------------------------------------------===// @@ -1208,44 +1253,55 @@ multiclass N3V_QHS op11_8, bit op4, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, SDNode OpNode, bit Commutable = 0> { + string OpcodeStr, string Dt, + SDNode OpNode, bit Commutable = 0> { // 64-bit vector types. def v8i8 : N3VD; + OpcodeStr, !strconcat(Dt, "8"), + v8i8, v8i8, OpNode, Commutable>; def v4i16 : N3VD; + OpcodeStr, !strconcat(Dt, "16"), + v4i16, v4i16, OpNode, Commutable>; def v2i32 : N3VD; + OpcodeStr, !strconcat(Dt, "32"), + v2i32, v2i32, OpNode, Commutable>; // 128-bit vector types. def v16i8 : N3VQ; + OpcodeStr, !strconcat(Dt, "8"), + v16i8, v16i8, OpNode, Commutable>; def v8i16 : N3VQ; + OpcodeStr, !strconcat(Dt, "16"), + v8i16, v8i16, OpNode, Commutable>; def v4i32 : N3VQ; + OpcodeStr, !strconcat(Dt, "32"), + v4i32, v4i32, OpNode, Commutable>; } -multiclass N3VSL_HS op11_8, string OpcodeStr, SDNode ShOp> { - def v4i16 : N3VDSL16<0b01, op11_8, !strconcat(OpcodeStr, "16"), v4i16, ShOp>; - def v2i32 : N3VDSL<0b10, op11_8, IIC_VMULi32D, !strconcat(OpcodeStr, "32"), +multiclass N3VSL_HS op11_8, string OpcodeStr, string Dt, SDNode ShOp> { + def v4i16 : N3VDSL16<0b01, op11_8, OpcodeStr, !strconcat(Dt, "16"), + v4i16, ShOp>; + def v2i32 : N3VDSL<0b10, op11_8, IIC_VMULi32D, OpcodeStr, !strconcat(Dt,"32"), v2i32, ShOp>; - def v8i16 : N3VQSL16<0b01, op11_8, !strconcat(OpcodeStr, "16"), + def v8i16 : N3VQSL16<0b01, op11_8, OpcodeStr, !strconcat(Dt, "16"), v8i16, v4i16, ShOp>; - def v4i32 : N3VQSL<0b10, op11_8, IIC_VMULi32Q, !strconcat(OpcodeStr, "32"), + def v4i32 : N3VQSL<0b10, op11_8, IIC_VMULi32Q, OpcodeStr, !strconcat(Dt,"32"), v4i32, v2i32, ShOp>; } // ....then also with element size 64 bits: multiclass N3V_QHSD op11_8, bit op4, InstrItinClass itinD, InstrItinClass itinQ, - string OpcodeStr, SDNode OpNode, bit Commutable = 0> + string OpcodeStr, string Dt, + SDNode OpNode, bit Commutable = 0> : N3V_QHS { + OpcodeStr, Dt, OpNode, Commutable> { def v1i64 : N3VD; + OpcodeStr, !strconcat(Dt, "64"), + v1i64, v1i64, OpNode, Commutable>; def v2i64 : N3VQ; + OpcodeStr, !strconcat(Dt, "64"), + v2i64, v2i64, OpNode, Commutable>; } @@ -1253,27 +1309,30 @@ // source operand element sizes of 16, 32 and 64 bits: multiclass N2VNInt_HSD op24_23, bits<2> op21_20, bits<2> op17_16, bits<5> op11_7, bit op6, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, Intrinsic IntOp> { def v8i8 : N2VNInt; + itin, OpcodeStr, !strconcat(Dt, "16"), + v8i8, v8i16, IntOp>; def v4i16 : N2VNInt; + itin, OpcodeStr, !strconcat(Dt, "32"), + v4i16, v4i32, IntOp>; def v2i32 : N2VNInt; + itin, OpcodeStr, !strconcat(Dt, "64"), + v2i32, v2i64, IntOp>; } // Neon Lengthening 2-register vector intrinsic (currently specific to VMOVL). // source operand element sizes of 16, 32 and 64 bits: multiclass N2VLInt_QHS op24_23, bits<5> op11_7, bit op6, bit op4, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { def v8i16 : N2VLInt; + OpcodeStr, !strconcat(Dt, "8"), v8i16, v8i8, IntOp>; def v4i32 : N2VLInt; + OpcodeStr, !strconcat(Dt, "16"), v4i32, v4i16, IntOp>; def v2i64 : N2VLInt; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v2i32, IntOp>; } @@ -1283,74 +1342,85 @@ multiclass N3VInt_HS op11_8, bit op4, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, Intrinsic IntOp, bit Commutable = 0> { + string OpcodeStr, string Dt, + Intrinsic IntOp, bit Commutable = 0> { // 64-bit vector types. def v4i16 : N3VDInt; def v2i32 : N3VDInt; // 128-bit vector types. def v8i16 : N3VQInt; def v4i32 : N3VQInt; } multiclass N3VIntSL_HS op11_8, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { def v4i16 : N3VDIntSL16<0b01, op11_8, itinD16, - !strconcat(OpcodeStr, "16"), v4i16, IntOp>; + OpcodeStr, !strconcat(Dt, "16"), v4i16, IntOp>; def v2i32 : N3VDIntSL<0b10, op11_8, itinD32, - !strconcat(OpcodeStr, "32"), v2i32, IntOp>; + OpcodeStr, !strconcat(Dt, "32"), v2i32, IntOp>; def v8i16 : N3VQIntSL16<0b01, op11_8, itinQ16, - !strconcat(OpcodeStr, "16"), v8i16, v4i16, IntOp>; + OpcodeStr, !strconcat(Dt, "16"), v8i16, v4i16, IntOp>; def v4i32 : N3VQIntSL<0b10, op11_8, itinQ32, - !strconcat(OpcodeStr, "32"), v4i32, v2i32, IntOp>; + OpcodeStr, !strconcat(Dt, "32"), v4i32, v2i32, IntOp>; } // ....then also with element size of 8 bits: multiclass N3VInt_QHS op11_8, bit op4, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, Intrinsic IntOp, bit Commutable = 0> + string OpcodeStr, string Dt, + Intrinsic IntOp, bit Commutable = 0> : N3VInt_HS { + OpcodeStr, Dt, IntOp, Commutable> { def v8i8 : N3VDInt; + OpcodeStr, !strconcat(Dt, "8"), + v8i8, v8i8, IntOp, Commutable>; def v16i8 : N3VQInt; + OpcodeStr, !strconcat(Dt, "8"), + v16i8, v16i8, IntOp, Commutable>; } // ....then also with element size of 64 bits: multiclass N3VInt_QHSD op11_8, bit op4, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, Intrinsic IntOp, bit Commutable = 0> + string OpcodeStr, string Dt, + Intrinsic IntOp, bit Commutable = 0> : N3VInt_QHS { + OpcodeStr, Dt, IntOp, Commutable> { def v1i64 : N3VDInt; + OpcodeStr, !strconcat(Dt, "64"), + v1i64, v1i64, IntOp, Commutable>; def v2i64 : N3VQInt; + OpcodeStr, !strconcat(Dt, "64"), + v2i64, v2i64, IntOp, Commutable>; } // Neon Narrowing 3-register vector intrinsics, // source operand element sizes of 16, 32 and 64 bits: multiclass N3VNInt_HSD op11_8, bit op4, - string OpcodeStr, Intrinsic IntOp, bit Commutable = 0> { - def v8i8 : N3VNInt { + def v8i8 : N3VNInt; - def v4i16 : N3VNInt; - def v2i32 : N3VNInt; } @@ -1359,41 +1429,50 @@ // First with only element sizes of 16 and 32 bits: multiclass N3VLInt_HS op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, Intrinsic IntOp, bit Commutable = 0> { def v4i32 : N3VLInt; + OpcodeStr, !strconcat(Dt, "16"), + v4i32, v4i16, IntOp, Commutable>; def v2i64 : N3VLInt; + OpcodeStr, !strconcat(Dt, "32"), + v2i64, v2i32, IntOp, Commutable>; } multiclass N3VLIntSL_HS op11_8, - InstrItinClass itin, string OpcodeStr, Intrinsic IntOp> { + InstrItinClass itin, string OpcodeStr, string Dt, + Intrinsic IntOp> { def v4i16 : N3VLIntSL16; + OpcodeStr, !strconcat(Dt, "16"), v4i32, v4i16, IntOp>; def v2i32 : N3VLIntSL; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v2i32, IntOp>; } // ....then also with element size of 8 bits: multiclass N3VLInt_QHS op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, + InstrItinClass itin, string OpcodeStr, string Dt, Intrinsic IntOp, bit Commutable = 0> - : N3VLInt_HS { + : N3VLInt_HS { def v8i16 : N3VLInt; + OpcodeStr, !strconcat(Dt, "8"), + v8i16, v8i8, IntOp, Commutable>; } // Neon Wide 3-register vector intrinsics, // source operand element sizes of 8, 16 and 32 bits: multiclass N3VWInt_QHS op11_8, bit op4, - string OpcodeStr, Intrinsic IntOp, bit Commutable = 0> { - def v8i16 : N3VWInt { + def v8i16 : N3VWInt; - def v4i32 : N3VWInt; - def v2i64 : N3VWInt; } @@ -1403,57 +1482,57 @@ multiclass N3VMulOp_QHS op11_8, bit op4, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, SDNode OpNode> { + string OpcodeStr, string Dt, SDNode OpNode> { // 64-bit vector types. def v8i8 : N3VDMulOp; + OpcodeStr, !strconcat(Dt, "8"), v8i8, mul, OpNode>; def v4i16 : N3VDMulOp; + OpcodeStr, !strconcat(Dt, "16"), v4i16, mul, OpNode>; def v2i32 : N3VDMulOp; + OpcodeStr, !strconcat(Dt, "32"), v2i32, mul, OpNode>; // 128-bit vector types. def v16i8 : N3VQMulOp; + OpcodeStr, !strconcat(Dt, "8"), v16i8, mul, OpNode>; def v8i16 : N3VQMulOp; + OpcodeStr, !strconcat(Dt, "16"), v8i16, mul, OpNode>; def v4i32 : N3VQMulOp; + OpcodeStr, !strconcat(Dt, "32"), v4i32, mul, OpNode>; } multiclass N3VMulOpSL_HS op11_8, InstrItinClass itinD16, InstrItinClass itinD32, InstrItinClass itinQ16, InstrItinClass itinQ32, - string OpcodeStr, SDNode ShOp> { + string OpcodeStr, string Dt, SDNode ShOp> { def v4i16 : N3VDMulOpSL16<0b01, op11_8, itinD16, - !strconcat(OpcodeStr, "16"), v4i16, mul, ShOp>; + OpcodeStr, !strconcat(Dt, "16"), v4i16, mul, ShOp>; def v2i32 : N3VDMulOpSL<0b10, op11_8, itinD32, - !strconcat(OpcodeStr, "32"), v2i32, mul, ShOp>; + OpcodeStr, !strconcat(Dt, "32"), v2i32, mul, ShOp>; def v8i16 : N3VQMulOpSL16<0b01, op11_8, itinQ16, - !strconcat(OpcodeStr, "16"), v8i16, v4i16, mul, ShOp>; + OpcodeStr, !strconcat(Dt, "16"), v8i16, v4i16, mul, ShOp>; def v4i32 : N3VQMulOpSL<0b10, op11_8, itinQ32, - !strconcat(OpcodeStr, "32"), v4i32, v2i32, mul, ShOp>; + OpcodeStr, !strconcat(Dt, "32"), v4i32, v2i32, mul, ShOp>; } // Neon 3-argument intrinsics, // element sizes of 8, 16 and 32 bits: multiclass N3VInt3_QHS op11_8, bit op4, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { // 64-bit vector types. def v8i8 : N3VDInt3; + OpcodeStr, !strconcat(Dt, "8"), v8i8, v8i8, IntOp>; def v4i16 : N3VDInt3; + OpcodeStr, !strconcat(Dt, "16"), v4i16, v4i16, IntOp>; def v2i32 : N3VDInt3; + OpcodeStr, !strconcat(Dt, "32"), v2i32, v2i32, IntOp>; // 128-bit vector types. def v16i8 : N3VQInt3; + OpcodeStr, !strconcat(Dt, "8"), v16i8, v16i8, IntOp>; def v8i16 : N3VQInt3; + OpcodeStr, !strconcat(Dt, "16"), v8i16, v8i16, IntOp>; def v4i32 : N3VQInt3; + OpcodeStr, !strconcat(Dt, "32"), v4i32, v4i32, IntOp>; } @@ -1461,27 +1540,27 @@ // First with only element sizes of 16 and 32 bits: multiclass N3VLInt3_HS op11_8, bit op4, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { def v4i32 : N3VLInt3; + OpcodeStr, !strconcat(Dt, "16"), v4i32, v4i16, IntOp>; def v2i64 : N3VLInt3; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v2i32, IntOp>; } multiclass N3VLInt3SL_HS op11_8, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { def v4i16 : N3VLInt3SL16; + OpcodeStr, !strconcat(Dt,"16"), v4i32, v4i16, IntOp>; def v2i32 : N3VLInt3SL; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v2i32, IntOp>; } // ....then also with element size of 8 bits: multiclass N3VLInt3_QHS op11_8, bit op4, - string OpcodeStr, Intrinsic IntOp> - : N3VLInt3_HS { + string OpcodeStr, string Dt, Intrinsic IntOp> + : N3VLInt3_HS { def v8i16 : N3VLInt3; + OpcodeStr, !strconcat(Dt, "8"), v8i16, v8i8, IntOp>; } @@ -1490,22 +1569,22 @@ multiclass N2VInt_QHS op24_23, bits<2> op21_20, bits<2> op17_16, bits<5> op11_7, bit op4, InstrItinClass itinD, InstrItinClass itinQ, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { // 64-bit vector types. def v8i8 : N2VDInt; + itinD, OpcodeStr, !strconcat(Dt, "8"), v8i8, v8i8, IntOp>; def v4i16 : N2VDInt; + itinD, OpcodeStr, !strconcat(Dt, "16"), v4i16, v4i16, IntOp>; def v2i32 : N2VDInt; + itinD, OpcodeStr, !strconcat(Dt, "32"), v2i32, v2i32, IntOp>; // 128-bit vector types. def v16i8 : N2VQInt; + itinQ, OpcodeStr, !strconcat(Dt, "8"), v16i8, v16i8, IntOp>; def v8i16 : N2VQInt; + itinQ, OpcodeStr, !strconcat(Dt, "16"), v8i16, v8i16, IntOp>; def v4i32 : N2VQInt; + itinQ, OpcodeStr, !strconcat(Dt, "32"), v4i32, v4i32, IntOp>; } @@ -1513,22 +1592,22 @@ // element sizes of 8, 16 and 32 bits: multiclass N2VPLInt_QHS op24_23, bits<2> op21_20, bits<2> op17_16, bits<5> op11_7, bit op4, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { // 64-bit vector types. def v8i8 : N2VDPLInt; + OpcodeStr, !strconcat(Dt, "8"), v4i16, v8i8, IntOp>; def v4i16 : N2VDPLInt; + OpcodeStr, !strconcat(Dt, "16"), v2i32, v4i16, IntOp>; def v2i32 : N2VDPLInt; + OpcodeStr, !strconcat(Dt, "32"), v1i64, v2i32, IntOp>; // 128-bit vector types. def v16i8 : N2VQPLInt; + OpcodeStr, !strconcat(Dt, "8"), v8i16, v16i8, IntOp>; def v8i16 : N2VQPLInt; + OpcodeStr, !strconcat(Dt, "16"), v4i32, v8i16, IntOp>; def v4i32 : N2VQPLInt; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v4i32, IntOp>; } @@ -1536,61 +1615,62 @@ // element sizes of 8, 16 and 32 bits: multiclass N2VPLInt2_QHS op24_23, bits<2> op21_20, bits<2> op17_16, bits<5> op11_7, bit op4, - string OpcodeStr, Intrinsic IntOp> { + string OpcodeStr, string Dt, Intrinsic IntOp> { // 64-bit vector types. def v8i8 : N2VDPLInt2; + OpcodeStr, !strconcat(Dt, "8"), v4i16, v8i8, IntOp>; def v4i16 : N2VDPLInt2; + OpcodeStr, !strconcat(Dt, "16"), v2i32, v4i16, IntOp>; def v2i32 : N2VDPLInt2; + OpcodeStr, !strconcat(Dt, "32"), v1i64, v2i32, IntOp>; // 128-bit vector types. def v16i8 : N2VQPLInt2; + OpcodeStr, !strconcat(Dt, "8"), v8i16, v16i8, IntOp>; def v8i16 : N2VQPLInt2; + OpcodeStr, !strconcat(Dt, "16"), v4i32, v8i16, IntOp>; def v4i32 : N2VQPLInt2; + OpcodeStr, !strconcat(Dt, "32"), v2i64, v4i32, IntOp>; } // Neon 2-register vector shift by immediate, // element sizes of 8, 16, 32 and 64 bits: multiclass N2VSh_QHSD op11_8, bit op4, - InstrItinClass itin, string OpcodeStr, SDNode OpNode> { + InstrItinClass itin, string OpcodeStr, string Dt, + SDNode OpNode> { // 64-bit vector types. def v8i8 : N2VDSh { + OpcodeStr, !strconcat(Dt, "8"), v8i8, OpNode> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v4i16 : N2VDSh { + OpcodeStr, !strconcat(Dt, "16"), v4i16, OpNode> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v2i32 : N2VDSh { + OpcodeStr, !strconcat(Dt, "32"), v2i32, OpNode> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v1i64 : N2VDSh; + OpcodeStr, !strconcat(Dt, "64"), v1i64, OpNode>; // imm6 = xxxxxx // 128-bit vector types. def v16i8 : N2VQSh { + OpcodeStr, !strconcat(Dt, "8"), v16i8, OpNode> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v8i16 : N2VQSh { + OpcodeStr, !strconcat(Dt, "16"), v8i16, OpNode> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v4i32 : N2VQSh { + OpcodeStr, !strconcat(Dt, "32"), v4i32, OpNode> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v2i64 : N2VQSh; + OpcodeStr, !strconcat(Dt, "64"), v2i64, OpNode>; // imm6 = xxxxxx } @@ -1598,39 +1678,39 @@ // Neon Shift-Accumulate vector operations, // element sizes of 8, 16, 32 and 64 bits: multiclass N2VShAdd_QHSD op11_8, bit op4, - string OpcodeStr, SDNode ShOp> { + string OpcodeStr, string Dt, SDNode ShOp> { // 64-bit vector types. def v8i8 : N2VDShAdd { + OpcodeStr, !strconcat(Dt, "8"), v8i8, ShOp> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v4i16 : N2VDShAdd { + OpcodeStr, !strconcat(Dt, "16"), v4i16, ShOp> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v2i32 : N2VDShAdd { + OpcodeStr, !strconcat(Dt, "32"), v2i32, ShOp> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v1i64 : N2VDShAdd; + OpcodeStr, !strconcat(Dt, "64"), v1i64, ShOp>; // imm6 = xxxxxx // 128-bit vector types. def v16i8 : N2VQShAdd { + OpcodeStr, !strconcat(Dt, "8"), v16i8, ShOp> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v8i16 : N2VQShAdd { + OpcodeStr, !strconcat(Dt, "16"), v8i16, ShOp> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v4i32 : N2VQShAdd { + OpcodeStr, !strconcat(Dt, "32"), v4i32, ShOp> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v2i64 : N2VQShAdd; + OpcodeStr, !strconcat(Dt, "64"), v2i64, ShOp>; // imm6 = xxxxxx } @@ -1641,53 +1721,53 @@ string OpcodeStr, SDNode ShOp> { // 64-bit vector types. def v8i8 : N2VDShIns { + OpcodeStr, "8", v8i8, ShOp> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v4i16 : N2VDShIns { + OpcodeStr, "16", v4i16, ShOp> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v2i32 : N2VDShIns { + OpcodeStr, "32", v2i32, ShOp> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v1i64 : N2VDShIns; + OpcodeStr, "64", v1i64, ShOp>; // imm6 = xxxxxx // 128-bit vector types. def v16i8 : N2VQShIns { + OpcodeStr, "8", v16i8, ShOp> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v8i16 : N2VQShIns { + OpcodeStr, "16", v8i16, ShOp> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v4i32 : N2VQShIns { + OpcodeStr, "32", v4i32, ShOp> { let Inst{21} = 0b1; // imm6 = 1xxxxx } def v2i64 : N2VQShIns; + OpcodeStr, "64", v2i64, ShOp>; // imm6 = xxxxxx } // Neon Shift Long operations, // element sizes of 8, 16, 32 bits: multiclass N2VLSh_QHS op11_8, bit op7, bit op6, - bit op4, string OpcodeStr, SDNode OpNode> { + bit op4, string OpcodeStr, string Dt, SDNode OpNode> { def v8i16 : N2VLSh { + OpcodeStr, !strconcat(Dt, "8"), v8i16, v8i8, OpNode> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v4i32 : N2VLSh { + OpcodeStr, !strconcat(Dt, "16"), v4i32, v4i16, OpNode> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v2i64 : N2VLSh { + OpcodeStr, !strconcat(Dt, "32"), v2i64, v2i32, OpNode> { let Inst{21} = 0b1; // imm6 = 1xxxxx } } @@ -1695,18 +1775,18 @@ // Neon Shift Narrow operations, // element sizes of 16, 32, 64 bits: multiclass N2VNSh_HSD op11_8, bit op7, bit op6, - bit op4, InstrItinClass itin, string OpcodeStr, + bit op4, InstrItinClass itin, string OpcodeStr, string Dt, SDNode OpNode> { def v8i8 : N2VNSh { + OpcodeStr, !strconcat(Dt, "16"), v8i8, v8i16, OpNode> { let Inst{21-19} = 0b001; // imm6 = 001xxx } def v4i16 : N2VNSh { + OpcodeStr, !strconcat(Dt, "32"), v4i16, v4i32, OpNode> { let Inst{21-20} = 0b01; // imm6 = 01xxxx } def v2i32 : N2VNSh { + OpcodeStr, !strconcat(Dt, "64"), v2i32, v2i64, OpNode> { let Inst{21} = 0b1; // imm6 = 1xxxxx } } @@ -1718,56 +1798,58 @@ // Vector Add Operations. // VADD : Vector Add (integer and floating-point) -defm VADD : N3V_QHSD<0, 0, 0b1000, 0, IIC_VBINiD, IIC_VBINiQ, "vadd.i", +defm VADD : N3V_QHSD<0, 0, 0b1000, 0, IIC_VBINiD, IIC_VBINiQ, "vadd", "i", add, 1>; -def VADDfd : N3VD<0, 0, 0b00, 0b1101, 0, IIC_VBIND, "vadd.f32", +def VADDfd : N3VD<0, 0, 0b00, 0b1101, 0, IIC_VBIND, "vadd", "f32", v2f32, v2f32, fadd, 1>; -def VADDfq : N3VQ<0, 0, 0b00, 0b1101, 0, IIC_VBINQ, "vadd.f32", +def VADDfq : N3VQ<0, 0, 0b00, 0b1101, 0, IIC_VBINQ, "vadd", "f32", v4f32, v4f32, fadd, 1>; // VADDL : Vector Add Long (Q = D + D) -defm VADDLs : N3VLInt_QHS<0,1,0b0000,0, IIC_VSHLiD, "vaddl.s", +defm VADDLs : N3VLInt_QHS<0,1,0b0000,0, IIC_VSHLiD, "vaddl", "s", int_arm_neon_vaddls, 1>; -defm VADDLu : N3VLInt_QHS<1,1,0b0000,0, IIC_VSHLiD, "vaddl.u", +defm VADDLu : N3VLInt_QHS<1,1,0b0000,0, IIC_VSHLiD, "vaddl", "u", int_arm_neon_vaddlu, 1>; // VADDW : Vector Add Wide (Q = Q + D) -defm VADDWs : N3VWInt_QHS<0,1,0b0001,0, "vaddw.s", int_arm_neon_vaddws, 0>; -defm VADDWu : N3VWInt_QHS<1,1,0b0001,0, "vaddw.u", int_arm_neon_vaddwu, 0>; +defm VADDWs : N3VWInt_QHS<0,1,0b0001,0, "vaddw", "s", int_arm_neon_vaddws, 0>; +defm VADDWu : N3VWInt_QHS<1,1,0b0001,0, "vaddw", "u", int_arm_neon_vaddwu, 0>; // VHADD : Vector Halving Add defm VHADDs : N3VInt_QHS<0,0,0b0000,0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vhadd.s", int_arm_neon_vhadds, 1>; + IIC_VBINi4Q, "vhadd", "s", int_arm_neon_vhadds, 1>; defm VHADDu : N3VInt_QHS<1,0,0b0000,0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vhadd.u", int_arm_neon_vhaddu, 1>; + IIC_VBINi4Q, "vhadd", "u", int_arm_neon_vhaddu, 1>; // VRHADD : Vector Rounding Halving Add defm VRHADDs : N3VInt_QHS<0,0,0b0001,0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vrhadd.s", int_arm_neon_vrhadds, 1>; + IIC_VBINi4Q, "vrhadd", "s", int_arm_neon_vrhadds, 1>; defm VRHADDu : N3VInt_QHS<1,0,0b0001,0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vrhadd.u", int_arm_neon_vrhaddu, 1>; + IIC_VBINi4Q, "vrhadd", "u", int_arm_neon_vrhaddu, 1>; // VQADD : Vector Saturating Add defm VQADDs : N3VInt_QHSD<0,0,0b0000,1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vqadd.s", int_arm_neon_vqadds, 1>; + IIC_VBINi4Q, "vqadd", "s", int_arm_neon_vqadds, 1>; defm VQADDu : N3VInt_QHSD<1,0,0b0000,1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vqadd.u", int_arm_neon_vqaddu, 1>; + IIC_VBINi4Q, "vqadd", "u", int_arm_neon_vqaddu, 1>; // VADDHN : Vector Add and Narrow Returning High Half (D = Q + Q) -defm VADDHN : N3VNInt_HSD<0,1,0b0100,0, "vaddhn.i", int_arm_neon_vaddhn, 1>; +defm VADDHN : N3VNInt_HSD<0,1,0b0100,0, "vaddhn", "i", + int_arm_neon_vaddhn, 1>; // VRADDHN : Vector Rounding Add and Narrow Returning High Half (D = Q + Q) -defm VRADDHN : N3VNInt_HSD<1,1,0b0100,0, "vraddhn.i", int_arm_neon_vraddhn, 1>; +defm VRADDHN : N3VNInt_HSD<1,1,0b0100,0, "vraddhn", "i", + int_arm_neon_vraddhn, 1>; // Vector Multiply Operations. // VMUL : Vector Multiply (integer, polynomial and floating-point) defm VMUL : N3V_QHS<0, 0, 0b1001, 1, IIC_VMULi16D, IIC_VMULi32D, - IIC_VMULi16Q, IIC_VMULi32Q, "vmul.i", mul, 1>; -def VMULpd : N3VDInt<1, 0, 0b00, 0b1001, 1, IIC_VMULi16D, "vmul.p8", + IIC_VMULi16Q, IIC_VMULi32Q, "vmul", "i", mul, 1>; +def VMULpd : N3VDInt<1, 0, 0b00, 0b1001, 1, IIC_VMULi16D, "vmul", "p8", v8i8, v8i8, int_arm_neon_vmulp, 1>; -def VMULpq : N3VQInt<1, 0, 0b00, 0b1001, 1, IIC_VMULi16Q, "vmul.p8", +def VMULpq : N3VQInt<1, 0, 0b00, 0b1001, 1, IIC_VMULi16Q, "vmul", "p8", v16i8, v16i8, int_arm_neon_vmulp, 1>; -def VMULfd : N3VD<1, 0, 0b00, 0b1101, 1, IIC_VBIND, "vmul.f32", +def VMULfd : N3VD<1, 0, 0b00, 0b1101, 1, IIC_VBIND, "vmul", "f32", v2f32, v2f32, fmul, 1>; -def VMULfq : N3VQ<1, 0, 0b00, 0b1101, 1, IIC_VBINQ, "vmul.f32", +def VMULfq : N3VQ<1, 0, 0b00, 0b1101, 1, IIC_VBINQ, "vmul", "f32", v4f32, v4f32, fmul, 1>; -defm VMULsl : N3VSL_HS<0b1000, "vmul.i", mul>; -def VMULslfd : N3VDSL<0b10, 0b1001, IIC_VBIND, "vmul.f32", v2f32, fmul>; -def VMULslfq : N3VQSL<0b10, 0b1001, IIC_VBINQ, "vmul.f32", v4f32, v2f32, fmul>; +defm VMULsl : N3VSL_HS<0b1000, "vmul", "i", mul>; +def VMULslfd : N3VDSL<0b10, 0b1001, IIC_VBIND, "vmul", "f32", v2f32, fmul>; +def VMULslfq : N3VQSL<0b10, 0b1001, IIC_VBINQ, "vmul", "f32", v4f32, v2f32, fmul>; def : Pat<(v8i16 (mul (v8i16 QPR:$src1), (v8i16 (NEONvduplane (v8i16 QPR:$src2), imm:$lane)))), (v8i16 (VMULslv8i16 (v8i16 QPR:$src1), @@ -1790,10 +1872,10 @@ // VQDMULH : Vector Saturating Doubling Multiply Returning High Half defm VQDMULH : N3VInt_HS<0, 0, 0b1011, 0, IIC_VMULi16D, IIC_VMULi32D, IIC_VMULi16Q, IIC_VMULi32Q, - "vqdmulh.s", int_arm_neon_vqdmulh, 1>; + "vqdmulh", "s", int_arm_neon_vqdmulh, 1>; defm VQDMULHsl: N3VIntSL_HS<0b1100, IIC_VMULi16D, IIC_VMULi32D, IIC_VMULi16Q, IIC_VMULi32Q, - "vqdmulh.s", int_arm_neon_vqdmulh>; + "vqdmulh", "s", int_arm_neon_vqdmulh>; def : Pat<(v8i16 (int_arm_neon_vqdmulh (v8i16 QPR:$src1), (v8i16 (NEONvduplane (v8i16 QPR:$src2), imm:$lane)))), @@ -1812,10 +1894,10 @@ // VQRDMULH : Vector Rounding Saturating Doubling Multiply Returning High Half defm VQRDMULH : N3VInt_HS<1, 0, 0b1011, 0, IIC_VMULi16D, IIC_VMULi32D, IIC_VMULi16Q, IIC_VMULi32Q, - "vqrdmulh.s", int_arm_neon_vqrdmulh, 1>; + "vqrdmulh", "s", int_arm_neon_vqrdmulh, 1>; defm VQRDMULHsl : N3VIntSL_HS<0b1101, IIC_VMULi16D, IIC_VMULi32D, IIC_VMULi16Q, IIC_VMULi32Q, - "vqrdmulh.s", int_arm_neon_vqrdmulh>; + "vqrdmulh", "s", int_arm_neon_vqrdmulh>; def : Pat<(v8i16 (int_arm_neon_vqrdmulh (v8i16 QPR:$src1), (v8i16 (NEONvduplane (v8i16 QPR:$src2), imm:$lane)))), @@ -1832,37 +1914,37 @@ (SubReg_i32_lane imm:$lane)))>; // VMULL : Vector Multiply Long (integer and polynomial) (Q = D * D) -defm VMULLs : N3VLInt_QHS<0,1,0b1100,0, IIC_VMULi16D, "vmull.s", +defm VMULLs : N3VLInt_QHS<0,1,0b1100,0, IIC_VMULi16D, "vmull", "s", int_arm_neon_vmulls, 1>; -defm VMULLu : N3VLInt_QHS<1,1,0b1100,0, IIC_VMULi16D, "vmull.u", +defm VMULLu : N3VLInt_QHS<1,1,0b1100,0, IIC_VMULi16D, "vmull", "u", int_arm_neon_vmullu, 1>; -def VMULLp : N3VLInt<0, 1, 0b00, 0b1110, 0, IIC_VMULi16D, "vmull.p8", +def VMULLp : N3VLInt<0, 1, 0b00, 0b1110, 0, IIC_VMULi16D, "vmull", "p8", v8i16, v8i8, int_arm_neon_vmullp, 1>; -defm VMULLsls : N3VLIntSL_HS<0, 0b1010, IIC_VMULi16D, "vmull.s", +defm VMULLsls : N3VLIntSL_HS<0, 0b1010, IIC_VMULi16D, "vmull", "s", int_arm_neon_vmulls>; -defm VMULLslu : N3VLIntSL_HS<1, 0b1010, IIC_VMULi16D, "vmull.u", +defm VMULLslu : N3VLIntSL_HS<1, 0b1010, IIC_VMULi16D, "vmull", "u", int_arm_neon_vmullu>; // VQDMULL : Vector Saturating Doubling Multiply Long (Q = D * D) -defm VQDMULL : N3VLInt_HS<0,1,0b1101,0, IIC_VMULi16D, "vqdmull.s", +defm VQDMULL : N3VLInt_HS<0,1,0b1101,0, IIC_VMULi16D, "vqdmull", "s", int_arm_neon_vqdmull, 1>; -defm VQDMULLsl: N3VLIntSL_HS<0, 0b1011, IIC_VMULi16D, "vqdmull.s", +defm VQDMULLsl: N3VLIntSL_HS<0, 0b1011, IIC_VMULi16D, "vqdmull", "s", int_arm_neon_vqdmull>; // Vector Multiply-Accumulate and Multiply-Subtract Operations. // VMLA : Vector Multiply Accumulate (integer and floating-point) defm VMLA : N3VMulOp_QHS<0, 0, 0b1001, 0, IIC_VMACi16D, IIC_VMACi32D, - IIC_VMACi16Q, IIC_VMACi32Q, "vmla.i", add>; -def VMLAfd : N3VDMulOp<0, 0, 0b00, 0b1101, 1, IIC_VMACD, "vmla.f32", + IIC_VMACi16Q, IIC_VMACi32Q, "vmla", "i", add>; +def VMLAfd : N3VDMulOp<0, 0, 0b00, 0b1101, 1, IIC_VMACD, "vmla", "f32", v2f32, fmul, fadd>; -def VMLAfq : N3VQMulOp<0, 0, 0b00, 0b1101, 1, IIC_VMACQ, "vmla.f32", +def VMLAfq : N3VQMulOp<0, 0, 0b00, 0b1101, 1, IIC_VMACQ, "vmla", "f32", v4f32, fmul, fadd>; defm VMLAsl : N3VMulOpSL_HS<0b0000, IIC_VMACi16D, IIC_VMACi32D, - IIC_VMACi16Q, IIC_VMACi32Q, "vmla.i", add>; -def VMLAslfd : N3VDMulOpSL<0b10, 0b0001, IIC_VMACD, "vmla.f32", + IIC_VMACi16Q, IIC_VMACi32Q, "vmla", "i", add>; +def VMLAslfd : N3VDMulOpSL<0b10, 0b0001, IIC_VMACD, "vmla", "f32", v2f32, fmul, fadd>; -def VMLAslfq : N3VQMulOpSL<0b10, 0b0001, IIC_VMACQ, "vmla.f32", +def VMLAslfq : N3VQMulOpSL<0b10, 0b0001, IIC_VMACQ, "vmla", "f32", v4f32, v2f32, fmul, fadd>; def : Pat<(v8i16 (add (v8i16 QPR:$src1), @@ -1893,28 +1975,29 @@ (SubReg_i32_lane imm:$lane)))>; // VMLAL : Vector Multiply Accumulate Long (Q += D * D) -defm VMLALs : N3VLInt3_QHS<0,1,0b1000,0, "vmlal.s", int_arm_neon_vmlals>; -defm VMLALu : N3VLInt3_QHS<1,1,0b1000,0, "vmlal.u", int_arm_neon_vmlalu>; +defm VMLALs : N3VLInt3_QHS<0,1,0b1000,0, "vmlal", "s", int_arm_neon_vmlals>; +defm VMLALu : N3VLInt3_QHS<1,1,0b1000,0, "vmlal", "u", int_arm_neon_vmlalu>; -defm VMLALsls : N3VLInt3SL_HS<0, 0b0010, "vmlal.s", int_arm_neon_vmlals>; -defm VMLALslu : N3VLInt3SL_HS<1, 0b0010, "vmlal.u", int_arm_neon_vmlalu>; +defm VMLALsls : N3VLInt3SL_HS<0, 0b0010, "vmlal", "s", int_arm_neon_vmlals>; +defm VMLALslu : N3VLInt3SL_HS<1, 0b0010, "vmlal", "u", int_arm_neon_vmlalu>; // VQDMLAL : Vector Saturating Doubling Multiply Accumulate Long (Q += D * D) -defm VQDMLAL : N3VLInt3_HS<0, 1, 0b1001, 0, "vqdmlal.s", int_arm_neon_vqdmlal>; -defm VQDMLALsl: N3VLInt3SL_HS<0, 0b0011, "vqdmlal.s", int_arm_neon_vqdmlal>; +defm VQDMLAL : N3VLInt3_HS<0, 1, 0b1001, 0, "vqdmlal", "s", + int_arm_neon_vqdmlal>; +defm VQDMLALsl: N3VLInt3SL_HS<0, 0b0011, "vqdmlal", "s", int_arm_neon_vqdmlal>; // VMLS : Vector Multiply Subtract (integer and floating-point) defm VMLS : N3VMulOp_QHS<1, 0, 0b1001, 0, IIC_VMACi16D, IIC_VMACi32D, - IIC_VMACi16Q, IIC_VMACi32Q, "vmls.i", sub>; -def VMLSfd : N3VDMulOp<0, 0, 0b10, 0b1101, 1, IIC_VMACD, "vmls.f32", + IIC_VMACi16Q, IIC_VMACi32Q, "vmls", "i", sub>; +def VMLSfd : N3VDMulOp<0, 0, 0b10, 0b1101, 1, IIC_VMACD, "vmls", "f32", v2f32, fmul, fsub>; -def VMLSfq : N3VQMulOp<0, 0, 0b10, 0b1101, 1, IIC_VMACQ, "vmls.f32", +def VMLSfq : N3VQMulOp<0, 0, 0b10, 0b1101, 1, IIC_VMACQ, "vmls", "f32", v4f32, fmul, fsub>; defm VMLSsl : N3VMulOpSL_HS<0b0100, IIC_VMACi16D, IIC_VMACi32D, - IIC_VMACi16Q, IIC_VMACi32Q, "vmls.i", sub>; -def VMLSslfd : N3VDMulOpSL<0b10, 0b0101, IIC_VMACD, "vmls.f32", + IIC_VMACi16Q, IIC_VMACi32Q, "vmls", "i", sub>; +def VMLSslfd : N3VDMulOpSL<0b10, 0b0101, IIC_VMACD, "vmls", "f32", v2f32, fmul, fsub>; -def VMLSslfq : N3VQMulOpSL<0b10, 0b0101, IIC_VMACQ, "vmls.f32", +def VMLSslfq : N3VQMulOpSL<0b10, 0b0101, IIC_VMACQ, "vmls", "f32", v4f32, v2f32, fmul, fsub>; def : Pat<(v8i16 (sub (v8i16 QPR:$src1), @@ -1945,167 +2028,170 @@ (SubReg_i32_lane imm:$lane)))>; // VMLSL : Vector Multiply Subtract Long (Q -= D * D) -defm VMLSLs : N3VLInt3_QHS<0,1,0b1010,0, "vmlsl.s", int_arm_neon_vmlsls>; -defm VMLSLu : N3VLInt3_QHS<1,1,0b1010,0, "vmlsl.u", int_arm_neon_vmlslu>; +defm VMLSLs : N3VLInt3_QHS<0,1,0b1010,0, "vmlsl", "s", int_arm_neon_vmlsls>; +defm VMLSLu : N3VLInt3_QHS<1,1,0b1010,0, "vmlsl", "u", int_arm_neon_vmlslu>; -defm VMLSLsls : N3VLInt3SL_HS<0, 0b0110, "vmlsl.s", int_arm_neon_vmlsls>; -defm VMLSLslu : N3VLInt3SL_HS<1, 0b0110, "vmlsl.u", int_arm_neon_vmlslu>; +defm VMLSLsls : N3VLInt3SL_HS<0, 0b0110, "vmlsl", "s", int_arm_neon_vmlsls>; +defm VMLSLslu : N3VLInt3SL_HS<1, 0b0110, "vmlsl", "u", int_arm_neon_vmlslu>; // VQDMLSL : Vector Saturating Doubling Multiply Subtract Long (Q -= D * D) -defm VQDMLSL : N3VLInt3_HS<0, 1, 0b1011, 0, "vqdmlsl.s", int_arm_neon_vqdmlsl>; -defm VQDMLSLsl: N3VLInt3SL_HS<0, 0b111, "vqdmlsl.s", int_arm_neon_vqdmlsl>; +defm VQDMLSL : N3VLInt3_HS<0, 1, 0b1011, 0, "vqdmlsl", "s", + int_arm_neon_vqdmlsl>; +defm VQDMLSLsl: N3VLInt3SL_HS<0, 0b111, "vqdmlsl", "s", int_arm_neon_vqdmlsl>; // Vector Subtract Operations. // VSUB : Vector Subtract (integer and floating-point) defm VSUB : N3V_QHSD<1, 0, 0b1000, 0, IIC_VSUBiD, IIC_VSUBiQ, - "vsub.i", sub, 0>; -def VSUBfd : N3VD<0, 0, 0b10, 0b1101, 0, IIC_VBIND, "vsub.f32", + "vsub", "i", sub, 0>; +def VSUBfd : N3VD<0, 0, 0b10, 0b1101, 0, IIC_VBIND, "vsub", "f32", v2f32, v2f32, fsub, 0>; -def VSUBfq : N3VQ<0, 0, 0b10, 0b1101, 0, IIC_VBINQ, "vsub.f32", +def VSUBfq : N3VQ<0, 0, 0b10, 0b1101, 0, IIC_VBINQ, "vsub", "f32", v4f32, v4f32, fsub, 0>; // VSUBL : Vector Subtract Long (Q = D - D) -defm VSUBLs : N3VLInt_QHS<0,1,0b0010,0, IIC_VSHLiD, "vsubl.s", +defm VSUBLs : N3VLInt_QHS<0,1,0b0010,0, IIC_VSHLiD, "vsubl", "s", int_arm_neon_vsubls, 1>; -defm VSUBLu : N3VLInt_QHS<1,1,0b0010,0, IIC_VSHLiD, "vsubl.u", +defm VSUBLu : N3VLInt_QHS<1,1,0b0010,0, IIC_VSHLiD, "vsubl", "u", int_arm_neon_vsublu, 1>; // VSUBW : Vector Subtract Wide (Q = Q - D) -defm VSUBWs : N3VWInt_QHS<0,1,0b0011,0, "vsubw.s", int_arm_neon_vsubws, 0>; -defm VSUBWu : N3VWInt_QHS<1,1,0b0011,0, "vsubw.u", int_arm_neon_vsubwu, 0>; +defm VSUBWs : N3VWInt_QHS<0,1,0b0011,0, "vsubw", "s", int_arm_neon_vsubws, 0>; +defm VSUBWu : N3VWInt_QHS<1,1,0b0011,0, "vsubw", "u", int_arm_neon_vsubwu, 0>; // VHSUB : Vector Halving Subtract defm VHSUBs : N3VInt_QHS<0, 0, 0b0010, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vhsub.s", int_arm_neon_vhsubs, 0>; + "vhsub", "s", int_arm_neon_vhsubs, 0>; defm VHSUBu : N3VInt_QHS<1, 0, 0b0010, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vhsub.u", int_arm_neon_vhsubu, 0>; + "vhsub", "u", int_arm_neon_vhsubu, 0>; // VQSUB : Vector Saturing Subtract defm VQSUBs : N3VInt_QHSD<0, 0, 0b0010, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vqsub.s", int_arm_neon_vqsubs, 0>; + "vqsub", "s", int_arm_neon_vqsubs, 0>; defm VQSUBu : N3VInt_QHSD<1, 0, 0b0010, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vqsub.u", int_arm_neon_vqsubu, 0>; + "vqsub", "u", int_arm_neon_vqsubu, 0>; // VSUBHN : Vector Subtract and Narrow Returning High Half (D = Q - Q) -defm VSUBHN : N3VNInt_HSD<0,1,0b0110,0, "vsubhn.i", int_arm_neon_vsubhn, 0>; +defm VSUBHN : N3VNInt_HSD<0,1,0b0110,0, "vsubhn", "i", + int_arm_neon_vsubhn, 0>; // VRSUBHN : Vector Rounding Subtract and Narrow Returning High Half (D=Q-Q) -defm VRSUBHN : N3VNInt_HSD<1,1,0b0110,0, "vrsubhn.i", int_arm_neon_vrsubhn, 0>; +defm VRSUBHN : N3VNInt_HSD<1,1,0b0110,0, "vrsubhn", "i", + int_arm_neon_vrsubhn, 0>; // Vector Comparisons. // VCEQ : Vector Compare Equal defm VCEQ : N3V_QHS<1, 0, 0b1000, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vceq.i", NEONvceq, 1>; -def VCEQfd : N3VD<0,0,0b00,0b1110,0, IIC_VBIND, "vceq.f32", v2i32, v2f32, + IIC_VBINi4Q, "vceq", "i", NEONvceq, 1>; +def VCEQfd : N3VD<0,0,0b00,0b1110,0, IIC_VBIND, "vceq", "f32", v2i32, v2f32, NEONvceq, 1>; -def VCEQfq : N3VQ<0,0,0b00,0b1110,0, IIC_VBINQ, "vceq.f32", v4i32, v4f32, +def VCEQfq : N3VQ<0,0,0b00,0b1110,0, IIC_VBINQ, "vceq", "f32", v4i32, v4f32, NEONvceq, 1>; // VCGE : Vector Compare Greater Than or Equal defm VCGEs : N3V_QHS<0, 0, 0b0011, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vcge.s", NEONvcge, 0>; + IIC_VBINi4Q, "vcge", "s", NEONvcge, 0>; defm VCGEu : N3V_QHS<1, 0, 0b0011, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vcge.u", NEONvcgeu, 0>; -def VCGEfd : N3VD<1,0,0b00,0b1110,0, IIC_VBIND, "vcge.f32", + IIC_VBINi4Q, "vcge", "u", NEONvcgeu, 0>; +def VCGEfd : N3VD<1,0,0b00,0b1110,0, IIC_VBIND, "vcge", "f32", v2i32, v2f32, NEONvcge, 0>; -def VCGEfq : N3VQ<1,0,0b00,0b1110,0, IIC_VBINQ, "vcge.f32", v4i32, v4f32, +def VCGEfq : N3VQ<1,0,0b00,0b1110,0, IIC_VBINQ, "vcge", "f32", v4i32, v4f32, NEONvcge, 0>; // VCGT : Vector Compare Greater Than defm VCGTs : N3V_QHS<0, 0, 0b0011, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vcgt.s", NEONvcgt, 0>; + IIC_VBINi4Q, "vcgt", "s", NEONvcgt, 0>; defm VCGTu : N3V_QHS<1, 0, 0b0011, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vcgt.u", NEONvcgtu, 0>; -def VCGTfd : N3VD<1,0,0b10,0b1110,0, IIC_VBIND, "vcgt.f32", v2i32, v2f32, + IIC_VBINi4Q, "vcgt", "u", NEONvcgtu, 0>; +def VCGTfd : N3VD<1,0,0b10,0b1110,0, IIC_VBIND, "vcgt", "f32", v2i32, v2f32, NEONvcgt, 0>; -def VCGTfq : N3VQ<1,0,0b10,0b1110,0, IIC_VBINQ, "vcgt.f32", v4i32, v4f32, +def VCGTfq : N3VQ<1,0,0b10,0b1110,0, IIC_VBINQ, "vcgt", "f32", v4i32, v4f32, NEONvcgt, 0>; // VACGE : Vector Absolute Compare Greater Than or Equal (aka VCAGE) -def VACGEd : N3VDInt<1, 0, 0b00, 0b1110, 1, IIC_VBIND, "vacge.f32", +def VACGEd : N3VDInt<1, 0, 0b00, 0b1110, 1, IIC_VBIND, "vacge", "f32", v2i32, v2f32, int_arm_neon_vacged, 0>; -def VACGEq : N3VQInt<1, 0, 0b00, 0b1110, 1, IIC_VBINQ, "vacge.f32", +def VACGEq : N3VQInt<1, 0, 0b00, 0b1110, 1, IIC_VBINQ, "vacge", "f32", v4i32, v4f32, int_arm_neon_vacgeq, 0>; // VACGT : Vector Absolute Compare Greater Than (aka VCAGT) -def VACGTd : N3VDInt<1, 0, 0b10, 0b1110, 1, IIC_VBIND, "vacgt.f32", +def VACGTd : N3VDInt<1, 0, 0b10, 0b1110, 1, IIC_VBIND, "vacgt", "f32", v2i32, v2f32, int_arm_neon_vacgtd, 0>; -def VACGTq : N3VQInt<1, 0, 0b10, 0b1110, 1, IIC_VBINQ, "vacgt.f32", +def VACGTq : N3VQInt<1, 0, 0b10, 0b1110, 1, IIC_VBINQ, "vacgt", "f32", v4i32, v4f32, int_arm_neon_vacgtq, 0>; // VTST : Vector Test Bits defm VTST : N3V_QHS<0, 0, 0b1000, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vtst.i", NEONvtst, 1>; + IIC_VBINi4Q, "vtst", "i", NEONvtst, 1>; // Vector Bitwise Operations. // VAND : Vector Bitwise AND -def VANDd : N3VD<0, 0, 0b00, 0b0001, 1, IIC_VBINiD, "vand", - v2i32, v2i32, and, 1>; -def VANDq : N3VQ<0, 0, 0b00, 0b0001, 1, IIC_VBINiQ, "vand", - v4i32, v4i32, and, 1>; +def VANDd : N3VDX<0, 0, 0b00, 0b0001, 1, IIC_VBINiD, "vand", + v2i32, v2i32, and, 1>; +def VANDq : N3VQX<0, 0, 0b00, 0b0001, 1, IIC_VBINiQ, "vand", + v4i32, v4i32, and, 1>; // VEOR : Vector Bitwise Exclusive OR -def VEORd : N3VD<1, 0, 0b00, 0b0001, 1, IIC_VBINiD, "veor", - v2i32, v2i32, xor, 1>; -def VEORq : N3VQ<1, 0, 0b00, 0b0001, 1, IIC_VBINiQ, "veor", - v4i32, v4i32, xor, 1>; +def VEORd : N3VDX<1, 0, 0b00, 0b0001, 1, IIC_VBINiD, "veor", + v2i32, v2i32, xor, 1>; +def VEORq : N3VQX<1, 0, 0b00, 0b0001, 1, IIC_VBINiQ, "veor", + v4i32, v4i32, xor, 1>; // VORR : Vector Bitwise OR -def VORRd : N3VD<0, 0, 0b10, 0b0001, 1, IIC_VBINiD, "vorr", - v2i32, v2i32, or, 1>; -def VORRq : N3VQ<0, 0, 0b10, 0b0001, 1, IIC_VBINiQ, "vorr", - v4i32, v4i32, or, 1>; +def VORRd : N3VDX<0, 0, 0b10, 0b0001, 1, IIC_VBINiD, "vorr", + v2i32, v2i32, or, 1>; +def VORRq : N3VQX<0, 0, 0b10, 0b0001, 1, IIC_VBINiQ, "vorr", + v4i32, v4i32, or, 1>; // VBIC : Vector Bitwise Bit Clear (AND NOT) -def VBICd : N3V<0, 0, 0b01, 0b0001, 0, 1, (outs DPR:$dst), +def VBICd : N3VX<0, 0, 0b01, 0b0001, 0, 1, (outs DPR:$dst), (ins DPR:$src1, DPR:$src2), IIC_VBINiD, - "vbic", "\t$dst, $src1, $src2", "", + "vbic", "$dst, $src1, $src2", "", [(set DPR:$dst, (v2i32 (and DPR:$src1, (vnot_conv DPR:$src2))))]>; -def VBICq : N3V<0, 0, 0b01, 0b0001, 1, 1, (outs QPR:$dst), +def VBICq : N3VX<0, 0, 0b01, 0b0001, 1, 1, (outs QPR:$dst), (ins QPR:$src1, QPR:$src2), IIC_VBINiQ, - "vbic", "\t$dst, $src1, $src2", "", + "vbic", "$dst, $src1, $src2", "", [(set QPR:$dst, (v4i32 (and QPR:$src1, (vnot_conv QPR:$src2))))]>; // VORN : Vector Bitwise OR NOT -def VORNd : N3V<0, 0, 0b11, 0b0001, 0, 1, (outs DPR:$dst), +def VORNd : N3VX<0, 0, 0b11, 0b0001, 0, 1, (outs DPR:$dst), (ins DPR:$src1, DPR:$src2), IIC_VBINiD, - "vorn", "\t$dst, $src1, $src2", "", + "vorn", "$dst, $src1, $src2", "", [(set DPR:$dst, (v2i32 (or DPR:$src1, (vnot_conv DPR:$src2))))]>; -def VORNq : N3V<0, 0, 0b11, 0b0001, 1, 1, (outs QPR:$dst), +def VORNq : N3VX<0, 0, 0b11, 0b0001, 1, 1, (outs QPR:$dst), (ins QPR:$src1, QPR:$src2), IIC_VBINiQ, - "vorn", "\t$dst, $src1, $src2", "", + "vorn", "$dst, $src1, $src2", "", [(set QPR:$dst, (v4i32 (or QPR:$src1, (vnot_conv QPR:$src2))))]>; // VMVN : Vector Bitwise NOT -def VMVNd : N2V<0b11, 0b11, 0b00, 0b00, 0b01011, 0, 0, +def VMVNd : N2VX<0b11, 0b11, 0b00, 0b00, 0b01011, 0, 0, (outs DPR:$dst), (ins DPR:$src), IIC_VSHLiD, - "vmvn", "\t$dst, $src", "", + "vmvn", "$dst, $src", "", [(set DPR:$dst, (v2i32 (vnot DPR:$src)))]>; -def VMVNq : N2V<0b11, 0b11, 0b00, 0b00, 0b01011, 1, 0, +def VMVNq : N2VX<0b11, 0b11, 0b00, 0b00, 0b01011, 1, 0, (outs QPR:$dst), (ins QPR:$src), IIC_VSHLiD, - "vmvn", "\t$dst, $src", "", + "vmvn", "$dst, $src", "", [(set QPR:$dst, (v4i32 (vnot QPR:$src)))]>; def : Pat<(v2i32 (vnot_conv DPR:$src)), (VMVNd DPR:$src)>; def : Pat<(v4i32 (vnot_conv QPR:$src)), (VMVNq QPR:$src)>; // VBSL : Vector Bitwise Select -def VBSLd : N3V<1, 0, 0b01, 0b0001, 0, 1, (outs DPR:$dst), +def VBSLd : N3VX<1, 0, 0b01, 0b0001, 0, 1, (outs DPR:$dst), (ins DPR:$src1, DPR:$src2, DPR:$src3), IIC_VCNTiD, - "vbsl", "\t$dst, $src2, $src3", "$src1 = $dst", + "vbsl", "$dst, $src2, $src3", "$src1 = $dst", [(set DPR:$dst, (v2i32 (or (and DPR:$src2, DPR:$src1), (and DPR:$src3, (vnot_conv DPR:$src1)))))]>; -def VBSLq : N3V<1, 0, 0b01, 0b0001, 1, 1, (outs QPR:$dst), +def VBSLq : N3VX<1, 0, 0b01, 0b0001, 1, 1, (outs QPR:$dst), (ins QPR:$src1, QPR:$src2, QPR:$src3), IIC_VCNTiQ, - "vbsl", "\t$dst, $src2, $src3", "$src1 = $dst", + "vbsl", "$dst, $src2, $src3", "$src1 = $dst", [(set QPR:$dst, (v4i32 (or (and QPR:$src2, QPR:$src1), (and QPR:$src3, (vnot_conv QPR:$src1)))))]>; // VBIF : Vector Bitwise Insert if False -// like VBSL but with: "vbif\t$dst, $src3, $src1", "$src2 = $dst", +// like VBSL but with: "vbif $dst, $src3, $src1", "$src2 = $dst", // VBIT : Vector Bitwise Insert if True -// like VBSL but with: "vbit\t$dst, $src2, $src1", "$src3 = $dst", +// like VBSL but with: "vbit $dst, $src2, $src1", "$src3 = $dst", // These are not yet implemented. The TwoAddress pass will not go looking // for equivalent operations with different register constraints; it just // inserts copies. @@ -2115,261 +2201,268 @@ // VABD : Vector Absolute Difference defm VABDs : N3VInt_QHS<0, 0, 0b0111, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vabd.s", int_arm_neon_vabds, 0>; + "vabd", "s", int_arm_neon_vabds, 0>; defm VABDu : N3VInt_QHS<1, 0, 0b0111, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, IIC_VBINi4Q, - "vabd.u", int_arm_neon_vabdu, 0>; + "vabd", "u", int_arm_neon_vabdu, 0>; def VABDfd : N3VDInt<1, 0, 0b10, 0b1101, 0, IIC_VBIND, - "vabd.f32", v2f32, v2f32, int_arm_neon_vabds, 0>; + "vabd", "f32", v2f32, v2f32, int_arm_neon_vabds, 0>; def VABDfq : N3VQInt<1, 0, 0b10, 0b1101, 0, IIC_VBINQ, - "vabd.f32", v4f32, v4f32, int_arm_neon_vabds, 0>; + "vabd", "f32", v4f32, v4f32, int_arm_neon_vabds, 0>; // VABDL : Vector Absolute Difference Long (Q = | D - D |) defm VABDLs : N3VLInt_QHS<0,1,0b0111,0, IIC_VBINi4Q, - "vabdl.s", int_arm_neon_vabdls, 0>; + "vabdl", "s", int_arm_neon_vabdls, 0>; defm VABDLu : N3VLInt_QHS<1,1,0b0111,0, IIC_VBINi4Q, - "vabdl.u", int_arm_neon_vabdlu, 0>; + "vabdl", "u", int_arm_neon_vabdlu, 0>; // VABA : Vector Absolute Difference and Accumulate -defm VABAs : N3VInt3_QHS<0,0,0b0111,1, "vaba.s", int_arm_neon_vabas>; -defm VABAu : N3VInt3_QHS<1,0,0b0111,1, "vaba.u", int_arm_neon_vabau>; +defm VABAs : N3VInt3_QHS<0,0,0b0111,1, "vaba", "s", int_arm_neon_vabas>; +defm VABAu : N3VInt3_QHS<1,0,0b0111,1, "vaba", "u", int_arm_neon_vabau>; // VABAL : Vector Absolute Difference and Accumulate Long (Q += | D - D |) -defm VABALs : N3VLInt3_QHS<0,1,0b0101,0, "vabal.s", int_arm_neon_vabals>; -defm VABALu : N3VLInt3_QHS<1,1,0b0101,0, "vabal.u", int_arm_neon_vabalu>; +defm VABALs : N3VLInt3_QHS<0,1,0b0101,0, "vabal", "s", int_arm_neon_vabals>; +defm VABALu : N3VLInt3_QHS<1,1,0b0101,0, "vabal", "u", int_arm_neon_vabalu>; // Vector Maximum and Minimum. // VMAX : Vector Maximum defm VMAXs : N3VInt_QHS<0, 0, 0b0110, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vmax.s", int_arm_neon_vmaxs, 1>; + IIC_VBINi4Q, "vmax", "s", int_arm_neon_vmaxs, 1>; defm VMAXu : N3VInt_QHS<1, 0, 0b0110, 0, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vmax.u", int_arm_neon_vmaxu, 1>; -def VMAXfd : N3VDInt<0, 0, 0b00, 0b1111, 0, IIC_VBIND, "vmax.f32", v2f32, v2f32, - int_arm_neon_vmaxs, 1>; -def VMAXfq : N3VQInt<0, 0, 0b00, 0b1111, 0, IIC_VBINQ, "vmax.f32", v4f32, v4f32, - int_arm_neon_vmaxs, 1>; + IIC_VBINi4Q, "vmax", "u", int_arm_neon_vmaxu, 1>; +def VMAXfd : N3VDInt<0, 0, 0b00, 0b1111, 0, IIC_VBIND, "vmax", "f32", + v2f32, v2f32, int_arm_neon_vmaxs, 1>; +def VMAXfq : N3VQInt<0, 0, 0b00, 0b1111, 0, IIC_VBINQ, "vmax", "f32", + v4f32, v4f32, int_arm_neon_vmaxs, 1>; // VMIN : Vector Minimum defm VMINs : N3VInt_QHS<0, 0, 0b0110, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vmin.s", int_arm_neon_vmins, 1>; + IIC_VBINi4Q, "vmin", "s", int_arm_neon_vmins, 1>; defm VMINu : N3VInt_QHS<1, 0, 0b0110, 1, IIC_VBINi4D, IIC_VBINi4D, IIC_VBINi4Q, - IIC_VBINi4Q, "vmin.u", int_arm_neon_vminu, 1>; -def VMINfd : N3VDInt<0, 0, 0b10, 0b1111, 0, IIC_VBIND, "vmin.f32", v2f32, v2f32, - int_arm_neon_vmins, 1>; -def VMINfq : N3VQInt<0, 0, 0b10, 0b1111, 0, IIC_VBINQ, "vmin.f32", v4f32, v4f32, - int_arm_neon_vmins, 1>; + IIC_VBINi4Q, "vmin", "u", int_arm_neon_vminu, 1>; +def VMINfd : N3VDInt<0, 0, 0b10, 0b1111, 0, IIC_VBIND, "vmin", "f32", + v2f32, v2f32, int_arm_neon_vmins, 1>; +def VMINfq : N3VQInt<0, 0, 0b10, 0b1111, 0, IIC_VBINQ, "vmin", "f32", + v4f32, v4f32, int_arm_neon_vmins, 1>; // Vector Pairwise Operations. // VPADD : Vector Pairwise Add -def VPADDi8 : N3VDInt<0, 0, 0b00, 0b1011, 1, IIC_VBINiD, "vpadd.i8", v8i8, v8i8, - int_arm_neon_vpadd, 0>; -def VPADDi16 : N3VDInt<0, 0, 0b01, 0b1011, 1, IIC_VBINiD, "vpadd.i16", v4i16, v4i16, - int_arm_neon_vpadd, 0>; -def VPADDi32 : N3VDInt<0, 0, 0b10, 0b1011, 1, IIC_VBINiD, "vpadd.i32", v2i32, v2i32, - int_arm_neon_vpadd, 0>; -def VPADDf : N3VDInt<1, 0, 0b00, 0b1101, 0, IIC_VBIND, "vpadd.f32", v2f32, v2f32, - int_arm_neon_vpadd, 0>; +def VPADDi8 : N3VDInt<0, 0, 0b00, 0b1011, 1, IIC_VBINiD, "vpadd", "i8", + v8i8, v8i8, int_arm_neon_vpadd, 0>; +def VPADDi16 : N3VDInt<0, 0, 0b01, 0b1011, 1, IIC_VBINiD, "vpadd", "i16", + v4i16, v4i16, int_arm_neon_vpadd, 0>; +def VPADDi32 : N3VDInt<0, 0, 0b10, 0b1011, 1, IIC_VBINiD, "vpadd", "i32", + v2i32, v2i32, int_arm_neon_vpadd, 0>; +def VPADDf : N3VDInt<1, 0, 0b00, 0b1101, 0, IIC_VBIND, "vpadd", "f32", + v2f32, v2f32, int_arm_neon_vpadd, 0>; // VPADDL : Vector Pairwise Add Long -defm VPADDLs : N2VPLInt_QHS<0b11, 0b11, 0b00, 0b00100, 0, "vpaddl.s", +defm VPADDLs : N2VPLInt_QHS<0b11, 0b11, 0b00, 0b00100, 0, "vpaddl", "s", int_arm_neon_vpaddls>; -defm VPADDLu : N2VPLInt_QHS<0b11, 0b11, 0b00, 0b00101, 0, "vpaddl.u", +defm VPADDLu : N2VPLInt_QHS<0b11, 0b11, 0b00, 0b00101, 0, "vpaddl", "u", int_arm_neon_vpaddlu>; // VPADAL : Vector Pairwise Add and Accumulate Long -defm VPADALs : N2VPLInt2_QHS<0b11, 0b11, 0b00, 0b01100, 0, "vpadal.s", +defm VPADALs : N2VPLInt2_QHS<0b11, 0b11, 0b00, 0b01100, 0, "vpadal", "s", int_arm_neon_vpadals>; -defm VPADALu : N2VPLInt2_QHS<0b11, 0b11, 0b00, 0b01101, 0, "vpadal.u", +defm VPADALu : N2VPLInt2_QHS<0b11, 0b11, 0b00, 0b01101, 0, "vpadal", "u", int_arm_neon_vpadalu>; // VPMAX : Vector Pairwise Maximum -def VPMAXs8 : N3VDInt<0, 0, 0b00, 0b1010, 0, IIC_VBINi4D, "vpmax.s8", v8i8, v8i8, - int_arm_neon_vpmaxs, 0>; -def VPMAXs16 : N3VDInt<0, 0, 0b01, 0b1010, 0, IIC_VBINi4D, "vpmax.s16", v4i16, v4i16, - int_arm_neon_vpmaxs, 0>; -def VPMAXs32 : N3VDInt<0, 0, 0b10, 0b1010, 0, IIC_VBINi4D, "vpmax.s32", v2i32, v2i32, - int_arm_neon_vpmaxs, 0>; -def VPMAXu8 : N3VDInt<1, 0, 0b00, 0b1010, 0, IIC_VBINi4D, "vpmax.u8", v8i8, v8i8, - int_arm_neon_vpmaxu, 0>; -def VPMAXu16 : N3VDInt<1, 0, 0b01, 0b1010, 0, IIC_VBINi4D, "vpmax.u16", v4i16, v4i16, - int_arm_neon_vpmaxu, 0>; -def VPMAXu32 : N3VDInt<1, 0, 0b10, 0b1010, 0, IIC_VBINi4D, "vpmax.u32", v2i32, v2i32, - int_arm_neon_vpmaxu, 0>; -def VPMAXf : N3VDInt<1, 0, 0b00, 0b1111, 0, IIC_VBINi4D, "vpmax.f32", v2f32, v2f32, - int_arm_neon_vpmaxs, 0>; +def VPMAXs8 : N3VDInt<0, 0, 0b00, 0b1010, 0, IIC_VBINi4D, "vpmax", "s8", + v8i8, v8i8, int_arm_neon_vpmaxs, 0>; +def VPMAXs16 : N3VDInt<0, 0, 0b01, 0b1010, 0, IIC_VBINi4D, "vpmax", "s16", + v4i16, v4i16, int_arm_neon_vpmaxs, 0>; +def VPMAXs32 : N3VDInt<0, 0, 0b10, 0b1010, 0, IIC_VBINi4D, "vpmax", "s32", + v2i32, v2i32, int_arm_neon_vpmaxs, 0>; +def VPMAXu8 : N3VDInt<1, 0, 0b00, 0b1010, 0, IIC_VBINi4D, "vpmax", "u8", + v8i8, v8i8, int_arm_neon_vpmaxu, 0>; +def VPMAXu16 : N3VDInt<1, 0, 0b01, 0b1010, 0, IIC_VBINi4D, "vpmax", "u16", + v4i16, v4i16, int_arm_neon_vpmaxu, 0>; +def VPMAXu32 : N3VDInt<1, 0, 0b10, 0b1010, 0, IIC_VBINi4D, "vpmax", "u32", + v2i32, v2i32, int_arm_neon_vpmaxu, 0>; +def VPMAXf : N3VDInt<1, 0, 0b00, 0b1111, 0, IIC_VBINi4D, "vpmax", "f32", + v2f32, v2f32, int_arm_neon_vpmaxs, 0>; // VPMIN : Vector Pairwise Minimum -def VPMINs8 : N3VDInt<0, 0, 0b00, 0b1010, 1, IIC_VBINi4D, "vpmin.s8", v8i8, v8i8, - int_arm_neon_vpmins, 0>; -def VPMINs16 : N3VDInt<0, 0, 0b01, 0b1010, 1, IIC_VBINi4D, "vpmin.s16", v4i16, v4i16, - int_arm_neon_vpmins, 0>; -def VPMINs32 : N3VDInt<0, 0, 0b10, 0b1010, 1, IIC_VBINi4D, "vpmin.s32", v2i32, v2i32, - int_arm_neon_vpmins, 0>; -def VPMINu8 : N3VDInt<1, 0, 0b00, 0b1010, 1, IIC_VBINi4D, "vpmin.u8", v8i8, v8i8, - int_arm_neon_vpminu, 0>; -def VPMINu16 : N3VDInt<1, 0, 0b01, 0b1010, 1, IIC_VBINi4D, "vpmin.u16", v4i16, v4i16, - int_arm_neon_vpminu, 0>; -def VPMINu32 : N3VDInt<1, 0, 0b10, 0b1010, 1, IIC_VBINi4D, "vpmin.u32", v2i32, v2i32, - int_arm_neon_vpminu, 0>; -def VPMINf : N3VDInt<1, 0, 0b10, 0b1111, 0, IIC_VBINi4D, "vpmin.f32", v2f32, v2f32, - int_arm_neon_vpmins, 0>; +def VPMINs8 : N3VDInt<0, 0, 0b00, 0b1010, 1, IIC_VBINi4D, "vpmin", "s8", + v8i8, v8i8, int_arm_neon_vpmins, 0>; +def VPMINs16 : N3VDInt<0, 0, 0b01, 0b1010, 1, IIC_VBINi4D, "vpmin", "s16", + v4i16, v4i16, int_arm_neon_vpmins, 0>; +def VPMINs32 : N3VDInt<0, 0, 0b10, 0b1010, 1, IIC_VBINi4D, "vpmin", "s32", + v2i32, v2i32, int_arm_neon_vpmins, 0>; +def VPMINu8 : N3VDInt<1, 0, 0b00, 0b1010, 1, IIC_VBINi4D, "vpmin", "u8", + v8i8, v8i8, int_arm_neon_vpminu, 0>; +def VPMINu16 : N3VDInt<1, 0, 0b01, 0b1010, 1, IIC_VBINi4D, "vpmin", "u16", + v4i16, v4i16, int_arm_neon_vpminu, 0>; +def VPMINu32 : N3VDInt<1, 0, 0b10, 0b1010, 1, IIC_VBINi4D, "vpmin", "u32", + v2i32, v2i32, int_arm_neon_vpminu, 0>; +def VPMINf : N3VDInt<1, 0, 0b10, 0b1111, 0, IIC_VBINi4D, "vpmin", "f32", + v2f32, v2f32, int_arm_neon_vpmins, 0>; // Vector Reciprocal and Reciprocal Square Root Estimate and Step. // VRECPE : Vector Reciprocal Estimate def VRECPEd : N2VDInt<0b11, 0b11, 0b10, 0b11, 0b01000, 0, - IIC_VUNAD, "vrecpe.u32", + IIC_VUNAD, "vrecpe", "u32", v2i32, v2i32, int_arm_neon_vrecpe>; def VRECPEq : N2VQInt<0b11, 0b11, 0b10, 0b11, 0b01000, 0, - IIC_VUNAQ, "vrecpe.u32", + IIC_VUNAQ, "vrecpe", "u32", v4i32, v4i32, int_arm_neon_vrecpe>; def VRECPEfd : N2VDInt<0b11, 0b11, 0b10, 0b11, 0b01010, 0, - IIC_VUNAD, "vrecpe.f32", + IIC_VUNAD, "vrecpe", "f32", v2f32, v2f32, int_arm_neon_vrecpe>; def VRECPEfq : N2VQInt<0b11, 0b11, 0b10, 0b11, 0b01010, 0, - IIC_VUNAQ, "vrecpe.f32", + IIC_VUNAQ, "vrecpe", "f32", v4f32, v4f32, int_arm_neon_vrecpe>; // VRECPS : Vector Reciprocal Step -def VRECPSfd : N3VDInt<0, 0, 0b00, 0b1111, 1, IIC_VRECSD, "vrecps.f32", v2f32, v2f32, - int_arm_neon_vrecps, 1>; -def VRECPSfq : N3VQInt<0, 0, 0b00, 0b1111, 1, IIC_VRECSQ, "vrecps.f32", v4f32, v4f32, - int_arm_neon_vrecps, 1>; +def VRECPSfd : N3VDInt<0, 0, 0b00, 0b1111, 1, + IIC_VRECSD, "vrecps", "f32", + v2f32, v2f32, int_arm_neon_vrecps, 1>; +def VRECPSfq : N3VQInt<0, 0, 0b00, 0b1111, 1, + IIC_VRECSQ, "vrecps", "f32", + v4f32, v4f32, int_arm_neon_vrecps, 1>; // VRSQRTE : Vector Reciprocal Square Root Estimate def VRSQRTEd : N2VDInt<0b11, 0b11, 0b10, 0b11, 0b01001, 0, - IIC_VUNAD, "vrsqrte.u32", + IIC_VUNAD, "vrsqrte", "u32", v2i32, v2i32, int_arm_neon_vrsqrte>; def VRSQRTEq : N2VQInt<0b11, 0b11, 0b10, 0b11, 0b01001, 0, - IIC_VUNAQ, "vrsqrte.u32", + IIC_VUNAQ, "vrsqrte", "u32", v4i32, v4i32, int_arm_neon_vrsqrte>; def VRSQRTEfd : N2VDInt<0b11, 0b11, 0b10, 0b11, 0b01011, 0, - IIC_VUNAD, "vrsqrte.f32", + IIC_VUNAD, "vrsqrte", "f32", v2f32, v2f32, int_arm_neon_vrsqrte>; def VRSQRTEfq : N2VQInt<0b11, 0b11, 0b10, 0b11, 0b01011, 0, - IIC_VUNAQ, "vrsqrte.f32", + IIC_VUNAQ, "vrsqrte", "f32", v4f32, v4f32, int_arm_neon_vrsqrte>; // VRSQRTS : Vector Reciprocal Square Root Step -def VRSQRTSfd : N3VDInt<0, 0, 0b10, 0b1111, 1, IIC_VRECSD, "vrsqrts.f32", v2f32, v2f32, - int_arm_neon_vrsqrts, 1>; -def VRSQRTSfq : N3VQInt<0, 0, 0b10, 0b1111, 1, IIC_VRECSQ, "vrsqrts.f32", v4f32, v4f32, - int_arm_neon_vrsqrts, 1>; +def VRSQRTSfd : N3VDInt<0, 0, 0b10, 0b1111, 1, + IIC_VRECSD, "vrsqrts", "f32", + v2f32, v2f32, int_arm_neon_vrsqrts, 1>; +def VRSQRTSfq : N3VQInt<0, 0, 0b10, 0b1111, 1, + IIC_VRECSQ, "vrsqrts", "f32", + v4f32, v4f32, int_arm_neon_vrsqrts, 1>; // Vector Shifts. // VSHL : Vector Shift defm VSHLs : N3VInt_QHSD<0, 0, 0b0100, 0, IIC_VSHLiD, IIC_VSHLiD, IIC_VSHLiQ, - IIC_VSHLiQ, "vshl.s", int_arm_neon_vshifts, 0>; + IIC_VSHLiQ, "vshl", "s", int_arm_neon_vshifts, 0>; defm VSHLu : N3VInt_QHSD<1, 0, 0b0100, 0, IIC_VSHLiD, IIC_VSHLiD, IIC_VSHLiQ, - IIC_VSHLiQ, "vshl.u", int_arm_neon_vshiftu, 0>; + IIC_VSHLiQ, "vshl", "u", int_arm_neon_vshiftu, 0>; // VSHL : Vector Shift Left (Immediate) -defm VSHLi : N2VSh_QHSD<0, 1, 0b0101, 1, IIC_VSHLiD, "vshl.i", NEONvshl>; +defm VSHLi : N2VSh_QHSD<0, 1, 0b0101, 1, IIC_VSHLiD, "vshl", "i", NEONvshl>; // VSHR : Vector Shift Right (Immediate) -defm VSHRs : N2VSh_QHSD<0, 1, 0b0000, 1, IIC_VSHLiD, "vshr.s", NEONvshrs>; -defm VSHRu : N2VSh_QHSD<1, 1, 0b0000, 1, IIC_VSHLiD, "vshr.u", NEONvshru>; +defm VSHRs : N2VSh_QHSD<0, 1, 0b0000, 1, IIC_VSHLiD, "vshr", "s", NEONvshrs>; +defm VSHRu : N2VSh_QHSD<1, 1, 0b0000, 1, IIC_VSHLiD, "vshr", "u", NEONvshru>; // VSHLL : Vector Shift Left Long -defm VSHLLs : N2VLSh_QHS<0, 1, 0b1010, 0, 0, 1, "vshll.s", NEONvshlls>; -defm VSHLLu : N2VLSh_QHS<1, 1, 0b1010, 0, 0, 1, "vshll.u", NEONvshllu>; +defm VSHLLs : N2VLSh_QHS<0, 1, 0b1010, 0, 0, 1, "vshll", "s", NEONvshlls>; +defm VSHLLu : N2VLSh_QHS<1, 1, 0b1010, 0, 0, 1, "vshll", "u", NEONvshllu>; // VSHLL : Vector Shift Left Long (with maximum shift count) class N2VLShMax op21_16, bits<4> op11_8, bit op7, - bit op6, bit op4, string OpcodeStr, ValueType ResTy, + bit op6, bit op4, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy, SDNode OpNode> - : N2VLSh { + : N2VLSh { let Inst{21-16} = op21_16; } -def VSHLLi8 : N2VLShMax<1, 1, 0b110010, 0b0011, 0, 0, 0, "vshll.i8", +def VSHLLi8 : N2VLShMax<1, 1, 0b110010, 0b0011, 0, 0, 0, "vshll", "i8", v8i16, v8i8, NEONvshlli>; -def VSHLLi16 : N2VLShMax<1, 1, 0b110110, 0b0011, 0, 0, 0, "vshll.i16", +def VSHLLi16 : N2VLShMax<1, 1, 0b110110, 0b0011, 0, 0, 0, "vshll", "i16", v4i32, v4i16, NEONvshlli>; -def VSHLLi32 : N2VLShMax<1, 1, 0b111010, 0b0011, 0, 0, 0, "vshll.i32", +def VSHLLi32 : N2VLShMax<1, 1, 0b111010, 0b0011, 0, 0, 0, "vshll", "i32", v2i64, v2i32, NEONvshlli>; // VSHRN : Vector Shift Right and Narrow -defm VSHRN : N2VNSh_HSD<0,1,0b1000,0,0,1, IIC_VSHLiD, "vshrn.i", NEONvshrn>; +defm VSHRN : N2VNSh_HSD<0,1,0b1000,0,0,1, IIC_VSHLiD, "vshrn", "i", NEONvshrn>; // VRSHL : Vector Rounding Shift defm VRSHLs : N3VInt_QHSD<0,0,0b0101,0, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vrshl.s", int_arm_neon_vrshifts, 0>; + IIC_VSHLi4Q, "vrshl", "s", int_arm_neon_vrshifts, 0>; defm VRSHLu : N3VInt_QHSD<1,0,0b0101,0, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vrshl.u", int_arm_neon_vrshiftu, 0>; + IIC_VSHLi4Q, "vrshl", "u", int_arm_neon_vrshiftu, 0>; // VRSHR : Vector Rounding Shift Right -defm VRSHRs : N2VSh_QHSD<0, 1, 0b0010, 1, IIC_VSHLi4D, "vrshr.s", NEONvrshrs>; -defm VRSHRu : N2VSh_QHSD<1, 1, 0b0010, 1, IIC_VSHLi4D, "vrshr.u", NEONvrshru>; +defm VRSHRs : N2VSh_QHSD<0, 1, 0b0010, 1, IIC_VSHLi4D, "vrshr", "s", NEONvrshrs>; +defm VRSHRu : N2VSh_QHSD<1, 1, 0b0010, 1, IIC_VSHLi4D, "vrshr", "u", NEONvrshru>; // VRSHRN : Vector Rounding Shift Right and Narrow -defm VRSHRN : N2VNSh_HSD<0, 1, 0b1000, 0, 1, 1, IIC_VSHLi4D, "vrshrn.i", +defm VRSHRN : N2VNSh_HSD<0, 1, 0b1000, 0, 1, 1, IIC_VSHLi4D, "vrshrn", "i", NEONvrshrn>; // VQSHL : Vector Saturating Shift defm VQSHLs : N3VInt_QHSD<0,0,0b0100,1, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vqshl.s", int_arm_neon_vqshifts, 0>; + IIC_VSHLi4Q, "vqshl", "s", int_arm_neon_vqshifts, 0>; defm VQSHLu : N3VInt_QHSD<1,0,0b0100,1, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vqshl.u", int_arm_neon_vqshiftu, 0>; + IIC_VSHLi4Q, "vqshl", "u", int_arm_neon_vqshiftu, 0>; // VQSHL : Vector Saturating Shift Left (Immediate) -defm VQSHLsi : N2VSh_QHSD<0, 1, 0b0111, 1, IIC_VSHLi4D, "vqshl.s", NEONvqshls>; -defm VQSHLui : N2VSh_QHSD<1, 1, 0b0111, 1, IIC_VSHLi4D, "vqshl.u", NEONvqshlu>; +defm VQSHLsi : N2VSh_QHSD<0, 1, 0b0111, 1, IIC_VSHLi4D, "vqshl", "s", NEONvqshls>; +defm VQSHLui : N2VSh_QHSD<1, 1, 0b0111, 1, IIC_VSHLi4D, "vqshl", "u", NEONvqshlu>; // VQSHLU : Vector Saturating Shift Left (Immediate, Unsigned) -defm VQSHLsu : N2VSh_QHSD<1, 1, 0b0110, 1, IIC_VSHLi4D, "vqshlu.s", NEONvqshlsu>; +defm VQSHLsu : N2VSh_QHSD<1, 1, 0b0110, 1, IIC_VSHLi4D, "vqshlu", "s", NEONvqshlsu>; // VQSHRN : Vector Saturating Shift Right and Narrow -defm VQSHRNs : N2VNSh_HSD<0, 1, 0b1001, 0, 0, 1, IIC_VSHLi4D, "vqshrn.s", +defm VQSHRNs : N2VNSh_HSD<0, 1, 0b1001, 0, 0, 1, IIC_VSHLi4D, "vqshrn", "s", NEONvqshrns>; -defm VQSHRNu : N2VNSh_HSD<1, 1, 0b1001, 0, 0, 1, IIC_VSHLi4D, "vqshrn.u", +defm VQSHRNu : N2VNSh_HSD<1, 1, 0b1001, 0, 0, 1, IIC_VSHLi4D, "vqshrn", "u", NEONvqshrnu>; // VQSHRUN : Vector Saturating Shift Right and Narrow (Unsigned) -defm VQSHRUN : N2VNSh_HSD<1, 1, 0b1000, 0, 0, 1, IIC_VSHLi4D, "vqshrun.s", +defm VQSHRUN : N2VNSh_HSD<1, 1, 0b1000, 0, 0, 1, IIC_VSHLi4D, "vqshrun", "s", NEONvqshrnsu>; // VQRSHL : Vector Saturating Rounding Shift defm VQRSHLs : N3VInt_QHSD<0, 0, 0b0101, 1, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vqrshl.s", int_arm_neon_vqrshifts, 0>; + IIC_VSHLi4Q, "vqrshl", "s", + int_arm_neon_vqrshifts, 0>; defm VQRSHLu : N3VInt_QHSD<1, 0, 0b0101, 1, IIC_VSHLi4D, IIC_VSHLi4D, IIC_VSHLi4Q, - IIC_VSHLi4Q, "vqrshl.u", int_arm_neon_vqrshiftu, 0>; + IIC_VSHLi4Q, "vqrshl", "u", + int_arm_neon_vqrshiftu, 0>; // VQRSHRN : Vector Saturating Rounding Shift Right and Narrow -defm VQRSHRNs : N2VNSh_HSD<0, 1, 0b1001, 0, 1, 1, IIC_VSHLi4D, "vqrshrn.s", +defm VQRSHRNs : N2VNSh_HSD<0, 1, 0b1001, 0, 1, 1, IIC_VSHLi4D, "vqrshrn", "s", NEONvqrshrns>; -defm VQRSHRNu : N2VNSh_HSD<1, 1, 0b1001, 0, 1, 1, IIC_VSHLi4D, "vqrshrn.u", +defm VQRSHRNu : N2VNSh_HSD<1, 1, 0b1001, 0, 1, 1, IIC_VSHLi4D, "vqrshrn", "u", NEONvqrshrnu>; // VQRSHRUN : Vector Saturating Rounding Shift Right and Narrow (Unsigned) -defm VQRSHRUN : N2VNSh_HSD<1, 1, 0b1000, 0, 1, 1, IIC_VSHLi4D, "vqrshrun.s", +defm VQRSHRUN : N2VNSh_HSD<1, 1, 0b1000, 0, 1, 1, IIC_VSHLi4D, "vqrshrun", "s", NEONvqrshrnsu>; // VSRA : Vector Shift Right and Accumulate -defm VSRAs : N2VShAdd_QHSD<0, 1, 0b0001, 1, "vsra.s", NEONvshrs>; -defm VSRAu : N2VShAdd_QHSD<1, 1, 0b0001, 1, "vsra.u", NEONvshru>; +defm VSRAs : N2VShAdd_QHSD<0, 1, 0b0001, 1, "vsra", "s", NEONvshrs>; +defm VSRAu : N2VShAdd_QHSD<1, 1, 0b0001, 1, "vsra", "u", NEONvshru>; // VRSRA : Vector Rounding Shift Right and Accumulate -defm VRSRAs : N2VShAdd_QHSD<0, 1, 0b0011, 1, "vrsra.s", NEONvrshrs>; -defm VRSRAu : N2VShAdd_QHSD<1, 1, 0b0011, 1, "vrsra.u", NEONvrshru>; +defm VRSRAs : N2VShAdd_QHSD<0, 1, 0b0011, 1, "vrsra", "s", NEONvrshrs>; +defm VRSRAu : N2VShAdd_QHSD<1, 1, 0b0011, 1, "vrsra", "u", NEONvrshru>; // VSLI : Vector Shift Left and Insert -defm VSLI : N2VShIns_QHSD<1, 1, 0b0101, 1, "vsli.", NEONvsli>; +defm VSLI : N2VShIns_QHSD<1, 1, 0b0101, 1, "vsli", NEONvsli>; // VSRI : Vector Shift Right and Insert -defm VSRI : N2VShIns_QHSD<1, 1, 0b0100, 1, "vsri.", NEONvsri>; +defm VSRI : N2VShIns_QHSD<1, 1, 0b0100, 1, "vsri", NEONvsri>; // Vector Absolute and Saturating Absolute. // VABS : Vector Absolute Value defm VABS : N2VInt_QHS<0b11, 0b11, 0b01, 0b00110, 0, - IIC_VUNAiD, IIC_VUNAiQ, "vabs.s", + IIC_VUNAiD, IIC_VUNAiQ, "vabs", "s", int_arm_neon_vabs>; def VABSfd : N2VDInt<0b11, 0b11, 0b10, 0b01, 0b01110, 0, - IIC_VUNAD, "vabs.f32", + IIC_VUNAD, "vabs", "f32", v2f32, v2f32, int_arm_neon_vabs>; def VABSfq : N2VQInt<0b11, 0b11, 0b10, 0b01, 0b01110, 0, - IIC_VUNAQ, "vabs.f32", + IIC_VUNAQ, "vabs", "f32", v4f32, v4f32, int_arm_neon_vabs>; // VQABS : Vector Saturating Absolute Value defm VQABS : N2VInt_QHS<0b11, 0b11, 0b00, 0b01110, 0, - IIC_VQUNAiD, IIC_VQUNAiQ, "vqabs.s", + IIC_VQUNAiD, IIC_VQUNAiQ, "vqabs", "s", int_arm_neon_vqabs>; // Vector Negate. @@ -2377,31 +2470,31 @@ def vneg : PatFrag<(ops node:$in), (sub immAllZerosV, node:$in)>; def vneg_conv : PatFrag<(ops node:$in), (sub immAllZerosV_bc, node:$in)>; -class VNEGD size, string OpcodeStr, ValueType Ty> +class VNEGD size, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, size, 0b01, 0b00111, 0, 0, (outs DPR:$dst), (ins DPR:$src), - IIC_VSHLiD, OpcodeStr, "\t$dst, $src", "", + IIC_VSHLiD, OpcodeStr, Dt, "$dst, $src", "", [(set DPR:$dst, (Ty (vneg DPR:$src)))]>; -class VNEGQ size, string OpcodeStr, ValueType Ty> +class VNEGQ size, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, size, 0b01, 0b00111, 1, 0, (outs QPR:$dst), (ins QPR:$src), - IIC_VSHLiD, OpcodeStr, "\t$dst, $src", "", + IIC_VSHLiD, OpcodeStr, Dt, "$dst, $src", "", [(set QPR:$dst, (Ty (vneg QPR:$src)))]>; // VNEG : Vector Negate -def VNEGs8d : VNEGD<0b00, "vneg.s8", v8i8>; -def VNEGs16d : VNEGD<0b01, "vneg.s16", v4i16>; -def VNEGs32d : VNEGD<0b10, "vneg.s32", v2i32>; -def VNEGs8q : VNEGQ<0b00, "vneg.s8", v16i8>; -def VNEGs16q : VNEGQ<0b01, "vneg.s16", v8i16>; -def VNEGs32q : VNEGQ<0b10, "vneg.s32", v4i32>; +def VNEGs8d : VNEGD<0b00, "vneg", "s8", v8i8>; +def VNEGs16d : VNEGD<0b01, "vneg", "s16", v4i16>; +def VNEGs32d : VNEGD<0b10, "vneg", "s32", v2i32>; +def VNEGs8q : VNEGQ<0b00, "vneg", "s8", v16i8>; +def VNEGs16q : VNEGQ<0b01, "vneg", "s16", v8i16>; +def VNEGs32q : VNEGQ<0b10, "vneg", "s32", v4i32>; // VNEG : Vector Negate (floating-point) def VNEGf32d : N2V<0b11, 0b11, 0b10, 0b01, 0b01111, 0, 0, (outs DPR:$dst), (ins DPR:$src), IIC_VUNAD, - "vneg.f32", "\t$dst, $src", "", + "vneg", "f32", "$dst, $src", "", [(set DPR:$dst, (v2f32 (fneg DPR:$src)))]>; def VNEGf32q : N2V<0b11, 0b11, 0b10, 0b01, 0b01111, 1, 0, (outs QPR:$dst), (ins QPR:$src), IIC_VUNAQ, - "vneg.f32", "\t$dst, $src", "", + "vneg", "f32", "$dst, $src", "", [(set QPR:$dst, (v4f32 (fneg QPR:$src)))]>; def : Pat<(v8i8 (vneg_conv DPR:$src)), (VNEGs8d DPR:$src)>; @@ -2413,35 +2506,35 @@ // VQNEG : Vector Saturating Negate defm VQNEG : N2VInt_QHS<0b11, 0b11, 0b00, 0b01111, 0, - IIC_VQUNAiD, IIC_VQUNAiQ, "vqneg.s", + IIC_VQUNAiD, IIC_VQUNAiQ, "vqneg", "s", int_arm_neon_vqneg>; // Vector Bit Counting Operations. // VCLS : Vector Count Leading Sign Bits defm VCLS : N2VInt_QHS<0b11, 0b11, 0b00, 0b01000, 0, - IIC_VCNTiD, IIC_VCNTiQ, "vcls.s", + IIC_VCNTiD, IIC_VCNTiQ, "vcls", "s", int_arm_neon_vcls>; // VCLZ : Vector Count Leading Zeros defm VCLZ : N2VInt_QHS<0b11, 0b11, 0b00, 0b01001, 0, - IIC_VCNTiD, IIC_VCNTiQ, "vclz.i", + IIC_VCNTiD, IIC_VCNTiQ, "vclz", "i", int_arm_neon_vclz>; // VCNT : Vector Count One Bits def VCNTd : N2VDInt<0b11, 0b11, 0b00, 0b00, 0b01010, 0, - IIC_VCNTiD, "vcnt.8", + IIC_VCNTiD, "vcnt", "8", v8i8, v8i8, int_arm_neon_vcnt>; def VCNTq : N2VQInt<0b11, 0b11, 0b00, 0b00, 0b01010, 0, - IIC_VCNTiQ, "vcnt.8", + IIC_VCNTiQ, "vcnt", "8", v16i8, v16i8, int_arm_neon_vcnt>; // Vector Move Operations. // VMOV : Vector Move (Register) -def VMOVDneon: N3V<0, 0, 0b10, 0b0001, 0, 1, (outs DPR:$dst), (ins DPR:$src), - IIC_VMOVD, "vmov", "\t$dst, $src", "", []>; -def VMOVQ : N3V<0, 0, 0b10, 0b0001, 1, 1, (outs QPR:$dst), (ins QPR:$src), - IIC_VMOVD, "vmov", "\t$dst, $src", "", []>; +def VMOVDneon: N3VX<0, 0, 0b10, 0b0001, 0, 1, (outs DPR:$dst), (ins DPR:$src), + IIC_VMOVD, "vmov", "$dst, $src", "", []>; +def VMOVQ : N3VX<0, 0, 0b10, 0b0001, 1, 1, (outs QPR:$dst), (ins QPR:$src), + IIC_VMOVD, "vmov", "$dst, $src", "", []>; // VMOV : Vector Move (Immediate) @@ -2482,65 +2575,65 @@ def VMOVv8i8 : N1ModImm<1, 0b000, 0b1110, 0, 0, 0, 1, (outs DPR:$dst), (ins h8imm:$SIMM), IIC_VMOVImm, - "vmov.i8", "\t$dst, $SIMM", "", + "vmov", "i8", "$dst, $SIMM", "", [(set DPR:$dst, (v8i8 vmovImm8:$SIMM))]>; def VMOVv16i8 : N1ModImm<1, 0b000, 0b1110, 0, 1, 0, 1, (outs QPR:$dst), (ins h8imm:$SIMM), IIC_VMOVImm, - "vmov.i8", "\t$dst, $SIMM", "", + "vmov", "i8", "$dst, $SIMM", "", [(set QPR:$dst, (v16i8 vmovImm8:$SIMM))]>; def VMOVv4i16 : N1ModImm<1, 0b000, 0b1000, 0, 0, 0, 1, (outs DPR:$dst), (ins h16imm:$SIMM), IIC_VMOVImm, - "vmov.i16", "\t$dst, $SIMM", "", + "vmov", "i16", "$dst, $SIMM", "", [(set DPR:$dst, (v4i16 vmovImm16:$SIMM))]>; def VMOVv8i16 : N1ModImm<1, 0b000, 0b1000, 0, 1, 0, 1, (outs QPR:$dst), (ins h16imm:$SIMM), IIC_VMOVImm, - "vmov.i16", "\t$dst, $SIMM", "", + "vmov", "i16", "$dst, $SIMM", "", [(set QPR:$dst, (v8i16 vmovImm16:$SIMM))]>; def VMOVv2i32 : N1ModImm<1, 0b000, 0b0000, 0, 0, 0, 1, (outs DPR:$dst), (ins h32imm:$SIMM), IIC_VMOVImm, - "vmov.i32", "\t$dst, $SIMM", "", + "vmov", "i32", "$dst, $SIMM", "", [(set DPR:$dst, (v2i32 vmovImm32:$SIMM))]>; def VMOVv4i32 : N1ModImm<1, 0b000, 0b0000, 0, 1, 0, 1, (outs QPR:$dst), (ins h32imm:$SIMM), IIC_VMOVImm, - "vmov.i32", "\t$dst, $SIMM", "", + "vmov", "i32", "$dst, $SIMM", "", [(set QPR:$dst, (v4i32 vmovImm32:$SIMM))]>; def VMOVv1i64 : N1ModImm<1, 0b000, 0b1110, 0, 0, 1, 1, (outs DPR:$dst), (ins h64imm:$SIMM), IIC_VMOVImm, - "vmov.i64", "\t$dst, $SIMM", "", + "vmov", "i64", "$dst, $SIMM", "", [(set DPR:$dst, (v1i64 vmovImm64:$SIMM))]>; def VMOVv2i64 : N1ModImm<1, 0b000, 0b1110, 0, 1, 1, 1, (outs QPR:$dst), (ins h64imm:$SIMM), IIC_VMOVImm, - "vmov.i64", "\t$dst, $SIMM", "", + "vmov", "i64", "$dst, $SIMM", "", [(set QPR:$dst, (v2i64 vmovImm64:$SIMM))]>; // VMOV : Vector Get Lane (move scalar to ARM core register) def VGETLNs8 : NVGetLane<{1,1,1,0,0,1,?,1}, 0b1011, {?,?}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), - IIC_VMOVSI, "vmov", ".s8\t$dst, $src[$lane]", + IIC_VMOVSI, "vmov", "s8", "$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlanes (v8i8 DPR:$src), imm:$lane))]>; def VGETLNs16 : NVGetLane<{1,1,1,0,0,0,?,1}, 0b1011, {?,1}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), - IIC_VMOVSI, "vmov", ".s16\t$dst, $src[$lane]", + IIC_VMOVSI, "vmov", "s16", "$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlanes (v4i16 DPR:$src), imm:$lane))]>; def VGETLNu8 : NVGetLane<{1,1,1,0,1,1,?,1}, 0b1011, {?,?}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), - IIC_VMOVSI, "vmov", ".u8\t$dst, $src[$lane]", + IIC_VMOVSI, "vmov", "u8", "$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlaneu (v8i8 DPR:$src), imm:$lane))]>; def VGETLNu16 : NVGetLane<{1,1,1,0,1,0,?,1}, 0b1011, {?,1}, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), - IIC_VMOVSI, "vmov", ".u16\t$dst, $src[$lane]", + IIC_VMOVSI, "vmov", "u16", "$dst, $src[$lane]", [(set GPR:$dst, (NEONvgetlaneu (v4i16 DPR:$src), imm:$lane))]>; def VGETLNi32 : NVGetLane<{1,1,1,0,0,0,?,1}, 0b1011, 0b00, (outs GPR:$dst), (ins DPR:$src, nohash_imm:$lane), - IIC_VMOVSI, "vmov", ".32\t$dst, $src[$lane]", + IIC_VMOVSI, "vmov", "32", "$dst, $src[$lane]", [(set GPR:$dst, (extractelt (v2i32 DPR:$src), imm:$lane))]>; // def VGETLNf32: see FMRDH and FMRDL in ARMInstrVFP.td @@ -2581,17 +2674,17 @@ let Constraints = "$src1 = $dst" in { def VSETLNi8 : NVSetLane<{1,1,1,0,0,1,?,0}, 0b1011, {?,?}, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), - IIC_VMOVISL, "vmov", ".8\t$dst[$lane], $src2", + IIC_VMOVISL, "vmov", "8", "$dst[$lane], $src2", [(set DPR:$dst, (vector_insert (v8i8 DPR:$src1), GPR:$src2, imm:$lane))]>; def VSETLNi16 : NVSetLane<{1,1,1,0,0,0,?,0}, 0b1011, {?,1}, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), - IIC_VMOVISL, "vmov", ".16\t$dst[$lane], $src2", + IIC_VMOVISL, "vmov", "16", "$dst[$lane], $src2", [(set DPR:$dst, (vector_insert (v4i16 DPR:$src1), GPR:$src2, imm:$lane))]>; def VSETLNi32 : NVSetLane<{1,1,1,0,0,0,?,0}, 0b1011, 0b00, (outs DPR:$dst), (ins DPR:$src1, GPR:$src2, nohash_imm:$lane), - IIC_VMOVISL, "vmov", ".32\t$dst[$lane], $src2", + IIC_VMOVISL, "vmov", "32", "$dst[$lane], $src2", [(set DPR:$dst, (insertelt (v2i32 DPR:$src1), GPR:$src2, imm:$lane))]>; } @@ -2655,56 +2748,57 @@ // VDUP : Vector Duplicate (from ARM core register to all elements) -class VDUPD opcod1, bits<2> opcod3, string asmSize, ValueType Ty> +class VDUPD opcod1, bits<2> opcod3, string Dt, ValueType Ty> : NVDup; -class VDUPQ opcod1, bits<2> opcod3, string asmSize, ValueType Ty> +class VDUPQ opcod1, bits<2> opcod3, string Dt, ValueType Ty> : NVDup; -def VDUP8d : VDUPD<0b11101100, 0b00, ".8", v8i8>; -def VDUP16d : VDUPD<0b11101000, 0b01, ".16", v4i16>; -def VDUP32d : VDUPD<0b11101000, 0b00, ".32", v2i32>; -def VDUP8q : VDUPQ<0b11101110, 0b00, ".8", v16i8>; -def VDUP16q : VDUPQ<0b11101010, 0b01, ".16", v8i16>; -def VDUP32q : VDUPQ<0b11101010, 0b00, ".32", v4i32>; +def VDUP8d : VDUPD<0b11101100, 0b00, "8", v8i8>; +def VDUP16d : VDUPD<0b11101000, 0b01, "16", v4i16>; +def VDUP32d : VDUPD<0b11101000, 0b00, "32", v2i32>; +def VDUP8q : VDUPQ<0b11101110, 0b00, "8", v16i8>; +def VDUP16q : VDUPQ<0b11101010, 0b01, "16", v8i16>; +def VDUP32q : VDUPQ<0b11101010, 0b00, "32", v4i32>; def VDUPfd : NVDup<0b11101000, 0b1011, 0b00, (outs DPR:$dst), (ins GPR:$src), - IIC_VMOVIS, "vdup", ".32\t$dst, $src", + IIC_VMOVIS, "vdup", "32", "$dst, $src", [(set DPR:$dst, (v2f32 (NEONvdup (f32 (bitconvert GPR:$src)))))]>; def VDUPfq : NVDup<0b11101010, 0b1011, 0b00, (outs QPR:$dst), (ins GPR:$src), - IIC_VMOVIS, "vdup", ".32\t$dst, $src", + IIC_VMOVIS, "vdup", "32", "$dst, $src", [(set QPR:$dst, (v4f32 (NEONvdup (f32 (bitconvert GPR:$src)))))]>; // VDUP : Vector Duplicate Lane (from scalar to all elements) -class VDUPLND op19_18, bits<2> op17_16, string OpcodeStr, ValueType Ty> +class VDUPLND op19_18, bits<2> op17_16, + string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 0, 0, (outs DPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, - OpcodeStr, "\t$dst, $src[$lane]", "", + OpcodeStr, Dt, "$dst, $src[$lane]", "", [(set DPR:$dst, (Ty (NEONvduplane (Ty DPR:$src), imm:$lane)))]>; -class VDUPLNQ op19_18, bits<2> op17_16, string OpcodeStr, +class VDUPLNQ op19_18, bits<2> op17_16, string OpcodeStr, string Dt, ValueType ResTy, ValueType OpTy> : N2V<0b11, 0b11, op19_18, op17_16, 0b11000, 1, 0, (outs QPR:$dst), (ins DPR:$src, nohash_imm:$lane), IIC_VMOVD, - OpcodeStr, "\t$dst, $src[$lane]", "", + OpcodeStr, Dt, "$dst, $src[$lane]", "", [(set QPR:$dst, (ResTy (NEONvduplane (OpTy DPR:$src), imm:$lane)))]>; // Inst{19-16} is partially specified depending on the element size. -def VDUPLN8d : VDUPLND<{?,?}, {?,1}, "vdup.8", v8i8>; -def VDUPLN16d : VDUPLND<{?,?}, {1,0}, "vdup.16", v4i16>; -def VDUPLN32d : VDUPLND<{?,1}, {0,0}, "vdup.32", v2i32>; -def VDUPLNfd : VDUPLND<{?,1}, {0,0}, "vdup.32", v2f32>; -def VDUPLN8q : VDUPLNQ<{?,?}, {?,1}, "vdup.8", v16i8, v8i8>; -def VDUPLN16q : VDUPLNQ<{?,?}, {1,0}, "vdup.16", v8i16, v4i16>; -def VDUPLN32q : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4i32, v2i32>; -def VDUPLNfq : VDUPLNQ<{?,1}, {0,0}, "vdup.32", v4f32, v2f32>; +def VDUPLN8d : VDUPLND<{?,?}, {?,1}, "vdup", "8", v8i8>; +def VDUPLN16d : VDUPLND<{?,?}, {1,0}, "vdup", "16", v4i16>; +def VDUPLN32d : VDUPLND<{?,1}, {0,0}, "vdup", "32", v2i32>; +def VDUPLNfd : VDUPLND<{?,1}, {0,0}, "vdup", "32", v2f32>; +def VDUPLN8q : VDUPLNQ<{?,?}, {?,1}, "vdup", "8", v16i8, v8i8>; +def VDUPLN16q : VDUPLNQ<{?,?}, {1,0}, "vdup", "16", v8i16, v4i16>; +def VDUPLN32q : VDUPLNQ<{?,1}, {0,0}, "vdup", "32", v4i32, v2i32>; +def VDUPLNfq : VDUPLNQ<{?,1}, {0,0}, "vdup", "32", v4f32, v2f32>; def : Pat<(v16i8 (NEONvduplane (v16i8 QPR:$src), imm:$lane)), (v16i8 (VDUPLN8q (v8i8 (EXTRACT_SUBREG QPR:$src, @@ -2725,12 +2819,12 @@ def VDUPfdf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 0, 0, (outs DPR:$dst), (ins SPR:$src), - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", + IIC_VMOVD, "vdup", "32", "$dst, ${src:lane}", "", [(set DPR:$dst, (v2f32 (NEONvdup (f32 SPR:$src))))]>; def VDUPfqf : N2V<0b11, 0b11, {?,1}, {0,0}, 0b11000, 1, 0, (outs QPR:$dst), (ins SPR:$src), - IIC_VMOVD, "vdup.32", "\t$dst, ${src:lane}", "", + IIC_VMOVD, "vdup", "32", "$dst, ${src:lane}", "", [(set QPR:$dst, (v4f32 (NEONvdup (f32 SPR:$src))))]>; def : Pat<(v2i64 (NEONvduplane (v2i64 QPR:$src), imm:$lane)), @@ -2743,176 +2837,178 @@ (DSubReg_f64_other_reg imm:$lane))>; // VMOVN : Vector Narrowing Move -defm VMOVN : N2VNInt_HSD<0b11,0b11,0b10,0b00100,0,0, IIC_VMOVD, "vmovn.i", - int_arm_neon_vmovn>; +defm VMOVN : N2VNInt_HSD<0b11,0b11,0b10,0b00100,0,0, IIC_VMOVD, + "vmovn", "i", int_arm_neon_vmovn>; // VQMOVN : Vector Saturating Narrowing Move -defm VQMOVNs : N2VNInt_HSD<0b11,0b11,0b10,0b00101,0,0, IIC_VQUNAiD, "vqmovn.s", - int_arm_neon_vqmovns>; -defm VQMOVNu : N2VNInt_HSD<0b11,0b11,0b10,0b00101,1,0, IIC_VQUNAiD, "vqmovn.u", - int_arm_neon_vqmovnu>; -defm VQMOVNsu : N2VNInt_HSD<0b11,0b11,0b10,0b00100,1,0, IIC_VQUNAiD, "vqmovun.s", - int_arm_neon_vqmovnsu>; +defm VQMOVNs : N2VNInt_HSD<0b11,0b11,0b10,0b00101,0,0, IIC_VQUNAiD, + "vqmovn", "s", int_arm_neon_vqmovns>; +defm VQMOVNu : N2VNInt_HSD<0b11,0b11,0b10,0b00101,1,0, IIC_VQUNAiD, + "vqmovn", "u", int_arm_neon_vqmovnu>; +defm VQMOVNsu : N2VNInt_HSD<0b11,0b11,0b10,0b00100,1,0, IIC_VQUNAiD, + "vqmovun", "s", int_arm_neon_vqmovnsu>; // VMOVL : Vector Lengthening Move -defm VMOVLs : N2VLInt_QHS<0b01,0b10100,0,1, "vmovl.s", int_arm_neon_vmovls>; -defm VMOVLu : N2VLInt_QHS<0b11,0b10100,0,1, "vmovl.u", int_arm_neon_vmovlu>; +defm VMOVLs : N2VLInt_QHS<0b01,0b10100,0,1, "vmovl", "s", + int_arm_neon_vmovls>; +defm VMOVLu : N2VLInt_QHS<0b11,0b10100,0,1, "vmovl", "u", + int_arm_neon_vmovlu>; // Vector Conversions. // VCVT : Vector Convert Between Floating-Point and Integers -def VCVTf2sd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt.s32.f32", +def VCVTf2sd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt", "s32.f32", v2i32, v2f32, fp_to_sint>; -def VCVTf2ud : N2VD<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt.u32.f32", +def VCVTf2ud : N2VD<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt", "u32.f32", v2i32, v2f32, fp_to_uint>; -def VCVTs2fd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt.f32.s32", +def VCVTs2fd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt", "f32.s32", v2f32, v2i32, sint_to_fp>; -def VCVTu2fd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt.f32.u32", +def VCVTu2fd : N2VD<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt", "f32.u32", v2f32, v2i32, uint_to_fp>; -def VCVTf2sq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt.s32.f32", +def VCVTf2sq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt", "s32.f32", v4i32, v4f32, fp_to_sint>; -def VCVTf2uq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt.u32.f32", +def VCVTf2uq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt", "u32.f32", v4i32, v4f32, fp_to_uint>; -def VCVTs2fq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt.f32.s32", +def VCVTs2fq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt", "f32.s32", v4f32, v4i32, sint_to_fp>; -def VCVTu2fq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt.f32.u32", +def VCVTu2fq : N2VQ<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt", "f32.u32", v4f32, v4i32, uint_to_fp>; // VCVT : Vector Convert Between Floating-Point and Fixed-Point. -def VCVTf2xsd : N2VCvtD<0, 1, 0b1111, 0, 1, "vcvt.s32.f32", +def VCVTf2xsd : N2VCvtD<0, 1, 0b1111, 0, 1, "vcvt", "s32.f32", v2i32, v2f32, int_arm_neon_vcvtfp2fxs>; -def VCVTf2xud : N2VCvtD<1, 1, 0b1111, 0, 1, "vcvt.u32.f32", +def VCVTf2xud : N2VCvtD<1, 1, 0b1111, 0, 1, "vcvt", "u32.f32", v2i32, v2f32, int_arm_neon_vcvtfp2fxu>; -def VCVTxs2fd : N2VCvtD<0, 1, 0b1110, 0, 1, "vcvt.f32.s32", +def VCVTxs2fd : N2VCvtD<0, 1, 0b1110, 0, 1, "vcvt", "f32.s32", v2f32, v2i32, int_arm_neon_vcvtfxs2fp>; -def VCVTxu2fd : N2VCvtD<1, 1, 0b1110, 0, 1, "vcvt.f32.u32", +def VCVTxu2fd : N2VCvtD<1, 1, 0b1110, 0, 1, "vcvt", "f32.u32", v2f32, v2i32, int_arm_neon_vcvtfxu2fp>; -def VCVTf2xsq : N2VCvtQ<0, 1, 0b1111, 0, 1, "vcvt.s32.f32", +def VCVTf2xsq : N2VCvtQ<0, 1, 0b1111, 0, 1, "vcvt", "s32.f32", v4i32, v4f32, int_arm_neon_vcvtfp2fxs>; -def VCVTf2xuq : N2VCvtQ<1, 1, 0b1111, 0, 1, "vcvt.u32.f32", +def VCVTf2xuq : N2VCvtQ<1, 1, 0b1111, 0, 1, "vcvt", "u32.f32", v4i32, v4f32, int_arm_neon_vcvtfp2fxu>; -def VCVTxs2fq : N2VCvtQ<0, 1, 0b1110, 0, 1, "vcvt.f32.s32", +def VCVTxs2fq : N2VCvtQ<0, 1, 0b1110, 0, 1, "vcvt", "f32.s32", v4f32, v4i32, int_arm_neon_vcvtfxs2fp>; -def VCVTxu2fq : N2VCvtQ<1, 1, 0b1110, 0, 1, "vcvt.f32.u32", +def VCVTxu2fq : N2VCvtQ<1, 1, 0b1110, 0, 1, "vcvt", "f32.u32", v4f32, v4i32, int_arm_neon_vcvtfxu2fp>; // Vector Reverse. // VREV64 : Vector Reverse elements within 64-bit doublewords -class VREV64D op19_18, string OpcodeStr, ValueType Ty> +class VREV64D op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00000, 0, 0, (outs DPR:$dst), (ins DPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set DPR:$dst, (Ty (NEONvrev64 (Ty DPR:$src))))]>; -class VREV64Q op19_18, string OpcodeStr, ValueType Ty> +class VREV64Q op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00000, 1, 0, (outs QPR:$dst), (ins QPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set QPR:$dst, (Ty (NEONvrev64 (Ty QPR:$src))))]>; -def VREV64d8 : VREV64D<0b00, "vrev64.8", v8i8>; -def VREV64d16 : VREV64D<0b01, "vrev64.16", v4i16>; -def VREV64d32 : VREV64D<0b10, "vrev64.32", v2i32>; -def VREV64df : VREV64D<0b10, "vrev64.32", v2f32>; - -def VREV64q8 : VREV64Q<0b00, "vrev64.8", v16i8>; -def VREV64q16 : VREV64Q<0b01, "vrev64.16", v8i16>; -def VREV64q32 : VREV64Q<0b10, "vrev64.32", v4i32>; -def VREV64qf : VREV64Q<0b10, "vrev64.32", v4f32>; +def VREV64d8 : VREV64D<0b00, "vrev64", "8", v8i8>; +def VREV64d16 : VREV64D<0b01, "vrev64", "16", v4i16>; +def VREV64d32 : VREV64D<0b10, "vrev64", "32", v2i32>; +def VREV64df : VREV64D<0b10, "vrev64", "32", v2f32>; + +def VREV64q8 : VREV64Q<0b00, "vrev64", "8", v16i8>; +def VREV64q16 : VREV64Q<0b01, "vrev64", "16", v8i16>; +def VREV64q32 : VREV64Q<0b10, "vrev64", "32", v4i32>; +def VREV64qf : VREV64Q<0b10, "vrev64", "32", v4f32>; // VREV32 : Vector Reverse elements within 32-bit words -class VREV32D op19_18, string OpcodeStr, ValueType Ty> +class VREV32D op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00001, 0, 0, (outs DPR:$dst), (ins DPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set DPR:$dst, (Ty (NEONvrev32 (Ty DPR:$src))))]>; -class VREV32Q op19_18, string OpcodeStr, ValueType Ty> +class VREV32Q op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00001, 1, 0, (outs QPR:$dst), (ins QPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set QPR:$dst, (Ty (NEONvrev32 (Ty QPR:$src))))]>; -def VREV32d8 : VREV32D<0b00, "vrev32.8", v8i8>; -def VREV32d16 : VREV32D<0b01, "vrev32.16", v4i16>; +def VREV32d8 : VREV32D<0b00, "vrev32", "8", v8i8>; +def VREV32d16 : VREV32D<0b01, "vrev32", "16", v4i16>; -def VREV32q8 : VREV32Q<0b00, "vrev32.8", v16i8>; -def VREV32q16 : VREV32Q<0b01, "vrev32.16", v8i16>; +def VREV32q8 : VREV32Q<0b00, "vrev32", "8", v16i8>; +def VREV32q16 : VREV32Q<0b01, "vrev32", "16", v8i16>; // VREV16 : Vector Reverse elements within 16-bit halfwords -class VREV16D op19_18, string OpcodeStr, ValueType Ty> +class VREV16D op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00010, 0, 0, (outs DPR:$dst), (ins DPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set DPR:$dst, (Ty (NEONvrev16 (Ty DPR:$src))))]>; -class VREV16Q op19_18, string OpcodeStr, ValueType Ty> +class VREV16Q op19_18, string OpcodeStr, string Dt, ValueType Ty> : N2V<0b11, 0b11, op19_18, 0b00, 0b00010, 1, 0, (outs QPR:$dst), (ins QPR:$src), IIC_VMOVD, - OpcodeStr, "\t$dst, $src", "", + OpcodeStr, Dt, "$dst, $src", "", [(set QPR:$dst, (Ty (NEONvrev16 (Ty QPR:$src))))]>; -def VREV16d8 : VREV16D<0b00, "vrev16.8", v8i8>; -def VREV16q8 : VREV16Q<0b00, "vrev16.8", v16i8>; +def VREV16d8 : VREV16D<0b00, "vrev16", "8", v8i8>; +def VREV16q8 : VREV16Q<0b00, "vrev16", "8", v16i8>; // Other Vector Shuffles. // VEXT : Vector Extract -class VEXTd +class VEXTd : N3V<0,1,0b11,{?,?,?,?},0,0, (outs DPR:$dst), (ins DPR:$lhs, DPR:$rhs, i32imm:$index), IIC_VEXTD, - OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", + OpcodeStr, Dt, "$dst, $lhs, $rhs, $index", "", [(set DPR:$dst, (Ty (NEONvext (Ty DPR:$lhs), (Ty DPR:$rhs), imm:$index)))]>; -class VEXTq +class VEXTq : N3V<0,1,0b11,{?,?,?,?},1,0, (outs QPR:$dst), (ins QPR:$lhs, QPR:$rhs, i32imm:$index), IIC_VEXTQ, - OpcodeStr, "\t$dst, $lhs, $rhs, $index", "", + OpcodeStr, Dt, "$dst, $lhs, $rhs, $index", "", [(set QPR:$dst, (Ty (NEONvext (Ty QPR:$lhs), (Ty QPR:$rhs), imm:$index)))]>; -def VEXTd8 : VEXTd<"vext.8", v8i8>; -def VEXTd16 : VEXTd<"vext.16", v4i16>; -def VEXTd32 : VEXTd<"vext.32", v2i32>; -def VEXTdf : VEXTd<"vext.32", v2f32>; - -def VEXTq8 : VEXTq<"vext.8", v16i8>; -def VEXTq16 : VEXTq<"vext.16", v8i16>; -def VEXTq32 : VEXTq<"vext.32", v4i32>; -def VEXTqf : VEXTq<"vext.32", v4f32>; +def VEXTd8 : VEXTd<"vext", "8", v8i8>; +def VEXTd16 : VEXTd<"vext", "16", v4i16>; +def VEXTd32 : VEXTd<"vext", "32", v2i32>; +def VEXTdf : VEXTd<"vext", "32", v2f32>; + +def VEXTq8 : VEXTq<"vext", "8", v16i8>; +def VEXTq16 : VEXTq<"vext", "16", v8i16>; +def VEXTq32 : VEXTq<"vext", "32", v4i32>; +def VEXTqf : VEXTq<"vext", "32", v4f32>; // VTRN : Vector Transpose -def VTRNd8 : N2VDShuffle<0b00, 0b00001, "vtrn.8">; -def VTRNd16 : N2VDShuffle<0b01, 0b00001, "vtrn.16">; -def VTRNd32 : N2VDShuffle<0b10, 0b00001, "vtrn.32">; - -def VTRNq8 : N2VQShuffle<0b00, 0b00001, IIC_VPERMQ, "vtrn.8">; -def VTRNq16 : N2VQShuffle<0b01, 0b00001, IIC_VPERMQ, "vtrn.16">; -def VTRNq32 : N2VQShuffle<0b10, 0b00001, IIC_VPERMQ, "vtrn.32">; +def VTRNd8 : N2VDShuffle<0b00, 0b00001, "vtrn", "8">; +def VTRNd16 : N2VDShuffle<0b01, 0b00001, "vtrn", "16">; +def VTRNd32 : N2VDShuffle<0b10, 0b00001, "vtrn", "32">; + +def VTRNq8 : N2VQShuffle<0b00, 0b00001, IIC_VPERMQ, "vtrn", "8">; +def VTRNq16 : N2VQShuffle<0b01, 0b00001, IIC_VPERMQ, "vtrn", "16">; +def VTRNq32 : N2VQShuffle<0b10, 0b00001, IIC_VPERMQ, "vtrn", "32">; // VUZP : Vector Unzip (Deinterleave) -def VUZPd8 : N2VDShuffle<0b00, 0b00010, "vuzp.8">; -def VUZPd16 : N2VDShuffle<0b01, 0b00010, "vuzp.16">; -def VUZPd32 : N2VDShuffle<0b10, 0b00010, "vuzp.32">; - -def VUZPq8 : N2VQShuffle<0b00, 0b00010, IIC_VPERMQ3, "vuzp.8">; -def VUZPq16 : N2VQShuffle<0b01, 0b00010, IIC_VPERMQ3, "vuzp.16">; -def VUZPq32 : N2VQShuffle<0b10, 0b00010, IIC_VPERMQ3, "vuzp.32">; +def VUZPd8 : N2VDShuffle<0b00, 0b00010, "vuzp", "8">; +def VUZPd16 : N2VDShuffle<0b01, 0b00010, "vuzp", "16">; +def VUZPd32 : N2VDShuffle<0b10, 0b00010, "vuzp", "32">; + +def VUZPq8 : N2VQShuffle<0b00, 0b00010, IIC_VPERMQ3, "vuzp", "8">; +def VUZPq16 : N2VQShuffle<0b01, 0b00010, IIC_VPERMQ3, "vuzp", "16">; +def VUZPq32 : N2VQShuffle<0b10, 0b00010, IIC_VPERMQ3, "vuzp", "32">; // VZIP : Vector Zip (Interleave) -def VZIPd8 : N2VDShuffle<0b00, 0b00011, "vzip.8">; -def VZIPd16 : N2VDShuffle<0b01, 0b00011, "vzip.16">; -def VZIPd32 : N2VDShuffle<0b10, 0b00011, "vzip.32">; - -def VZIPq8 : N2VQShuffle<0b00, 0b00011, IIC_VPERMQ3, "vzip.8">; -def VZIPq16 : N2VQShuffle<0b01, 0b00011, IIC_VPERMQ3, "vzip.16">; -def VZIPq32 : N2VQShuffle<0b10, 0b00011, IIC_VPERMQ3, "vzip.32">; +def VZIPd8 : N2VDShuffle<0b00, 0b00011, "vzip", "8">; +def VZIPd16 : N2VDShuffle<0b01, 0b00011, "vzip", "16">; +def VZIPd32 : N2VDShuffle<0b10, 0b00011, "vzip", "32">; + +def VZIPq8 : N2VQShuffle<0b00, 0b00011, IIC_VPERMQ3, "vzip", "8">; +def VZIPq16 : N2VQShuffle<0b01, 0b00011, IIC_VPERMQ3, "vzip", "16">; +def VZIPq32 : N2VQShuffle<0b10, 0b00011, IIC_VPERMQ3, "vzip", "32">; // Vector Table Lookup and Table Extension. @@ -2920,25 +3016,25 @@ def VTBL1 : N3V<1,1,0b11,0b1000,0,0, (outs DPR:$dst), (ins DPR:$tbl1, DPR:$src), IIC_VTB1, - "vtbl.8", "\t$dst, \\{$tbl1\\}, $src", "", + "vtbl", "8", "$dst, \\{$tbl1\\}, $src", "", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbl1 DPR:$tbl1, DPR:$src)))]>; let hasExtraSrcRegAllocReq = 1 in { def VTBL2 : N3V<1,1,0b11,0b1001,0,0, (outs DPR:$dst), (ins DPR:$tbl1, DPR:$tbl2, DPR:$src), IIC_VTB2, - "vtbl.8", "\t$dst, \\{$tbl1,$tbl2\\}, $src", "", + "vtbl", "8", "$dst, \\{$tbl1,$tbl2\\}, $src", "", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbl2 DPR:$tbl1, DPR:$tbl2, DPR:$src)))]>; def VTBL3 : N3V<1,1,0b11,0b1010,0,0, (outs DPR:$dst), (ins DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$src), IIC_VTB3, - "vtbl.8", "\t$dst, \\{$tbl1,$tbl2,$tbl3\\}, $src", "", + "vtbl", "8", "$dst, \\{$tbl1,$tbl2,$tbl3\\}, $src", "", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbl3 DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$src)))]>; def VTBL4 : N3V<1,1,0b11,0b1011,0,0, (outs DPR:$dst), (ins DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$tbl4, DPR:$src), IIC_VTB4, - "vtbl.8", "\t$dst, \\{$tbl1,$tbl2,$tbl3,$tbl4\\}, $src", "", + "vtbl", "8", "$dst, \\{$tbl1,$tbl2,$tbl3,$tbl4\\}, $src", "", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbl4 DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$tbl4, DPR:$src)))]>; } // hasExtraSrcRegAllocReq = 1 @@ -2947,26 +3043,26 @@ def VTBX1 : N3V<1,1,0b11,0b1000,1,0, (outs DPR:$dst), (ins DPR:$orig, DPR:$tbl1, DPR:$src), IIC_VTBX1, - "vtbx.8", "\t$dst, \\{$tbl1\\}, $src", "$orig = $dst", + "vtbx", "8", "$dst, \\{$tbl1\\}, $src", "$orig = $dst", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbx1 DPR:$orig, DPR:$tbl1, DPR:$src)))]>; let hasExtraSrcRegAllocReq = 1 in { def VTBX2 : N3V<1,1,0b11,0b1001,1,0, (outs DPR:$dst), (ins DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$src), IIC_VTBX2, - "vtbx.8", "\t$dst, \\{$tbl1,$tbl2\\}, $src", "$orig = $dst", + "vtbx", "8", "$dst, \\{$tbl1,$tbl2\\}, $src", "$orig = $dst", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbx2 DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$src)))]>; def VTBX3 : N3V<1,1,0b11,0b1010,1,0, (outs DPR:$dst), (ins DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$src), IIC_VTBX3, - "vtbx.8", "\t$dst, \\{$tbl1,$tbl2,$tbl3\\}, $src", "$orig = $dst", + "vtbx", "8", "$dst, \\{$tbl1,$tbl2,$tbl3\\}, $src", "$orig = $dst", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbx3 DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$src)))]>; def VTBX4 : N3V<1,1,0b11,0b1011,1,0, (outs DPR:$dst), (ins DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$tbl4, DPR:$src), IIC_VTBX4, - "vtbx.8", "\t$dst, \\{$tbl1,$tbl2,$tbl3,$tbl4\\}, $src", "$orig = $dst", + "vtbx", "8", "$dst, \\{$tbl1,$tbl2,$tbl3,$tbl4\\}, $src", "$orig = $dst", [(set DPR:$dst, (v8i8 (int_arm_neon_vtbx4 DPR:$orig, DPR:$tbl1, DPR:$tbl2, DPR:$tbl3, DPR:$tbl4, DPR:$src)))]>; } // hasExtraSrcRegAllocReq = 1 @@ -2980,17 +3076,17 @@ // Vector Add Operations used for single-precision FP let neverHasSideEffects = 1 in -def VADDfd_sfp : N3VDs<0, 0, 0b00, 0b1101, 0, "vadd.f32", v2f32, v2f32, fadd,1>; +def VADDfd_sfp : N3VDs<0, 0, 0b00, 0b1101, 0, "vadd", "f32", v2f32, v2f32, fadd,1>; def : N3VDsPat; // Vector Sub Operations used for single-precision FP let neverHasSideEffects = 1 in -def VSUBfd_sfp : N3VDs<0, 0, 0b10, 0b1101, 0, "vsub.f32", v2f32, v2f32, fsub,0>; +def VSUBfd_sfp : N3VDs<0, 0, 0b10, 0b1101, 0, "vsub", "f32", v2f32, v2f32, fsub,0>; def : N3VDsPat; // Vector Multiply Operations used for single-precision FP let neverHasSideEffects = 1 in -def VMULfd_sfp : N3VDs<1, 0, 0b00, 0b1101, 1, "vmul.f32", v2f32, v2f32, fmul,1>; +def VMULfd_sfp : N3VDs<1, 0, 0b00, 0b1101, 1, "vmul", "f32", v2f32, v2f32, fmul,1>; def : N3VDsPat; // Vector Multiply-Accumulate/Subtract used for single-precision FP @@ -2998,17 +3094,17 @@ // we want to avoid them for now. e.g., alternating vmla/vadd instructions. //let neverHasSideEffects = 1 in -//def VMLAfd_sfp : N3VDMulOps<0, 0, 0b00, 0b1101, 1, IIC_VMACD, "vmla.f32", v2f32,fmul,fadd>; +//def VMLAfd_sfp : N3VDMulOps<0, 0, 0b00, 0b1101, 1, IIC_VMACD, "vmla", "f32", v2f32,fmul,fadd>; //def : N3VDMulOpsPat; //let neverHasSideEffects = 1 in -//def VMLSfd_sfp : N3VDMulOps<0, 0, 0b10, 0b1101, 1, IIC_VMACD, "vmls.f32", v2f32,fmul,fsub>; +//def VMLSfd_sfp : N3VDMulOps<0, 0, 0b10, 0b1101, 1, IIC_VMACD, "vmls", "f32", v2f32,fmul,fsub>; //def : N3VDMulOpsPat; // Vector Absolute used for single-precision FP let neverHasSideEffects = 1 in def VABSfd_sfp : N2VDInts<0b11, 0b11, 0b10, 0b01, 0b01110, 0, - IIC_VUNAD, "vabs.f32", + IIC_VUNAD, "vabs", "f32", v2f32, v2f32, int_arm_neon_vabs>; def : N2VDIntsPat; @@ -3016,27 +3112,27 @@ let neverHasSideEffects = 1 in def VNEGf32d_sfp : N2V<0b11, 0b11, 0b10, 0b01, 0b01111, 0, 0, (outs DPR_VFP2:$dst), (ins DPR_VFP2:$src), IIC_VUNAD, - "vneg.f32", "\t$dst, $src", "", []>; + "vneg", "f32", "$dst, $src", "", []>; def : N2VDIntsPat; // Vector Convert between single-precision FP and integer let neverHasSideEffects = 1 in -def VCVTf2sd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt.s32.f32", +def VCVTf2sd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01110, 0, "vcvt", "s32.f32", v2i32, v2f32, fp_to_sint>; def : N2VDsPat; let neverHasSideEffects = 1 in -def VCVTf2ud_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt.u32.f32", +def VCVTf2ud_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01111, 0, "vcvt", "u32.f32", v2i32, v2f32, fp_to_uint>; def : N2VDsPat; let neverHasSideEffects = 1 in -def VCVTs2fd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt.f32.s32", +def VCVTs2fd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01100, 0, "vcvt", "f32.s32", v2f32, v2i32, sint_to_fp>; def : N2VDsPat; let neverHasSideEffects = 1 in -def VCVTu2fd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt.f32.u32", +def VCVTu2fd_sfp : N2VDs<0b11, 0b11, 0b10, 0b11, 0b01101, 0, "vcvt", "f32.u32", v2f32, v2i32, uint_to_fp>; def : N2VDsPat; From jyasskin at google.com Mon Nov 23 16:49:01 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 23 Nov 2009 22:49:01 -0000 Subject: [llvm-commits] [llvm] r89708 - in /llvm/trunk: include/llvm/CodeGen/JITCodeEmitter.h include/llvm/CodeGen/MachineCodeEmitter.h lib/ExecutionEngine/JIT/JITEmitter.cpp lib/Target/ARM/ARMJITInfo.cpp lib/Target/Alpha/AlphaJITInfo.cpp lib/Target/PowerPC/PPCJITInfo.cpp lib/Target/X86/X86JITInfo.cpp Message-ID: <200911232249.nANMn2d1013209@zion.cs.uiuc.edu> Author: jyasskin Date: Mon Nov 23 16:49:00 2009 New Revision: 89708 URL: http://llvm.org/viewvc/llvm-project?rev=89708&view=rev Log: Allow more than one stub to be being generated at the same time. It's probably better in the long run to replace the indirect-GlobalVariable system. That'll be done after a subsequent patch. Modified: llvm/trunk/include/llvm/CodeGen/JITCodeEmitter.h llvm/trunk/include/llvm/CodeGen/MachineCodeEmitter.h llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp llvm/trunk/lib/Target/X86/X86JITInfo.cpp Modified: llvm/trunk/include/llvm/CodeGen/JITCodeEmitter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/JITCodeEmitter.h?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/JITCodeEmitter.h (original) +++ llvm/trunk/include/llvm/CodeGen/JITCodeEmitter.h Mon Nov 23 16:49:00 2009 @@ -68,23 +68,29 @@ /// virtual bool finishFunction(MachineFunction &F) = 0; - /// startGVStub - This callback is invoked when the JIT needs the - /// address of a GV (e.g. function) that has not been code generated yet. - /// The StubSize specifies the total size required by the stub. + /// startGVStub - This callback is invoked when the JIT needs the address of a + /// GV (e.g. function) that has not been code generated yet. The StubSize + /// specifies the total size required by the stub. The BufferState must be + /// passed to finishGVStub, and start/finish pairs with the same BufferState + /// must be properly nested. /// - virtual void startGVStub(const GlobalValue* GV, unsigned StubSize, - unsigned Alignment = 1) = 0; + virtual void startGVStub(BufferState &BS, const GlobalValue* GV, + unsigned StubSize, unsigned Alignment = 1) = 0; - /// startGVStub - This callback is invoked when the JIT needs the address of a + /// startGVStub - This callback is invoked when the JIT needs the address of a /// GV (e.g. function) that has not been code generated yet. Buffer points to - /// memory already allocated for this stub. + /// memory already allocated for this stub. The BufferState must be passed to + /// finishGVStub, and start/finish pairs with the same BufferState must be + /// properly nested. /// - virtual void startGVStub(const GlobalValue* GV, void *Buffer, + virtual void startGVStub(BufferState &BS, void *Buffer, unsigned StubSize) = 0; - - /// finishGVStub - This callback is invoked to terminate a GV stub. + + /// finishGVStub - This callback is invoked to terminate a GV stub and returns + /// the start address of the stub. The BufferState must first have been + /// passed to startGVStub. /// - virtual void *finishGVStub(const GlobalValue* F) = 0; + virtual void *finishGVStub(BufferState &BS) = 0; /// emitByte - This callback is invoked when a byte needs to be written to the /// output stream. Modified: llvm/trunk/include/llvm/CodeGen/MachineCodeEmitter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineCodeEmitter.h?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineCodeEmitter.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineCodeEmitter.h Mon Nov 23 16:49:00 2009 @@ -48,17 +48,41 @@ /// occurred, more memory is allocated, and we reemit the code into it. /// class MachineCodeEmitter { +public: + class BufferState { + friend class MachineCodeEmitter; + /// BufferBegin/BufferEnd - Pointers to the start and end of the memory + /// allocated for this code buffer. + uint8_t *BufferBegin, *BufferEnd; + + /// CurBufferPtr - Pointer to the next byte of memory to fill when emitting + /// code. This is guranteed to be in the range [BufferBegin,BufferEnd]. If + /// this pointer is at BufferEnd, it will never move due to code emission, + /// and all code emission requests will be ignored (this is the buffer + /// overflow condition). + uint8_t *CurBufferPtr; + public: + BufferState() : BufferBegin(NULL), BufferEnd(NULL), CurBufferPtr(NULL) {} + }; + protected: - /// BufferBegin/BufferEnd - Pointers to the start and end of the memory - /// allocated for this code buffer. - uint8_t *BufferBegin, *BufferEnd; - - /// CurBufferPtr - Pointer to the next byte of memory to fill when emitting - /// code. This is guranteed to be in the range [BufferBegin,BufferEnd]. If - /// this pointer is at BufferEnd, it will never move due to code emission, and - /// all code emission requests will be ignored (this is the buffer overflow - /// condition). - uint8_t *CurBufferPtr; + /// These have the same meanings as the fields in BufferState + uint8_t *BufferBegin, *BufferEnd, *CurBufferPtr; + + /// Save or restore the current buffer state. The BufferState objects must be + /// used as a stack. + void SaveStateTo(BufferState &BS) { + assert(BS.BufferBegin == NULL && + "Can't save state into the same BufferState twice."); + BS.BufferBegin = BufferBegin; + BS.BufferEnd = BufferEnd; + BS.CurBufferPtr = CurBufferPtr; + } + void RestoreStateFrom(BufferState &BS) { + BufferBegin = BS.BufferBegin; + BufferEnd = BS.BufferEnd; + CurBufferPtr = BS.CurBufferPtr; + } public: virtual ~MachineCodeEmitter() {} Modified: llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp Mon Nov 23 16:49:00 2009 @@ -268,10 +268,6 @@ class JITEmitter : public JITCodeEmitter { JITMemoryManager *MemMgr; - // When outputting a function stub in the context of some other function, we - // save BufferBegin/BufferEnd/CurBufferPtr here. - uint8_t *SavedBufferBegin, *SavedBufferEnd, *SavedCurBufferPtr; - // When reattempting to JIT a function after running out of space, we store // the estimated size of the function we're trying to JIT here, so we can // ask the memory manager for at least this much space. When we @@ -397,11 +393,11 @@ void initJumpTableInfo(MachineJumpTableInfo *MJTI); void emitJumpTableInfo(MachineJumpTableInfo *MJTI); - virtual void startGVStub(const GlobalValue* GV, unsigned StubSize, - unsigned Alignment = 1); - virtual void startGVStub(const GlobalValue* GV, void *Buffer, + virtual void startGVStub(BufferState &BS, const GlobalValue* GV, + unsigned StubSize, unsigned Alignment = 1); + virtual void startGVStub(BufferState &BS, void *Buffer, unsigned StubSize); - virtual void* finishGVStub(const GlobalValue *GV); + virtual void* finishGVStub(BufferState &BS); /// allocateSpace - Reserves space in the current block if any, or /// allocate a new one of the given size. @@ -1207,9 +1203,8 @@ if (DwarfExceptionHandling || JITEmitDebugInfo) { uintptr_t ActualSize = 0; - SavedBufferBegin = BufferBegin; - SavedBufferEnd = BufferEnd; - SavedCurBufferPtr = CurBufferPtr; + BufferState BS; + SaveStateTo(BS); if (MemMgr->NeedsExactSize()) { ActualSize = DE->GetDwarfTableSizeInBytes(F, *this, FnStart, FnEnd); @@ -1225,9 +1220,7 @@ MemMgr->endExceptionTable(F.getFunction(), BufferBegin, CurBufferPtr, FrameRegister); uint8_t *EhEnd = CurBufferPtr; - BufferBegin = SavedBufferBegin; - BufferEnd = SavedBufferEnd; - CurBufferPtr = SavedCurBufferPtr; + RestoreStateFrom(BS); if (DwarfExceptionHandling) { TheJIT->RegisterTable(FrameRegister); @@ -1433,32 +1426,26 @@ } } -void JITEmitter::startGVStub(const GlobalValue* GV, unsigned StubSize, - unsigned Alignment) { - SavedBufferBegin = BufferBegin; - SavedBufferEnd = BufferEnd; - SavedCurBufferPtr = CurBufferPtr; +void JITEmitter::startGVStub(BufferState &BS, const GlobalValue* GV, + unsigned StubSize, unsigned Alignment) { + SaveStateTo(BS); BufferBegin = CurBufferPtr = MemMgr->allocateStub(GV, StubSize, Alignment); BufferEnd = BufferBegin+StubSize+1; } -void JITEmitter::startGVStub(const GlobalValue* GV, void *Buffer, - unsigned StubSize) { - SavedBufferBegin = BufferBegin; - SavedBufferEnd = BufferEnd; - SavedCurBufferPtr = CurBufferPtr; +void JITEmitter::startGVStub(BufferState &BS, void *Buffer, unsigned StubSize) { + SaveStateTo(BS); BufferBegin = CurBufferPtr = (uint8_t *)Buffer; BufferEnd = BufferBegin+StubSize+1; } -void *JITEmitter::finishGVStub(const GlobalValue* GV) { +void *JITEmitter::finishGVStub(BufferState &BS) { NumBytes += getCurrentPCOffset(); - std::swap(SavedBufferBegin, BufferBegin); - BufferEnd = SavedBufferEnd; - CurBufferPtr = SavedCurBufferPtr; - return SavedBufferBegin; + void *Result = BufferBegin; + RestoreStateFrom(BS); + return Result; } // getConstantPoolEntryAddress - Return the address of the 'ConstantNum' entry Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp Mon Nov 23 16:49:00 2009 @@ -139,7 +139,8 @@ void *ARMJITInfo::emitGlobalValueIndirectSym(const GlobalValue *GV, void *Ptr, JITCodeEmitter &JCE) { - JCE.startGVStub(GV, 4, 4); + MachineCodeEmitter::BufferState BS; + JCE.startGVStub(BS, GV, 4, 4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); if (!sys::Memory::setRangeWritable((void*)Addr, 4)) { llvm_unreachable("ERROR: Unable to mark indirect symbol writable"); @@ -148,13 +149,14 @@ if (!sys::Memory::setRangeExecutable((void*)Addr, 4)) { llvm_unreachable("ERROR: Unable to mark indirect symbol executable"); } - void *PtrAddr = JCE.finishGVStub(GV); + void *PtrAddr = JCE.finishGVStub(BS); addIndirectSymAddr(Ptr, (intptr_t)PtrAddr); return PtrAddr; } void *ARMJITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; // If this is just a call to an external function, emit a branch instead of a // call. The code is the same except for one bit of the last instruction. if (Fn != (void*)(intptr_t)ARMCompilationCallback) { @@ -172,7 +174,7 @@ errs() << "JIT: Stub emitted at [" << LazyPtr << "] for external function at '" << Fn << "'\n"); } - JCE.startGVStub(F, 16, 4); + JCE.startGVStub(BS, F, 16, 4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); if (!sys::Memory::setRangeWritable((void*)Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub writable"); @@ -187,7 +189,7 @@ } } else { // The stub is 8-byte size and 4-aligned. - JCE.startGVStub(F, 8, 4); + JCE.startGVStub(BS, F, 8, 4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); if (!sys::Memory::setRangeWritable((void*)Addr, 8)) { llvm_unreachable("ERROR: Unable to mark stub writable"); @@ -207,7 +209,7 @@ // // Branch and link to the compilation callback. // The stub is 16-byte size and 4-byte aligned. - JCE.startGVStub(F, 16, 4); + JCE.startGVStub(BS, F, 16, 4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); if (!sys::Memory::setRangeWritable((void*)Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub writable"); @@ -228,7 +230,7 @@ } } - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } intptr_t ARMJITInfo::resolveRelocDestAddr(MachineRelocation *MR) const { Modified: llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp (original) +++ llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp Mon Nov 23 16:49:00 2009 @@ -192,15 +192,16 @@ void *AlphaJITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; //assert(Fn == AlphaCompilationCallback && "Where are you going?\n"); //Do things in a stupid slow way! - JCE.startGVStub(F, 19*4); + JCE.startGVStub(BS, F, 19*4); void* Addr = (void*)(intptr_t)JCE.getCurrentPCValue(); for (int x = 0; x < 19; ++ x) JCE.emitWordLE(0); EmitBranchToAt(Addr, Fn); DEBUG(errs() << "Emitting Stub to " << Fn << " at [" << Addr << "]\n"); - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } TargetJITInfo::LazyResolverFn Modified: llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp Mon Nov 23 16:49:00 2009 @@ -330,11 +330,12 @@ void *PPCJITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; // If this is just a call to an external function, emit a branch instead of a // call. The code is the same except for one bit of the last instruction. if (Fn != (void*)(intptr_t)PPC32CompilationCallback && Fn != (void*)(intptr_t)PPC64CompilationCallback) { - JCE.startGVStub(F, 7*4); + JCE.startGVStub(BS, F, 7*4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); JCE.emitWordBE(0); JCE.emitWordBE(0); @@ -345,10 +346,10 @@ JCE.emitWordBE(0); EmitBranchToAt(Addr, (intptr_t)Fn, false, is64Bit); sys::Memory::InvalidateInstructionCache((void*)Addr, 7*4); - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } - JCE.startGVStub(F, 10*4); + JCE.startGVStub(BS, F, 10*4); intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); if (is64Bit) { JCE.emitWordBE(0xf821ffb1); // stdu r1,-80(r1) @@ -373,7 +374,7 @@ JCE.emitWordBE(0); EmitBranchToAt(BranchAddr, (intptr_t)Fn, true, is64Bit); sys::Memory::InvalidateInstructionCache((void*)Addr, 10*4); - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } Modified: llvm/trunk/lib/Target/X86/X86JITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86JITInfo.cpp?rev=89708&r1=89707&r2=89708&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86JITInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86JITInfo.cpp Mon Nov 23 16:49:00 2009 @@ -426,19 +426,21 @@ void *X86JITInfo::emitGlobalValueIndirectSym(const GlobalValue* GV, void *ptr, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; #if defined (X86_64_JIT) - JCE.startGVStub(GV, 8, 8); + JCE.startGVStub(BS, GV, 8, 8); JCE.emitWordLE((unsigned)(intptr_t)ptr); JCE.emitWordLE((unsigned)(((intptr_t)ptr) >> 32)); #else - JCE.startGVStub(GV, 4, 4); + JCE.startGVStub(BS, GV, 4, 4); JCE.emitWordLE((intptr_t)ptr); #endif - return JCE.finishGVStub(GV); + return JCE.finishGVStub(BS); } void *X86JITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; // Note, we cast to intptr_t here to silence a -pedantic warning that // complains about casting a function pointer to a normal pointer. #if defined (X86_32_JIT) && !defined (_MSC_VER) @@ -449,7 +451,7 @@ #endif if (NotCC) { #if defined (X86_64_JIT) - JCE.startGVStub(F, 13, 4); + JCE.startGVStub(BS, F, 13, 4); JCE.emitByte(0x49); // REX prefix JCE.emitByte(0xB8+2); // movabsq r10 JCE.emitWordLE((unsigned)(intptr_t)Fn); @@ -458,15 +460,15 @@ JCE.emitByte(0xFF); // jmpq *r10 JCE.emitByte(2 | (4 << 3) | (3 << 6)); #else - JCE.startGVStub(F, 5, 4); + JCE.startGVStub(BS, F, 5, 4); JCE.emitByte(0xE9); JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); #endif - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } #if defined (X86_64_JIT) - JCE.startGVStub(F, 14, 4); + JCE.startGVStub(BS, F, 14, 4); JCE.emitByte(0x49); // REX prefix JCE.emitByte(0xB8+2); // movabsq r10 JCE.emitWordLE((unsigned)(intptr_t)Fn); @@ -475,7 +477,7 @@ JCE.emitByte(0xFF); // callq *r10 JCE.emitByte(2 | (2 << 3) | (3 << 6)); #else - JCE.startGVStub(F, 6, 4); + JCE.startGVStub(BS, F, 6, 4); JCE.emitByte(0xE8); // Call with 32 bit pc-rel destination... JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); @@ -485,14 +487,15 @@ // initialize the buffer with garbage, which means it may follow a // noreturn function call, confusing X86CompilationCallback2. PR 4929. JCE.emitByte(0xCE); // Interrupt - Just a marker identifying the stub! - return JCE.finishGVStub(F); + return JCE.finishGVStub(BS); } void X86JITInfo::emitFunctionStubAtAddr(const Function* F, void *Fn, void *Stub, JITCodeEmitter &JCE) { + MachineCodeEmitter::BufferState BS; // Note, we cast to intptr_t here to silence a -pedantic warning that // complains about casting a function pointer to a normal pointer. - JCE.startGVStub(F, Stub, 5); + JCE.startGVStub(BS, Stub, 5); JCE.emitByte(0xE9); #if defined (X86_64_JIT) && !defined (NDEBUG) // Yes, we need both of these casts, or some broken versions of GCC (4.2.4) @@ -502,7 +505,7 @@ && "PIC displacement does not fit in displacement field!"); #endif JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); - JCE.finishGVStub(F); + JCE.finishGVStub(BS); } /// getPICJumpTableEntry - Returns the value of the jumptable entry for the From gohman at apple.com Mon Nov 23 17:20:52 2009 From: gohman at apple.com (Dan Gohman) Date: Mon, 23 Nov 2009 23:20:52 -0000 Subject: [llvm-commits] [llvm] r89711 - in /llvm/trunk: include/llvm/CodeGen/ include/llvm/Target/ lib/CodeGen/SelectionDAG/ lib/Target/ARM/ lib/Target/Alpha/ lib/Target/Blackfin/ lib/Target/CellSPU/ lib/Target/Mips/ lib/Target/PowerPC/ lib/Target/Sparc/ lib/Target/X86/ lib/Target/XCore/ utils/TableGen/ Message-ID: <200911232320.nANNKrMQ014385@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 17:20:51 2009 New Revision: 89711 URL: http://llvm.org/viewvc/llvm-project?rev=89711&view=rev Log: Remove ISD::DEBUG_LOC and ISD::DBG_LABEL, which are no longer used. Note that "hasDotLocAndDotFile"-style debug info was already broken; people wanting this functionality should implement it in the AsmPrinter/DwarfWriter code. Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h llvm/trunk/include/llvm/Target/TargetSelectionDAG.td llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp llvm/trunk/lib/Target/ARM/ARMInstrInfo.td llvm/trunk/lib/Target/Alpha/AlphaISelLowering.cpp llvm/trunk/lib/Target/Blackfin/BlackfinISelLowering.cpp llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td llvm/trunk/lib/Target/Sparc/SparcISelLowering.cpp llvm/trunk/lib/Target/X86/X86CodeEmitter.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp llvm/trunk/lib/Target/X86/X86InstrInfo.cpp llvm/trunk/lib/Target/X86/X86InstrInfo.td llvm/trunk/lib/Target/XCore/XCoreISelLowering.cpp llvm/trunk/utils/TableGen/DAGISelEmitter.cpp Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h Mon Nov 23 17:20:51 2009 @@ -494,10 +494,9 @@ // Operand #last: Optional, an incoming flag. INLINEASM, - // DBG_LABEL, EH_LABEL - Represents a label in mid basic block used to track + // EH_LABEL - Represents a label in mid basic block used to track // locations needed for debug and exception handling tables. These nodes // take a chain as input and return a chain. - DBG_LABEL, EH_LABEL, // STACKSAVE - STACKSAVE has one operand, an input chain. It produces a @@ -546,12 +545,6 @@ // HANDLENODE node - Used as a handle for various purposes. HANDLENODE, - // DEBUG_LOC - This node is used to represent source line information - // embedded in the code. It takes a token chain as input, then a line - // number, then a column then a file id (provided by MachineModuleInfo.) It - // produces a token chain as output. - DEBUG_LOC, - // TRAMPOLINE - This corresponds to the init_trampoline intrinsic. // It takes as input a token chain, the pointer to the trampoline, // the pointer to the nested function, the pointer to pass for the @@ -630,10 +623,6 @@ /// element is not an undef. bool isScalarToVector(const SDNode *N); - /// isDebugLabel - Return true if the specified node represents a debug - /// label (i.e. ISD::DBG_LABEL or TargetInstrInfo::DBG_LABEL node). - bool isDebugLabel(const SDNode *N); - //===--------------------------------------------------------------------===// /// MemIndexedMode enum - This enum defines the load / store indexed /// addressing modes. @@ -2031,8 +2020,7 @@ static bool classof(const LabelSDNode *) { return true; } static bool classof(const SDNode *N) { - return N->getOpcode() == ISD::DBG_LABEL || - N->getOpcode() == ISD::EH_LABEL; + return N->getOpcode() == ISD::EH_LABEL; } }; Modified: llvm/trunk/include/llvm/Target/TargetSelectionDAG.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetSelectionDAG.td?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetSelectionDAG.td (original) +++ llvm/trunk/include/llvm/Target/TargetSelectionDAG.td Mon Nov 23 17:20:51 2009 @@ -864,10 +864,3 @@ list Properties = props; list Attributes = attrs; } - -//===----------------------------------------------------------------------===// -// Dwarf support. -// -def SDT_dwarf_loc : SDTypeProfile<0, 3, - [SDTCisInt<0>, SDTCisInt<1>, SDTCisInt<2>]>; -def dwarf_loc : SDNode<"ISD::DEBUG_LOC", SDT_dwarf_loc,[SDNPHasChain]>; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Nov 23 17:20:51 2009 @@ -2243,7 +2243,6 @@ Results.push_back(DAG.getConstant(1, Node->getValueType(0))); break; case ISD::EH_RETURN: - case ISD::DBG_LABEL: case ISD::EH_LABEL: case ISD::PREFETCH: case ISD::MEMBARRIER: Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Nov 23 17:20:51 2009 @@ -200,19 +200,6 @@ return true; } - -/// isDebugLabel - Return true if the specified node represents a debug -/// label (i.e. ISD::DBG_LABEL or TargetInstrInfo::DBG_LABEL node). -bool ISD::isDebugLabel(const SDNode *N) { - SDValue Zero; - if (N->getOpcode() == ISD::DBG_LABEL) - return true; - if (N->isMachineOpcode() && - N->getMachineOpcode() == TargetInstrInfo::DBG_LABEL) - return true; - return false; -} - /// getSetCCSwappedOperands - Return the operation corresponding to (Y op X) /// when given the operation for (X op Y). ISD::CondCode ISD::getSetCCSwappedOperands(ISD::CondCode Operation) { @@ -503,7 +490,6 @@ switch (N->getOpcode()) { default: break; case ISD::HANDLENODE: - case ISD::DBG_LABEL: case ISD::EH_LABEL: return true; // Never CSE these nodes. } @@ -5438,7 +5424,6 @@ case ISD::UNDEF: return "undef"; case ISD::MERGE_VALUES: return "merge_values"; case ISD::INLINEASM: return "inlineasm"; - case ISD::DBG_LABEL: return "dbg_label"; case ISD::EH_LABEL: return "eh_label"; case ISD::HANDLENODE: return "handlenode"; @@ -5572,9 +5557,6 @@ case ISD::CTTZ: return "cttz"; case ISD::CTLZ: return "ctlz"; - // Debug info - case ISD::DEBUG_LOC: return "debug_loc"; - // Trampolines case ISD::TRAMPOLINE: return "trampoline"; Modified: llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Mon Nov 23 17:20:51 2009 @@ -532,11 +532,6 @@ InitLibcallNames(LibcallRoutineNames); InitCmpLibcallCCs(CmpLibcallCCs); InitLibcallCallingConvs(LibcallCallingConvs); - - // Tell Legalize whether the assembler supports DEBUG_LOC. - const MCAsmInfo *TASM = TM.getMCAsmInfo(); - if (!TASM || !TASM->hasDotLocAndDotFile()) - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); } TargetLowering::~TargetLowering() { Modified: llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp Mon Nov 23 17:20:51 2009 @@ -613,7 +613,6 @@ break; case TargetInstrInfo::IMPLICIT_DEF: case TargetInstrInfo::KILL: - case ARM::DWARF_LOC: // Do nothing. break; case ARM::CONSTPOOL_ENTRY: Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -355,9 +355,6 @@ setOperationAction(ISD::SDIVREM, MVT::i32, Expand); setOperationAction(ISD::UDIVREM, MVT::i32, Expand); - // Support label based line numbers. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::GlobalAddress, MVT::i32, Custom); setOperationAction(ISD::ConstantPool, MVT::i32, Custom); setOperationAction(ISD::GLOBAL_OFFSET_TABLE, MVT::i32, Custom); Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrInfo.td Mon Nov 23 17:20:51 2009 @@ -584,12 +584,6 @@ [(ARMcallseq_start timm:$amt)]>; } -def DWARF_LOC : -PseudoInst<(outs), (ins i32imm:$line, i32imm:$col, i32imm:$file), NoItinerary, - ".loc $file, $line, $col", - [(dwarf_loc (i32 imm:$line), (i32 imm:$col), (i32 imm:$file))]>; - - // Address computation and loads and stores in PIC mode. let isNotDuplicable = 1 in { def PICADD : AXI1<0b0100, (outs GPR:$dst), (ins GPR:$a, pclabel:$cp, pred:$p), Modified: llvm/trunk/lib/Target/Alpha/AlphaISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Alpha/AlphaISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/Alpha/AlphaISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Alpha/AlphaISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -127,9 +127,6 @@ setOperationAction(ISD::BIT_CONVERT, MVT::f32, Promote); - // We don't have line number support yet. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); // Not implemented yet. Modified: llvm/trunk/lib/Target/Blackfin/BlackfinISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Blackfin/BlackfinISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/Blackfin/BlackfinISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Blackfin/BlackfinISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -114,9 +114,6 @@ // READCYCLECOUNTER needs special type legalization. setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Custom); - // We don't have line number support yet. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); // Use the default implementation. Modified: llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp (original) +++ llvm/trunk/lib/Target/CellSPU/SPUISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -387,9 +387,6 @@ // We cannot sextinreg(i1). Expand to shifts. setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand); - // Support label based line numbers. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - // We want to legalize GlobalAddress and ConstantPool nodes into the // appropriate instructions to materialize the address. for (unsigned sctype = (unsigned) MVT::i8; sctype < (unsigned) MVT::f128; Modified: llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td (original) +++ llvm/trunk/lib/Target/CellSPU/SPUInstrInfo.td Mon Nov 23 17:20:51 2009 @@ -31,14 +31,6 @@ } //===----------------------------------------------------------------------===// -// DWARF debugging Pseudo Instructions -//===----------------------------------------------------------------------===// - -def DWARF_LOC : Pseudo<(outs), (ins i32imm:$line, i32imm:$col, i32imm:$file), - ".loc $file, $line, $col", - [(dwarf_loc (i32 imm:$line), (i32 imm:$col), (i32 imm:$file))]>; - -//===----------------------------------------------------------------------===// // Loads: // NB: The ordering is actually important, since the instruction selection // will try each of the instructions in sequence, i.e., the D-form first with Modified: llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -132,9 +132,6 @@ setOperationAction(ISD::FLOG10, MVT::f32, Expand); setOperationAction(ISD::FEXP, MVT::f32, Expand); - // We don't have line number support yet. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); // Use the default for now Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -182,9 +182,6 @@ // We cannot sextinreg(i1). Expand to shifts. setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand); - // Support label based line numbers. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::EXCEPTIONADDR, MVT::i64, Expand); setOperationAction(ISD::EHSELECTION, MVT::i64, Expand); setOperationAction(ISD::EXCEPTIONADDR, MVT::i32, Expand); Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td (original) +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td Mon Nov 23 17:20:51 2009 @@ -1358,15 +1358,6 @@ //===----------------------------------------------------------------------===// -// DWARF Pseudo Instructions -// - -def DWARF_LOC : Pseudo<(outs), (ins i32imm:$line, i32imm:$col, i32imm:$file), - "${:comment} .loc $file, $line, $col", - [(dwarf_loc (i32 imm:$line), (i32 imm:$col), - (i32 imm:$file))]>; - -//===----------------------------------------------------------------------===// // PowerPC Instruction Patterns // Modified: llvm/trunk/lib/Target/Sparc/SparcISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Sparc/SparcISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/Sparc/SparcISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Sparc/SparcISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -644,9 +644,6 @@ setOperationAction(ISD::UMUL_LOHI, MVT::i32, Expand); setOperationAction(ISD::SMUL_LOHI, MVT::i32, Expand); - // We don't have line number support yet. - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); // VASTART needs to be custom lowered to use the VarArgsFrameIndex. @@ -662,7 +659,6 @@ setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32 , Custom); // No debug info support yet. - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); setStackPointerRegisterToSaveRestore(SP::O6); Modified: llvm/trunk/lib/Target/X86/X86CodeEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CodeEmitter.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86CodeEmitter.cpp (original) +++ llvm/trunk/lib/Target/X86/X86CodeEmitter.cpp Mon Nov 23 17:20:51 2009 @@ -595,7 +595,6 @@ break; case TargetInstrInfo::IMPLICIT_DEF: case TargetInstrInfo::KILL: - case X86::DWARF_LOC: case X86::FP_REG_KILL: break; case X86::MOVPC32r: { Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -377,7 +377,6 @@ if (!Subtarget->isTargetDarwin() && !Subtarget->isTargetELF() && !Subtarget->isTargetCygMing()) { - setOperationAction(ISD::DBG_LABEL, MVT::Other, Expand); setOperationAction(ISD::EH_LABEL, MVT::Other, Expand); } Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon Nov 23 17:20:51 2009 @@ -3133,7 +3133,6 @@ break; case TargetInstrInfo::IMPLICIT_DEF: case TargetInstrInfo::KILL: - case X86::DWARF_LOC: case X86::FP_REG_KILL: break; case X86::MOVPC32r: { Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.td?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.td Mon Nov 23 17:20:51 2009 @@ -3506,16 +3506,6 @@ [(set GR32:$dst, (fsload addr:$src))]>, SegFS; //===----------------------------------------------------------------------===// -// DWARF Pseudo Instructions -// - -def DWARF_LOC : I<0, Pseudo, (outs), - (ins i32imm:$line, i32imm:$col, i32imm:$file), - ".loc\t$file $line $col", - [(dwarf_loc (i32 imm:$line), (i32 imm:$col), - (i32 imm:$file))]>; - -//===----------------------------------------------------------------------===// // EH Pseudo Instructions // let isTerminator = 1, isReturn = 1, isBarrier = 1, Modified: llvm/trunk/lib/Target/XCore/XCoreISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/XCore/XCoreISelLowering.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/lib/Target/XCore/XCoreISelLowering.cpp (original) +++ llvm/trunk/lib/Target/XCore/XCoreISelLowering.cpp Mon Nov 23 17:20:51 2009 @@ -142,9 +142,6 @@ setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand); setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Expand); - // Debug - setOperationAction(ISD::DEBUG_LOC, MVT::Other, Expand); - maxStoresPerMemset = 4; maxStoresPerMemmove = maxStoresPerMemcpy = 2; Modified: llvm/trunk/utils/TableGen/DAGISelEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/DAGISelEmitter.cpp?rev=89711&r1=89710&r2=89711&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/DAGISelEmitter.cpp (original) +++ llvm/trunk/utils/TableGen/DAGISelEmitter.cpp Mon Nov 23 17:20:51 2009 @@ -1947,7 +1947,6 @@ << " return NULL;\n" << " }\n" << " case ISD::INLINEASM: return Select_INLINEASM(N);\n" - << " case ISD::DBG_LABEL: return Select_DBG_LABEL(N);\n" << " case ISD::EH_LABEL: return Select_EH_LABEL(N);\n" << " case ISD::UNDEF: return Select_UNDEF(N);\n"; From grosbach at apple.com Mon Nov 23 17:25:55 2009 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 23 Nov 2009 23:25:55 -0000 Subject: [llvm-commits] [llvm] r89713 - /llvm/trunk/lib/Analysis/IVUsers.cpp Message-ID: <200911232325.nANNPtol014596@zion.cs.uiuc.edu> Author: grosbach Date: Mon Nov 23 17:25:54 2009 New Revision: 89713 URL: http://llvm.org/viewvc/llvm-project?rev=89713&view=rev Log: enable iv-users simplification by default Modified: llvm/trunk/lib/Analysis/IVUsers.cpp Modified: llvm/trunk/lib/Analysis/IVUsers.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IVUsers.cpp?rev=89713&r1=89712&r2=89713&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/IVUsers.cpp (original) +++ llvm/trunk/lib/Analysis/IVUsers.cpp Mon Nov 23 17:25:54 2009 @@ -24,7 +24,6 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" -#include "llvm/Support/CommandLine.h" #include using namespace llvm; @@ -32,10 +31,6 @@ static RegisterPass X("iv-users", "Induction Variable Users", false, true); -static cl::opt -SimplifyIVUsers("simplify-iv-users", cl::Hidden, cl::init(false), - cl::desc("Restrict IV Users to loop-invariant strides")); - Pass *llvm::createIVUsersPass() { return new IVUsers(); } @@ -214,8 +209,7 @@ return false; // Non-reducible symbolic expression, bail out. // Keep things simple. Don't touch loop-variant strides. - if (SimplifyIVUsers && !Stride->isLoopInvariant(L) - && L->contains(I->getParent())) + if (!Stride->isLoopInvariant(L) && L->contains(I->getParent())) return false; SmallPtrSet UniqueUsers; From devang.patel at gmail.com Mon Nov 23 17:26:22 2009 From: devang.patel at gmail.com (Devang Patel) Date: Mon, 23 Nov 2009 15:26:22 -0800 Subject: [llvm-commits] [llvm] r89701 - /llvm/trunk/lib/CodeGen/MachineInstr.cpp In-Reply-To: <200911232129.nANLT8Vi009464@zion.cs.uiuc.edu> References: <200911232129.nANLT8Vi009464@zion.cs.uiuc.edu> Message-ID: <352a1fb20911231526n63420d2q2c8b12f1a72a55a3@mail.gmail.com> Dan, On Mon, Nov 23, 2009 at 1:29 PM, Dan Gohman wrote: > Author: djg > Date: Mon Nov 23 15:29:08 2009 > New Revision: 89701 > > URL: http://llvm.org/viewvc/llvm-project?rev=89701&view=rev > Log: > Print the debug info line and column in MachineInstr::print even when there's > no filename. This situation is apparently fairly common right now. > > Modified: > ? ?llvm/trunk/lib/CodeGen/MachineInstr.cpp > > Modified: llvm/trunk/lib/CodeGen/MachineInstr.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineInstr.cpp?rev=89701&r1=89700&r2=89701&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/MachineInstr.cpp (original) > +++ llvm/trunk/lib/CodeGen/MachineInstr.cpp Mon Nov 23 15:29:08 2009 > @@ -1149,9 +1149,10 @@ > > ? ? DebugLocTuple DLT = MF->getDebugLocTuple(debugLoc); > ? ? DICompileUnit CU(DLT.Scope); > + ? ?OS << " dbg:"; > ? ? if (!CU.isNull()) This will always be null, because DLT.Scope is not DICompileUnit. It is a DIScope, which may be a DILexicalScope or a DISubprogram. You want to use DIScope::getFilename() and DIScope::getDirectory() here. - Devang From jyasskin at google.com Mon Nov 23 17:35:20 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 23 Nov 2009 23:35:20 -0000 Subject: [llvm-commits] [llvm] r89715 - in /llvm/trunk: include/llvm/Target/TargetJITInfo.h lib/ExecutionEngine/JIT/JITEmitter.cpp lib/Target/ARM/ARMJITInfo.cpp lib/Target/ARM/ARMJITInfo.h lib/Target/Alpha/AlphaJITInfo.cpp lib/Target/Alpha/AlphaJITInfo.h lib/Target/PowerPC/PPCJITInfo.cpp lib/Target/PowerPC/PPCJITInfo.h lib/Target/X86/X86JITInfo.cpp lib/Target/X86/X86JITInfo.h unittests/ExecutionEngine/JIT/JITTest.cpp Message-ID: <200911232335.nANNZK3v014939@zion.cs.uiuc.edu> Author: jyasskin Date: Mon Nov 23 17:35:19 2009 New Revision: 89715 URL: http://llvm.org/viewvc/llvm-project?rev=89715&view=rev Log: * Move stub allocation inside the JITEmitter, instead of exposing a way for each TargetJITInfo subclass to allocate its own stubs. This means stubs aren't as exactly-sized anymore, but it lets us get rid of TargetJITInfo::emitFunctionStubAtAddr(), which lets ARM and PPC support the eager JIT, fixing http://llvm.org/PR4816. * Rename the JITEmitter's stub creation functions to describe the kind of stub they create. So far, all of them create lazy-compilation stubs, but they sometimes get used when far-call stubs are needed. Fixing http://llvm.org/PR5201 will involve fixing this. Modified: llvm/trunk/include/llvm/Target/TargetJITInfo.h llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp llvm/trunk/lib/Target/ARM/ARMJITInfo.h llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp llvm/trunk/lib/Target/Alpha/AlphaJITInfo.h llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp llvm/trunk/lib/Target/PowerPC/PPCJITInfo.h llvm/trunk/lib/Target/X86/X86JITInfo.cpp llvm/trunk/lib/Target/X86/X86JITInfo.h llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp Modified: llvm/trunk/include/llvm/Target/TargetJITInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetJITInfo.h?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetJITInfo.h (original) +++ llvm/trunk/include/llvm/Target/TargetJITInfo.h Mon Nov 23 17:35:19 2009 @@ -18,6 +18,7 @@ #define LLVM_TARGET_TARGETJITINFO_H #include +#include "llvm/Support/ErrorHandling.h" #include "llvm/System/DataTypes.h" namespace llvm { @@ -48,22 +49,28 @@ return 0; } + /// Records the required size and alignment for a call stub in bytes. + struct StubLayout { + size_t Size; + size_t Alignment; + }; + /// Returns the maximum size and alignment for a call stub on this target. + virtual StubLayout getStubLayout() { + llvm_unreachable("This target doesn't implement getStubLayout!"); + StubLayout Result = {0, 0}; + return Result; + } + /// emitFunctionStub - Use the specified JITCodeEmitter object to emit a /// small native function that simply calls the function at the specified - /// address. Return the address of the resultant function. - virtual void *emitFunctionStub(const Function* F, void *Fn, + /// address. The JITCodeEmitter must already have storage allocated for the + /// stub. Return the address of the resultant function, which may have been + /// aligned from the address the JCE was set up to emit at. + virtual void *emitFunctionStub(const Function* F, void *Target, JITCodeEmitter &JCE) { assert(0 && "This target doesn't implement emitFunctionStub!"); return 0; } - - /// emitFunctionStubAtAddr - Use the specified JITCodeEmitter object to - /// emit a small native function that simply calls Fn. Emit the stub into - /// the supplied buffer. - virtual void emitFunctionStubAtAddr(const Function* F, void *Fn, - void *Buffer, JITCodeEmitter &JCE) { - assert(0 && "This target doesn't implement emitFunctionStubAtAddr!"); - } /// getPICJumpTableEntry - Returns the value of the jumptable entry for the /// specific basic block. Modified: llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp (original) +++ llvm/trunk/lib/ExecutionEngine/JIT/JITEmitter.cpp Mon Nov 23 17:35:19 2009 @@ -83,15 +83,15 @@ class JITResolverState { public: typedef ValueMap > - FunctionToStubMapTy; + FunctionToLazyStubMapTy; typedef std::map > CallSiteToFunctionMapTy; typedef ValueMap, CallSiteValueMapConfig> FunctionToCallSitesMapTy; typedef std::map, void*> GlobalToIndirectSymMapTy; private: - /// FunctionToStubMap - Keep track of the stub created for a particular - /// function so that we can reuse them if necessary. - FunctionToStubMapTy FunctionToStubMap; + /// FunctionToLazyStubMap - Keep track of the lazy stub created for a + /// particular function so that we can reuse them if necessary. + FunctionToLazyStubMapTy FunctionToLazyStubMap; /// CallSiteToFunctionMap - Keep track of the function that each lazy call /// site corresponds to, and vice versa. @@ -103,12 +103,13 @@ GlobalToIndirectSymMapTy GlobalToIndirectSymMap; public: - JITResolverState() : FunctionToStubMap(this), + JITResolverState() : FunctionToLazyStubMap(this), FunctionToCallSitesMap(this) {} - FunctionToStubMapTy& getFunctionToStubMap(const MutexGuard& locked) { + FunctionToLazyStubMapTy& getFunctionToLazyStubMap( + const MutexGuard& locked) { assert(locked.holds(TheJIT->lock)); - return FunctionToStubMap; + return FunctionToLazyStubMap; } GlobalToIndirectSymMapTy& getGlobalToIndirectSymMap(const MutexGuard& locked) { @@ -154,11 +155,11 @@ Function *const F = C2F_I->second; #ifndef NDEBUG - void *RealStub = FunctionToStubMap.lookup(F); + void *RealStub = FunctionToLazyStubMap.lookup(F); assert(RealStub == Stub && "Call-site that wasn't a stub pass in to EraseStub"); #endif - FunctionToStubMap.erase(F); + FunctionToLazyStubMap.erase(F); CallSiteToFunctionMap.erase(C2F_I); // Remove the stub from the function->call-sites map, and remove the whole @@ -196,7 +197,7 @@ /// JITResolver - Keep track of, and resolve, call sites for functions that /// have not yet been compiled. class JITResolver { - typedef JITResolverState::FunctionToStubMapTy FunctionToStubMapTy; + typedef JITResolverState::FunctionToLazyStubMapTy FunctionToLazyStubMapTy; typedef JITResolverState::CallSiteToFunctionMapTy CallSiteToFunctionMapTy; typedef JITResolverState::GlobalToIndirectSymMapTy GlobalToIndirectSymMapTy; @@ -206,8 +207,11 @@ JITResolverState state; - /// ExternalFnToStubMap - This is the equivalent of FunctionToStubMap for - /// external functions. + /// ExternalFnToStubMap - This is the equivalent of FunctionToLazyStubMap + /// for external functions. TODO: Of course, external functions don't need + /// a lazy stub. It's actually here to make it more likely that far calls + /// succeed, but no single stub can guarantee that. I'll remove this in a + /// subsequent checkin when I actually fix far calls. std::map ExternalFnToStubMap; /// revGOTMap - map addresses to indexes in the GOT @@ -230,14 +234,13 @@ TheJITResolver = 0; } - /// getFunctionStubIfAvailable - This returns a pointer to a function stub - /// if it has already been created. - void *getFunctionStubIfAvailable(Function *F); - - /// getFunctionStub - This returns a pointer to a function stub, creating - /// one on demand as needed. If empty is true, create a function stub - /// pointing at address 0, to be filled in later. - void *getFunctionStub(Function *F); + /// getLazyFunctionStubIfAvailable - This returns a pointer to a function's + /// lazy-compilation stub if it has already been created. + void *getLazyFunctionStubIfAvailable(Function *F); + + /// getLazyFunctionStub - This returns a pointer to a function's + /// lazy-compilation stub, creating one on demand as needed. + void *getLazyFunctionStub(Function *F); /// getExternalFunctionStub - Return a stub for the function at the /// specified address, created lazily on demand. @@ -485,22 +488,22 @@ JRS->EraseAllCallSitesPrelocked(F); } -/// getFunctionStubIfAvailable - This returns a pointer to a function stub +/// getLazyFunctionStubIfAvailable - This returns a pointer to a function stub /// if it has already been created. -void *JITResolver::getFunctionStubIfAvailable(Function *F) { +void *JITResolver::getLazyFunctionStubIfAvailable(Function *F) { MutexGuard locked(TheJIT->lock); // If we already have a stub for this function, recycle it. - return state.getFunctionToStubMap(locked).lookup(F); + return state.getFunctionToLazyStubMap(locked).lookup(F); } /// getFunctionStub - This returns a pointer to a function stub, creating /// one on demand as needed. -void *JITResolver::getFunctionStub(Function *F) { +void *JITResolver::getLazyFunctionStub(Function *F) { MutexGuard locked(TheJIT->lock); - // If we already have a stub for this function, recycle it. - void *&Stub = state.getFunctionToStubMap(locked)[F]; + // If we already have a lazy stub for this function, recycle it. + void *&Stub = state.getFunctionToLazyStubMap(locked)[F]; if (Stub) return Stub; // Call the lazy resolver function if we are JIT'ing lazily. Otherwise we @@ -518,9 +521,13 @@ if (!Actual) return 0; } + MachineCodeEmitter::BufferState BS; + TargetJITInfo::StubLayout SL = TheJIT->getJITInfo().getStubLayout(); + JE.startGVStub(BS, F, SL.Size, SL.Alignment); // Codegen a new stub, calling the lazy resolver or the actual address of the // external function, if it was resolved. Stub = TheJIT->getJITInfo().emitFunctionStub(F, Actual, JE); + JE.finishGVStub(BS); if (Actual != (void*)(intptr_t)LazyResolverFn) { // If we are getting the stub for an external function, we really want the @@ -529,7 +536,7 @@ TheJIT->updateGlobalMapping(F, Stub); } - DEBUG(errs() << "JIT: Stub emitted at [" << Stub << "] for function '" + DEBUG(errs() << "JIT: Lazy stub emitted at [" << Stub << "] for function '" << F->getName() << "'\n"); // Finally, keep track of the stub-to-Function mapping so that the @@ -572,7 +579,11 @@ void *&Stub = ExternalFnToStubMap[FnAddr]; if (Stub) return Stub; + MachineCodeEmitter::BufferState BS; + TargetJITInfo::StubLayout SL = TheJIT->getJITInfo().getStubLayout(); + JE.startGVStub(BS, 0, SL.Size, SL.Alignment); Stub = TheJIT->getJITInfo().emitFunctionStub(0, FnAddr, JE); + JE.finishGVStub(BS); DEBUG(errs() << "JIT: Stub emitted at [" << Stub << "] for external function at '" << FnAddr << "'\n"); @@ -594,10 +605,10 @@ SmallVectorImpl &Ptrs) { MutexGuard locked(TheJIT->lock); - const FunctionToStubMapTy &FM = state.getFunctionToStubMap(locked); + const FunctionToLazyStubMapTy &FM = state.getFunctionToLazyStubMap(locked); GlobalToIndirectSymMapTy &GM = state.getGlobalToIndirectSymMap(locked); - for (FunctionToStubMapTy::const_iterator i = FM.begin(), e = FM.end(); + for (FunctionToLazyStubMapTy::const_iterator i = FM.begin(), e = FM.end(); i != e; ++i){ Function *F = i->first; if (F->isDeclaration() && F->hasExternalLinkage()) { @@ -723,11 +734,12 @@ // If we have already compiled the function, return a pointer to its body. Function *F = cast(V); - void *FnStub = Resolver.getFunctionStubIfAvailable(F); + void *FnStub = Resolver.getLazyFunctionStubIfAvailable(F); if (FnStub) { - // Return the function stub if it's already created. We do this first - // so that we're returning the same address for the function as any - // previous call. + // Return the function stub if it's already created. We do this first so + // that we're returning the same address for the function as any previous + // call. TODO: Yes, this is wrong. The lazy stub isn't guaranteed to be + // close enough to call. AddStubToCurrentFunction(FnStub); return FnStub; } @@ -747,12 +759,12 @@ // Otherwise, we may need a to emit a stub, and, conservatively, we // always do so. - void *StubAddr = Resolver.getFunctionStub(F); + void *StubAddr = Resolver.getLazyFunctionStub(F); // Add the stub to the current function's list of referenced stubs, so we can // deallocate them if the current function is ever freed. It's possible to - // return null from getFunctionStub in the case of a weak extern that fails - // to resolve. + // return null from getLazyFunctionStub in the case of a weak extern that + // fails to resolve. if (StubAddr) AddStubToCurrentFunction(StubAddr); @@ -1442,6 +1454,7 @@ } void *JITEmitter::finishGVStub(BufferState &BS) { + assert(CurBufferPtr != BufferEnd && "Stub overflowed allocated space."); NumBytes += getCurrentPCOffset(); void *Result = BufferBegin; RestoreStateFrom(BS); @@ -1521,19 +1534,23 @@ // Get a stub if the target supports it. assert(isa(JCE) && "Unexpected MCE?"); JITEmitter *JE = cast(getCodeEmitter()); - return JE->getJITResolver().getFunctionStub(F); + return JE->getJITResolver().getLazyFunctionStub(F); } void JIT::updateFunctionStub(Function *F) { // Get the empty stub we generated earlier. assert(isa(JCE) && "Unexpected MCE?"); JITEmitter *JE = cast(getCodeEmitter()); - void *Stub = JE->getJITResolver().getFunctionStub(F); + void *Stub = JE->getJITResolver().getLazyFunctionStub(F); + void *Addr = getPointerToGlobalIfAvailable(F); // Tell the target jit info to rewrite the stub at the specified address, // rather than creating a new one. - void *Addr = getPointerToGlobalIfAvailable(F); - getJITInfo().emitFunctionStubAtAddr(F, Addr, Stub, *getCodeEmitter()); + MachineCodeEmitter::BufferState BS; + TargetJITInfo::StubLayout layout = getJITInfo().getStubLayout(); + JE->startGVStub(BS, Stub, layout.Size); + getJITInfo().emitFunctionStub(F, Addr, *getCodeEmitter()); + JE->finishGVStub(BS); } /// freeMachineCodeForFunction - release machine code memory for given Function. Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp Mon Nov 23 17:35:19 2009 @@ -154,15 +154,22 @@ return PtrAddr; } +TargetJITInfo::StubLayout ARMJITInfo::getStubLayout() { + // The stub contains up to 3 4-byte instructions, aligned at 4 bytes, and a + // 4-byte address. See emitFunctionStub for details. + StubLayout Result = {16, 4}; + return Result; +} + void *ARMJITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { - MachineCodeEmitter::BufferState BS; + void *Addr; // If this is just a call to an external function, emit a branch instead of a // call. The code is the same except for one bit of the last instruction. if (Fn != (void*)(intptr_t)ARMCompilationCallback) { // Branch to the corresponding function addr. if (IsPIC) { - // The stub is 8-byte size and 4-aligned. + // The stub is 16-byte size and 4-aligned. intptr_t LazyPtr = getIndirectSymAddr(Fn); if (!LazyPtr) { // In PIC mode, the function stub is loading a lazy-ptr. @@ -174,30 +181,30 @@ errs() << "JIT: Stub emitted at [" << LazyPtr << "] for external function at '" << Fn << "'\n"); } - JCE.startGVStub(BS, F, 16, 4); - intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); - if (!sys::Memory::setRangeWritable((void*)Addr, 16)) { + JCE.emitAlignment(4); + Addr = (void*)JCE.getCurrentPCValue(); + if (!sys::Memory::setRangeWritable(Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub writable"); } JCE.emitWordLE(0xe59fc004); // ldr ip, [pc, #+4] JCE.emitWordLE(0xe08fc00c); // L_func$scv: add ip, pc, ip JCE.emitWordLE(0xe59cf000); // ldr pc, [ip] - JCE.emitWordLE(LazyPtr - (Addr+4+8)); // func - (L_func$scv+8) - sys::Memory::InvalidateInstructionCache((void*)Addr, 16); - if (!sys::Memory::setRangeExecutable((void*)Addr, 16)) { + JCE.emitWordLE(LazyPtr - (intptr_t(Addr)+4+8)); // func - (L_func$scv+8) + sys::Memory::InvalidateInstructionCache(Addr, 16); + if (!sys::Memory::setRangeExecutable(Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub executable"); } } else { // The stub is 8-byte size and 4-aligned. - JCE.startGVStub(BS, F, 8, 4); - intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); - if (!sys::Memory::setRangeWritable((void*)Addr, 8)) { + JCE.emitAlignment(4); + Addr = (void*)JCE.getCurrentPCValue(); + if (!sys::Memory::setRangeWritable(Addr, 8)) { llvm_unreachable("ERROR: Unable to mark stub writable"); } JCE.emitWordLE(0xe51ff004); // ldr pc, [pc, #-4] JCE.emitWordLE((intptr_t)Fn); // addr of function - sys::Memory::InvalidateInstructionCache((void*)Addr, 8); - if (!sys::Memory::setRangeExecutable((void*)Addr, 8)) { + sys::Memory::InvalidateInstructionCache(Addr, 8); + if (!sys::Memory::setRangeExecutable(Addr, 8)) { llvm_unreachable("ERROR: Unable to mark stub executable"); } } @@ -209,9 +216,9 @@ // // Branch and link to the compilation callback. // The stub is 16-byte size and 4-byte aligned. - JCE.startGVStub(BS, F, 16, 4); - intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); - if (!sys::Memory::setRangeWritable((void*)Addr, 16)) { + JCE.emitAlignment(4); + Addr = (void*)JCE.getCurrentPCValue(); + if (!sys::Memory::setRangeWritable(Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub writable"); } // Save LR so the callback can determine which stub called it. @@ -224,13 +231,13 @@ JCE.emitWordLE(0xe51ff004); // ldr pc, [pc, #-4] // The address of the compilation callback. JCE.emitWordLE((intptr_t)ARMCompilationCallback); - sys::Memory::InvalidateInstructionCache((void*)Addr, 16); - if (!sys::Memory::setRangeExecutable((void*)Addr, 16)) { + sys::Memory::InvalidateInstructionCache(Addr, 16); + if (!sys::Memory::setRangeExecutable(Addr, 16)) { llvm_unreachable("ERROR: Unable to mark stub executable"); } } - return JCE.finishGVStub(BS); + return Addr; } intptr_t ARMJITInfo::resolveRelocDestAddr(MachineRelocation *MR) const { Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.h?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMJITInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.h Mon Nov 23 17:35:19 2009 @@ -61,6 +61,10 @@ virtual void *emitGlobalValueIndirectSym(const GlobalValue* GV, void *ptr, JITCodeEmitter &JCE); + // getStubLayout - Returns the size and alignment of the largest call stub + // on ARM. + virtual StubLayout getStubLayout(); + /// emitFunctionStub - Use the specified JITCodeEmitter object to emit a /// small native function that simply calls the function at the specified /// address. Modified: llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp (original) +++ llvm/trunk/lib/Target/Alpha/AlphaJITInfo.cpp Mon Nov 23 17:35:19 2009 @@ -190,18 +190,27 @@ #endif } +TargetJITInfo::StubLayout AlphaJITInfo::getStubLayout() { + // The stub contains 19 4-byte instructions, aligned at 4 bytes: + // R0 = R27 + // 8 x "R27 <<= 8; R27 |= 8-bits-of-Target" == 16 instructions + // JMP R27 + // Magic number so the compilation callback can recognize the stub. + StubLayout Result = {19 * 4, 4}; + return Result; +} + void *AlphaJITInfo::emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE) { MachineCodeEmitter::BufferState BS; //assert(Fn == AlphaCompilationCallback && "Where are you going?\n"); //Do things in a stupid slow way! - JCE.startGVStub(BS, F, 19*4); void* Addr = (void*)(intptr_t)JCE.getCurrentPCValue(); for (int x = 0; x < 19; ++ x) JCE.emitWordLE(0); EmitBranchToAt(Addr, Fn); DEBUG(errs() << "Emitting Stub to " << Fn << " at [" << Addr << "]\n"); - return JCE.finishGVStub(BS); + return Addr; } TargetJITInfo::LazyResolverFn Modified: llvm/trunk/lib/Target/Alpha/AlphaJITInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Alpha/AlphaJITInfo.h?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/Alpha/AlphaJITInfo.h (original) +++ llvm/trunk/lib/Target/Alpha/AlphaJITInfo.h Mon Nov 23 17:35:19 2009 @@ -31,6 +31,7 @@ explicit AlphaJITInfo(TargetMachine &tm) : TM(tm) { useGOT = true; } + virtual StubLayout getStubLayout(); virtual void *emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE); virtual LazyResolverFn getLazyResolverFunction(JITCompilerFn); Modified: llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCJITInfo.cpp Mon Nov 23 17:35:19 2009 @@ -323,6 +323,15 @@ return is64Bit ? PPC64CompilationCallback : PPC32CompilationCallback; } +TargetJITInfo::StubLayout PPCJITInfo::getStubLayout() { + // The stub contains up to 10 4-byte instructions, aligned at 4 bytes: 3 + // instructions to save the caller's address if this is a lazy-compilation + // stub, plus a 1-, 4-, or 7-instruction sequence to load an arbitrary address + // into a register and jump through it. + StubLayout Result = {10*4, 4}; + return Result; +} + #if (defined(__POWERPC__) || defined (__ppc__) || defined(_POWER)) && \ defined(__APPLE__) extern "C" void sys_icache_invalidate(const void *Addr, size_t len); @@ -335,8 +344,7 @@ // call. The code is the same except for one bit of the last instruction. if (Fn != (void*)(intptr_t)PPC32CompilationCallback && Fn != (void*)(intptr_t)PPC64CompilationCallback) { - JCE.startGVStub(BS, F, 7*4); - intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); + void *Addr = (void*)JCE.getCurrentPCValue(); JCE.emitWordBE(0); JCE.emitWordBE(0); JCE.emitWordBE(0); @@ -344,13 +352,12 @@ JCE.emitWordBE(0); JCE.emitWordBE(0); JCE.emitWordBE(0); - EmitBranchToAt(Addr, (intptr_t)Fn, false, is64Bit); - sys::Memory::InvalidateInstructionCache((void*)Addr, 7*4); - return JCE.finishGVStub(BS); + EmitBranchToAt((intptr_t)Addr, (intptr_t)Fn, false, is64Bit); + sys::Memory::InvalidateInstructionCache(Addr, 7*4); + return Addr; } - JCE.startGVStub(BS, F, 10*4); - intptr_t Addr = (intptr_t)JCE.getCurrentPCValue(); + void *Addr = (void*)JCE.getCurrentPCValue(); if (is64Bit) { JCE.emitWordBE(0xf821ffb1); // stdu r1,-80(r1) JCE.emitWordBE(0x7d6802a6); // mflr r11 @@ -373,8 +380,8 @@ JCE.emitWordBE(0); JCE.emitWordBE(0); EmitBranchToAt(BranchAddr, (intptr_t)Fn, true, is64Bit); - sys::Memory::InvalidateInstructionCache((void*)Addr, 10*4); - return JCE.finishGVStub(BS); + sys::Memory::InvalidateInstructionCache(Addr, 10*4); + return Addr; } Modified: llvm/trunk/lib/Target/PowerPC/PPCJITInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCJITInfo.h?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCJITInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCJITInfo.h Mon Nov 23 17:35:19 2009 @@ -30,6 +30,7 @@ is64Bit = tmIs64Bit; } + virtual StubLayout getStubLayout(); virtual void *emitFunctionStub(const Function* F, void *Fn, JITCodeEmitter &JCE); virtual LazyResolverFn getLazyResolverFunction(JITCompilerFn); Modified: llvm/trunk/lib/Target/X86/X86JITInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86JITInfo.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86JITInfo.cpp (original) +++ llvm/trunk/lib/Target/X86/X86JITInfo.cpp Mon Nov 23 17:35:19 2009 @@ -438,74 +438,65 @@ return JCE.finishGVStub(BS); } -void *X86JITInfo::emitFunctionStub(const Function* F, void *Fn, +TargetJITInfo::StubLayout X86JITInfo::getStubLayout() { + // The 64-bit stub contains: + // movabs r10 <- 8-byte-target-address # 10 bytes + // call|jmp *r10 # 3 bytes + // The 32-bit stub contains a 5-byte call|jmp. + // If the stub is a call to the compilation callback, an extra byte is added + // to mark it as a stub. + StubLayout Result = {14, 4}; + return Result; +} + +void *X86JITInfo::emitFunctionStub(const Function* F, void *Target, JITCodeEmitter &JCE) { MachineCodeEmitter::BufferState BS; // Note, we cast to intptr_t here to silence a -pedantic warning that // complains about casting a function pointer to a normal pointer. #if defined (X86_32_JIT) && !defined (_MSC_VER) - bool NotCC = (Fn != (void*)(intptr_t)X86CompilationCallback && - Fn != (void*)(intptr_t)X86CompilationCallback_SSE); + bool NotCC = (Target != (void*)(intptr_t)X86CompilationCallback && + Target != (void*)(intptr_t)X86CompilationCallback_SSE); #else - bool NotCC = Fn != (void*)(intptr_t)X86CompilationCallback; + bool NotCC = Target != (void*)(intptr_t)X86CompilationCallback; #endif + JCE.emitAlignment(4); + void *Result = (void*)JCE.getCurrentPCValue(); if (NotCC) { #if defined (X86_64_JIT) - JCE.startGVStub(BS, F, 13, 4); JCE.emitByte(0x49); // REX prefix JCE.emitByte(0xB8+2); // movabsq r10 - JCE.emitWordLE((unsigned)(intptr_t)Fn); - JCE.emitWordLE((unsigned)(((intptr_t)Fn) >> 32)); + JCE.emitWordLE((unsigned)(intptr_t)Target); + JCE.emitWordLE((unsigned)(((intptr_t)Target) >> 32)); JCE.emitByte(0x41); // REX prefix JCE.emitByte(0xFF); // jmpq *r10 JCE.emitByte(2 | (4 << 3) | (3 << 6)); #else - JCE.startGVStub(BS, F, 5, 4); JCE.emitByte(0xE9); - JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); + JCE.emitWordLE((intptr_t)Target-JCE.getCurrentPCValue()-4); #endif - return JCE.finishGVStub(BS); + return Result; } #if defined (X86_64_JIT) - JCE.startGVStub(BS, F, 14, 4); JCE.emitByte(0x49); // REX prefix JCE.emitByte(0xB8+2); // movabsq r10 - JCE.emitWordLE((unsigned)(intptr_t)Fn); - JCE.emitWordLE((unsigned)(((intptr_t)Fn) >> 32)); + JCE.emitWordLE((unsigned)(intptr_t)Target); + JCE.emitWordLE((unsigned)(((intptr_t)Target) >> 32)); JCE.emitByte(0x41); // REX prefix JCE.emitByte(0xFF); // callq *r10 JCE.emitByte(2 | (2 << 3) | (3 << 6)); #else - JCE.startGVStub(BS, F, 6, 4); JCE.emitByte(0xE8); // Call with 32 bit pc-rel destination... - JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); + JCE.emitWordLE((intptr_t)Target-JCE.getCurrentPCValue()-4); #endif // This used to use 0xCD, but that value is used by JITMemoryManager to // initialize the buffer with garbage, which means it may follow a // noreturn function call, confusing X86CompilationCallback2. PR 4929. JCE.emitByte(0xCE); // Interrupt - Just a marker identifying the stub! - return JCE.finishGVStub(BS); -} - -void X86JITInfo::emitFunctionStubAtAddr(const Function* F, void *Fn, void *Stub, - JITCodeEmitter &JCE) { - MachineCodeEmitter::BufferState BS; - // Note, we cast to intptr_t here to silence a -pedantic warning that - // complains about casting a function pointer to a normal pointer. - JCE.startGVStub(BS, Stub, 5); - JCE.emitByte(0xE9); -#if defined (X86_64_JIT) && !defined (NDEBUG) - // Yes, we need both of these casts, or some broken versions of GCC (4.2.4) - // get the signed-ness of the expression wrong. Go figure. - intptr_t Displacement = (intptr_t)Fn - (intptr_t)JCE.getCurrentPCValue() - 5; - assert(((Displacement << 32) >> 32) == Displacement - && "PIC displacement does not fit in displacement field!"); -#endif - JCE.emitWordLE((intptr_t)Fn-JCE.getCurrentPCValue()-4); - JCE.finishGVStub(BS); + return Result; } /// getPICJumpTableEntry - Returns the value of the jumptable entry for the Modified: llvm/trunk/lib/Target/X86/X86JITInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86JITInfo.h?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86JITInfo.h (original) +++ llvm/trunk/lib/Target/X86/X86JITInfo.h Mon Nov 23 17:35:19 2009 @@ -43,18 +43,16 @@ virtual void *emitGlobalValueIndirectSym(const GlobalValue* GV, void *ptr, JITCodeEmitter &JCE); + // getStubLayout - Returns the size and alignment of the largest call stub + // on X86. + virtual StubLayout getStubLayout(); + /// emitFunctionStub - Use the specified JITCodeEmitter object to emit a /// small native function that simply calls the function at the specified /// address. - virtual void *emitFunctionStub(const Function* F, void *Fn, + virtual void *emitFunctionStub(const Function* F, void *Target, JITCodeEmitter &JCE); - /// emitFunctionStubAtAddr - Use the specified JITCodeEmitter object to - /// emit a small native function that simply calls Fn. Emit the stub into - /// the supplied buffer. - virtual void emitFunctionStubAtAddr(const Function* F, void *Fn, - void *Buffer, JITCodeEmitter &JCE); - /// getPICJumpTableEntry - Returns the value of the jumptable entry for the /// specific basic block. virtual uintptr_t getPICJumpTableEntry(uintptr_t BB, uintptr_t JTBase); Modified: llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp?rev=89715&r1=89714&r2=89715&view=diff ============================================================================== --- llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp (original) +++ llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp Mon Nov 23 17:35:19 2009 @@ -183,6 +183,7 @@ M = new Module("
", Context); MP = new ExistingModuleProvider(M); RJMM = new RecordingJITMemoryManager; + RJMM->setPoisonMemory(true); std::string Error; TheJIT.reset(EngineBuilder(MP).setEngineKind(EngineKind::JIT) .setJITMemoryManager(RJMM) @@ -311,7 +312,6 @@ EXPECT_EQ(8, TestFunctionPtr()); } -#if !defined(__arm__) && !defined(__powerpc__) && !defined(__ppc__) // Test a function C which calls A and B which call each other. TEST_F(JITTest, NonLazyCompilationStillNeedsStubs) { TheJIT->DisableLazyCompilation(true); @@ -407,7 +407,6 @@ EXPECT_EQ(Func2->getNumUses(), 0u); Func2->eraseFromParent(); } -#endif TEST_F(JITTest, ModuleDeletion) { TheJIT->DisableLazyCompilation(false); @@ -458,7 +457,6 @@ NumTablesDeallocated); } -#if !defined(__arm__) && !defined(__powerpc__) && !defined(__ppc__) typedef int (*FooPtr) (); TEST_F(JITTest, NoStubs) { @@ -496,7 +494,40 @@ ASSERT_EQ(stubsBefore, RJMM->stubsAllocated); } + +TEST_F(JITTest, FunctionPointersOutliveTheirCreator) { + TheJIT->DisableLazyCompilation(true); + LoadAssembly("define i8()* @get_foo_addr() { " + " ret i8()* @foo " + "} " + " " + "define i8 @foo() { " + " ret i8 42 " + "} "); + Function *F_get_foo_addr = M->getFunction("get_foo_addr"); + + typedef char(*fooT)(); + fooT (*get_foo_addr)() = reinterpret_cast( + (intptr_t)TheJIT->getPointerToFunction(F_get_foo_addr)); + fooT foo_addr = get_foo_addr(); + + // Now free get_foo_addr. This should not free the machine code for foo or + // any call stub returned as foo's canonical address. + TheJIT->freeMachineCodeForFunction(F_get_foo_addr); + + // Check by calling the reported address of foo. + EXPECT_EQ(42, foo_addr()); + + // The reported address should also be the same as the result of a subsequent + // getPointerToFunction(foo). +#if 0 + // Fails until PR5126 is fixed: + Function *F_foo = M->getFunction("foo"); + fooT foo = reinterpret_cast( + (intptr_t)TheJIT->getPointerToFunction(F_foo)); + EXPECT_EQ((intptr_t)foo, (intptr_t)foo_addr); #endif +} // This code is copied from JITEventListenerTest, but it only runs once for all // the tests in this directory. Everything seems fine, but that's strange From grosbach at apple.com Mon Nov 23 18:20:27 2009 From: grosbach at apple.com (Jim Grosbach) Date: Tue, 24 Nov 2009 00:20:27 -0000 Subject: [llvm-commits] [llvm] r89718 - /llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Message-ID: <200911240020.nAO0KSrx016524@zion.cs.uiuc.edu> Author: grosbach Date: Mon Nov 23 18:20:27 2009 New Revision: 89718 URL: http://llvm.org/viewvc/llvm-project?rev=89718&view=rev Log: 80 column violations Modified: llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td?rev=89718&r1=89717&r2=89718&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Mon Nov 23 18:20:27 2009 @@ -49,8 +49,8 @@ // 8-bit immediate rotated by an arbitrary number of bits, or an 8-bit // immediate splatted into multiple bytes of the word. t2_so_imm values are // represented in the imm field in the same 12-bit form that they are encoded -// into t2_so_imm instructions: the 8-bit immediate is the least significant bits -// [bits 0-7], the 4-bit shift/splat amount is the next 4 bits [bits 8-11]. +// into t2_so_imm instructions: the 8-bit immediate is the least significant +// bits [bits 0-7], the 4-bit shift/splat amount is the next 4 bits [bits 8-11]. def t2_so_imm : Operand, PatLeaf<(imm), [{ return ARM_AM::getT2SOImmVal((uint32_t)N->getZExtValue()) != -1; @@ -267,9 +267,9 @@ [(set GPR:$dst, (opnode GPR:$lhs, t2_so_reg:$rhs))]>; } -/// T2I_adde_sube_irs - Defines a set of (op reg, {so_imm|r|so_reg}) patterns for a -/// binary operation that produces a value and use and define the carry bit. -/// It's not predicable. +/// T2I_adde_sube_irs - Defines a set of (op reg, {so_imm|r|so_reg}) patterns +/// for a binary operation that produces a value and use and define the carry +/// bit. It's not predicable. let Uses = [CPSR] in { multiclass T2I_adde_sube_irs { // shifted imm @@ -630,7 +630,7 @@ AddrModeT2_i8, IndexModePost, IIC_iStoreiu, "str", "\t$src, [$base], $offset", "$base = $base_wb", [(set GPR:$base_wb, - (post_store GPR:$src, GPR:$base, t2am_imm8_offset:$offset))]>; + (post_store GPR:$src, GPR:$base, t2am_imm8_offset:$offset))]>; def t2STRH_PRE : T2Iidxldst<(outs GPR:$base_wb), (ins GPR:$src, GPR:$base, t2am_imm8_offset:$offset), @@ -733,9 +733,9 @@ (t2UXTB16r_rot GPR:$Src, 8)>; defm t2UXTAB : T2I_bin_rrot<"uxtab", - BinOpFrag<(add node:$LHS, (and node:$RHS, 0x00FF))>>; + BinOpFrag<(add node:$LHS, (and node:$RHS, 0x00FF))>>; defm t2UXTAH : T2I_bin_rrot<"uxtah", - BinOpFrag<(add node:$LHS, (and node:$RHS, 0xFFFF))>>; + BinOpFrag<(add node:$LHS, (and node:$RHS, 0xFFFF))>>; } //===----------------------------------------------------------------------===// From asl at math.spbu.ru Mon Nov 23 18:44:37 2009 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 24 Nov 2009 00:44:37 -0000 Subject: [llvm-commits] [llvm] r89720 - in /llvm/trunk: lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMExpandPseudoInsts.cpp lib/Target/ARM/ARMISelDAGToDAG.cpp lib/Target/ARM/ARMISelLowering.cpp lib/Target/ARM/ARMInstrInfo.td lib/Target/ARM/ARMInstrThumb2.td lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMSubtarget.h lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp lib/Target/ARM/Thumb2SizeReduction.cpp test/CodeGen/ARM/movt-movw-global.ll Message-ID: <200911240044.nAO0ic27017388@zion.cs.uiuc.edu> Author: asl Date: Mon Nov 23 18:44:37 2009 New Revision: 89720 URL: http://llvm.org/viewvc/llvm-project?rev=89720&view=rev Log: Materialize global addresses via movt/movw pair, this is always better than doing the same via constpool: 1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2. 2. Load from constpool might stall up to 300 cycles due to cache miss. 3. Movt/movw does not use load/store unit. 4. Less constpool entries => better compiler performance. This is only enabled on ELF systems, since darwin does not have needed relocations (yet). Added: llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp llvm/trunk/lib/Target/ARM/ARMInstrInfo.td llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp llvm/trunk/lib/Target/ARM/ARMSubtarget.h llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Mon Nov 23 18:44:37 2009 @@ -162,6 +162,22 @@ I_BitShift = 25, CondShift = 28 }; + + /// Target Operand Flag enum. + enum TOF { + //===------------------------------------------------------------------===// + // ARM Specific MachineOperand flags. + + MO_NO_FLAG, + + /// MO_LO16 - On a symbol operand, this represents a relocation containing + /// lower 16 bit of the address. Used only via movw instruction. + MO_LO16, + + /// MO_HI16 - On a symbol operand, this represents a relocation containing + /// higher 16 bit of the address. Used only via movt instruction. + MO_HI16 + }; } class ARMBaseInstrInfo : public TargetInstrInfoImpl { Modified: llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp Mon Nov 23 18:44:37 2009 @@ -75,17 +75,30 @@ } case ARM::t2MOVi32imm: { unsigned DstReg = MI.getOperand(0).getReg(); - unsigned Imm = MI.getOperand(1).getImm(); - unsigned Lo16 = Imm & 0xffff; - unsigned Hi16 = (Imm >> 16) & 0xffff; if (!MI.getOperand(0).isDead()) { - AddDefaultPred(BuildMI(MBB, MBBI, MI.getDebugLoc(), - TII->get(ARM::t2MOVi16), DstReg) - .addImm(Lo16)); - AddDefaultPred(BuildMI(MBB, MBBI, MI.getDebugLoc(), - TII->get(ARM::t2MOVTi16)) - .addReg(DstReg, getDefRegState(true)) - .addReg(DstReg).addImm(Hi16)); + const MachineOperand &MO = MI.getOperand(1); + MachineInstrBuilder LO16, HI16; + + LO16 = BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(ARM::t2MOVi16), + DstReg); + HI16 = BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(ARM::t2MOVTi16)) + .addReg(DstReg, getDefRegState(true)).addReg(DstReg); + + if (MO.isImm()) { + unsigned Imm = MO.getImm(); + unsigned Lo16 = Imm & 0xffff; + unsigned Hi16 = (Imm >> 16) & 0xffff; + LO16 = LO16.addImm(Lo16); + HI16 = HI16.addImm(Hi16); + } else { + GlobalValue *GV = MO.getGlobal(); + unsigned TF = MO.getTargetFlags(); + LO16 = LO16.addGlobalAddress(GV, MO.getOffset(), TF | ARMII::MO_LO16); + HI16 = HI16.addGlobalAddress(GV, MO.getOffset(), TF | ARMII::MO_HI16); + // FIXME: What's about memoperands? + } + AddDefaultPred(LO16); + AddDefaultPred(HI16); } MI.eraseFromParent(); Modified = true; Modified: llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp Mon Nov 23 18:44:37 2009 @@ -261,7 +261,9 @@ if (N.getOpcode() == ISD::FrameIndex) { int FI = cast(N)->getIndex(); Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); - } else if (N.getOpcode() == ARMISD::Wrapper) { + } else if (N.getOpcode() == ARMISD::Wrapper && + !(Subtarget->useMovt() && + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { Base = N.getOperand(0); } Offset = CurDAG->getRegister(0, MVT::i32); @@ -463,7 +465,9 @@ if (N.getOpcode() == ISD::FrameIndex) { int FI = cast(N)->getIndex(); Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); - } else if (N.getOpcode() == ARMISD::Wrapper) { + } else if (N.getOpcode() == ARMISD::Wrapper && + !(Subtarget->useMovt() && + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { Base = N.getOperand(0); } Offset = CurDAG->getTargetConstant(ARM_AM::getAM5Opc(ARM_AM::add, 0), @@ -558,7 +562,13 @@ } if (N.getOpcode() != ISD::ADD) { - Base = (N.getOpcode() == ARMISD::Wrapper) ? N.getOperand(0) : N; + if (N.getOpcode() == ARMISD::Wrapper && + !(Subtarget->useMovt() && + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { + Base = N.getOperand(0); + } else + Base = N; + Offset = CurDAG->getRegister(0, MVT::i32); OffImm = CurDAG->getTargetConstant(0, MVT::i32); return true; @@ -681,7 +691,9 @@ Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); OffImm = CurDAG->getTargetConstant(0, MVT::i32); return true; - } else if (N.getOpcode() == ARMISD::Wrapper) { + } else if (N.getOpcode() == ARMISD::Wrapper && + !(Subtarget->useMovt() && + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { Base = N.getOperand(0); if (Base.getOpcode() == ISD::TargetConstantPool) return false; // We want to select t2LDRpci instead. Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Mon Nov 23 18:44:37 2009 @@ -39,6 +39,7 @@ #include "llvm/CodeGen/SelectionDAG.h" #include "llvm/Target/TargetOptions.h" #include "llvm/ADT/VectorExtras.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MathExtras.h" #include @@ -1356,10 +1357,17 @@ PseudoSourceValue::getGOT(), 0); return Result; } else { - SDValue CPAddr = DAG.getTargetConstantPool(GV, PtrVT, 4); - CPAddr = DAG.getNode(ARMISD::Wrapper, dl, MVT::i32, CPAddr); - return DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), CPAddr, - PseudoSourceValue::getConstantPool(), 0); + // If we have T2 ops, we can materialize the address directly via movt/movw + // pair. This is always cheaper. + if (Subtarget->useMovt()) { + return DAG.getNode(ARMISD::Wrapper, dl, PtrVT, + DAG.getTargetGlobalAddress(GV, PtrVT)); + } else { + SDValue CPAddr = DAG.getTargetConstantPool(GV, PtrVT, 4); + CPAddr = DAG.getNode(ARMISD::Wrapper, dl, MVT::i32, CPAddr); + return DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), CPAddr, + PseudoSourceValue::getConstantPool(), 0); + } } } Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrInfo.td Mon Nov 23 18:44:37 2009 @@ -116,6 +116,10 @@ def CarryDefIsUnused : Predicate<"!N.getNode()->hasAnyUseOfValue(1)">; def CarryDefIsUsed : Predicate<"N.getNode()->hasAnyUseOfValue(1)">; +// FIXME: Eventually this will be just "hasV6T2Ops". +def UseMovt : Predicate<"Subtarget->useMovt()">; +def DontUseMovt : Predicate<"!Subtarget->useMovt()">; + //===----------------------------------------------------------------------===// // ARM Flag Definitions. @@ -204,7 +208,7 @@ def lo16AllZero : PatLeaf<(i32 imm), [{ // Returns true if all low 16-bits are 0. return (((uint32_t)N->getZExtValue()) & 0xFFFFUL) == 0; - }], hi16>; +}], hi16>; /// imm0_65535 predicate - True if the 32-bit immediate is in the range /// [0.65535]. @@ -1002,7 +1006,7 @@ let Constraints = "$src = $dst" in def MOVTi16 : AI1<0b1010, (outs GPR:$dst), (ins GPR:$src, i32imm:$imm), DPFrm, IIC_iMOVi, - "movt", "\t$dst, $imm", + "movt", "\t$dst, $imm", [(set GPR:$dst, (or (and GPR:$src, 0xffff), lo16AllZero:$imm))]>, UnaryDP, @@ -1603,12 +1607,6 @@ // Non-Instruction Patterns // -// ConstantPool, GlobalAddress, and JumpTable -def : ARMPat<(ARMWrapper tglobaladdr :$dst), (LEApcrel tglobaladdr :$dst)>; -def : ARMPat<(ARMWrapper tconstpool :$dst), (LEApcrel tconstpool :$dst)>; -def : ARMPat<(ARMWrapperJT tjumptable:$dst, imm:$id), - (LEApcrelJT tjumptable:$dst, imm:$id)>; - // Large immediate handling. // Two piece so_imms. @@ -1638,10 +1636,19 @@ // FIXME: Remove this when we can do generalized remat. let isReMaterializable = 1 in def MOVi32imm : AI1x2<(outs GPR:$dst), (ins i32imm:$src), Pseudo, IIC_iMOVi, - "movw", "\t$dst, ${src:lo16}\n\tmovt${p} $dst, ${src:hi16}", + "movw", "\t$dst, ${src:lo16}\n\tmovt${p}\t$dst, ${src:hi16}", [(set GPR:$dst, (i32 imm:$src))]>, Requires<[IsARM, HasV6T2]>; +// ConstantPool, GlobalAddress, and JumpTable +def : ARMPat<(ARMWrapper tglobaladdr :$dst), (LEApcrel tglobaladdr :$dst)>, + Requires<[IsARM, DontUseMovt]>; +def : ARMPat<(ARMWrapper tconstpool :$dst), (LEApcrel tconstpool :$dst)>; +def : ARMPat<(ARMWrapper tglobaladdr :$dst), (MOVi32imm tglobaladdr :$dst)>, + Requires<[IsARM, UseMovt]>; +def : ARMPat<(ARMWrapperJT tjumptable:$dst, imm:$id), + (LEApcrelJT tjumptable:$dst, imm:$id)>; + // TODO: add,sub,and, 3-instr forms? Modified: llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Mon Nov 23 18:44:37 2009 @@ -1181,12 +1181,6 @@ (t2SUBri (t2SUBri GPR:$LHS, (t2_so_neg_imm2part_1 imm:$RHS)), (t2_so_neg_imm2part_2 imm:$RHS))>; -// ConstantPool, GlobalAddress, and JumpTable -def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2LEApcrel tglobaladdr :$dst)>; -def : T2Pat<(ARMWrapper tconstpool :$dst), (t2LEApcrel tconstpool :$dst)>; -def : T2Pat<(ARMWrapperJT tjumptable:$dst, imm:$id), - (t2LEApcrelJT tjumptable:$dst, imm:$id)>; - // 32-bit immediate using movw + movt. // This is a single pseudo instruction to make it re-materializable. Remove // when we can do generalized remat. @@ -1195,6 +1189,16 @@ "movw", "\t$dst, ${src:lo16}\n\tmovt${p}\t$dst, ${src:hi16}", [(set GPR:$dst, (i32 imm:$src))]>; +// ConstantPool, GlobalAddress, and JumpTable +def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2LEApcrel tglobaladdr :$dst)>, + Requires<[IsThumb2, DontUseMovt]>; +def : T2Pat<(ARMWrapper tconstpool :$dst), (t2LEApcrel tconstpool :$dst)>; +def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2MOVi32imm tglobaladdr :$dst)>, + Requires<[IsThumb2, UseMovt]>; + +def : T2Pat<(ARMWrapperJT tjumptable:$dst, imm:$id), + (t2LEApcrelJT tjumptable:$dst, imm:$id)>; + // Pseudo instruction that combines ldr from constpool and add pc. This should // be expanded into two instructions late to allow if-conversion and // scheduling. Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp Mon Nov 23 18:44:37 2009 @@ -27,6 +27,10 @@ cl::desc("Use NEON for single-precision FP"), cl::init(false), cl::Hidden); +static cl::opt +UseMOVT("arm-use-movt", + cl::init(true), cl::Hidden); + ARMSubtarget::ARMSubtarget(const std::string &TT, const std::string &FS, bool isT) : ARMArchVersion(V4T) @@ -36,6 +40,7 @@ , ThumbMode(Thumb1) , PostRAScheduler(false) , IsR9Reserved(ReserveR9) + , UseMovt(UseMOVT) , stackAlignment(4) , CPUString("generic") , TargetType(isELF) // Default to ELF unless otherwise specified. Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.h?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMSubtarget.h (original) +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.h Mon Nov 23 18:44:37 2009 @@ -65,6 +65,10 @@ /// IsR9Reserved - True if R9 is a not available as general purpose register. bool IsR9Reserved; + /// UseMovt - True if MOVT / MOVW pairs are used for materialization of 32-bit + /// imms (including global addresses). + bool UseMovt; + /// stackAlignment - The minimum alignment known to hold of the stack frame on /// entry to the function and which must be maintained by every function. unsigned stackAlignment; @@ -130,8 +134,10 @@ bool isR9Reserved() const { return IsR9Reserved; } + bool useMovt() const { return UseMovt && hasV6T2Ops(); } + const std::string & getCPUString() const { return CPUString; } - + /// enablePostRAScheduler - True at 'More' optimization. bool enablePostRAScheduler(CodeGenOpt::Level OptLevel, TargetSubtarget::AntiDepBreakMode& Mode, Modified: llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp (original) +++ llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp Mon Nov 23 18:44:37 2009 @@ -330,6 +330,8 @@ void ARMAsmPrinter::printOperand(const MachineInstr *MI, int OpNum, const char *Modifier) { const MachineOperand &MO = MI->getOperand(OpNum); + unsigned TF = MO.getTargetFlags(); + switch (MO.getType()) { default: assert(0 && ""); @@ -356,12 +358,12 @@ case MachineOperand::MO_Immediate: { int64_t Imm = MO.getImm(); O << '#'; - if (Modifier) { - if (strcmp(Modifier, "lo16") == 0) - O << ":lower16:"; - else if (strcmp(Modifier, "hi16") == 0) - O << ":upper16:"; - } + if ((Modifier && strcmp(Modifier, "lo16") == 0) || + (TF & ARMII::MO_LO16)) + O << ":lower16:"; + else if ((Modifier && strcmp(Modifier, "hi16") == 0) || + (TF & ARMII::MO_HI16)) + O << ":upper16:"; O << Imm; break; } @@ -371,6 +373,13 @@ case MachineOperand::MO_GlobalAddress: { bool isCallOp = Modifier && !strcmp(Modifier, "call"); GlobalValue *GV = MO.getGlobal(); + + if ((Modifier && strcmp(Modifier, "lo16") == 0) || + (TF & ARMII::MO_LO16)) + O << ":lower16:"; + else if ((Modifier && strcmp(Modifier, "hi16") == 0) || + (TF & ARMII::MO_HI16)) + O << ":upper16:"; O << Mang->getMangledName(GV); printOffset(MO.getOffset()); Modified: llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp?rev=89720&r1=89719&r2=89720&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp (original) +++ llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp Mon Nov 23 18:44:37 2009 @@ -78,7 +78,7 @@ { ARM::t2LSRri, ARM::tLSRri, 0, 5, 0, 1, 0, 0,0, 0 }, { ARM::t2LSRrr, 0, ARM::tLSRrr, 0, 0, 0, 1, 0,0, 0 }, { ARM::t2MOVi, ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 0 }, - { ARM::t2MOVi16,ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 0 }, + { ARM::t2MOVi16,ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 1 }, // FIXME: Do we need the 16-bit 'S' variant? { ARM::t2MOVr,ARM::tMOVgpr2gpr,0, 0, 0, 0, 0, 1,0, 0 }, { ARM::t2MOVCCr,0, ARM::tMOVCCr, 0, 0, 0, 0, 0,1, 0 }, @@ -413,6 +413,12 @@ if (MI->getOperand(2).getImm() == 0) return ReduceToNarrow(MBB, MI, Entry, LiveCPSR); break; + case ARM::t2MOVi16: + // Can convert only 'pure' immediate operands, not immediates obtained as + // globals' addresses. + if (MI->getOperand(1).isImm()) + return ReduceToNarrow(MBB, MI, Entry, LiveCPSR); + break; } return false; } Added: llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll?rev=89720&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll (added) +++ llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll Mon Nov 23 18:44:37 2009 @@ -0,0 +1,20 @@ +; RUN: llc < %s | FileCheck %s +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64" +target triple = "armv7-eabi" + + at foo = common global i32 0 ; [#uses=1] + +define arm_aapcs_vfpcc i32* @bar1() nounwind readnone { +entry: +; CHECK: movw r0, :lower16:foo +; CHECK-NEXT: movt r0, :upper16:foo + ret i32* @foo +} + +define arm_aapcs_vfpcc void @bar2(i32 %baz) nounwind { +entry: +; CHECK: movw r1, :lower16:foo +; CHECK-NEXT: movt r1, :upper16:foo + store i32 %baz, i32* @foo, align 4 + ret void +} From vkutuzov at accesssoftek.com Mon Nov 23 18:45:03 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Mon, 23 Nov 2009 16:45:03 -0800 Subject: [llvm-commits] [llvm] r89516 - in /llvm/trunk: include/llvm/Target/SubtargetFeature.h lib/Target/SubtargetFeature.cpp tools/lto/LTOCodeGenerator.cpp References: <200911210000.nAL003q6027547@zion.cs.uiuc.edu> <6a8523d60911220941u6e633e0dhd4ee9da39dedb2ce@mail.gmail.com> <6E59EC20-D700-4C76-9E8B-A4B12F2617F5@apple.com> Message-ID: <17738F1106824A719376B071BC8C5F04@andreic6e7fe55> Please find the patch attached. Is it Ok to submit? However, the SubtargetFeature constructor accepts a string as an argument and parse it as a comma-separated list. There is also the SubtargetFeature::setString method which does the same. While I'm changing the SubtargetFeature, would you like me to change the SubtargetFeature constructor to accept array of StringRef's and remove setString method? -Viktor ----- Original Message ----- From: "Chris Lattner" To: "Daniel Dunbar" Cc: "Viktor Kutuzov" ; Sent: Monday, November 23, 2009 9:37 AM Subject: Re: [llvm-commits] [llvm] r89516 - in /llvm/trunk: include/llvm/Target/SubtargetFeature.h lib/Target/SubtargetFeature.cpp tools/lto/LTOCodeGenerator.cpp On Nov 22, 2009, at 9:41 AM, Daniel Dunbar wrote: > On Sun, Nov 22, 2009 at 6:03 AM, Chris Lattner wrote: >> >> On Nov 20, 2009, at 4:00 PM, Viktor Kutuzov wrote: >> >>> Author: vkutuzov >>> Date: Fri Nov 20 18:00:02 2009 >>> New Revision: 89516 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=89516&view=rev >>> Log: >>> Added two SubtargetFeatures::AddFeatures methods, which accept a comma-separated string or already parsed command line >>> parameters as input, and some code re-factoring to use these new methods. > > I don't think this code belongs in SubtargetFeatures at all. All it is > doing is calling AddFeature on each string, the client is perfectly > capable of doing this, which obviates thinking about how best to pass > the vector. Yes, I agree. Viktor, please remove this part of the patch, pushing the logic into the LTO client. -Chris > > Similarly, AddFeatures shouldn't impose some kind of discipline like > comma separate strings, clients should handle this if it is what they > have (and StringRef::split makes it easy for them to split the > string). > > - Daniel > >> Ok, a couple comments below: >> >>> +++ llvm/trunk/include/llvm/Target/SubtargetFeature.h Fri Nov 20 18:00:02 2009 >>> @@ -22,6 +22,7 @@ >>> #include >>> #include >>> #include "llvm/ADT/Triple.h" >>> +#include "llvm/Support/CommandLine.h" >>> #include "llvm/System/DataTypes.h" >> >> Please drop this #include. >> >>> @@ -93,6 +94,12 @@ >>> /// Adding Features. >>> void AddFeature(const std::string &String, bool IsEnabled = true); >>> >>> + /// Add a set of features from the comma-separated string. >>> + void AddFeatures(const std::string &String); >> >> This should take a StringRef instead of std::string. >> >>> + >>> + /// Add a set of features from the parsed command line parameters. >>> + void AddFeatures(const cl::list &List); >> >> cl::list inherits from std::vector, so you should be able to pass in a std::vector directly. However, it would be >> much much better to expose this as taking an array of StringRef's and require the caller to do the unpacking. >> >> -Chris >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-lto-codegen-subtargetfeature-remove_ext_functionality.diff Type: application/octet-stream Size: 2172 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091123/7edba59e/attachment.obj From jyasskin at google.com Mon Nov 23 18:53:23 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Mon, 23 Nov 2009 16:53:23 -0800 Subject: [llvm-commits] [llvm] r81422 - in /llvm/trunk/lib/Target/ARM: ARMCodeEmitter.cpp ARMJITInfo.cpp In-Reply-To: <200909100123.n8A1Nred001304@zion.cs.uiuc.edu> References: <200909100123.n8A1Nred001304@zion.cs.uiuc.edu> Message-ID: Hi Evan. Why does ARM need to mark the indirect symbol executable? It's loading from that address not executing it. Is there a test that the change to ARMJITInfo.cpp fixes? On Wed, Sep 9, 2009 at 5:23 PM, Evan Cheng wrote: > Author: evancheng > Date: Wed Sep ?9 20:23:53 2009 > New Revision: 81422 > > URL: http://llvm.org/viewvc/llvm-project?rev=81422&view=rev > Log: > Proper support of non-lazy indirect symbols. > > Modified: > ? ?llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp > ? ?llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp > > Modified: llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp?rev=81422&r1=81421&r2=81422&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp Wed Sep ?9 20:23:53 2009 > @@ -60,6 +60,7 @@ > ? ? ARMJITInfo ? ? ? ? ? ? ? ?*JTI; > ? ? const ARMInstrInfo ? ? ? ?*II; > ? ? const TargetData ? ? ? ? ?*TD; > + ? ?const ARMSubtarget ? ? ? ?*Subtarget; > ? ? TargetMachine ? ? ? ? ? ? &TM; > ? ? CodeEmitter ? ? ? ? ? ? ? &MCE; > ? ? const std::vector *MCPEs; > @@ -163,7 +164,7 @@ > ? ? /// Routines that handle operands which add machine relocations which are > ? ? /// fixed up by the relocation stage. > ? ? void emitGlobalAddress(GlobalValue *GV, unsigned Reloc, > - ? ? ? ? ? ? ? ? ? ? ? ? ? bool NeedStub, intptr_t ACPV = 0); > + ? ? ? ? ? ? ? ? ? ? ? ? ? bool NeedStub, ?bool Indirect, intptr_t ACPV = 0); > ? ? void emitExternalSymbolAddress(const char *ES, unsigned Reloc); > ? ? void emitConstPoolAddress(unsigned CPI, unsigned Reloc); > ? ? void emitJumpTableAddress(unsigned JTIndex, unsigned Reloc); > @@ -195,9 +196,10 @@ > ? assert((MF.getTarget().getRelocationModel() != Reloc::Default || > ? ? ? ? ? MF.getTarget().getRelocationModel() != Reloc::Static) && > ? ? ? ? ?"JIT relocation model must be set to static or default!"); > + ?JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); > ? II = ((ARMTargetMachine&)MF.getTarget()).getInstrInfo(); > ? TD = ((ARMTargetMachine&)MF.getTarget()).getTargetData(); > - ?JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); > + ?Subtarget = &TM.getSubtarget(); > ? MCPEs = &MF.getConstantPool()->getConstants(); > ? MJTEs = &MF.getJumpTableInfo()->getJumpTables(); > ? IsPIC = TM.getRelocationModel() == Reloc::PIC_; > @@ -244,7 +246,7 @@ > ? else if (MO.isImm()) > ? ? return static_cast(MO.getImm()); > ? else if (MO.isGlobal()) > - ? ?emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true); > + ? ?emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true, false); > ? else if (MO.isSymbol()) > ? ? emitExternalSymbolAddress(MO.getSymbolName(), ARM::reloc_arm_branch); > ? else if (MO.isCPI()) { > @@ -270,9 +272,14 @@ > ?/// > ?template > ?void Emitter::emitGlobalAddress(GlobalValue *GV, unsigned Reloc, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bool NeedStub, intptr_t ACPV) { > - ?MCE.addRelocation(MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? GV, ACPV, NeedStub)); > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bool NeedStub, bool Indirect, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? intptr_t ACPV) { > + ?MachineRelocation MR = Indirect > + ? ?? MachineRelocation::getIndirectSymbol(MCE.getCurrentPCOffset(), Reloc, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? GV, ACPV, NeedStub) > + ? ?: MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? GV, ACPV, NeedStub); > + ?MCE.addRelocation(MR); > ?} > > ?/// emitExternalSymbolAddress - Arrange for the address of an external symbol to > @@ -417,8 +424,11 @@ > > ? ? GlobalValue *GV = ACPV->getGV(); > ? ? if (GV) { > + ? ? ?Reloc::Model RelocM = TM.getRelocationModel(); > ? ? ? emitGlobalAddress(GV, ARM::reloc_arm_machine_cp_entry, > - ? ? ? ? ? ? ? ? ? ? ? ?isa(GV), (intptr_t)ACPV); > + ? ? ? ? ? ? ? ? ? ? ? ?isa(GV), > + ? ? ? ? ? ? ? ? ? ? ? ?Subtarget->GVIsIndirectSymbol(GV, RelocM), > + ? ? ? ? ? ? ? ? ? ? ? ?(intptr_t)ACPV); > ? ? ?} else ?{ > ? ? ? emitExternalSymbolAddress(ACPV->getSymbol(), ARM::reloc_arm_absolute); > ? ? } > @@ -437,7 +447,7 @@ > ? ? ? }); > > ? ? if (GlobalValue *GV = dyn_cast(CV)) { > - ? ? ?emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV)); > + ? ? ?emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV), false); > ? ? ? emitWordLE(0); > ? ? } else if (const ConstantInt *CI = dyn_cast(CV)) { > ? ? ? uint32_t Val = *(uint32_t*)CI->getValue().getRawData(); > > Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp?rev=81422&r1=81421&r2=81422&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp Wed Sep ?9 20:23:53 2009 > @@ -145,6 +145,9 @@ > ? ? llvm_unreachable("ERROR: Unable to mark indirect symbol writable"); > ? } > ? JCE.emitWordLE((intptr_t)Ptr); > + ?if (!sys::Memory::setRangeExecutable((void*)Addr, 4)) { > + ? ?llvm_unreachable("ERROR: Unable to mark indirect symbol executable"); > + ?} > ? void *PtrAddr = JCE.finishGVStub(GV); > ? addIndirectSymAddr(Ptr, (intptr_t)PtrAddr); > ? return PtrAddr; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From david_goodwin at apple.com Mon Nov 23 18:59:08 2009 From: david_goodwin at apple.com (David Goodwin) Date: Tue, 24 Nov 2009 00:59:08 -0000 Subject: [llvm-commits] [llvm] r89722 - /llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp Message-ID: <200911240059.nAO0x8gZ017867@zion.cs.uiuc.edu> Author: david_goodwin Date: Mon Nov 23 18:59:08 2009 New Revision: 89722 URL: http://llvm.org/viewvc/llvm-project?rev=89722&view=rev Log: . Allow multiple registers to be renamed together (super and sub) if necessary to break an anti-dependence. Modified: llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp Modified: llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp?rev=89722&r1=89721&r2=89722&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp (original) +++ llvm/trunk/lib/CodeGen/AggressiveAntiDepBreaker.cpp Mon Nov 23 18:59:08 2009 @@ -581,12 +581,6 @@ return false; } - // FIXME: for now just handle single register in group case... - if (Regs.size() > 1) { - DEBUG(errs() << "\tMultiple rename registers in group\n"); - return false; - } - #ifndef NDEBUG // If DebugDiv > 0 then only rename (renamecnt % DebugDiv) == DebugMod if (DebugDiv > 0) { From evan.cheng at apple.com Mon Nov 23 19:05:24 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 01:05:24 -0000 Subject: [llvm-commits] [llvm] r89723 - /llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Message-ID: <200911240105.nAO15OEP018111@zion.cs.uiuc.edu> Author: evancheng Date: Mon Nov 23 19:05:23 2009 New Revision: 89723 URL: http://llvm.org/viewvc/llvm-project?rev=89723&view=rev Log: Data type suffix must come after predicate. Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Modified: llvm/trunk/lib/Target/ARM/ARMInstrVFP.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrVFP.td?rev=89723&r1=89722&r2=89723&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMInstrVFP.td (original) +++ llvm/trunk/lib/Target/ARM/ARMInstrVFP.td Mon Nov 23 19:05:23 2009 @@ -437,7 +437,7 @@ let isReMaterializable = 1 in { def FCONSTD : VFPAI<(outs DPR:$dst), (ins vfp_f64imm:$imm), VFPMiscFrm, IIC_VMOVImm, - "vmov.f64", "\t$dst, $imm", + "vmov", ".f64\t$dst, $imm", [(set DPR:$dst, vfp_f64imm:$imm)]>, Requires<[HasVFP3]> { let Inst{27-23} = 0b11101; let Inst{21-20} = 0b11; @@ -448,7 +448,7 @@ def FCONSTS : VFPAI<(outs SPR:$dst), (ins vfp_f32imm:$imm), VFPMiscFrm, IIC_VMOVImm, - "vmov.f32", "\t$dst, $imm", + "vmov", ".f32\t$dst, $imm", [(set SPR:$dst, vfp_f32imm:$imm)]>, Requires<[HasVFP3]> { let Inst{27-23} = 0b11101; let Inst{21-20} = 0b11; From dalej at apple.com Mon Nov 23 19:09:07 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 24 Nov 2009 01:09:07 -0000 Subject: [llvm-commits] [llvm] r89724 - in /llvm/trunk/lib/Target/PowerPC: PPCISelDAGToDAG.cpp PPCISelLowering.cpp PPCRegisterInfo.cpp Message-ID: <200911240109.nAO197KL018253@zion.cs.uiuc.edu> Author: johannes Date: Mon Nov 23 19:09:07 2009 New Revision: 89724 URL: http://llvm.org/viewvc/llvm-project?rev=89724&view=rev Log: Make capitalization of names starting "is" more consistent. No functional change. Modified: llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Modified: llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp?rev=89724&r1=89723&r2=89724&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp Mon Nov 23 19:09:07 2009 @@ -86,7 +86,7 @@ /// isRotateAndMask - Returns true if Mask and Shift can be folded into a /// rotate and mask opcode and mask operation. - static bool isRotateAndMask(SDNode *N, unsigned Mask, bool IsShiftMask, + static bool isRotateAndMask(SDNode *N, unsigned Mask, bool isShiftMask, unsigned &SH, unsigned &MB, unsigned &ME); /// getGlobalBaseReg - insert code into the entry mbb to materialize the PIC @@ -358,7 +358,7 @@ } bool PPCDAGToDAGISel::isRotateAndMask(SDNode *N, unsigned Mask, - bool IsShiftMask, unsigned &SH, + bool isShiftMask, unsigned &SH, unsigned &MB, unsigned &ME) { // Don't even go down this path for i64, since different logic will be // necessary for rldicl/rldicr/rldimi. @@ -374,12 +374,12 @@ if (Opcode == ISD::SHL) { // apply shift left to mask if it comes first - if (IsShiftMask) Mask = Mask << Shift; + if (isShiftMask) Mask = Mask << Shift; // determine which bits are made indeterminant by shift Indeterminant = ~(0xFFFFFFFFu << Shift); } else if (Opcode == ISD::SRL) { // apply shift right to mask if it comes first - if (IsShiftMask) Mask = Mask >> Shift; + if (isShiftMask) Mask = Mask >> Shift; // determine which bits are made indeterminant by shift Indeterminant = ~(0xFFFFFFFFu >> Shift); // adjust for the left rotate Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=89724&r1=89723&r2=89724&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Mon Nov 23 19:09:07 2009 @@ -2173,10 +2173,10 @@ /// CalculateTailCallSPDiff - Get the amount the stack pointer has to be /// adjusted to accomodate the arguments for the tailcall. -static int CalculateTailCallSPDiff(SelectionDAG& DAG, bool IsTailCall, +static int CalculateTailCallSPDiff(SelectionDAG& DAG, bool isTailCall, unsigned ParamSize) { - if (!IsTailCall) return 0; + if (!isTailCall) return 0; PPCFunctionInfo *FI = DAG.getMachineFunction().getInfo(); unsigned CallerMinReservedArea = FI->getMinReservedArea(); @@ -3186,8 +3186,8 @@ EVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); // Construct the stack pointer operand. - bool IsPPC64 = Subtarget.isPPC64(); - unsigned SP = IsPPC64 ? PPC::X1 : PPC::R1; + bool isPPC64 = Subtarget.isPPC64(); + unsigned SP = isPPC64 ? PPC::X1 : PPC::R1; SDValue StackPtr = DAG.getRegister(SP, PtrVT); // Get the operands for the STACKRESTORE. @@ -3209,7 +3209,7 @@ SDValue PPCTargetLowering::getReturnAddrFrameIndex(SelectionDAG & DAG) const { MachineFunction &MF = DAG.getMachineFunction(); - bool IsPPC64 = PPCSubTarget.isPPC64(); + bool isPPC64 = PPCSubTarget.isPPC64(); bool isDarwinABI = PPCSubTarget.isDarwinABI(); EVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); @@ -3221,9 +3221,9 @@ // If the frame pointer save index hasn't been defined yet. if (!RASI) { // Find out what the fix offset of the frame pointer save area. - int LROffset = PPCFrameInfo::getReturnSaveOffset(IsPPC64, isDarwinABI); + int LROffset = PPCFrameInfo::getReturnSaveOffset(isPPC64, isDarwinABI); // Allocate the frame index for frame pointer save area. - RASI = MF.getFrameInfo()->CreateFixedObject(IsPPC64? 8 : 4, LROffset, + RASI = MF.getFrameInfo()->CreateFixedObject(isPPC64? 8 : 4, LROffset, true, false); // Save the result. FI->setReturnAddrSaveIndex(RASI); @@ -3234,7 +3234,7 @@ SDValue PPCTargetLowering::getFramePointerFrameIndex(SelectionDAG & DAG) const { MachineFunction &MF = DAG.getMachineFunction(); - bool IsPPC64 = PPCSubTarget.isPPC64(); + bool isPPC64 = PPCSubTarget.isPPC64(); bool isDarwinABI = PPCSubTarget.isDarwinABI(); EVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(); @@ -3246,11 +3246,11 @@ // If the frame pointer save index hasn't been defined yet. if (!FPSI) { // Find out what the fix offset of the frame pointer save area. - int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(IsPPC64, + int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, isDarwinABI); // Allocate the frame index for frame pointer save area. - FPSI = MF.getFrameInfo()->CreateFixedObject(IsPPC64? 8 : 4, FPOffset, + FPSI = MF.getFrameInfo()->CreateFixedObject(isPPC64? 8 : 4, FPOffset, true, false); // Save the result. FI->setFramePointerSaveIndex(FPSI); Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=89724&r1=89723&r2=89724&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Mon Nov 23 19:09:07 2009 @@ -1032,18 +1032,18 @@ // Save R31 if necessary int FPSI = FI->getFramePointerSaveIndex(); - bool IsPPC64 = Subtarget.isPPC64(); - bool IsSVR4ABI = Subtarget.isSVR4ABI(); + bool isPPC64 = Subtarget.isPPC64(); + bool isSVR4ABI = Subtarget.isSVR4ABI(); bool isDarwinABI = Subtarget.isDarwinABI(); MachineFrameInfo *MFI = MF.getFrameInfo(); // If the frame pointer save index hasn't been defined yet. - if (!FPSI && needsFP(MF) && IsSVR4ABI) { + if (!FPSI && needsFP(MF) && isSVR4ABI) { // Find out what the fix offset of the frame pointer save area. - int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(IsPPC64, + int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, isDarwinABI); // Allocate the frame index for frame pointer save area. - FPSI = MF.getFrameInfo()->CreateFixedObject(IsPPC64? 8 : 4, FPOffset, + FPSI = MF.getFrameInfo()->CreateFixedObject(isPPC64? 8 : 4, FPOffset, true, false); // Save the result. FI->setFramePointerSaveIndex(FPSI); @@ -1067,7 +1067,7 @@ if (needsFP(MF) || spillsCR(MF)) { const TargetRegisterClass *GPRC = &PPC::GPRCRegClass; const TargetRegisterClass *G8RC = &PPC::G8RCRegClass; - const TargetRegisterClass *RC = IsPPC64 ? G8RC : GPRC; + const TargetRegisterClass *RC = isPPC64 ? G8RC : GPRC; RS->setScavengingFrameIndex(MFI->CreateStackObject(RC->getSize(), RC->getAlignment(), false)); @@ -1297,7 +1297,7 @@ int NegFrameSize = -FrameSize; // Get processor type. - bool IsPPC64 = Subtarget.isPPC64(); + bool isPPC64 = Subtarget.isPPC64(); // Get operating system bool isDarwinABI = Subtarget.isDarwinABI(); // Check if the link register (LR) must be saved. @@ -1306,7 +1306,7 @@ // Do we have a frame pointer for this function? bool HasFP = hasFP(MF) && FrameSize; - int LROffset = PPCFrameInfo::getReturnSaveOffset(IsPPC64, isDarwinABI); + int LROffset = PPCFrameInfo::getReturnSaveOffset(isPPC64, isDarwinABI); int FPOffset = 0; if (HasFP) { @@ -1316,11 +1316,11 @@ assert(FPIndex && "No Frame Pointer Save Slot!"); FPOffset = FFI->getObjectOffset(FPIndex); } else { - FPOffset = PPCFrameInfo::getFramePointerSaveOffset(IsPPC64, isDarwinABI); + FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, isDarwinABI); } } - if (IsPPC64) { + if (isPPC64) { if (MustSaveLR) BuildMI(MBB, MBBI, dl, TII.get(PPC::MFLR8), PPC::X0); @@ -1361,7 +1361,7 @@ // Adjust stack pointer: r1 += NegFrameSize. // If there is a preferred stack alignment, align R1 now - if (!IsPPC64) { + if (!isPPC64) { // PPC32. if (ALIGN_STACK && MaxAlign > TargetAlign) { assert(isPowerOf2_32(MaxAlign)&&isInt16(MaxAlign)&&"Invalid alignment!"); @@ -1444,19 +1444,19 @@ MachineLocation SPSrc(MachineLocation::VirtualFP, NegFrameSize); Moves.push_back(MachineMove(FrameLabelId, SPDst, SPSrc)); } else { - MachineLocation SP(IsPPC64 ? PPC::X31 : PPC::R31); + MachineLocation SP(isPPC64 ? PPC::X31 : PPC::R31); Moves.push_back(MachineMove(FrameLabelId, SP, SP)); } if (HasFP) { MachineLocation FPDst(MachineLocation::VirtualFP, FPOffset); - MachineLocation FPSrc(IsPPC64 ? PPC::X31 : PPC::R31); + MachineLocation FPSrc(isPPC64 ? PPC::X31 : PPC::R31); Moves.push_back(MachineMove(FrameLabelId, FPDst, FPSrc)); } if (MustSaveLR) { MachineLocation LRDst(MachineLocation::VirtualFP, LROffset); - MachineLocation LRSrc(IsPPC64 ? PPC::LR8 : PPC::LR); + MachineLocation LRSrc(isPPC64 ? PPC::LR8 : PPC::LR); Moves.push_back(MachineMove(FrameLabelId, LRDst, LRSrc)); } } @@ -1465,7 +1465,7 @@ // If there is a frame pointer, copy R1 into R31 if (HasFP) { - if (!IsPPC64) { + if (!isPPC64) { BuildMI(MBB, MBBI, dl, TII.get(PPC::OR), PPC::R31) .addReg(PPC::R1) .addReg(PPC::R1); @@ -1481,8 +1481,8 @@ // Mark effective beginning of when frame pointer is ready. BuildMI(MBB, MBBI, dl, TII.get(PPC::DBG_LABEL)).addImm(ReadyLabelId); - MachineLocation FPDst(HasFP ? (IsPPC64 ? PPC::X31 : PPC::R31) : - (IsPPC64 ? PPC::X1 : PPC::R1)); + MachineLocation FPDst(HasFP ? (isPPC64 ? PPC::X31 : PPC::R31) : + (isPPC64 ? PPC::X1 : PPC::R1)); MachineLocation FPSrc(MachineLocation::VirtualFP); Moves.push_back(MachineMove(ReadyLabelId, FPDst, FPSrc)); } @@ -1528,7 +1528,7 @@ int FrameSize = MFI->getStackSize(); // Get processor type. - bool IsPPC64 = Subtarget.isPPC64(); + bool isPPC64 = Subtarget.isPPC64(); // Get operating system bool isDarwinABI = Subtarget.isDarwinABI(); // Check if the link register (LR) has been saved. @@ -1537,7 +1537,7 @@ // Do we have a frame pointer for this function? bool HasFP = hasFP(MF) && FrameSize; - int LROffset = PPCFrameInfo::getReturnSaveOffset(IsPPC64, isDarwinABI); + int LROffset = PPCFrameInfo::getReturnSaveOffset(isPPC64, isDarwinABI); int FPOffset = 0; if (HasFP) { @@ -1547,7 +1547,7 @@ assert(FPIndex && "No Frame Pointer Save Slot!"); FPOffset = FFI->getObjectOffset(FPIndex); } else { - FPOffset = PPCFrameInfo::getFramePointerSaveOffset(IsPPC64, isDarwinABI); + FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, isDarwinABI); } } @@ -1575,7 +1575,7 @@ if (FrameSize) { // The loaded (or persistent) stack pointer value is offset by the 'stwu' // on entry to the function. Add this offset back now. - if (!IsPPC64) { + if (!isPPC64) { // If this function contained a fastcc call and PerformTailCallOpt is // enabled (=> hasFastCall()==true) the fastcc call might contain a tail // call which invalidates the stack pointer value in SP(0). So we use the @@ -1629,7 +1629,7 @@ } } - if (IsPPC64) { + if (isPPC64) { if (MustSaveLR) BuildMI(MBB, MBBI, dl, TII.get(PPC::LD), PPC::X0) .addImm(LROffset/4).addReg(PPC::X1); @@ -1659,13 +1659,13 @@ MF.getFunction()->getCallingConv() == CallingConv::Fast) { PPCFunctionInfo *FI = MF.getInfo(); unsigned CallerAllocatedAmt = FI->getMinReservedArea(); - unsigned StackReg = IsPPC64 ? PPC::X1 : PPC::R1; - unsigned FPReg = IsPPC64 ? PPC::X31 : PPC::R31; - unsigned TmpReg = IsPPC64 ? PPC::X0 : PPC::R0; - unsigned ADDIInstr = IsPPC64 ? PPC::ADDI8 : PPC::ADDI; - unsigned ADDInstr = IsPPC64 ? PPC::ADD8 : PPC::ADD4; - unsigned LISInstr = IsPPC64 ? PPC::LIS8 : PPC::LIS; - unsigned ORIInstr = IsPPC64 ? PPC::ORI8 : PPC::ORI; + unsigned StackReg = isPPC64 ? PPC::X1 : PPC::R1; + unsigned FPReg = isPPC64 ? PPC::X31 : PPC::R31; + unsigned TmpReg = isPPC64 ? PPC::X0 : PPC::R0; + unsigned ADDIInstr = isPPC64 ? PPC::ADDI8 : PPC::ADDI; + unsigned ADDInstr = isPPC64 ? PPC::ADD8 : PPC::ADD4; + unsigned LISInstr = isPPC64 ? PPC::LIS8 : PPC::LIS; + unsigned ORIInstr = isPPC64 ? PPC::ORI8 : PPC::ORI; if (CallerAllocatedAmt && isInt16(CallerAllocatedAmt)) { BuildMI(MBB, MBBI, dl, TII.get(ADDIInstr), StackReg) From dpatel at apple.com Mon Nov 23 19:14:22 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 24 Nov 2009 01:14:22 -0000 Subject: [llvm-commits] [llvm] r89725 - in /llvm/trunk: include/llvm/Analysis/DebugInfo.h lib/Analysis/DebugInfo.cpp lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h Message-ID: <200911240114.nAO1EN89018437@zion.cs.uiuc.edu> Author: dpatel Date: Mon Nov 23 19:14:22 2009 New Revision: 89725 URL: http://llvm.org/viewvc/llvm-project?rev=89725&view=rev Log: Emit pubtypes. Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h llvm/trunk/lib/Analysis/DebugInfo.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/DebugInfo.h?rev=89725&r1=89724&r2=89725&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/DebugInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/DebugInfo.h Mon Nov 23 19:14:22 2009 @@ -673,6 +673,12 @@ DebugLoc ExtractDebugLocation(DbgFuncStartInst &FSI, DebugLocTracker &DebugLocInfo); + /// getDISubprogram - Find subprogram that is enclosing this scope. + DISubprogram getDISubprogram(MDNode *Scope); + + /// getDICompositeType - Find underlying composite type. + DICompositeType getDICompositeType(DIType T); + class DebugInfoFinder { public: Modified: llvm/trunk/lib/Analysis/DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/DebugInfo.cpp?rev=89725&r1=89724&r2=89725&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/DebugInfo.cpp (original) +++ llvm/trunk/lib/Analysis/DebugInfo.cpp Mon Nov 23 19:14:22 2009 @@ -1409,4 +1409,36 @@ return DebugLoc::get(Id); } + + /// getDISubprogram - Find subprogram that is enclosing this scope. + DISubprogram getDISubprogram(MDNode *Scope) { + DIDescriptor D(Scope); + if (D.isNull()) + return DISubprogram(); + + if (D.isCompileUnit()) + return DISubprogram(); + + if (D.isSubprogram()) + return DISubprogram(Scope); + + if (D.isLexicalBlock()) + return getDISubprogram(DILexicalBlock(Scope).getContext().getNode()); + + return DISubprogram(); + } + + /// getDICompositeType - Find underlying composite type. + DICompositeType getDICompositeType(DIType T) { + if (T.isNull()) + return DICompositeType(); + + if (T.isCompositeType()) + return DICompositeType(T.getNode()); + + if (T.isDerivedType()) + return getDICompositeType(DIDerivedType(T.getNode()).getTypeDerivedFrom()); + + return DICompositeType(); + } } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89725&r1=89724&r2=89725&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Mon Nov 23 19:14:22 2009 @@ -72,15 +72,20 @@ /// StringMap Globals; + /// GlobalTypes - A map of globally visible types for this unit. + /// + StringMap GlobalTypes; + public: CompileUnit(unsigned I, DIE *D) : ID(I), CUDie(D), IndexTyDie(0) {} ~CompileUnit() { delete CUDie; delete IndexTyDie; } // Accessors. - unsigned getID() const { return ID; } - DIE* getCUDie() const { return CUDie; } - StringMap &getGlobals() { return Globals; } + unsigned getID() const { return ID; } + DIE* getCUDie() const { return CUDie; } + const StringMap &getGlobals() const { return Globals; } + const StringMap &getGlobalTypes() const { return GlobalTypes; } /// hasContent - Return true if this compile unit has something to write out. /// @@ -90,6 +95,12 @@ /// void addGlobal(const std::string &Name, DIE *Die) { Globals[Name] = Die; } + /// addGlobalType - Add a new global type to the compile unit. + /// + void addGlobalType(const std::string &Name, DIE *Die) { + GlobalTypes[Name] = Die; + } + /// getDIE - Returns the debug information entry map slot for the /// specified debug variable. DIE *getDIE(MDNode *N) { return GVToDieMap.lookup(N); } @@ -1277,24 +1288,6 @@ return AScope; } -static DISubprogram getDISubprogram(MDNode *N) { - - DIDescriptor D(N); - if (D.isNull()) - return DISubprogram(); - - if (D.isCompileUnit()) - return DISubprogram(); - - if (D.isSubprogram()) - return DISubprogram(N); - - if (D.isLexicalBlock()) - return getDISubprogram(DILexicalBlock(N).getContext().getNode()); - - llvm_unreachable("Unexpected Descriptor!"); -} - /// updateSubprogramScopeDIE - Find DIE for the given subprogram and /// attach appropriate DW_AT_low_pc and DW_AT_high_pc attributes. /// If there are global variables in this scope then create and insert @@ -1325,6 +1318,7 @@ SPDie->addChild(ScopedGVDie); } } + return SPDie; } @@ -1482,6 +1476,28 @@ } +void DwarfDebug::addPubTypes(DISubprogram SP) { + DICompositeType SPTy = SP.getType(); + unsigned SPTag = SPTy.getTag(); + if (SPTag != dwarf::DW_TAG_subroutine_type) + return; + + DIArray Args = SPTy.getTypeArray(); + if (Args.isNull()) + return; + + for (unsigned i = 0, e = Args.getNumElements(); i != e; ++i) { + DIType ATy(Args.getElement(i).getNode()); + if (ATy.isNull()) + continue; + DICompositeType CATy = getDICompositeType(ATy); + if (!CATy.isNull() && CATy.getName()) { + if (DIEEntry *Entry = ModuleCU->getDIEEntry(CATy.getNode())) + ModuleCU->addGlobalType(CATy.getName(), Entry->getEntry()); + } + } +} + /// constructScopeDIE - Construct a DIE for this scope. DIE *DwarfDebug::constructScopeDIE(DbgScope *Scope) { if (!Scope) @@ -1520,7 +1536,11 @@ if (NestedDIE) ScopeDIE->addChild(NestedDIE); } - return ScopeDIE; + + if (DS.isSubprogram()) + addPubTypes(DISubprogram(DS.getNode())); + + return ScopeDIE; } /// GetOrCreateSourceID - Look up the source id with the given directory and @@ -1622,6 +1642,13 @@ // Expose as global. FIXME - need to check external flag. ModuleCU->addGlobal(DI_GV.getName(), VariableDie); + + DIType GTy = DI_GV.getType(); + if (GTy.isCompositeType() && GTy.getName()) { + DIEEntry *Entry = ModuleCU->getDIEEntry(GTy.getNode()); + assert (Entry && "Missing global type!"); + ModuleCU->addGlobalType(GTy.getName(), Entry->getEntry()); + } return; } @@ -1647,6 +1674,7 @@ // Expose as global. ModuleCU->addGlobal(SP.getName(), SubprogramDie); + return; } @@ -1778,6 +1806,9 @@ // Emit info into a debug pubnames section. emitDebugPubNames(); + // Emit info into a debug pubtypes section. + emitDebugPubTypes(); + // Emit info into a debug str section. emitDebugStr(); @@ -2233,6 +2264,8 @@ EmitLabel("section_loc", 0); Asm->OutStreamer.SwitchSection(TLOF.getDwarfPubNamesSection()); EmitLabel("section_pubnames", 0); + Asm->OutStreamer.SwitchSection(TLOF.getDwarfPubTypesSection()); + EmitLabel("section_pubtypes", 0); Asm->OutStreamer.SwitchSection(TLOF.getDwarfStrSection()); EmitLabel("section_str", 0); Asm->OutStreamer.SwitchSection(TLOF.getDwarfRangesSection()); @@ -2657,7 +2690,7 @@ true); Asm->EOL("Compilation Unit Length"); - StringMap &Globals = Unit->getGlobals(); + const StringMap &Globals = Unit->getGlobals(); for (StringMap::const_iterator GI = Globals.begin(), GE = Globals.end(); GI != GE; ++GI) { const char *Name = GI->getKeyData(); @@ -2683,6 +2716,39 @@ emitDebugPubNamesPerCU(ModuleCU); } +void DwarfDebug::emitDebugPubTypes() { + EmitDifference("pubtypes_end", ModuleCU->getID(), + "pubtypes_begin", ModuleCU->getID(), true); + Asm->EOL("Length of Public Types Info"); + + EmitLabel("pubtypes_begin", ModuleCU->getID()); + + Asm->EmitInt16(dwarf::DWARF_VERSION); Asm->EOL("DWARF Version"); + + EmitSectionOffset("info_begin", "section_info", + ModuleCU->getID(), 0, true, false); + Asm->EOL("Offset of Compilation ModuleCU Info"); + + EmitDifference("info_end", ModuleCU->getID(), "info_begin", ModuleCU->getID(), + true); + Asm->EOL("Compilation ModuleCU Length"); + + const StringMap &Globals = ModuleCU->getGlobalTypes(); + for (StringMap::const_iterator + GI = Globals.begin(), GE = Globals.end(); GI != GE; ++GI) { + const char *Name = GI->getKeyData(); + DIE * Entity = GI->second; + + Asm->EmitInt32(Entity->getOffset()); Asm->EOL("DIE offset"); + Asm->EmitString(Name, strlen(Name)); Asm->EOL("External Name"); + } + + Asm->EmitInt32(0); Asm->EOL("End Mark"); + EmitLabel("pubtypes_end", ModuleCU->getID()); + + Asm->EOL(); +} + /// emitDebugStr - Emit visible names into a debug str section. /// void DwarfDebug::emitDebugStr() { Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h?rev=89725&r1=89724&r2=89725&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Mon Nov 23 19:14:22 2009 @@ -310,6 +310,8 @@ /// addType - Add a new type attribute to the specified entity. void addType(CompileUnit *DW_Unit, DIE *Entity, DIType Ty); + void addPubTypes(DISubprogram SP); + /// constructTypeDIE - Construct basic type die from DIBasicType. void constructTypeDIE(CompileUnit *DW_Unit, DIE &Buffer, DIBasicType BTy); @@ -436,6 +438,10 @@ /// void emitDebugPubNames(); + /// emitDebugPubTypes - Emit visible types into a debug pubtypes section. + /// + void emitDebugPubTypes(); + /// emitDebugStr - Emit visible names into a debug str section. /// void emitDebugStr(); From gohman at apple.com Mon Nov 23 19:48:15 2009 From: gohman at apple.com (Dan Gohman) Date: Tue, 24 Nov 2009 01:48:15 -0000 Subject: [llvm-commits] [llvm] r89729 - in /llvm/trunk/utils/TableGen: Record.cpp Record.h Message-ID: <200911240148.nAO1mFue019599@zion.cs.uiuc.edu> Author: djg Date: Mon Nov 23 19:48:15 2009 New Revision: 89729 URL: http://llvm.org/viewvc/llvm-project?rev=89729&view=rev Log: Delete some dead and non-obvious code. Modified: llvm/trunk/utils/TableGen/Record.cpp llvm/trunk/utils/TableGen/Record.h Modified: llvm/trunk/utils/TableGen/Record.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/Record.cpp?rev=89729&r1=89728&r2=89729&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/Record.cpp (original) +++ llvm/trunk/utils/TableGen/Record.cpp Mon Nov 23 19:48:15 2009 @@ -346,10 +346,6 @@ } std::string BitsInit::getAsString() const { - //if (!printInHex(OS)) return; - //if (!printAsVariable(OS)) return; - //if (!printAsUnset(OS)) return; - std::string Result = "{ "; for (unsigned i = 0, e = getNumBits(); i != e; ++i) { if (i) Result += ", "; @@ -361,51 +357,6 @@ return Result + " }"; } -bool BitsInit::printInHex(raw_ostream &OS) const { - // First, attempt to convert the value into an integer value... - int64_t Result = 0; - for (unsigned i = 0, e = getNumBits(); i != e; ++i) - if (BitInit *Bit = dynamic_cast(getBit(i))) { - Result |= Bit->getValue() << i; - } else { - return true; - } - - OS << format("0x%x", Result); - return false; -} - -bool BitsInit::printAsVariable(raw_ostream &OS) const { - // Get the variable that we may be set equal to... - assert(getNumBits() != 0); - VarBitInit *FirstBit = dynamic_cast(getBit(0)); - if (FirstBit == 0) return true; - TypedInit *Var = FirstBit->getVariable(); - - // Check to make sure the types are compatible. - BitsRecTy *Ty = dynamic_cast(FirstBit->getVariable()->getType()); - if (Ty == 0) return true; - if (Ty->getNumBits() != getNumBits()) return true; // Incompatible types! - - // Check to make sure all bits are referring to the right bits in the variable - for (unsigned i = 0, e = getNumBits(); i != e; ++i) { - VarBitInit *Bit = dynamic_cast(getBit(i)); - if (Bit == 0 || Bit->getVariable() != Var || Bit->getBitNum() != i) - return true; - } - - Var->print(OS); - return false; -} - -bool BitsInit::printAsUnset(raw_ostream &OS) const { - for (unsigned i = 0, e = getNumBits(); i != e; ++i) - if (!dynamic_cast(getBit(i))) - return true; - OS << "?"; - return false; -} - // resolveReferences - If there are any field references that refer to fields // that have been filled in, we can propagate the values now. // Modified: llvm/trunk/utils/TableGen/Record.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/Record.h?rev=89729&r1=89728&r2=89729&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/Record.h (original) +++ llvm/trunk/utils/TableGen/Record.h Mon Nov 23 19:48:15 2009 @@ -611,12 +611,6 @@ virtual std::string getAsString() const; virtual Init *resolveReferences(Record &R, const RecordVal *RV); - - // printXX - Print this bitstream with the specified format, returning true if - // it is not possible. - bool printInHex(raw_ostream &OS) const; - bool printAsVariable(raw_ostream &OS) const; - bool printAsUnset(raw_ostream &OS) const; }; From evan.cheng at apple.com Mon Nov 23 20:04:23 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 18:04:23 -0800 Subject: [llvm-commits] [llvm] r81422 - in /llvm/trunk/lib/Target/ARM: ARMCodeEmitter.cpp ARMJITInfo.cpp In-Reply-To: References: <200909100123.n8A1Nred001304@zion.cs.uiuc.edu> Message-ID: It doesn't. Looks like a copy+paste bug. Evan On Nov 23, 2009, at 4:53 PM, Jeffrey Yasskin wrote: > Hi Evan. Why does ARM need to mark the indirect symbol executable? > It's loading from that address not executing it. > > Is there a test that the change to ARMJITInfo.cpp fixes? > > On Wed, Sep 9, 2009 at 5:23 PM, Evan Cheng wrote: >> Author: evancheng >> Date: Wed Sep 9 20:23:53 2009 >> New Revision: 81422 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=81422&view=rev >> Log: >> Proper support of non-lazy indirect symbols. >> >> Modified: >> llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp >> llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp >> >> Modified: llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp?rev=81422&r1=81421&r2=81422&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp Wed Sep 9 20:23:53 2009 >> @@ -60,6 +60,7 @@ >> ARMJITInfo *JTI; >> const ARMInstrInfo *II; >> const TargetData *TD; >> + const ARMSubtarget *Subtarget; >> TargetMachine &TM; >> CodeEmitter &MCE; >> const std::vector *MCPEs; >> @@ -163,7 +164,7 @@ >> /// Routines that handle operands which add machine relocations which are >> /// fixed up by the relocation stage. >> void emitGlobalAddress(GlobalValue *GV, unsigned Reloc, >> - bool NeedStub, intptr_t ACPV = 0); >> + bool NeedStub, bool Indirect, intptr_t ACPV = 0); >> void emitExternalSymbolAddress(const char *ES, unsigned Reloc); >> void emitConstPoolAddress(unsigned CPI, unsigned Reloc); >> void emitJumpTableAddress(unsigned JTIndex, unsigned Reloc); >> @@ -195,9 +196,10 @@ >> assert((MF.getTarget().getRelocationModel() != Reloc::Default || >> MF.getTarget().getRelocationModel() != Reloc::Static) && >> "JIT relocation model must be set to static or default!"); >> + JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); >> II = ((ARMTargetMachine&)MF.getTarget()).getInstrInfo(); >> TD = ((ARMTargetMachine&)MF.getTarget()).getTargetData(); >> - JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); >> + Subtarget = &TM.getSubtarget(); >> MCPEs = &MF.getConstantPool()->getConstants(); >> MJTEs = &MF.getJumpTableInfo()->getJumpTables(); >> IsPIC = TM.getRelocationModel() == Reloc::PIC_; >> @@ -244,7 +246,7 @@ >> else if (MO.isImm()) >> return static_cast(MO.getImm()); >> else if (MO.isGlobal()) >> - emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true); >> + emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true, false); >> else if (MO.isSymbol()) >> emitExternalSymbolAddress(MO.getSymbolName(), ARM::reloc_arm_branch); >> else if (MO.isCPI()) { >> @@ -270,9 +272,14 @@ >> /// >> template >> void Emitter::emitGlobalAddress(GlobalValue *GV, unsigned Reloc, >> - bool NeedStub, intptr_t ACPV) { >> - MCE.addRelocation(MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, >> - GV, ACPV, NeedStub)); >> + bool NeedStub, bool Indirect, >> + intptr_t ACPV) { >> + MachineRelocation MR = Indirect >> + ? MachineRelocation::getIndirectSymbol(MCE.getCurrentPCOffset(), Reloc, >> + GV, ACPV, NeedStub) >> + : MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, >> + GV, ACPV, NeedStub); >> + MCE.addRelocation(MR); >> } >> >> /// emitExternalSymbolAddress - Arrange for the address of an external symbol to >> @@ -417,8 +424,11 @@ >> >> GlobalValue *GV = ACPV->getGV(); >> if (GV) { >> + Reloc::Model RelocM = TM.getRelocationModel(); >> emitGlobalAddress(GV, ARM::reloc_arm_machine_cp_entry, >> - isa(GV), (intptr_t)ACPV); >> + isa(GV), >> + Subtarget->GVIsIndirectSymbol(GV, RelocM), >> + (intptr_t)ACPV); >> } else { >> emitExternalSymbolAddress(ACPV->getSymbol(), ARM::reloc_arm_absolute); >> } >> @@ -437,7 +447,7 @@ >> }); >> >> if (GlobalValue *GV = dyn_cast(CV)) { >> - emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV)); >> + emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV), false); >> emitWordLE(0); >> } else if (const ConstantInt *CI = dyn_cast(CV)) { >> uint32_t Val = *(uint32_t*)CI->getValue().getRawData(); >> >> Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp?rev=81422&r1=81421&r2=81422&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp Wed Sep 9 20:23:53 2009 >> @@ -145,6 +145,9 @@ >> llvm_unreachable("ERROR: Unable to mark indirect symbol writable"); >> } >> JCE.emitWordLE((intptr_t)Ptr); >> + if (!sys::Memory::setRangeExecutable((void*)Addr, 4)) { >> + llvm_unreachable("ERROR: Unable to mark indirect symbol executable"); >> + } >> void *PtrAddr = JCE.finishGVStub(GV); >> addIndirectSymAddr(Ptr, (intptr_t)PtrAddr); >> return PtrAddr; >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> From evan.cheng at apple.com Mon Nov 23 20:07:11 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 18:07:11 -0800 Subject: [llvm-commits] [llvm] r81422 - in /llvm/trunk/lib/Target/ARM: ARMCodeEmitter.cpp ARMJITInfo.cpp In-Reply-To: References: <200909100123.n8A1Nred001304@zion.cs.uiuc.edu> Message-ID: Ah actually, it looks like we are not distinguishing data indirect symbol from function stub (which is bad). In emitFunctionStub(), I see: if (!LazyPtr) { // In PIC mode, the function stub is loading a lazy-ptr. LazyPtr= (intptr_t)emitGlobalValueIndirectSym((GlobalValue*)F, Fn, JCE); DEBUG(if (F) errs() << "JIT: Indirect symbol emitted at [" << LazyPtr << "] for GV '" << F->getName() << "'\n"; else errs() << "JIT: Stub emitted at [" << LazyPtr << "] for external function at '" << Fn << "'\n"); } The function stub does need to be marked executable. Evan On Nov 23, 2009, at 6:04 PM, Evan Cheng wrote: > It doesn't. Looks like a copy+paste bug. > > Evan > > On Nov 23, 2009, at 4:53 PM, Jeffrey Yasskin wrote: > >> Hi Evan. Why does ARM need to mark the indirect symbol executable? >> It's loading from that address not executing it. >> >> Is there a test that the change to ARMJITInfo.cpp fixes? >> >> On Wed, Sep 9, 2009 at 5:23 PM, Evan Cheng wrote: >>> Author: evancheng >>> Date: Wed Sep 9 20:23:53 2009 >>> New Revision: 81422 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=81422&view=rev >>> Log: >>> Proper support of non-lazy indirect symbols. >>> >>> Modified: >>> llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp >>> llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp >>> >>> Modified: llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp?rev=81422&r1=81421&r2=81422&view=diff >>> >>> ============================================================================== >>> --- llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp (original) >>> +++ llvm/trunk/lib/Target/ARM/ARMCodeEmitter.cpp Wed Sep 9 20:23:53 2009 >>> @@ -60,6 +60,7 @@ >>> ARMJITInfo *JTI; >>> const ARMInstrInfo *II; >>> const TargetData *TD; >>> + const ARMSubtarget *Subtarget; >>> TargetMachine &TM; >>> CodeEmitter &MCE; >>> const std::vector *MCPEs; >>> @@ -163,7 +164,7 @@ >>> /// Routines that handle operands which add machine relocations which are >>> /// fixed up by the relocation stage. >>> void emitGlobalAddress(GlobalValue *GV, unsigned Reloc, >>> - bool NeedStub, intptr_t ACPV = 0); >>> + bool NeedStub, bool Indirect, intptr_t ACPV = 0); >>> void emitExternalSymbolAddress(const char *ES, unsigned Reloc); >>> void emitConstPoolAddress(unsigned CPI, unsigned Reloc); >>> void emitJumpTableAddress(unsigned JTIndex, unsigned Reloc); >>> @@ -195,9 +196,10 @@ >>> assert((MF.getTarget().getRelocationModel() != Reloc::Default || >>> MF.getTarget().getRelocationModel() != Reloc::Static) && >>> "JIT relocation model must be set to static or default!"); >>> + JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); >>> II = ((ARMTargetMachine&)MF.getTarget()).getInstrInfo(); >>> TD = ((ARMTargetMachine&)MF.getTarget()).getTargetData(); >>> - JTI = ((ARMTargetMachine&)MF.getTarget()).getJITInfo(); >>> + Subtarget = &TM.getSubtarget(); >>> MCPEs = &MF.getConstantPool()->getConstants(); >>> MJTEs = &MF.getJumpTableInfo()->getJumpTables(); >>> IsPIC = TM.getRelocationModel() == Reloc::PIC_; >>> @@ -244,7 +246,7 @@ >>> else if (MO.isImm()) >>> return static_cast(MO.getImm()); >>> else if (MO.isGlobal()) >>> - emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true); >>> + emitGlobalAddress(MO.getGlobal(), ARM::reloc_arm_branch, true, false); >>> else if (MO.isSymbol()) >>> emitExternalSymbolAddress(MO.getSymbolName(), ARM::reloc_arm_branch); >>> else if (MO.isCPI()) { >>> @@ -270,9 +272,14 @@ >>> /// >>> template >>> void Emitter::emitGlobalAddress(GlobalValue *GV, unsigned Reloc, >>> - bool NeedStub, intptr_t ACPV) { >>> - MCE.addRelocation(MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, >>> - GV, ACPV, NeedStub)); >>> + bool NeedStub, bool Indirect, >>> + intptr_t ACPV) { >>> + MachineRelocation MR = Indirect >>> + ? MachineRelocation::getIndirectSymbol(MCE.getCurrentPCOffset(), Reloc, >>> + GV, ACPV, NeedStub) >>> + : MachineRelocation::getGV(MCE.getCurrentPCOffset(), Reloc, >>> + GV, ACPV, NeedStub); >>> + MCE.addRelocation(MR); >>> } >>> >>> /// emitExternalSymbolAddress - Arrange for the address of an external symbol to >>> @@ -417,8 +424,11 @@ >>> >>> GlobalValue *GV = ACPV->getGV(); >>> if (GV) { >>> + Reloc::Model RelocM = TM.getRelocationModel(); >>> emitGlobalAddress(GV, ARM::reloc_arm_machine_cp_entry, >>> - isa(GV), (intptr_t)ACPV); >>> + isa(GV), >>> + Subtarget->GVIsIndirectSymbol(GV, RelocM), >>> + (intptr_t)ACPV); >>> } else { >>> emitExternalSymbolAddress(ACPV->getSymbol(), ARM::reloc_arm_absolute); >>> } >>> @@ -437,7 +447,7 @@ >>> }); >>> >>> if (GlobalValue *GV = dyn_cast(CV)) { >>> - emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV)); >>> + emitGlobalAddress(GV, ARM::reloc_arm_absolute, isa(GV), false); >>> emitWordLE(0); >>> } else if (const ConstantInt *CI = dyn_cast(CV)) { >>> uint32_t Val = *(uint32_t*)CI->getValue().getRawData(); >>> >>> Modified: llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp?rev=81422&r1=81421&r2=81422&view=diff >>> >>> ============================================================================== >>> --- llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp (original) >>> +++ llvm/trunk/lib/Target/ARM/ARMJITInfo.cpp Wed Sep 9 20:23:53 2009 >>> @@ -145,6 +145,9 @@ >>> llvm_unreachable("ERROR: Unable to mark indirect symbol writable"); >>> } >>> JCE.emitWordLE((intptr_t)Ptr); >>> + if (!sys::Memory::setRangeExecutable((void*)Addr, 4)) { >>> + llvm_unreachable("ERROR: Unable to mark indirect symbol executable"); >>> + } >>> void *PtrAddr = JCE.finishGVStub(GV); >>> addIndirectSymAddr(Ptr, (intptr_t)PtrAddr); >>> return PtrAddr; >>> >>> >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >>> > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From jyasskin at google.com Mon Nov 23 20:11:14 2009 From: jyasskin at google.com (Jeffrey Yasskin) Date: Tue, 24 Nov 2009 02:11:14 -0000 Subject: [llvm-commits] [llvm] r89733 - /llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp Message-ID: <200911240211.nAO2BExH020516@zion.cs.uiuc.edu> Author: jyasskin Date: Mon Nov 23 20:11:14 2009 New Revision: 89733 URL: http://llvm.org/viewvc/llvm-project?rev=89733&view=rev Log: Oops. Re-disable JITTest.NoStubs on ARM and PPC since they still use stubs to make far calls work. Modified: llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp Modified: llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp?rev=89733&r1=89732&r2=89733&view=diff ============================================================================== --- llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp (original) +++ llvm/trunk/unittests/ExecutionEngine/JIT/JITTest.cpp Mon Nov 23 20:11:14 2009 @@ -457,6 +457,10 @@ NumTablesDeallocated); } +// ARM and PPC still emit stubs for calls since the target may be too far away +// to call directly. This #if can probably be removed when +// http://llvm.org/PR5201 is fixed. +#if !defined(__arm__) && !defined(__powerpc__) && !defined(__ppc__) typedef int (*FooPtr) (); TEST_F(JITTest, NoStubs) { @@ -494,6 +498,7 @@ ASSERT_EQ(stubsBefore, RJMM->stubsAllocated); } +#endif // !ARM && !PPC TEST_F(JITTest, FunctionPointersOutliveTheirCreator) { TheJIT->DisableLazyCompilation(true); From evan.cheng at apple.com Mon Nov 23 20:47:28 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 18:47:28 -0800 Subject: [llvm-commits] [llvm] r89720 - in /llvm/trunk: lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMExpandPseudoInsts.cpp lib/Target/ARM/ARMISelDAGToDAG.cpp lib/Target/ARM/ARMISelLowering.cpp lib/Target/ARM/ARMInstrInfo.td lib/Target/ARM/ARMInstrThumb2.td lib/Target/ARM/ARMSubtarget.cpp lib/Target/ARM/ARMSubtarget.h lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp lib/Target/ARM/Thumb2SizeReduction.cpp test/CodeGen/ARM/movt-movw-global.ll In-Reply-To: <200911240044.nAO0ic27017388@zion.cs.uiuc.edu> References: <200911240044.nAO0ic27017388@zion.cs.uiuc.edu> Message-ID: <019CB8B6-C967-4560-B4C7-BDE0F446C026@apple.com> Hi Anton, Can you remove the "UseMovt" subtarget feature and just check for V6T2? Evan On Nov 23, 2009, at 4:44 PM, Anton Korobeynikov wrote: > Author: asl > Date: Mon Nov 23 18:44:37 2009 > New Revision: 89720 > > URL: http://llvm.org/viewvc/llvm-project?rev=89720&view=rev > Log: > Materialize global addresses via movt/movw pair, this is always better > than doing the same via constpool: > 1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2. > 2. Load from constpool might stall up to 300 cycles due to cache miss. > 3. Movt/movw does not use load/store unit. > 4. Less constpool entries => better compiler performance. > > This is only enabled on ELF systems, since darwin does not have needed > relocations (yet). > > Added: > llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll > Modified: > llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h > llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp > llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp > llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp > llvm/trunk/lib/Target/ARM/ARMInstrInfo.td > llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td > llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp > llvm/trunk/lib/Target/ARM/ARMSubtarget.h > llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp > llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp > > Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) > +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Mon Nov 23 18:44:37 2009 > @@ -162,6 +162,22 @@ > I_BitShift = 25, > CondShift = 28 > }; > + > + /// Target Operand Flag enum. > + enum TOF { > + //===------------------------------------------------------------------===// > + // ARM Specific MachineOperand flags. > + > + MO_NO_FLAG, > + > + /// MO_LO16 - On a symbol operand, this represents a relocation containing > + /// lower 16 bit of the address. Used only via movw instruction. > + MO_LO16, > + > + /// MO_HI16 - On a symbol operand, this represents a relocation containing > + /// higher 16 bit of the address. Used only via movt instruction. > + MO_HI16 > + }; > } > > class ARMBaseInstrInfo : public TargetInstrInfoImpl { > > Modified: llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMExpandPseudoInsts.cpp Mon Nov 23 18:44:37 2009 > @@ -75,17 +75,30 @@ > } > case ARM::t2MOVi32imm: { > unsigned DstReg = MI.getOperand(0).getReg(); > - unsigned Imm = MI.getOperand(1).getImm(); > - unsigned Lo16 = Imm & 0xffff; > - unsigned Hi16 = (Imm >> 16) & 0xffff; > if (!MI.getOperand(0).isDead()) { > - AddDefaultPred(BuildMI(MBB, MBBI, MI.getDebugLoc(), > - TII->get(ARM::t2MOVi16), DstReg) > - .addImm(Lo16)); > - AddDefaultPred(BuildMI(MBB, MBBI, MI.getDebugLoc(), > - TII->get(ARM::t2MOVTi16)) > - .addReg(DstReg, getDefRegState(true)) > - .addReg(DstReg).addImm(Hi16)); > + const MachineOperand &MO = MI.getOperand(1); > + MachineInstrBuilder LO16, HI16; > + > + LO16 = BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(ARM::t2MOVi16), > + DstReg); > + HI16 = BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(ARM::t2MOVTi16)) > + .addReg(DstReg, getDefRegState(true)).addReg(DstReg); > + > + if (MO.isImm()) { > + unsigned Imm = MO.getImm(); > + unsigned Lo16 = Imm & 0xffff; > + unsigned Hi16 = (Imm >> 16) & 0xffff; > + LO16 = LO16.addImm(Lo16); > + HI16 = HI16.addImm(Hi16); > + } else { > + GlobalValue *GV = MO.getGlobal(); > + unsigned TF = MO.getTargetFlags(); > + LO16 = LO16.addGlobalAddress(GV, MO.getOffset(), TF | ARMII::MO_LO16); > + HI16 = HI16.addGlobalAddress(GV, MO.getOffset(), TF | ARMII::MO_HI16); > + // FIXME: What's about memoperands? > + } > + AddDefaultPred(LO16); > + AddDefaultPred(HI16); > } > MI.eraseFromParent(); > Modified = true; > > Modified: llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp Mon Nov 23 18:44:37 2009 > @@ -261,7 +261,9 @@ > if (N.getOpcode() == ISD::FrameIndex) { > int FI = cast(N)->getIndex(); > Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); > - } else if (N.getOpcode() == ARMISD::Wrapper) { > + } else if (N.getOpcode() == ARMISD::Wrapper && > + !(Subtarget->useMovt() && > + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { > Base = N.getOperand(0); > } > Offset = CurDAG->getRegister(0, MVT::i32); > @@ -463,7 +465,9 @@ > if (N.getOpcode() == ISD::FrameIndex) { > int FI = cast(N)->getIndex(); > Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); > - } else if (N.getOpcode() == ARMISD::Wrapper) { > + } else if (N.getOpcode() == ARMISD::Wrapper && > + !(Subtarget->useMovt() && > + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { > Base = N.getOperand(0); > } > Offset = CurDAG->getTargetConstant(ARM_AM::getAM5Opc(ARM_AM::add, 0), > @@ -558,7 +562,13 @@ > } > > if (N.getOpcode() != ISD::ADD) { > - Base = (N.getOpcode() == ARMISD::Wrapper) ? N.getOperand(0) : N; > + if (N.getOpcode() == ARMISD::Wrapper && > + !(Subtarget->useMovt() && > + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { > + Base = N.getOperand(0); > + } else > + Base = N; > + > Offset = CurDAG->getRegister(0, MVT::i32); > OffImm = CurDAG->getTargetConstant(0, MVT::i32); > return true; > @@ -681,7 +691,9 @@ > Base = CurDAG->getTargetFrameIndex(FI, TLI.getPointerTy()); > OffImm = CurDAG->getTargetConstant(0, MVT::i32); > return true; > - } else if (N.getOpcode() == ARMISD::Wrapper) { > + } else if (N.getOpcode() == ARMISD::Wrapper && > + !(Subtarget->useMovt() && > + N.getOperand(0).getOpcode() == ISD::TargetGlobalAddress)) { > Base = N.getOperand(0); > if (Base.getOpcode() == ISD::TargetConstantPool) > return false; // We want to select t2LDRpci instead. > > Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Mon Nov 23 18:44:37 2009 > @@ -39,6 +39,7 @@ > #include "llvm/CodeGen/SelectionDAG.h" > #include "llvm/Target/TargetOptions.h" > #include "llvm/ADT/VectorExtras.h" > +#include "llvm/Support/CommandLine.h" > #include "llvm/Support/ErrorHandling.h" > #include "llvm/Support/MathExtras.h" > #include > @@ -1356,10 +1357,17 @@ > PseudoSourceValue::getGOT(), 0); > return Result; > } else { > - SDValue CPAddr = DAG.getTargetConstantPool(GV, PtrVT, 4); > - CPAddr = DAG.getNode(ARMISD::Wrapper, dl, MVT::i32, CPAddr); > - return DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), CPAddr, > - PseudoSourceValue::getConstantPool(), 0); > + // If we have T2 ops, we can materialize the address directly via movt/movw > + // pair. This is always cheaper. > + if (Subtarget->useMovt()) { > + return DAG.getNode(ARMISD::Wrapper, dl, PtrVT, > + DAG.getTargetGlobalAddress(GV, PtrVT)); > + } else { > + SDValue CPAddr = DAG.getTargetConstantPool(GV, PtrVT, 4); > + CPAddr = DAG.getNode(ARMISD::Wrapper, dl, MVT::i32, CPAddr); > + return DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), CPAddr, > + PseudoSourceValue::getConstantPool(), 0); > + } > } > } > > > Modified: llvm/trunk/lib/Target/ARM/ARMInstrInfo.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrInfo.td Mon Nov 23 18:44:37 2009 > @@ -116,6 +116,10 @@ > def CarryDefIsUnused : Predicate<"!N.getNode()->hasAnyUseOfValue(1)">; > def CarryDefIsUsed : Predicate<"N.getNode()->hasAnyUseOfValue(1)">; > > +// FIXME: Eventually this will be just "hasV6T2Ops". > +def UseMovt : Predicate<"Subtarget->useMovt()">; > +def DontUseMovt : Predicate<"!Subtarget->useMovt()">; > + > //===----------------------------------------------------------------------===// > // ARM Flag Definitions. > > @@ -204,7 +208,7 @@ > def lo16AllZero : PatLeaf<(i32 imm), [{ > // Returns true if all low 16-bits are 0. > return (((uint32_t)N->getZExtValue()) & 0xFFFFUL) == 0; > - }], hi16>; > +}], hi16>; > > /// imm0_65535 predicate - True if the 32-bit immediate is in the range > /// [0.65535]. > @@ -1002,7 +1006,7 @@ > let Constraints = "$src = $dst" in > def MOVTi16 : AI1<0b1010, (outs GPR:$dst), (ins GPR:$src, i32imm:$imm), > DPFrm, IIC_iMOVi, > - "movt", "\t$dst, $imm", > + "movt", "\t$dst, $imm", > [(set GPR:$dst, > (or (and GPR:$src, 0xffff), > lo16AllZero:$imm))]>, UnaryDP, > @@ -1603,12 +1607,6 @@ > // Non-Instruction Patterns > // > > -// ConstantPool, GlobalAddress, and JumpTable > -def : ARMPat<(ARMWrapper tglobaladdr :$dst), (LEApcrel tglobaladdr :$dst)>; > -def : ARMPat<(ARMWrapper tconstpool :$dst), (LEApcrel tconstpool :$dst)>; > -def : ARMPat<(ARMWrapperJT tjumptable:$dst, imm:$id), > - (LEApcrelJT tjumptable:$dst, imm:$id)>; > - > // Large immediate handling. > > // Two piece so_imms. > @@ -1638,10 +1636,19 @@ > // FIXME: Remove this when we can do generalized remat. > let isReMaterializable = 1 in > def MOVi32imm : AI1x2<(outs GPR:$dst), (ins i32imm:$src), Pseudo, IIC_iMOVi, > - "movw", "\t$dst, ${src:lo16}\n\tmovt${p} $dst, ${src:hi16}", > + "movw", "\t$dst, ${src:lo16}\n\tmovt${p}\t$dst, ${src:hi16}", > [(set GPR:$dst, (i32 imm:$src))]>, > Requires<[IsARM, HasV6T2]>; > > +// ConstantPool, GlobalAddress, and JumpTable > +def : ARMPat<(ARMWrapper tglobaladdr :$dst), (LEApcrel tglobaladdr :$dst)>, > + Requires<[IsARM, DontUseMovt]>; > +def : ARMPat<(ARMWrapper tconstpool :$dst), (LEApcrel tconstpool :$dst)>; > +def : ARMPat<(ARMWrapper tglobaladdr :$dst), (MOVi32imm tglobaladdr :$dst)>, > + Requires<[IsARM, UseMovt]>; > +def : ARMPat<(ARMWrapperJT tjumptable:$dst, imm:$id), > + (LEApcrelJT tjumptable:$dst, imm:$id)>; > + > // TODO: add,sub,and, 3-instr forms? > > > > Modified: llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td (original) > +++ llvm/trunk/lib/Target/ARM/ARMInstrThumb2.td Mon Nov 23 18:44:37 2009 > @@ -1181,12 +1181,6 @@ > (t2SUBri (t2SUBri GPR:$LHS, (t2_so_neg_imm2part_1 imm:$RHS)), > (t2_so_neg_imm2part_2 imm:$RHS))>; > > -// ConstantPool, GlobalAddress, and JumpTable > -def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2LEApcrel tglobaladdr :$dst)>; > -def : T2Pat<(ARMWrapper tconstpool :$dst), (t2LEApcrel tconstpool :$dst)>; > -def : T2Pat<(ARMWrapperJT tjumptable:$dst, imm:$id), > - (t2LEApcrelJT tjumptable:$dst, imm:$id)>; > - > // 32-bit immediate using movw + movt. > // This is a single pseudo instruction to make it re-materializable. Remove > // when we can do generalized remat. > @@ -1195,6 +1189,16 @@ > "movw", "\t$dst, ${src:lo16}\n\tmovt${p}\t$dst, ${src:hi16}", > [(set GPR:$dst, (i32 imm:$src))]>; > > +// ConstantPool, GlobalAddress, and JumpTable > +def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2LEApcrel tglobaladdr :$dst)>, > + Requires<[IsThumb2, DontUseMovt]>; > +def : T2Pat<(ARMWrapper tconstpool :$dst), (t2LEApcrel tconstpool :$dst)>; > +def : T2Pat<(ARMWrapper tglobaladdr :$dst), (t2MOVi32imm tglobaladdr :$dst)>, > + Requires<[IsThumb2, UseMovt]>; > + > +def : T2Pat<(ARMWrapperJT tjumptable:$dst, imm:$id), > + (t2LEApcrelJT tjumptable:$dst, imm:$id)>; > + > // Pseudo instruction that combines ldr from constpool and add pc. This should > // be expanded into two instructions late to allow if-conversion and > // scheduling. > > Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.cpp Mon Nov 23 18:44:37 2009 > @@ -27,6 +27,10 @@ > cl::desc("Use NEON for single-precision FP"), > cl::init(false), cl::Hidden); > > +static cl::opt > +UseMOVT("arm-use-movt", > + cl::init(true), cl::Hidden); > + > ARMSubtarget::ARMSubtarget(const std::string &TT, const std::string &FS, > bool isT) > : ARMArchVersion(V4T) > @@ -36,6 +40,7 @@ > , ThumbMode(Thumb1) > , PostRAScheduler(false) > , IsR9Reserved(ReserveR9) > + , UseMovt(UseMOVT) > , stackAlignment(4) > , CPUString("generic") > , TargetType(isELF) // Default to ELF unless otherwise specified. > > Modified: llvm/trunk/lib/Target/ARM/ARMSubtarget.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMSubtarget.h?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMSubtarget.h (original) > +++ llvm/trunk/lib/Target/ARM/ARMSubtarget.h Mon Nov 23 18:44:37 2009 > @@ -65,6 +65,10 @@ > /// IsR9Reserved - True if R9 is a not available as general purpose register. > bool IsR9Reserved; > > + /// UseMovt - True if MOVT / MOVW pairs are used for materialization of 32-bit > + /// imms (including global addresses). > + bool UseMovt; > + > /// stackAlignment - The minimum alignment known to hold of the stack frame on > /// entry to the function and which must be maintained by every function. > unsigned stackAlignment; > @@ -130,8 +134,10 @@ > > bool isR9Reserved() const { return IsR9Reserved; } > > + bool useMovt() const { return UseMovt && hasV6T2Ops(); } > + > const std::string & getCPUString() const { return CPUString; } > - > + > /// enablePostRAScheduler - True at 'More' optimization. > bool enablePostRAScheduler(CodeGenOpt::Level OptLevel, > TargetSubtarget::AntiDepBreakMode& Mode, > > Modified: llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp (original) > +++ llvm/trunk/lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp Mon Nov 23 18:44:37 2009 > @@ -330,6 +330,8 @@ > void ARMAsmPrinter::printOperand(const MachineInstr *MI, int OpNum, > const char *Modifier) { > const MachineOperand &MO = MI->getOperand(OpNum); > + unsigned TF = MO.getTargetFlags(); > + > switch (MO.getType()) { > default: > assert(0 && ""); > @@ -356,12 +358,12 @@ > case MachineOperand::MO_Immediate: { > int64_t Imm = MO.getImm(); > O << '#'; > - if (Modifier) { > - if (strcmp(Modifier, "lo16") == 0) > - O << ":lower16:"; > - else if (strcmp(Modifier, "hi16") == 0) > - O << ":upper16:"; > - } > + if ((Modifier && strcmp(Modifier, "lo16") == 0) || > + (TF & ARMII::MO_LO16)) > + O << ":lower16:"; > + else if ((Modifier && strcmp(Modifier, "hi16") == 0) || > + (TF & ARMII::MO_HI16)) > + O << ":upper16:"; > O << Imm; > break; > } > @@ -371,6 +373,13 @@ > case MachineOperand::MO_GlobalAddress: { > bool isCallOp = Modifier && !strcmp(Modifier, "call"); > GlobalValue *GV = MO.getGlobal(); > + > + if ((Modifier && strcmp(Modifier, "lo16") == 0) || > + (TF & ARMII::MO_LO16)) > + O << ":lower16:"; > + else if ((Modifier && strcmp(Modifier, "hi16") == 0) || > + (TF & ARMII::MO_HI16)) > + O << ":upper16:"; > O << Mang->getMangledName(GV); > > printOffset(MO.getOffset()); > > Modified: llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp?rev=89720&r1=89719&r2=89720&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp (original) > +++ llvm/trunk/lib/Target/ARM/Thumb2SizeReduction.cpp Mon Nov 23 18:44:37 2009 > @@ -78,7 +78,7 @@ > { ARM::t2LSRri, ARM::tLSRri, 0, 5, 0, 1, 0, 0,0, 0 }, > { ARM::t2LSRrr, 0, ARM::tLSRrr, 0, 0, 0, 1, 0,0, 0 }, > { ARM::t2MOVi, ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 0 }, > - { ARM::t2MOVi16,ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 0 }, > + { ARM::t2MOVi16,ARM::tMOVi8, 0, 8, 0, 1, 0, 0,0, 1 }, > // FIXME: Do we need the 16-bit 'S' variant? > { ARM::t2MOVr,ARM::tMOVgpr2gpr,0, 0, 0, 0, 0, 1,0, 0 }, > { ARM::t2MOVCCr,0, ARM::tMOVCCr, 0, 0, 0, 0, 0,1, 0 }, > @@ -413,6 +413,12 @@ > if (MI->getOperand(2).getImm() == 0) > return ReduceToNarrow(MBB, MI, Entry, LiveCPSR); > break; > + case ARM::t2MOVi16: > + // Can convert only 'pure' immediate operands, not immediates obtained as > + // globals' addresses. > + if (MI->getOperand(1).isImm()) > + return ReduceToNarrow(MBB, MI, Entry, LiveCPSR); > + break; > } > return false; > } > > Added: llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll?rev=89720&view=auto > > ============================================================================== > --- llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll (added) > +++ llvm/trunk/test/CodeGen/ARM/movt-movw-global.ll Mon Nov 23 18:44:37 2009 > @@ -0,0 +1,20 @@ > +; RUN: llc < %s | FileCheck %s > +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64" > +target triple = "armv7-eabi" > + > + at foo = common global i32 0 ; [#uses=1] > + > +define arm_aapcs_vfpcc i32* @bar1() nounwind readnone { > +entry: > +; CHECK: movw r0, :lower16:foo > +; CHECK-NEXT: movt r0, :upper16:foo > + ret i32* @foo > +} > + > +define arm_aapcs_vfpcc void @bar2(i32 %baz) nounwind { > +entry: > +; CHECK: movw r1, :lower16:foo > +; CHECK-NEXT: movt r1, :upper16:foo > + store i32 %baz, i32* @foo, align 4 > + ret void > +} > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From evan.cheng at apple.com Mon Nov 23 21:04:44 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 19:04:44 -0800 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911231411.29627.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911230901.47371.dag@cray.com> <9895559A-187A-447E-945A-FEFEA37EFF43@apple.com> <200911231411.29627.dag@cray.com> Message-ID: On Nov 23, 2009, at 12:11 PM, David Greene wrote: > On Monday 23 November 2009 13:39, Evan Cheng wrote: >> David, this really is not a good idea. You are adding more target hooks >> purely for asm printing comments. These hooks return instruction properties >> that should be static so they should be moved to td files. > > Ok, that's reasonable. There are a number of target hooks that should > be processed by tblgen, then. For example, isMoveInstr, GetCondBranchFromCond > and sizeOfImm. > > To do isVectorInstr and isVectorOperandInstr will require some additional > flags in the .td files, I think. Is that ok? I don't want to to a whole > bunch of work to find out later that there's a better way. What kind of flags? These are fairly target specific information so I don't think we want to add anything target independent. Can you enhance asm printer so targets can inject target specific comments? Evan > > -Dave From vkutuzov at accesssoftek.com Mon Nov 23 21:38:31 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Mon, 23 Nov 2009 19:38:31 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911170809j6a32716ar840b8622cfad6f17@mail.gmail.com> <6AE1604EE3EC5F4296C096518C6B77EEFD4607C4@mail.accesssoftek.com> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> Message-ID: <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> The updated patch is attahced. It reflects the changess Chris and Daniel has requested. Best regards, Viktor ----- Original Message ----- From: "Rafael Espindola" To: "Viktor Kutuzov" Cc: "Commit Messages and Patches for LLVM" Sent: Monday, November 23, 2009 1:09 PM Subject: Re: [llvm-commits] [PATCH] LTO code generator options >> Thanks a lot for reviewing the patch. >> It is commited as http://llvm.org/viewvc/llvm-project?rev=89516&view=rev >> >> Now everything is reaady for the target triple overriding. > > I should be able to take a look at it tomorrow, but it would help if > you could first implement Daniel's and Chirs' comments. > >> >> Best regards, >> Viktor > > Cheers, > -- > Rafael ?vila de Esp?ndola > -------------- next part -------------- A non-text attachment was scrubbed... Name: llvm-lto-codegen-target_triple_override.diff Type: application/octet-stream Size: 9693 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091123/e5a72773/attachment.obj From nicholas at mxc.ca Mon Nov 23 23:35:38 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 23 Nov 2009 21:35:38 -0800 Subject: [llvm-commits] [llvm] r88910 - /llvm/trunk/lib/VMCore/Core.cpp In-Reply-To: <4B0A676B.6010104@free.fr> References: <200911161315.nAGDFTMi017583@zion.cs.uiuc.edu> <4B09ADBC.3010807@mxc.ca> <4B0A676B.6010104@free.fr> Message-ID: <4B0B70AA.9090403@mxc.ca> Duncan Sands wrote: > Hi Nick, > >> Also, this is the C API, so you can't fix this by changing the >> signature on the C function, if it's ever been through a release. > > why not? It started with libLTO. We wrote C bindings even though both LLVM and the linker were written in C++ specifically to allow a mix and match of versions. At that time, Chris declared that the C API would be immutable. When we got more C API, we kept the same rule. I don't know if any real-world user is relying on that API yet, but the theory is that they could download LLVM 2.6, write using the C API, and we promise it will work for the 2.x series (at least). Nick From asl at math.spbu.ru Tue Nov 24 01:22:14 2009 From: asl at math.spbu.ru (Anton Korobeynikov) Date: Tue, 24 Nov 2009 10:22:14 +0300 Subject: [llvm-commits] [llvm] r89720 - in /llvm/trunk: lib/Target/ARM/ARMBaseInstrInfo.h lib/Target/ARM/ARMExpandPseudoInsts.cpp lib/Target/ARM/ARMISelDAGToDAG.cpp lib/Target/ARM/ARMISelLowering.cpp lib/Target/ARM/ARMInstrInfo.td lib/Target/ARM/ARMInstrT Message-ID: Hello, Evan > Can you remove the "UseMovt" subtarget feature and just check for V6T2? I will do (after nightlytests) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From evan.cheng at apple.com Tue Nov 24 01:47:44 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Mon, 23 Nov 2009 23:47:44 -0800 Subject: [llvm-commits] [llvm] r89713 - /llvm/trunk/lib/Analysis/IVUsers.cpp In-Reply-To: <200911232325.nANNPtol014596@zion.cs.uiuc.edu> References: <200911232325.nANNPtol014596@zion.cs.uiuc.edu> Message-ID: <5EEF2375-1B67-4689-98A2-C9E26C8B2365@apple.com> Don't forget to update llvm-test llcbeta. Evan On Nov 23, 2009, at 3:25 PM, Jim Grosbach wrote: > Author: grosbach > Date: Mon Nov 23 17:25:54 2009 > New Revision: 89713 > > URL: http://llvm.org/viewvc/llvm-project?rev=89713&view=rev > Log: > enable iv-users simplification by default > > Modified: > llvm/trunk/lib/Analysis/IVUsers.cpp > > Modified: llvm/trunk/lib/Analysis/IVUsers.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IVUsers.cpp?rev=89713&r1=89712&r2=89713&view=diff > > ============================================================================== > --- llvm/trunk/lib/Analysis/IVUsers.cpp (original) > +++ llvm/trunk/lib/Analysis/IVUsers.cpp Mon Nov 23 17:25:54 2009 > @@ -24,7 +24,6 @@ > #include "llvm/ADT/STLExtras.h" > #include "llvm/Support/Debug.h" > #include "llvm/Support/raw_ostream.h" > -#include "llvm/Support/CommandLine.h" > #include > using namespace llvm; > > @@ -32,10 +31,6 @@ > static RegisterPass > X("iv-users", "Induction Variable Users", false, true); > > -static cl::opt > -SimplifyIVUsers("simplify-iv-users", cl::Hidden, cl::init(false), > - cl::desc("Restrict IV Users to loop-invariant strides")); > - > Pass *llvm::createIVUsersPass() { > return new IVUsers(); > } > @@ -214,8 +209,7 @@ > return false; // Non-reducible symbolic expression, bail out. > > // Keep things simple. Don't touch loop-variant strides. > - if (SimplifyIVUsers && !Stride->isLoopInvariant(L) > - && L->contains(I->getParent())) > + if (!Stride->isLoopInvariant(L) && L->contains(I->getParent())) > return false; > > SmallPtrSet UniqueUsers; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From evan.cheng at apple.com Tue Nov 24 01:50:02 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 07:50:02 -0000 Subject: [llvm-commits] [test-suite] r89747 - /test-suite/trunk/Makefile.programs Message-ID: <200911240750.nAO7o2pY031090@zion.cs.uiuc.edu> Author: evancheng Date: Tue Nov 24 01:50:01 2009 New Revision: 89747 URL: http://llvm.org/viewvc/llvm-project?rev=89747&view=rev Log: -simplify-iv-users is no more. Test -split-gep-gvn. Modified: test-suite/trunk/Makefile.programs Modified: test-suite/trunk/Makefile.programs URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/Makefile.programs?rev=89747&r1=89746&r2=89747&view=diff ============================================================================== --- test-suite/trunk/Makefile.programs (original) +++ test-suite/trunk/Makefile.programs Tue Nov 24 01:50:01 2009 @@ -223,13 +223,13 @@ LLCBETAOPTION := -sched=simple endif ifeq ($(ARCH),x86_64) -LLCBETAOPTION := -simplify-iv-users +LLCBETAOPTION := -split-gep-gvn #-combiner-alias-analysis #-pre-alloc-split #-remat-pic-stub-load endif ifeq ($(ARCH),x86) -LLCBETAOPTION := -simplify-iv-users +LLCBETAOPTION := -split-gep-gvn #-combiner-alias-analysis #-pre-alloc-split #-remat-pic-stub-load @@ -246,11 +246,11 @@ LLCBETAOPTION := -enable-sparc-v9-insts endif ifeq ($(ARCH),ARM) -LLCBETAOPTION := -simplify-iv-users +LLCBETAOPTION := -split-gep-gvn #-schedule-livein-copies endif ifeq ($(ARCH),THUMB) -LLCBETAOPTION := -simplify-iv-users +LLCBETAOPTION := -split-gep-gvn #-combiner-alias-analysis #-enable-thumb-reg-scavenging endif From evan.cheng at apple.com Tue Nov 24 02:06:16 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 08:06:16 -0000 Subject: [llvm-commits] [llvm] r89748 - in /llvm/trunk: lib/Target/ARM/ARMBaseInstrInfo.cpp test/CodeGen/Thumb2/ifcvt-neon.ll Message-ID: <200911240806.nAO86GHS031608@zion.cs.uiuc.edu> Author: evancheng Date: Tue Nov 24 02:06:15 2009 New Revision: 89748 URL: http://llvm.org/viewvc/llvm-project?rev=89748&view=rev Log: Enable predication of NEON instructions in Thumb2 mode. Added: llvm/trunk/test/CodeGen/Thumb2/ifcvt-neon.ll Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89748&r1=89747&r2=89748&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 24 02:06:15 2009 @@ -39,10 +39,6 @@ EnableARM3Addr("enable-arm-3-addr-conv", cl::Hidden, cl::desc("Enable ARM 2-addr to 3-addr conv")); -static cl::opt -PredicateNEON("predicate-neon", cl::Hidden, - cl::desc("Allow NEON instructions to be predicated")); - ARMBaseInstrInfo::ARMBaseInstrInfo(const ARMSubtarget& STI) : TargetInstrInfoImpl(ARMInsts, array_lengthof(ARMInsts)), Subtarget(STI) { @@ -417,7 +413,7 @@ if ((TID.TSFlags & ARMII::DomainMask) == ARMII::DomainNEON) { ARMFunctionInfo *AFI = MI->getParent()->getParent()->getInfo(); - return PredicateNEON && AFI->isThumb2Function(); + return AFI->isThumb2Function(); } return true; } Added: llvm/trunk/test/CodeGen/Thumb2/ifcvt-neon.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Thumb2/ifcvt-neon.ll?rev=89748&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/Thumb2/ifcvt-neon.ll (added) +++ llvm/trunk/test/CodeGen/Thumb2/ifcvt-neon.ll Tue Nov 24 02:06:15 2009 @@ -0,0 +1,29 @@ +; RUN: llc < %s -march=thumb -mcpu=cortex-a8 | FileCheck %s +; rdar://7368193 + + at a = common global float 0.000000e+00 ; [#uses=2] + at b = common global float 0.000000e+00 ; [#uses=1] + +define arm_apcscc float @t(i32 %c) nounwind { +entry: + %0 = icmp sgt i32 %c, 1 ; [#uses=1] + %1 = load float* @a, align 4 ; [#uses=2] + %2 = load float* @b, align 4 ; [#uses=2] + br i1 %0, label %bb, label %bb1 + +bb: ; preds = %entry +; CHECK: ite lt +; CHECK: vsublt.f32 +; CHECK-NEXT: vaddge.f32 + %3 = fadd float %1, %2 ; [#uses=1] + br label %bb2 + +bb1: ; preds = %entry + %4 = fsub float %1, %2 ; [#uses=1] + br label %bb2 + +bb2: ; preds = %bb1, %bb + %storemerge = phi float [ %4, %bb1 ], [ %3, %bb ] ; [#uses=2] + store float %storemerge, float* @a + ret float %storemerge +} From eocallaghan at auroraux.org Tue Nov 24 05:51:52 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 11:51:52 -0000 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll Message-ID: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> Author: evocallaghan Date: Tue Nov 24 05:51:52 2009 New Revision: 89758 URL: http://llvm.org/viewvc/llvm-project?rev=89758&view=rev Log: Fix for PR5373, Credit to Jakub Staszak. Added: llvm/trunk/test/Transforms/LoopUnswitch/5373.ll Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp?rev=89758&r1=89757&r2=89758&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Tue Nov 24 05:51:52 2009 @@ -296,7 +296,6 @@ // first exit. if (ExitBB != 0) return false; ExitBB = BB; - return true; } // Otherwise, this is an unvisited intra-loop node. Check all successors. Added: llvm/trunk/test/Transforms/LoopUnswitch/5373.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnswitch/5373.ll?rev=89758&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopUnswitch/5373.ll (added) +++ llvm/trunk/test/Transforms/LoopUnswitch/5373.ll Tue Nov 24 05:51:52 2009 @@ -0,0 +1,24 @@ +; RUN: opt < %s -loop-unswitch -stats -disable-output |& grep "3 loop-unswitch - Number of branches unswitched" + +define noalias i32* @func_16(i32** %p_18, i32* %p_20) noreturn nounwind ssp { +entry: + %lnot = icmp eq i32** %p_18, null ; [#uses=1] + %lnot6 = icmp eq i32* %p_20, null ; [#uses=1] + br label %for.body + +for.body: ; preds = %cond.end, %entry + br i1 %lnot, label %cond.end, label %cond.true + +cond.true: ; preds = %for.body + tail call void @f() + unreachable + +cond.end: ; preds = %for.body + br i1 %lnot6, label %for.body, label %cond.true10 + +cond.true10: ; preds = %cond.end + tail call void @f() + unreachable +} + +declare void @f() noreturn From anton at korobeynikov.info Tue Nov 24 06:33:08 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 24 Nov 2009 15:33:08 +0300 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll In-Reply-To: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> References: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> Message-ID: Hello, Edward > Fix for PR5373, Credit to Jakub Staszak. Who reviewed the patch? -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From dag at cray.com Tue Nov 24 08:29:15 2009 From: dag at cray.com (David Greene) Date: Tue, 24 Nov 2009 08:29:15 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: References: <200911201622.37994.dag@cray.com> <200911231411.29627.dag@cray.com> Message-ID: <200911240829.16459.dag@cray.com> On Monday 23 November 2009 21:04, Evan Cheng wrote: > > To do isVectorInstr and isVectorOperandInstr will require some additional > > flags in the .td files, I think. Is that ok? I don't want to to a whole > > bunch of work to find out later that there's a better way. > > What kind of flags? These are fairly target specific information so I don't > think we want to add anything target independent. Can you enhance asm > printer so targets can inject target specific comments? I was thinking of doing this by having TableGen infer isVector flags on instructions and operands from the type. If an operand has a vector type, set the isVector flag on the operand. If an instruction has any operands with isVector set, set isVector on the instruction. The user can then override these assumptions in the .td file by setting "let scalar=1" or "let vector=1." I'm not sure how target-specific comments would work. Do you have an example in mind? I want to avoid having to mark every vector instruction as the information can be inferred for 90% of the cases. -Dave From eocallaghan at auroraux.org Tue Nov 24 09:19:11 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 15:19:11 -0000 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc Message-ID: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> Author: evocallaghan Date: Tue Nov 24 09:19:10 2009 New Revision: 89765 URL: http://llvm.org/viewvc/llvm-project?rev=89765&view=rev Log: Provide Path::isSpecialFile interface for PR5568. Modified: llvm/trunk/include/llvm/System/Path.h llvm/trunk/lib/System/Unix/Path.inc llvm/trunk/lib/System/Win32/Path.inc Modified: llvm/trunk/include/llvm/System/Path.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89765&r1=89764&r2=89765&view=diff ============================================================================== --- llvm/trunk/include/llvm/System/Path.h (original) +++ llvm/trunk/include/llvm/System/Path.h Tue Nov 24 09:19:10 2009 @@ -380,6 +380,11 @@ /// in the file system. bool canWrite() const; + /// This function checks that what we're trying to work only on a regular file or Dir. + /// Check for things like /dev/null, any block special file, + /// or other things that aren't "regular" files. + bool isSpecialFile() const; + /// This function determines if the path name references an executable /// file in the file system. This function checks for the existence and /// executability (by the current program) of the file. Modified: llvm/trunk/lib/System/Unix/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=89765&r1=89764&r2=89765&view=diff ============================================================================== --- llvm/trunk/lib/System/Unix/Path.inc (original) +++ llvm/trunk/lib/System/Unix/Path.inc Tue Nov 24 09:19:10 2009 @@ -335,7 +335,7 @@ free(pv); return (NULL); } -#endif +#endif // __FreeBSD__ /// GetMainExecutable - Return the path to the main executable, given the /// value of argv[0] from program startup. @@ -454,6 +454,24 @@ } bool +Path::isSpecialFile() const { + // Get the status so we can determine if its a file or directory + struct stat buf; + std::string *ErrStr; + + if (0 != stat(path.c_str(), &buf)) { + MakeErrMsg(ErrStr, path + ": can't get status of file"); + return true; + } + + if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) { + return false; + } + + return true; +} + +bool Path::canExecute() const { if (0 != access(path.c_str(), R_OK | X_OK )) return false; @@ -723,7 +741,7 @@ bool Path::eraseFromDisk(bool remove_contents, std::string *ErrStr) const { - // Get the status so we can determin if its a file or directory + // Get the status so we can determine if its a file or directory struct stat buf; if (0 != stat(path.c_str(), &buf)) { MakeErrMsg(ErrStr, path + ": can't get status of file"); Modified: llvm/trunk/lib/System/Win32/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Win32/Path.inc?rev=89765&r1=89764&r2=89765&view=diff ============================================================================== --- llvm/trunk/lib/System/Win32/Path.inc (original) +++ llvm/trunk/lib/System/Win32/Path.inc Tue Nov 24 09:19:10 2009 @@ -357,6 +357,11 @@ return attr != INVALID_FILE_ATTRIBUTES; } +bool +Path::isSpecialFile() const { + return false; +} + std::string Path::getLast() const { // Find the last slash From eocallaghan at auroraux.org Tue Nov 24 09:32:36 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 15:32:36 +0000 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll In-Reply-To: References: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> Message-ID: <521640720911240732m4641b7c0i5fbd1ad468273591@mail.gmail.com> G'Day Anton, Me, I tested it on my machine and everything seemed fine. Why? Thanks for the post-review, Edward. 2009/11/24 Anton Korobeynikov : > Hello, Edward > >> Fix for PR5373, Credit to Jakub Staszak. > Who reviewed the patch? > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > -- -- Edward O'Callaghan http://www.auroraux.org/ eocallaghan at auroraux dot org --- () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From anton at korobeynikov.info Tue Nov 24 09:42:31 2009 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Tue, 24 Nov 2009 18:42:31 +0300 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll In-Reply-To: <521640720911240732m4641b7c0i5fbd1ad468273591@mail.gmail.com> References: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> <521640720911240732m4641b7c0i5fbd1ad468273591@mail.gmail.com> Message-ID: Hello, Edward > Me, I tested it on my machine and everything seemed fine. How have you tested? Have you run llvm-gcc bootstrap and/or nightly tests, etc.? Fix of the testcase does not imply that change itself is correct > Thanks for the post-review, Sorry, I cannot review this patch. I don't feel competent enough to review commits for loop unswitch code. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From eocallaghan at auroraux.org Tue Nov 24 09:47:02 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 15:47:02 +0000 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll In-Reply-To: References: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> <521640720911240732m4641b7c0i5fbd1ad468273591@mail.gmail.com> Message-ID: <521640720911240747n5955c307pa2e7feb1c89cb231@mail.gmail.com> G'day, 2009/11/24 Anton Korobeynikov : > Hello, Edward > >> Me, I tested it on my machine and everything seemed fine. > How have you tested? Have you run llvm-gcc bootstrap and/or nightly tests, etc.? > Fix of the testcase does not imply that change itself is correct > I did a fresh build of LLVM on auroraux and solaris with and without the patch and ran the test case on both, when not patched llvm crashes for me. >> Thanks for the post-review, > Sorry, I cannot review this patch. I don't feel competent enough to > review commits for loop unswitch code. > > -- > With best regards, Anton Korobeynikov > Faculty of Mathematics and Mechanics, Saint Petersburg State University > Cheers, Edward. -- -- Edward O'Callaghan http://www.auroraux.org/ eocallaghan at auroraux dot org --- () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From baldrick at free.fr Tue Nov 24 10:17:19 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 24 Nov 2009 17:17:19 +0100 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> Message-ID: <4B0C070F.2020907@free.fr> Hi, > + /// This function checks that what we're trying to work only on a regular file or Dir. Dir -> directory > + /// Check for things like /dev/null, any block special file, > + /// or other things that aren't "regular" files. files -> regular files or directories > +bool > +Path::isSpecialFile() const { > + return false; > +} Windows does have special files AFAIK, for example opening "nul" gives the effect of /dev/null on unix systems. Ciao, Duncan. From eocallaghan at auroraux.org Tue Nov 24 10:29:23 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 16:29:23 -0000 Subject: [llvm-commits] [llvm] r89770 - /llvm/trunk/include/llvm/System/Path.h Message-ID: <200911241629.nAOGTNWU002576@zion.cs.uiuc.edu> Author: evocallaghan Date: Tue Nov 24 10:29:23 2009 New Revision: 89770 URL: http://llvm.org/viewvc/llvm-project?rev=89770&view=rev Log: Fix comments as pre-post review for rev.89765. Modified: llvm/trunk/include/llvm/System/Path.h Modified: llvm/trunk/include/llvm/System/Path.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89770&r1=89769&r2=89770&view=diff ============================================================================== --- llvm/trunk/include/llvm/System/Path.h (original) +++ llvm/trunk/include/llvm/System/Path.h Tue Nov 24 10:29:23 2009 @@ -380,9 +380,9 @@ /// in the file system. bool canWrite() const; - /// This function checks that what we're trying to work only on a regular file or Dir. + /// This function checks that what we're trying to work only on a regular file or directory. /// Check for things like /dev/null, any block special file, - /// or other things that aren't "regular" files. + /// or other things that aren't "regular" regular files or directories. bool isSpecialFile() const; /// This function determines if the path name references an executable From baldrick at free.fr Tue Nov 24 10:32:36 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 24 Nov 2009 17:32:36 +0100 Subject: [llvm-commits] [llvm] r89770 - /llvm/trunk/include/llvm/System/Path.h In-Reply-To: <200911241629.nAOGTNWU002576@zion.cs.uiuc.edu> References: <200911241629.nAOGTNWU002576@zion.cs.uiuc.edu> Message-ID: <4B0C0AA4.8020308@free.fr> > + /// This function checks that what we're trying to work only on a regular file or directory. Is this line too long (80 columns)? Ciao, Duncan. From eocallaghan at auroraux.org Tue Nov 24 10:34:27 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 16:34:27 +0000 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <4B0C070F.2020907@free.fr> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> <4B0C070F.2020907@free.fr> Message-ID: <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> G'Day Duncan, 2009/11/24 Duncan Sands : > Hi, > >> + ? ? ?/// This function checks that what we're trying to work only on a >> regular file or Dir. > > Dir -> directory > >> + ? ? ?/// Check for things like /dev/null, any block special file, >> + ? ? ?/// or other things that aren't "regular" files. > > files -> regular files or directories > Fix committed in revision 89770, Cheers! >> +bool >> +Path::isSpecialFile() const { >> + ?return false; >> +} > > Windows does have special files AFAIK, for example opening "nul" > gives the effect of /dev/null on unix systems. OK, then I don't know the details of this nor do I have a windows machine to expand on this hook. Are you able to provide some more detail, is this bug a problem on windows as well? > > Ciao, > > Duncan. > Many thanks for your review, Edward. -- -- Edward O'Callaghan http://www.auroraux.org/ eocallaghan at auroraux dot org --- () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From baldrick at free.fr Tue Nov 24 10:42:36 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 24 Nov 2009 17:42:36 +0100 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> <4B0C070F.2020907@free.fr> <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> Message-ID: <4B0C0CFC.3010709@free.fr> Hi Edward, >>> +bool >>> +Path::isSpecialFile() const { >>> + return false; >>> +} >> Windows does have special files AFAIK, for example opening "nul" >> gives the effect of /dev/null on unix systems. > > OK, then I don't know the details of this nor do I have a windows > machine to expand on this hook. > Are you able to provide some more detail, is this bug a problem on > windows as well? I can't help with this, sorry. I just don't know enough about windows. Ciao, Duncan. From daniel at zuster.org Tue Nov 24 11:30:39 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 09:30:39 -0800 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> Message-ID: <6a8523d60911240930i24c82138va1f712abaa1656c3@mail.gmail.com> Hi Edward, On Tue, Nov 24, 2009 at 7:19 AM, Edward O'Callaghan wrote: > Author: evocallaghan > Date: Tue Nov 24 09:19:10 2009 > New Revision: 89765 > > URL: http://llvm.org/viewvc/llvm-project?rev=89765&view=rev > Log: > Provide Path::isSpecialFile interface for PR5568. > > Modified: > ? ?llvm/trunk/include/llvm/System/Path.h > ? ?llvm/trunk/lib/System/Unix/Path.inc > ? ?llvm/trunk/lib/System/Win32/Path.inc > > Modified: llvm/trunk/include/llvm/System/Path.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89765&r1=89764&r2=89765&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/System/Path.h (original) > +++ llvm/trunk/include/llvm/System/Path.h Tue Nov 24 09:19:10 2009 > @@ -380,6 +380,11 @@ > ? ? ? /// in the file system. > ? ? ? bool canWrite() const; > > + ? ? ?/// This function checks that what we're trying to work only on a regular file or Dir. > + ? ? ?/// Check for things like /dev/null, any block special file, > + ? ? ?/// or other things that aren't "regular" files. > + ? ? ?bool isSpecialFile() const; > + > ? ? ? /// This function determines if the path name references an executable > ? ? ? /// file in the file system. This function checks for the existence and > ? ? ? /// executability (by the current program) of the file. I would prefer this be Path::isRegularFile, that corresponds to a well known Unixism (S_ISREG) and avoids creating new terminology. Similarly, I don't think it should do anything else -- checking that the path is a directory is something clients can do. - Daniel > Modified: llvm/trunk/lib/System/Unix/Path.inc > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=89765&r1=89764&r2=89765&view=diff > > ============================================================================== > --- llvm/trunk/lib/System/Unix/Path.inc (original) > +++ llvm/trunk/lib/System/Unix/Path.inc Tue Nov 24 09:19:10 2009 > @@ -335,7 +335,7 @@ > ? free(pv); > ? return (NULL); > ?} > -#endif > +#endif // __FreeBSD__ > > ?/// GetMainExecutable - Return the path to the main executable, given the > ?/// value of argv[0] from program startup. > @@ -454,6 +454,24 @@ > ?} > > ?bool > +Path::isSpecialFile() const { > + ?// Get the status so we can determine if its a file or directory > + ?struct stat buf; > + ?std::string *ErrStr; > + > + ?if (0 != stat(path.c_str(), &buf)) { > + ? ?MakeErrMsg(ErrStr, path + ": can't get status of file"); > + ? ?return true; > + ?} > + > + ?if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) { > + ? ?return false; > + ?} > + > + ?return true; > +} > + > +bool > ?Path::canExecute() const { > ? if (0 != access(path.c_str(), R_OK | X_OK )) > ? ? return false; > @@ -723,7 +741,7 @@ > > ?bool > ?Path::eraseFromDisk(bool remove_contents, std::string *ErrStr) const { > - ?// Get the status so we can determin if its a file or directory > + ?// Get the status so we can determine if its a file or directory > ? struct stat buf; > ? if (0 != stat(path.c_str(), &buf)) { > ? ? MakeErrMsg(ErrStr, path + ": can't get status of file"); > > Modified: llvm/trunk/lib/System/Win32/Path.inc > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Win32/Path.inc?rev=89765&r1=89764&r2=89765&view=diff > > ============================================================================== > --- llvm/trunk/lib/System/Win32/Path.inc (original) > +++ llvm/trunk/lib/System/Win32/Path.inc Tue Nov 24 09:19:10 2009 > @@ -357,6 +357,11 @@ > ? return attr != INVALID_FILE_ATTRIBUTES; > ?} > > +bool > +Path::isSpecialFile() const { > + ?return false; > +} > > ?std::string > ?Path::getLast() const { > ? // Find the last slash > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From daniel at zuster.org Tue Nov 24 11:31:04 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 09:31:04 -0800 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> <4B0C070F.2020907@free.fr> <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> Message-ID: <6a8523d60911240931n408c105aofc25b48621f343c8@mail.gmail.com> On Tue, Nov 24, 2009 at 8:34 AM, Edward O'Callaghan wrote: > G'Day Duncan, > > 2009/11/24 Duncan Sands : >> Hi, >> >>> + ? ? ?/// This function checks that what we're trying to work only on a >>> regular file or Dir. >> >> Dir -> directory >> >>> + ? ? ?/// Check for things like /dev/null, any block special file, >>> + ? ? ?/// or other things that aren't "regular" files. >> >> files -> regular files or directories >> > > Fix committed in revision 89770, Cheers! > >>> +bool >>> +Path::isSpecialFile() const { >>> + ?return false; >>> +} >> >> Windows does have special files AFAIK, for example opening "nul" >> gives the effect of /dev/null on unix systems. > > OK, then I don't know the details of this nor do I have a windows > machine to expand on this hook. > Are you able to provide some more detail, is this bug a problem on > windows as well? Don't worry about it, if someone on Windows cares we can fix it then. - Daniel >> >> Ciao, >> >> Duncan. >> > > Many thanks for your review, > Edward. > > > -- > -- > Edward O'Callaghan > http://www.auroraux.org/ > eocallaghan at auroraux dot org > --- > () ?ascii ribbon campaign - against html e-mail > /\ ? ? ? ? ? ? ? ? ? ? ? ?- against microsoft attachments > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From daniel at zuster.org Tue Nov 24 12:26:19 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 18:26:19 -0000 Subject: [llvm-commits] [zorg] r89779 - /zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Message-ID: <200911241826.nAOIQJqm006446@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 12:26:19 2009 New Revision: 89779 URL: http://llvm.org/viewvc/llvm-project?rev=89779&view=rev Log: Tweak llvm-gcc build factory to allow setting --{build,host,target} triple arguments independently. Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py?rev=89779&r1=89778&r2=89779&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Tue Nov 24 12:26:19 2009 @@ -31,9 +31,18 @@ return args def getLLVMGCCBuildFactory(jobs=1, update=True, clean=True, - gxxincludedir=None, triple=None, + gxxincludedir=None, + triple=None, build=None, host=None, target=None, useTwoStage=True, stage1_config='Release', stage2_config='Release'): + if build or host or target: + if not build or not host or not target: + raise ValueError,"Must specify all of 'build', 'host', 'target' if used." + if triple: + raise ValueError,"Cannot specify 'triple' and 'build', 'host', 'target' options." + elif triple: + build = host = target = triple + f = buildbot.process.factory.BuildFactory() # Determine the build directory. @@ -66,10 +75,10 @@ # Configure llvm (stage 1). base_llvm_configure_args = [WithProperties("%(builddir)s/llvm.src/configure")] - if triple: - base_llvm_configure_args.append('--build=' + triple) - base_llvm_configure_args.append('--host=' + triple) - base_llvm_configure_args.append('--target=' + triple) + if build: + base_llvm_configure_args.append('--build=' + build) + base_llvm_configure_args.append('--host=' + host) + base_llvm_configure_args.append('--target=' + target) stage_configure_args = getConfigArgs(stage1_config) f.addStep(Configure(name='configure.llvm.stage1', command=base_llvm_configure_args + @@ -113,10 +122,10 @@ "--enable-languages=c,c++"] if gxxincludedir: base_llvmgcc_configure_args.append('--with-gxx-include-dir=' + gxxincludedir) - if triple: - base_llvmgcc_configure_args.append('--build=' + triple) - base_llvmgcc_configure_args.append('--host=' + triple) - base_llvmgcc_configure_args.append('--target=' + triple) + if build: + base_llvmgcc_configure_args.append('--build=' + build) + base_llvmgcc_configure_args.append('--host=' + host) + base_llvmgcc_configure_args.append('--target=' + target) f.addStep(Configure(name='configure.llvm-gcc.stage1', command=(base_llvmgcc_configure_args + ["--program-prefix=llvm-", From daniel at zuster.org Tue Nov 24 12:27:24 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 18:27:24 -0000 Subject: [llvm-commits] [zorg] r89780 - in /zorg/trunk/zorg/buildbot/builders: ClangBuilder.py LLVMBuilder.py LLVMGCCBuilder.py Message-ID: <200911241827.nAOIROV4006493@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 12:27:23 2009 New Revision: 89780 URL: http://llvm.org/viewvc/llvm-project?rev=89780&view=rev Log: Allow clients to specify "make" command to use. Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/ClangBuilder.py?rev=89780&r1=89779&r2=89780&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/ClangBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/ClangBuilder.py Tue Nov 24 12:27:23 2009 @@ -12,7 +12,8 @@ from zorg.buildbot.commands.BatchFileDownload import BatchFileDownload def getClangBuildFactory(triple=None, clean=True, test=True, - expensive_checks=False, run_cxx_tests=False, valgrind=False): + expensive_checks=False, run_cxx_tests=False, valgrind=False, + make='make'): f = buildbot.process.factory.BuildFactory() # Determine the build directory. @@ -48,13 +49,14 @@ descriptionDone=['configure',config_name])) if clean: f.addStep(WarningCountingShellCommand(name="clean-llvm", - command="make clean", + command=[make, "clean"], haltOnFailure=True, description="cleaning llvm", descriptionDone="clean llvm", workdir='llvm')) f.addStep(WarningCountingShellCommand(name="compile", - command=WithProperties("nice -n 10 make -j%(jobs)d"), + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure=True, description="compiling llvm & clang", descriptionDone="compile llvm & clang", @@ -69,12 +71,12 @@ extraTestDirs += '%(builddir)s/llvm/tools/clang/utils/C++Tests' if test: f.addStep(ClangTestCommand(name='test-llvm', - command=["make", "check-lit", "VERBOSE=1"], + command=[make, "check-lit", "VERBOSE=1"], description=["testing", "llvm"], descriptionDone=["test", "llvm"], workdir='llvm')) f.addStep(ClangTestCommand(name='test-clang', - command=['make', 'test', WithProperties('TESTARGS=%s' % clangTestArgs), + command=[make, 'test', WithProperties('TESTARGS=%s' % clangTestArgs), WithProperties('EXTRA_TESTDIRS=%s' % extraTestDirs)], workdir='llvm/tools/clang')) return f Modified: zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py?rev=89780&r1=89779&r2=89780&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py Tue Nov 24 12:27:23 2009 @@ -11,7 +11,7 @@ def getLLVMBuildFactory(triple=None, clean=True, test=True, expensive_checks=False, - jobs=1, timeout=20): + jobs=1, timeout=20, make='make'): f = buildbot.process.factory.BuildFactory() # Determine the build directory. @@ -43,13 +43,14 @@ descriptionDone=['configure',config_name])) if clean: f.addStep(WarningCountingShellCommand(name="clean-llvm", - command="make clean", + command=[make, 'clean'], haltOnFailure=True, description="cleaning llvm", descriptionDone="clean llvm", workdir='llvm')) f.addStep(WarningCountingShellCommand(name="compile", - command=WithProperties("nice -n 10 make -j%s" % jobs), + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure=True, description="compiling llvm", descriptionDone="compile llvm", @@ -57,7 +58,7 @@ timeout=timeout*60)) if test: f.addStep(ClangTestCommand(name='test-llvm', - command=["make", "check-lit", "VERBOSE=1"], + command=[make, "check-lit", "VERBOSE=1"], description=["testing", "llvm"], descriptionDone=["test", "llvm"], workdir='llvm')) Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py?rev=89780&r1=89779&r2=89780&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Tue Nov 24 12:27:23 2009 @@ -34,7 +34,7 @@ gxxincludedir=None, triple=None, build=None, host=None, target=None, useTwoStage=True, stage1_config='Release', - stage2_config='Release'): + stage2_config='Release', make='make'): if build or host or target: if not build or not host or not target: raise ValueError,"Must specify all of 'build', 'host', 'target' if used." @@ -93,7 +93,8 @@ # Build llvm (stage 1). f.addStep(WarningCountingShellCommand(name = "compile.llvm.stage1", - command = WithProperties("nice -n 10 make -j%s" % jobs), + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure = True, description=["compile", "llvm", @@ -103,7 +104,7 @@ # Run LLVM tests (stage 1). f.addStep(ClangTestCommand(name = 'test.llvm.stage1', - command = ["make", "check-lit", "VERBOSE=1"], + command = [make, "check-lit", "VERBOSE=1"], description = ["testing", "llvm"], descriptionDone = ["test", "llvm"], workdir = 'llvm.obj')) @@ -139,7 +140,8 @@ # Build llvm-gcc. f.addStep(WarningCountingShellCommand(name="compile.llvm-gcc.stage1", - command = WithProperties("nice -n 10 make -j%s" % jobs), + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure=True, description=["compile", "llvm-gcc"], @@ -156,7 +158,8 @@ # Install llvm-gcc. f.addStep(WarningCountingShellCommand(name="install.llvm-gcc.stage1", - command="nice -n 10 make install", + command=['nice', '-n', '10', + make, 'install'], haltOnFailure=True, description=["install", "llvm-gcc"], @@ -194,7 +197,8 @@ # Build LLVM (stage 2). f.addStep(WarningCountingShellCommand(name = "compile.llvm.stage2", - command = WithProperties("nice -n 10 make -j%s" % jobs), + command = ['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure = True, description=["compile", "llvm", @@ -204,7 +208,7 @@ # Run LLVM tests (stage 2). f.addStep(ClangTestCommand(name = 'test.llvm.stage2', - command = ["make", "check-lit", "VERBOSE=1"], + command = [make, "check-lit", "VERBOSE=1"], description = ["testing", "llvm", "(stage 2)"], descriptionDone = ["test", "llvm", "(stage 2)"], workdir = 'llvm.obj.2')) @@ -235,7 +239,8 @@ # Build llvm-gcc (stage 2). f.addStep(WarningCountingShellCommand(name="compile.llvm-gcc.stage2", - command=WithProperties("nice -n 10 make -j%s" % jobs), + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], haltOnFailure=True, description=["compile", "llvm-gcc", @@ -254,6 +259,8 @@ # Install llvm-gcc. f.addStep(WarningCountingShellCommand(name="install.llvm-gcc.stage2", + command = ['nice', '-n', '10', + make, 'install'], command="nice -n 10 make", haltOnFailure=True, description=["install", From evan.cheng at apple.com Tue Nov 24 12:52:41 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 10:52:41 -0800 Subject: [llvm-commits] [llvm] r89758 - in /llvm/trunk: lib/Transforms/Scalar/LoopUnswitch.cpp test/Transforms/LoopUnswitch/5373.ll In-Reply-To: <521640720911240747n5955c307pa2e7feb1c89cb231@mail.gmail.com> References: <200911241151.nAOBpq1f025375@zion.cs.uiuc.edu> <521640720911240732m4641b7c0i5fbd1ad468273591@mail.gmail.com> <521640720911240747n5955c307pa2e7feb1c89cb231@mail.gmail.com> Message-ID: <39549D7E-3F30-4AA0-8D56-3A282B966C57@apple.com> The patch is good. Thanks. Some comments would have been better than "fixing prxxxx". :-) Evan On Nov 24, 2009, at 7:47 AM, Edward O'Callaghan wrote: > G'day, > > 2009/11/24 Anton Korobeynikov : >> Hello, Edward >> >>> Me, I tested it on my machine and everything seemed fine. >> How have you tested? Have you run llvm-gcc bootstrap and/or nightly tests, etc.? >> Fix of the testcase does not imply that change itself is correct >> > > I did a fresh build of LLVM on auroraux and solaris with and without > the patch and ran the test case on both, when not patched llvm crashes > for me. > >>> Thanks for the post-review, >> Sorry, I cannot review this patch. I don't feel competent enough to >> review commits for loop unswitch code. >> >> -- >> With best regards, Anton Korobeynikov >> Faculty of Mathematics and Mechanics, Saint Petersburg State University >> > Cheers, > Edward. > > > > -- > -- > Edward O'Callaghan > http://www.auroraux.org/ > eocallaghan at auroraux dot org > --- > () ascii ribbon campaign - against html e-mail > /\ - against microsoft attachments > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From daniel at zuster.org Tue Nov 24 13:03:33 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 19:03:33 -0000 Subject: [llvm-commits] [llvm] r89786 - /llvm/trunk/lib/System/Unix/Path.inc Message-ID: <200911241903.nAOJ3Xg4007740@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 13:03:33 2009 New Revision: 89786 URL: http://llvm.org/viewvc/llvm-project?rev=89786&view=rev Log: Remove bogus error handling code. Modified: llvm/trunk/lib/System/Unix/Path.inc Modified: llvm/trunk/lib/System/Unix/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=89786&r1=89785&r2=89786&view=diff ============================================================================== --- llvm/trunk/lib/System/Unix/Path.inc (original) +++ llvm/trunk/lib/System/Unix/Path.inc Tue Nov 24 13:03:33 2009 @@ -457,16 +457,12 @@ Path::isSpecialFile() const { // Get the status so we can determine if its a file or directory struct stat buf; - std::string *ErrStr; - if (0 != stat(path.c_str(), &buf)) { - MakeErrMsg(ErrStr, path + ": can't get status of file"); + if (0 != stat(path.c_str(), &buf)) return true; - } - if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) { + if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) return false; - } return true; } From dpatel at apple.com Tue Nov 24 13:18:41 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 24 Nov 2009 19:18:41 -0000 Subject: [llvm-commits] [llvm] r89787 - /llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Message-ID: <200911241918.nAOJIfKi008315@zion.cs.uiuc.edu> Author: dpatel Date: Tue Nov 24 13:18:41 2009 New Revision: 89787 URL: http://llvm.org/viewvc/llvm-project?rev=89787&view=rev Log: Swith to pubtypes section before emitting pub types. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89787&r1=89786&r2=89787&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Tue Nov 24 13:18:41 2009 @@ -2717,6 +2717,9 @@ } void DwarfDebug::emitDebugPubTypes() { + // Start the dwarf pubnames section. + Asm->OutStreamer.SwitchSection( + Asm->getObjFileLowering().getDwarfPubTypesSection()); EmitDifference("pubtypes_end", ModuleCU->getID(), "pubtypes_begin", ModuleCU->getID(), true); Asm->EOL("Length of Public Types Info"); From jyasskin at gmail.com Tue Nov 24 13:28:10 2009 From: jyasskin at gmail.com (jyasskin at gmail.com) Date: Tue, 24 Nov 2009 19:28:10 +0000 Subject: [llvm-commits] Change indirect-globals to use a dedicated allocIndirectGV (issue157145) Message-ID: <0016e68dec164ab6f9047922f14f@google.com> Reviewers: evan.cheng_apple.com, Message: Hi Evan, could you take a look? http://codereview.appspot.com/download/issue157145_7.diff I think this is an ok change for arm (and it passes tests on nlewycky's box) because even though the indirect global is generated as part of a call stub, it's not actually executed. Description: This lets us remove start/finishGVStub and the BufferState helper class from the MachineCodeEmitter interface. It has the side-effect of not setting the indirect global writable and then executable on ARM, but that shouldn't be necessary. Please review this at http://codereview.appspot.com/157145 Affected files: M include/llvm/CodeGen/JITCodeEmitter.h M include/llvm/CodeGen/MachineCodeEmitter.h M lib/ExecutionEngine/JIT/JITEmitter.cpp M lib/Target/ARM/ARMJITInfo.cpp M lib/Target/Alpha/AlphaJITInfo.cpp M lib/Target/PowerPC/PPCJITInfo.cpp M lib/Target/X86/X86JITInfo.cpp From dpatel at apple.com Tue Nov 24 13:37:07 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 24 Nov 2009 19:37:07 -0000 Subject: [llvm-commits] [llvm] r89790 - in /llvm/trunk: include/llvm/CodeGen/MachineModuleInfo.h include/llvm/CodeGen/Passes.h lib/CodeGen/LLVMTargetMachine.cpp lib/CodeGen/MachineModuleInfo.cpp Message-ID: <200911241937.nAOJb71D008910@zion.cs.uiuc.edu> Author: dpatel Date: Tue Nov 24 13:37:07 2009 New Revision: 89790 URL: http://llvm.org/viewvc/llvm-project?rev=89790&view=rev Log: Remove DebugLabelFolder pass. It is not used by dwarf writer anymore. Modified: llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h llvm/trunk/include/llvm/CodeGen/Passes.h llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp Modified: llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h?rev=89790&r1=89789&r2=89790&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h (original) +++ llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h Tue Nov 24 13:37:07 2009 @@ -135,9 +135,6 @@ /// llvm.compiler.used. SmallPtrSet UsedFunctions; - /// UsedDbgLabels - labels are used by debug info entries. - SmallSet UsedDbgLabels; - bool CallsEHReturn; bool CallsUnwindInit; @@ -232,19 +229,6 @@ return LabelID ? LabelIDList[LabelID - 1] : 0; } - /// isDbgLabelUsed - Return true if label with LabelID is used by - /// DwarfWriter. - bool isDbgLabelUsed(unsigned LabelID) { - return UsedDbgLabels.count(LabelID); - } - - /// RecordUsedDbgLabel - Mark label with LabelID as used. This is used - /// by DwarfWriter to inform DebugLabelFolder that certain labels are - /// not to be deleted. - void RecordUsedDbgLabel(unsigned LabelID) { - UsedDbgLabels.insert(LabelID); - } - /// getFrameMoves - Returns a reference to a list of moves done in the current /// function's prologue. Used to construct frame maps for debug and exception /// handling comsumers. Modified: llvm/trunk/include/llvm/CodeGen/Passes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=89790&r1=89789&r2=89790&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/Passes.h (original) +++ llvm/trunk/include/llvm/CodeGen/Passes.h Tue Nov 24 13:37:07 2009 @@ -136,11 +136,6 @@ /// headers to target specific alignment boundary. FunctionPass *createCodePlacementOptPass(); - /// DebugLabelFoldingPass - This pass prunes out redundant debug labels. This - /// allows a debug emitter to determine if the range of two labels is empty, - /// by seeing if the labels map to the same reduced label. - FunctionPass *createDebugLabelFoldingPass(); - /// getRegisterAllocator - This creates an instance of the register allocator /// for the Sparc. FunctionPass *getRegisterAllocator(TargetMachine &T); Modified: llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp?rev=89790&r1=89789&r2=89790&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp (original) +++ llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp Tue Nov 24 13:37:07 2009 @@ -349,10 +349,6 @@ if (PrintGCInfo) PM.add(createGCInfoPrinter(errs())); - // Fold redundant debug labels. - PM.add(createDebugLabelFoldingPass()); - printAndVerify(PM, "After DebugLabelFolding"); - if (OptLevel != CodeGenOpt::None && !DisableCodePlace) { PM.add(createCodePlacementOptPass()); printAndVerify(PM, "After CodePlacementOpt"); Modified: llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp?rev=89790&r1=89789&r2=89790&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp (original) +++ llvm/trunk/lib/CodeGen/MachineModuleInfo.cpp Tue Nov 24 13:37:07 2009 @@ -293,75 +293,3 @@ return 0; } -//===----------------------------------------------------------------------===// -/// DebugLabelFolding pass - This pass prunes out redundant labels. This allows -/// a info consumer to determine if the range of two labels is empty, by seeing -/// if the labels map to the same reduced label. - -namespace llvm { - -struct DebugLabelFolder : public MachineFunctionPass { - static char ID; - DebugLabelFolder() : MachineFunctionPass(&ID) {} - - virtual void getAnalysisUsage(AnalysisUsage &AU) const { - AU.setPreservesCFG(); - AU.addPreservedID(MachineLoopInfoID); - AU.addPreservedID(MachineDominatorsID); - MachineFunctionPass::getAnalysisUsage(AU); - } - - virtual bool runOnMachineFunction(MachineFunction &MF); - virtual const char *getPassName() const { return "Label Folder"; } -}; - -char DebugLabelFolder::ID = 0; - -bool DebugLabelFolder::runOnMachineFunction(MachineFunction &MF) { - // Get machine module info. - MachineModuleInfo *MMI = getAnalysisIfAvailable(); - if (!MMI) return false; - - // Track if change is made. - bool MadeChange = false; - // No prior label to begin. - unsigned PriorLabel = 0; - - // Iterate through basic blocks. - for (MachineFunction::iterator BB = MF.begin(), E = MF.end(); - BB != E; ++BB) { - // Iterate through instructions. - for (MachineBasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ) { - // Is it a label. - if (I->isDebugLabel() && !MMI->isDbgLabelUsed(I->getOperand(0).getImm())){ - // The label ID # is always operand #0, an immediate. - unsigned NextLabel = I->getOperand(0).getImm(); - - // If there was an immediate prior label. - if (PriorLabel) { - // Remap the current label to prior label. - MMI->RemapLabel(NextLabel, PriorLabel); - // Delete the current label. - I = BB->erase(I); - // Indicate a change has been made. - MadeChange = true; - continue; - } else { - // Start a new round. - PriorLabel = NextLabel; - } - } else { - // No consecutive labels. - PriorLabel = 0; - } - - ++I; - } - } - - return MadeChange; -} - -FunctionPass *createDebugLabelFoldingPass() { return new DebugLabelFolder(); } - -} From eocallaghan at auroraux.org Tue Nov 24 13:39:04 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Tue, 24 Nov 2009 19:39:04 +0000 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <6a8523d60911240930i24c82138va1f712abaa1656c3@mail.gmail.com> References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> <6a8523d60911240930i24c82138va1f712abaa1656c3@mail.gmail.com> Message-ID: <521640720911241139w73006b1m38ebb8d151dfcac5@mail.gmail.com> G'Day Daniel, > I would prefer this be Path::isRegularFile, that corresponds to a well > known Unixism (S_ISREG) and avoids creating new terminology. > Similarly, I don't think it should do anything else -- checking that > the path is a directory is something clients can do. You know that's about the forth time someone has told me to change the name. :| Also, isRegularFile would be wrong because it returns false for being a regular file, that's not we want to be checking for, you got it the wrong way around. I'm not sure what you mean about not checking a directory, if its not a reg file or a directory then its special. Please see how its being used on the clang side of this fix. This is a two part fix, one side on LLVM and the other on Clang. Cheers, Edward. 2009/11/24 Daniel Dunbar : > Hi Edward, > > > On Tue, Nov 24, 2009 at 7:19 AM, Edward O'Callaghan > wrote: >> Author: evocallaghan >> Date: Tue Nov 24 09:19:10 2009 >> New Revision: 89765 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89765&view=rev >> Log: >> Provide Path::isSpecialFile interface for PR5568. >> >> Modified: >> ? ?llvm/trunk/include/llvm/System/Path.h >> ? ?llvm/trunk/lib/System/Unix/Path.inc >> ? ?llvm/trunk/lib/System/Win32/Path.inc >> >> Modified: llvm/trunk/include/llvm/System/Path.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89765&r1=89764&r2=89765&view=diff >> >> ============================================================================== >> --- llvm/trunk/include/llvm/System/Path.h (original) >> +++ llvm/trunk/include/llvm/System/Path.h Tue Nov 24 09:19:10 2009 >> @@ -380,6 +380,11 @@ >> ? ? ? /// in the file system. >> ? ? ? bool canWrite() const; >> >> + ? ? ?/// This function checks that what we're trying to work only on a regular file or Dir. >> + ? ? ?/// Check for things like /dev/null, any block special file, >> + ? ? ?/// or other things that aren't "regular" files. >> + ? ? ?bool isSpecialFile() const; >> + >> ? ? ? /// This function determines if the path name references an executable >> ? ? ? /// file in the file system. This function checks for the existence and >> ? ? ? /// executability (by the current program) of the file. > > > I would prefer this be Path::isRegularFile, that corresponds to a well > known Unixism (S_ISREG) and avoids creating new terminology. > Similarly, I don't think it should do anything else -- checking that > the path is a directory is something clients can do. > > ?- Daniel > >> Modified: llvm/trunk/lib/System/Unix/Path.inc >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=89765&r1=89764&r2=89765&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/System/Unix/Path.inc (original) >> +++ llvm/trunk/lib/System/Unix/Path.inc Tue Nov 24 09:19:10 2009 >> @@ -335,7 +335,7 @@ >> ? free(pv); >> ? return (NULL); >> ?} >> -#endif >> +#endif // __FreeBSD__ >> >> ?/// GetMainExecutable - Return the path to the main executable, given the >> ?/// value of argv[0] from program startup. >> @@ -454,6 +454,24 @@ >> ?} >> >> ?bool >> +Path::isSpecialFile() const { >> + ?// Get the status so we can determine if its a file or directory >> + ?struct stat buf; >> + ?std::string *ErrStr; >> + >> + ?if (0 != stat(path.c_str(), &buf)) { >> + ? ?MakeErrMsg(ErrStr, path + ": can't get status of file"); >> + ? ?return true; >> + ?} >> + >> + ?if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) { >> + ? ?return false; >> + ?} >> + >> + ?return true; >> +} >> + >> +bool >> ?Path::canExecute() const { >> ? if (0 != access(path.c_str(), R_OK | X_OK )) >> ? ? return false; >> @@ -723,7 +741,7 @@ >> >> ?bool >> ?Path::eraseFromDisk(bool remove_contents, std::string *ErrStr) const { >> - ?// Get the status so we can determin if its a file or directory >> + ?// Get the status so we can determine if its a file or directory >> ? struct stat buf; >> ? if (0 != stat(path.c_str(), &buf)) { >> ? ? MakeErrMsg(ErrStr, path + ": can't get status of file"); >> >> Modified: llvm/trunk/lib/System/Win32/Path.inc >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Win32/Path.inc?rev=89765&r1=89764&r2=89765&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/System/Win32/Path.inc (original) >> +++ llvm/trunk/lib/System/Win32/Path.inc Tue Nov 24 09:19:10 2009 >> @@ -357,6 +357,11 @@ >> ? return attr != INVALID_FILE_ATTRIBUTES; >> ?} >> >> +bool >> +Path::isSpecialFile() const { >> + ?return false; >> +} >> >> ?std::string >> ?Path::getLast() const { >> ? // Find the last slash >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > -- -- Edward O'Callaghan http://www.auroraux.org/ eocallaghan at auroraux dot org --- () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From dpatel at apple.com Tue Nov 24 13:42:17 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 24 Nov 2009 19:42:17 -0000 Subject: [llvm-commits] [llvm] r89793 - in /llvm/trunk: include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter/AsmPrinter.cpp lib/CodeGen/AsmPrinter/DIE.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h Message-ID: <200911241942.nAOJgHfU009130@zion.cs.uiuc.edu> Author: dpatel Date: Tue Nov 24 13:42:17 2009 New Revision: 89793 URL: http://llvm.org/viewvc/llvm-project?rev=89793&view=rev Log: Use StringRef instead of std::string in DIEString. Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/AsmPrinter.h?rev=89793&r1=89792&r2=89793&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/AsmPrinter.h (original) +++ llvm/trunk/include/llvm/CodeGen/AsmPrinter.h Tue Nov 24 13:42:17 2009 @@ -297,7 +297,7 @@ /// EmitString - Emit a string with quotes and a null terminator. /// Special characters are emitted properly. /// @verbatim (Eg. '\t') @endverbatim - void EmitString(const std::string &String) const; + void EmitString(const StringRef String) const; void EmitString(const char *String, unsigned Size) const; /// EmitFile - Emit a .file directive. Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=89793&r1=89792&r2=89793&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Tue Nov 24 13:42:17 2009 @@ -728,7 +728,7 @@ /// EmitString - Emit a string with quotes and a null terminator. /// Special characters are emitted properly. /// \literal (Eg. '\t') \endliteral -void AsmPrinter::EmitString(const std::string &String) const { +void AsmPrinter::EmitString(const StringRef String) const { EmitString(String.data(), String.size()); } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h?rev=89793&r1=89792&r2=89793&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h Tue Nov 24 13:42:17 2009 @@ -277,9 +277,9 @@ /// DIEString - A string value DIE. /// class DIEString : public DIEValue { - const std::string Str; + const StringRef Str; public: - explicit DIEString(const std::string &S) : DIEValue(isString), Str(S) {} + explicit DIEString(const StringRef S) : DIEValue(isString), Str(S) {} /// EmitValue - Emit string value. /// Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89793&r1=89792&r2=89793&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Tue Nov 24 13:42:17 2009 @@ -333,7 +333,7 @@ /// addString - Add a string attribute data and value. /// void DwarfDebug::addString(DIE *Die, unsigned Attribute, unsigned Form, - const std::string &String) { + const StringRef String) { DIEValue *Value = new DIEString(String); DIEValues.push_back(Value); Die->addValue(Attribute, Form, Value); Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h?rev=89793&r1=89792&r2=89793&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Tue Nov 24 13:42:17 2009 @@ -244,7 +244,7 @@ /// addString - Add a string attribute data and value. /// void addString(DIE *Die, unsigned Attribute, unsigned Form, - const std::string &String); + const StringRef Str); /// addLabel - Add a Dwarf label attribute data and value. /// From evan.cheng at apple.com Tue Nov 24 13:43:43 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 11:43:43 -0800 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911240829.16459.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911231411.29627.dag@cray.com> <200911240829.16459.dag@cray.com> Message-ID: <688E3BAC-7BCD-472A-9746-DD3F1247F6C4@apple.com> On Nov 24, 2009, at 6:29 AM, David Greene wrote: > On Monday 23 November 2009 21:04, Evan Cheng wrote: > >>> To do isVectorInstr and isVectorOperandInstr will require some additional >>> flags in the .td files, I think. Is that ok? I don't want to to a whole >>> bunch of work to find out later that there's a better way. >> >> What kind of flags? These are fairly target specific information so I don't >> think we want to add anything target independent. Can you enhance asm >> printer so targets can inject target specific comments? > > I was thinking of doing this by having TableGen infer isVector flags on > instructions and operands from the type. If an operand has a vector type, > set the isVector flag on the operand. If an instruction has any operands > with isVector set, set isVector on the instruction. The user can then > override these assumptions in the .td file by setting "let scalar=1" or > "let vector=1." That's reasonable. But I think we should let tablegen infer a lot more information. For every instruction with matching pattern it should save the SDNode opcode. So for ADD64*, ADDPS*, etc. we know they are all ISD::ADD instructions. Similarly we can save the ValueType information and infer properties such as isVector. Such information will let codegen do all kinds of interesting optimizations and give us some truly rich asm comments. > > I'm not sure how target-specific comments would work. Do you have an > example in mind? I want to avoid having to mark every vector instruction as > the information can be inferred for 90% of the cases. I am not sure either. It seems to me you are adding asm comments that many targets might not care about. Perhaps tablegen can infer some information and attach it as asm comment string to each TargetInstrDesc entry? Evan > > -Dave From dag at cray.com Tue Nov 24 13:48:43 2009 From: dag at cray.com (David Greene) Date: Tue, 24 Nov 2009 13:48:43 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <688E3BAC-7BCD-472A-9746-DD3F1247F6C4@apple.com> References: <200911201622.37994.dag@cray.com> <200911240829.16459.dag@cray.com> <688E3BAC-7BCD-472A-9746-DD3F1247F6C4@apple.com> Message-ID: <200911241348.43974.dag@cray.com> On Tuesday 24 November 2009 13:43, Evan Cheng wrote: > > I was thinking of doing this by having TableGen infer isVector flags on > > instructions and operands from the type. If an operand has a vector > > type, set the isVector flag on the operand. If an instruction has any > > operands with isVector set, set isVector on the instruction. The user > > can then override these assumptions in the .td file by setting "let > > scalar=1" or "let vector=1." > > That's reasonable. But I think we should let tablegen infer a lot more > information. For every instruction with matching pattern it should save the > SDNode opcode. So for ADD64*, ADDPS*, etc. we know they are all ISD::ADD > instructions. Similarly we can save the ValueType information and infer > properties such as isVector. > > Such information will let codegen do all kinds of interesting optimizations > and give us some truly rich asm comments. Yes, that would be cool. I have wanted type information at the MachineInstr level for some time now. Do you imagine extending the Instruction and Operand classes in Target.td and then having TableGen fill in those bits? > > I'm not sure how target-specific comments would work. Do you have an > > example in mind? I want to avoid having to mark every vector instruction > > as the information can be inferred for 90% of the cases. > > I am not sure either. It seems to me you are adding asm comments that many > targets might not care about. Perhaps tablegen can infer some information > and attach it as asm comment string to each TargetInstrDesc entry? That could be useful. Something as generic as "Vector Spill" or "Scalar Spill" isn't terribly target-specific but I could certainly imagine a time where we'll want very target-specific asm comments. -Dave From evan.cheng at apple.com Tue Nov 24 13:59:24 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 11:59:24 -0800 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911241348.43974.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911240829.16459.dag@cray.com> <688E3BAC-7BCD-472A-9746-DD3F1247F6C4@apple.com> <200911241348.43974.dag@cray.com> Message-ID: <5120DADD-37E1-4296-B4CF-59AE9316EDBB@apple.com> On Nov 24, 2009, at 11:48 AM, David Greene wrote: > On Tuesday 24 November 2009 13:43, Evan Cheng wrote: > >>> I was thinking of doing this by having TableGen infer isVector flags on >>> instructions and operands from the type. If an operand has a vector >>> type, set the isVector flag on the operand. If an instruction has any >>> operands with isVector set, set isVector on the instruction. The user >>> can then override these assumptions in the .td file by setting "let >>> scalar=1" or "let vector=1." >> >> That's reasonable. But I think we should let tablegen infer a lot more >> information. For every instruction with matching pattern it should save the >> SDNode opcode. So for ADD64*, ADDPS*, etc. we know they are all ISD::ADD >> instructions. Similarly we can save the ValueType information and infer >> properties such as isVector. >> >> Such information will let codegen do all kinds of interesting optimizations >> and give us some truly rich asm comments. > > Yes, that would be cool. I have wanted type information at the MachineInstr > level for some time now. Do you imagine extending the Instruction and Operand > classes in Target.td and then having TableGen fill in those bits? Only on instruction level, not at the operand level. Also note the information is imprecise. Now that I think about it, I am not sure adding ValueType to TargetInstrDesc makes sense since it would be a many (types) to one mapping. But perhaps properties like bitwidth, isVector make sense. I don't think we need to add anything to Target.td. We just need to enhance TargetInstrDesc and have InstrInfoEmitter fill in the information. Evan > >>> I'm not sure how target-specific comments would work. Do you have an >>> example in mind? I want to avoid having to mark every vector instruction >>> as the information can be inferred for 90% of the cases. >> >> I am not sure either. It seems to me you are adding asm comments that many >> targets might not care about. Perhaps tablegen can infer some information >> and attach it as asm comment string to each TargetInstrDesc entry? > > That could be useful. Something as generic as "Vector Spill" or "Scalar > Spill" isn't terribly target-specific but I could certainly imagine a time > where we'll want very target-specific asm comments. > > -Dave From dag at cray.com Tue Nov 24 14:03:14 2009 From: dag at cray.com (David Greene) Date: Tue, 24 Nov 2009 14:03:14 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <5120DADD-37E1-4296-B4CF-59AE9316EDBB@apple.com> References: <200911201622.37994.dag@cray.com> <200911241348.43974.dag@cray.com> <5120DADD-37E1-4296-B4CF-59AE9316EDBB@apple.com> Message-ID: <200911241403.15274.dag@cray.com> On Tuesday 24 November 2009 13:59, Evan Cheng wrote: > > Yes, that would be cool. I have wanted type information at the > > MachineInstr level for some time now. Do you imagine extending the > > Instruction and Operand classes in Target.td and then having TableGen > > fill in those bits? > > Only on instruction level, not at the operand level. Also note the > information is imprecise. Now that I think about it, I am not sure adding > ValueType to TargetInstrDesc makes sense since it would be a many (types) > to one mapping. But perhaps properties like bitwidth, isVector make sense. > > I don't think we need to add anything to Target.td. We just need to enhance > TargetInstrDesc and have InstrInfoEmitter fill in the information. I think we need to add a flag to Target.td to override TableGen's inference of "isVector." For example: let isVector = 0 in def Int_CVTSS2SIrr : SSI<0x2D, MRMSrcReg, (outs GR32:$dst), (ins VR128:$src), "cvtss2si\t{$src, $dst|$dst, $src}", [(set GR32:$dst, (int_x86_sse_cvtss2si VR128$src))]>; -Dave From daniel at zuster.org Tue Nov 24 14:44:38 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 20:44:38 -0000 Subject: [llvm-commits] [zorg] r89797 - in /zorg/trunk: test/ test/buildbot/ test/buildbot/builders/ test/buildbot/builders/Import.py test/lit.cfg zorg/buildbot/builders/LLVMGCCBuilder.py Message-ID: <200911242044.nAOKictl011486@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 14:44:38 2009 New Revision: 89797 URL: http://llvm.org/viewvc/llvm-project?rev=89797&view=rev Log: Fix a refacto in the LLVMGCCBuilder, and start a test suite to prevent future embarrassments. - I heart lit. Added: zorg/trunk/test/ zorg/trunk/test/buildbot/ zorg/trunk/test/buildbot/builders/ zorg/trunk/test/buildbot/builders/Import.py zorg/trunk/test/lit.cfg Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Added: zorg/trunk/test/buildbot/builders/Import.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/test/buildbot/builders/Import.py?rev=89797&view=auto ============================================================================== --- zorg/trunk/test/buildbot/builders/Import.py (added) +++ zorg/trunk/test/buildbot/builders/Import.py Tue Nov 24 14:44:38 2009 @@ -0,0 +1,5 @@ +# RUN: python %s + +import zorg +from zorg.buildbot.builders import ClangBuilder, LLVMBuilder, LLVMGCCBuilder + Added: zorg/trunk/test/lit.cfg URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/test/lit.cfg?rev=89797&view=auto ============================================================================== --- zorg/trunk/test/lit.cfg (added) +++ zorg/trunk/test/lit.cfg Tue Nov 24 14:44:38 2009 @@ -0,0 +1,28 @@ +# -*- Python -*- + +import os +import platform + +# Configuration file for the 'lit' test runner. + +# name: The name of this test suite. +config.name = 'Zorg' + +# testFormat: The test format to use to interpret tests. +# +# For now we require '&&' between commands, until they get globally killed and +# the test runner updated. +execute_external = platform.system() != 'Windows' +config.test_format = lit.formats.ShTest(execute_external) + +# suffixes: A list of file extensions to treat as test files. +config.suffixes = ['.py'] + +# test_source_root: The root path where tests are located. +config.test_source_root = os.path.dirname(__file__) +config.test_exec_root = config.test_source_root + +config.target_triple = None + +config.environment['PYTHONPATH'] = os.path.join(config.test_source_root, '..') + Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py?rev=89797&r1=89796&r2=89797&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Tue Nov 24 14:44:38 2009 @@ -261,7 +261,6 @@ f.addStep(WarningCountingShellCommand(name="install.llvm-gcc.stage2", command = ['nice', '-n', '10', make, 'install'], - command="nice -n 10 make", haltOnFailure=True, description=["install", "llvm-gcc", From daniel at zuster.org Tue Nov 24 14:52:30 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 20:52:30 -0000 Subject: [llvm-commits] [zorg] r89798 - in /zorg/trunk: test/buildbot/builders/Import.py zorg/buildbot/builders/ClangBuilder.py Message-ID: <200911242052.nAOKqUrQ011748@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 14:52:30 2009 New Revision: 89798 URL: http://llvm.org/viewvc/llvm-project?rev=89798&view=rev Log: Improve builder test to actually instantiate the factories, and fix the equally embarrasing bug it found. Modified: zorg/trunk/test/buildbot/builders/Import.py zorg/trunk/zorg/buildbot/builders/ClangBuilder.py Modified: zorg/trunk/test/buildbot/builders/Import.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/test/buildbot/builders/Import.py?rev=89798&r1=89797&r2=89798&view=diff ============================================================================== --- zorg/trunk/test/buildbot/builders/Import.py (original) +++ zorg/trunk/test/buildbot/builders/Import.py Tue Nov 24 14:52:30 2009 @@ -3,3 +3,10 @@ import zorg from zorg.buildbot.builders import ClangBuilder, LLVMBuilder, LLVMGCCBuilder +# Just check that we can instaniate the build factors, what else can we do? + +print ClangBuilder.getClangBuildFactory() + +print LLVMBuilder.getLLVMBuildFactory() + +print LLVMGCCBuilder.getLLVMGCCBuildFactory() Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/ClangBuilder.py?rev=89798&r1=89797&r2=89798&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/ClangBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/ClangBuilder.py Tue Nov 24 14:52:30 2009 @@ -13,7 +13,7 @@ def getClangBuildFactory(triple=None, clean=True, test=True, expensive_checks=False, run_cxx_tests=False, valgrind=False, - make='make'): + make='make', jobs="%(jobs)s"): f = buildbot.process.factory.BuildFactory() # Determine the build directory. From baldrick at free.fr Tue Nov 24 14:55:17 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 24 Nov 2009 21:55:17 +0100 Subject: [llvm-commits] [zorg] r89798 - in /zorg/trunk: test/buildbot/builders/Import.py zorg/buildbot/builders/ClangBuilder.py In-Reply-To: <200911242052.nAOKqUrQ011748@zion.cs.uiuc.edu> References: <200911242052.nAOKqUrQ011748@zion.cs.uiuc.edu> Message-ID: <4B0C4835.2090800@free.fr> Hi Daniel, > +# Just check that we can instaniate the build factors, what else can we do? instaniate -> instantiate Ciao, Duncan. From dpatel at apple.com Tue Nov 24 15:38:55 2009 From: dpatel at apple.com (Devang Patel) Date: Tue, 24 Nov 2009 21:38:55 -0000 Subject: [llvm-commits] [llvm] r89803 - /llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Message-ID: <200911242138.nAOLctAB013385@zion.cs.uiuc.edu> Author: dpatel Date: Tue Nov 24 15:38:54 2009 New Revision: 89803 URL: http://llvm.org/viewvc/llvm-project?rev=89803&view=rev Log: Enable debug info for ppc-darwin. Modified: llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Modified: llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp?rev=89803&r1=89802&r2=89803&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Tue Nov 24 15:38:54 2009 @@ -22,6 +22,9 @@ if (!is64Bit) Data64bitsDirective = 0; // We can't emit a 64-bit unit in PPC32 mode. AssemblerDialect = 1; // New-Style mnemonics. + + // Debug Information + SupportsDebugInformation = true; } PPCLinuxMCAsmInfo::PPCLinuxMCAsmInfo(bool is64Bit) { From baldrick at free.fr Tue Nov 24 15:45:53 2009 From: baldrick at free.fr (Duncan Sands) Date: Tue, 24 Nov 2009 22:45:53 +0100 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> <4B0A9A8C.9040905@free.fr> <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> Message-ID: <4B0C5411.5090800@free.fr> Hi Chris, >> I'm not against loosening the definition of nocapture as long as it is clearly >> defined what nocapture means. I think the obvious thing to do is to say that >> if the value of the pointer is reconstructed *entirely by control flow* then it >> is not considered to be captured. > > That works for me, though I'm not sure exactly what it means. the basic idea is that you recursively visit all uses of the original value, and if none of them is returned and none stored somewhere, then you decree that the value was not captured. This is wrong in the sense that there are a bunch of crazy examples for which this definition gives "not captured" but in fact the original pointer value managed to escape. All of these crazy examples (I gave one in my previous email) are based on "reconstruction via control flow", though perhaps "correlated expressions" is a better name: you inspect the original value and branch based on the values seen, for example you inspect each bit and branch depending on whether it is 0 or 1; in the branched to basic blocks you build up a new value, unconnected to the original by any def-use chain, for example by or'ing in 0 or 1 depending which block was branched to. In this way you can reconstruct the original value in some other register without having any connection via uses between the registers. Anyway, I whipped up a quick implementation of this, see attached patch. It is a bit more complicated than what I said above because loading the pointer and returning the loaded value is not considered to capture it. Also, to please Dan, it allows stores to alloca's and tracks what happens to the alloca. It is much more liberal than what we had before, however it does do less well in two cases: previously if the pointer was a phi node operand or the operand of a select, then we would not consider returning a load of the phi/select pointer to be a capture, but now we do. That's because it's not immediately clear to me how to handle these cases better in this context. The results are disappointing. Running on MultiSource results in a 0.7% increase in the number of function parameters consider nocapture (18523 compared to 18393). For many programs the number of nocapture parameters went down, presumably due to the select/phi issue I mentioned above. Ciao, Duncan. -------------- next part -------------- A non-text attachment was scrubbed... Name: capture.diff Type: text/x-patch Size: 8693 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091124/53b2a372/attachment.bin From rjmccall at apple.com Tue Nov 24 16:20:43 2009 From: rjmccall at apple.com (John McCall) Date: Tue, 24 Nov 2009 14:20:43 -0800 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <4B0C5411.5090800@free.fr> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> <4B0A9A8C.9040905@free.fr> <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> <4B0C5411.5090800@free.fr> Message-ID: <4B0C5C3B.5010903@apple.com> Duncan Sands wrote: > Anyway, I whipped up a quick implementation of this, see attached > patch. It > is a bit more complicated than what I said above because loading the > pointer > and returning the loaded value is not considered to capture it. Also, > to please > Dan, it allows stores to alloca's and tracks what happens to the > alloca. It is > much more liberal than what we had before, however it does do less > well in two > cases: previously if the pointer was a phi node operand or the operand > of a > select, then we would not consider returning a load of the phi/select > pointer to > be a capture, but now we do. That's because it's not immediately > clear to me > how to handle these cases better in this context. If we're not going to consider "correlated" captures, shouldn't the model just be taint-checking? The original parameter is tainted, GEPs are tainted, phis to which it's an input are tainted, possibly some other things. The parameter is nocapture only if there's no "capturing" use of a tainted value. That's about as good as you're going to get without a lot of (expensive) sophistication. John. From ofv at wanadoo.es Tue Nov 24 16:32:49 2009 From: ofv at wanadoo.es (=?utf-8?Q?=C3=93scar_Fuentes?=) Date: Tue, 24 Nov 2009 23:32:49 +0100 Subject: [llvm-commits] [llvm] r89765 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc References: <200911241519.nAOFJBHs032617@zion.cs.uiuc.edu> <4B0C070F.2020907@free.fr> <521640720911240834x6fe24e50ma1ced5fb6d1f279e@mail.gmail.com> Message-ID: <87ws1fft9q.fsf@telefonica.net> "Edward O'Callaghan" writes: [snip] >> Windows does have special files AFAIK, for example opening "nul" >> gives the effect of /dev/null on unix systems. > > OK, then I don't know the details of this nor do I have a windows > machine to expand on this hook. > Are you able to provide some more detail, is this bug a problem on > windows as well? On Windows you can't delete a NUL or any other device file name, even when you are the Administrator user. A different issue is if clang barfs when it tries to delete the output file and the operation fails. There are lots of special file names on Windows: NUL, COM1, COM2..., LPT1, LPT2..., PRN, CON, AUX etc. A file name that contains one of those names as the basename may be considered special too (NUL.cpp, for instance). There is a GetFileType Win32 API, but it takes a file handle, not a file name, and obtaining the file handle on a correct way is tricky. For instance, is the file can't be opened maybe is because it does not exists, but another possibility is that it may be blocked or is a device in use. -- ?scar From dalej at apple.com Tue Nov 24 16:59:02 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 24 Nov 2009 22:59:02 -0000 Subject: [llvm-commits] [llvm] r89811 - in /llvm/trunk: lib/Target/PowerPC/PPCFrameInfo.h lib/Target/PowerPC/PPCRegisterInfo.cpp test/CodeGen/PowerPC/Frames-alloca.ll test/CodeGen/PowerPC/Frames-large.ll test/CodeGen/PowerPC/Frames-small.ll test/CodeGen/PowerPC/ppc-prologue.ll Message-ID: <200911242259.nAOMx2c2016262@zion.cs.uiuc.edu> Author: johannes Date: Tue Nov 24 16:59:02 2009 New Revision: 89811 URL: http://llvm.org/viewvc/llvm-project?rev=89811&view=rev Log: Do not store R31 into the caller's link area on PPC. This violates the ABI (that area is "reserved"), and while it is safe if all code is generated with current compilers, there is some very old code around that uses that slot for something else, and breaks if it is stored into. Adjust testcases looking for current behavior. I've verified that the stack frame size is right in all testcases, whether it changed or not. 7311323. Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h Tue Nov 24 16:59:02 2009 @@ -42,11 +42,12 @@ /// frame pointer. static unsigned getFramePointerSaveOffset(bool isPPC64, bool isDarwinABI) { // For the Darwin ABI: - // Use the TOC save slot in the PowerPC linkage area for saving the frame - // pointer (if needed.) LLVM does not generate code that uses the TOC (R2 - // is treated as a caller saved register.) + // We cannot use the TOC save slot (offset +20) in the PowerPC linkage area + // for saving the frame pointer (if needed.) While the published ABI has + // not used this slot since at least MacOSX 10.2, there is older code + // around that does use it, and that needs to continue to work. if (isDarwinABI) - return isPPC64 ? 40 : 20; + return isPPC64 ? -8U : -4U; // SVR4 ABI: First slot in the general register save area. return -4U; @@ -90,6 +91,17 @@ // With the SVR4 ABI, callee-saved registers have fixed offsets on the stack. const SpillSlot * getCalleeSavedSpillSlots(unsigned &NumEntries) const { + if (TM.getSubtarget().isDarwinABI()) { + NumEntries = 1; + if (TM.getSubtarget().isPPC64()) { + static const SpillSlot darwin64Offsets[] = {PPC::X31, -8}; + return darwin64Offsets; + } else { + static const SpillSlot darwinOffsets[] = {PPC::R31, -4}; + return darwinOffsets; + } + } + // Early exit if not using the SVR4 ABI. if (!TM.getSubtarget().isSVR4ABI()) { NumEntries = 0; Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Tue Nov 24 16:59:02 2009 @@ -1033,12 +1033,11 @@ // Save R31 if necessary int FPSI = FI->getFramePointerSaveIndex(); bool isPPC64 = Subtarget.isPPC64(); - bool isSVR4ABI = Subtarget.isSVR4ABI(); bool isDarwinABI = Subtarget.isDarwinABI(); MachineFrameInfo *MFI = MF.getFrameInfo(); // If the frame pointer save index hasn't been defined yet. - if (!FPSI && needsFP(MF) && isSVR4ABI) { + if (!FPSI && needsFP(MF)) { // Find out what the fix offset of the frame pointer save area. int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, isDarwinABI); Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll Tue Nov 24 16:59:02 2009 @@ -6,23 +6,23 @@ ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -enable-ppc32-regscavenger | FileCheck %s -check-prefix=PPC32-RS ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -disable-fp-elim -enable-ppc32-regscavenger | FileCheck %s -check-prefix=PPC32-RS-NOFP -; CHECK-PPC32: stw r31, 20(r1) +; CHECK-PPC32: stw r31, -4(r1) ; CHECK-PPC32: lwz r1, 0(r1) -; CHECK-PPC32: lwz r31, 20(r1) -; CHECK-PPC32-NOFP: stw r31, 20(r1) +; CHECK-PPC32: lwz r31, -4(r1) +; CHECK-PPC32-NOFP: stw r31, -4(r1) ; CHECK-PPC32-NOFP: lwz r1, 0(r1) -; CHECK-PPC32-NOFP: lwz r31, 20(r1) +; CHECK-PPC32-NOFP: lwz r31, -4(r1) ; CHECK-PPC32-RS: stwu r1, -80(r1) ; CHECK-PPC32-RS-NOFP: stwu r1, -80(r1) -; CHECK-PPC64: std r31, 40(r1) -; CHECK-PPC64: stdu r1, -112(r1) +; CHECK-PPC64: std r31, -8(r1) +; CHECK-PPC64: stdu r1, -128(r1) ; CHECK-PPC64: ld r1, 0(r1) -; CHECK-PPC64: ld r31, 40(r1) -; CHECK-PPC64-NOFP: std r31, 40(r1) -; CHECK-PPC64-NOFP: stdu r1, -112(r1) +; CHECK-PPC64: ld r31, -8(r1) +; CHECK-PPC64-NOFP: std r31, -8(r1) +; CHECK-PPC64-NOFP: stdu r1, -128(r1) ; CHECK-PPC64-NOFP: ld r1, 0(r1) -; CHECK-PPC64-NOFP: ld r31, 40(r1) +; CHECK-PPC64-NOFP: ld r31, -8(r1) define i32* @f1(i32 %n) { %tmp = alloca i32, i32 %n ; [#uses=1] Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll Tue Nov 24 16:59:02 2009 @@ -22,13 +22,13 @@ ; PPC32-NOFP: blr ; PPC32-FP: _f1: -; PPC32-FP: stw r31, 20(r1) +; PPC32-FP: stw r31, -4(r1) ; PPC32-FP: lis r0, -1 ; PPC32-FP: ori r0, r0, 32704 ; PPC32-FP: stwux r1, r1, r0 ; ... ; PPC32-FP: lwz r1, 0(r1) -; PPC32-FP: lwz r31, 20(r1) +; PPC32-FP: lwz r31, -4(r1) ; PPC32-FP: blr @@ -42,11 +42,11 @@ ; PPC64-FP: _f1: -; PPC64-FP: std r31, 40(r1) +; PPC64-FP: std r31, -8(r1) ; PPC64-FP: lis r0, -1 -; PPC64-FP: ori r0, r0, 32656 +; PPC64-FP: ori r0, r0, 32640 ; PPC64-FP: stdux r1, r1, r0 ; ... ; PPC64-FP: ld r1, 0(r1) -; PPC64-FP: ld r31, 40(r1) +; PPC64-FP: ld r31, -8(r1) ; PPC64-FP: blr Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll Tue Nov 24 16:59:02 2009 @@ -1,26 +1,26 @@ ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -o %t1 -; RUN not grep {stw r31, 20(r1)} %t1 +; RUN not grep {stw r31, -4(r1)} %t1 ; RUN: grep {stwu r1, -16448(r1)} %t1 ; RUN: grep {addi r1, r1, 16448} %t1 ; RUN: llc < %s -march=ppc32 | \ -; RUN: not grep {lwz r31, 20(r1)} +; RUN: not grep {lwz r31, -4(r1)} ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -disable-fp-elim \ ; RUN: -o %t2 -; RUN: grep {stw r31, 20(r1)} %t2 +; RUN: grep {stw r31, -4(r1)} %t2 ; RUN: grep {stwu r1, -16448(r1)} %t2 ; RUN: grep {addi r1, r1, 16448} %t2 -; RUN: grep {lwz r31, 20(r1)} %t2 +; RUN: grep {lwz r31, -4(r1)} %t2 ; RUN: llc < %s -march=ppc64 -mtriple=powerpc-apple-darwin8 -o %t3 -; RUN: not grep {std r31, 40(r1)} %t3 +; RUN: not grep {std r31, -8(r1)} %t3 ; RUN: grep {stdu r1, -16496(r1)} %t3 ; RUN: grep {addi r1, r1, 16496} %t3 -; RUN: not grep {ld r31, 40(r1)} %t3 +; RUN: not grep {ld r31, -8(r1)} %t3 ; RUN: llc < %s -march=ppc64 -mtriple=powerpc-apple-darwin8 -disable-fp-elim \ ; RUN: -o %t4 -; RUN: grep {std r31, 40(r1)} %t4 -; RUN: grep {stdu r1, -16496(r1)} %t4 -; RUN: grep {addi r1, r1, 16496} %t4 -; RUN: grep {ld r31, 40(r1)} %t4 +; RUN: grep {std r31, -8(r1)} %t4 +; RUN: grep {stdu r1, -16512(r1)} %t4 +; RUN: grep {addi r1, r1, 16512} %t4 +; RUN: grep {ld r31, -8(r1)} %t4 define i32* @f1() { %tmp = alloca i32, i32 4095 ; [#uses=1] Modified: llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll?rev=89811&r1=89810&r2=89811&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll (original) +++ llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll Tue Nov 24 16:59:02 2009 @@ -2,7 +2,7 @@ define i32 @_Z4funci(i32 %a) ssp { ; CHECK: mflr r0 -; CHECK-NEXT: stw r31, 20(r1) +; CHECK-NEXT: stw r31, -4(r1) ; CHECK-NEXT: stw r0, 8(r1) ; CHECK-NEXT: stwu r1, -80(r1) ; CHECK-NEXT: Llabel1: From daniel at zuster.org Tue Nov 24 17:12:53 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 15:12:53 -0800 Subject: [llvm-commits] [zorg] r89798 - in /zorg/trunk: test/buildbot/builders/Import.py zorg/buildbot/builders/ClangBuilder.py In-Reply-To: <4B0C4835.2090800@free.fr> References: <200911242052.nAOKqUrQ011748@zion.cs.uiuc.edu> <4B0C4835.2090800@free.fr> Message-ID: <6a8523d60911241512k66a838c5sbbd9b131e7f80c12@mail.gmail.com> It's an embarrassing day, I guess. :) - Daniel On Tue, Nov 24, 2009 at 12:55 PM, Duncan Sands wrote: > Hi Daniel, > >> +# Just check that we can instaniate the build factors, what else can we >> do? > > instaniate -> instantiate > > Ciao, > > Duncan. > From daniel at zuster.org Tue Nov 24 17:13:17 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 23:13:17 -0000 Subject: [llvm-commits] [zorg] r89812 - /zorg/trunk/test/buildbot/builders/Import.py Message-ID: <200911242313.nAONDIAU016771@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 17:13:17 2009 New Revision: 89812 URL: http://llvm.org/viewvc/llvm-project?rev=89812&view=rev Log: Fix typo. Modified: zorg/trunk/test/buildbot/builders/Import.py Modified: zorg/trunk/test/buildbot/builders/Import.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/test/buildbot/builders/Import.py?rev=89812&r1=89811&r2=89812&view=diff ============================================================================== --- zorg/trunk/test/buildbot/builders/Import.py (original) +++ zorg/trunk/test/buildbot/builders/Import.py Tue Nov 24 17:13:17 2009 @@ -3,7 +3,7 @@ import zorg from zorg.buildbot.builders import ClangBuilder, LLVMBuilder, LLVMGCCBuilder -# Just check that we can instaniate the build factors, what else can we do? +# Just check that we can instantiate the build factors, what else can we do? print ClangBuilder.getClangBuildFactory() From bob.wilson at apple.com Tue Nov 24 17:35:49 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Tue, 24 Nov 2009 23:35:49 -0000 Subject: [llvm-commits] [llvm] r89814 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h Message-ID: <200911242335.nAONZntR017555@zion.cs.uiuc.edu> Author: bwilson Date: Tue Nov 24 17:35:49 2009 New Revision: 89814 URL: http://llvm.org/viewvc/llvm-project?rev=89814&view=rev Log: Refactor target hook for tail duplication as requested by Chris. Make tail duplication of indirect branches much more aggressive (for targets that indicate that it is profitable), based on further experience with this transformation. I compiled 3 large applications with and without this more aggressive tail duplication and measured minimal changes in code size. ("size" on Darwin seems to round the text size up to the nearest page boundary, so I can only say that any code size increase was less than one 4k page.) Radar 7421267. Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h llvm/trunk/lib/CodeGen/BranchFolding.cpp llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original) +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 24 17:35:49 2009 @@ -544,12 +544,9 @@ virtual unsigned getInlineAsmLength(const char *Str, const MCAsmInfo &MAI) const; - /// TailDuplicationLimit - Returns the limit on the number of instructions - /// in basic block MBB beyond which it will not be tail-duplicated. - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, - unsigned DefaultLimit) const { - return DefaultLimit; - } + /// isProfitableToDuplicateIndirectBranch - Returns true if tail duplication + /// is especially profitable for indirect branches. + virtual bool isProfitableToDuplicateIndirectBranch() const { return false; } }; /// TargetInstrInfoImpl - This is the default implementation of Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89814&r1=89813&r2=89814&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original) +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 24 17:35:49 2009 @@ -1043,9 +1043,18 @@ // of one less than the tail-merge threshold. When optimizing for size, // duplicate only one, because one branch instruction can be eliminated to // compensate for the duplication. - unsigned MaxDuplicateCount = - MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ? - 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1); + unsigned MaxDuplicateCount; + if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize)) + MaxDuplicateCount = 1; + else if (TII->isProfitableToDuplicateIndirectBranch() && + !TailBB->empty() && TailBB->back().getDesc().isIndirectBranch()) + // If the target has hardware branch prediction that can handle indirect + // branches, duplicating them can often make them predictable when there + // are common paths through the code. The limit needs to be high enough + // to allow undoing the effects of tail merging. + MaxDuplicateCount = 20; + else + MaxDuplicateCount = TailMergeSize - 1; // Check the instructions in the block to determine whether tail-duplication // is invalid or unlikely to be profitable. Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89814&r1=89813&r2=89814&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 24 17:35:49 2009 @@ -1027,14 +1027,10 @@ return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI); } -unsigned ARMBaseInstrInfo::TailDuplicationLimit(const MachineBasicBlock &MBB, - unsigned DefaultLimit) const { +bool ARMBaseInstrInfo::isProfitableToDuplicateIndirectBranch() const { // If the target processor can predict indirect branches, it is highly // desirable to duplicate them, since it can often make them predictable. - if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode()) && - getSubtarget().hasBranchTargetBuffer()) - return DefaultLimit + 2; - return DefaultLimit; + return getSubtarget().hasBranchTargetBuffer(); } /// getInstrPredicate - If instruction is predicated, returns its predicate Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff ============================================================================== --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 24 17:35:49 2009 @@ -291,8 +291,7 @@ virtual bool isIdentical(const MachineInstr *MI, const MachineInstr *Other, const MachineRegisterInfo *MRI) const; - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, - unsigned DefaultLimit) const; + virtual bool isProfitableToDuplicateIndirectBranch() const; }; static inline From evan.cheng at apple.com Tue Nov 24 17:40:22 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 15:40:22 -0800 Subject: [llvm-commits] [llvm] r89814 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h In-Reply-To: <200911242335.nAONZntR017555@zion.cs.uiuc.edu> References: <200911242335.nAONZntR017555@zion.cs.uiuc.edu> Message-ID: You can look at the output of llc -stats. That tells you the number of instructions, which is a better indication of code size change. Evan On Nov 24, 2009, at 3:35 PM, Bob Wilson wrote: > Author: bwilson > Date: Tue Nov 24 17:35:49 2009 > New Revision: 89814 > > URL: http://llvm.org/viewvc/llvm-project?rev=89814&view=rev > Log: > Refactor target hook for tail duplication as requested by Chris. > Make tail duplication of indirect branches much more aggressive (for targets > that indicate that it is profitable), based on further experience with > this transformation. I compiled 3 large applications with and without > this more aggressive tail duplication and measured minimal changes in code > size. ("size" on Darwin seems to round the text size up to the nearest > page boundary, so I can only say that any code size increase was less than > one 4k page.) Radar 7421267. > > Modified: > llvm/trunk/include/llvm/Target/TargetInstrInfo.h > llvm/trunk/lib/CodeGen/BranchFolding.cpp > llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp > llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h > > Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original) > +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 24 17:35:49 2009 > @@ -544,12 +544,9 @@ > virtual unsigned getInlineAsmLength(const char *Str, > const MCAsmInfo &MAI) const; > > - /// TailDuplicationLimit - Returns the limit on the number of instructions > - /// in basic block MBB beyond which it will not be tail-duplicated. > - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, > - unsigned DefaultLimit) const { > - return DefaultLimit; > - } > + /// isProfitableToDuplicateIndirectBranch - Returns true if tail duplication > + /// is especially profitable for indirect branches. > + virtual bool isProfitableToDuplicateIndirectBranch() const { return false; } > }; > > /// TargetInstrInfoImpl - This is the default implementation of > > Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89814&r1=89813&r2=89814&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original) > +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 24 17:35:49 2009 > @@ -1043,9 +1043,18 @@ > // of one less than the tail-merge threshold. When optimizing for size, > // duplicate only one, because one branch instruction can be eliminated to > // compensate for the duplication. > - unsigned MaxDuplicateCount = > - MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ? > - 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1); > + unsigned MaxDuplicateCount; > + if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize)) > + MaxDuplicateCount = 1; > + else if (TII->isProfitableToDuplicateIndirectBranch() && > + !TailBB->empty() && TailBB->back().getDesc().isIndirectBranch()) > + // If the target has hardware branch prediction that can handle indirect > + // branches, duplicating them can often make them predictable when there > + // are common paths through the code. The limit needs to be high enough > + // to allow undoing the effects of tail merging. > + MaxDuplicateCount = 20; > + else > + MaxDuplicateCount = TailMergeSize - 1; > > // Check the instructions in the block to determine whether tail-duplication > // is invalid or unlikely to be profitable. > > Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89814&r1=89813&r2=89814&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original) > +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 24 17:35:49 2009 > @@ -1027,14 +1027,10 @@ > return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI); > } > > -unsigned ARMBaseInstrInfo::TailDuplicationLimit(const MachineBasicBlock &MBB, > - unsigned DefaultLimit) const { > +bool ARMBaseInstrInfo::isProfitableToDuplicateIndirectBranch() const { > // If the target processor can predict indirect branches, it is highly > // desirable to duplicate them, since it can often make them predictable. > - if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode()) && > - getSubtarget().hasBranchTargetBuffer()) > - return DefaultLimit + 2; > - return DefaultLimit; > + return getSubtarget().hasBranchTargetBuffer(); > } > > /// getInstrPredicate - If instruction is predicated, returns its predicate > > Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) > +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 24 17:35:49 2009 > @@ -291,8 +291,7 @@ > virtual bool isIdentical(const MachineInstr *MI, const MachineInstr *Other, > const MachineRegisterInfo *MRI) const; > > - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, > - unsigned DefaultLimit) const; > + virtual bool isProfitableToDuplicateIndirectBranch() const; > }; > > static inline > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From daniel at zuster.org Tue Nov 24 17:41:56 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Tue, 24 Nov 2009 15:41:56 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options In-Reply-To: <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> Message-ID: <6a8523d60911241541u23f73e93s6402ec3153e0da5f@mail.gmail.com> Hi Viktor, -- > Index: include/llvm/ADT/Triple.h > =================================================================== > --- include/llvm/ADT/Triple.h (revision 89702) > +++ include/llvm/ADT/Triple.h (working copy) > @@ -131,6 +131,18 @@ > } > > /// @} > + /// @name Operators > + /// @{ > + > + /// operator =(StringRef) - Set all components to the new triple > + /// from the given string representation. Used in the cl::opt parser. > + Triple& operator =(const StringRef Str) { > + Data = Str; > + Arch = InvalidArch; > + return *this; > + } I don't think this should be necessary, I would rather have an explicit Triple call, or if that is too painful just make the Triple() constructor (which takes a StringRef) implicit. Note that the 'const' on Str isn't needed. > + > + /// @} > /// @name Typed Component Access > /// @{ > > Index: tools/lto/LTOCodeGenerator.h > =================================================================== > --- tools/lto/LTOCodeGenerator.h (revision 89702) > +++ tools/lto/LTOCodeGenerator.h (working copy) > @@ -19,6 +19,7 @@ > #include "llvm/LLVMContext.h" > #include "llvm/ADT/StringMap.h" > #include "llvm/ADT/SmallVector.h" > +#include "llvm/ADT/Triple.h" > > #include > > @@ -50,12 +51,14 @@ > const std::string& objPath, std::string& errMsg); > void applyScopeRestrictions(); > bool determineTarget(std::string& errMsg); > + void parseCommandLineOptions(); > > typedef llvm::StringMap StringSet; > > llvm::LLVMContext& _context; > llvm::Linker _linker; > llvm::TargetMachine* _target; > + llvm::Triple _targetTriple; I don't think _targetTriple needs to be added here, I think it can be local to the function that uses it. If it does need to be added to the class, I would prefer having a function like getTriple() which computed, cached, and returned the triple, instead of having the member variable mutated inside some other function. > bool _emitDwarfDebugInfo; > bool _scopeRestrictionsDone; > lto_codegen_model _codeModel; > Index: tools/lto/LTOCodeGenerator.cpp > =================================================================== > --- tools/lto/LTOCodeGenerator.cpp (revision 89702) > +++ tools/lto/LTOCodeGenerator.cpp (working copy) > @@ -59,7 +59,29 @@ > static cl::opt DisableInline("disable-inlining", > cl::desc("Do not run the inliner pass")); > > +static cl::opt > TargetTriple("mtriple", > + cl::desc("Override target triple for module (see -version for available targets)"), > + cl::value_desc("--[-]")); > + > +static cl::opt MCPU("mcpu", > + cl::desc("Override a target specific cpu type (see -mattr=help for available CPUs)"), > + cl::value_desc("cpu-name"), > + cl::init("")); > > +// Ignore -mtune option which could be defined by gcc. > +// We don't use it, just don't want to get unrecognized parameter error. > +static cl::opt MTune("mtune", > + cl::desc("Override a target specific cpu type (see -mattr=help for available CPUs)"), > + cl::value_desc("cpu-name"), > + cl::init(""), cl::Hidden); > + > +static cl::list MAttrs("mattr", > + cl::CommaSeparated, > + cl::desc("Target specific attributes (see -mattr=help for details)"), > + cl::value_desc("a1,+a2,-a3,...")); > + > +static bool opt_parsed = false; > + While it isn't new in this patch, the use of llvm::cl here is really not good, and I'm nervous about increasing our dependency on it. Among other things, this is making it harder to use it in a threaded context. > const char* LTOCodeGenerator::getVersionString() > { > #ifdef LLVM_VERSION_INFO > @@ -140,6 +162,12 @@ > > bool LTOCodeGenerator::writeMergedModules(const char *path, > std::string &errMsg) { > + // This method is exposed by the LTO API, it starts code generation > + // process and could be called from the outside. > + // So we need to parse the code generator command line options here. > + parseCommandLineOptions(); > + // Determine and create a target platform object for > + // the processing data. > if (determineTarget(errMsg)) > return true; Newline before new //? > > @@ -171,6 +199,15 @@ > > const void* LTOCodeGenerator::compile(size_t* length, std::string& errMsg) > { > + // This method is exposed by the LTO API, it starts code generation > + // process and could be called from the outside. > + // So we need to parse the code generator command line options here. > + parseCommandLineOptions(); > + // Determine and create a target platform object for > + // the processing data. > + if (determineTarget(errMsg)) > + return NULL; Newline before new //? > + > // make unique temp .s file to put generated assembly code > sys::Path uniqueAsmPath("lto-llvm.s"); > if ( uniqueAsmPath.createTemporaryFileOnDisk(true, &errMsg) ) > @@ -227,6 +264,9 @@ > bool LTOCodeGenerator::assemble(const std::string& asmPath, > const std::string& objPath, std::string& errMsg) > { > + // Target must be set at this point > + assert(_target); > + Please use: -- assert(_target && "Target must be set at this point!"); -- instead, which is in keeping with LLVM style. Actually, what would be better would be to just add a function -- TargetMachine &getTarget() -- which returns and caches the target, and don't use _target directly. This reduces dependencies in the code. > sys::Path tool; > bool needsCompilerOptions = true; > if ( _assemblerPath ) { > @@ -281,12 +321,20 @@ > bool LTOCodeGenerator::determineTarget(std::string& errMsg) > { > if ( _target == NULL ) { > - std::string Triple = _linker.getModule()->getTargetTriple(); > - if (Triple.empty()) > - Triple = sys::getHostTriple(); > + // Use the tripple if the option was explicitly set tripple -> triple > + _targetTriple = TargetTriple; > + // Or take it from the bitcode module if the option wasn't set explicity > + if (_targetTriple.getTriple().empty()) > + _targetTriple = llvm::Triple(_linker.getModule()->getTargetTriple().c_str()) ; > + // Still empty? Use the host triple as the last choise. choise -> choice and more newlines before comments. > + if (_targetTriple.getTriple().empty()) > + _targetTriple = Triple(sys::getHostTriple()); > > + // At this point triple string should not be empty. > + assert(!_targetTriple.getTriple().empty()); Again, please put the assert "reason" inside the assert instead of in a comment. > + > // create target machine from info for merged modules > - const Target *march = TargetRegistry::lookupTarget(Triple, errMsg); > + const Target *march = TargetRegistry::lookupTarget(_targetTriple.getTriple(), errMsg); > if ( march == NULL ) > return true; > > @@ -308,13 +356,27 @@ > SubtargetFeatures features; > > // Set the rest of features by default. > - // Note: Please keep this after all explict feature settings to make sure > - // defaults will not override explicitly set options. > - features.AddFeatures( > - SubtargetFeatures::getDefaultSubtargetFeatures(llvm::Triple(Triple))); > + // Note: Please keep this before all explict feature settings to make sure > + // defaults will be overrided by explicitly set options. Please rephrase to just be an explanation, something like: // Initialize the target features starting from the default and applying the // explicitly set options which may override the defaults. is more readable IMHO. > + std::pair pair = > + StringRef(SubtargetFeatures::getDefaultSubtargetFeatures(_targetTriple)). > + split(","); > + do { > + features.AddFeature(pair.first); > + pair = pair.second.split(","); > + } while (pair.first.size() > 0); This assumes getDefaultSubtargetFeatures returns a non-empty string. > > + if (!MCPU.empty()) > + features.setCPU(MCPU); > + > + if (!MAttrs.empty()) { > + for (unsigned i = 0; i != MAttrs.size(); ++i) > + features.AddFeature(MAttrs[i]); > + } > + > // construct LTModule, hand over ownership of module and target > - _target = march->createTargetMachine(Triple, features.getString()); > + _target = march->createTargetMachine(_targetTriple.getTriple(), > + features.getString()); > } > return false; > } > @@ -358,8 +420,8 @@ > bool LTOCodeGenerator::generateAssemblyCode(formatted_raw_ostream& out, > std::string& errMsg) > { > - if ( this->determineTarget(errMsg) ) > - return true; > + // Target must be set at this point. > + assert(_target); See prev comments. > // mark which symbols can not be internalized > this->applyScopeRestrictions(); > @@ -380,11 +442,6 @@ > assert (0 && "Unknown exception handling model!"); > } > > - // if options were requested, set them > - if ( !_codegenOptions.empty() ) > - cl::ParseCommandLineOptions(_codegenOptions.size(), > - (char**)&_codegenOptions[0]); > - > // Instantiate the pass manager to organize the passes. > PassManager passes; > > @@ -459,3 +516,16 @@ > _codegenOptions.push_back(strdup(o.c_str())); > } > } > + > +/// Parse the command line and set the code generator options. > +void LTOCodeGenerator::parseCommandLineOptions() > +{ > + // if options were requested, set them once. > + if ( !opt_parsed && !_codegenOptions.empty() ) { > + cl::ParseCommandLineOptions(_codegenOptions.size(), > + (char**)&_codegenOptions[0]); > + opt_parsed = true; > + } > +} > + > + > Index: tools/lto/LTOModule.cpp > =================================================================== > --- tools/lto/LTOModule.cpp (revision 89702) > +++ tools/lto/LTOModule.cpp (working copy) > @@ -131,19 +131,23 @@ > if (!m) > return NULL; > > - std::string Triple = m->getTargetTriple(); > - if (Triple.empty()) > - Triple = sys::getHostTriple(); > + // Target overriding for a single module is not supported. > + // Use host triple as default if target is not defined in the module. > + Triple targetTriple(m->getTargetTriple()); > + if (targetTriple.getTriple().empty()) > + targetTriple.setTriple(sys::getHostTriple()); > > // find machine architecture for this module > - const Target* march = TargetRegistry::lookupTarget(Triple, errMsg); > + const Target* march = > + TargetRegistry::lookupTarget(targetTriple.getTriple(), errMsg); > if (!march) > return NULL; > > // construct LTModule, hand over ownership of module and target > - const std::string FeatureStr = > - SubtargetFeatures::getDefaultSubtargetFeatures(llvm::Triple(Triple)); > - TargetMachine* target = march->createTargetMachine(Triple, FeatureStr); > + const std::string featureStr = > + SubtargetFeatures::getDefaultSubtargetFeatures(targetTriple); > + TargetMachine* target = > + march->createTargetMachine(targetTriple.getTriple(), featureStr); > return new LTOModule(m.take(), target); > } > -- - Daniel On Mon, Nov 23, 2009 at 7:38 PM, Viktor Kutuzov wrote: > The updated patch is attahced. It reflects the changess Chris and Daniel has > requested. > > Best regards, > Viktor > > ----- Original Message ----- From: "Rafael Espindola" > To: "Viktor Kutuzov" > Cc: "Commit Messages and Patches for LLVM" > Sent: Monday, November 23, 2009 1:09 PM > Subject: Re: [llvm-commits] [PATCH] LTO code generator options > > >>> Thanks a lot for reviewing the patch. >>> It is commited as http://llvm.org/viewvc/llvm-project?rev=89516&view=rev >>> >>> Now everything is reaady for the target triple overriding. >> >> I should be able to take a look at it tomorrow, but it would help if >> you could first implement Daniel's and Chirs' comments. >> >>> >>> Best regards, >>> Viktor >> >> Cheers, >> -- >> Rafael ?vila de Esp?ndola >> > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > From dalej at apple.com Tue Nov 24 17:46:59 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 24 Nov 2009 15:46:59 -0800 Subject: [llvm-commits] [llvm] r89814 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h In-Reply-To: References: <200911242335.nAONZntR017555@zion.cs.uiuc.edu> Message-ID: <6C8D4F4B-4F73-4CFF-81C9-9E2428E7932E@apple.com> On Nov 24, 2009, at 3:40 PMPST, Evan Cheng wrote: > You can look at the output of llc -stats. That tells you the number > of instructions, which is a better indication of code size change. That corresponds better to what llvm's keeping track of internally and is useful for determining whether it's doing what we think it's doing. It's not too good for actual size on targets with variable- length instructions, though. size -m is what he wants (on Darwin). > Evan > > On Nov 24, 2009, at 3:35 PM, Bob Wilson wrote: > >> Author: bwilson >> Date: Tue Nov 24 17:35:49 2009 >> New Revision: 89814 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89814&view=rev >> Log: >> Refactor target hook for tail duplication as requested by Chris. >> Make tail duplication of indirect branches much more aggressive >> (for targets >> that indicate that it is profitable), based on further experience >> with >> this transformation. I compiled 3 large applications with and >> without >> this more aggressive tail duplication and measured minimal changes >> in code >> size. ("size" on Darwin seems to round the text size up to the >> nearest >> page boundary, so I can only say that any code size increase was >> less than >> one 4k page.) Radar 7421267. >> >> Modified: >> llvm/trunk/include/llvm/Target/TargetInstrInfo.h >> llvm/trunk/lib/CodeGen/BranchFolding.cpp >> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp >> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h >> >> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original) >> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 24 >> 17:35:49 2009 >> @@ -544,12 +544,9 @@ >> virtual unsigned getInlineAsmLength(const char *Str, >> const MCAsmInfo &MAI) const; >> >> - /// TailDuplicationLimit - Returns the limit on the number of >> instructions >> - /// in basic block MBB beyond which it will not be tail- >> duplicated. >> - virtual unsigned TailDuplicationLimit(const MachineBasicBlock >> &MBB, >> - unsigned DefaultLimit) >> const { >> - return DefaultLimit; >> - } >> + /// isProfitableToDuplicateIndirectBranch - Returns true if tail >> duplication >> + /// is especially profitable for indirect branches. >> + virtual bool isProfitableToDuplicateIndirectBranch() const >> { return false; } >> }; >> >> /// TargetInstrInfoImpl - This is the default implementation of >> >> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89814&r1=89813&r2=89814&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original) >> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 24 17:35:49 2009 >> @@ -1043,9 +1043,18 @@ >> // of one less than the tail-merge threshold. When optimizing for >> size, >> // duplicate only one, because one branch instruction can be >> eliminated to >> // compensate for the duplication. >> - unsigned MaxDuplicateCount = >> - MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ? >> - 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1); >> + unsigned MaxDuplicateCount; >> + if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize)) >> + MaxDuplicateCount = 1; >> + else if (TII->isProfitableToDuplicateIndirectBranch() && >> + !TailBB->empty() && TailBB- >> >back().getDesc().isIndirectBranch()) >> + // If the target has hardware branch prediction that can >> handle indirect >> + // branches, duplicating them can often make them predictable >> when there >> + // are common paths through the code. The limit needs to be >> high enough >> + // to allow undoing the effects of tail merging. >> + MaxDuplicateCount = 20; >> + else >> + MaxDuplicateCount = TailMergeSize - 1; >> >> // Check the instructions in the block to determine whether tail- >> duplication >> // is invalid or unlikely to be profitable. >> >> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89814&r1=89813&r2=89814&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 24 >> 17:35:49 2009 >> @@ -1027,14 +1027,10 @@ >> return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI); >> } >> >> -unsigned ARMBaseInstrInfo::TailDuplicationLimit(const >> MachineBasicBlock &MBB, >> - unsigned >> DefaultLimit) const { >> +bool ARMBaseInstrInfo::isProfitableToDuplicateIndirectBranch() >> const { >> // If the target processor can predict indirect branches, it is >> highly >> // desirable to duplicate them, since it can often make them >> predictable. >> - if (!MBB.empty() && >> isIndirectBranchOpcode(MBB.back().getOpcode()) && >> - getSubtarget().hasBranchTargetBuffer()) >> - return DefaultLimit + 2; >> - return DefaultLimit; >> + return getSubtarget().hasBranchTargetBuffer(); >> } >> >> /// getInstrPredicate - If instruction is predicated, returns its >> predicate >> >> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff >> >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) >> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 24 >> 17:35:49 2009 >> @@ -291,8 +291,7 @@ >> virtual bool isIdentical(const MachineInstr *MI, const >> MachineInstr *Other, >> const MachineRegisterInfo *MRI) const; >> >> - virtual unsigned TailDuplicationLimit(const MachineBasicBlock >> &MBB, >> - unsigned DefaultLimit) >> const; >> + virtual bool isProfitableToDuplicateIndirectBranch() const; >> }; >> >> static inline >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From devang.patel at gmail.com Tue Nov 24 17:58:27 2009 From: devang.patel at gmail.com (Devang Patel) Date: Tue, 24 Nov 2009 15:58:27 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options In-Reply-To: <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> Message-ID: <352a1fb20911241558o4131950di5e3bac9db3a31e30@mail.gmail.com> Hi All, I am sorry to join the party late, but wouldn't it make sense to encode subtarget features in to the llvm bitcode itself (function attributes)? If the info is encoded in llvm bitcode files directly then it would be useful in other situations also. ? - Devang From bob.wilson at apple.com Tue Nov 24 18:00:39 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Tue, 24 Nov 2009 16:00:39 -0800 Subject: [llvm-commits] [llvm] r89814 - in /llvm/trunk: include/llvm/Target/TargetInstrInfo.h lib/CodeGen/BranchFolding.cpp lib/Target/ARM/ARMBaseInstrInfo.cpp lib/Target/ARM/ARMBaseInstrInfo.h In-Reply-To: References: <200911242335.nAONZntR017555@zion.cs.uiuc.edu> Message-ID: <542421EB-B122-4DD4-ABE9-EDA00D73B054@apple.com> I measured a few individual files with "llc -stats" but I wanted to check a much wider range of code. I remeasured everything using "size -m". Two of the applications had no change whatsoever. The third increased by 0.23%, and the total was still smaller than what I measured from a version of llvm from late last week. I don't think there's anything to worry about here. On Nov 24, 2009, at 3:40 PM, Evan Cheng wrote: > You can look at the output of llc -stats. That tells you the number of instructions, which is a better indication of code size change. > > Evan > > On Nov 24, 2009, at 3:35 PM, Bob Wilson wrote: > >> Author: bwilson >> Date: Tue Nov 24 17:35:49 2009 >> New Revision: 89814 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89814&view=rev >> Log: >> Refactor target hook for tail duplication as requested by Chris. >> Make tail duplication of indirect branches much more aggressive (for targets >> that indicate that it is profitable), based on further experience with >> this transformation. I compiled 3 large applications with and without >> this more aggressive tail duplication and measured minimal changes in code >> size. ("size" on Darwin seems to round the text size up to the nearest >> page boundary, so I can only say that any code size increase was less than >> one 4k page.) Radar 7421267. >> >> Modified: >> llvm/trunk/include/llvm/Target/TargetInstrInfo.h >> llvm/trunk/lib/CodeGen/BranchFolding.cpp >> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp >> llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h >> >> Modified: llvm/trunk/include/llvm/Target/TargetInstrInfo.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff >> >> ============================================================================== >> --- llvm/trunk/include/llvm/Target/TargetInstrInfo.h (original) >> +++ llvm/trunk/include/llvm/Target/TargetInstrInfo.h Tue Nov 24 17:35:49 2009 >> @@ -544,12 +544,9 @@ >> virtual unsigned getInlineAsmLength(const char *Str, >> const MCAsmInfo &MAI) const; >> >> - /// TailDuplicationLimit - Returns the limit on the number of instructions >> - /// in basic block MBB beyond which it will not be tail-duplicated. >> - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, >> - unsigned DefaultLimit) const { >> - return DefaultLimit; >> - } >> + /// isProfitableToDuplicateIndirectBranch - Returns true if tail duplication >> + /// is especially profitable for indirect branches. >> + virtual bool isProfitableToDuplicateIndirectBranch() const { return false; } >> }; >> >> /// TargetInstrInfoImpl - This is the default implementation of >> >> Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89814&r1=89813&r2=89814&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original) >> +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Tue Nov 24 17:35:49 2009 >> @@ -1043,9 +1043,18 @@ >> // of one less than the tail-merge threshold. When optimizing for size, >> // duplicate only one, because one branch instruction can be eliminated to >> // compensate for the duplication. >> - unsigned MaxDuplicateCount = >> - MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize) ? >> - 1 : TII->TailDuplicationLimit(*TailBB, TailMergeSize - 1); >> + unsigned MaxDuplicateCount; >> + if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize)) >> + MaxDuplicateCount = 1; >> + else if (TII->isProfitableToDuplicateIndirectBranch() && >> + !TailBB->empty() && TailBB->back().getDesc().isIndirectBranch()) >> + // If the target has hardware branch prediction that can handle indirect >> + // branches, duplicating them can often make them predictable when there >> + // are common paths through the code. The limit needs to be high enough >> + // to allow undoing the effects of tail merging. >> + MaxDuplicateCount = 20; >> + else >> + MaxDuplicateCount = TailMergeSize - 1; >> >> // Check the instructions in the block to determine whether tail-duplication >> // is invalid or unlikely to be profitable. >> >> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp?rev=89814&r1=89813&r2=89814&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp (original) >> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp Tue Nov 24 17:35:49 2009 >> @@ -1027,14 +1027,10 @@ >> return TargetInstrInfoImpl::isIdentical(MI0, MI1, MRI); >> } >> >> -unsigned ARMBaseInstrInfo::TailDuplicationLimit(const MachineBasicBlock &MBB, >> - unsigned DefaultLimit) const { >> +bool ARMBaseInstrInfo::isProfitableToDuplicateIndirectBranch() const { >> // If the target processor can predict indirect branches, it is highly >> // desirable to duplicate them, since it can often make them predictable. >> - if (!MBB.empty() && isIndirectBranchOpcode(MBB.back().getOpcode()) && >> - getSubtarget().hasBranchTargetBuffer()) >> - return DefaultLimit + 2; >> - return DefaultLimit; >> + return getSubtarget().hasBranchTargetBuffer(); >> } >> >> /// getInstrPredicate - If instruction is predicated, returns its predicate >> >> Modified: llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h?rev=89814&r1=89813&r2=89814&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h (original) >> +++ llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.h Tue Nov 24 17:35:49 2009 >> @@ -291,8 +291,7 @@ >> virtual bool isIdentical(const MachineInstr *MI, const MachineInstr *Other, >> const MachineRegisterInfo *MRI) const; >> >> - virtual unsigned TailDuplicationLimit(const MachineBasicBlock &MBB, >> - unsigned DefaultLimit) const; >> + virtual bool isProfitableToDuplicateIndirectBranch() const; >> }; >> >> static inline >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From dpatel at apple.com Tue Nov 24 18:31:14 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 25 Nov 2009 00:31:14 -0000 Subject: [llvm-commits] [llvm] r89819 - /llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Message-ID: <200911250031.nAP0VELf019513@zion.cs.uiuc.edu> Author: dpatel Date: Tue Nov 24 18:31:13 2009 New Revision: 89819 URL: http://llvm.org/viewvc/llvm-project?rev=89819&view=rev Log: Revert r89803. Modified: llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Modified: llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp?rev=89819&r1=89818&r2=89819&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp (original) +++ llvm/trunk/lib/Target/PowerPC/PPCMCAsmInfo.cpp Tue Nov 24 18:31:13 2009 @@ -22,9 +22,6 @@ if (!is64Bit) Data64bitsDirective = 0; // We can't emit a 64-bit unit in PPC32 mode. AssemblerDialect = 1; // New-Style mnemonics. - - // Debug Information - SupportsDebugInformation = true; } PPCLinuxMCAsmInfo::PPCLinuxMCAsmInfo(bool is64Bit) { From bruno.cardoso at gmail.com Tue Nov 24 18:36:01 2009 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Wed, 25 Nov 2009 00:36:01 -0000 Subject: [llvm-commits] [llvm] r89821 - /llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp Message-ID: <200911250036.nAP0a1Z6019722@zion.cs.uiuc.edu> Author: bruno Date: Tue Nov 24 18:36:00 2009 New Revision: 89821 URL: http://llvm.org/viewvc/llvm-project?rev=89821&view=rev Log: Add proper emission of load/store double to stack slots for mips1 targets! Modified: llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp Modified: llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp?rev=89821&r1=89820&r2=89821&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsInstrInfo.cpp Tue Nov 24 18:36:00 2009 @@ -200,22 +200,33 @@ storeRegToStackSlot(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, unsigned SrcReg, bool isKill, int FI, const TargetRegisterClass *RC) const { - unsigned Opc; - DebugLoc DL = DebugLoc::getUnknownLoc(); if (I != MBB.end()) DL = I->getDebugLoc(); if (RC == Mips::CPURegsRegisterClass) - Opc = Mips::SW; + BuildMI(MBB, I, DL, get(Mips::SW)).addReg(SrcReg, getKillRegState(isKill)) + .addImm(0).addFrameIndex(FI); else if (RC == Mips::FGR32RegisterClass) - Opc = Mips::SWC1; - else { - assert(RC == Mips::AFGR64RegisterClass); - Opc = Mips::SDC1; - } - - BuildMI(MBB, I, DL, get(Opc)).addReg(SrcReg, getKillRegState(isKill)) + BuildMI(MBB, I, DL, get(Mips::SWC1)).addReg(SrcReg, getKillRegState(isKill)) .addImm(0).addFrameIndex(FI); + else if (RC == Mips::AFGR64RegisterClass) { + if (!TM.getSubtarget().isMips1()) { + BuildMI(MBB, I, DL, get(Mips::SDC1)) + .addReg(SrcReg, getKillRegState(isKill)) + .addImm(0).addFrameIndex(FI); + } else { + const TargetRegisterInfo *TRI = + MBB.getParent()->getTarget().getRegisterInfo(); + const unsigned *SubSet = TRI->getSubRegisters(SrcReg); + BuildMI(MBB, I, DL, get(Mips::SWC1)) + .addReg(SubSet[0], getKillRegState(isKill)) + .addImm(0).addFrameIndex(FI); + BuildMI(MBB, I, DL, get(Mips::SWC1)) + .addReg(SubSet[1], getKillRegState(isKill)) + .addImm(4).addFrameIndex(FI); + } + } else + llvm_unreachable("Register class not handled!"); } void MipsInstrInfo:: @@ -223,19 +234,27 @@ unsigned DestReg, int FI, const TargetRegisterClass *RC) const { - unsigned Opc; - if (RC == Mips::CPURegsRegisterClass) - Opc = Mips::LW; - else if (RC == Mips::FGR32RegisterClass) - Opc = Mips::LWC1; - else { - assert(RC == Mips::AFGR64RegisterClass); - Opc = Mips::LDC1; - } - DebugLoc DL = DebugLoc::getUnknownLoc(); if (I != MBB.end()) DL = I->getDebugLoc(); - BuildMI(MBB, I, DL, get(Opc), DestReg).addImm(0).addFrameIndex(FI); + + if (RC == Mips::CPURegsRegisterClass) + BuildMI(MBB, I, DL, get(Mips::LW), DestReg).addImm(0).addFrameIndex(FI); + else if (RC == Mips::FGR32RegisterClass) + BuildMI(MBB, I, DL, get(Mips::LWC1), DestReg).addImm(0).addFrameIndex(FI); + else if (RC == Mips::AFGR64RegisterClass) { + if (!TM.getSubtarget().isMips1()) { + BuildMI(MBB, I, DL, get(Mips::LDC1), DestReg).addImm(0).addFrameIndex(FI); + } else { + const TargetRegisterInfo *TRI = + MBB.getParent()->getTarget().getRegisterInfo(); + const unsigned *SubSet = TRI->getSubRegisters(DestReg); + BuildMI(MBB, I, DL, get(Mips::LWC1), SubSet[0]) + .addImm(0).addFrameIndex(FI); + BuildMI(MBB, I, DL, get(Mips::LWC1), SubSet[1]) + .addImm(4).addFrameIndex(FI); + } + } else + llvm_unreachable("Register class not handled!"); } MachineInstr *MipsInstrInfo:: @@ -278,11 +297,14 @@ const TargetRegisterClass *RC = RI.getRegClass(MI->getOperand(0).getReg()); unsigned StoreOpc, LoadOpc; + bool IsMips1 = TM.getSubtarget().isMips1(); if (RC == Mips::FGR32RegisterClass) { LoadOpc = Mips::LWC1; StoreOpc = Mips::SWC1; } else { assert(RC == Mips::AFGR64RegisterClass); + // Mips1 doesn't have ldc/sdc instructions. + if (IsMips1) break; LoadOpc = Mips::LDC1; StoreOpc = Mips::SDC1; } From vkutuzov at accesssoftek.com Tue Nov 24 18:39:59 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Tue, 24 Nov 2009 16:39:59 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <38a0d8450911180722j5a463fa8hec81178154deaf09@mail.gmail.com> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> <352a1fb20911241558o4131950di5e3bac9db3a31e30@mail.gmail.com> Message-ID: <3780A335BCB5498984B6E0408C2EDB0D@andreic6e7fe55> Hi Devang, Unless I'm missing something, bitcode does not depend on subtarget features, does it? Features come to play when someone compiles for a specific target. Encoding subtarget features in the llvm bitcode itself would be mixing together 2 unrelated things with different scopes. What use cases you are thinking of? Viktor ----- Original Message ----- From: "Devang Patel" To: "Viktor Kutuzov" Cc: "Rafael Espindola" ; "Commit Messages and Patches for LLVM" Sent: Tuesday, November 24, 2009 3:58 PM Subject: Re: [llvm-commits] [PATCH] LTO code generator options Hi All, I am sorry to join the party late, but wouldn't it make sense to encode subtarget features in to the llvm bitcode itself (function attributes)? If the info is encoded in llvm bitcode files directly then it would be useful in other situations also. ? - Devang From bruno.cardoso at gmail.com Tue Nov 24 18:47:43 2009 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Wed, 25 Nov 2009 00:47:43 -0000 Subject: [llvm-commits] [llvm] r89823 - /llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp Message-ID: <200911250047.nAP0liXv020143@zion.cs.uiuc.edu> Author: bruno Date: Tue Nov 24 18:47:43 2009 New Revision: 89823 URL: http://llvm.org/viewvc/llvm-project?rev=89823&view=rev Log: Only include in the callee saved regs the sub registers to avoid unnecessary save/restore. Modified: llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp Modified: llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp?rev=89823&r1=89822&r2=89823&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsRegisterInfo.cpp Tue Nov 24 18:47:43 2009 @@ -107,8 +107,7 @@ static const unsigned BitMode32CalleeSavedRegs[] = { Mips::S0, Mips::S1, Mips::S2, Mips::S3, Mips::S4, Mips::S5, Mips::S6, Mips::S7, - Mips::F20, Mips::F22, Mips::F24, Mips::F26, Mips::F28, Mips::F30, - Mips::D10, Mips::D11, Mips::D12, Mips::D13, Mips::D14, Mips::D15,0 + Mips::F20, Mips::F22, Mips::F24, Mips::F26, Mips::F28, Mips::F30, 0 }; if (Subtarget.isSingleFloat()) @@ -136,9 +135,7 @@ &Mips::CPURegsRegClass, &Mips::CPURegsRegClass, &Mips::CPURegsRegClass, &Mips::CPURegsRegClass, &Mips::CPURegsRegClass, &Mips::FGR32RegClass, &Mips::FGR32RegClass, &Mips::FGR32RegClass, - &Mips::FGR32RegClass, &Mips::FGR32RegClass, &Mips::FGR32RegClass, - &Mips::AFGR64RegClass, &Mips::AFGR64RegClass, &Mips::AFGR64RegClass, - &Mips::AFGR64RegClass, &Mips::AFGR64RegClass, &Mips::AFGR64RegClass, 0 + &Mips::FGR32RegClass, &Mips::FGR32RegClass, &Mips::FGR32RegClass, 0 }; if (Subtarget.isSingleFloat()) From evan.cheng at apple.com Tue Nov 24 18:49:29 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Tue, 24 Nov 2009 16:49:29 -0800 Subject: [llvm-commits] [llvm] r89811 - in /llvm/trunk: lib/Target/PowerPC/PPCFrameInfo.h lib/Target/PowerPC/PPCRegisterInfo.cpp test/CodeGen/PowerPC/Frames-alloca.ll test/CodeGen/PowerPC/Frames-large.ll test/CodeGen/PowerPC/Frames-small.ll test/CodeGen/PowerPC/ppc-prologue.ll In-Reply-To: <200911242259.nAOMx2c2016262@zion.cs.uiuc.edu> References: <200911242259.nAOMx2c2016262@zion.cs.uiuc.edu> Message-ID: <6E979870-9CBA-4510-BE1C-914A29849999@apple.com> I see compilation warnings: In file included from PPCRegisterInfo.cpp:20: PPCFrameInfo.h: In member function 'virtual const llvm::TargetFrameInfo::SpillSlot* llvm::PPCFrameInfo::getCalleeSavedSpillSlots(unsigned int&) const': PPCFrameInfo.h:97: warning: missing braces around initializer for 'const llvm::TargetFrameInfo::SpillSlot' PPCFrameInfo.h:100: warning: missing braces around initializer for 'const llvm::TargetFrameInfo::SpillSlot' Evan On Nov 24, 2009, at 2:59 PM, Dale Johannesen wrote: > Author: johannes > Date: Tue Nov 24 16:59:02 2009 > New Revision: 89811 > > URL: http://llvm.org/viewvc/llvm-project?rev=89811&view=rev > Log: > Do not store R31 into the caller's link area on PPC. > This violates the ABI (that area is "reserved"), and > while it is safe if all code is generated with current > compilers, there is some very old code around that uses > that slot for something else, and breaks if it is stored > into. Adjust testcases looking for current behavior. > I've verified that the stack frame size is right in all > testcases, whether it changed or not. 7311323. > > > Modified: > llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h > llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp > llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll > llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll > llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll > llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll > > Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h (original) > +++ llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h Tue Nov 24 16:59:02 2009 > @@ -42,11 +42,12 @@ > /// frame pointer. > static unsigned getFramePointerSaveOffset(bool isPPC64, bool isDarwinABI) { > // For the Darwin ABI: > - // Use the TOC save slot in the PowerPC linkage area for saving the frame > - // pointer (if needed.) LLVM does not generate code that uses the TOC (R2 > - // is treated as a caller saved register.) > + // We cannot use the TOC save slot (offset +20) in the PowerPC linkage area > + // for saving the frame pointer (if needed.) While the published ABI has > + // not used this slot since at least MacOSX 10.2, there is older code > + // around that does use it, and that needs to continue to work. > if (isDarwinABI) > - return isPPC64 ? 40 : 20; > + return isPPC64 ? -8U : -4U; > > // SVR4 ABI: First slot in the general register save area. > return -4U; > @@ -90,6 +91,17 @@ > // With the SVR4 ABI, callee-saved registers have fixed offsets on the stack. > const SpillSlot * > getCalleeSavedSpillSlots(unsigned &NumEntries) const { > + if (TM.getSubtarget().isDarwinABI()) { > + NumEntries = 1; > + if (TM.getSubtarget().isPPC64()) { > + static const SpillSlot darwin64Offsets[] = {PPC::X31, -8}; > + return darwin64Offsets; > + } else { > + static const SpillSlot darwinOffsets[] = {PPC::R31, -4}; > + return darwinOffsets; > + } > + } > + > // Early exit if not using the SVR4 ABI. > if (!TM.getSubtarget().isSVR4ABI()) { > NumEntries = 0; > > Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp (original) > +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Tue Nov 24 16:59:02 2009 > @@ -1033,12 +1033,11 @@ > // Save R31 if necessary > int FPSI = FI->getFramePointerSaveIndex(); > bool isPPC64 = Subtarget.isPPC64(); > - bool isSVR4ABI = Subtarget.isSVR4ABI(); > bool isDarwinABI = Subtarget.isDarwinABI(); > MachineFrameInfo *MFI = MF.getFrameInfo(); > > // If the frame pointer save index hasn't been defined yet. > - if (!FPSI && needsFP(MF) && isSVR4ABI) { > + if (!FPSI && needsFP(MF)) { > // Find out what the fix offset of the frame pointer save area. > int FPOffset = PPCFrameInfo::getFramePointerSaveOffset(isPPC64, > isDarwinABI); > > Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll (original) > +++ llvm/trunk/test/CodeGen/PowerPC/Frames-alloca.ll Tue Nov 24 16:59:02 2009 > @@ -6,23 +6,23 @@ > ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -enable-ppc32-regscavenger | FileCheck %s -check-prefix=PPC32-RS > ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -disable-fp-elim -enable-ppc32-regscavenger | FileCheck %s -check-prefix=PPC32-RS-NOFP > > -; CHECK-PPC32: stw r31, 20(r1) > +; CHECK-PPC32: stw r31, -4(r1) > ; CHECK-PPC32: lwz r1, 0(r1) > -; CHECK-PPC32: lwz r31, 20(r1) > -; CHECK-PPC32-NOFP: stw r31, 20(r1) > +; CHECK-PPC32: lwz r31, -4(r1) > +; CHECK-PPC32-NOFP: stw r31, -4(r1) > ; CHECK-PPC32-NOFP: lwz r1, 0(r1) > -; CHECK-PPC32-NOFP: lwz r31, 20(r1) > +; CHECK-PPC32-NOFP: lwz r31, -4(r1) > ; CHECK-PPC32-RS: stwu r1, -80(r1) > ; CHECK-PPC32-RS-NOFP: stwu r1, -80(r1) > > -; CHECK-PPC64: std r31, 40(r1) > -; CHECK-PPC64: stdu r1, -112(r1) > +; CHECK-PPC64: std r31, -8(r1) > +; CHECK-PPC64: stdu r1, -128(r1) > ; CHECK-PPC64: ld r1, 0(r1) > -; CHECK-PPC64: ld r31, 40(r1) > -; CHECK-PPC64-NOFP: std r31, 40(r1) > -; CHECK-PPC64-NOFP: stdu r1, -112(r1) > +; CHECK-PPC64: ld r31, -8(r1) > +; CHECK-PPC64-NOFP: std r31, -8(r1) > +; CHECK-PPC64-NOFP: stdu r1, -128(r1) > ; CHECK-PPC64-NOFP: ld r1, 0(r1) > -; CHECK-PPC64-NOFP: ld r31, 40(r1) > +; CHECK-PPC64-NOFP: ld r31, -8(r1) > > define i32* @f1(i32 %n) { > %tmp = alloca i32, i32 %n ; [#uses=1] > > Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll (original) > +++ llvm/trunk/test/CodeGen/PowerPC/Frames-large.ll Tue Nov 24 16:59:02 2009 > @@ -22,13 +22,13 @@ > ; PPC32-NOFP: blr > > ; PPC32-FP: _f1: > -; PPC32-FP: stw r31, 20(r1) > +; PPC32-FP: stw r31, -4(r1) > ; PPC32-FP: lis r0, -1 > ; PPC32-FP: ori r0, r0, 32704 > ; PPC32-FP: stwux r1, r1, r0 > ; ... > ; PPC32-FP: lwz r1, 0(r1) > -; PPC32-FP: lwz r31, 20(r1) > +; PPC32-FP: lwz r31, -4(r1) > ; PPC32-FP: blr > > > @@ -42,11 +42,11 @@ > > > ; PPC64-FP: _f1: > -; PPC64-FP: std r31, 40(r1) > +; PPC64-FP: std r31, -8(r1) > ; PPC64-FP: lis r0, -1 > -; PPC64-FP: ori r0, r0, 32656 > +; PPC64-FP: ori r0, r0, 32640 > ; PPC64-FP: stdux r1, r1, r0 > ; ... > ; PPC64-FP: ld r1, 0(r1) > -; PPC64-FP: ld r31, 40(r1) > +; PPC64-FP: ld r31, -8(r1) > ; PPC64-FP: blr > > Modified: llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll (original) > +++ llvm/trunk/test/CodeGen/PowerPC/Frames-small.ll Tue Nov 24 16:59:02 2009 > @@ -1,26 +1,26 @@ > ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -o %t1 > -; RUN not grep {stw r31, 20(r1)} %t1 > +; RUN not grep {stw r31, -4(r1)} %t1 > ; RUN: grep {stwu r1, -16448(r1)} %t1 > ; RUN: grep {addi r1, r1, 16448} %t1 > ; RUN: llc < %s -march=ppc32 | \ > -; RUN: not grep {lwz r31, 20(r1)} > +; RUN: not grep {lwz r31, -4(r1)} > ; RUN: llc < %s -march=ppc32 -mtriple=powerpc-apple-darwin8 -disable-fp-elim \ > ; RUN: -o %t2 > -; RUN: grep {stw r31, 20(r1)} %t2 > +; RUN: grep {stw r31, -4(r1)} %t2 > ; RUN: grep {stwu r1, -16448(r1)} %t2 > ; RUN: grep {addi r1, r1, 16448} %t2 > -; RUN: grep {lwz r31, 20(r1)} %t2 > +; RUN: grep {lwz r31, -4(r1)} %t2 > ; RUN: llc < %s -march=ppc64 -mtriple=powerpc-apple-darwin8 -o %t3 > -; RUN: not grep {std r31, 40(r1)} %t3 > +; RUN: not grep {std r31, -8(r1)} %t3 > ; RUN: grep {stdu r1, -16496(r1)} %t3 > ; RUN: grep {addi r1, r1, 16496} %t3 > -; RUN: not grep {ld r31, 40(r1)} %t3 > +; RUN: not grep {ld r31, -8(r1)} %t3 > ; RUN: llc < %s -march=ppc64 -mtriple=powerpc-apple-darwin8 -disable-fp-elim \ > ; RUN: -o %t4 > -; RUN: grep {std r31, 40(r1)} %t4 > -; RUN: grep {stdu r1, -16496(r1)} %t4 > -; RUN: grep {addi r1, r1, 16496} %t4 > -; RUN: grep {ld r31, 40(r1)} %t4 > +; RUN: grep {std r31, -8(r1)} %t4 > +; RUN: grep {stdu r1, -16512(r1)} %t4 > +; RUN: grep {addi r1, r1, 16512} %t4 > +; RUN: grep {ld r31, -8(r1)} %t4 > > define i32* @f1() { > %tmp = alloca i32, i32 4095 ; [#uses=1] > > Modified: llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll?rev=89811&r1=89810&r2=89811&view=diff > > ============================================================================== > --- llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll (original) > +++ llvm/trunk/test/CodeGen/PowerPC/ppc-prologue.ll Tue Nov 24 16:59:02 2009 > @@ -2,7 +2,7 @@ > > define i32 @_Z4funci(i32 %a) ssp { > ; CHECK: mflr r0 > -; CHECK-NEXT: stw r31, 20(r1) > +; CHECK-NEXT: stw r31, -4(r1) > ; CHECK-NEXT: stw r0, 8(r1) > ; CHECK-NEXT: stwu r1, -80(r1) > ; CHECK-NEXT: Llabel1: > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From dalej at apple.com Tue Nov 24 18:58:21 2009 From: dalej at apple.com (Dale Johannesen) Date: Wed, 25 Nov 2009 00:58:21 -0000 Subject: [llvm-commits] [llvm] r89824 - /llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h Message-ID: <200911250058.nAP0wLWo020440@zion.cs.uiuc.edu> Author: johannes Date: Tue Nov 24 18:58:21 2009 New Revision: 89824 URL: http://llvm.org/viewvc/llvm-project?rev=89824&view=rev Log: Fix compiler warnings. Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h Modified: llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h?rev=89824&r1=89823&r2=89824&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCFrameInfo.h Tue Nov 24 18:58:21 2009 @@ -94,11 +94,11 @@ if (TM.getSubtarget().isDarwinABI()) { NumEntries = 1; if (TM.getSubtarget().isPPC64()) { - static const SpillSlot darwin64Offsets[] = {PPC::X31, -8}; - return darwin64Offsets; + static const SpillSlot darwin64Offsets = {PPC::X31, -8}; + return &darwin64Offsets; } else { - static const SpillSlot darwinOffsets[] = {PPC::R31, -4}; - return darwinOffsets; + static const SpillSlot darwinOffsets = {PPC::R31, -4}; + return &darwinOffsets; } } From bruno.cardoso at gmail.com Tue Nov 24 19:05:25 2009 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Wed, 25 Nov 2009 01:05:25 -0000 Subject: [llvm-commits] [llvm] r89826 - /llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp Message-ID: <200911250105.nAP15Puq020652@zion.cs.uiuc.edu> Author: bruno Date: Tue Nov 24 19:05:25 2009 New Revision: 89826 URL: http://llvm.org/viewvc/llvm-project?rev=89826&view=rev Log: Use endianess dependent offsets for load/store of doubles when using two swc/lwc instead of sdc/ldc. Modified: llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp Modified: llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp?rev=89826&r1=89825&r2=89826&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp Tue Nov 24 19:05:25 2009 @@ -235,6 +235,10 @@ else return NULL; + // Choose the offsets depending on the endianess + if (TM.getTargetData()->isBigEndian()) + std::swap(Offset0, Offset1); + // Instead of: // ldc $f0, X($3) // Generate: @@ -296,6 +300,10 @@ else return NULL; + // Choose the offsets depending on the endianess + if (TM.getTargetData()->isBigEndian()) + std::swap(Offset0, Offset1); + // Instead of: // sdc $f0, X($3) // Generate: From dalej at apple.com Tue Nov 24 19:09:12 2009 From: dalej at apple.com (Dale Johannesen) Date: Tue, 24 Nov 2009 17:09:12 -0800 Subject: [llvm-commits] [llvm] r89811 - in /llvm/trunk: lib/Target/PowerPC/PPCFrameInfo.h lib/Target/PowerPC/PPCRegisterInfo.cpp test/CodeGen/PowerPC/Frames-alloca.ll test/CodeGen/PowerPC/Frames-large.ll test/CodeGen/PowerPC/Frames-small.ll test/CodeGen/PowerPC/ppc-prologue.ll In-Reply-To: <6E979870-9CBA-4510-BE1C-914A29849999@apple.com> References: <200911242259.nAOMx2c2016262@zion.cs.uiuc.edu> <6E979870-9CBA-4510-BE1C-914A29849999@apple.com> Message-ID: <42344334-0630-47C6-BBC9-DB54034D38DA@apple.com> On Nov 24, 2009, at 4:49 PMPST, Evan Cheng wrote: > I see compilation warnings: > > In file included from PPCRegisterInfo.cpp:20: > PPCFrameInfo.h: In member function 'virtual const > llvm::TargetFrameInfo::SpillSlot* > llvm::PPCFrameInfo::getCalleeSavedSpillSlots(unsigned int&) const': > PPCFrameInfo.h:97: warning: missing braces around initializer for > 'const llvm::TargetFrameInfo::SpillSlot' > PPCFrameInfo.h:100: warning: missing braces around initializer for > 'const llvm::TargetFrameInfo::SpillSlot' Fixed. I should set up my local builds to warn in the ways that matter, but when I configure with --with-extra-options=-Wall I get hundreds of them. What are you using? From daniel at zuster.org Tue Nov 24 20:13:23 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 02:13:23 -0000 Subject: [llvm-commits] [llvm] r89833 - in /llvm/trunk/utils/TableGen: CMakeLists.txt DisassemblerEmitter.cpp DisassemblerEmitter.h TableGen.cpp Message-ID: <200911250213.nAP2DN4m023100@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 20:13:23 2009 New Revision: 89833 URL: http://llvm.org/viewvc/llvm-project?rev=89833&view=rev Log: Sketch TableGen disassembler emitter, based on patch by Sean Callanan. Added: llvm/trunk/utils/TableGen/DisassemblerEmitter.cpp llvm/trunk/utils/TableGen/DisassemblerEmitter.h Modified: llvm/trunk/utils/TableGen/CMakeLists.txt llvm/trunk/utils/TableGen/TableGen.cpp Modified: llvm/trunk/utils/TableGen/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CMakeLists.txt?rev=89833&r1=89832&r2=89833&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/CMakeLists.txt (original) +++ llvm/trunk/utils/TableGen/CMakeLists.txt Tue Nov 24 20:13:23 2009 @@ -8,6 +8,7 @@ CodeGenInstruction.cpp CodeGenTarget.cpp DAGISelEmitter.cpp + DisassemblerEmitter.cpp FastISelEmitter.cpp InstrEnumEmitter.cpp InstrInfoEmitter.cpp Added: llvm/trunk/utils/TableGen/DisassemblerEmitter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/DisassemblerEmitter.cpp?rev=89833&view=auto ============================================================================== --- llvm/trunk/utils/TableGen/DisassemblerEmitter.cpp (added) +++ llvm/trunk/utils/TableGen/DisassemblerEmitter.cpp Tue Nov 24 20:13:23 2009 @@ -0,0 +1,30 @@ +//===- DisassemblerEmitter.cpp - Generate a disassembler ------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "DisassemblerEmitter.h" +#include "CodeGenTarget.h" +#include "Record.h" +using namespace llvm; + +void DisassemblerEmitter::run(raw_ostream &OS) { + CodeGenTarget Target; + + OS << "/*===- TableGen'erated file " + << "---------------------------------------*- C -*-===*\n" + << " *\n" + << " * " << Target.getName() << " Disassembler\n" + << " *\n" + << " * Automatically generated file, do not edit!\n" + << " *\n" + << " *===---------------------------------------------------------------" + << "-------===*/\n"; + + throw TGError(Target.getTargetRecord()->getLoc(), + "Unable to generate disassembler for this target"); +} Added: llvm/trunk/utils/TableGen/DisassemblerEmitter.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/DisassemblerEmitter.h?rev=89833&view=auto ============================================================================== --- llvm/trunk/utils/TableGen/DisassemblerEmitter.h (added) +++ llvm/trunk/utils/TableGen/DisassemblerEmitter.h Tue Nov 24 20:13:23 2009 @@ -0,0 +1,28 @@ +//===- DisassemblerEmitter.h - Disassembler Generator -----------*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#ifndef DISASSEMBLEREMITTER_H +#define DISASSEMBLEREMITTER_H + +#include "TableGenBackend.h" + +namespace llvm { + + class DisassemblerEmitter : public TableGenBackend { + RecordKeeper &Records; + public: + DisassemblerEmitter(RecordKeeper &R) : Records(R) {} + + /// run - Output the disassembler. + void run(raw_ostream &o); + }; + +} // end llvm namespace + +#endif Modified: llvm/trunk/utils/TableGen/TableGen.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/TableGen.cpp?rev=89833&r1=89832&r2=89833&view=diff ============================================================================== --- llvm/trunk/utils/TableGen/TableGen.cpp (original) +++ llvm/trunk/utils/TableGen/TableGen.cpp Tue Nov 24 20:13:23 2009 @@ -21,6 +21,7 @@ #include "ClangDiagnosticsEmitter.h" #include "CodeEmitterGen.h" #include "DAGISelEmitter.h" +#include "DisassemblerEmitter.h" #include "FastISelEmitter.h" #include "InstrEnumEmitter.h" #include "InstrInfoEmitter.h" @@ -46,6 +47,7 @@ GenEmitter, GenRegisterEnums, GenRegister, GenRegisterHeader, GenInstrEnums, GenInstrs, GenAsmWriter, GenAsmMatcher, + GenDisassembler, GenCallingConv, GenClangDiagsDefs, GenClangDiagGroups, @@ -80,6 +82,8 @@ "Generate calling convention descriptions"), clEnumValN(GenAsmWriter, "gen-asm-writer", "Generate assembly writer"), + clEnumValN(GenDisassembler, "gen-disassembler", + "Generate disassembler"), clEnumValN(GenAsmMatcher, "gen-asm-matcher", "Generate assembly instruction matcher"), clEnumValN(GenDAGISel, "gen-dag-isel", @@ -228,6 +232,9 @@ case GenClangDiagGroups: ClangDiagGroupsEmitter(Records).run(*Out); break; + case GenDisassembler: + DisassemblerEmitter(Records).run(*Out); + break; case GenOptParserDefs: OptParserEmitter(Records, true).run(*Out); break; From daniel at zuster.org Tue Nov 24 22:27:32 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 04:27:32 -0000 Subject: [llvm-commits] [zorg] r89838 - in /zorg/trunk/zorg/buildbot/builders: ClangBuilder.py LLVMGCCBuilder.py Util.py Message-ID: <200911250427.nAP4RXZQ027694@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 22:27:32 2009 New Revision: 89838 URL: http://llvm.org/viewvc/llvm-project?rev=89838&view=rev Log: Add support for two stage clang builds. Added: zorg/trunk/zorg/buildbot/builders/Util.py Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/ClangBuilder.py?rev=89838&r1=89837&r2=89838&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/ClangBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/ClangBuilder.py Tue Nov 24 22:27:32 2009 @@ -11,9 +11,25 @@ from zorg.buildbot.commands.ClangTestCommand import ClangTestCommand from zorg.buildbot.commands.BatchFileDownload import BatchFileDownload -def getClangBuildFactory(triple=None, clean=True, test=True, - expensive_checks=False, run_cxx_tests=False, valgrind=False, - make='make', jobs="%(jobs)s"): +from Util import getConfigArgs + +def getClangBuildFactory(triple=None, clean=True, test=True, run_cxx_tests=False, + valgrind=False, useTwoStage=False, + make='make', jobs="%(jobs)s", + stage1_config='Debug', stage2_config='Release', + extra_configure_args=[]): + # Don't use in-dir builds with a two stage build process. + inDir = not useTwoStage + if inDir: + llvm_srcdir = "llvm" + llvm_1_objdir = "llvm" + llvm_1_installdir = None + else: + llvm_srcdir = "llvm.src" + llvm_1_objdir = "llvm.obj" + llvm_1_installdir = "llvm.install.1" + llvm_2_objdir = "llvm.obj.2" + f = buildbot.process.factory.BuildFactory() # Determine the build directory. @@ -27,40 +43,52 @@ f.addStep(SVN(name='svn-llvm', mode='update', baseURL='http://llvm.org/svn/llvm-project/llvm/', defaultBranch='trunk', - workdir='llvm')) + workdir=llvm_srcdir)) f.addStep(SVN(name='svn-clang', mode='update', baseURL='http://llvm.org/svn/llvm-project/cfe/', defaultBranch='trunk', - workdir='llvm/tools/clang')) + workdir='%s/tools/clang' % llvm_srcdir)) + # Clean up llvm (stage 1); unless in-dir. + if clean and llvm_srcdir != llvm_1_objdir: + f.addStep(ShellCommand(name="rm-llvm.obj.stage1", + command=["rm", "-rf", llvm_1_objdir], + haltOnFailure=True, + description=["rm build dir", "llvm"], + workdir=".")) + # Force without llvm-gcc so we don't run afoul of Frontend test failures. - configure_args = ["./configure", "--without-llvmgcc", "--without-llvmgxx"] - config_name = 'Debug' - if expensive_checks: - configure_args.append('--enable-expensive-checks') - config_name += '+Checks' + base_configure_args = [WithProperties("%%(builddir)s/%s/configure" % llvm_srcdir), + WithProperties("--prefix=%%(builddir)s/%s" % llvm_1_installdir), + '--disable-bindings'] + base_configure_args += extra_configure_args if triple: - configure_args += ['--build=%s' % triple, - '--host=%s' % triple, - '--target=%s' % triple] - f.addStep(Configure(command=configure_args, - workdir='llvm', - description=['configuring',config_name], - descriptionDone=['configure',config_name])) - if clean: + base_configure_args += ['--build=%s' % triple, + '--host=%s' % triple, + '--target=%s' % triple] + args = base_configure_args + ["--without-llvmgcc", "--without-llvmgxx"] + args += getConfigArgs(stage1_config) + f.addStep(Configure(command=args, + workdir=llvm_1_objdir, + description=['configuring',stage1_config], + descriptionDone=['configure',stage1_config])) + + # Make clean if using in-dir builds. + if clean and llvm_srcdir == llvm_1_objdir: f.addStep(WarningCountingShellCommand(name="clean-llvm", command=[make, "clean"], haltOnFailure=True, description="cleaning llvm", descriptionDone="clean llvm", - workdir='llvm')) + workdir=llvm_1_objdir)) + f.addStep(WarningCountingShellCommand(name="compile", command=['nice', '-n', '10', make, WithProperties("-j%s" % jobs)], haltOnFailure=True, - description="compiling llvm & clang", - descriptionDone="compile llvm & clang", - workdir='llvm')) + description=["compiling", stage1_config], + descriptionDone=["compile", stage1_config], + workdir=llvm_1_objdir)) clangTestArgs = '-v' if valgrind: clangTestArgs += ' --vg ' @@ -74,11 +102,67 @@ command=[make, "check-lit", "VERBOSE=1"], description=["testing", "llvm"], descriptionDone=["test", "llvm"], - workdir='llvm')) + workdir=llvm_1_objdir)) + f.addStep(ClangTestCommand(name='test-clang', + command=[make, 'test', WithProperties('TESTARGS=%s' % clangTestArgs), + WithProperties('EXTRA_TESTDIRS=%s' % extraTestDirs)], + workdir='%s/tools/clang' % llvm_1_objdir)) + + # Install llvm and clang. + if llvm_1_installdir: + f.addStep(WarningCountingShellCommand(name="install.llvm.1", + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs), + 'install'], + haltOnFailure=True, + description=["install", "llvm & clang"], + workdir=llvm_1_objdir)) + + if not useTwoStage: + return f + + # Clean up llvm (stage 2). + if clean: + f.addStep(ShellCommand(name="rm-llvm.obj.stage2", + command=["rm", "-rf", llvm_2_objdir], + haltOnFailure=True, + description=["rm build dir", "llvm", "(stage 2)"], + workdir=".")) + + # Configure llvm (stage 2). + args = base_configure_args + ["--without-llvmgcc", "--without-llvmgxx"] + args += getConfigArgs(stage2_config) + f.addStep(Configure(name="configure.llvm.stage2", + command=args, + env={'CC' : WithProperties("%%(builddir)s/%s/bin/clang" % llvm_1_installdir), + 'CXX' : WithProperties("%%(builddir)s/%s/bin/clang++" % llvm_1_installdir),}, + haltOnFailure=True, + workdir=llvm_2_objdir, + description=["configure", "llvm", "(stage 2)", + stage2_config])) + + # Build llvm (stage 2). + f.addStep(WarningCountingShellCommand(name="compile.llvm.stage2", + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs)], + haltOnFailure=True, + description=["compiling", "(stage 2)", + stage2_config], + descriptionDone=["compile", "(stage 2)", + stage2_config], + workdir=llvm_2_objdir)) + + if test: + f.addStep(ClangTestCommand(name='test-llvm', + command=[make, "check-lit", "VERBOSE=1"], + description=["testing", "llvm"], + descriptionDone=["test", "llvm"], + workdir=llvm_2_objdir)) f.addStep(ClangTestCommand(name='test-clang', command=[make, 'test', WithProperties('TESTARGS=%s' % clangTestArgs), WithProperties('EXTRA_TESTDIRS=%s' % extraTestDirs)], - workdir='llvm/tools/clang')) + workdir='%s/tools/clang' % llvm_2_objdir)) + return f def getClangMSVCBuildFactory(update=True, clean=True, vcDrive='c', jobs=1): Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py?rev=89838&r1=89837&r2=89838&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Tue Nov 24 22:27:32 2009 @@ -6,29 +6,7 @@ from zorg.buildbot.commands.ClangTestCommand import ClangTestCommand -def getConfigArgs(origname): - name = origname - args = [] - if name.startswith('Release'): - name = name[len('Release'):] - args.append('--enable-optimized') - elif name.startswith('Debug'): - name = name[len('Debug'):] - else: - raise ValueError,'Unknown config name: %r' % origname - - if name.startswith('-Asserts'): - name = name[len('-Asserts'):] - args.append('--disable-assertions') - - if name.startswith('+Checks'): - name = name[len('+Checks'):] - args.append('--enable-expensive-checks') - - if name: - raise ValueError,'Unknown config name: %r' % origname - - return args +from Util import getConfigArgs def getLLVMGCCBuildFactory(jobs=1, update=True, clean=True, gxxincludedir=None, Added: zorg/trunk/zorg/buildbot/builders/Util.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/Util.py?rev=89838&view=auto ============================================================================== --- zorg/trunk/zorg/buildbot/builders/Util.py (added) +++ zorg/trunk/zorg/buildbot/builders/Util.py Tue Nov 24 22:27:32 2009 @@ -0,0 +1,23 @@ +def getConfigArgs(origname): + name = origname + args = [] + if name.startswith('Release'): + name = name[len('Release'):] + args.append('--enable-optimized') + elif name.startswith('Debug'): + name = name[len('Debug'):] + else: + raise ValueError,'Unknown config name: %r' % origname + + if name.startswith('-Asserts'): + name = name[len('-Asserts'):] + args.append('--disable-assertions') + + if name.startswith('+Checks'): + name = name[len('+Checks'):] + args.append('--enable-expensive-checks') + + if name: + raise ValueError,'Unknown config name: %r' % origname + + return args From daniel at zuster.org Tue Nov 24 22:30:13 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 04:30:13 -0000 Subject: [llvm-commits] [llvm] r89839 - in /llvm/trunk: CMakeLists.txt autoconf/configure.ac include/llvm/Config/Disassemblers.def.in Message-ID: <200911250430.nAP4UDl7027808@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 22:30:13 2009 New Revision: 89839 URL: http://llvm.org/viewvc/llvm-project?rev=89839&view=rev Log: Add CMake and configure logic to create llvm/Config/Disassemblers.defs. Added: llvm/trunk/include/llvm/Config/Disassemblers.def.in Modified: llvm/trunk/CMakeLists.txt llvm/trunk/autoconf/configure.ac Modified: llvm/trunk/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/CMakeLists.txt?rev=89839&r1=89838&r2=89839&view=diff ============================================================================== --- llvm/trunk/CMakeLists.txt (original) +++ llvm/trunk/CMakeLists.txt Tue Nov 24 22:30:13 2009 @@ -280,6 +280,7 @@ set(LLVM_ENUM_ASM_PRINTERS "") set(LLVM_ENUM_ASM_PARSERS "") +set(LLVM_ENUM_DISASSEMBLERS "") foreach(t ${LLVM_TARGETS_TO_BUILD}) message(STATUS "Targeting ${t}") add_subdirectory(lib/Target/${t}) @@ -294,6 +295,11 @@ set(LLVM_ENUM_ASM_PARSERS "${LLVM_ENUM_ASM_PARSERS}LLVM_ASM_PARSER(${t})\n") endif( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmParser/CMakeLists.txt ) + if( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/Disassembler/CMakeLists.txt ) + add_subdirectory(lib/Target/${t}/Disassembler) + set(LLVM_ENUM_DISASSEMBLERS + "${LLVM_ENUM_DISASSEMBLERS}LLVM_DISASSEMBLER(${t})\n") + endif( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/Disassembler/CMakeLists.txt ) set(CURRENT_LLVM_TARGET) endforeach(t) @@ -309,6 +315,12 @@ ${LLVM_BINARY_DIR}/include/llvm/Config/AsmParsers.def ) +# Produce llvm/Config/Disassemblers.def +configure_file( + ${LLVM_MAIN_INCLUDE_DIR}/llvm/Config/Disassemblers.def.in + ${LLVM_BINARY_DIR}/include/llvm/Config/Disassemblers.def + ) + add_subdirectory(lib/ExecutionEngine) add_subdirectory(lib/ExecutionEngine/Interpreter) add_subdirectory(lib/ExecutionEngine/JIT) Modified: llvm/trunk/autoconf/configure.ac URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/autoconf/configure.ac?rev=89839&r1=89838&r2=89839&view=diff ============================================================================== --- llvm/trunk/autoconf/configure.ac (original) +++ llvm/trunk/autoconf/configure.ac Tue Nov 24 22:30:13 2009 @@ -533,11 +533,12 @@ fi done -# Build the LLVM_TARGET and LLVM_ASM_PRINTER macro uses for -# Targets.def, AsmPrinters.def, and AsmParsers.def. +# Build the LLVM_TARGET and LLVM_... macros for Targets.def and the individual +# target feature def files. LLVM_ENUM_TARGETS="" LLVM_ENUM_ASM_PRINTERS="" LLVM_ENUM_ASM_PARSERS="" +LLVM_ENUM_DISASSEMBLERS="" for target_to_build in $TARGETS_TO_BUILD; do LLVM_ENUM_TARGETS="LLVM_TARGET($target_to_build) $LLVM_ENUM_TARGETS" if test -f ${srcdir}/lib/Target/${target_to_build}/AsmPrinter/Makefile ; then @@ -546,10 +547,14 @@ if test -f ${srcdir}/lib/Target/${target_to_build}/AsmParser/Makefile ; then LLVM_ENUM_ASM_PARSERS="LLVM_ASM_PARSER($target_to_build) $LLVM_ENUM_ASM_PARSERS"; fi + if test -f ${srcdir}/lib/Target/${target_to_build}/Disassembler/Makefile ; then + LLVM_ENUM_DISASSEMBLERS="LLVM_DISASSEMBLER($target_to_build) $LLVM_ENUM_DISASSEMBLERS"; + fi done AC_SUBST(LLVM_ENUM_TARGETS) AC_SUBST(LLVM_ENUM_ASM_PRINTERS) AC_SUBST(LLVM_ENUM_ASM_PARSERS) +AC_SUBST(LLVM_ENUM_DISASSEMBLERS) dnl Prevent the CBackend from using printf("%a") for floating point so older dnl C compilers that cannot deal with the 0x0p+0 hex floating point format @@ -1407,6 +1412,7 @@ AC_CONFIG_FILES([include/llvm/Config/Targets.def]) AC_CONFIG_FILES([include/llvm/Config/AsmPrinters.def]) AC_CONFIG_FILES([include/llvm/Config/AsmParsers.def]) +AC_CONFIG_FILES([include/llvm/Config/Disassemblers.def]) AC_CONFIG_HEADERS([include/llvm/System/DataTypes.h]) dnl Configure the makefile's configuration data Added: llvm/trunk/include/llvm/Config/Disassemblers.def.in URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Config/Disassemblers.def.in?rev=89839&view=auto ============================================================================== --- llvm/trunk/include/llvm/Config/Disassemblers.def.in (added) +++ llvm/trunk/include/llvm/Config/Disassemblers.def.in Tue Nov 24 22:30:13 2009 @@ -0,0 +1,29 @@ +//===- llvm/Config/Disassemblers.def - LLVM Assembly Parsers ----*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file enumerates all of the assembly-language parsers +// supported by this build of LLVM. Clients of this file should define +// the LLVM_ASM_PARSER macro to be a function-like macro with a +// single parameter (the name of the target whose assembly can be +// generated); including this file will then enumerate all of the +// targets with assembly parsers. +// +// The set of targets supported by LLVM is generated at configuration +// time, at which point this header is generated. Do not modify this +// header directly. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_DISASSEMBLER +# error Please define the macro LLVM_DISASSEMBLER(TargetName) +#endif + + at LLVM_ENUM_DISASSEMBLERS@ + +#undef LLVM_DISASSEMBLER From daniel at zuster.org Tue Nov 24 22:37:28 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 04:37:28 -0000 Subject: [llvm-commits] [llvm] r89840 - /llvm/trunk/configure Message-ID: <200911250437.nAP4bSiT028037@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 22:37:28 2009 New Revision: 89840 URL: http://llvm.org/viewvc/llvm-project?rev=89840&view=rev Log: Regenerate configure Modified: llvm/trunk/configure Modified: llvm/trunk/configure URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/configure?rev=89840&r1=89839&r2=89840&view=diff ============================================================================== --- llvm/trunk/configure (original) +++ llvm/trunk/configure Tue Nov 24 22:37:28 2009 @@ -847,6 +847,7 @@ LLVM_ENUM_TARGETS LLVM_ENUM_ASM_PRINTERS LLVM_ENUM_ASM_PARSERS +LLVM_ENUM_DISASSEMBLERS ENABLE_CBE_PRINTF_A OPTIMIZE_OPTION EXTRA_OPTIONS @@ -5108,11 +5109,12 @@ fi done -# Build the LLVM_TARGET and LLVM_ASM_PRINTER macro uses for -# Targets.def, AsmPrinters.def, and AsmParsers.def. +# Build the LLVM_TARGET and LLVM_... macros for Targets.def and the individual +# target feature def files. LLVM_ENUM_TARGETS="" LLVM_ENUM_ASM_PRINTERS="" LLVM_ENUM_ASM_PARSERS="" +LLVM_ENUM_DISASSEMBLERS="" for target_to_build in $TARGETS_TO_BUILD; do LLVM_ENUM_TARGETS="LLVM_TARGET($target_to_build) $LLVM_ENUM_TARGETS" if test -f ${srcdir}/lib/Target/${target_to_build}/AsmPrinter/Makefile ; then @@ -5121,11 +5123,15 @@ if test -f ${srcdir}/lib/Target/${target_to_build}/AsmParser/Makefile ; then LLVM_ENUM_ASM_PARSERS="LLVM_ASM_PARSER($target_to_build) $LLVM_ENUM_ASM_PARSERS"; fi + if test -f ${srcdir}/lib/Target/${target_to_build}/Disassembler/Makefile ; then + LLVM_ENUM_DISASSEMBLERS="LLVM_DISASSEMBLER($target_to_build) $LLVM_ENUM_DISASSEMBLERS"; + fi done + # Check whether --enable-cbe-printf-a was given. if test "${enable_cbe_printf_a+set}" = set; then enableval=$enable_cbe_printf_a; @@ -11114,7 +11120,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext < conftest.$ac_ext + echo '#line 13267 "configure"' > conftest.$ac_ext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? @@ -14976,11 +14982,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:14979: $lt_compile\"" >&5) + (eval echo "\"\$as_me:14985: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:14983: \$? = $ac_status" >&5 + echo "$as_me:14989: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -15244,11 +15250,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:15247: $lt_compile\"" >&5) + (eval echo "\"\$as_me:15253: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:15251: \$? = $ac_status" >&5 + echo "$as_me:15257: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -15348,11 +15354,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:15351: $lt_compile\"" >&5) + (eval echo "\"\$as_me:15357: $lt_compile\"" >&5) (eval "$lt_compile" 2>out/conftest.err) ac_status=$? cat out/conftest.err >&5 - echo "$as_me:15355: \$? = $ac_status" >&5 + echo "$as_me:15361: \$? = $ac_status" >&5 if (exit $ac_status) && test -s out/conftest2.$ac_objext then # The compiler can only warn and ignore the option if not recognized @@ -17800,7 +17806,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext < conftest.$ac_ext <&5) + (eval echo "\"\$as_me:20277: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:20275: \$? = $ac_status" >&5 + echo "$as_me:20281: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -20372,11 +20378,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:20375: $lt_compile\"" >&5) + (eval echo "\"\$as_me:20381: $lt_compile\"" >&5) (eval "$lt_compile" 2>out/conftest.err) ac_status=$? cat out/conftest.err >&5 - echo "$as_me:20379: \$? = $ac_status" >&5 + echo "$as_me:20385: \$? = $ac_status" >&5 if (exit $ac_status) && test -s out/conftest2.$ac_objext then # The compiler can only warn and ignore the option if not recognized @@ -21942,11 +21948,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:21945: $lt_compile\"" >&5) + (eval echo "\"\$as_me:21951: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:21949: \$? = $ac_status" >&5 + echo "$as_me:21955: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -22046,11 +22052,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:22049: $lt_compile\"" >&5) + (eval echo "\"\$as_me:22055: $lt_compile\"" >&5) (eval "$lt_compile" 2>out/conftest.err) ac_status=$? cat out/conftest.err >&5 - echo "$as_me:22053: \$? = $ac_status" >&5 + echo "$as_me:22059: \$? = $ac_status" >&5 if (exit $ac_status) && test -s out/conftest2.$ac_objext then # The compiler can only warn and ignore the option if not recognized @@ -24281,11 +24287,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:24284: $lt_compile\"" >&5) + (eval echo "\"\$as_me:24290: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:24288: \$? = $ac_status" >&5 + echo "$as_me:24294: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -24549,11 +24555,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:24552: $lt_compile\"" >&5) + (eval echo "\"\$as_me:24558: $lt_compile\"" >&5) (eval "$lt_compile" 2>conftest.err) ac_status=$? cat conftest.err >&5 - echo "$as_me:24556: \$? = $ac_status" >&5 + echo "$as_me:24562: \$? = $ac_status" >&5 if (exit $ac_status) && test -s "$ac_outfile"; then # The compiler can only warn and ignore the option if not recognized # So say no if there are warnings other than the usual output. @@ -24653,11 +24659,11 @@ -e 's:.*FLAGS}\{0,1\} :&$lt_compiler_flag :; t' \ -e 's: [^ ]*conftest\.: $lt_compiler_flag&:; t' \ -e 's:$: $lt_compiler_flag:'` - (eval echo "\"\$as_me:24656: $lt_compile\"" >&5) + (eval echo "\"\$as_me:24662: $lt_compile\"" >&5) (eval "$lt_compile" 2>out/conftest.err) ac_status=$? cat out/conftest.err >&5 - echo "$as_me:24660: \$? = $ac_status" >&5 + echo "$as_me:24666: \$? = $ac_status" >&5 if (exit $ac_status) && test -s out/conftest2.$ac_objext then # The compiler can only warn and ignore the option if not recognized @@ -35375,6 +35381,8 @@ ac_config_files="$ac_config_files include/llvm/Config/AsmParsers.def" +ac_config_files="$ac_config_files include/llvm/Config/Disassemblers.def" + ac_config_headers="$ac_config_headers include/llvm/System/DataTypes.h" @@ -36002,6 +36010,7 @@ "include/llvm/Config/Targets.def") CONFIG_FILES="$CONFIG_FILES include/llvm/Config/Targets.def" ;; "include/llvm/Config/AsmPrinters.def") CONFIG_FILES="$CONFIG_FILES include/llvm/Config/AsmPrinters.def" ;; "include/llvm/Config/AsmParsers.def") CONFIG_FILES="$CONFIG_FILES include/llvm/Config/AsmParsers.def" ;; + "include/llvm/Config/Disassemblers.def") CONFIG_FILES="$CONFIG_FILES include/llvm/Config/Disassemblers.def" ;; "include/llvm/System/DataTypes.h") CONFIG_HEADERS="$CONFIG_HEADERS include/llvm/System/DataTypes.h" ;; "Makefile.config") CONFIG_FILES="$CONFIG_FILES Makefile.config" ;; "llvm.spec") CONFIG_FILES="$CONFIG_FILES llvm.spec" ;; @@ -36175,12 +36184,12 @@ LLVM_ENUM_TARGETS!$LLVM_ENUM_TARGETS$ac_delim LLVM_ENUM_ASM_PRINTERS!$LLVM_ENUM_ASM_PRINTERS$ac_delim LLVM_ENUM_ASM_PARSERS!$LLVM_ENUM_ASM_PARSERS$ac_delim +LLVM_ENUM_DISASSEMBLERS!$LLVM_ENUM_DISASSEMBLERS$ac_delim ENABLE_CBE_PRINTF_A!$ENABLE_CBE_PRINTF_A$ac_delim OPTIMIZE_OPTION!$OPTIMIZE_OPTION$ac_delim EXTRA_OPTIONS!$EXTRA_OPTIONS$ac_delim BINUTILS_INCDIR!$BINUTILS_INCDIR$ac_delim ENABLE_LLVMC_DYNAMIC!$ENABLE_LLVMC_DYNAMIC$ac_delim -ENABLE_LLVMC_DYNAMIC_PLUGINS!$ENABLE_LLVMC_DYNAMIC_PLUGINS$ac_delim _ACEOF if test `sed -n "s/.*$ac_delim\$/X/p" conf$$subs.sed | grep -c X` = 97; then @@ -36222,6 +36231,7 @@ ac_delim='%!_!# ' for ac_last_try in false false false false false :; do cat >conf$$subs.sed <<_ACEOF +ENABLE_LLVMC_DYNAMIC_PLUGINS!$ENABLE_LLVMC_DYNAMIC_PLUGINS$ac_delim CXX!$CXX$ac_delim CXXFLAGS!$CXXFLAGS$ac_delim ac_ct_CXX!$ac_ct_CXX$ac_delim @@ -36319,7 +36329,7 @@ LTLIBOBJS!$LTLIBOBJS$ac_delim _ACEOF - if test `sed -n "s/.*$ac_delim\$/X/p" conf$$subs.sed | grep -c X` = 95; then + if test `sed -n "s/.*$ac_delim\$/X/p" conf$$subs.sed | grep -c X` = 96; then break elif $ac_last_try; then { { echo "$as_me:$LINENO: error: could not make $CONFIG_STATUS" >&5 @@ -36338,7 +36348,7 @@ cat >>$CONFIG_STATUS <<_ACEOF cat >"\$tmp/subs-2.sed" <<\CEOF$ac_eof -/@[a-zA-Z_][a-zA-Z_0-9]*@/!b end +/@[a-zA-Z_][a-zA-Z_0-9]*@/!b _ACEOF sed ' s/[,\\&]/\\&/g; s/@/@|#_!!_#|/g @@ -36351,8 +36361,6 @@ ' >>$CONFIG_STATUS >$CONFIG_STATUS <<_ACEOF -:end -s/|#_!!_#|//g CEOF$ac_eof _ACEOF @@ -36600,7 +36608,7 @@ s&@abs_top_builddir@&$ac_abs_top_builddir&;t t s&@INSTALL@&$ac_INSTALL&;t t $ac_datarootdir_hack -" $ac_file_inputs | sed -f "$tmp/subs-1.sed" | sed -f "$tmp/subs-2.sed" >$tmp/out +" $ac_file_inputs | sed -f "$tmp/subs-1.sed" | sed -f "$tmp/subs-2.sed" | sed 's/|#_!!_#|//g' >$tmp/out test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && { ac_out=`sed -n '/\${datarootdir}/p' "$tmp/out"`; test -n "$ac_out"; } && From daniel at zuster.org Tue Nov 24 22:46:59 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 04:46:59 -0000 Subject: [llvm-commits] [llvm] r89841 - in /llvm/trunk: Makefile Makefile.rules cmake/modules/LLVMConfig.cmake include/llvm/Target/TargetSelect.h tools/llvm-config/llvm-config.in.in Message-ID: <200911250447.nAP4l0tB028467@zion.cs.uiuc.edu> Author: ddunbar Date: Tue Nov 24 22:46:58 2009 New Revision: 89841 URL: http://llvm.org/viewvc/llvm-project?rev=89841&view=rev Log: Add the rest of the build system logic for optional target disassemblers Modified: llvm/trunk/Makefile llvm/trunk/Makefile.rules llvm/trunk/cmake/modules/LLVMConfig.cmake llvm/trunk/include/llvm/Target/TargetSelect.h llvm/trunk/tools/llvm-config/llvm-config.in.in Modified: llvm/trunk/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/Makefile?rev=89841&r1=89840&r2=89841&view=diff ============================================================================== --- llvm/trunk/Makefile (original) +++ llvm/trunk/Makefile Tue Nov 24 22:46:58 2009 @@ -155,9 +155,11 @@ FilesToConfig := \ include/llvm/Config/config.h \ include/llvm/Config/Targets.def \ - include/llvm/Config/AsmPrinters.def \ + include/llvm/Config/AsmPrinters.def \ + include/llvm/Config/AsmParsers.def \ + include/llvm/Config/Disassemblers.def \ include/llvm/System/DataTypes.h \ - tools/llvmc/plugins/Base/Base.td + tools/llvmc/plugins/Base/Base.td FilesToConfigPATH := $(addprefix $(LLVM_OBJ_ROOT)/,$(FilesToConfig)) all-local:: $(FilesToConfigPATH) Modified: llvm/trunk/Makefile.rules URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/Makefile.rules?rev=89841&r1=89840&r2=89841&view=diff ============================================================================== --- llvm/trunk/Makefile.rules (original) +++ llvm/trunk/Makefile.rules Tue Nov 24 22:46:58 2009 @@ -1565,6 +1565,11 @@ $(Echo) "Building $( Author: evocallaghan Date: Tue Nov 24 23:38:41 2009 New Revision: 89844 URL: http://llvm.org/viewvc/llvm-project?rev=89844&view=rev Log: Reverting patch in revision 89758, initial attempt at fixing PR5373 has proven to be bogus. Removed: llvm/trunk/test/Transforms/LoopUnswitch/5373.ll Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp?rev=89844&r1=89843&r2=89844&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/LoopUnswitch.cpp Tue Nov 24 23:38:41 2009 @@ -296,6 +296,7 @@ // first exit. if (ExitBB != 0) return false; ExitBB = BB; + return true; } // Otherwise, this is an unvisited intra-loop node. Check all successors. Removed: llvm/trunk/test/Transforms/LoopUnswitch/5373.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopUnswitch/5373.ll?rev=89843&view=auto ============================================================================== --- llvm/trunk/test/Transforms/LoopUnswitch/5373.ll (original) +++ llvm/trunk/test/Transforms/LoopUnswitch/5373.ll (removed) @@ -1,24 +0,0 @@ -; RUN: opt < %s -loop-unswitch -stats -disable-output |& grep "3 loop-unswitch - Number of branches unswitched" - -define noalias i32* @func_16(i32** %p_18, i32* %p_20) noreturn nounwind ssp { -entry: - %lnot = icmp eq i32** %p_18, null ; [#uses=1] - %lnot6 = icmp eq i32* %p_20, null ; [#uses=1] - br label %for.body - -for.body: ; preds = %cond.end, %entry - br i1 %lnot, label %cond.end, label %cond.true - -cond.true: ; preds = %for.body - tail call void @f() - unreachable - -cond.end: ; preds = %for.body - br i1 %lnot6, label %for.body, label %cond.true10 - -cond.true10: ; preds = %cond.end - tail call void @f() - unreachable -} - -declare void @f() noreturn From dgregor at apple.com Wed Nov 25 00:04:19 2009 From: dgregor at apple.com (Douglas Gregor) Date: Wed, 25 Nov 2009 06:04:19 -0000 Subject: [llvm-commits] [llvm] r89846 - /llvm/trunk/lib/Support/CommandLine.cpp Message-ID: <200911250604.nAP64J44031081@zion.cs.uiuc.edu> Author: dgregor Date: Wed Nov 25 00:04:18 2009 New Revision: 89846 URL: http://llvm.org/viewvc/llvm-project?rev=89846&view=rev Log: Perform explicit instantiations in the proper namespace, since Clang diagnoses this ill-formity. Modified: llvm/trunk/lib/Support/CommandLine.cpp Modified: llvm/trunk/lib/Support/CommandLine.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Support/CommandLine.cpp?rev=89846&r1=89845&r2=89846&view=diff ============================================================================== --- llvm/trunk/lib/Support/CommandLine.cpp (original) +++ llvm/trunk/lib/Support/CommandLine.cpp Wed Nov 25 00:04:18 2009 @@ -39,6 +39,7 @@ //===----------------------------------------------------------------------===// // Template instantiations and anchors. // +namespace llvm { namespace cl { TEMPLATE_INSTANTIATION(class basic_parser); TEMPLATE_INSTANTIATION(class basic_parser); TEMPLATE_INSTANTIATION(class basic_parser); @@ -53,6 +54,7 @@ TEMPLATE_INSTANTIATION(class opt); TEMPLATE_INSTANTIATION(class opt); TEMPLATE_INSTANTIATION(class opt); +} } // end namespace llvm::cl void Option::anchor() {} void basic_parser_impl::anchor() {} From eocallaghan at auroraux.org Wed Nov 25 00:32:20 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Wed, 25 Nov 2009 06:32:20 -0000 Subject: [llvm-commits] [llvm] r89848 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc Message-ID: <200911250632.nAP6WKOF032121@zion.cs.uiuc.edu> Author: evocallaghan Date: Wed Nov 25 00:32:19 2009 New Revision: 89848 URL: http://llvm.org/viewvc/llvm-project?rev=89848&view=rev Log: API change Path::isSpecialFile to Path::isRegularFile, improve semantics in regards to comments from 89765 post review. Modified: llvm/trunk/include/llvm/System/Path.h llvm/trunk/lib/System/Unix/Path.inc llvm/trunk/lib/System/Win32/Path.inc Modified: llvm/trunk/include/llvm/System/Path.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89848&r1=89847&r2=89848&view=diff ============================================================================== --- llvm/trunk/include/llvm/System/Path.h (original) +++ llvm/trunk/include/llvm/System/Path.h Wed Nov 25 00:32:19 2009 @@ -380,10 +380,12 @@ /// in the file system. bool canWrite() const; - /// This function checks that what we're trying to work only on a regular file or directory. - /// Check for things like /dev/null, any block special file, + /// This function checks that what we're trying to work only on a regular file + /// or directory. Check for things like /dev/null, any block special file, /// or other things that aren't "regular" regular files or directories. - bool isSpecialFile() const; + /// @returns true if the file is S_ISREG. + /// @brief Determines if the file is a regular file + bool isRegularFile() const; /// This function determines if the path name references an executable /// file in the file system. This function checks for the existence and Modified: llvm/trunk/lib/System/Unix/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=89848&r1=89847&r2=89848&view=diff ============================================================================== --- llvm/trunk/lib/System/Unix/Path.inc (original) +++ llvm/trunk/lib/System/Unix/Path.inc Wed Nov 25 00:32:19 2009 @@ -454,17 +454,17 @@ } bool -Path::isSpecialFile() const { +Path::isRegularFile() const { // Get the status so we can determine if its a file or directory struct stat buf; if (0 != stat(path.c_str(), &buf)) - return true; - - if (S_ISDIR(buf.st_mode) || S_ISREG(buf.st_mode)) return false; - return true; + if (S_ISREG(buf.st_mode)) + return true; + + return false; } bool Modified: llvm/trunk/lib/System/Win32/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Win32/Path.inc?rev=89848&r1=89847&r2=89848&view=diff ============================================================================== --- llvm/trunk/lib/System/Win32/Path.inc (original) +++ llvm/trunk/lib/System/Win32/Path.inc Wed Nov 25 00:32:19 2009 @@ -358,8 +358,10 @@ } bool -Path::isSpecialFile() const { - return false; +Path::isRegularFile() const { + if (isDirectory()) + return false; + return true; } std::string From daniel at zuster.org Wed Nov 25 00:53:09 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 06:53:09 -0000 Subject: [llvm-commits] [llvm] r89850 - in /llvm/trunk/lib/Target/X86: Disassembler/ Disassembler/CMakeLists.txt Disassembler/Makefile Disassembler/X86Disassembler.cpp Makefile Message-ID: <200911250653.nAP6r9cv000413@zion.cs.uiuc.edu> Author: ddunbar Date: Wed Nov 25 00:53:08 2009 New Revision: 89850 URL: http://llvm.org/viewvc/llvm-project?rev=89850&view=rev Log: Sketch structure for X86 disassembler. Added: llvm/trunk/lib/Target/X86/Disassembler/ llvm/trunk/lib/Target/X86/Disassembler/CMakeLists.txt llvm/trunk/lib/Target/X86/Disassembler/Makefile llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp Modified: llvm/trunk/lib/Target/X86/Makefile Added: llvm/trunk/lib/Target/X86/Disassembler/CMakeLists.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/CMakeLists.txt?rev=89850&view=auto ============================================================================== --- llvm/trunk/lib/Target/X86/Disassembler/CMakeLists.txt (added) +++ llvm/trunk/lib/Target/X86/Disassembler/CMakeLists.txt Wed Nov 25 00:53:08 2009 @@ -0,0 +1,6 @@ +include_directories( ${CMAKE_CURRENT_BINARY_DIR}/.. ${CMAKE_CURRENT_SOURCE_DIR}/.. ) + +add_llvm_library(LLVMX86Disassembler + X86Disassembler.cpp + ) +add_dependencies(LLVMX86Disassembler X86CodeGenTable_gen) Added: llvm/trunk/lib/Target/X86/Disassembler/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/Makefile?rev=89850&view=auto ============================================================================== --- llvm/trunk/lib/Target/X86/Disassembler/Makefile (added) +++ llvm/trunk/lib/Target/X86/Disassembler/Makefile Wed Nov 25 00:53:08 2009 @@ -0,0 +1,16 @@ +##===- lib/Target/X86/Disassembler/Makefile ----------------*- Makefile -*-===## +# +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +# +##===----------------------------------------------------------------------===## + +LEVEL = ../../../.. +LIBRARYNAME = LLVMX86Disassembler + +# Hack: we need to include 'main' x86 target directory to grab private headers +CPPFLAGS = -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/.. + +include $(LEVEL)/Makefile.common Added: llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp?rev=89850&view=auto ============================================================================== --- llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp (added) +++ llvm/trunk/lib/Target/X86/Disassembler/X86Disassembler.cpp Wed Nov 25 00:53:08 2009 @@ -0,0 +1,29 @@ +//===- X86Disassembler.cpp - Disassembler for x86 and x86_64 ----*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +#include "llvm/MC/MCDisassembler.h" +#include "llvm/Target/TargetRegistry.h" +#include "X86.h" +using namespace llvm; + +static const MCDisassembler *createX86_32Disassembler(const Target &T) { + return 0; +} + +static const MCDisassembler *createX86_64Disassembler(const Target &T) { + return 0; +} + +extern "C" void LLVMInitializeX86Disassembler() { + // Register the disassembler. + TargetRegistry::RegisterMCDisassembler(TheX86_32Target, + createX86_32Disassembler); + TargetRegistry::RegisterMCDisassembler(TheX86_64Target, + createX86_64Disassembler); +} Modified: llvm/trunk/lib/Target/X86/Makefile URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/Makefile?rev=89850&r1=89849&r2=89850&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/Makefile (original) +++ llvm/trunk/lib/Target/X86/Makefile Wed Nov 25 00:53:08 2009 @@ -18,6 +18,6 @@ X86GenFastISel.inc \ X86GenCallingConv.inc X86GenSubtarget.inc -DIRS = AsmPrinter AsmParser TargetInfo +DIRS = AsmPrinter AsmParser Disassembler TargetInfo include $(LEVEL)/Makefile.common From baldrick at free.fr Wed Nov 25 03:16:55 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 25 Nov 2009 10:16:55 +0100 Subject: [llvm-commits] [llvm] r89848 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <200911250632.nAP6WKOF032121@zion.cs.uiuc.edu> References: <200911250632.nAP6WKOF032121@zion.cs.uiuc.edu> Message-ID: <4B0CF607.6060601@free.fr> Hi Edward, > + /// This function checks that what we're trying to work only on a regular file > + /// or directory. Check for things like /dev/null, any block special file, "or directory" is no longer true. Ciao, Duncan. From baldrick at free.fr Wed Nov 25 03:24:33 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 25 Nov 2009 10:24:33 +0100 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <4B0C5C3B.5010903@apple.com> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> <4B0A9A8C.9040905@free.fr> <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> <4B0C5411.5090800@free.fr> <4B0C5C3B.5010903@apple.com> Message-ID: <4B0CF7D1.4040003@free.fr> Hi John, > If we're not going to consider "correlated" captures, shouldn't the > model just be taint-checking? The original parameter is tainted, GEPs > are tainted, phis to which it's an input are tainted, possibly some > other things. The parameter is nocapture only if there's no "capturing" > use of a tainted value. That's about as good as you're going to get > without a lot of (expensive) sophistication. the current implementation of nocapture is very strict, and does not allow "correlated" captures or any other kind of capture. This seems stricter than is needed, since "capture" exists for the benefit of alias analysis, and most languages allow you assume that pointers don't alias in certain circumstances, and these circumstances seem to say that we can ignore "correlated" capture - though this isn't completely clear. So I've started experimenting with a more liberal notion of capture, essentially the notion you describe above. If this doesn't give any practical improvement over the current strict definition, then I guess we will stay with the current version. Ciao, Duncan. From eocallaghan at auroraux.org Wed Nov 25 06:00:34 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Wed, 25 Nov 2009 12:00:34 -0000 Subject: [llvm-commits] [llvm] r89862 - /llvm/trunk/include/llvm/System/Path.h Message-ID: <200911251200.nAPC0Ype026465@zion.cs.uiuc.edu> Author: evocallaghan Date: Wed Nov 25 06:00:34 2009 New Revision: 89862 URL: http://llvm.org/viewvc/llvm-project?rev=89862&view=rev Log: Adjust comments to new semantics. Modified: llvm/trunk/include/llvm/System/Path.h Modified: llvm/trunk/include/llvm/System/Path.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/System/Path.h?rev=89862&r1=89861&r2=89862&view=diff ============================================================================== --- llvm/trunk/include/llvm/System/Path.h (original) +++ llvm/trunk/include/llvm/System/Path.h Wed Nov 25 06:00:34 2009 @@ -380,9 +380,9 @@ /// in the file system. bool canWrite() const; - /// This function checks that what we're trying to work only on a regular file - /// or directory. Check for things like /dev/null, any block special file, - /// or other things that aren't "regular" regular files or directories. + /// This function checks that what we're trying to work only on a regular file. + /// Check for things like /dev/null, any block special file, + /// or other things that aren't "regular" regular files. /// @returns true if the file is S_ISREG. /// @brief Determines if the file is a regular file bool isRegularFile() const; From eocallaghan at auroraux.org Wed Nov 25 06:01:51 2009 From: eocallaghan at auroraux.org (Edward O'Callaghan) Date: Wed, 25 Nov 2009 12:01:51 +0000 Subject: [llvm-commits] [llvm] r89848 - in /llvm/trunk: include/llvm/System/Path.h lib/System/Unix/Path.inc lib/System/Win32/Path.inc In-Reply-To: <4B0CF607.6060601@free.fr> References: <200911250632.nAP6WKOF032121@zion.cs.uiuc.edu> <4B0CF607.6060601@free.fr> Message-ID: <521640720911250401x7d34d65k432ebb6d19c4721f@mail.gmail.com> Fixed in revision 89862, nice spot. Cheers, Edward. 2009/11/25 Duncan Sands : > Hi Edward, > >> + ? ? ?/// This function checks that what we're trying to work only on a >> regular file >> + ? ? ?/// or directory. Check for things like /dev/null, any block >> special file, > > "or directory" is no longer true. > > Ciao, > > Duncan. > -- -- Edward O'Callaghan http://www.auroraux.org/ eocallaghan at auroraux dot org --- () ascii ribbon campaign - against html e-mail /\ - against microsoft attachments From bruno.cardoso at gmail.com Wed Nov 25 06:17:59 2009 From: bruno.cardoso at gmail.com (Bruno Cardoso Lopes) Date: Wed, 25 Nov 2009 12:17:59 -0000 Subject: [llvm-commits] [llvm] r89863 - in /llvm/trunk: lib/Target/Mips/MipsISelDAGToDAG.cpp lib/Target/Mips/MipsISelLowering.cpp test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll Message-ID: <200911251217.nAPCHxvW027007@zion.cs.uiuc.edu> Author: bruno Date: Wed Nov 25 06:17:58 2009 New Revision: 89863 URL: http://llvm.org/viewvc/llvm-project?rev=89863&view=rev Log: Support PIC loading of constant pool entries Modified: llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp llvm/trunk/test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll Modified: llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp?rev=89863&r1=89862&r2=89863&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsISelDAGToDAG.cpp Wed Nov 25 06:17:58 2009 @@ -144,6 +144,7 @@ // on PIC code Load GA if (TM.getRelocationModel() == Reloc::PIC_) { if ((Addr.getOpcode() == ISD::TargetGlobalAddress) || + (Addr.getOpcode() == ISD::TargetConstantPool) || (Addr.getOpcode() == ISD::TargetJumpTable)){ Base = CurDAG->getRegister(Mips::GP, MVT::i32); Offset = Addr; @@ -174,23 +175,21 @@ } // When loading from constant pools, load the lower address part in - // the instruction itself. Instead of: + // the instruction itself. Example, instead of: // lui $2, %hi($CPI1_0) // addiu $2, $2, %lo($CPI1_0) // lwc1 $f0, 0($2) // Generate: // lui $2, %hi($CPI1_0) // lwc1 $f0, %lo($CPI1_0)($2) - if (Addr.getOperand(0).getOpcode() == MipsISD::Hi && + if ((Addr.getOperand(0).getOpcode() == MipsISD::Hi || + Addr.getOperand(0).getOpcode() == ISD::LOAD) && Addr.getOperand(1).getOpcode() == MipsISD::Lo) { SDValue LoVal = Addr.getOperand(1); - if (ConstantPoolSDNode *CP = dyn_cast( - LoVal.getOperand(0))) { - if (!CP->getOffset()) { - Base = Addr.getOperand(0); - Offset = LoVal.getOperand(0); - return true; - } + if (dyn_cast(LoVal.getOperand(0))) { + Base = Addr.getOperand(0); + Offset = LoVal.getOperand(0); + return true; } } } Modified: llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp?rev=89863&r1=89862&r2=89863&view=diff ============================================================================== --- llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp (original) +++ llvm/trunk/lib/Target/Mips/MipsISelLowering.cpp Wed Nov 25 06:17:58 2009 @@ -563,8 +563,6 @@ SDValue ResNode; ConstantPoolSDNode *N = cast(Op); Constant *C = N->getConstVal(); - SDValue CP = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(), - N->getOffset(), MipsII::MO_ABS_HILO); // FIXME there isn't actually debug info here DebugLoc dl = Op.getDebugLoc(); @@ -577,11 +575,21 @@ // SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, MVT::i32, CP); // SDValue GOT = DAG.getGLOBAL_OFFSET_TABLE(MVT::i32); // ResNode = DAG.getNode(ISD::ADD, MVT::i32, GOT, GPRelNode); - //} else { // %hi/%lo relocation + + if (getTargetMachine().getRelocationModel() != Reloc::PIC_) { + SDValue CP = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(), + N->getOffset(), MipsII::MO_ABS_HILO); SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, MVT::i32, CP); SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CP); ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo); - //} + } else { + SDValue CP = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(), + N->getOffset(), MipsII::MO_GOT); + SDValue Load = DAG.getLoad(MVT::i32, dl, DAG.getEntryNode(), + CP, NULL, 0); + SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CP); + ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, Load, Lo); + } return ResNode; } Modified: llvm/trunk/test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll?rev=89863&r1=89862&r2=89863&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll (original) +++ llvm/trunk/test/CodeGen/Mips/2009-11-16-CstPoolLoad.ll Wed Nov 25 06:17:58 2009 @@ -4,7 +4,7 @@ define float @h() nounwind readnone { entry: -; CHECK: lui $2, %hi($CPI1_0) +; CHECK: lw $2, %got($CPI1_0)($gp) ; CHECK: lwc1 $f0, %lo($CPI1_0)($2) ret float 0x400B333340000000 } From gohman at apple.com Wed Nov 25 07:27:36 2009 From: gohman at apple.com (Dan Gohman) Date: Wed, 25 Nov 2009 05:27:36 -0800 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911241403.15274.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911241348.43974.dag@cray.com> <5120DADD-37E1-4296-B4CF-59AE9316EDBB@apple.com> <200911241403.15274.dag@cray.com> Message-ID: On Nov 24, 2009, at 12:03 PM, David Greene wrote: > On Tuesday 24 November 2009 13:59, Evan Cheng wrote: > >>> Yes, that would be cool. I have wanted type information at the >>> MachineInstr level for some time now. Do you imagine extending the >>> Instruction and Operand classes in Target.td and then having >>> TableGen >>> fill in those bits? >> >> Only on instruction level, not at the operand level. Also note the >> information is imprecise. Now that I think about it, I am not sure >> adding >> ValueType to TargetInstrDesc makes sense since it would be a many >> (types) >> to one mapping. But perhaps properties like bitwidth, isVector make >> sense. >> >> I don't think we need to add anything to Target.td. We just need to >> enhance >> TargetInstrDesc and have InstrInfoEmitter fill in the information. > > I think we need to add a flag to Target.td to override TableGen's > inference > of "isVector." For example: > > let isVector = 0 in > def Int_CVTSS2SIrr : SSI<0x2D, MRMSrcReg, (outs GR32:$dst), (ins > VR128:$src), > "cvtss2si\t{$src, $dst|$dst, $src}", > [(set GR32:$dst, (int_x86_sse_cvtss2si > VR128$src))]>; Hi Dave, This raises the question of what you're actually aiming at here. Does it really make sense to impose the Vector and Scalar dichotomy on an architecture like x86? Dan From dag at cray.com Wed Nov 25 09:08:34 2009 From: dag at cray.com (David Greene) Date: Wed, 25 Nov 2009 09:08:34 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: References: <200911201622.37994.dag@cray.com> <200911241403.15274.dag@cray.com> Message-ID: <200911250908.34818.dag@cray.com> On Wednesday 25 November 2009 07:27, Dan Gohman wrote: > This raises the question of what you're actually aiming at here. Does > it really make sense to impose the Vector and Scalar dichotomy on an > architecture like x86? Yeah, I think so. There are vector instructions and there are scalar instructions. We want to know when certain more expensive (i.e. vector) things happen. In this case, we want to have some idea of how much data is being transferred with a spill. Of course cache lines come into play here but the vector/scalar categorization gives us an idea of how much of that cache line we're actually touching. What are you thinking about this. What makes you nervous? -Dave From devang.patel at gmail.com Wed Nov 25 11:25:49 2009 From: devang.patel at gmail.com (Devang Patel) Date: Wed, 25 Nov 2009 09:25:49 -0800 Subject: [llvm-commits] [PATCH] LTO code generator options In-Reply-To: <3780A335BCB5498984B6E0408C2EDB0D@andreic6e7fe55> References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55> <41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55> <38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com> <38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com> <38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com> <2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55> <352a1fb20911241558o4131950di5e3bac9db3a31e30@mail.gmail.com> <3780A335BCB5498984B6E0408C2EDB0D@andreic6e7fe55> Message-ID: <352a1fb20911250925q4d4b8b9fwae1ac454f4c0341c@mail.gmail.com> On Tue, Nov 24, 2009 at 4:39 PM, Viktor Kutuzov wrote: > Hi Devang, > > What use cases you are thinking of? Several. 1 - LTO. The libLTO accepting command line options approach requires linker to accept this command line options. The may require a) linker change and b) build system changes to supply additional linker options while using LTO. These goes against ease of use, that is transparent LTO use that does not require any changes other than adding -O4 or -flto on compiler command line. If subtarget features are encoded in bitcode files then LTO use does not require any linker changes. 2 - One of the big strength of llvm toolset is bitcode representation that can be saved on a disk and later on used to generate code. If I do $ llvm-gcc foo.c -o foo.bc -emit-llvm; $ llc foo.bc -o foo.s then it should match $ llvm-gcc -S foo.c -o foo.s If I have to add/introduce additional llc command line options then bitcode users may be surprised, which may in turn cost resources and may require extra work from user. Again, this can be a avoided if subtarget features are encoded in bitcode file. 3 - It is quite reasonable for some to put two copies of a function, one for SSE3 machines, one for non-SSE machines, in one bitcode file and let code generator generate appropriate code for each functions so that the user can select desired function at run time. At Apple, we supported similar requirements for Altivec vs. non-Altivec code. This can be achieved if subtarget features like SSE3 are encoded in bitcode files. 4 - This was the one of the biggest driving point behind the current function attribute (originally proposed as function notes, if you search archive) infrastructure we have in llvm. - Devang From bob.wilson at apple.com Wed Nov 25 11:27:53 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Wed, 25 Nov 2009 17:27:53 -0000 Subject: [llvm-commits] [llvm] r89865 - /llvm/trunk/lib/Target/X86/X86InstrInfo.h Message-ID: <200911251727.nAPHRrl3006122@zion.cs.uiuc.edu> Author: bwilson Date: Wed Nov 25 11:27:53 2009 New Revision: 89865 URL: http://llvm.org/viewvc/llvm-project?rev=89865&view=rev Log: Based on the testcase for pr3120, running on my MacPro with Xeon processors, it is definitely profitable to tail duplicate indirect branches for x86. This is likely to be true to various degrees for all modern x86 processors. Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.h Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.h?rev=89865&r1=89864&r2=89865&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrInfo.h (original) +++ llvm/trunk/lib/Target/X86/X86InstrInfo.h Wed Nov 25 11:27:53 2009 @@ -632,6 +632,8 @@ /// unsigned getGlobalBaseReg(MachineFunction *MF) const; + virtual bool isProfitableToDuplicateIndirectBranch() const { return true; } + private: MachineInstr* foldMemoryOperandImpl(MachineFunction &MF, MachineInstr* MI, From dpatel at apple.com Wed Nov 25 11:36:50 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 25 Nov 2009 17:36:50 -0000 Subject: [llvm-commits] [llvm] r89866 - in /llvm/trunk: include/llvm/Analysis/DebugInfo.h lib/Analysis/DebugInfo.cpp lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h lib/Target/PIC16/PIC16DebugInfo.cpp Message-ID: <200911251736.nAPHaox8006395@zion.cs.uiuc.edu> Author: dpatel Date: Wed Nov 25 11:36:49 2009 New Revision: 89866 URL: http://llvm.org/viewvc/llvm-project?rev=89866&view=rev Log: Use StringRef (again) in DebugInfo interface. Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h llvm/trunk/lib/Analysis/DebugInfo.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp Modified: llvm/trunk/include/llvm/Analysis/DebugInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/DebugInfo.h?rev=89866&r1=89865&r2=89866&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/DebugInfo.h (original) +++ llvm/trunk/include/llvm/Analysis/DebugInfo.h Wed Nov 25 11:36:49 2009 @@ -55,7 +55,7 @@ /// not, the debug info is corrupt and we ignore it. DIDescriptor(MDNode *N, unsigned RequiredTag); - const char *getStringField(unsigned Elt) const; + StringRef getStringField(unsigned Elt) const; unsigned getUnsignedField(unsigned Elt) const { return (unsigned)getUInt64Field(Elt); } @@ -137,8 +137,8 @@ } virtual ~DIScope() {} - const char *getFilename() const; - const char *getDirectory() const; + StringRef getFilename() const; + StringRef getDirectory() const; }; /// DICompileUnit - A wrapper for a compile unit. @@ -150,9 +150,9 @@ } unsigned getLanguage() const { return getUnsignedField(2); } - const char *getFilename() const { return getStringField(3); } - const char *getDirectory() const { return getStringField(4); } - const char *getProducer() const { return getStringField(5); } + StringRef getFilename() const { return getStringField(3); } + StringRef getDirectory() const { return getStringField(4); } + StringRef getProducer() const { return getStringField(5); } /// isMain - Each input file is encoded as a separate compile unit in LLVM /// debugging information output. However, many target specific tool chains @@ -165,7 +165,7 @@ bool isMain() const { return getUnsignedField(6); } bool isOptimized() const { return getUnsignedField(7); } - const char *getFlags() const { return getStringField(8); } + StringRef getFlags() const { return getStringField(8); } unsigned getRunTimeVersion() const { return getUnsignedField(9); } /// Verify - Verify that a compile unit is well formed. @@ -183,7 +183,7 @@ explicit DIEnumerator(MDNode *N = 0) : DIDescriptor(N, dwarf::DW_TAG_enumerator) {} - const char *getName() const { return getStringField(1); } + StringRef getName() const { return getStringField(1); } uint64_t getEnumValue() const { return getUInt64Field(2); } }; @@ -217,7 +217,7 @@ virtual ~DIType() {} DIDescriptor getContext() const { return getDescriptorField(1); } - const char *getName() const { return getStringField(2); } + StringRef getName() const { return getStringField(2); } DICompileUnit getCompileUnit() const{ return getFieldAs(3); } unsigned getLineNumber() const { return getUnsignedField(4); } uint64_t getSizeInBits() const { return getUInt64Field(5); } @@ -317,9 +317,9 @@ virtual ~DIGlobal() {} DIDescriptor getContext() const { return getDescriptorField(2); } - const char *getName() const { return getStringField(3); } - const char *getDisplayName() const { return getStringField(4); } - const char *getLinkageName() const { return getStringField(5); } + StringRef getName() const { return getStringField(3); } + StringRef getDisplayName() const { return getStringField(4); } + StringRef getLinkageName() const { return getStringField(5); } DICompileUnit getCompileUnit() const{ return getFieldAs(6); } unsigned getLineNumber() const { return getUnsignedField(7); } DIType getType() const { return getFieldAs(8); } @@ -342,16 +342,16 @@ } DIDescriptor getContext() const { return getDescriptorField(2); } - const char *getName() const { return getStringField(3); } - const char *getDisplayName() const { return getStringField(4); } - const char *getLinkageName() const { return getStringField(5); } + StringRef getName() const { return getStringField(3); } + StringRef getDisplayName() const { return getStringField(4); } + StringRef getLinkageName() const { return getStringField(5); } DICompileUnit getCompileUnit() const{ return getFieldAs(6); } unsigned getLineNumber() const { return getUnsignedField(7); } DICompositeType getType() const { return getFieldAs(8); } /// getReturnTypeName - Subprogram return types are encoded either as /// DIType or as DICompositeType. - const char *getReturnTypeName() const { + StringRef getReturnTypeName() const { DICompositeType DCT(getFieldAs(8)); if (!DCT.isNull()) { DIArray A = DCT.getTypeArray(); @@ -366,8 +366,8 @@ /// compile unit, like 'static' in C. unsigned isLocalToUnit() const { return getUnsignedField(9); } unsigned isDefinition() const { return getUnsignedField(10); } - const char *getFilename() const { return getCompileUnit().getFilename();} - const char *getDirectory() const { return getCompileUnit().getDirectory();} + StringRef getFilename() const { return getCompileUnit().getFilename();} + StringRef getDirectory() const { return getCompileUnit().getDirectory();} /// Verify - Verify that a subprogram descriptor is well formed. bool Verify() const; @@ -406,7 +406,7 @@ } DIDescriptor getContext() const { return getDescriptorField(1); } - const char *getName() const { return getStringField(2); } + StringRef getName() const { return getStringField(2); } DICompileUnit getCompileUnit() const{ return getFieldAs(3); } unsigned getLineNumber() const { return getUnsignedField(4); } DIType getType() const { return getFieldAs(5); } @@ -444,8 +444,8 @@ DbgNode = 0; } DIScope getContext() const { return getFieldAs(1); } - const char *getDirectory() const { return getContext().getDirectory(); } - const char *getFilename() const { return getContext().getFilename(); } + StringRef getDirectory() const { return getContext().getDirectory(); } + StringRef getFilename() const { return getContext().getFilename(); } }; /// DILocation - This object holds location information. This object @@ -458,8 +458,8 @@ unsigned getColumnNumber() const { return getUnsignedField(1); } DIScope getScope() const { return getFieldAs(2); } DILocation getOrigLocation() const { return getFieldAs(3); } - const char *getFilename() const { return getScope().getFilename(); } - const char *getDirectory() const { return getScope().getDirectory(); } + StringRef getFilename() const { return getScope().getFilename(); } + StringRef getDirectory() const { return getScope().getDirectory(); } }; /// DIFactory - This object assists with the construction of the various @@ -489,26 +489,26 @@ /// CreateCompileUnit - Create a new descriptor for the specified compile /// unit. DICompileUnit CreateCompileUnit(unsigned LangID, - const char * Filename, - const char * Directory, - const char * Producer, + StringRef Filename, + StringRef Directory, + StringRef Producer, bool isMain = false, bool isOptimized = false, - const char *Flags = "", + StringRef Flags = "", unsigned RunTimeVer = 0); /// CreateEnumerator - Create a single enumerator value. - DIEnumerator CreateEnumerator(const char * Name, uint64_t Val); + DIEnumerator CreateEnumerator(StringRef Name, uint64_t Val); /// CreateBasicType - Create a basic type like int, float, etc. - DIBasicType CreateBasicType(DIDescriptor Context, const char * Name, + DIBasicType CreateBasicType(DIDescriptor Context, StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, uint64_t AlignInBits, uint64_t OffsetInBits, unsigned Flags, unsigned Encoding); /// CreateBasicType - Create a basic type like int, float, etc. - DIBasicType CreateBasicTypeEx(DIDescriptor Context, const char * Name, + DIBasicType CreateBasicTypeEx(DIDescriptor Context, StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, Constant *AlignInBits, Constant *OffsetInBits, unsigned Flags, @@ -517,7 +517,7 @@ /// CreateDerivedType - Create a derived type like const qualified type, /// pointer, typedef, etc. DIDerivedType CreateDerivedType(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, uint64_t AlignInBits, @@ -527,7 +527,7 @@ /// CreateDerivedType - Create a derived type like const qualified type, /// pointer, typedef, etc. DIDerivedType CreateDerivedTypeEx(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, Constant *AlignInBits, @@ -536,7 +536,7 @@ /// CreateCompositeType - Create a composite type like array, struct, etc. DICompositeType CreateCompositeType(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, @@ -548,7 +548,7 @@ /// CreateCompositeType - Create a composite type like array, struct, etc. DICompositeType CreateCompositeTypeEx(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, @@ -560,25 +560,25 @@ /// CreateSubprogram - Create a new descriptor for the specified subprogram. /// See comments in DISubprogram for descriptions of these fields. - DISubprogram CreateSubprogram(DIDescriptor Context, const char * Name, - const char * DisplayName, - const char * LinkageName, + DISubprogram CreateSubprogram(DIDescriptor Context, StringRef Name, + StringRef DisplayName, + StringRef LinkageName, DICompileUnit CompileUnit, unsigned LineNo, DIType Type, bool isLocalToUnit, bool isDefinition); /// CreateGlobalVariable - Create a new descriptor for the specified global. DIGlobalVariable - CreateGlobalVariable(DIDescriptor Context, const char * Name, - const char * DisplayName, - const char * LinkageName, + CreateGlobalVariable(DIDescriptor Context, StringRef Name, + StringRef DisplayName, + StringRef LinkageName, DICompileUnit CompileUnit, unsigned LineNo, DIType Type, bool isLocalToUnit, bool isDefinition, llvm::GlobalVariable *GV); /// CreateVariable - Create a new descriptor for the specified variable. DIVariable CreateVariable(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNo, DIType Type); Modified: llvm/trunk/lib/Analysis/DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/DebugInfo.cpp?rev=89866&r1=89865&r2=89866&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/DebugInfo.cpp (original) +++ llvm/trunk/lib/Analysis/DebugInfo.cpp Wed Nov 25 11:36:49 2009 @@ -78,19 +78,16 @@ } } -const char * +StringRef DIDescriptor::getStringField(unsigned Elt) const { if (DbgNode == 0) - return NULL; + return StringRef(); if (Elt < DbgNode->getNumElements()) - if (MDString *MDS = dyn_cast_or_null(DbgNode->getElement(Elt))) { - if (MDS->getLength() == 0) - return NULL; - return MDS->getString().data(); - } + if (MDString *MDS = dyn_cast_or_null(DbgNode->getElement(Elt))) + return MDS->getString(); - return NULL; + return StringRef(); } uint64_t DIDescriptor::getUInt64Field(unsigned Elt) const { @@ -310,8 +307,8 @@ bool DICompileUnit::Verify() const { if (isNull()) return false; - const char *N = getFilename(); - if (!N) + StringRef N = getFilename(); + if (N.empty()) return false; // It is possible that directory and produce string is empty. return true; @@ -366,7 +363,7 @@ if (isNull()) return false; - if (!getDisplayName()) + if (getDisplayName().empty()) return false; if (getContext().isNull()) @@ -426,15 +423,15 @@ /// information for the function F. bool DISubprogram::describes(const Function *F) { assert (F && "Invalid function"); - const char *Name = getLinkageName(); - if (!Name) + StringRef Name = getLinkageName(); + if (Name.empty()) Name = getName(); - if (strcmp(F->getName().data(), Name) == 0) + if (F->getName() == Name) return true; return false; } -const char *DIScope::getFilename() const { +StringRef DIScope::getFilename() const { if (isLexicalBlock()) return DILexicalBlock(DbgNode).getFilename(); else if (isSubprogram()) @@ -443,10 +440,10 @@ return DICompileUnit(DbgNode).getFilename(); else assert (0 && "Invalid DIScope!"); - return NULL; + return StringRef(); } -const char *DIScope::getDirectory() const { +StringRef DIScope::getDirectory() const { if (isLexicalBlock()) return DILexicalBlock(DbgNode).getDirectory(); else if (isSubprogram()) @@ -455,7 +452,7 @@ return DICompileUnit(DbgNode).getDirectory(); else assert (0 && "Invalid DIScope!"); - return NULL; + return StringRef(); } //===----------------------------------------------------------------------===// @@ -481,7 +478,8 @@ void DIType::dump() const { if (isNull()) return; - if (const char *Res = getName()) + StringRef Res = getName(); + if (!Res.empty()) errs() << " [" << Res << "] "; unsigned Tag = getTag(); @@ -538,7 +536,8 @@ /// dump - Print global. void DIGlobal::dump() const { - if (const char *Res = getName()) + StringRef Res = getName(); + if (!Res.empty()) errs() << " [" << Res << "] "; unsigned Tag = getTag(); @@ -562,7 +561,8 @@ /// dump - Print subprogram. void DISubprogram::dump() const { - if (const char *Res = getName()) + StringRef Res = getName(); + if (!Res.empty()) errs() << " [" << Res << "] "; unsigned Tag = getTag(); @@ -590,7 +590,8 @@ /// dump - Print variable. void DIVariable::dump() const { - if (const char *Res = getName()) + StringRef Res = getName(); + if (!Res.empty()) errs() << " [" << Res << "] "; getCompileUnit().dump(); @@ -651,12 +652,12 @@ /// CreateCompileUnit - Create a new descriptor for the specified compile /// unit. Note that this does not unique compile units within the module. DICompileUnit DIFactory::CreateCompileUnit(unsigned LangID, - const char * Filename, - const char * Directory, - const char * Producer, + StringRef Filename, + StringRef Directory, + StringRef Producer, bool isMain, bool isOptimized, - const char *Flags, + StringRef Flags, unsigned RunTimeVer) { Value *Elts[] = { GetTagConstant(dwarf::DW_TAG_compile_unit), @@ -675,7 +676,7 @@ } /// CreateEnumerator - Create a single enumerator value. -DIEnumerator DIFactory::CreateEnumerator(const char * Name, uint64_t Val){ +DIEnumerator DIFactory::CreateEnumerator(StringRef Name, uint64_t Val){ Value *Elts[] = { GetTagConstant(dwarf::DW_TAG_enumerator), MDString::get(VMContext, Name), @@ -687,7 +688,7 @@ /// CreateBasicType - Create a basic type like int, float, etc. DIBasicType DIFactory::CreateBasicType(DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, @@ -712,7 +713,7 @@ /// CreateBasicType - Create a basic type like int, float, etc. DIBasicType DIFactory::CreateBasicTypeEx(DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, @@ -739,7 +740,7 @@ /// pointer, typedef, etc. DIDerivedType DIFactory::CreateDerivedType(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, @@ -767,7 +768,7 @@ /// pointer, typedef, etc. DIDerivedType DIFactory::CreateDerivedTypeEx(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, @@ -794,7 +795,7 @@ /// CreateCompositeType - Create a composite type like array, struct, etc. DICompositeType DIFactory::CreateCompositeType(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, uint64_t SizeInBits, @@ -826,7 +827,7 @@ /// CreateCompositeType - Create a composite type like array, struct, etc. DICompositeType DIFactory::CreateCompositeTypeEx(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNumber, Constant *SizeInBits, @@ -859,9 +860,9 @@ /// See comments in DISubprogram for descriptions of these fields. This /// method does not unique the generated descriptors. DISubprogram DIFactory::CreateSubprogram(DIDescriptor Context, - const char * Name, - const char * DisplayName, - const char * LinkageName, + StringRef Name, + StringRef DisplayName, + StringRef LinkageName, DICompileUnit CompileUnit, unsigned LineNo, DIType Type, bool isLocalToUnit, @@ -885,9 +886,9 @@ /// CreateGlobalVariable - Create a new descriptor for the specified global. DIGlobalVariable -DIFactory::CreateGlobalVariable(DIDescriptor Context, const char * Name, - const char * DisplayName, - const char * LinkageName, +DIFactory::CreateGlobalVariable(DIDescriptor Context, StringRef Name, + StringRef DisplayName, + StringRef LinkageName, DICompileUnit CompileUnit, unsigned LineNo, DIType Type,bool isLocalToUnit, bool isDefinition, llvm::GlobalVariable *Val) { @@ -919,7 +920,7 @@ /// CreateVariable - Create a new descriptor for the specified variable. DIVariable DIFactory::CreateVariable(unsigned Tag, DIDescriptor Context, - const char * Name, + StringRef Name, DICompileUnit CompileUnit, unsigned LineNo, DIType Type) { Value *Elts[] = { @@ -1274,7 +1275,8 @@ if (!DIGV) return false; DIGlobalVariable Var(cast(DIGV)); - if (const char *D = Var.getDisplayName()) + StringRef D = Var.getDisplayName(); + if (!D.empty()) DisplayName = D; LineNo = Var.getLineNumber(); Unit = Var.getCompileUnit(); @@ -1284,18 +1286,22 @@ if (!DDI) return false; DIVariable Var(cast(DDI->getVariable())); - if (const char *D = Var.getName()) + StringRef D = Var.getName(); + if (!D.empty()) DisplayName = D; LineNo = Var.getLineNumber(); Unit = Var.getCompileUnit(); TypeD = Var.getType(); } - if (const char *T = TypeD.getName()) + StringRef T = TypeD.getName(); + if (!T.empty()) Type = T; - if (const char *F = Unit.getFilename()) + StringRef F = Unit.getFilename(); + if (!F.empty()) File = F; - if (const char *D = Unit.getDirectory()) + StringRef D = Unit.getDirectory(); + if (!D.empty()) Dir = D; return true; } Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89866&r1=89865&r2=89866&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Wed Nov 25 11:36:49 2009 @@ -494,7 +494,7 @@ for (unsigned i = 0, N = Elements.getNumElements(); i < N; ++i) { DIDescriptor Element = Elements.getElement(i); DIDerivedType DT = DIDerivedType(Element.getNode()); - if (strcmp(Name.c_str(), DT.getName()) == 0) + if (Name == DT.getName()) return (DT.getTypeDerivedFrom()); } @@ -620,7 +620,7 @@ unsigned Tag = Ty.getTag(); bool isPointer = false; - const char *varName = VD.getName(); + StringRef varName = VD.getName(); if (Tag == dwarf::DW_TAG_pointer_type) { DIDerivedType DTy = DIDerivedType(Ty.getNode()); @@ -640,10 +640,10 @@ for (unsigned i = 0, N = Fields.getNumElements(); i < N; ++i) { DIDescriptor Element = Fields.getElement(i); DIDerivedType DT = DIDerivedType(Element.getNode()); - const char *fieldName = DT.getName(); - if (strcmp(fieldName, "__forwarding") == 0) + StringRef fieldName = DT.getName(); + if (fieldName == "__forwarding") forwardingField = Element; - else if (strcmp(fieldName, varName) == 0) + else if (fieldName == varName) varField = Element; } @@ -785,13 +785,13 @@ void DwarfDebug::constructTypeDIE(CompileUnit *DW_Unit, DIE &Buffer, DIBasicType BTy) { // Get core information. - const char *Name = BTy.getName(); + StringRef Name = BTy.getName(); Buffer.setTag(dwarf::DW_TAG_base_type); addUInt(&Buffer, dwarf::DW_AT_encoding, dwarf::DW_FORM_data1, BTy.getEncoding()); // Add name if not anonymous or intermediate type. - if (Name) + if (!Name.empty()) addString(&Buffer, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); uint64_t Size = BTy.getSizeInBits() >> 3; addUInt(&Buffer, dwarf::DW_AT_byte_size, 0, Size); @@ -801,7 +801,7 @@ void DwarfDebug::constructTypeDIE(CompileUnit *DW_Unit, DIE &Buffer, DIDerivedType DTy) { // Get core information. - const char *Name = DTy.getName(); + StringRef Name = DTy.getName(); uint64_t Size = DTy.getSizeInBits() >> 3; unsigned Tag = DTy.getTag(); @@ -815,7 +815,7 @@ addType(DW_Unit, &Buffer, FromTy); // Add name if not anonymous or intermediate type. - if (Name && Tag != dwarf::DW_TAG_pointer_type) + if (!Name.empty() && Tag != dwarf::DW_TAG_pointer_type) addString(&Buffer, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); // Add size if non-zero (derived types might be zero-sized.) @@ -831,7 +831,7 @@ void DwarfDebug::constructTypeDIE(CompileUnit *DW_Unit, DIE &Buffer, DICompositeType CTy) { // Get core information. - const char *Name = CTy.getName(); + StringRef Name = CTy.getName(); uint64_t Size = CTy.getSizeInBits() >> 3; unsigned Tag = CTy.getTag(); @@ -913,7 +913,7 @@ } // Add name if not anonymous or intermediate type. - if (Name) + if (!Name.empty()) addString(&Buffer, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); if (Tag == dwarf::DW_TAG_enumeration_type || @@ -984,7 +984,7 @@ /// constructEnumTypeDIE - Construct enum type DIE from DIEnumerator. DIE *DwarfDebug::constructEnumTypeDIE(CompileUnit *DW_Unit, DIEnumerator *ETy) { DIE *Enumerator = new DIE(dwarf::DW_TAG_enumerator); - const char *Name = ETy->getName(); + StringRef Name = ETy->getName(); addString(Enumerator, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); int64_t Value = ETy->getEnumValue(); addSInt(Enumerator, dwarf::DW_AT_const_value, dwarf::DW_FORM_sdata, Value); @@ -997,20 +997,20 @@ // If the global variable was optmized out then no need to create debug info // entry. if (!GV.getGlobal()) return NULL; - if (!GV.getDisplayName()) return NULL; + if (GV.getDisplayName().empty()) return NULL; DIE *GVDie = new DIE(dwarf::DW_TAG_variable); addString(GVDie, dwarf::DW_AT_name, dwarf::DW_FORM_string, GV.getDisplayName()); - const char *LinkageName = GV.getLinkageName(); - if (LinkageName) { + StringRef LinkageName = GV.getLinkageName(); + if (!LinkageName.empty()) { // Skip special LLVM prefix that is used to inform the asm printer to not // emit usual symbol prefix before the symbol name. This happens for // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LinkageName[0] == 1) - LinkageName = &LinkageName[1]; + LinkageName = LinkageName.data() + 1; addString(GVDie, dwarf::DW_AT_MIPS_linkage_name, dwarf::DW_FORM_string, LinkageName); } @@ -1032,9 +1032,10 @@ /// createMemberDIE - Create new member DIE. DIE *DwarfDebug::createMemberDIE(CompileUnit *DW_Unit, const DIDerivedType &DT){ DIE *MemberDie = new DIE(DT.getTag()); - if (const char *Name = DT.getName()) + StringRef Name = DT.getName(); + if (!Name.empty()) addString(MemberDie, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); - + addType(DW_Unit, MemberDie, DT.getTypeDerivedFrom()); addSourceLine(MemberDie, &DT); @@ -1087,18 +1088,16 @@ bool IsConstructor, bool IsInlined) { DIE *SPDie = new DIE(dwarf::DW_TAG_subprogram); + addString(SPDie, dwarf::DW_AT_name, dwarf::DW_FORM_string, SP.getName()); - const char * Name = SP.getName(); - addString(SPDie, dwarf::DW_AT_name, dwarf::DW_FORM_string, Name); - - const char *LinkageName = SP.getLinkageName(); - if (LinkageName) { + StringRef LinkageName = SP.getLinkageName(); + if (!LinkageName.empty()) { // Skip special LLVM prefix that is used to inform the asm printer to not // emit usual symbol prefix before the symbol name. This happens for // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LinkageName[0] == 1) - LinkageName = &LinkageName[1]; + LinkageName = LinkageName.data() + 1; addString(SPDie, dwarf::DW_AT_MIPS_linkage_name, dwarf::DW_FORM_string, LinkageName); } @@ -1155,8 +1154,8 @@ DIE *DwarfDebug::createDbgScopeVariable(DbgVariable *DV, CompileUnit *Unit) { // Get the descriptor. const DIVariable &VD = DV->getVariable(); - const char *Name = VD.getName(); - if (!Name) + StringRef Name = VD.getName(); + if (Name.empty()) return NULL; // Translate tag to proper Dwarf tag. The result variable is dropped for @@ -1406,8 +1405,8 @@ DbgScope *Scope, CompileUnit *Unit) { // Get the descriptor. const DIVariable &VD = DV->getVariable(); - const char *Name = VD.getName(); - if (!Name) + StringRef Name = VD.getName(); + if (Name.empty()) return NULL; // Translate tag to proper Dwarf tag. The result variable is dropped for @@ -1491,7 +1490,7 @@ if (ATy.isNull()) continue; DICompositeType CATy = getDICompositeType(ATy); - if (!CATy.isNull() && CATy.getName()) { + if (!CATy.isNull() && !CATy.getName().empty()) { if (DIEEntry *Entry = ModuleCU->getDIEEntry(CATy.getNode())) ModuleCU->addGlobalType(CATy.getName(), Entry->getEntry()); } @@ -1547,8 +1546,7 @@ /// source file names. If none currently exists, create a new id and insert it /// in the SourceIds map. This can update DirectoryNames and SourceFileNames /// maps as well. -unsigned DwarfDebug::GetOrCreateSourceID(const char *DirName, - const char *FileName) { +unsigned DwarfDebug::GetOrCreateSourceID(StringRef DirName, StringRef FileName) { unsigned DId; StringMap::iterator DI = DirectoryIdMap.find(DirName); if (DI != DirectoryIdMap.end()) { @@ -1583,8 +1581,8 @@ void DwarfDebug::constructCompileUnit(MDNode *N) { DICompileUnit DIUnit(N); - const char *FN = DIUnit.getFilename(); - const char *Dir = DIUnit.getDirectory(); + StringRef FN = DIUnit.getFilename(); + StringRef Dir = DIUnit.getDirectory(); unsigned ID = GetOrCreateSourceID(Dir, FN); DIE *Die = new DIE(dwarf::DW_TAG_compile_unit); @@ -1597,12 +1595,13 @@ DIUnit.getLanguage()); addString(Die, dwarf::DW_AT_name, dwarf::DW_FORM_string, FN); - if (Dir) + if (!Dir.empty()) addString(Die, dwarf::DW_AT_comp_dir, dwarf::DW_FORM_string, Dir); if (DIUnit.isOptimized()) addUInt(Die, dwarf::DW_AT_APPLE_optimized, dwarf::DW_FORM_flag, 1); - if (const char *Flags = DIUnit.getFlags()) + StringRef Flags = DIUnit.getFlags(); + if (!Flags.empty()) addString(Die, dwarf::DW_AT_APPLE_flags, dwarf::DW_FORM_string, Flags); unsigned RVer = DIUnit.getRunTimeVersion(); @@ -1644,7 +1643,7 @@ ModuleCU->addGlobal(DI_GV.getName(), VariableDie); DIType GTy = DI_GV.getType(); - if (GTy.isCompositeType() && GTy.getName()) { + if (GTy.isCompositeType() && !GTy.getName().empty()) { DIEEntry *Entry = ModuleCU->getDIEEntry(GTy.getNode()); assert (Entry && "Missing global type!"); ModuleCU->addGlobalType(GTy.getName(), Entry->getEntry()); @@ -2119,8 +2118,8 @@ if (TimePassesIsEnabled) DebugTimer->startTimer(); - const char *Dir = NULL; - const char *Fn = NULL; + StringRef Dir; + StringRef Fn; DIDescriptor Scope(S); if (Scope.isCompileUnit()) { @@ -2889,10 +2888,10 @@ = InlineInfo.find(Node); SmallVector &Labels = II->second; DISubprogram SP(Node); - const char *LName = SP.getLinkageName(); - const char *Name = SP.getName(); + StringRef LName = SP.getLinkageName(); + StringRef Name = SP.getName(); - if (!LName) + if (LName.empty()) Asm->EmitString(Name); else { // Skip special LLVM prefix that is used to inform the asm printer to not @@ -2900,7 +2899,7 @@ // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LName[0] == 1) - LName = &LName[1]; + LName = LName.data() + 1; // Asm->EmitString(LName); EmitSectionOffset("string", "section_str", StringPool.idFor(LName), false, true); Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h?rev=89866&r1=89865&r2=89866&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Wed Nov 25 11:36:49 2009 @@ -486,8 +486,7 @@ /// source file names. If none currently exists, create a new id and insert it /// in the SourceIds map. This can update DirectoryNames and SourceFileNames maps /// as well. - unsigned GetOrCreateSourceID(const char *DirName, - const char *FileName); + unsigned GetOrCreateSourceID(StringRef DirName, StringRef FileName); void constructCompileUnit(MDNode *N); Modified: llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp?rev=89866&r1=89865&r2=89866&view=diff ============================================================================== --- llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp (original) +++ llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp Wed Nov 25 11:36:49 2009 @@ -306,10 +306,10 @@ int ElementAux[PIC16Dbg::AuxSize] = { 0 }; std::string TagName = ""; DIDerivedType DITy(Element.getNode()); - const char *ElementName = DITy.getName(); + StringRef ElementName = DITy.getName(); unsigned short ElementSize = DITy.getSizeInBits()/8; // Get mangleddd name for this structure/union element. - std::string MangMemName = ElementName + SuffixNo; + std::string MangMemName = ElementName.data() + SuffixNo; PopulateDebugInfo(DITy, TypeNo, HasAux, ElementAux, TagName); short Class = 0; if( CTy.getTag() == dwarf::DW_TAG_union_type) @@ -337,12 +337,12 @@ continue; if (CTy.getTag() == dwarf::DW_TAG_union_type || CTy.getTag() == dwarf::DW_TAG_structure_type ) { - const char *Name = CTy.getName(); + StringRef Name = CTy.getName(); // Get the number after llvm.dbg.composite and make UniqueSuffix from // it. std::string DIVar = CTy.getNode()->getNameStr(); std::string UniqueSuffix = "." + DIVar.substr(18); - std::string MangledCTyName = Name + UniqueSuffix; + std::string MangledCTyName = Name.data() + UniqueSuffix; unsigned short size = CTy.getSizeInBits()/8; int Aux[PIC16Dbg::AuxSize] = {0}; // 7th and 8th byte represent size of structure/union. From dpatel at apple.com Wed Nov 25 11:38:44 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 25 Nov 2009 17:38:44 -0000 Subject: [llvm-commits] [llvm-gcc-4.2] r89868 - /llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp Message-ID: <200911251738.nAPHcife006473@zion.cs.uiuc.edu> Author: dpatel Date: Wed Nov 25 11:38:43 2009 New Revision: 89868 URL: http://llvm.org/viewvc/llvm-project?rev=89868&view=rev Log: Use StringRef (again) in DebugInfo interface. Modified: llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp Modified: llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp?rev=89868&r1=89867&r2=89868&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp (original) +++ llvm-gcc-4.2/trunk/gcc/llvm-debug.cpp Wed Nov 25 11:38:43 2009 @@ -122,7 +122,7 @@ /// GetNodeName - Returns the name stored in a node regardless of whether the /// node is a TYPE or DECL. -static const char *GetNodeName(tree Node) { +static StringRef GetNodeName(tree Node) { tree Name = NULL; if (DECL_P(Node)) { @@ -136,11 +136,11 @@ return IDENTIFIER_POINTER(Name); } else if (TREE_CODE(Name) == TYPE_DECL && DECL_NAME(Name) && !DECL_IGNORED_P(Name)) { - return IDENTIFIER_POINTER(DECL_NAME(Name)); + return StringRef(IDENTIFIER_POINTER(DECL_NAME(Name))); } } - return NULL; + return StringRef(); } /// GetNodeLocation - Returns the location stored in a node regardless of @@ -181,12 +181,12 @@ return Location; } -static const char *getLinkageName(tree Node) { +static StringRef getLinkageName(tree Node) { // Use llvm value name as linkage name if it is available. if (DECL_LLVM_SET_P(Node)) { Value *V = DECL_LLVM(Node); - return V->getName().data(); + return V->getName(); } tree decl_name = DECL_NAME(Node); @@ -194,10 +194,10 @@ if (TREE_PUBLIC(Node) && DECL_ASSEMBLER_NAME(Node) != DECL_NAME(Node) && !DECL_ABSTRACT(Node)) { - return IDENTIFIER_POINTER(DECL_ASSEMBLER_NAME(Node)); + return StringRef(IDENTIFIER_POINTER(DECL_ASSEMBLER_NAME(Node))); } } - return NULL; + return StringRef(); } DebugInfo::DebugInfo(Module *m) @@ -234,7 +234,7 @@ BasicBlock *CurBB) { // Gather location information. expanded_location Loc = GetNodeLocation(FnDecl, false); - const char *LinkageName = getLinkageName(FnDecl); + StringRef LinkageName = getLinkageName(FnDecl); unsigned lineno = CurLineNo; if (isCopyOrDestroyHelper(FnDecl)) @@ -396,8 +396,7 @@ // Gather location information. expanded_location Loc = expand_location(DECL_SOURCE_LOCATION(decl)); DIType TyD = getOrCreateType(TREE_TYPE(decl)); - std::string DispNameStr = GV->getNameStr(); - const char *DispName = DispNameStr.c_str(); + StringRef DispName = GV->getName(); if (DECL_NAME(decl)) { if (IDENTIFIER_POINTER(DECL_NAME(decl))) DispName = IDENTIFIER_POINTER(DECL_NAME(decl)); @@ -414,7 +413,7 @@ /// createBasicType - Create BasicType. DIType DebugInfo::createBasicType(tree type) { - const char *TypeName = GetNodeName(type); + StringRef TypeName = GetNodeName(type); uint64_t Size = NodeSizeInBits(type); uint64_t Align = NodeAlignInBits(type); @@ -479,7 +478,7 @@ DebugFactory.GetOrCreateArray(EltTys.data(), EltTys.size()); return DebugFactory.CreateCompositeType(llvm::dwarf::DW_TAG_subroutine_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0, 0, 0, 0, 0, llvm::DIType(), EltTypeArray); @@ -500,7 +499,7 @@ Flags |= llvm::DIType::FlagBlockByrefStruct; expanded_location Loc = GetNodeLocation(type); - const char *PName = FromTy.getName(); + StringRef PName = FromTy.getName(); return DebugFactory.CreateDerivedType(Tag, findRegion(type), PName, getOrCreateCompileUnit(NULL), 0 /*line no*/, @@ -557,7 +556,7 @@ DebugFactory.GetOrCreateArray(Subscripts.data(), Subscripts.size()); expanded_location Loc = GetNodeLocation(type); return DebugFactory.CreateCompositeType(llvm::dwarf::DW_TAG_array_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(Loc.file), 0, NodeSizeInBits(type), NodeAlignInBits(type), 0, 0, @@ -638,13 +637,13 @@ /// also while creating FwdDecl for now. std::string FwdName; if (TYPE_CONTEXT(type)) { - const char *TypeContextName = GetNodeName(TYPE_CONTEXT(type)); - if (TypeContextName) + StringRef TypeContextName = GetNodeName(TYPE_CONTEXT(type)); + if (!TypeContextName.empty()) FwdName = TypeContextName; } - const char *TypeName = GetNodeName(type); - if (TypeName) - FwdName = FwdName + TypeName; + StringRef TypeName = GetNodeName(type); + if (!TypeName.empty()) + FwdName = FwdName + TypeName.data(); unsigned Flags = llvm::DIType::FlagFwdDecl; if (TYPE_BLOCK_IMPL_STRUCT(type)) Flags |= llvm::DIType::FlagAppleBlock; @@ -680,7 +679,7 @@ // FIXME : name, size, align etc... DIType DTy = DebugFactory.CreateDerivedType(DW_TAG_inheritance, - findRegion(type), NULL, + findRegion(type), StringRef(), llvm::DICompileUnit(), 0,0,0, getINTEGER_CSTVal(BINFO_OFFSET(BInfo)), 0, BaseClass); @@ -711,7 +710,7 @@ // Field type is the declared type of the field. tree FieldNodeType = FieldType(Member); DIType MemberType = getOrCreateType(FieldNodeType); - const char *MemberName = GetNodeName(Member); + StringRef MemberName = GetNodeName(Member); unsigned Flags = 0; if (TREE_PROTECTED(Member)) Flags = llvm::DIType::FlagProtected; @@ -744,7 +743,7 @@ expanded_location MemLoc = GetNodeLocation(Member, false); const char *MemberName = lang_hooks.dwarf_name(Member, 0); - const char *LinkageName = getLinkageName(Member); + StringRef LinkageName = getLinkageName(Member); DIType SPTy = getOrCreateType(TREE_TYPE(Member)); DISubprogram SP = DebugFactory.CreateSubprogram(findRegion(Member), MemberName, MemberName, @@ -799,7 +798,7 @@ if (TYPE_VOLATILE(type)) { Ty = DebugFactory.CreateDerivedType(DW_TAG_volatile_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0 /*line no*/, NodeSizeInBits(type), @@ -812,7 +811,7 @@ if (TYPE_READONLY(type)) Ty = DebugFactory.CreateDerivedType(DW_TAG_const_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0 /*line no*/, NodeSizeInBits(type), @@ -979,7 +978,7 @@ else LangTag = DW_LANG_C89; - const char *Flags = NULL; + StringRef Flags; // Do this only when RC_DEBUG_OPTIONS environment variable is set to // a nonempty string. This is intended only for internal Apple use. char * debugopt = getenv("RC_DEBUG_OPTIONS"); From benny.kra at googlemail.com Wed Nov 25 12:26:09 2009 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Wed, 25 Nov 2009 18:26:09 -0000 Subject: [llvm-commits] [llvm] r89873 - in /llvm/trunk/lib: CodeGen/AsmPrinter/DwarfDebug.cpp Target/PIC16/PIC16DebugInfo.cpp Message-ID: <200911251826.nAPIQ9IK008155@zion.cs.uiuc.edu> Author: d0k Date: Wed Nov 25 12:26:09 2009 New Revision: 89873 URL: http://llvm.org/viewvc/llvm-project?rev=89873&view=rev Log: Avoid some possibly unsafe uses of StringRef::data(). Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89873&r1=89872&r2=89873&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Wed Nov 25 12:26:09 2009 @@ -1010,7 +1010,7 @@ // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LinkageName[0] == 1) - LinkageName = LinkageName.data() + 1; + LinkageName = LinkageName.substr(1); addString(GVDie, dwarf::DW_AT_MIPS_linkage_name, dwarf::DW_FORM_string, LinkageName); } @@ -1097,7 +1097,7 @@ // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LinkageName[0] == 1) - LinkageName = LinkageName.data() + 1; + LinkageName = LinkageName.substr(1); addString(SPDie, dwarf::DW_AT_MIPS_linkage_name, dwarf::DW_FORM_string, LinkageName); } @@ -2899,7 +2899,7 @@ // Objective-C symbol names and symbol whose name is replaced using GCC's // __asm__ attribute. if (LName[0] == 1) - LName = LName.data() + 1; + LName = LName.substr(1); // Asm->EmitString(LName); EmitSectionOffset("string", "section_str", StringPool.idFor(LName), false, true); Modified: llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp?rev=89873&r1=89872&r2=89873&view=diff ============================================================================== --- llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp (original) +++ llvm/trunk/lib/Target/PIC16/PIC16DebugInfo.cpp Wed Nov 25 12:26:09 2009 @@ -306,10 +306,9 @@ int ElementAux[PIC16Dbg::AuxSize] = { 0 }; std::string TagName = ""; DIDerivedType DITy(Element.getNode()); - StringRef ElementName = DITy.getName(); unsigned short ElementSize = DITy.getSizeInBits()/8; // Get mangleddd name for this structure/union element. - std::string MangMemName = ElementName.data() + SuffixNo; + std::string MangMemName = DITy.getName().str() + SuffixNo; PopulateDebugInfo(DITy, TypeNo, HasAux, ElementAux, TagName); short Class = 0; if( CTy.getTag() == dwarf::DW_TAG_union_type) @@ -337,12 +336,11 @@ continue; if (CTy.getTag() == dwarf::DW_TAG_union_type || CTy.getTag() == dwarf::DW_TAG_structure_type ) { - StringRef Name = CTy.getName(); // Get the number after llvm.dbg.composite and make UniqueSuffix from // it. std::string DIVar = CTy.getNode()->getNameStr(); std::string UniqueSuffix = "." + DIVar.substr(18); - std::string MangledCTyName = Name.data() + UniqueSuffix; + std::string MangledCTyName = CTy.getName().str() + UniqueSuffix; unsigned short size = CTy.getSizeInBits()/8; int Aux[PIC16Dbg::AuxSize] = {0}; // 7th and 8th byte represent size of structure/union. From baldrick at free.fr Wed Nov 25 12:41:37 2009 From: baldrick at free.fr (Duncan Sands) Date: Wed, 25 Nov 2009 19:41:37 +0100 Subject: [llvm-commits] [llvm] r89421 - /llvm/trunk/lib/Analysis/CaptureTracking.cpp In-Reply-To: <4B0C5C3B.5010903@apple.com> References: <200911200050.nAK0or7J026222@zion.cs.uiuc.edu> <4B068327.1070103@free.fr> <784C47FB-FB86-404E-B39F-8EAF9C4A98E0@apple.com> <4B080CEF.5060301@free.fr> <6A5CF08E-94EF-4893-81CC-5AEF0A19EB25@apple.com> <4B0A9A8C.9040905@free.fr> <006872F5-2CC8-400F-B995-22D5B52886BA@apple.com> <4B0C5411.5090800@free.fr> <4B0C5C3B.5010903@apple.com> Message-ID: <4B0D7A61.9010109@free.fr> PS: Here's my latest version which is no longer obstructed by phi nodes or selects. I didn't test it much, but it increases the number of parameters marked nocapture by more than 3% in the testsuite. -------------- next part -------------- A non-text attachment was scrubbed... Name: capture.diff Type: text/x-patch Size: 10377 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091125/88929225/attachment.bin From daniel at zuster.org Wed Nov 25 13:43:18 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 11:43:18 -0800 Subject: [llvm-commits] [llvm] r89793 - in /llvm/trunk: include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter/AsmPrinter.cpp lib/CodeGen/AsmPrinter/DIE.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h In-Reply-To: <200911241942.nAOJgHfU009130@zion.cs.uiuc.edu> References: <200911241942.nAOJgHfU009130@zion.cs.uiuc.edu> Message-ID: <6a8523d60911251143p5caafc85ga6a8014cec36af48@mail.gmail.com> Hi Devang, On Tue, Nov 24, 2009 at 11:42 AM, Devang Patel wrote: > Author: dpatel > Date: Tue Nov 24 13:42:17 2009 > New Revision: 89793 > > URL: http://llvm.org/viewvc/llvm-project?rev=89793&view=rev > Log: > Use StringRef instead of std::string in DIEString. Thanks! > Modified: > ? ?llvm/trunk/include/llvm/CodeGen/AsmPrinter.h > ? ?llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > ? ?llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h > ? ?llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp > ? ?llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h > > Modified: llvm/trunk/include/llvm/CodeGen/AsmPrinter.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/AsmPrinter.h?rev=89793&r1=89792&r2=89793&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/CodeGen/AsmPrinter.h (original) > +++ llvm/trunk/include/llvm/CodeGen/AsmPrinter.h Tue Nov 24 13:42:17 2009 > @@ -297,7 +297,7 @@ > ? ? /// EmitString - Emit a string with quotes and a null terminator. > ? ? /// Special characters are emitted properly. > ? ? /// @verbatim (Eg. '\t') @endverbatim > - ? ?void EmitString(const std::string &String) const; > + ? ?void EmitString(const StringRef String) const; > ? ? void EmitString(const char *String, unsigned Size) const; > > ? ? /// EmitFile - Emit a .file directive. > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp?rev=89793&r1=89792&r2=89793&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp Tue Nov 24 13:42:17 2009 > @@ -728,7 +728,7 @@ > ?/// EmitString - Emit a string with quotes and a null terminator. > ?/// Special characters are emitted properly. > ?/// \literal (Eg. '\t') \endliteral > -void AsmPrinter::EmitString(const std::string &String) const { > +void AsmPrinter::EmitString(const StringRef String) const { > ? EmitString(String.data(), String.size()); > ?} > > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h?rev=89793&r1=89792&r2=89793&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h Tue Nov 24 13:42:17 2009 > @@ -277,9 +277,9 @@ > ? /// DIEString - A string value DIE. > ? /// > ? class DIEString : public DIEValue { > - ? ?const std::string Str; > + ? ?const StringRef Str; This particular change might not be a good idea. A StringRef is just a reference, so whether this is safe or not depends on whether clients are disallowed from passing in strings whose lifetime will not extend pass that of the DIEString. I suspect this should remain std::string. For example, this will fail: -- addString(die, attr, form, std::string("foo") + "bar"); -- - Daniel > ? public: > - ? ?explicit DIEString(const std::string &S) : DIEValue(isString), Str(S) {} > + ? ?explicit DIEString(const StringRef S) : DIEValue(isString), Str(S) {} > > ? ? /// EmitValue - Emit string value. > ? ? /// > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp?rev=89793&r1=89792&r2=89793&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Tue Nov 24 13:42:17 2009 > @@ -333,7 +333,7 @@ > ?/// addString - Add a string attribute data and value. > ?/// > ?void DwarfDebug::addString(DIE *Die, unsigned Attribute, unsigned Form, > - ? ? ? ? ? ? ? ? ? ? ? ? ? const std::string &String) { > + ? ? ? ? ? ? ? ? ? ? ? ? ? const StringRef String) { > ? DIEValue *Value = new DIEString(String); > ? DIEValues.push_back(Value); > ? Die->addValue(Attribute, Form, Value); > > Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h?rev=89793&r1=89792&r2=89793&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h (original) > +++ llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h Tue Nov 24 13:42:17 2009 > @@ -244,7 +244,7 @@ > ? /// addString - Add a string attribute data and value. > ? /// > ? void addString(DIE *Die, unsigned Attribute, unsigned Form, > - ? ? ? ? ? ? ? ? const std::string &String); > + ? ? ? ? ? ? ? ? const StringRef Str); > > ? /// addLabel - Add a Dwarf label attribute data and value. > ? /// > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > From devang.patel at gmail.com Wed Nov 25 13:53:51 2009 From: devang.patel at gmail.com (Devang Patel) Date: Wed, 25 Nov 2009 11:53:51 -0800 Subject: [llvm-commits] [llvm] r89793 - in /llvm/trunk: include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter/AsmPrinter.cpp lib/CodeGen/AsmPrinter/DIE.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h In-Reply-To: <6a8523d60911251143p5caafc85ga6a8014cec36af48@mail.gmail.com> References: <200911241942.nAOJgHfU009130@zion.cs.uiuc.edu> <6a8523d60911251143p5caafc85ga6a8014cec36af48@mail.gmail.com> Message-ID: <352a1fb20911251153n16d6d05fxe8720681ba504e71@mail.gmail.com> On Wed, Nov 25, 2009 at 11:43 AM, Daniel Dunbar wrote: > > This particular change might not be a good idea. A StringRef is just a > reference, so whether this is safe or not depends on whether clients > are disallowed from passing in strings whose lifetime will not extend > pass that of the DIEString. I suspect this should remain std::string. > For example, this will fail: > -- > addString(die, attr, form, std::string("foo") + "bar"); > -- I checked all uses of addString() and they are safe. If I missed one let me know. BTW, this is why I left std::string left alone DIEObjectLabel - Devang From daniel at zuster.org Wed Nov 25 13:54:16 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 11:54:16 -0800 Subject: [llvm-commits] [llvm] r89626 - in /llvm/trunk: docs/CommandGuide/FileCheck.pod utils/FileCheck/FileCheck.cpp In-Reply-To: References: <200911222207.nAMM7pis006331@zion.cs.uiuc.edu> Message-ID: <6a8523d60911251154t6f3c8908i6b9bccce1a8d5410@mail.gmail.com> On Mon, Nov 23, 2009 at 10:29 AM, Chris Lattner wrote: > > On Nov 22, 2009, at 2:07 PM, Daniel Dunbar wrote: > >> Author: ddunbar >> Date: Sun Nov 22 16:07:50 2009 >> New Revision: 89626 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89626&view=rev >> Log: >> Allow '_' in FileCheck variable names, it is nice to have at least one >> separate character. >> - Chris, OK? > > Fine with me, thx. ?Please update the TestingGuide documentation. Why do we have two copies of the docs? Can't the TestingGuide just point at the FileCheck man page? - Daniel From bob.wilson at apple.com Wed Nov 25 13:57:14 2009 From: bob.wilson at apple.com (Bob Wilson) Date: Wed, 25 Nov 2009 19:57:14 -0000 Subject: [llvm-commits] [llvm] r89877 - /llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h Message-ID: <200911251957.nAPJvEma011366@zion.cs.uiuc.edu> Author: bwilson Date: Wed Nov 25 13:57:14 2009 New Revision: 89877 URL: http://llvm.org/viewvc/llvm-project?rev=89877&view=rev Log: Tail duplicate indirect branches for PowerPC, too. With the testcase for pr3120, the "threaded interpreter" runtime decreases from 1788 to 1413 with this change. Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h?rev=89877&r1=89876&r2=89877&view=diff ============================================================================== --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h (original) +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h Wed Nov 25 13:57:14 2009 @@ -151,6 +151,8 @@ /// instruction may be. This returns the maximum number of bytes. /// virtual unsigned GetInstSizeInBytes(const MachineInstr *MI) const; + + virtual bool isProfitableToDuplicateIndirectBranch() const { return true; } }; } From dalej at apple.com Wed Nov 25 13:57:24 2009 From: dalej at apple.com (Dale Johannesen) Date: Wed, 25 Nov 2009 11:57:24 -0800 Subject: [llvm-commits] [llvm] r89793 - in /llvm/trunk: include/llvm/CodeGen/AsmPrinter.h lib/CodeGen/AsmPrinter/AsmPrinter.cpp lib/CodeGen/AsmPrinter/DIE.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h In-Reply-To: <352a1fb20911251153n16d6d05fxe8720681ba504e71@mail.gmail.com> References: <200911241942.nAOJgHfU009130@zion.cs.uiuc.edu> <6a8523d60911251143p5caafc85ga6a8014cec36af48@mail.gmail.com> <352a1fb20911251153n16d6d05fxe8720681ba504e71@mail.gmail.com> Message-ID: On Nov 25, 2009, at 11:53 AM, Devang Patel wrote: > On Wed, Nov 25, 2009 at 11:43 AM, Daniel Dunbar > wrote: >> >> This particular change might not be a good idea. A StringRef is >> just a >> reference, so whether this is safe or not depends on whether clients >> are disallowed from passing in strings whose lifetime will not extend >> pass that of the DIEString. I suspect this should remain std::string. >> For example, this will fail: >> -- >> addString(die, attr, form, std::string("foo") + "bar"); >> -- > > I checked all uses of addString() and they are safe. Future uses could also be a problem. If there are restrictions on the use of addString please make sure this is documented (commented). > If I missed one > let me know. BTW, this is why I left std::string left alone > DIEObjectLabel > - > Devang > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From daniel at zuster.org Wed Nov 25 14:38:37 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 12:38:37 -0800 Subject: [llvm-commits] [llvm] r89211 - /llvm/trunk/CMakeLists.txt In-Reply-To: <33B0309B-C3D1-48AD-8574-917639E27C6B@apple.com> References: <200911181742.nAIHgMXe005175@zion.cs.uiuc.edu> <87lji25i81.fsf@telefonica.net> <6a8523d60911221021t75d68290j30060188af1734ca@mail.gmail.com> <4B098818.4000506@mxc.ca> <6a8523d60911221409o474ae2atb51155eab3f56890@mail.gmail.com> <4B09BAEA.2030207@mxc.ca> <33B0309B-C3D1-48AD-8574-917639E27C6B@apple.com> Message-ID: <6a8523d60911251238i2030be7duf114125fe233eebd@mail.gmail.com> On Mon, Nov 23, 2009 at 9:40 AM, Chris Lattner wrote: > > On Nov 22, 2009, at 2:27 PM, Nick Lewycky wrote: > >>>>> >>>>> I will update the doc. >>>> >>>> The rationale for copying each Kaleidoscope chapter into the examples/ >>>> directory was to make sure that our examples on the website don't get out of >>>> date when someone changes the API. >>> >>> I agree this is good. > > This sounds like excellent rationale for disabling this stuff (including kaleidoscope!) by default, but making sure the buildbots are building them. This is done, our fastest machine (clang-x86_64-linux) builds them for LLVM and Clang. - Daniel From evan.cheng at apple.com Wed Nov 25 15:01:14 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Wed, 25 Nov 2009 13:01:14 -0800 Subject: [llvm-commits] [llvm] r89713 - /llvm/trunk/lib/Analysis/IVUsers.cpp In-Reply-To: <200911232325.nANNPtol014596@zion.cs.uiuc.edu> References: <200911232325.nANNPtol014596@zion.cs.uiuc.edu> Message-ID: <76078986-1518-4F0D-84B5-425BB12783BF@apple.com> Test case? Evan On Nov 23, 2009, at 3:25 PM, Jim Grosbach wrote: > Author: grosbach > Date: Mon Nov 23 17:25:54 2009 > New Revision: 89713 > > URL: http://llvm.org/viewvc/llvm-project?rev=89713&view=rev > Log: > enable iv-users simplification by default > > Modified: > llvm/trunk/lib/Analysis/IVUsers.cpp > > Modified: llvm/trunk/lib/Analysis/IVUsers.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/IVUsers.cpp?rev=89713&r1=89712&r2=89713&view=diff > > ============================================================================== > --- llvm/trunk/lib/Analysis/IVUsers.cpp (original) > +++ llvm/trunk/lib/Analysis/IVUsers.cpp Mon Nov 23 17:25:54 2009 > @@ -24,7 +24,6 @@ > #include "llvm/ADT/STLExtras.h" > #include "llvm/Support/Debug.h" > #include "llvm/Support/raw_ostream.h" > -#include "llvm/Support/CommandLine.h" > #include > using namespace llvm; > > @@ -32,10 +31,6 @@ > static RegisterPass > X("iv-users", "Induction Variable Users", false, true); > > -static cl::opt > -SimplifyIVUsers("simplify-iv-users", cl::Hidden, cl::init(false), > - cl::desc("Restrict IV Users to loop-invariant strides")); > - > Pass *llvm::createIVUsersPass() { > return new IVUsers(); > } > @@ -214,8 +209,7 @@ > return false; // Non-reducible symbolic expression, bail out. > > // Keep things simple. Don't touch loop-variant strides. > - if (SimplifyIVUsers && !Stride->isLoopInvariant(L) > - && L->contains(I->getParent())) > + if (!Stride->isLoopInvariant(L) && L->contains(I->getParent())) > return false; > > SmallPtrSet UniqueUsers; > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From daniel at zuster.org Wed Nov 25 15:11:08 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Wed, 25 Nov 2009 21:11:08 -0000 Subject: [llvm-commits] [zorg] r89879 - in /zorg/trunk/zorg/buildbot/builders: ClangBuilder.py LLVMBuilder.py Message-ID: <200911252111.nAPLB8pm013799@zion.cs.uiuc.edu> Author: ddunbar Date: Wed Nov 25 15:11:08 2009 New Revision: 89879 URL: http://llvm.org/viewvc/llvm-project?rev=89879&view=rev Log: Add optional argument to LLVM and Clang builders to indicate examples should be built. Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py Modified: zorg/trunk/zorg/buildbot/builders/ClangBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/ClangBuilder.py?rev=89879&r1=89878&r2=89879&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/ClangBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/ClangBuilder.py Wed Nov 25 15:11:08 2009 @@ -14,7 +14,7 @@ from Util import getConfigArgs def getClangBuildFactory(triple=None, clean=True, test=True, run_cxx_tests=False, - valgrind=False, useTwoStage=False, + examples=False, valgrind=False, useTwoStage=False, make='make', jobs="%(jobs)s", stage1_config='Debug', stage2_config='Release', extra_configure_args=[]): @@ -89,6 +89,17 @@ description=["compiling", stage1_config], descriptionDone=["compile", stage1_config], workdir=llvm_1_objdir)) + + if examples: + f.addStep(WarningCountingShellCommand(name="compile.examples", + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs), + "BUILD_EXAMPLES=1"], + haltOnFailure=True, + description=["compilinge", stage1_config, "examples"], + descriptionDone=["compile", stage1_config, "examples"], + workdir=llvm_1_objdir)) + clangTestArgs = '-v' if valgrind: clangTestArgs += ' --vg ' Modified: zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py?rev=89879&r1=89878&r2=89879&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMBuilder.py Wed Nov 25 15:11:08 2009 @@ -10,7 +10,7 @@ from zorg.buildbot.commands.ClangTestCommand import ClangTestCommand def getLLVMBuildFactory(triple=None, clean=True, test=True, - expensive_checks=False, + expensive_checks=False, examples=False, jobs=1, timeout=20, make='make'): f = buildbot.process.factory.BuildFactory() @@ -56,6 +56,16 @@ descriptionDone="compile llvm", workdir='llvm', timeout=timeout*60)) + if examples: + f.addStep(WarningCountingShellCommand(name="compile.examples", + command=['nice', '-n', '10', + make, WithProperties("-j%s" % jobs), + 'BUILD_EXAMPLES=1'], + haltOnFailure=True, + description=["compiling", "llvm", "examples"], + descriptionDone=["compile", "llvm", "examples"], + workdir='llvm', + timeout=timeout*60)) if test: f.addStep(ClangTestCommand(name='test-llvm', command=[make, "check-lit", "VERBOSE=1"], From evan.cheng at apple.com Wed Nov 25 15:13:39 2009 From: evan.cheng at apple.com (Evan Cheng) Date: Wed, 25 Nov 2009 21:13:39 -0000 Subject: [llvm-commits] [llvm] r89880 - in /llvm/trunk: lib/CodeGen/ProcessImplicitDefs.cpp test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll Message-ID: <200911252113.nAPLDdwh013899@zion.cs.uiuc.edu> Author: evancheng Date: Wed Nov 25 15:13:39 2009 New Revision: 89880 URL: http://llvm.org/viewvc/llvm-project?rev=89880&view=rev Log: ProcessImplicitDefs should watch out for invalidated iterator and extra implicit operands on copies. Added: llvm/trunk/test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll Modified: llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp Modified: llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp?rev=89880&r1=89879&r2=89880&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp (original) +++ llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp Wed Nov 25 15:13:39 2009 @@ -75,10 +75,11 @@ SmallSet ImpDefRegs; SmallVector ImpDefMIs; - MachineBasicBlock *Entry = fn.begin(); + SmallVector RUses; SmallPtrSet Visited; SmallPtrSet ModInsts; + MachineBasicBlock *Entry = fn.begin(); for (df_ext_iterator > DFI = df_ext_begin(Entry, Visited), E = df_ext_end(Entry, Visited); DFI != E; ++DFI) { @@ -197,38 +198,68 @@ MI->eraseFromParent(); Changed = true; + // Process each use instruction once. for (MachineRegisterInfo::use_iterator UI = mri_->use_begin(Reg), - UE = mri_->use_end(); UI != UE; ) { - MachineOperand &RMO = UI.getOperand(); + UE = mri_->use_end(); UI != UE; ++UI) { MachineInstr *RMI = &*UI; - ++UI; - if (ModInsts.count(RMI)) - continue; MachineBasicBlock *RMBB = RMI->getParent(); if (RMBB == MBB) continue; + if (ModInsts.insert(RMI)) + RUses.push_back(RMI); + } + + for (unsigned i = 0, e = RUses.size(); i != e; ++i) { + MachineInstr *RMI = RUses[i]; // Turn a copy use into an implicit_def. unsigned SrcReg, DstReg, SrcSubReg, DstSubReg; if (tii_->isMoveInstr(*RMI, SrcReg, DstReg, SrcSubReg, DstSubReg) && Reg == SrcReg) { - if (RMO.isKill()) { + RMI->setDesc(tii_->get(TargetInstrInfo::IMPLICIT_DEF)); + + bool isKill = false; + SmallVector Ops; + for (unsigned j = 0, ee = RMI->getNumOperands(); j != ee; ++j) { + MachineOperand &RRMO = RMI->getOperand(j); + if (RRMO.isReg() && RRMO.getReg() == Reg) { + Ops.push_back(j); + if (RRMO.isKill()) + isKill = true; + } + } + // Leave the other operands along. + for (unsigned j = 0, ee = Ops.size(); j != ee; ++j) { + unsigned OpIdx = Ops[j]; + RMI->RemoveOperand(OpIdx-j); + } + + // Update LiveVariables varinfo if the instruction is a kill. + if (isKill) { LiveVariables::VarInfo& vi = lv_->getVarInfo(Reg); vi.removeKill(RMI); } - RMI->setDesc(tii_->get(TargetInstrInfo::IMPLICIT_DEF)); - for (int j = RMI->getNumOperands() - 1, ee = 0; j > ee; --j) - RMI->RemoveOperand(j); - ModInsts.insert(RMI); continue; } + // Replace Reg with a new vreg that's marked implicit. const TargetRegisterClass* RC = mri_->getRegClass(Reg); unsigned NewVReg = mri_->createVirtualRegister(RC); - RMO.setReg(NewVReg); - RMO.setIsUndef(); - RMO.setIsKill(); + bool isKill = true; + for (unsigned j = 0, ee = RMI->getNumOperands(); j != ee; ++j) { + MachineOperand &RRMO = RMI->getOperand(j); + if (RRMO.isReg() && RRMO.getReg() == Reg) { + RRMO.setReg(NewVReg); + RRMO.setIsUndef(); + if (isKill) { + // Only the first operand of NewVReg is marked kill. + RRMO.setIsKill(); + isKill = false; + } + } + } } + RUses.clear(); } ModInsts.clear(); ImpDefRegs.clear(); Added: llvm/trunk/test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll?rev=89880&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll (added) +++ llvm/trunk/test/CodeGen/PowerPC/2009-11-25-ImpDefBug.ll Wed Nov 25 15:13:39 2009 @@ -0,0 +1,56 @@ +; RUN: llc < %s -mtriple=powerpc-apple-darwin9.5 -mcpu=g5 +; rdar://7422268 + +%struct..0EdgeT = type { i32, i32, float, float, i32, i32, i32, float, i32, i32 } + +define void @smooth_color_z_triangle(i32 %v0, i32 %v1, i32 %v2, i32 %pv) nounwind { +entry: + br i1 undef, label %return, label %bb14 + +bb14: ; preds = %entry + br i1 undef, label %bb15, label %return + +bb15: ; preds = %bb14 + br i1 undef, label %bb16, label %bb17 + +bb16: ; preds = %bb15 + br label %bb17 + +bb17: ; preds = %bb16, %bb15 + %0 = fcmp olt float undef, 0.000000e+00 ; [#uses=2] + %eTop.eMaj = select i1 %0, %struct..0EdgeT* undef, %struct..0EdgeT* null ; <%struct..0EdgeT*> [#uses=1] + br label %bb69 + +bb24: ; preds = %bb69 + br i1 undef, label %bb25, label %bb28 + +bb25: ; preds = %bb24 + br label %bb33 + +bb28: ; preds = %bb24 + br i1 undef, label %return, label %bb32 + +bb32: ; preds = %bb28 + br i1 %0, label %bb38, label %bb33 + +bb33: ; preds = %bb32, %bb25 + br i1 undef, label %bb34, label %bb38 + +bb34: ; preds = %bb33 + br label %bb38 + +bb38: ; preds = %bb34, %bb33, %bb32 + %eRight.08 = phi %struct..0EdgeT* [ %eTop.eMaj, %bb32 ], [ undef, %bb34 ], [ undef, %bb33 ] ; <%struct..0EdgeT*> [#uses=0] + %fdgOuter.0 = phi i32 [ %fdgOuter.1, %bb32 ], [ undef, %bb34 ], [ %fdgOuter.1, %bb33 ] ; [#uses=1] + %fz.3 = phi i32 [ %fz.2, %bb32 ], [ 2147483647, %bb34 ], [ %fz.2, %bb33 ] ; [#uses=1] + %1 = add i32 undef, 1 ; [#uses=0] + br label %bb69 + +bb69: ; preds = %bb38, %bb17 + %fdgOuter.1 = phi i32 [ undef, %bb17 ], [ %fdgOuter.0, %bb38 ] ; [#uses=2] + %fz.2 = phi i32 [ undef, %bb17 ], [ %fz.3, %bb38 ] ; [#uses=2] + br i1 undef, label %bb24, label %return + +return: ; preds = %bb69, %bb28, %bb14, %entry + ret void +} From vkutuzov at accesssoftek.com Wed Nov 25 16:44:18 2009 From: vkutuzov at accesssoftek.com (Viktor Kutuzov) Date: Wed, 25 Nov 2009 22:44:18 -0000 Subject: [llvm-commits] [llvm] r89893 - in /llvm/trunk: include/llvm/Target/SubtargetFeature.h lib/Target/SubtargetFeature.cpp tools/lto/LTOCodeGenerator.cpp Message-ID: <200911252244.nAPMiIUG016965@zion.cs.uiuc.edu> Author: vkutuzov Date: Wed Nov 25 16:44:18 2009 New Revision: 89893 URL: http://llvm.org/viewvc/llvm-project?rev=89893&view=rev Log: Rollback changes r89516: Added two SubtargetFeatures::AddFeatures methods, which accept a comma-separated string or already parsed command line parameters as input, and some code re-factoring to use these new methods. Modified: llvm/trunk/include/llvm/Target/SubtargetFeature.h llvm/trunk/lib/Target/SubtargetFeature.cpp llvm/trunk/tools/lto/LTOCodeGenerator.cpp Modified: llvm/trunk/include/llvm/Target/SubtargetFeature.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/SubtargetFeature.h?rev=89893&r1=89892&r2=89893&view=diff ============================================================================== --- llvm/trunk/include/llvm/Target/SubtargetFeature.h (original) +++ llvm/trunk/include/llvm/Target/SubtargetFeature.h Wed Nov 25 16:44:18 2009 @@ -22,7 +22,6 @@ #include #include #include "llvm/ADT/Triple.h" -#include "llvm/Support/CommandLine.h" #include "llvm/System/DataTypes.h" namespace llvm { @@ -94,12 +93,6 @@ /// Adding Features. void AddFeature(const std::string &String, bool IsEnabled = true); - /// Add a set of features from the comma-separated string. - void AddFeatures(const std::string &String); - - /// Add a set of features from the parsed command line parameters. - void AddFeatures(const cl::list &List); - /// Get feature bits. uint32_t getBits(const SubtargetFeatureKV *CPUTable, size_t CPUTableSize, Modified: llvm/trunk/lib/Target/SubtargetFeature.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/SubtargetFeature.cpp?rev=89893&r1=89892&r2=89893&view=diff ============================================================================== --- llvm/trunk/lib/Target/SubtargetFeature.cpp (original) +++ llvm/trunk/lib/Target/SubtargetFeature.cpp Wed Nov 25 16:44:18 2009 @@ -110,33 +110,6 @@ } } -/// Add a set of features from the comma-separated string. -void SubtargetFeatures::AddFeatures(const std::string &String) -{ - std::vector _Features; - - Split(_Features, String); - // Nothing is specified. - if (_Features.size() == 0) - return; - - for (std::vector::iterator it = _Features.begin(), - end = _Features.end(); it != end; ++it) { - // AddFeature will take care of feature string normalization. - AddFeature(*it); - } -} - -/// Add a set of features from the parsed command line parameters. -void SubtargetFeatures::AddFeatures(const cl::list &List) -{ - for (cl::list::const_iterator it = List.begin(), - end = List.end(); it != end; ++it) { - // AddFeature will take care of feature string normalization. - AddFeature(*it); - } -} - /// Find KV in array using binary search. template const T *Find(const std::string &S, const T *A, size_t L) { // Make the lower bound element we're looking for Modified: llvm/trunk/tools/lto/LTOCodeGenerator.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/lto/LTOCodeGenerator.cpp?rev=89893&r1=89892&r2=89893&view=diff ============================================================================== --- llvm/trunk/tools/lto/LTOCodeGenerator.cpp (original) +++ llvm/trunk/tools/lto/LTOCodeGenerator.cpp Wed Nov 25 16:44:18 2009 @@ -304,17 +304,10 @@ break; } - // Prepare subtarget feature set for the given command line options. - SubtargetFeatures features; - - // Set the rest of features by default. - // Note: Please keep this after all explict feature settings to make sure - // defaults will not override explicitly set options. - features.AddFeatures( - SubtargetFeatures::getDefaultSubtargetFeatures(llvm::Triple(Triple))); - // construct LTModule, hand over ownership of module and target - _target = march->createTargetMachine(Triple, features.getString()); + const std::string FeatureStr = + SubtargetFeatures::getDefaultSubtargetFeatures(llvm::Triple(Triple)); + _target = march->createTargetMachine(Triple, FeatureStr); } return false; } From dag at cray.com Wed Nov 25 17:09:34 2009 From: dag at cray.com (David Greene) Date: Wed, 25 Nov 2009 17:09:34 -0600 Subject: [llvm-commits] [PATCH] More Spill Annotations In-Reply-To: <200911250908.34818.dag@cray.com> References: <200911201622.37994.dag@cray.com> <200911250908.34818.dag@cray.com> Message-ID: <200911251709.35410.dag@cray.com> On Wednesday 25 November 2009 09:08, David Greene wrote: > On Wednesday 25 November 2009 07:27, Dan Gohman wrote: > > This raises the question of what you're actually aiming at here. Does > > it really make sense to impose the Vector and Scalar dichotomy on an > > architecture like x86? > > Yeah, I think so. There are vector instructions and there are scalar > instructions. We want to know when certain more expensive (i.e. vector) > things happen. In this case, we want to have some idea of how much data > is being transferred with a spill. Of course cache lines come into play > here but the vector/scalar categorization gives us an idea of how much of > that cache line we're actually touching. > > What are you thinking about this. What makes you nervous? Dan and I talked about this some and while I still thik it's useful to mark instructions as vector and scalar, I don't actually need that for the spill comments. As Dan pointed out, what would be more useful is more general microarchitectural properties like "pipeline class" or some such thing, whatever you want to call it, that means "this instruction can be executed by these pipelines." I don't have the time to put all of that kind of infrastructure in right now, so I'm going to finish off this vector classification stuff but not actually print out comments yet. I will print more useful comments like "4-byte spill" and so on. I'll send new patches when I have something reasonable. This has been a helpful discussion, thanks! -Dave From dpatel at apple.com Wed Nov 25 17:28:01 2009 From: dpatel at apple.com (Devang Patel) Date: Wed, 25 Nov 2009 23:28:01 -0000 Subject: [llvm-commits] [llvm] r89896 - /llvm/trunk/docs/SourceLevelDebugging.html Message-ID: <200911252328.nAPNS1YY018373@zion.cs.uiuc.edu> Author: dpatel Date: Wed Nov 25 17:28:01 2009 New Revision: 89896 URL: http://llvm.org/viewvc/llvm-project?rev=89896&view=rev Log: Update to reflect recent debugging information encoding changes. Modified: llvm/trunk/docs/SourceLevelDebugging.html Modified: llvm/trunk/docs/SourceLevelDebugging.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/SourceLevelDebugging.html?rev=89896&r1=89895&r2=89896&view=diff ============================================================================== --- llvm/trunk/docs/SourceLevelDebugging.html (original) +++ llvm/trunk/docs/SourceLevelDebugging.html Wed Nov 25 17:28:01 2009 @@ -37,15 +37,10 @@
  • Debugger intrinsic functions
  • -
  • Representing stopping points in the - source program
  • +
  • Object lifetimes and scoping
  • C/C++ front-end specific debug information
    1. C/C++ source file information
    2. @@ -763,92 +758,6 @@ - -
      -
      -  void %llvm.dbg.stoppoint( uint, uint, metadata)
      -
      - -

      This intrinsic is used to provide correspondence between the source file and - the generated code. The first argument is the line number (base 1), second - argument is the column number (0 if unknown) and the third argument the - source %llvm.dbg.compile_unit. - Code following a call to this intrinsic will - have been defined in close proximity of the line, column and file. This - information holds until the next call - to %lvm.dbg.stoppoint.

      - -
      - - - - -
      -
      -  void %llvm.dbg.func.start( metadata )
      -
      - -

      This intrinsic is used to link the debug information - in %llvm.dbg.subprogram to the - function. It defines the beginning of the function's declarative region - (scope). It also implies a call to - %llvm.dbg.stoppoint which - defines a source line "stop point". The intrinsic should be called early in - the function after the all the alloca instructions. It should be paired off - with a closing - %llvm.dbg.region.end. - The function's single argument is - the %llvm.dbg.subprogram.type.

      - -
      - - - - -
      -
      -  void %llvm.dbg.region.start( metadata )
      -
      - -

      This intrinsic is used to define the beginning of a declarative scope (ex. - block) for local language elements. It should be paired off with a closing - %llvm.dbg.region.end. The - function's single argument is - the %llvm.dbg.block which is - starting.

      - - -
      - - - - -
      -
      -  void %llvm.dbg.region.end( metadata )
      -
      - -

      This intrinsic is used to define the end of a declarative scope (ex. block) - for local language elements. It should be paired off with an - opening %llvm.dbg.region.start - or %llvm.dbg.func.start. - The function's single argument is either - the %llvm.dbg.block or - the %llvm.dbg.subprogram.type - which is ending.

      - -
      - - - @@ -867,40 +776,6 @@ - -
      - -

      LLVM debugger "stop points" are a key part of the debugging representation - that allows the LLVM to maintain simple semantics - for debugging optimized code. The basic idea is that - the front-end inserts calls to - the %llvm.dbg.stoppoint - intrinsic function at every point in the program where a debugger should be - able to inspect the program (these correspond to places a debugger stops when - you "step" through it). The front-end can choose to place these as - fine-grained as it would like (for example, before every subexpression - evaluated), but it is recommended to only put them after every source - statement that includes executable code.

      - -

      Using calls to this intrinsic function to demark legal points for the - debugger to inspect the program automatically disables any optimizations that - could potentially confuse debugging information. To - non-debug-information-aware transformations, these calls simply look like - calls to an external function, which they must assume to do anything - (including reading or writing to any part of reachable memory). On the other - hand, it does not impact many optimizations, such as code motion of - non-trapping instructions, nor does it impact optimization of subexpressions, - code duplication transformations, or basic-block reordering - transformations.

      - -
      - - - @@ -914,21 +789,20 @@ scoping in this sense, and does not want to be tied to a language's scoping rules.

      -

      In order to handle this, the LLVM debug format uses the notion of "regions" - of a function, delineated by calls to intrinsic functions. These intrinsic - functions define new regions of the program and indicate when the region - lifetime expires. Consider the following C fragment, for example:

      +

      In order to handle this, the LLVM debug format uses the metadata attached + with llvm instructions to encode line nuber and scoping information. + Consider the following C fragment, for example:

       1.  void foo() {
      -2.    int X = ...;
      -3.    int Y = ...;
      +2.    int X = 21;
      +3.    int Y = 22;
       4.    {
      -5.      int Z = ...;
      -6.      ...
      +5.      int Z = 23;
      +6.      Z = X;
       7.    }
      -8.    ...
      +8.    X = Y;
       9.  }
       
      @@ -937,99 +811,124 @@
      -void %foo() {
      +nounwind ssp {
       entry:
      -    %X = alloca int
      -    %Y = alloca int
      -    %Z = alloca int
      -    
      -    ...
      -    
      -    call void @llvm.dbg.func.start( metadata !0)
      -    
      -    call void @llvm.dbg.stoppoint( uint 2, uint 2, metadata !1)
      -    
      -    call void @llvm.dbg.declare({}* %X, ...)
      -    call void @llvm.dbg.declare({}* %Y, ...)
      -    
      -    ;; Evaluate expression on line 2, assigning to X.
      -    
      -    call void @llvm.dbg.stoppoint( uint 3, uint 2, metadata !1)
      -    
      -    ;; Evaluate expression on line 3, assigning to Y.
      -    
      -    call void @llvm.region.start()
      -    call void @llvm.dbg.stoppoint( uint 5, uint 4, metadata !1)
      -    call void @llvm.dbg.declare({}* %X, ...)
      -    
      -    ;; Evaluate expression on line 5, assigning to Z.
      -    
      -    call void @llvm.dbg.stoppoint( uint 7, uint 2, metadata !1)
      -    call void @llvm.region.end()
      -    
      -    call void @llvm.dbg.stoppoint( uint 9, uint 2, metadata !1)
      -    
      -    call void @llvm.region.end()
      -    
      -    ret void
      -}
      +  %X = alloca i32, align 4                        ;  [#uses=4]
      +  %Y = alloca i32, align 4                        ;  [#uses=4]
      +  %Z = alloca i32, align 4                        ;  [#uses=3]
      +  %0 = bitcast i32* %X to { }*                    ; <{ }*> [#uses=1]
      +  call void @llvm.dbg.declare({ }* %0, metadata !0), !dbg !7
      +  store i32 21, i32* %X, !dbg !8
      +  %1 = bitcast i32* %Y to { }*                    ; <{ }*> [#uses=1]
      +  call void @llvm.dbg.declare({ }* %1, metadata !9), !dbg !10
      +  store i32 22, i32* %Y, !dbg !11
      +  %2 = bitcast i32* %Z to { }*                    ; <{ }*> [#uses=1]
      +  call void @llvm.dbg.declare({ }* %2, metadata !12), !dbg !14
      +  store i32 23, i32* %Z, !dbg !15
      +  %tmp = load i32* %X, !dbg !16                   ;  [#uses=1]
      +  %tmp1 = load i32* %Y, !dbg !16                  ;  [#uses=1]
      +  %add = add nsw i32 %tmp, %tmp1, !dbg !16        ;  [#uses=1]
      +  store i32 %add, i32* %Z, !dbg !16
      +  %tmp2 = load i32* %Y, !dbg !17                  ;  [#uses=1]
      +  store i32 %tmp2, i32* %X, !dbg !17
      +  ret void, !dbg !18
      +}
      +
      +declare void @llvm.dbg.declare({ }*, metadata) nounwind readnone
      +
      +!0 = metadata !{i32 459008, metadata !1, metadata !"X", 
      +                metadata !3, i32 2, metadata !6}; [ DW_TAG_auto_variable ]
      +!1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ]
      +!2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", metadata !"foo", 
      +               metadata !"foo", metadata !3, i32 1, metadata !4, 
      +               i1 false, i1 true}; [DW_TAG_subprogram ]
      +!3 = metadata !{i32 458769, i32 0, i32 12, metadata !"foo.c", 
      +                metadata !"/private/tmp", metadata !"clang 1.1", i1 true, 
      +                i1 false, metadata !"", i32 0}; [DW_TAG_compile_unit ]
      +!4 = metadata !{i32 458773, metadata !3, metadata !"", null, i32 0, i64 0, i64 0, 
      +                i64 0, i32 0, null, metadata !5, i32 0}; [DW_TAG_subroutine_type ]
      +!5 = metadata !{null}
      +!6 = metadata !{i32 458788, metadata !3, metadata !"int", metadata !3, i32 0, 
      +                i64 32, i64 32, i64 0, i32 0, i32 5}; [DW_TAG_base_type ]
      +!7 = metadata !{i32 2, i32 7, metadata !1, null}
      +!8 = metadata !{i32 2, i32 3, metadata !1, null}
      +!9 = metadata !{i32 459008, metadata !1, metadata !"Y", metadata !3, i32 3, 
      +                metadata !6}; [ DW_TAG_auto_variable ]
      +!10 = metadata !{i32 3, i32 7, metadata !1, null}
      +!11 = metadata !{i32 3, i32 3, metadata !1, null}
      +!12 = metadata !{i32 459008, metadata !13, metadata !"Z", metadata !3, i32 5, 
      +                 metadata !6}; [ DW_TAG_auto_variable ]
      +!13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ]
      +!14 = metadata !{i32 5, i32 9, metadata !13, null}
      +!15 = metadata !{i32 5, i32 5, metadata !13, null}
      +!16 = metadata !{i32 6, i32 5, metadata !13, null}
      +!17 = metadata !{i32 8, i32 3, metadata !1, null}
      +!18 = metadata !{i32 9, i32 1, metadata !2, null}
       

      This example illustrates a few important details about the LLVM debugging - information. In particular, it shows how the various intrinsics are applied + information. In particular, it shows how the llvm.dbg.declare intrinsic + and location information, attached with an instruction, are applied together to allow a debugger to analyze the relationship between statements, variable definitions, and the code used to implement the function.

      -

      The first - intrinsic %llvm.dbg.func.start - provides a link with the subprogram - descriptor containing the details of this function. This call also - defines the beginning of the function region, bounded by - the %llvm.region.end at the - end of the function. This region is used to bracket the lifetime of - variables declared within. For a function, this outer region defines a new - stack frame whose lifetime ends when the region is ended.

      - -

      It is possible to define inner regions for short term variables by using the - %llvm.region.start - and %llvm.region.end to - bound a region. The inner region in this example would be for the block - containing the declaration of Z.

      - -

      Using regions to represent the boundaries of source-level functions allow - LLVM interprocedural optimizations to arbitrarily modify LLVM functions - without having to worry about breaking mapping information between the LLVM - code and the and source-level program. In particular, the inliner requires - no modification to support inlining with debugging information: there is no - explicit correlation drawn between LLVM functions and their source-level - counterparts (note however, that if the inliner inlines all instances of a - non-strong-linkage function into its caller that it will not be possible for - the user to manually invoke the inlined function from a debugger).

      - -

      Once the function has been defined, - the stopping point - corresponding to line #2 (column #2) of the function is encountered. At this - point in the function, no local variables are live. As lines 2 and 3 - of the example are executed, their variable definitions are introduced into - the program using - %llvm.dbg.declare, without the - need to specify a new region. These variables do not require new regions to - be introduced because they go out of scope at the same point in the program: - line 9.

      - -

      In contrast, the Z variable goes out of scope at a different time, - on line 7. For this reason, it is defined within the inner region, which - kills the availability of Z before the code for line 8 is executed. - In this way, regions can support arbitrary source-language scoping rules, as - long as they can only be nested (ie, one scope cannot partially overlap with - a part of another scope).

      - -

      It is worth noting that this scoping mechanism is used to control scoping of - all declarations, not just variable declarations. For example, the scope of - a C++ using declaration is controlled with this and could change how name - lookup is performed.

      +
      +
       
      +     call void @llvm.dbg.declare({ }* %0, metadata !0), !dbg !7   
      +   
      +
      +

      This first intrinsic + %llvm.dbg.declare + encodes debugging information for variable X. The metadata, + !dbg !7 attached with the intrinsic provides scope information for + the variable X.

      +
      +
      +     !7 = metadata !{i32 2, i32 7, metadata !1, null}
      +     !1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ]
      +     !2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", 
      +                     metadata !"foo", metadata !"foo", metadata !3, i32 1, 
      +                     metadata !4, i1 false, i1 true}; [DW_TAG_subprogram ]   
      +   
      +
      + +

      Here !7 is a metadata providing location information. It has four + fields : line number, column number, scope and original scope. The original + scope represents inline location if this instruction is inlined inside + a caller. It is null otherwise. In this example scope is encoded by + !1. !1 represents a lexical block inside the scope + !2, where !2 is a + subprogram descriptor. + This way the location information attched with the intrinsics indicates + that the variable X is declared at line number 2 at a function level + scope in function foo.

      + +

      Now lets take another example.

      + +
      +
       
      +     call void @llvm.dbg.declare({ }* %2, metadata !12), !dbg !14
      +   
      +
      +

      This intrinsic + %llvm.dbg.declare + encodes debugging information for variable Z. The metadata, + !dbg !14 attached with the intrinsic provides scope information for + the variable Z.

      +
      +
      +     !13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ]
      +     !14 = metadata !{i32 5, i32 9, metadata !13, null}
      +   
      +
      + +

      Here !14 indicates that Z is declaread at line number 5, + column number 9 inside a lexical scope !13. This lexical scope + itself resides inside lexcial scope !1 described above.

      +

      The scope information attached with each instruction provides a straight + forward way to find instructions covered by a scope.

      From dalej at apple.com Wed Nov 25 17:49:09 2009 From: dalej at apple.com (Dale Johannesen) Date: Wed, 25 Nov 2009 23:49:09 -0000 Subject: [llvm-commits] [llvm-gcc-4.2] r89898 - in /llvm-gcc-4.2/trunk/gcc/config/i386: i386.c i386.h llvm-i386.cpp Message-ID: <200911252349.nAPNn9OS019074@zion.cs.uiuc.edu> Author: johannes Date: Wed Nov 25 17:49:09 2009 New Revision: 89898 URL: http://llvm.org/viewvc/llvm-project?rev=89898&view=rev Log: Add X86_64_POINTER_CLASS to x86-64 parameter passing logic, which represents (surprise) a pointer. This has the same semantics as INTEGER except that it generates i8* in llvm IR rather than i64. 7375899. Modified: llvm-gcc-4.2/trunk/gcc/config/i386/i386.c llvm-gcc-4.2/trunk/gcc/config/i386/i386.h llvm-gcc-4.2/trunk/gcc/config/i386/llvm-i386.cpp Modified: llvm-gcc-4.2/trunk/gcc/config/i386/i386.c URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/i386/i386.c?rev=89898&r1=89897&r2=89898&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/config/i386/i386.c (original) +++ llvm-gcc-4.2/trunk/gcc/config/i386/i386.c Wed Nov 25 17:49:09 2009 @@ -1294,13 +1294,15 @@ X86_64_X87_CLASS, X86_64_X87UP_CLASS, X86_64_COMPLEX_X87_CLASS, - X86_64_MEMORY_CLASS + X86_64_MEMORY_CLASS, + X86_64_POINTER_CLASS }; #endif /* !ENABLE_LLVM */ /* LLVM LOCAL end */ static const char * const x86_64_reg_class_name[] = { "no", "integer", "integerSI", "sse", "sseSF", "sseDF", - "sseup", "x87", "x87up", "cplx87", "no" + /* LLVM LOCAL */ + "sseup", "x87", "x87up", "cplx87", "no", "ptr" }; #define MAX_CLASSES 4 @@ -3303,8 +3305,12 @@ if ((class1 == X86_64_INTEGERSI_CLASS && class2 == X86_64_SSESF_CLASS) || (class2 == X86_64_INTEGERSI_CLASS && class1 == X86_64_SSESF_CLASS)) return X86_64_INTEGERSI_CLASS; + /* LLVM LOCAL begin */ if (class1 == X86_64_INTEGER_CLASS || class1 == X86_64_INTEGERSI_CLASS - || class2 == X86_64_INTEGER_CLASS || class2 == X86_64_INTEGERSI_CLASS) + || class1 == X86_64_POINTER_CLASS + || class2 == X86_64_INTEGER_CLASS || class2 == X86_64_INTEGERSI_CLASS + || class2 == X86_64_POINTER_CLASS) + /* LLVM LOCAL end */ return X86_64_INTEGER_CLASS; /* Rule #5: If one of the classes is X87, X87UP, or COMPLEX_X87 class, @@ -3551,6 +3557,14 @@ classes[1] = X86_64_SSEUP_CLASS; return 2; case DImode: + /* LLVM LOCAL begin */ + if (POINTER_TYPE_P(type)) + { + classes[0] = X86_64_POINTER_CLASS; + return 1; + } + /* fall through */ + /* LLVM LOCAL end */ case SImode: case HImode: case QImode: @@ -3651,6 +3665,8 @@ { case X86_64_INTEGER_CLASS: case X86_64_INTEGERSI_CLASS: + /* LLVM LOCAL */ + case X86_64_POINTER_CLASS: (*int_nregs)++; break; case X86_64_SSE_CLASS: @@ -3764,6 +3780,8 @@ { case X86_64_INTEGER_CLASS: case X86_64_INTEGERSI_CLASS: + /* LLVM LOCAL */ + case X86_64_POINTER_CLASS: return gen_rtx_REG (mode, intreg[0]); case X86_64_SSE_CLASS: case X86_64_SSESF_CLASS: @@ -3799,6 +3817,8 @@ break; case X86_64_INTEGER_CLASS: case X86_64_INTEGERSI_CLASS: + /* LLVM LOCAL */ + case X86_64_POINTER_CLASS: /* Merge TImodes on aligned occasions here too. */ if (i * 8 + 8 > bytes) tmpmode = mode_for_size ((bytes - i * 8) * BITS_PER_UNIT, MODE_INT, 0); Modified: llvm-gcc-4.2/trunk/gcc/config/i386/i386.h URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/i386/i386.h?rev=89898&r1=89897&r2=89898&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/config/i386/i386.h (original) +++ llvm-gcc-4.2/trunk/gcc/config/i386/i386.h Wed Nov 25 17:49:09 2009 @@ -60,7 +60,8 @@ X86_64_X87_CLASS, X86_64_X87UP_CLASS, X86_64_COMPLEX_X87_CLASS, - X86_64_MEMORY_CLASS + X86_64_MEMORY_CLASS, + X86_64_POINTER_CLASS }; #endif /* ENABLE_LLVM */ Modified: llvm-gcc-4.2/trunk/gcc/config/i386/llvm-i386.cpp URL: http://llvm.org/viewvc/llvm-project/llvm-gcc-4.2/trunk/gcc/config/i386/llvm-i386.cpp?rev=89898&r1=89897&r2=89898&view=diff ============================================================================== --- llvm-gcc-4.2/trunk/gcc/config/i386/llvm-i386.cpp (original) +++ llvm-gcc-4.2/trunk/gcc/config/i386/llvm-i386.cpp Wed Nov 25 17:49:09 2009 @@ -838,6 +838,11 @@ totallyEmpty = false; Bytes -= 8; break; + case X86_64_POINTER_CLASS: + Elts.push_back(Type::getInt8PtrTy(Context)); + totallyEmpty = false; + Bytes -= 8; + break; case X86_64_SSE_CLASS: totallyEmpty = false; // If it's a SSE class argument, then one of the followings are possible: @@ -902,6 +907,9 @@ } else if (Class[i+1] == X86_64_INTEGER_CLASS) { Elts.push_back(VectorType::get(Type::getFloatTy(Context), 2)); Elts.push_back(Type::getInt64Ty(Context)); + } else if (Class[i+1] == X86_64_POINTER_CLASS) { + Elts.push_back(VectorType::get(Type::getFloatTy(Context), 2)); + Elts.push_back(Type::getInt8PtrTy(Context)); } else if (Class[i+1] == X86_64_NO_CLASS) { // padding bytes, don't pass Elts.push_back(Type::getDoubleTy(Context)); @@ -1069,7 +1077,8 @@ return false; if (NumClasses == 1 && - (Class[0] == X86_64_INTEGERSI_CLASS || Class[0] == X86_64_INTEGER_CLASS)) + (Class[0] == X86_64_INTEGERSI_CLASS || Class[0] == X86_64_INTEGER_CLASS || + Class[0] == X86_64_POINTER_CLASS)) // This will fit in one i64 register. return false; @@ -1113,11 +1122,14 @@ if (NumClasses == 1) { if (Class[0] == X86_64_INTEGERSI_CLASS || - Class[0] == X86_64_INTEGER_CLASS) { + Class[0] == X86_64_INTEGER_CLASS || + Class[0] == X86_64_POINTER_CLASS) { // one int register HOST_WIDE_INT Bytes = (Mode == BLKmode) ? int_size_in_bytes(type) : (int) GET_MODE_SIZE(Mode); + if (Bytes==8 && Class[0] == X86_64_POINTER_CLASS) + return Type::getInt8PtrTy(Context); if (Bytes>4) return Type::getInt64Ty(Context); else if (Bytes>2) @@ -1135,6 +1147,8 @@ Class[0] == X86_64_NO_CLASS || Class[0] == X86_64_INTEGERSI_CLASS) return Type::getInt64Ty(Context); + else if (Class[0] == X86_64_POINTER_CLASS) + return Type::getInt8PtrTy(Context); else if (Class[0] == X86_64_SSE_CLASS || Class[0] == X86_64_SSEDF_CLASS) return Type::getDoubleTy(Context); else if (Class[0] == X86_64_SSESF_CLASS) @@ -1146,6 +1160,8 @@ if (Class[1] == X86_64_INTEGERSI_CLASS || Class[1] == X86_64_INTEGER_CLASS) return Type::getInt64Ty(Context); + else if (Class[1] == X86_64_POINTER_CLASS) + return Type::getInt8PtrTy(Context); else if (Class[1] == X86_64_SSE_CLASS || Class[1] == X86_64_SSEDF_CLASS) return Type::getDoubleTy(Context); else if (Class[1] == X86_64_SSESF_CLASS) @@ -1189,6 +1205,9 @@ if (NumClasses == 1 && Class[0] == X86_64_INTEGER_CLASS) assert(0 && "This type does not need multiple return registers!"); + if (NumClasses == 1 && Class[0] == X86_64_POINTER_CLASS) + assert(0 && "This type does not need multiple return registers!"); + // classify_argument uses a single X86_64_NO_CLASS as a special case for // empty structs. Recognize it and don't add any return values in that // case. @@ -1202,6 +1221,10 @@ Elts.push_back(Type::getInt64Ty(Context)); Bytes -= 8; break; + case X86_64_POINTER_CLASS: + Elts.push_back(Type::getInt8PtrTy(Context)); + Bytes -= 8; + break; case X86_64_SSE_CLASS: // If it's a SSE class argument, then one of the followings are possible: // 1. 1 x SSE, size is 8: 1 x Double. @@ -1264,6 +1287,9 @@ } else if (Class[i+1] == X86_64_INTEGER_CLASS) { Elts.push_back(VectorType::get(Type::getFloatTy(Context), 2)); Elts.push_back(Type::getInt64Ty(Context)); + } else if (Class[i+1] == X86_64_POINTER_CLASS) { + Elts.push_back(VectorType::get(Type::getFloatTy(Context), 2)); + Elts.push_back(Type::getInt8PtrTy(Context)); } else if (Class[i+1] == X86_64_NO_CLASS) { Elts.push_back(Type::getDoubleTy(Context)); Bytes -= 16; @@ -1467,7 +1493,8 @@ int NumClasses = ix86_ClassifyArgument(Mode, type, Class, 0); *DontCheckAlignment= true; if (NumClasses == 1 && (Class[0] == X86_64_INTEGER_CLASS || - Class[0] == X86_64_INTEGERSI_CLASS)) { + Class[0] == X86_64_INTEGERSI_CLASS || + Class[0] == X86_64_POINTER_CLASS)) { // one int register HOST_WIDE_INT Bytes = (Mode == BLKmode) ? int_size_in_bytes(type) : (int) GET_MODE_SIZE(Mode); @@ -1480,8 +1507,10 @@ return true; } if (NumClasses == 2 && (Class[0] == X86_64_INTEGERSI_CLASS || - Class[0] == X86_64_INTEGER_CLASS)) { - if (Class[1] == X86_64_INTEGER_CLASS) { + Class[0] == X86_64_INTEGER_CLASS || + Class[0] == X86_64_POINTER_CLASS)) { + if (Class[1] == X86_64_INTEGER_CLASS || + Class[1] == X86_64_POINTER_CLASS) { // 16 byte object, 2 int registers *size = 16; return true; From dalej at apple.com Wed Nov 25 17:50:09 2009 From: dalej at apple.com (Dale Johannesen) Date: Wed, 25 Nov 2009 23:50:09 -0000 Subject: [llvm-commits] [llvm] r89899 - /llvm/trunk/test/FrontendC++/m64-ptr.cpp Message-ID: <200911252350.nAPNo942019114@zion.cs.uiuc.edu> Author: johannes Date: Wed Nov 25 17:50:09 2009 New Revision: 89899 URL: http://llvm.org/viewvc/llvm-project?rev=89899&view=rev Log: Test for llvm-gcc checkin 89898. Added: llvm/trunk/test/FrontendC++/m64-ptr.cpp Added: llvm/trunk/test/FrontendC++/m64-ptr.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/FrontendC%2B%2B/m64-ptr.cpp?rev=89899&view=auto ============================================================================== --- llvm/trunk/test/FrontendC++/m64-ptr.cpp (added) +++ llvm/trunk/test/FrontendC++/m64-ptr.cpp Wed Nov 25 17:50:09 2009 @@ -0,0 +1,17 @@ +// RUN: %llvmgxx %s -S -o - | FileCheck %s +// Make sure pointers are passed as pointers, not converted to int. +// The first load should be of type i8** in either 32 or 64 bit mode. +// This formerly happened on x86-64, 7375899. + +class StringRef { +public: + const char *Data; + long Len; +}; +void foo(StringRef X); +void bar(StringRef &A) { +// CHECK: @_Z3barR9StringRef +// CHECK: load i8** + foo(A); +// CHECK: ret void +} From clattner at apple.com Wed Nov 25 18:17:30 2009 From: clattner at apple.com (Chris Lattner) Date: Wed, 25 Nov 2009 16:17:30 -0800 Subject: [llvm-commits] [llvm] r89626 - in /llvm/trunk: docs/CommandGuide/FileCheck.pod utils/FileCheck/FileCheck.cpp In-Reply-To: <6a8523d60911251154t6f3c8908i6b9bccce1a8d5410@mail.gmail.com> References: <200911222207.nAMM7pis006331@zion.cs.uiuc.edu> <6a8523d60911251154t6f3c8908i6b9bccce1a8d5410@mail.gmail.com> Message-ID: <5497D6F4-8C2C-4DAA-992E-A307B5E02C09@apple.com> On Nov 25, 2009, at 11:54 AM, Daniel Dunbar wrote: > On Mon, Nov 23, 2009 at 10:29 AM, Chris Lattner wrote: >> >> On Nov 22, 2009, at 2:07 PM, Daniel Dunbar wrote: >> >>> Author: ddunbar >>> Date: Sun Nov 22 16:07:50 2009 >>> New Revision: 89626 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=89626&view=rev >>> Log: >>> Allow '_' in FileCheck variable names, it is nice to have at least one >>> separate character. >>> - Chris, OK? >> >> Fine with me, thx. Please update the TestingGuide documentation. > > Why do we have two copies of the docs? Can't the TestingGuide just > point at the FileCheck man page? Yes, but the TestingGuide.html version is formatted nicer (e.g. bolding important pieces of the
       text).  I couldn't figure out how to get the pod to not be ugly.
      
      -Chris
      
      
      From clattner at apple.com  Wed Nov 25 18:18:10 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Wed, 25 Nov 2009 16:18:10 -0800
      Subject: [llvm-commits] [llvm] r89877 -
      	/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h
      In-Reply-To: <200911251957.nAPJvEma011366@zion.cs.uiuc.edu>
      References: <200911251957.nAPJvEma011366@zion.cs.uiuc.edu>
      Message-ID: <6CB2B61C-657A-4260-9717-1267F605009E@apple.com>
      
      
      On Nov 25, 2009, at 11:57 AM, Bob Wilson wrote:
      
      > Author: bwilson
      > Date: Wed Nov 25 13:57:14 2009
      > New Revision: 89877
      > 
      > URL: http://llvm.org/viewvc/llvm-project?rev=89877&view=rev
      > Log:
      > Tail duplicate indirect branches for PowerPC, too.
      > With the testcase for pr3120, the "threaded interpreter" runtime decreases
      > from 1788 to 1413 with this change.
      
      Very nice Bob!  Silly question: should isProfitableToDuplicateIndirectBranch default to true?  That would let targets opt out if it is not beneficial and would save a small bit of code size.
      
      -Chris
      
      > 
      > Modified:
      >    llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h
      > 
      > Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h
      > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h?rev=89877&r1=89876&r2=89877&view=diff
      > 
      > ==============================================================================
      > --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h (original)
      > +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h Wed Nov 25 13:57:14 2009
      > @@ -151,6 +151,8 @@
      >   /// instruction may be.  This returns the maximum number of bytes.
      >   ///
      >   virtual unsigned GetInstSizeInBytes(const MachineInstr *MI) const;
      > +
      > +  virtual bool isProfitableToDuplicateIndirectBranch() const { return true; }
      > };
      > 
      > }
      > 
      > 
      > _______________________________________________
      > llvm-commits mailing list
      > llvm-commits at cs.uiuc.edu
      > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
      
      
      
      
      From bob.wilson at apple.com  Wed Nov 25 18:32:22 2009
      From: bob.wilson at apple.com (Bob Wilson)
      Date: Thu, 26 Nov 2009 00:32:22 -0000
      Subject: [llvm-commits] [llvm] r89904 - in /llvm/trunk:
       include/llvm/CodeGen/MachineBasicBlock.h include/llvm/CodeGen/Passes.h
       lib/CodeGen/BranchFolding.cpp lib/CodeGen/BranchFolding.h
       lib/CodeGen/CMakeLists.txt lib/CodeGen/LLVMTargetMachine.cpp
       lib/CodeGen/MachineBasicBlock.cpp lib/CodeGen/TailDuplication.cpp
      Message-ID: <200911260032.nAQ0WMAl020719@zion.cs.uiuc.edu>
      
      Author: bwilson
      Date: Wed Nov 25 18:32:21 2009
      New Revision: 89904
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89904&view=rev
      Log:
      Split tail duplication into a separate pass.  This is needed to avoid
      running tail duplication when doing branch folding for if-conversion, and
      we also want to be able to run tail duplication earlier to fix some
      reg alloc problems.  Move the CanFallThrough function from BranchFolding
      to MachineBasicBlock so that it can be shared by TailDuplication.
      
      Added:
          llvm/trunk/lib/CodeGen/TailDuplication.cpp
      Modified:
          llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h
          llvm/trunk/include/llvm/CodeGen/Passes.h
          llvm/trunk/lib/CodeGen/BranchFolding.cpp
          llvm/trunk/lib/CodeGen/BranchFolding.h
          llvm/trunk/lib/CodeGen/CMakeLists.txt
          llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp
          llvm/trunk/lib/CodeGen/MachineBasicBlock.cpp
      
      Modified: llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h (original)
      +++ llvm/trunk/include/llvm/CodeGen/MachineBasicBlock.h Wed Nov 25 18:32:21 2009
      @@ -271,6 +271,12 @@
         /// ends with an unconditional branch to some other block.
         bool isLayoutSuccessor(const MachineBasicBlock *MBB) const;
       
      +  /// canFallThrough - Return true if the block can implicitly transfer
      +  /// control to the block after it by falling off the end of it.  This should
      +  /// return false if it can reach the block after it, but it uses an explicit
      +  /// branch to do so (e.g., a table jump).  True is a conservative answer.
      +  bool canFallThrough();
      +
         /// getFirstTerminator - returns an iterator to the first terminator
         /// instruction of this basic block. If a terminator does not exist,
         /// it returns end()
      
      Modified: llvm/trunk/include/llvm/CodeGen/Passes.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/CodeGen/Passes.h (original)
      +++ llvm/trunk/include/llvm/CodeGen/Passes.h Wed Nov 25 18:32:21 2009
      @@ -129,6 +129,10 @@
         /// branches.
         FunctionPass *createBranchFoldingPass(bool DefaultEnableTailMerge);
       
      +  /// TailDuplication Pass - Duplicate blocks with unconditional branches
      +  /// into tails of their predecessors.
      +  FunctionPass *createTailDuplicationPass();
      +
         /// IfConverter Pass - This pass performs machine code if conversion.
         FunctionPass *createIfConverterPass();
       
      
      Modified: llvm/trunk/lib/CodeGen/BranchFolding.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.cpp?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/BranchFolding.cpp (original)
      +++ llvm/trunk/lib/CodeGen/BranchFolding.cpp Wed Nov 25 18:32:21 2009
      @@ -41,8 +41,6 @@
       STATISTIC(NumDeadBlocks, "Number of dead blocks removed");
       STATISTIC(NumBranchOpts, "Number of branches optimized");
       STATISTIC(NumTailMerge , "Number of block tails merged");
      -STATISTIC(NumTailDups  , "Number of tail duplicated blocks");
      -STATISTIC(NumInstrDups , "Additional instructions due to tail duplication");
       
       static cl::opt FlagEnableTailMerge("enable-tail-merge",
                                     cl::init(cl::BOU_UNSET), cl::Hidden);
      @@ -205,16 +203,6 @@
           MadeChange |= MadeChangeThisIteration;
         }
       
      -  // Do tail duplication after tail merging is done.  Otherwise it is
      -  // tough to avoid situations where tail duplication and tail merging undo
      -  // each other's transformations ad infinitum.
      -  MadeChangeThisIteration = true;
      -  while (MadeChangeThisIteration) {
      -    MadeChangeThisIteration = false;
      -    MadeChangeThisIteration |= TailDuplicateBlocks(MF);
      -    MadeChange |= MadeChangeThisIteration;
      -  }
      -
         // See if any jump tables have become mergable or dead as the code generator
         // did its thing.
         MachineJumpTableInfo *JTI = MF.getJumpTableInfo();
      @@ -918,71 +906,6 @@
       }
       
       
      -/// CanFallThrough - Return true if the specified block (with the specified
      -/// branch condition) can implicitly transfer control to the block after it by
      -/// falling off the end of it.  This should return false if it can reach the
      -/// block after it, but it uses an explicit branch to do so (e.g. a table jump).
      -///
      -/// True is a conservative answer.
      -///
      -bool BranchFolder::CanFallThrough(MachineBasicBlock *CurBB,
      -                                  bool BranchUnAnalyzable,
      -                                  MachineBasicBlock *TBB,
      -                                  MachineBasicBlock *FBB,
      -                                  const SmallVectorImpl &Cond) {
      -  MachineFunction::iterator Fallthrough = CurBB;
      -  ++Fallthrough;
      -  // If FallthroughBlock is off the end of the function, it can't fall through.
      -  if (Fallthrough == CurBB->getParent()->end())
      -    return false;
      -
      -  // If FallthroughBlock isn't a successor of CurBB, no fallthrough is possible.
      -  if (!CurBB->isSuccessor(Fallthrough))
      -    return false;
      -
      -  // If we couldn't analyze the branch, examine the last instruction.
      -  // If the block doesn't end in a known control barrier, assume fallthrough
      -  // is possible. The isPredicable check is needed because this code can be
      -  // called during IfConversion, where an instruction which is normally a
      -  // Barrier is predicated and thus no longer an actual control barrier. This
      -  // is over-conservative though, because if an instruction isn't actually
      -  // predicated we could still treat it like a barrier.
      -  if (BranchUnAnalyzable)
      -    return CurBB->empty() || !CurBB->back().getDesc().isBarrier() ||
      -           CurBB->back().getDesc().isPredicable();
      -
      -  // If there is no branch, control always falls through.
      -  if (TBB == 0) return true;
      -
      -  // If there is some explicit branch to the fallthrough block, it can obviously
      -  // reach, even though the branch should get folded to fall through implicitly.
      -  if (MachineFunction::iterator(TBB) == Fallthrough ||
      -      MachineFunction::iterator(FBB) == Fallthrough)
      -    return true;
      -
      -  // If it's an unconditional branch to some block not the fall through, it
      -  // doesn't fall through.
      -  if (Cond.empty()) return false;
      -
      -  // Otherwise, if it is conditional and has no explicit false block, it falls
      -  // through.
      -  return FBB == 0;
      -}
      -
      -/// CanFallThrough - Return true if the specified can implicitly transfer
      -/// control to the block after it by falling off the end of it.  This should
      -/// return false if it can reach the block after it, but it uses an explicit
      -/// branch to do so (e.g. a table jump).
      -///
      -/// True is a conservative answer.
      -///
      -bool BranchFolder::CanFallThrough(MachineBasicBlock *CurBB) {
      -  MachineBasicBlock *TBB = 0, *FBB = 0;
      -  SmallVector Cond;
      -  bool CurUnAnalyzable = TII->AnalyzeBranch(*CurBB, TBB, FBB, Cond, true);
      -  return CanFallThrough(CurBB, CurUnAnalyzable, TBB, FBB, Cond);
      -}
      -
       /// IsBetterFallthrough - Return true if it would be clearly better to
       /// fall-through to MBB1 than to fall through into MBB2.  This has to return
       /// a strict ordering, returning true for both (MBB1,MBB2) and (MBB2,MBB1) will
      @@ -1005,152 +928,6 @@
         return MBB2I->getDesc().isCall() && !MBB1I->getDesc().isCall();
       }
       
      -/// TailDuplicateBlocks - Look for small blocks that are unconditionally
      -/// branched to and do not fall through. Tail-duplicate their instructions
      -/// into their predecessors to eliminate (dynamic) branches.
      -bool BranchFolder::TailDuplicateBlocks(MachineFunction &MF) {
      -  bool MadeChange = false;
      -
      -  for (MachineFunction::iterator I = ++MF.begin(), E = MF.end(); I != E; ) {
      -    MachineBasicBlock *MBB = I++;
      -
      -    // Only duplicate blocks that end with unconditional branches.
      -    if (CanFallThrough(MBB))
      -      continue;
      -
      -    MadeChange |= TailDuplicate(MBB, MF);
      -
      -    // If it is dead, remove it.
      -    if (MBB->pred_empty()) {
      -      NumInstrDups -= MBB->size();
      -      RemoveDeadBlock(MBB);
      -      MadeChange = true;
      -      ++NumDeadBlocks;
      -    }
      -  }
      -  return MadeChange;
      -}
      -
      -/// TailDuplicate - If it is profitable, duplicate TailBB's contents in each
      -/// of its predecessors.
      -bool BranchFolder::TailDuplicate(MachineBasicBlock *TailBB,
      -                                 MachineFunction &MF) {
      -  // Don't try to tail-duplicate single-block loops.
      -  if (TailBB->isSuccessor(TailBB))
      -    return false;
      -
      -  // Set the limit on the number of instructions to duplicate, with a default
      -  // of one less than the tail-merge threshold. When optimizing for size,
      -  // duplicate only one, because one branch instruction can be eliminated to
      -  // compensate for the duplication.
      -  unsigned MaxDuplicateCount;
      -  if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize))
      -    MaxDuplicateCount = 1;
      -  else if (TII->isProfitableToDuplicateIndirectBranch() &&
      -           !TailBB->empty() && TailBB->back().getDesc().isIndirectBranch())
      -    // If the target has hardware branch prediction that can handle indirect
      -    // branches, duplicating them can often make them predictable when there
      -    // are common paths through the code.  The limit needs to be high enough
      -    // to allow undoing the effects of tail merging.
      -    MaxDuplicateCount = 20;
      -  else
      -    MaxDuplicateCount = TailMergeSize - 1;
      -
      -  // Check the instructions in the block to determine whether tail-duplication
      -  // is invalid or unlikely to be profitable.
      -  unsigned i = 0;
      -  bool HasCall = false;
      -  for (MachineBasicBlock::iterator I = TailBB->begin();
      -       I != TailBB->end(); ++I, ++i) {
      -    // Non-duplicable things shouldn't be tail-duplicated.
      -    if (I->getDesc().isNotDuplicable()) return false;
      -    // Don't duplicate more than the threshold.
      -    if (i == MaxDuplicateCount) return false;
      -    // Remember if we saw a call.
      -    if (I->getDesc().isCall()) HasCall = true;
      -  }
      -  // Heuristically, don't tail-duplicate calls if it would expand code size,
      -  // as it's less likely to be worth the extra cost.
      -  if (i > 1 && HasCall)
      -    return false;
      -
      -  // Iterate through all the unique predecessors and tail-duplicate this
      -  // block into them, if possible. Copying the list ahead of time also
      -  // avoids trouble with the predecessor list reallocating.
      -  bool Changed = false;
      -  SmallSetVector Preds(TailBB->pred_begin(),
      -                                               TailBB->pred_end());
      -  for (SmallSetVector::iterator PI = Preds.begin(),
      -       PE = Preds.end(); PI != PE; ++PI) {
      -    MachineBasicBlock *PredBB = *PI;
      -
      -    assert(TailBB != PredBB &&
      -           "Single-block loop should have been rejected earlier!");
      -    if (PredBB->succ_size() > 1) continue;
      -
      -    MachineBasicBlock *PredTBB, *PredFBB;
      -    SmallVector PredCond;
      -    if (TII->AnalyzeBranch(*PredBB, PredTBB, PredFBB, PredCond, true))
      -      continue;
      -    if (!PredCond.empty())
      -      continue;
      -    // EH edges are ignored by AnalyzeBranch.
      -    if (PredBB->succ_size() != 1)
      -      continue;
      -    // Don't duplicate into a fall-through predecessor (at least for now).
      -    if (PredBB->isLayoutSuccessor(TailBB) && CanFallThrough(PredBB))
      -      continue;
      -
      -    DEBUG(errs() << "\nTail-duplicating into PredBB: " << *PredBB
      -                 << "From Succ: " << *TailBB);
      -
      -    // Remove PredBB's unconditional branch.
      -    TII->RemoveBranch(*PredBB);
      -    // Clone the contents of TailBB into PredBB.
      -    for (MachineBasicBlock::iterator I = TailBB->begin(), E = TailBB->end();
      -         I != E; ++I) {
      -      MachineInstr *NewMI = MF.CloneMachineInstr(I);
      -      PredBB->insert(PredBB->end(), NewMI);
      -    }
      -    NumInstrDups += TailBB->size() - 1; // subtract one for removed branch
      -
      -    // Update the CFG.
      -    PredBB->removeSuccessor(PredBB->succ_begin());
      -    assert(PredBB->succ_empty() &&
      -           "TailDuplicate called on block with multiple successors!");
      -    for (MachineBasicBlock::succ_iterator I = TailBB->succ_begin(),
      -         E = TailBB->succ_end(); I != E; ++I)
      -       PredBB->addSuccessor(*I);
      -
      -    Changed = true;
      -    ++NumTailDups;
      -  }
      -
      -  // If TailBB was duplicated into all its predecessors except for the prior
      -  // block, which falls through unconditionally, move the contents of this
      -  // block into the prior block.
      -  MachineBasicBlock &PrevBB = *prior(MachineFunction::iterator(TailBB));
      -  MachineBasicBlock *PriorTBB = 0, *PriorFBB = 0;
      -  SmallVector PriorCond;
      -  bool PriorUnAnalyzable =
      -    TII->AnalyzeBranch(PrevBB, PriorTBB, PriorFBB, PriorCond, true);
      -  // This has to check PrevBB->succ_size() because EH edges are ignored by
      -  // AnalyzeBranch.
      -  if (!PriorUnAnalyzable && PriorCond.empty() && !PriorTBB &&
      -      TailBB->pred_size() == 1 && PrevBB.succ_size() == 1 &&
      -      !TailBB->hasAddressTaken()) {
      -    DEBUG(errs() << "\nMerging into block: " << PrevBB
      -          << "From MBB: " << *TailBB);
      -    PrevBB.splice(PrevBB.end(), TailBB, TailBB->begin(), TailBB->end());
      -    PrevBB.removeSuccessor(PrevBB.succ_begin());;
      -    assert(PrevBB.succ_empty());
      -    PrevBB.transferSuccessors(TailBB);
      -    Changed = true;
      -  }
      -
      -  return Changed;
      -}
      -
       /// OptimizeBlock - Analyze and optimize control flow related to the specified
       /// block.  This is never called on the entry block.
       bool BranchFolder::OptimizeBlock(MachineBasicBlock *MBB) {
      @@ -1275,7 +1052,7 @@
           // the assert condition out of the loop body.
           if (MBB->succ_empty() && !PriorCond.empty() && PriorFBB == 0 &&
               MachineFunction::iterator(PriorTBB) == FallThrough &&
      -        !CanFallThrough(MBB)) {
      +        !MBB->canFallThrough()) {
             bool DoTransform = true;
       
             // We have to be careful that the succs of PredBB aren't both no-successor
      @@ -1299,7 +1076,7 @@
             // In this case, we could actually be moving the return block *into* a
             // loop!
             if (DoTransform && !MBB->succ_empty() &&
      -          (!CanFallThrough(PriorTBB) || PriorTBB->empty()))
      +          (!PriorTBB->canFallThrough() || PriorTBB->empty()))
               DoTransform = false;
       
       
      @@ -1431,13 +1208,11 @@
         // If the prior block doesn't fall through into this block, and if this
         // block doesn't fall through into some other block, see if we can find a
         // place to move this block where a fall-through will happen.
      -  if (!CanFallThrough(&PrevBB, PriorUnAnalyzable,
      -                      PriorTBB, PriorFBB, PriorCond)) {
      +  if (!PrevBB.canFallThrough()) {
       
           // Now we know that there was no fall-through into this block, check to
           // see if it has a fall-through into its successor.
      -    bool CurFallsThru = CanFallThrough(MBB, CurUnAnalyzable, CurTBB, CurFBB,
      -                                       CurCond);
      +    bool CurFallsThru = MBB->canFallThrough();
       
           if (!MBB->isLandingPad()) {
             // Check all the predecessors of this block.  If one of them has no fall
      @@ -1449,7 +1224,7 @@
               MachineFunction::iterator PredFallthrough = PredBB; ++PredFallthrough;
               MachineBasicBlock *PredTBB, *PredFBB;
               SmallVector PredCond;
      -        if (PredBB != MBB && !CanFallThrough(PredBB) &&
      +        if (PredBB != MBB && !PredBB->canFallThrough() &&
                   !TII->AnalyzeBranch(*PredBB, PredTBB, PredFBB, PredCond, true)
                   && (!CurFallsThru || !CurTBB || !CurFBB)
                   && (!CurFallsThru || MBB->getNumber() >= PredBB->getNumber())) {
      @@ -1488,7 +1263,7 @@
               // and if the successor isn't an EH destination, we can arrange for the
               // fallthrough to happen.
               if (SuccBB != MBB && &*SuccPrev != MBB &&
      -            !CanFallThrough(SuccPrev) && !CurUnAnalyzable &&
      +            !SuccPrev->canFallThrough() && !CurUnAnalyzable &&
                   !SuccBB->isLandingPad()) {
                 MBB->moveBefore(SuccBB);
                 MadeChange = true;
      
      Modified: llvm/trunk/lib/CodeGen/BranchFolding.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/BranchFolding.h?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/BranchFolding.h (original)
      +++ llvm/trunk/lib/CodeGen/BranchFolding.h Wed Nov 25 18:32:21 2009
      @@ -105,18 +105,10 @@
           unsigned CreateCommonTailOnlyBlock(MachineBasicBlock *&PredBB,
                                              unsigned maxCommonTailLength);
       
      -    bool TailDuplicateBlocks(MachineFunction &MF);
      -    bool TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF);
      -    
           bool OptimizeBranches(MachineFunction &MF);
           bool OptimizeBlock(MachineBasicBlock *MBB);
           void RemoveDeadBlock(MachineBasicBlock *MBB);
           bool OptimizeImpDefsBlock(MachineBasicBlock *MBB);
      -    
      -    bool CanFallThrough(MachineBasicBlock *CurBB);
      -    bool CanFallThrough(MachineBasicBlock *CurBB, bool BranchUnAnalyzable,
      -                        MachineBasicBlock *TBB, MachineBasicBlock *FBB,
      -                        const SmallVectorImpl &Cond);
         };
       }
       
      
      Modified: llvm/trunk/lib/CodeGen/CMakeLists.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CMakeLists.txt?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/CMakeLists.txt (original)
      +++ llvm/trunk/lib/CodeGen/CMakeLists.txt Wed Nov 25 18:32:21 2009
      @@ -63,6 +63,7 @@
         StackProtector.cpp
         StackSlotColoring.cpp
         StrongPHIElimination.cpp
      +  TailDuplication.cpp
         TargetInstrInfoImpl.cpp
         TwoAddressInstructionPass.cpp
         UnreachableBlockElim.cpp
      
      Modified: llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp (original)
      +++ llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp Wed Nov 25 18:32:21 2009
      @@ -35,6 +35,8 @@
           cl::desc("Disable Post Regalloc"));
       static cl::opt DisableBranchFold("disable-branch-fold", cl::Hidden,
           cl::desc("Disable branch folding"));
      +static cl::opt DisableTailDuplicate("disable-tail-duplicate", cl::Hidden,
      +    cl::desc("Disable tail duplication"));
       static cl::opt DisableCodePlace("disable-code-place", cl::Hidden,
           cl::desc("Disable code placement"));
       static cl::opt DisableSSC("disable-ssc", cl::Hidden,
      @@ -344,6 +346,12 @@
           printAndVerify(PM, "After BranchFolding");
         }
       
      +  // Tail duplication.
      +  if (OptLevel != CodeGenOpt::None && !DisableTailDuplicate) {
      +    PM.add(createTailDuplicationPass());
      +    printAndVerify(PM, "After TailDuplication");
      +  }
      +
         PM.add(createGCMachineCodeAnalysisPass());
       
         if (PrintGCInfo)
      
      Modified: llvm/trunk/lib/CodeGen/MachineBasicBlock.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineBasicBlock.cpp?rev=89904&r1=89903&r2=89904&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/MachineBasicBlock.cpp (original)
      +++ llvm/trunk/lib/CodeGen/MachineBasicBlock.cpp Wed Nov 25 18:32:21 2009
      @@ -362,6 +362,51 @@
         return next(I) == MachineFunction::const_iterator(MBB);
       }
       
      +bool MachineBasicBlock::canFallThrough() {
      +  MachineBasicBlock *TBB = 0, *FBB = 0;
      +  SmallVector Cond;
      +  const TargetInstrInfo *TII = getParent()->getTarget().getInstrInfo();
      +  bool BranchUnAnalyzable = TII->AnalyzeBranch(*this, TBB, FBB, Cond, true);
      +
      +  MachineFunction::iterator Fallthrough = this;
      +  ++Fallthrough;
      +  // If FallthroughBlock is off the end of the function, it can't fall through.
      +  if (Fallthrough == getParent()->end())
      +    return false;
      +
      +  // If FallthroughBlock isn't a successor, no fallthrough is possible.
      +  if (!isSuccessor(Fallthrough))
      +    return false;
      +
      +  // If we couldn't analyze the branch, examine the last instruction.
      +  // If the block doesn't end in a known control barrier, assume fallthrough
      +  // is possible. The isPredicable check is needed because this code can be
      +  // called during IfConversion, where an instruction which is normally a
      +  // Barrier is predicated and thus no longer an actual control barrier. This
      +  // is over-conservative though, because if an instruction isn't actually
      +  // predicated we could still treat it like a barrier.
      +  if (BranchUnAnalyzable)
      +    return empty() || !back().getDesc().isBarrier() ||
      +           back().getDesc().isPredicable();
      +
      +  // If there is no branch, control always falls through.
      +  if (TBB == 0) return true;
      +
      +  // If there is some explicit branch to the fallthrough block, it can obviously
      +  // reach, even though the branch should get folded to fall through implicitly.
      +  if (MachineFunction::iterator(TBB) == Fallthrough ||
      +      MachineFunction::iterator(FBB) == Fallthrough)
      +    return true;
      +
      +  // If it's an unconditional branch to some block not the fall through, it
      +  // doesn't fall through.
      +  if (Cond.empty()) return false;
      +
      +  // Otherwise, if it is conditional and has no explicit false block, it falls
      +  // through.
      +  return FBB == 0;
      +}
      +
       /// removeFromParent - This method unlinks 'this' from the containing function,
       /// and returns it, but does not delete it.
       MachineBasicBlock *MachineBasicBlock::removeFromParent() {
      
      Added: llvm/trunk/lib/CodeGen/TailDuplication.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TailDuplication.cpp?rev=89904&view=auto
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/TailDuplication.cpp (added)
      +++ llvm/trunk/lib/CodeGen/TailDuplication.cpp Wed Nov 25 18:32:21 2009
      @@ -0,0 +1,250 @@
      +//===-- TailDuplication.cpp - Duplicate blocks into predecessors' tails ---===//
      +//
      +//                     The LLVM Compiler Infrastructure
      +//
      +// This file is distributed under the University of Illinois Open Source
      +// License. See LICENSE.TXT for details.
      +//
      +//===----------------------------------------------------------------------===//
      +//
      +// This pass duplicates basic blocks ending in unconditional branches into
      +// the tails of their predecessors.
      +//
      +//===----------------------------------------------------------------------===//
      +
      +#define DEBUG_TYPE "tailduplication"
      +#include "llvm/Function.h"
      +#include "llvm/CodeGen/Passes.h"
      +#include "llvm/CodeGen/MachineModuleInfo.h"
      +#include "llvm/CodeGen/MachineFunctionPass.h"
      +#include "llvm/Target/TargetInstrInfo.h"
      +#include "llvm/Support/CommandLine.h"
      +#include "llvm/Support/Debug.h"
      +#include "llvm/Support/raw_ostream.h"
      +#include "llvm/ADT/SmallSet.h"
      +#include "llvm/ADT/SetVector.h"
      +#include "llvm/ADT/Statistic.h"
      +using namespace llvm;
      +
      +STATISTIC(NumTailDups  , "Number of tail duplicated blocks");
      +STATISTIC(NumInstrDups , "Additional instructions due to tail duplication");
      +STATISTIC(NumDeadBlocks, "Number of dead blocks removed");
      +
      +// Heuristic for tail duplication.
      +static cl::opt
      +TailDuplicateSize("tail-dup-size",
      +                  cl::desc("Maximum instructions to consider tail duplicating"),
      +                  cl::init(2), cl::Hidden);
      +
      +namespace {
      +  /// TailDuplicationPass - Perform tail duplication.
      +  class TailDuplicationPass : public MachineFunctionPass {
      +    const TargetInstrInfo *TII;
      +    MachineModuleInfo *MMI;
      +
      +  public:
      +    static char ID;
      +    explicit TailDuplicationPass() : MachineFunctionPass(&ID) {}
      +
      +    virtual bool runOnMachineFunction(MachineFunction &MF);
      +    virtual const char *getPassName() const { return "Tail Duplication"; }
      +
      +  private:
      +    bool TailDuplicateBlocks(MachineFunction &MF);
      +    bool TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF);
      +    void RemoveDeadBlock(MachineBasicBlock *MBB);
      +  };
      +
      +  char TailDuplicationPass::ID = 0;
      +}
      +
      +FunctionPass *llvm::createTailDuplicationPass() {
      +  return new TailDuplicationPass();
      +}
      +
      +bool TailDuplicationPass::runOnMachineFunction(MachineFunction &MF) {
      +  TII = MF.getTarget().getInstrInfo();
      +  MMI = getAnalysisIfAvailable();
      +
      +  bool MadeChange = false;
      +  bool MadeChangeThisIteration = true;
      +  while (MadeChangeThisIteration) {
      +    MadeChangeThisIteration = false;
      +    MadeChangeThisIteration |= TailDuplicateBlocks(MF);
      +    MadeChange |= MadeChangeThisIteration;
      +  }
      +
      +  return MadeChange;
      +}
      +
      +/// TailDuplicateBlocks - Look for small blocks that are unconditionally
      +/// branched to and do not fall through. Tail-duplicate their instructions
      +/// into their predecessors to eliminate (dynamic) branches.
      +bool TailDuplicationPass::TailDuplicateBlocks(MachineFunction &MF) {
      +  bool MadeChange = false;
      +
      +  for (MachineFunction::iterator I = ++MF.begin(), E = MF.end(); I != E; ) {
      +    MachineBasicBlock *MBB = I++;
      +
      +    // Only duplicate blocks that end with unconditional branches.
      +    if (MBB->canFallThrough())
      +      continue;
      +
      +    MadeChange |= TailDuplicate(MBB, MF);
      +
      +    // If it is dead, remove it.
      +    if (MBB->pred_empty()) {
      +      NumInstrDups -= MBB->size();
      +      RemoveDeadBlock(MBB);
      +      MadeChange = true;
      +      ++NumDeadBlocks;
      +    }
      +  }
      +  return MadeChange;
      +}
      +
      +/// TailDuplicate - If it is profitable, duplicate TailBB's contents in each
      +/// of its predecessors.
      +bool TailDuplicationPass::TailDuplicate(MachineBasicBlock *TailBB,
      +                                        MachineFunction &MF) {
      +  // Don't try to tail-duplicate single-block loops.
      +  if (TailBB->isSuccessor(TailBB))
      +    return false;
      +
      +  // Set the limit on the number of instructions to duplicate, with a default
      +  // of one less than the tail-merge threshold. When optimizing for size,
      +  // duplicate only one, because one branch instruction can be eliminated to
      +  // compensate for the duplication.
      +  unsigned MaxDuplicateCount;
      +  if (MF.getFunction()->hasFnAttr(Attribute::OptimizeForSize))
      +    MaxDuplicateCount = 1;
      +  else if (TII->isProfitableToDuplicateIndirectBranch() &&
      +           !TailBB->empty() && TailBB->back().getDesc().isIndirectBranch())
      +    // If the target has hardware branch prediction that can handle indirect
      +    // branches, duplicating them can often make them predictable when there
      +    // are common paths through the code.  The limit needs to be high enough
      +    // to allow undoing the effects of tail merging.
      +    MaxDuplicateCount = 20;
      +  else
      +    MaxDuplicateCount = TailDuplicateSize;
      +
      +  // Check the instructions in the block to determine whether tail-duplication
      +  // is invalid or unlikely to be profitable.
      +  unsigned i = 0;
      +  bool HasCall = false;
      +  for (MachineBasicBlock::iterator I = TailBB->begin();
      +       I != TailBB->end(); ++I, ++i) {
      +    // Non-duplicable things shouldn't be tail-duplicated.
      +    if (I->getDesc().isNotDuplicable()) return false;
      +    // Don't duplicate more than the threshold.
      +    if (i == MaxDuplicateCount) return false;
      +    // Remember if we saw a call.
      +    if (I->getDesc().isCall()) HasCall = true;
      +  }
      +  // Heuristically, don't tail-duplicate calls if it would expand code size,
      +  // as it's less likely to be worth the extra cost.
      +  if (i > 1 && HasCall)
      +    return false;
      +
      +  // Iterate through all the unique predecessors and tail-duplicate this
      +  // block into them, if possible. Copying the list ahead of time also
      +  // avoids trouble with the predecessor list reallocating.
      +  bool Changed = false;
      +  SmallSetVector Preds(TailBB->pred_begin(),
      +                                               TailBB->pred_end());
      +  for (SmallSetVector::iterator PI = Preds.begin(),
      +       PE = Preds.end(); PI != PE; ++PI) {
      +    MachineBasicBlock *PredBB = *PI;
      +
      +    assert(TailBB != PredBB &&
      +           "Single-block loop should have been rejected earlier!");
      +    if (PredBB->succ_size() > 1) continue;
      +
      +    MachineBasicBlock *PredTBB, *PredFBB;
      +    SmallVector PredCond;
      +    if (TII->AnalyzeBranch(*PredBB, PredTBB, PredFBB, PredCond, true))
      +      continue;
      +    if (!PredCond.empty())
      +      continue;
      +    // EH edges are ignored by AnalyzeBranch.
      +    if (PredBB->succ_size() != 1)
      +      continue;
      +    // Don't duplicate into a fall-through predecessor (at least for now).
      +    if (PredBB->isLayoutSuccessor(TailBB) && PredBB->canFallThrough())
      +      continue;
      +
      +    DEBUG(errs() << "\nTail-duplicating into PredBB: " << *PredBB
      +                 << "From Succ: " << *TailBB);
      +
      +    // Remove PredBB's unconditional branch.
      +    TII->RemoveBranch(*PredBB);
      +    // Clone the contents of TailBB into PredBB.
      +    for (MachineBasicBlock::iterator I = TailBB->begin(), E = TailBB->end();
      +         I != E; ++I) {
      +      MachineInstr *NewMI = MF.CloneMachineInstr(I);
      +      PredBB->insert(PredBB->end(), NewMI);
      +    }
      +    NumInstrDups += TailBB->size() - 1; // subtract one for removed branch
      +
      +    // Update the CFG.
      +    PredBB->removeSuccessor(PredBB->succ_begin());
      +    assert(PredBB->succ_empty() &&
      +           "TailDuplicate called on block with multiple successors!");
      +    for (MachineBasicBlock::succ_iterator I = TailBB->succ_begin(),
      +         E = TailBB->succ_end(); I != E; ++I)
      +       PredBB->addSuccessor(*I);
      +
      +    Changed = true;
      +    ++NumTailDups;
      +  }
      +
      +  // If TailBB was duplicated into all its predecessors except for the prior
      +  // block, which falls through unconditionally, move the contents of this
      +  // block into the prior block.
      +  MachineBasicBlock &PrevBB = *prior(MachineFunction::iterator(TailBB));
      +  MachineBasicBlock *PriorTBB = 0, *PriorFBB = 0;
      +  SmallVector PriorCond;
      +  bool PriorUnAnalyzable =
      +    TII->AnalyzeBranch(PrevBB, PriorTBB, PriorFBB, PriorCond, true);
      +  // This has to check PrevBB->succ_size() because EH edges are ignored by
      +  // AnalyzeBranch.
      +  if (!PriorUnAnalyzable && PriorCond.empty() && !PriorTBB &&
      +      TailBB->pred_size() == 1 && PrevBB.succ_size() == 1 &&
      +      !TailBB->hasAddressTaken()) {
      +    DEBUG(errs() << "\nMerging into block: " << PrevBB
      +          << "From MBB: " << *TailBB);
      +    PrevBB.splice(PrevBB.end(), TailBB, TailBB->begin(), TailBB->end());
      +    PrevBB.removeSuccessor(PrevBB.succ_begin());;
      +    assert(PrevBB.succ_empty());
      +    PrevBB.transferSuccessors(TailBB);
      +    Changed = true;
      +  }
      +
      +  return Changed;
      +}
      +
      +/// RemoveDeadBlock - Remove the specified dead machine basic block from the
      +/// function, updating the CFG.
      +void TailDuplicationPass::RemoveDeadBlock(MachineBasicBlock *MBB) {
      +  assert(MBB->pred_empty() && "MBB must be dead!");
      +  DEBUG(errs() << "\nRemoving MBB: " << *MBB);
      +
      +  // Remove all successors.
      +  while (!MBB->succ_empty())
      +    MBB->removeSuccessor(MBB->succ_end()-1);
      +
      +  // If there are any labels in the basic block, unregister them from
      +  // MachineModuleInfo.
      +  if (MMI && !MBB->empty()) {
      +    for (MachineBasicBlock::iterator I = MBB->begin(), E = MBB->end();
      +         I != E; ++I) {
      +      if (I->isLabel())
      +        // The label ID # is always operand #0, an immediate.
      +        MMI->InvalidateLabel(I->getOperand(0).getImm());
      +    }
      +  }
      +
      +  // Remove the block.
      +  MBB->eraseFromParent();
      +}
      +
      
      
      
      
      From evan.cheng at apple.com  Wed Nov 25 18:32:36 2009
      From: evan.cheng at apple.com (Evan Cheng)
      Date: Thu, 26 Nov 2009 00:32:36 -0000
      Subject: [llvm-commits] [llvm] r89905 -
      	/llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp
      Message-ID: <200911260032.nAQ0WaRx020738@zion.cs.uiuc.edu>
      
      Author: evancheng
      Date: Wed Nov 25 18:32:36 2009
      New Revision: 89905
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89905&view=rev
      Log:
      When all defs of a vr are implicit_def, delete all of the defs.
      
      Modified:
          llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp
      
      Modified: llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp?rev=89905&r1=89904&r2=89905&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp (original)
      +++ llvm/trunk/lib/CodeGen/ProcessImplicitDefs.cpp Wed Nov 25 18:32:36 2009
      @@ -183,19 +183,23 @@
             // is not an implicit_def, do not insert implicit_def's before the
             // uses.
             bool Skip = false;
      +      SmallVector DeadImpDefs;
             for (MachineRegisterInfo::def_iterator DI = mri_->def_begin(Reg),
                    DE = mri_->def_end(); DI != DE; ++DI) {
      -        if (DI->getOpcode() != TargetInstrInfo::IMPLICIT_DEF) {
      +        MachineInstr *DeadImpDef = &*DI;
      +        if (DeadImpDef->getOpcode() != TargetInstrInfo::IMPLICIT_DEF) {
                 Skip = true;
                 break;
               }
      +        DeadImpDefs.push_back(DeadImpDef);
             }
             if (Skip)
               continue;
       
             // The only implicit_def which we want to keep are those that are live
             // out of its block.
      -      MI->eraseFromParent();
      +      for (unsigned j = 0, ee = DeadImpDefs.size(); j != ee; ++j)
      +        DeadImpDefs[j]->eraseFromParent();
             Changed = true;
       
             // Process each use instruction once.
      
      
      
      
      From evan.cheng at apple.com  Wed Nov 25 18:35:01 2009
      From: evan.cheng at apple.com (Evan Cheng)
      Date: Thu, 26 Nov 2009 00:35:01 -0000
      Subject: [llvm-commits] [llvm] r89906 -
      	/llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll
      Message-ID: <200911260035.nAQ0Z1mo020843@zion.cs.uiuc.edu>
      
      Author: evancheng
      Date: Wed Nov 25 18:35:01 2009
      New Revision: 89906
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89906&view=rev
      Log:
      Test for 89905.
      
      Added:
          llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll
      
      Added: llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll?rev=89906&view=auto
      
      ==============================================================================
      --- llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll (added)
      +++ llvm/trunk/test/CodeGen/X86/2009-11-25-ImpDefBug.ll Wed Nov 25 18:35:01 2009
      @@ -0,0 +1,116 @@
      +; RUN: llc < %s -mtriple=x86_64-unknown-linux-gnu
      +; pr5600
      +
      +%struct..0__pthread_mutex_s = type { i32, i32, i32, i32, i32, i32, %struct.__pthread_list_t }
      +%struct.ASN1ObjHeader = type { i8, %"struct.__gmp_expr<__mpz_struct [1],__mpz_struct [1]>", i64, i32, i32, i32 }
      +%struct.ASN1Object = type { i32 (...)**, i32, i32, i64 }
      +%struct.ASN1Unit = type { [4 x i32 (%struct.ASN1ObjHeader*, %struct.ASN1Object**)*], %"struct.std::ASN1ObjList" }
      +%"struct.__gmp_expr<__mpz_struct [1],__mpz_struct [1]>" = type { [1 x %struct.__mpz_struct] }
      +%struct.__mpz_struct = type { i32, i32, i64* }
      +%struct.__pthread_list_t = type { %struct.__pthread_list_t*, %struct.__pthread_list_t* }
      +%struct.pthread_attr_t = type { i64, [48 x i8] }
      +%struct.pthread_mutex_t = type { %struct..0__pthread_mutex_s }
      +%struct.pthread_mutexattr_t = type { i32 }
      +%"struct.std::ASN1ObjList" = type { %"struct.std::_Vector_base >" }
      +%"struct.std::_Vector_base >" = type { %"struct.std::_Vector_base >::_Vector_impl" }
      +%"struct.std::_Vector_base >::_Vector_impl" = type { %struct.ASN1Object**, %struct.ASN1Object**, %struct.ASN1Object** }
      +%struct.xmstream = type { i8*, i64, i64, i64, i8 }
      +
      +declare void @_ZNSt6vectorIP10ASN1ObjectSaIS1_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS1_S3_EERKS1_(%"struct.std::ASN1ObjList"* nocapture, i64, %struct.ASN1Object** nocapture)
      +
      +declare i32 @_Z17LoadObjectFromBERR8xmstreamPP10ASN1ObjectPPF10ASN1StatusP13ASN1ObjHeaderS3_E(%struct.xmstream*, %struct.ASN1Object**, i32 (%struct.ASN1ObjHeader*, %struct.ASN1Object**)**)
      +
      +define i32 @_ZN8ASN1Unit4loadER8xmstreamjm18ASN1LengthEncoding(%struct.ASN1Unit* %this, %struct.xmstream* nocapture %stream, i32 %numObjects, i64 %size, i32 %lEncoding) {
      +entry:
      +  br label %meshBB85
      +
      +bb5:                                              ; preds = %bb13.fragment.cl135, %bb13.fragment.cl, %bb.i.i.bbcl.disp, %bb13.fragment
      +  %0 = invoke i32 @_Z17LoadObjectFromBERR8xmstreamPP10ASN1ObjectPPF10ASN1StatusP13ASN1ObjHeaderS3_E(%struct.xmstream* undef, %struct.ASN1Object** undef, i32 (%struct.ASN1ObjHeader*, %struct.ASN1Object**)** undef)
      +          to label %meshBB81.bbcl.disp unwind label %lpad ;  [#uses=0]
      +
      +bb10.fragment:                                    ; preds = %bb13.fragment.bbcl.disp
      +  br i1 undef, label %bb1.i.fragment.bbcl.disp, label %bb.i.i.bbcl.disp
      +
      +bb1.i.fragment:                                   ; preds = %bb1.i.fragment.bbcl.disp
      +  invoke void @_ZNSt6vectorIP10ASN1ObjectSaIS1_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS1_S3_EERKS1_(%"struct.std::ASN1ObjList"* undef, i64 undef, %struct.ASN1Object** undef)
      +          to label %meshBB81.bbcl.disp unwind label %lpad
      +
      +bb13.fragment:                                    ; preds = %bb13.fragment.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb5
      +
      +bb.i4:                                            ; preds = %bb.i4.bbcl.disp, %bb1.i.fragment.bbcl.disp
      +  ret i32 undef
      +
      +bb1.i5:                                           ; preds = %bb.i1
      +  ret i32 undef
      +
      +lpad:                                             ; preds = %bb1.i.fragment.cl, %bb1.i.fragment, %bb5
      +  %.SV10.phi807 = phi i8* [ undef, %bb1.i.fragment.cl ], [ undef, %bb1.i.fragment ], [ undef, %bb5 ] ;  [#uses=1]
      +  %1 = load i8* %.SV10.phi807, align 8            ;  [#uses=0]
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb13.fragment.bbcl.disp
      +
      +bb.i1:                                            ; preds = %bb.i.i.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb1.i5
      +
      +meshBB81:                                         ; preds = %meshBB81.bbcl.disp, %bb.i.i.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb.i4.bbcl.disp
      +
      +meshBB85:                                         ; preds = %meshBB81.bbcl.disp, %bb.i4.bbcl.disp, %bb1.i.fragment.bbcl.disp, %bb.i.i.bbcl.disp, %entry
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb13.fragment.bbcl.disp
      +
      +bb.i.i.bbcl.disp:                                 ; preds = %bb10.fragment
      +  switch i8 undef, label %meshBB85 [
      +    i8 123, label %bb.i1
      +    i8 97, label %bb5
      +    i8 44, label %meshBB81
      +    i8 1, label %meshBB81.cl
      +    i8 51, label %meshBB81.cl141
      +  ]
      +
      +bb1.i.fragment.cl:                                ; preds = %bb1.i.fragment.bbcl.disp
      +  invoke void @_ZNSt6vectorIP10ASN1ObjectSaIS1_EE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPS1_S3_EERKS1_(%"struct.std::ASN1ObjList"* undef, i64 undef, %struct.ASN1Object** undef)
      +          to label %meshBB81.bbcl.disp unwind label %lpad
      +
      +bb1.i.fragment.bbcl.disp:                         ; preds = %bb10.fragment
      +  switch i8 undef, label %bb.i4 [
      +    i8 97, label %bb1.i.fragment
      +    i8 7, label %bb1.i.fragment.cl
      +    i8 35, label %bb.i4.cl
      +    i8 77, label %meshBB85
      +  ]
      +
      +bb13.fragment.cl:                                 ; preds = %bb13.fragment.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb5
      +
      +bb13.fragment.cl135:                              ; preds = %bb13.fragment.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb5
      +
      +bb13.fragment.bbcl.disp:                          ; preds = %meshBB85, %lpad
      +  switch i8 undef, label %bb10.fragment [
      +    i8 67, label %bb13.fragment.cl
      +    i8 108, label %bb13.fragment
      +    i8 58, label %bb13.fragment.cl135
      +  ]
      +
      +bb.i4.cl:                                         ; preds = %bb.i4.bbcl.disp, %bb1.i.fragment.bbcl.disp
      +  ret i32 undef
      +
      +bb.i4.bbcl.disp:                                  ; preds = %meshBB81.cl141, %meshBB81.cl, %meshBB81
      +  switch i8 undef, label %bb.i4 [
      +    i8 35, label %bb.i4.cl
      +    i8 77, label %meshBB85
      +  ]
      +
      +meshBB81.cl:                                      ; preds = %meshBB81.bbcl.disp, %bb.i.i.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb.i4.bbcl.disp
      +
      +meshBB81.cl141:                                   ; preds = %meshBB81.bbcl.disp, %bb.i.i.bbcl.disp
      +  br i1 undef, label %meshBB81.bbcl.disp, label %bb.i4.bbcl.disp
      +
      +meshBB81.bbcl.disp:                               ; preds = %meshBB81.cl141, %meshBB81.cl, %bb13.fragment.cl135, %bb13.fragment.cl, %bb1.i.fragment.cl, %meshBB85, %meshBB81, %bb.i1, %lpad, %bb13.fragment, %bb1.i.fragment, %bb5
      +  switch i8 undef, label %meshBB85 [
      +    i8 44, label %meshBB81
      +    i8 1, label %meshBB81.cl
      +    i8 51, label %meshBB81.cl141
      +  ]
      +}
      
      
      
      
      From bob.wilson at apple.com  Wed Nov 25 18:45:26 2009
      From: bob.wilson at apple.com (Bob Wilson)
      Date: Wed, 25 Nov 2009 16:45:26 -0800
      Subject: [llvm-commits] [llvm] r89877 -
      	/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h
      In-Reply-To: <6CB2B61C-657A-4260-9717-1267F605009E@apple.com>
      References: <200911251957.nAPJvEma011366@zion.cs.uiuc.edu>
      	<6CB2B61C-657A-4260-9717-1267F605009E@apple.com>
      Message-ID: <20AB4AAE-93CF-46FB-8788-F53C1107E490@apple.com>
      
      
      On Nov 25, 2009, at 4:18 PM, Chris Lattner wrote:
      
      > 
      > On Nov 25, 2009, at 11:57 AM, Bob Wilson wrote:
      > 
      >> Author: bwilson
      >> Date: Wed Nov 25 13:57:14 2009
      >> New Revision: 89877
      >> 
      >> URL: http://llvm.org/viewvc/llvm-project?rev=89877&view=rev
      >> Log:
      >> Tail duplicate indirect branches for PowerPC, too.
      >> With the testcase for pr3120, the "threaded interpreter" runtime decreases
      >> from 1788 to 1413 with this change.
      > 
      > Very nice Bob!  Silly question: should isProfitableToDuplicateIndirectBranch default to true?  That would let targets opt out if it is not beneficial and would save a small bit of code size.
      
      Not silly at all.... I had the same thought.
      
      Having spent more time looking at the effects of this transformation, I have a different proposal that I think you might like.  Let's just get rid of that target hook altogether.  ;-)
      
      This special treatment of indirect branches for tail duplication just doesn't kick in very often.  I was being cautious about blowing up code size, but it doesn't happen often enough to matter.  I also wasn't sure if this would matter on anything besides ARM Cortex processors but now we know that it does.
      
      On a related note, Evan suggested that we be more aggressive about this with -Os, and I think that's also a good idea.
      
      Unless someone objects, I'll go ahead with those changes sometime soon.
      
      
      From sabre at nondot.org  Wed Nov 25 19:50:12 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 01:50:12 -0000
      Subject: [llvm-commits] [llvm] r89912 -
      	/llvm/trunk/lib/Analysis/ValueTracking.cpp
      Message-ID: <200911260150.nAQ1oCXa023320@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 19:50:12 2009
      New Revision: 89912
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89912&view=rev
      Log:
      remove some redundant braces
      
      Modified:
          llvm/trunk/lib/Analysis/ValueTracking.cpp
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=89912&r1=89911&r2=89912&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Wed Nov 25 19:50:12 2009
      @@ -833,14 +833,12 @@
       
         switch (I->getOpcode()) {
         default: break;
      -  case Instruction::SExt: {
      +  case Instruction::SExt:
           if (!LookThroughSExt) return false;
           // otherwise fall through to ZExt
      -  }
      -  case Instruction::ZExt: {
      +  case Instruction::ZExt:
           return ComputeMultiple(I->getOperand(0), Base, Multiple,
                                  LookThroughSExt, Depth+1);
      -  }
         case Instruction::Shl:
         case Instruction::Mul: {
           Value *Op0 = I->getOperand(0);
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 19:51:19 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 01:51:19 -0000
      Subject: [llvm-commits] [llvm] r89913 - /llvm/trunk/lib/Target/README.txt
      Message-ID: <200911260151.nAQ1pJP7023358@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 19:51:18 2009
      New Revision: 89913
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89913&view=rev
      Log:
      update some notes slightly
      
      Modified:
          llvm/trunk/lib/Target/README.txt
      
      Modified: llvm/trunk/lib/Target/README.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=89913&r1=89912&r2=89913&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Target/README.txt (original)
      +++ llvm/trunk/lib/Target/README.txt Wed Nov 25 19:51:18 2009
      @@ -220,7 +220,7 @@
       ... which would only do one 32-bit XOR per loop iteration instead of two.
       
       It would also be nice to recognize the reg->size doesn't alias reg->node[i], but
      -alas.
      +this requires TBAA.
       
       //===---------------------------------------------------------------------===//
       
      @@ -280,6 +280,9 @@
         return count;
       }
       
      +This is a form of idiom recognition for loops, the same thing that could be
      +useful for recognizing memset/memcpy.
      +
       //===---------------------------------------------------------------------===//
       
       These should turn into single 16-bit (unaligned?) loads on little/big endian
      @@ -343,7 +346,7 @@
       
       //===---------------------------------------------------------------------===//
       
      -LSR should know what GPR types a target has.  This code:
      +LSR should know what GPR types a target has from TargetData.  This code:
       
       volatile short X, Y; // globals
       
      @@ -369,7 +372,6 @@
       
       LSR should reuse the "+" IV for the exit test.
       
      -
       //===---------------------------------------------------------------------===//
       
       Tail call elim should be more aggressive, checking to see if the call is
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 20:11:09 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 02:11:09 -0000
      Subject: [llvm-commits] [llvm] r89914 -
      	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      Message-ID: <200911260211.nAQ2B9FU024092@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 20:11:08 2009
      New Revision: 89914
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89914&view=rev
      Log:
      Use GEPOperator more pervasively to simplify code.
      
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89914&r1=89913&r2=89914&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Wed Nov 25 20:11:08 2009
      @@ -38,24 +38,24 @@
       // Useful predicates
       //===----------------------------------------------------------------------===//
       
      -static const Value *GetGEPOperands(const Value *V, 
      +static const Value *GetGEPOperands(const GEPOperator *V, 
                                          SmallVector &GEPOps) {
         assert(GEPOps.empty() && "Expect empty list to populate!");
      -  GEPOps.insert(GEPOps.end(), cast(V)->op_begin()+1,
      -                cast(V)->op_end());
      +  GEPOps.insert(GEPOps.end(), V->op_begin()+1, V->op_end());
       
      -  // Accumulate all of the chained indexes into the operand array
      -  V = cast(V)->getOperand(0);
      -
      -  while (const GEPOperator *G = dyn_cast(V)) {
      -    if (!isa(GEPOps[0]) || isa(GEPOps[0]) ||
      -        !cast(GEPOps[0])->isNullValue())
      -      break;  // Don't handle folding arbitrary pointer offsets yet...
      +  // Accumulate all of the chained indexes into the operand array.
      +  Value *BasePtr = V->getOperand(0);
      +  while (1) {
      +    V = dyn_cast(BasePtr);
      +    if (V == 0) return BasePtr;
      +    
      +    // Don't handle folding arbitrary pointer offsets yet.
      +    if (!isa(GEPOps[0]) || !cast(GEPOps[0])->isNullValue())
      +      return BasePtr;
      +    
           GEPOps.erase(GEPOps.begin());   // Drop the zero index
      -    GEPOps.insert(GEPOps.begin(), G->op_begin()+1, G->op_end());
      -    V = G->getOperand(0);
      +    GEPOps.insert(GEPOps.begin(), V->op_begin()+1, V->op_end());
         }
      -  return V;
       }
       
       /// isKnownNonNull - Return true if we know that the specified value is never
      @@ -219,7 +219,7 @@
       
           // aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP
           // instruction against another.
      -    AliasResult aliasGEP(const Value *V1, unsigned V1Size,
      +    AliasResult aliasGEP(const GEPOperator *V1, unsigned V1Size,
                                const Value *V2, unsigned V2Size);
       
           // aliasPHI - Provide a bunch of ad-hoc rules to disambiguate a PHI
      @@ -405,21 +405,19 @@
         return NoAA::getModRefInfo(CS1, CS2);
       }
       
      -// aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction
      -// against another.
      -//
      +/// aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction
      +/// against another pointer.  We know that V1 is a GEP, but we don't know
      +/// anything about V2.
      +///
       AliasAnalysis::AliasResult
      -BasicAliasAnalysis::aliasGEP(const Value *V1, unsigned V1Size,
      +BasicAliasAnalysis::aliasGEP(const GEPOperator *GEP1, unsigned V1Size,
                                    const Value *V2, unsigned V2Size) {
         // If we have two gep instructions with must-alias'ing base pointers, figure
         // out if the indexes to the GEP tell us anything about the derived pointer.
         // Note that we also handle chains of getelementptr instructions as well as
         // constant expression getelementptrs here.
         //
      -  if (isa(V1) && isa(V2)) {
      -    const User *GEP1 = cast(V1);
      -    const User *GEP2 = cast(V2);
      -    
      +  if (const GEPOperator *GEP2 = dyn_cast(V2)) {
           // If V1 and V2 are identical GEPs, just recurse down on both of them.
           // This allows us to analyze things like:
           //   P = gep A, 0, i, 1
      @@ -438,13 +436,13 @@
           while (isa(GEP1->getOperand(0)) &&
                  GEP1->getOperand(1) ==
                  Constant::getNullValue(GEP1->getOperand(1)->getType()))
      -      GEP1 = cast(GEP1->getOperand(0));
      +      GEP1 = cast(GEP1->getOperand(0));
           const Value *BasePtr1 = GEP1->getOperand(0);
       
           while (isa(GEP2->getOperand(0)) &&
                  GEP2->getOperand(1) ==
                  Constant::getNullValue(GEP2->getOperand(1)->getType()))
      -      GEP2 = cast(GEP2->getOperand(0));
      +      GEP2 = cast(GEP2->getOperand(0));
           const Value *BasePtr2 = GEP2->getOperand(0);
       
           // Do the base pointers alias?
      @@ -457,8 +455,8 @@
       
             // Collect all of the chained GEP operands together into one simple place
             SmallVector GEP1Ops, GEP2Ops;
      -      BasePtr1 = GetGEPOperands(V1, GEP1Ops);
      -      BasePtr2 = GetGEPOperands(V2, GEP2Ops);
      +      BasePtr1 = GetGEPOperands(GEP1, GEP1Ops);
      +      BasePtr2 = GetGEPOperands(GEP2, GEP2Ops);
       
             // If GetGEPOperands were able to fold to the same must-aliased pointer,
             // do the comparison.
      @@ -482,7 +480,7 @@
           return MayAlias;
       
         SmallVector GEPOperands;
      -  const Value *BasePtr = GetGEPOperands(V1, GEPOperands);
      +  const Value *BasePtr = GetGEPOperands(GEP1, GEPOperands);
       
         AliasResult R = aliasCheck(BasePtr, ~0U, V2, V2Size);
         if (R != MustAlias)
      @@ -719,8 +717,8 @@
           std::swap(V1, V2);
           std::swap(V1Size, V2Size);
         }
      -  if (isa(V1))
      -    return aliasGEP(V1, V1Size, V2, V2Size);
      +  if (const GEPOperator *GV1 = dyn_cast(V1))
      +    return aliasGEP(GV1, V1Size, V2, V2Size);
       
         if (isa(V2) && !isa(V1)) {
           std::swap(V1, V2);
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 20:13:03 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 02:13:03 -0000
      Subject: [llvm-commits] [llvm] r89915 -
      	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      Message-ID: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 20:13:03 2009
      New Revision: 89915
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89915&view=rev
      Log:
      Implement a new DecomposeGEPExpression method, which decomposes a GEP into a list of scaled offsets.  Use this to eliminate some previous ad-hoc code which was subtly broken (it assumed all Constant*'s were non-zero, but strange constant express could be zero).
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89915&r1=89914&r2=89915&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Wed Nov 25 20:13:03 2009
      @@ -405,6 +405,97 @@
         return NoAA::getModRefInfo(CS1, CS2);
       }
       
      +/// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose it
      +/// into a base pointer with a constant offset and a number of scaled symbolic
      +/// offsets.
      +static const Value *DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
      +               SmallVectorImpl > &VarIndices,
      +                                           const TargetData *TD) {
      +  const Value *OrigPtr = V;
      +  BaseOffs = 0;
      +  while (1) {
      +    // See if this is a bitcast or GEP.
      +    const Operator *Op = dyn_cast(V);
      +    if (Op == 0) return V;
      +    
      +    if (Op->getOpcode() == Instruction::BitCast) {
      +      V = Op->getOperand(0);
      +      continue;
      +    }
      +    
      +    if (Op->getOpcode() != Instruction::GetElementPtr)
      +      return V;
      +    
      +    // Don't attempt to analyze GEPs over unsized objects.
      +    if (!cast(Op->getOperand(0)->getType())
      +          ->getElementType()->isSized())
      +      return V;
      +    
      +    // Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
      +    gep_type_iterator GTI = gep_type_begin(Op);
      +    for (User::const_op_iterator I = next(Op->op_begin()), E = Op->op_end();
      +         I != E; ++I) {
      +      Value *Index = *I;
      +      // Compute the (potentially symbolic) offset in bytes for this index.
      +      if (const StructType *STy = dyn_cast(*GTI++)) {
      +        // For a struct, add the member offset.
      +        unsigned FieldNo = cast(Index)->getZExtValue();
      +        if (FieldNo == 0) continue;
      +        if (TD == 0) goto FailNoTD;
      +        
      +        BaseOffs += TD->getStructLayout(STy)->getElementOffset(FieldNo);
      +        continue;
      +      }
      +      
      +      // For an array/pointer, add the element offset, explicitly scaled.
      +      if (ConstantInt *CIdx = dyn_cast(Index)) {
      +        if (CIdx->isZero()) continue;
      +        if (TD == 0) goto FailNoTD;
      +        
      +        BaseOffs += TD->getTypeAllocSize(*GTI)*CIdx->getSExtValue();
      +        continue;
      +      }
      +      
      +      if (TD == 0) goto FailNoTD;
      +      
      +      // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
      +      uint64_t Scale = TD->getTypeAllocSize(*GTI);
      +      
      +      // If we already had an occurrance of this index variable, merge this
      +      // scale into it.  For example, we want to handle:
      +      //   A[x][x] -> x*16 + x*4 -> x*20
      +      for (unsigned i = 0, e = VarIndices.size(); i != e; ++i) {
      +        if (VarIndices[i].first == Index) {
      +          Scale += VarIndices[i].second;
      +          VarIndices.erase(VarIndices.begin()+i);
      +          break;
      +        }
      +      }
      +      
      +      // Make sure that we have a scale that makes sense for this target's
      +      // pointer size.
      +      if (unsigned ShiftBits = 64-TD->getPointerSizeInBits()) {
      +        Scale <<= ShiftBits;
      +        Scale >>= ShiftBits;
      +      }
      +      
      +      if (Scale)
      +        VarIndices.push_back(std::make_pair(Index, Scale));
      +    }
      +    
      +    // Analyze the base pointer next.
      +    V = Op->getOperand(0);
      +  }
      +  
      +  // If we don't have TD around, we can't analyze this index, remove all
      +  // information we've found.
      +FailNoTD:
      +  VarIndices.clear();
      +  BaseOffs = 0;
      +  return OrigPtr;
      +}
      +
      +
       /// aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction
       /// against another pointer.  We know that V1 is a GEP, but we don't know
       /// anything about V2.
      @@ -479,10 +570,12 @@
         if (V1Size == ~0U || V2Size == ~0U)
           return MayAlias;
       
      -  SmallVector GEPOperands;
      -  const Value *BasePtr = GetGEPOperands(GEP1, GEPOperands);
      -
      -  AliasResult R = aliasCheck(BasePtr, ~0U, V2, V2Size);
      +  int64_t GEP1BaseOffset;
      +  SmallVector, 4> VariableIndices;
      +  const Value *GEP1BasePtr =
      +    DecomposeGEPExpression(GEP1, GEP1BaseOffset, VariableIndices, TD);
      +    
      +  AliasResult R = aliasCheck(GEP1BasePtr, ~0U, V2, V2Size);
         if (R != MustAlias)
           // If V2 may alias GEP base pointer, conservatively returns MayAlias.
           // If V2 is known not to alias GEP base pointer, then the two values
      @@ -491,48 +584,34 @@
           // with the first operand of the getelementptr".
           return R;
       
      -  // If there is at least one non-zero constant index, we know they cannot
      -  // alias.
      -  bool ConstantFound = false;
      -  bool AllZerosFound = true;
      -  for (unsigned i = 0, e = GEPOperands.size(); i != e; ++i)
      -    if (const Constant *C = dyn_cast(GEPOperands[i])) {
      -      if (!C->isNullValue()) {
      -        ConstantFound = true;
      -        AllZerosFound = false;
      -        break;
      -      }
      -    } else {
      -      AllZerosFound = false;
      -    }
      -
         // If we have getelementptr , 0, 0, 0, 0, ... and V2 must aliases
         // the ptr, the end result is a must alias also.
      -  if (AllZerosFound)
      +  if (GEP1BaseOffset == 0 && VariableIndices.empty())
           return MustAlias;
       
      -  if (ConstantFound) {
      -    if (V2Size <= 1 && V1Size <= 1)  // Just pointer check?
      +  // If we have a known constant offset, see if this offset is larger than the
      +  // access size being queried.  If so, and if no variable indices can remove
      +  // pieces of this constant, then we know we have a no-alias.  For example,
      +  //   &A[100] != &A.
      +  
      +  // In order to handle cases like &A[100][i] where i is an out of range
      +  // subscript, we have to ignore all constant offset pieces that are a multiple
      +  // of a scaled index.  Do this by removing constant offsets that are a
      +  // multiple of any of our variable indices.  This allows us to transform
      +  // things like &A[i][1] because i has a stride of (e.g.) 8 bytes but the 1
      +  // provides an offset of 4 bytes (assuming a <= 4 byte access).
      +  for (unsigned i = 0, e = VariableIndices.size(); i != e && GEP1BaseOffset;++i)
      +    if (int64_t RemovedOffset = GEP1BaseOffset/VariableIndices[i].second)
      +      GEP1BaseOffset -= RemovedOffset*VariableIndices[i].second;
      +  
      +  // If our known offset is bigger than the access size, we know we don't have
      +  // an alias.
      +  if (GEP1BaseOffset) {
      +    if (GEP1BaseOffset >= (int64_t)V2Size ||
      +        GEP1BaseOffset <= -(int64_t)V1Size)
             return NoAlias;
      -
      -    // Otherwise we have to check to see that the distance is more than
      -    // the size of the argument... build an index vector that is equal to
      -    // the arguments provided, except substitute 0's for any variable
      -    // indexes we find...
      -    if (TD &&
      -        cast(BasePtr->getType())->getElementType()->isSized()) {
      -      for (unsigned i = 0; i != GEPOperands.size(); ++i)
      -        if (!isa(GEPOperands[i]))
      -          GEPOperands[i] = Constant::getNullValue(GEPOperands[i]->getType());
      -      int64_t Offset = TD->getIndexedOffset(BasePtr->getType(),
      -                                            &GEPOperands[0],
      -                                            GEPOperands.size());
      -
      -      if (Offset >= (int64_t)V2Size || Offset <= -(int64_t)V1Size)
      -        return NoAlias;
      -    }
         }
      -
      +  
         return MayAlias;
       }
       
      @@ -713,6 +792,8 @@
             return NoAlias;
         }
       
      +  // FIXME: This isn't aggressively handling alias(GEP, PHI) for example: if the
      +  // GEP can't simplify, we don't even look at the PHI cases.
         if (!isa(V1) && isa(V2)) {
           std::swap(V1, V2);
           std::swap(V1Size, V2Size);
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 20:14:59 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 02:14:59 -0000
      Subject: [llvm-commits] [llvm] r89920 -
      	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      Message-ID: <200911260214.nAQ2Ex4w024292@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 20:14:59 2009
      New Revision: 89920
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89920&view=rev
      Log:
      Generalize DecomposeGEPExpression to exactly handle what Value::getUnderlyingObject does (when TD is around).  This allows us to avoid calling DecomposeGEPExpression unless the ultimate alias check we care about passes, speedup up BasicAA a bit.
      
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89920&r1=89919&r2=89920&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Wed Nov 25 20:14:59 2009
      @@ -20,6 +20,7 @@
       #include "llvm/Constants.h"
       #include "llvm/DerivedTypes.h"
       #include "llvm/Function.h"
      +#include "llvm/GlobalAlias.h"
       #include "llvm/GlobalVariable.h"
       #include "llvm/Instructions.h"
       #include "llvm/IntrinsicInst.h"
      @@ -220,7 +221,8 @@
           // aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP
           // instruction against another.
           AliasResult aliasGEP(const GEPOperator *V1, unsigned V1Size,
      -                         const Value *V2, unsigned V2Size);
      +                         const Value *V2, unsigned V2Size,
      +                         const Value *UnderlyingV1, const Value *UnderlyingV2);
       
           // aliasPHI - Provide a bunch of ad-hoc rules to disambiguate a PHI
           // instruction against another.
      @@ -408,40 +410,66 @@
       /// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose it
       /// into a base pointer with a constant offset and a number of scaled symbolic
       /// offsets.
      +///
      +/// When TargetData is around, this function is capable of analyzing everything
      +/// that Value::getUnderlyingObject() can look through.  When not, it just looks
      +/// through pointer casts.
      +///
      +/// FIXME: Move this out to ValueTracking.cpp
      +///
       static const Value *DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
                      SmallVectorImpl > &VarIndices,
                                                  const TargetData *TD) {
      -  const Value *OrigPtr = V;
      +  // FIXME: Should limit depth like getUnderlyingObject?
         BaseOffs = 0;
         while (1) {
           // See if this is a bitcast or GEP.
           const Operator *Op = dyn_cast(V);
      -    if (Op == 0) return V;
      +    if (Op == 0) {
      +      // The only non-operator case we can handle are GlobalAliases.
      +      if (const GlobalAlias *GA = dyn_cast(V)) {
      +        if (!GA->mayBeOverridden()) {
      +          V = GA->getAliasee();
      +          continue;
      +        }
      +      }
      +      return V;
      +    }
           
           if (Op->getOpcode() == Instruction::BitCast) {
             V = Op->getOperand(0);
             continue;
           }
           
      -    if (Op->getOpcode() != Instruction::GetElementPtr)
      +    const GEPOperator *GEPOp = dyn_cast(Op);
      +    if (GEPOp == 0)
             return V;
           
           // Don't attempt to analyze GEPs over unsized objects.
      -    if (!cast(Op->getOperand(0)->getType())
      +    if (!cast(GEPOp->getOperand(0)->getType())
                 ->getElementType()->isSized())
             return V;
      +
      +    // If we are lacking TargetData information, we can't compute the offets of
      +    // elements computed by GEPs.  However, we can handle bitcast equivalent
      +    // GEPs.
      +    if (!TD) {
      +      if (!GEPOp->hasAllZeroIndices())
      +        return V;
      +      V = GEPOp->getOperand(0);
      +      continue;
      +    }
           
           // Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
      -    gep_type_iterator GTI = gep_type_begin(Op);
      -    for (User::const_op_iterator I = next(Op->op_begin()), E = Op->op_end();
      -         I != E; ++I) {
      +    gep_type_iterator GTI = gep_type_begin(GEPOp);
      +    for (User::const_op_iterator I = next(GEPOp->op_begin()),
      +         E = GEPOp->op_end(); I != E; ++I) {
             Value *Index = *I;
             // Compute the (potentially symbolic) offset in bytes for this index.
             if (const StructType *STy = dyn_cast(*GTI++)) {
               // For a struct, add the member offset.
               unsigned FieldNo = cast(Index)->getZExtValue();
               if (FieldNo == 0) continue;
      -        if (TD == 0) goto FailNoTD;
               
               BaseOffs += TD->getStructLayout(STy)->getElementOffset(FieldNo);
               continue;
      @@ -450,14 +478,10 @@
             // For an array/pointer, add the element offset, explicitly scaled.
             if (ConstantInt *CIdx = dyn_cast(Index)) {
               if (CIdx->isZero()) continue;
      -        if (TD == 0) goto FailNoTD;
      -        
               BaseOffs += TD->getTypeAllocSize(*GTI)*CIdx->getSExtValue();
               continue;
             }
             
      -      if (TD == 0) goto FailNoTD;
      -      
             // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
             uint64_t Scale = TD->getTypeAllocSize(*GTI);
             
      @@ -484,25 +508,21 @@
           }
           
           // Analyze the base pointer next.
      -    V = Op->getOperand(0);
      +    V = GEPOp->getOperand(0);
         }
      -  
      -  // If we don't have TD around, we can't analyze this index, remove all
      -  // information we've found.
      -FailNoTD:
      -  VarIndices.clear();
      -  BaseOffs = 0;
      -  return OrigPtr;
       }
       
       
       /// aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction
       /// against another pointer.  We know that V1 is a GEP, but we don't know
      -/// anything about V2.
      +/// anything about V2.  UnderlyingV1 is GEP1->getUnderlyingObject(),
      +/// UnderlyingV2 is the same for V2.
       ///
       AliasAnalysis::AliasResult
       BasicAliasAnalysis::aliasGEP(const GEPOperator *GEP1, unsigned V1Size,
      -                             const Value *V2, unsigned V2Size) {
      +                             const Value *V2, unsigned V2Size,
      +                             const Value *UnderlyingV1,
      +                             const Value *UnderlyingV2) {
         // If we have two gep instructions with must-alias'ing base pointers, figure
         // out if the indexes to the GEP tell us anything about the derived pointer.
         // Note that we also handle chains of getelementptr instructions as well as
      @@ -567,15 +587,12 @@
         // instruction.  If one pointer is a GEP with a non-zero index of the other
         // pointer, we know they cannot alias.
         //
      +  // FIXME: The check below only looks at the size of one of the pointers, not
      +  // both, this may cause us to miss things.
         if (V1Size == ~0U || V2Size == ~0U)
           return MayAlias;
       
      -  int64_t GEP1BaseOffset;
      -  SmallVector, 4> VariableIndices;
      -  const Value *GEP1BasePtr =
      -    DecomposeGEPExpression(GEP1, GEP1BaseOffset, VariableIndices, TD);
      -    
      -  AliasResult R = aliasCheck(GEP1BasePtr, ~0U, V2, V2Size);
      +  AliasResult R = aliasCheck(UnderlyingV1, ~0U, V2, V2Size);
         if (R != MustAlias)
           // If V2 may alias GEP base pointer, conservatively returns MayAlias.
           // If V2 is known not to alias GEP base pointer, then the two values
      @@ -584,6 +601,20 @@
           // with the first operand of the getelementptr".
           return R;
       
      +  int64_t GEP1BaseOffset;
      +  SmallVector, 4> VariableIndices;
      +  const Value *GEP1BasePtr =
      +    DecomposeGEPExpression(GEP1, GEP1BaseOffset, VariableIndices, TD);
      +  
      +  // If DecomposeGEPExpression isn't able to look all the way through the
      +  // addressing operation, we must not have TD and this is too complex for us
      +  // to handle without it.
      +  if (GEP1BasePtr != UnderlyingV1) {
      +    assert(TD == 0 &&
      +           "DecomposeGEPExpression and getUnderlyingObject disagree!");
      +    return MayAlias;
      +  }
      +  
         // If we have getelementptr , 0, 0, 0, 0, ... and V2 must aliases
         // the ptr, the end result is a must alias also.
         if (GEP1BaseOffset == 0 && VariableIndices.empty())
      @@ -797,9 +828,10 @@
         if (!isa(V1) && isa(V2)) {
           std::swap(V1, V2);
           std::swap(V1Size, V2Size);
      +    std::swap(O1, O2);
         }
         if (const GEPOperator *GV1 = dyn_cast(V1))
      -    return aliasGEP(GV1, V1Size, V2, V2Size);
      +    return aliasGEP(GV1, V1Size, V2, V2Size, O1, O2);
       
         if (isa(V2) && !isa(V1)) {
           std::swap(V1, V2);
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 20:16:28 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 02:16:28 -0000
      Subject: [llvm-commits] [llvm] r89921 -
      	/llvm/trunk/test/Analysis/BasicAA/modref.ll
      Message-ID: <200911260216.nAQ2GSXi024344@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 20:16:28 2009
      New Revision: 89921
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89921&view=rev
      Log:
      add a new random feature test
      
      Modified:
          llvm/trunk/test/Analysis/BasicAA/modref.ll
      
      Modified: llvm/trunk/test/Analysis/BasicAA/modref.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/modref.ll?rev=89921&r1=89920&r2=89921&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/modref.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/modref.ll Wed Nov 25 20:16:28 2009
      @@ -4,6 +4,7 @@
       declare void @llvm.memset.i32(i8*, i8, i32, i32)
       declare void @llvm.memset.i8(i8*, i8, i8, i32)
       declare void @llvm.memcpy.i8(i8*, i8*, i8, i32)
      +declare void @llvm.memcpy.i32(i8*, i8*, i32, i32)
       declare void @llvm.lifetime.end(i64, i8* nocapture)
       
       declare void @external(i32*) 
      @@ -94,7 +95,7 @@
       @G1 = external global i32
       @G2 = external global [4000 x i32]
       
      -define i32 @test4(i8* %P, i8 %X) {
      +define i32 @test4(i8* %P) {
         %tmp = load i32* @G1
         call void @llvm.memset.i32(i8* bitcast ([4000 x i32]* @G2 to i8*), i8 0, i32 4000, i32 1)
         %tmp2 = load i32* @G1
      @@ -107,3 +108,18 @@
       ; CHECK: sub i32 %tmp, %tmp
       }
       
      +; Verify that basicaa is handling variable length memcpy, knowing it doesn't
      +; write to G1.
      +define i32 @test5(i8* %P, i32 %Len) {
      +  %tmp = load i32* @G1
      +  call void @llvm.memcpy.i32(i8* bitcast ([4000 x i32]* @G2 to i8*), i8* bitcast (i32* @G1 to i8*), i32 %Len, i32 1)
      +  %tmp2 = load i32* @G1
      +  %sub = sub i32 %tmp2, %tmp
      +  ret i32 %sub
      +; CHECK: @test5
      +; CHECK: load i32* @G
      +; CHECK: memcpy.i32
      +; CHECK-NOT: load
      +; CHECK: sub i32 %tmp, %tmp
      +}
      +
      
      
      
      
      From sabre at nondot.org  Wed Nov 25 20:17:34 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 02:17:34 -0000
      Subject: [llvm-commits] [llvm] r89922 - in /llvm/trunk:
       lib/Analysis/BasicAliasAnalysis.cpp
       test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      Message-ID: <200911260217.nAQ2HYJo024394@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Wed Nov 25 20:17:34 2009
      New Revision: 89922
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89922&view=rev
      Log:
      Change the other half of aliasGEP (which handles GEP differencing) to use DecomposeGEPExpression.  This dramatically simplifies and shrinks the code by eliminating the horrible CheckGEPInstructions method, fixes a miscompilation (@test3) and makes the code more aggressive.  In particular, we now handle the @test4 case, which is reduced from the SmallPtrSet constructor.  Missing this caused us to emit a variable length memset instead of a fixed size one.
      
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
          llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89922&r1=89921&r2=89922&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Wed Nov 25 20:17:34 2009
      @@ -39,26 +39,6 @@
       // Useful predicates
       //===----------------------------------------------------------------------===//
       
      -static const Value *GetGEPOperands(const GEPOperator *V, 
      -                                   SmallVector &GEPOps) {
      -  assert(GEPOps.empty() && "Expect empty list to populate!");
      -  GEPOps.insert(GEPOps.end(), V->op_begin()+1, V->op_end());
      -
      -  // Accumulate all of the chained indexes into the operand array.
      -  Value *BasePtr = V->getOperand(0);
      -  while (1) {
      -    V = dyn_cast(BasePtr);
      -    if (V == 0) return BasePtr;
      -    
      -    // Don't handle folding arbitrary pointer offsets yet.
      -    if (!isa(GEPOps[0]) || !cast(GEPOps[0])->isNullValue())
      -      return BasePtr;
      -    
      -    GEPOps.erase(GEPOps.begin());   // Drop the zero index
      -    GEPOps.insert(GEPOps.begin(), V->op_begin()+1, V->op_end());
      -  }
      -}
      -
       /// isKnownNonNull - Return true if we know that the specified value is never
       /// null.
       static bool isKnownNonNull(const Value *V) {
      @@ -235,15 +215,6 @@
       
           AliasResult aliasCheck(const Value *V1, unsigned V1Size,
                                  const Value *V2, unsigned V2Size);
      -
      -    // CheckGEPInstructions - Check two GEP instructions with known
      -    // must-aliasing base pointers.  This checks to see if the index expressions
      -    // preclude the pointers from aliasing.
      -    AliasResult
      -    CheckGEPInstructions(const Type* BasePtr1Ty,
      -                         Value **GEP1Ops, unsigned NumGEP1Ops, unsigned G1Size,
      -                         const Type *BasePtr2Ty,
      -                         Value **GEP2Ops, unsigned NumGEP2Ops, unsigned G2Size);
         };
       }  // End of anonymous namespace
       
      @@ -418,7 +389,7 @@
       /// FIXME: Move this out to ValueTracking.cpp
       ///
       static const Value *DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
      -               SmallVectorImpl > &VarIndices,
      +                 SmallVectorImpl > &VarIndices,
                                                  const TargetData *TD) {
         // FIXME: Should limit depth like getUnderlyingObject?
         BaseOffs = 0;
      @@ -488,6 +459,7 @@
             // If we already had an occurrance of this index variable, merge this
             // scale into it.  For example, we want to handle:
             //   A[x][x] -> x*16 + x*4 -> x*20
      +      // This also ensures that 'x' only appears in the index list once.
             for (unsigned i = 0, e = VarIndices.size(); i != e; ++i) {
               if (VarIndices[i].first == Index) {
                 Scale += VarIndices[i].second;
      @@ -512,6 +484,39 @@
         }
       }
       
      +/// GetIndiceDifference - Dest and Src are the variable indices from two
      +/// decomposed GetElementPtr instructions GEP1 and GEP2 which have common base
      +/// pointers.  Subtract the GEP2 indices from GEP1 to find the symbolic
      +/// difference between the two pointers. 
      +static void GetIndiceDifference(
      +                      SmallVectorImpl > &Dest,
      +                const SmallVectorImpl > &Src) {
      +  if (Src.empty()) return;
      +
      +  for (unsigned i = 0, e = Src.size(); i != e; ++i) {
      +    const Value *V = Src[i].first;
      +    int64_t Scale = Src[i].second;
      +    
      +    // Find V in Dest.  This is N^2, but pointer indices almost never have more
      +    // than a few variable indexes.
      +    for (unsigned j = 0, e = Dest.size(); j != e; ++j) {
      +      if (Dest[j].first != V) continue;
      +      
      +      // If we found it, subtract off Scale V's from the entry in Dest.  If it
      +      // goes to zero, remove the entry.
      +      if (Dest[j].second != Scale)
      +        Dest[j].second -= Scale;
      +      else
      +        Dest.erase(Dest.begin()+j);
      +      Scale = 0;
      +      break;
      +    }
      +    
      +    // If we didn't consume this entry, add it to the end of the Dest list.
      +    if (Scale)
      +      Dest.push_back(std::make_pair(V, -Scale));
      +  }
      +}
       
       /// aliasGEP - Provide a bunch of ad-hoc rules to disambiguate a GEP instruction
       /// against another pointer.  We know that V1 is a GEP, but we don't know
      @@ -523,101 +528,83 @@
                                    const Value *V2, unsigned V2Size,
                                    const Value *UnderlyingV1,
                                    const Value *UnderlyingV2) {
      +  int64_t GEP1BaseOffset;
      +  SmallVector, 4> GEP1VariableIndices;
      +
         // If we have two gep instructions with must-alias'ing base pointers, figure
         // out if the indexes to the GEP tell us anything about the derived pointer.
      -  // Note that we also handle chains of getelementptr instructions as well as
      -  // constant expression getelementptrs here.
      -  //
         if (const GEPOperator *GEP2 = dyn_cast(V2)) {
      -    // If V1 and V2 are identical GEPs, just recurse down on both of them.
      -    // This allows us to analyze things like:
      -    //   P = gep A, 0, i, 1
      -    //   Q = gep B, 0, i, 1
      -    // by just analyzing A and B.  This is even safe for variable indices.
      -    if (GEP1->getType() == GEP2->getType() &&
      -        GEP1->getNumOperands() == GEP2->getNumOperands() &&
      -        GEP1->getOperand(0)->getType() == GEP2->getOperand(0)->getType() &&
      -        // All operands are the same, ignoring the base.
      -        std::equal(GEP1->op_begin()+1, GEP1->op_end(), GEP2->op_begin()+1))
      -      return aliasCheck(GEP1->getOperand(0), V1Size,
      -                        GEP2->getOperand(0), V2Size);
      -    
      -    // Drill down into the first non-gep value, to test for must-aliasing of
      -    // the base pointers.
      -    while (isa(GEP1->getOperand(0)) &&
      -           GEP1->getOperand(1) ==
      -           Constant::getNullValue(GEP1->getOperand(1)->getType()))
      -      GEP1 = cast(GEP1->getOperand(0));
      -    const Value *BasePtr1 = GEP1->getOperand(0);
      -
      -    while (isa(GEP2->getOperand(0)) &&
      -           GEP2->getOperand(1) ==
      -           Constant::getNullValue(GEP2->getOperand(1)->getType()))
      -      GEP2 = cast(GEP2->getOperand(0));
      -    const Value *BasePtr2 = GEP2->getOperand(0);
      -
           // Do the base pointers alias?
      -    AliasResult BaseAlias = aliasCheck(BasePtr1, ~0U, BasePtr2, ~0U);
      -    if (BaseAlias == NoAlias) return NoAlias;
      -    if (BaseAlias == MustAlias) {
      -      // If the base pointers alias each other exactly, check to see if we can
      -      // figure out anything about the resultant pointers, to try to prove
      -      // non-aliasing.
      -
      -      // Collect all of the chained GEP operands together into one simple place
      -      SmallVector GEP1Ops, GEP2Ops;
      -      BasePtr1 = GetGEPOperands(GEP1, GEP1Ops);
      -      BasePtr2 = GetGEPOperands(GEP2, GEP2Ops);
      -
      -      // If GetGEPOperands were able to fold to the same must-aliased pointer,
      -      // do the comparison.
      -      if (BasePtr1 == BasePtr2) {
      -        AliasResult GAlias =
      -          CheckGEPInstructions(BasePtr1->getType(),
      -                               &GEP1Ops[0], GEP1Ops.size(), V1Size,
      -                               BasePtr2->getType(),
      -                               &GEP2Ops[0], GEP2Ops.size(), V2Size);
      -        if (GAlias != MayAlias)
      -          return GAlias;
      -      }
      +    AliasResult BaseAlias = aliasCheck(UnderlyingV1, ~0U, UnderlyingV2, ~0U);
      +    
      +    // If we get a No or May, then return it immediately, no amount of analysis
      +    // will improve this situation.
      +    if (BaseAlias != MustAlias) return BaseAlias;
      +    
      +    // Otherwise, we have a MustAlias.  Since the base pointers alias each other
      +    // exactly, see if the computed offset from the common pointer tells us
      +    // about the relation of the resulting pointer.
      +    const Value *GEP1BasePtr =
      +      DecomposeGEPExpression(GEP1, GEP1BaseOffset, GEP1VariableIndices, TD);
      +    
      +    int64_t GEP2BaseOffset;
      +    SmallVector, 4> GEP2VariableIndices;
      +    const Value *GEP2BasePtr =
      +      DecomposeGEPExpression(GEP2, GEP2BaseOffset, GEP2VariableIndices, TD);
      +    
      +    // If DecomposeGEPExpression isn't able to look all the way through the
      +    // addressing operation, we must not have TD and this is too complex for us
      +    // to handle without it.
      +    if (GEP1BasePtr != UnderlyingV1 || GEP2BasePtr != UnderlyingV2) {
      +      assert(TD == 0 &&
      +             "DecomposeGEPExpression and getUnderlyingObject disagree!");
      +      return MayAlias;
           }
      -  }
      -
      -  // Check to see if these two pointers are related by a getelementptr
      -  // instruction.  If one pointer is a GEP with a non-zero index of the other
      -  // pointer, we know they cannot alias.
      -  //
      -  // FIXME: The check below only looks at the size of one of the pointers, not
      -  // both, this may cause us to miss things.
      -  if (V1Size == ~0U || V2Size == ~0U)
      -    return MayAlias;
      -
      -  AliasResult R = aliasCheck(UnderlyingV1, ~0U, V2, V2Size);
      -  if (R != MustAlias)
      -    // If V2 may alias GEP base pointer, conservatively returns MayAlias.
      -    // If V2 is known not to alias GEP base pointer, then the two values
      -    // cannot alias per GEP semantics: "A pointer value formed from a
      -    // getelementptr instruction is associated with the addresses associated
      -    // with the first operand of the getelementptr".
      -    return R;
      +    
      +    // Subtract the GEP2 pointer from the GEP1 pointer to find out their
      +    // symbolic difference.
      +    GEP1BaseOffset -= GEP2BaseOffset;
      +    GetIndiceDifference(GEP1VariableIndices, GEP2VariableIndices);
      +    
      +  } else {
      +    // Check to see if these two pointers are related by the getelementptr
      +    // instruction.  If one pointer is a GEP with a non-zero index of the other
      +    // pointer, we know they cannot alias.
      +    //
      +    // FIXME: The check below only looks at the size of one of the pointers, not
      +    // both, this may cause us to miss things.
      +    if (V1Size == ~0U || V2Size == ~0U)
      +      return MayAlias;
       
      -  int64_t GEP1BaseOffset;
      -  SmallVector, 4> VariableIndices;
      -  const Value *GEP1BasePtr =
      -    DecomposeGEPExpression(GEP1, GEP1BaseOffset, VariableIndices, TD);
      -  
      -  // If DecomposeGEPExpression isn't able to look all the way through the
      -  // addressing operation, we must not have TD and this is too complex for us
      -  // to handle without it.
      -  if (GEP1BasePtr != UnderlyingV1) {
      -    assert(TD == 0 &&
      -           "DecomposeGEPExpression and getUnderlyingObject disagree!");
      -    return MayAlias;
      +    AliasResult R = aliasCheck(UnderlyingV1, ~0U, V2, V2Size);
      +    if (R != MustAlias)
      +      // If V2 may alias GEP base pointer, conservatively returns MayAlias.
      +      // If V2 is known not to alias GEP base pointer, then the two values
      +      // cannot alias per GEP semantics: "A pointer value formed from a
      +      // getelementptr instruction is associated with the addresses associated
      +      // with the first operand of the getelementptr".
      +      return R;
      +
      +    const Value *GEP1BasePtr =
      +      DecomposeGEPExpression(GEP1, GEP1BaseOffset, GEP1VariableIndices, TD);
      +    
      +    // If DecomposeGEPExpression isn't able to look all the way through the
      +    // addressing operation, we must not have TD and this is too complex for us
      +    // to handle without it.
      +    if (GEP1BasePtr != UnderlyingV1) {
      +      assert(TD == 0 &&
      +             "DecomposeGEPExpression and getUnderlyingObject disagree!");
      +      return MayAlias;
      +    }
         }
         
      -  // If we have getelementptr , 0, 0, 0, 0, ... and V2 must aliases
      -  // the ptr, the end result is a must alias also.
      -  if (GEP1BaseOffset == 0 && VariableIndices.empty())
      +  // In the two GEP Case, if there is no difference in the offsets of the
      +  // computed pointers, the resultant pointers are a must alias.  This
      +  // hapens when we have two lexically identical GEP's (for example).
      +  //
      +  // In the other case, if we have getelementptr , 0, 0, 0, 0, ... and V2
      +  // must aliases the GEP, the end result is a must alias also.
      +  if (GEP1BaseOffset == 0 && GEP1VariableIndices.empty())
           return MustAlias;
       
         // If we have a known constant offset, see if this offset is larger than the
      @@ -631,9 +618,10 @@
         // multiple of any of our variable indices.  This allows us to transform
         // things like &A[i][1] because i has a stride of (e.g.) 8 bytes but the 1
         // provides an offset of 4 bytes (assuming a <= 4 byte access).
      -  for (unsigned i = 0, e = VariableIndices.size(); i != e && GEP1BaseOffset;++i)
      -    if (int64_t RemovedOffset = GEP1BaseOffset/VariableIndices[i].second)
      -      GEP1BaseOffset -= RemovedOffset*VariableIndices[i].second;
      +  for (unsigned i = 0, e = GEP1VariableIndices.size();
      +       i != e && GEP1BaseOffset;++i)
      +    if (int64_t RemovedOffset = GEP1BaseOffset/GEP1VariableIndices[i].second)
      +      GEP1BaseOffset -= RemovedOffset*GEP1VariableIndices[i].second;
         
         // If our known offset is bigger than the access size, we know we don't have
         // an alias.
      @@ -850,351 +838,5 @@
         return MayAlias;
       }
       
      -// This function is used to determine if the indices of two GEP instructions are
      -// equal. V1 and V2 are the indices.
      -static bool IndexOperandsEqual(Value *V1, Value *V2) {
      -  if (V1->getType() == V2->getType())
      -    return V1 == V2;
      -  if (Constant *C1 = dyn_cast(V1))
      -    if (Constant *C2 = dyn_cast(V2)) {
      -      // Sign extend the constants to long types, if necessary
      -      if (C1->getType() != Type::getInt64Ty(C1->getContext()))
      -        C1 = ConstantExpr::getSExt(C1, Type::getInt64Ty(C1->getContext()));
      -      if (C2->getType() != Type::getInt64Ty(C1->getContext())) 
      -        C2 = ConstantExpr::getSExt(C2, Type::getInt64Ty(C1->getContext()));
      -      return C1 == C2;
      -    }
      -  return false;
      -}
      -
      -/// CheckGEPInstructions - Check two GEP instructions with known must-aliasing
      -/// base pointers.  This checks to see if the index expressions preclude the
      -/// pointers from aliasing.
      -AliasAnalysis::AliasResult 
      -BasicAliasAnalysis::CheckGEPInstructions(
      -  const Type* BasePtr1Ty, Value **GEP1Ops, unsigned NumGEP1Ops, unsigned G1S,
      -  const Type *BasePtr2Ty, Value **GEP2Ops, unsigned NumGEP2Ops, unsigned G2S) {
      -  // We currently can't handle the case when the base pointers have different
      -  // primitive types.  Since this is uncommon anyway, we are happy being
      -  // extremely conservative.
      -  if (BasePtr1Ty != BasePtr2Ty)
      -    return MayAlias;
      -
      -  const PointerType *GEPPointerTy = cast(BasePtr1Ty);
      -
      -  // Find the (possibly empty) initial sequence of equal values... which are not
      -  // necessarily constants.
      -  unsigned NumGEP1Operands = NumGEP1Ops, NumGEP2Operands = NumGEP2Ops;
      -  unsigned MinOperands = std::min(NumGEP1Operands, NumGEP2Operands);
      -  unsigned MaxOperands = std::max(NumGEP1Operands, NumGEP2Operands);
      -  unsigned UnequalOper = 0;
      -  while (UnequalOper != MinOperands &&
      -         IndexOperandsEqual(GEP1Ops[UnequalOper], GEP2Ops[UnequalOper])) {
      -    // Advance through the type as we go...
      -    ++UnequalOper;
      -    if (const CompositeType *CT = dyn_cast(BasePtr1Ty))
      -      BasePtr1Ty = CT->getTypeAtIndex(GEP1Ops[UnequalOper-1]);
      -    else {
      -      // If all operands equal each other, then the derived pointers must
      -      // alias each other...
      -      BasePtr1Ty = 0;
      -      assert(UnequalOper == NumGEP1Operands && UnequalOper == NumGEP2Operands &&
      -             "Ran out of type nesting, but not out of operands?");
      -      return MustAlias;
      -    }
      -  }
      -
      -  // If we have seen all constant operands, and run out of indexes on one of the
      -  // getelementptrs, check to see if the tail of the leftover one is all zeros.
      -  // If so, return mustalias.
      -  if (UnequalOper == MinOperands) {
      -    if (NumGEP1Ops < NumGEP2Ops) {
      -      std::swap(GEP1Ops, GEP2Ops);
      -      std::swap(NumGEP1Ops, NumGEP2Ops);
      -    }
      -
      -    bool AllAreZeros = true;
      -    for (unsigned i = UnequalOper; i != MaxOperands; ++i)
      -      if (!isa(GEP1Ops[i]) ||
      -          !cast(GEP1Ops[i])->isNullValue()) {
      -        AllAreZeros = false;
      -        break;
      -      }
      -    if (AllAreZeros) return MustAlias;
      -  }
      -
      -
      -  // So now we know that the indexes derived from the base pointers,
      -  // which are known to alias, are different.  We can still determine a
      -  // no-alias result if there are differing constant pairs in the index
      -  // chain.  For example:
      -  //        A[i][0] != A[j][1] iff (&A[0][1]-&A[0][0] >= std::max(G1S, G2S))
      -  //
      -  // We have to be careful here about array accesses.  In particular, consider:
      -  //        A[1][0] vs A[0][i]
      -  // In this case, we don't *know* that the array will be accessed in bounds:
      -  // the index could even be negative.  Because of this, we have to
      -  // conservatively *give up* and return may alias.  We disregard differing
      -  // array subscripts that are followed by a variable index without going
      -  // through a struct.
      -  //
      -  unsigned SizeMax = std::max(G1S, G2S);
      -  if (SizeMax == ~0U) return MayAlias; // Avoid frivolous work.
      -
      -  // Scan for the first operand that is constant and unequal in the
      -  // two getelementptrs...
      -  unsigned FirstConstantOper = UnequalOper;
      -  for (; FirstConstantOper != MinOperands; ++FirstConstantOper) {
      -    const Value *G1Oper = GEP1Ops[FirstConstantOper];
      -    const Value *G2Oper = GEP2Ops[FirstConstantOper];
      -
      -    if (G1Oper != G2Oper)   // Found non-equal constant indexes...
      -      if (Constant *G1OC = dyn_cast(const_cast(G1Oper)))
      -        if (Constant *G2OC = dyn_cast(const_cast(G2Oper))){
      -          if (G1OC->getType() != G2OC->getType()) {
      -            // Sign extend both operands to long.
      -            const Type *Int64Ty = Type::getInt64Ty(G1OC->getContext());
      -            if (G1OC->getType() != Int64Ty)
      -              G1OC = ConstantExpr::getSExt(G1OC, Int64Ty);
      -            if (G2OC->getType() != Int64Ty) 
      -              G2OC = ConstantExpr::getSExt(G2OC, Int64Ty);
      -            GEP1Ops[FirstConstantOper] = G1OC;
      -            GEP2Ops[FirstConstantOper] = G2OC;
      -          }
      -          
      -          if (G1OC != G2OC) {
      -            // Handle the "be careful" case above: if this is an array/vector
      -            // subscript, scan for a subsequent variable array index.
      -            if (const SequentialType *STy =
      -                  dyn_cast(BasePtr1Ty)) {
      -              const Type *NextTy = STy;
      -              bool isBadCase = false;
      -              
      -              for (unsigned Idx = FirstConstantOper;
      -                   Idx != MinOperands && isa(NextTy); ++Idx) {
      -                const Value *V1 = GEP1Ops[Idx], *V2 = GEP2Ops[Idx];
      -                if (!isa(V1) || !isa(V2)) {
      -                  isBadCase = true;
      -                  break;
      -                }
      -                // If the array is indexed beyond the bounds of the static type
      -                // at this level, it will also fall into the "be careful" case.
      -                // It would theoretically be possible to analyze these cases,
      -                // but for now just be conservatively correct.
      -                if (const ArrayType *ATy = dyn_cast(STy))
      -                  if (cast(G1OC)->getZExtValue() >=
      -                        ATy->getNumElements() ||
      -                      cast(G2OC)->getZExtValue() >=
      -                        ATy->getNumElements()) {
      -                    isBadCase = true;
      -                    break;
      -                  }
      -                if (const VectorType *VTy = dyn_cast(STy))
      -                  if (cast(G1OC)->getZExtValue() >=
      -                        VTy->getNumElements() ||
      -                      cast(G2OC)->getZExtValue() >=
      -                        VTy->getNumElements()) {
      -                    isBadCase = true;
      -                    break;
      -                  }
      -                STy = cast(NextTy);
      -                NextTy = cast(NextTy)->getElementType();
      -              }
      -              
      -              if (isBadCase) G1OC = 0;
      -            }
      -
      -            // Make sure they are comparable (ie, not constant expressions), and
      -            // make sure the GEP with the smaller leading constant is GEP1.
      -            if (G1OC) {
      -              Constant *Compare = ConstantExpr::getICmp(ICmpInst::ICMP_SGT, 
      -                                                        G1OC, G2OC);
      -              if (ConstantInt *CV = dyn_cast(Compare)) {
      -                if (CV->getZExtValue()) {  // If they are comparable and G2 > G1
      -                  std::swap(GEP1Ops, GEP2Ops);  // Make GEP1 < GEP2
      -                  std::swap(NumGEP1Ops, NumGEP2Ops);
      -                }
      -                break;
      -              }
      -            }
      -          }
      -        }
      -    BasePtr1Ty = cast(BasePtr1Ty)->getTypeAtIndex(G1Oper);
      -  }
      -
      -  // No shared constant operands, and we ran out of common operands.  At this
      -  // point, the GEP instructions have run through all of their operands, and we
      -  // haven't found evidence that there are any deltas between the GEP's.
      -  // However, one GEP may have more operands than the other.  If this is the
      -  // case, there may still be hope.  Check this now.
      -  if (FirstConstantOper == MinOperands) {
      -    // Without TargetData, we won't know what the offsets are.
      -    if (!TD)
      -      return MayAlias;
      -
      -    // Make GEP1Ops be the longer one if there is a longer one.
      -    if (NumGEP1Ops < NumGEP2Ops) {
      -      std::swap(GEP1Ops, GEP2Ops);
      -      std::swap(NumGEP1Ops, NumGEP2Ops);
      -    }
      -
      -    // Is there anything to check?
      -    if (NumGEP1Ops > MinOperands) {
      -      for (unsigned i = FirstConstantOper; i != MaxOperands; ++i)
      -        if (isa(GEP1Ops[i]) && 
      -            !cast(GEP1Ops[i])->isZero()) {
      -          // Yup, there's a constant in the tail.  Set all variables to
      -          // constants in the GEP instruction to make it suitable for
      -          // TargetData::getIndexedOffset.
      -          for (i = 0; i != MaxOperands; ++i)
      -            if (!isa(GEP1Ops[i]))
      -              GEP1Ops[i] = Constant::getNullValue(GEP1Ops[i]->getType());
      -          // Okay, now get the offset.  This is the relative offset for the full
      -          // instruction.
      -          int64_t Offset1 = TD->getIndexedOffset(GEPPointerTy, GEP1Ops,
      -                                                 NumGEP1Ops);
      -
      -          // Now check without any constants at the end.
      -          int64_t Offset2 = TD->getIndexedOffset(GEPPointerTy, GEP1Ops,
      -                                                 MinOperands);
      -
      -          // Make sure we compare the absolute difference.
      -          if (Offset1 > Offset2)
      -            std::swap(Offset1, Offset2);
      -
      -          // If the tail provided a bit enough offset, return noalias!
      -          if ((uint64_t)(Offset2-Offset1) >= SizeMax)
      -            return NoAlias;
      -          // Otherwise break - we don't look for another constant in the tail.
      -          break;
      -        }
      -    }
      -
      -    // Couldn't find anything useful.
      -    return MayAlias;
      -  }
      -
      -  // If there are non-equal constants arguments, then we can figure
      -  // out a minimum known delta between the two index expressions... at
      -  // this point we know that the first constant index of GEP1 is less
      -  // than the first constant index of GEP2.
      -
      -  // Advance BasePtr[12]Ty over this first differing constant operand.
      -  BasePtr2Ty = cast(BasePtr1Ty)->
      -      getTypeAtIndex(GEP2Ops[FirstConstantOper]);
      -  BasePtr1Ty = cast(BasePtr1Ty)->
      -      getTypeAtIndex(GEP1Ops[FirstConstantOper]);
      -
      -  // We are going to be using TargetData::getIndexedOffset to determine the
      -  // offset that each of the GEP's is reaching.  To do this, we have to convert
      -  // all variable references to constant references.  To do this, we convert the
      -  // initial sequence of array subscripts into constant zeros to start with.
      -  const Type *ZeroIdxTy = GEPPointerTy;
      -  for (unsigned i = 0; i != FirstConstantOper; ++i) {
      -    if (!isa(ZeroIdxTy))
      -      GEP1Ops[i] = GEP2Ops[i] = 
      -              Constant::getNullValue(Type::getInt32Ty(ZeroIdxTy->getContext()));
      -
      -    if (const CompositeType *CT = dyn_cast(ZeroIdxTy))
      -      ZeroIdxTy = CT->getTypeAtIndex(GEP1Ops[i]);
      -  }
      -
      -  // We know that GEP1Ops[FirstConstantOper] & GEP2Ops[FirstConstantOper] are ok
      -
      -  // Loop over the rest of the operands...
      -  for (unsigned i = FirstConstantOper+1; i != MaxOperands; ++i) {
      -    const Value *Op1 = i < NumGEP1Ops ? GEP1Ops[i] : 0;
      -    const Value *Op2 = i < NumGEP2Ops ? GEP2Ops[i] : 0;
      -    // If they are equal, use a zero index...
      -    if (Op1 == Op2 && BasePtr1Ty == BasePtr2Ty) {
      -      if (!isa(Op1))
      -        GEP1Ops[i] = GEP2Ops[i] = Constant::getNullValue(Op1->getType());
      -      // Otherwise, just keep the constants we have.
      -    } else {
      -      if (Op1) {
      -        if (const ConstantInt *Op1C = dyn_cast(Op1)) {
      -          // If this is an array index, make sure the array element is in range.
      -          if (const ArrayType *AT = dyn_cast(BasePtr1Ty)) {
      -            if (Op1C->getZExtValue() >= AT->getNumElements())
      -              return MayAlias;  // Be conservative with out-of-range accesses
      -          } else if (const VectorType *VT = dyn_cast(BasePtr1Ty)) {
      -            if (Op1C->getZExtValue() >= VT->getNumElements())
      -              return MayAlias;  // Be conservative with out-of-range accesses
      -          }
      -          
      -        } else {
      -          // GEP1 is known to produce a value less than GEP2.  To be
      -          // conservatively correct, we must assume the largest possible
      -          // constant is used in this position.  This cannot be the initial
      -          // index to the GEP instructions (because we know we have at least one
      -          // element before this one with the different constant arguments), so
      -          // we know that the current index must be into either a struct or
      -          // array.  Because we know it's not constant, this cannot be a
      -          // structure index.  Because of this, we can calculate the maximum
      -          // value possible.
      -          //
      -          if (const ArrayType *AT = dyn_cast(BasePtr1Ty))
      -            GEP1Ops[i] =
      -                  ConstantInt::get(Type::getInt64Ty(AT->getContext()), 
      -                                   AT->getNumElements()-1);
      -          else if (const VectorType *VT = dyn_cast(BasePtr1Ty))
      -            GEP1Ops[i] = 
      -                  ConstantInt::get(Type::getInt64Ty(VT->getContext()),
      -                                   VT->getNumElements()-1);
      -        }
      -      }
      -
      -      if (Op2) {
      -        if (const ConstantInt *Op2C = dyn_cast(Op2)) {
      -          // If this is an array index, make sure the array element is in range.
      -          if (const ArrayType *AT = dyn_cast(BasePtr2Ty)) {
      -            if (Op2C->getZExtValue() >= AT->getNumElements())
      -              return MayAlias;  // Be conservative with out-of-range accesses
      -          } else if (const VectorType *VT = dyn_cast(BasePtr2Ty)) {
      -            if (Op2C->getZExtValue() >= VT->getNumElements())
      -              return MayAlias;  // Be conservative with out-of-range accesses
      -          }
      -        } else {  // Conservatively assume the minimum value for this index
      -          GEP2Ops[i] = Constant::getNullValue(Op2->getType());
      -        }
      -      }
      -    }
      -
      -    if (BasePtr1Ty && Op1) {
      -      if (const CompositeType *CT = dyn_cast(BasePtr1Ty))
      -        BasePtr1Ty = CT->getTypeAtIndex(GEP1Ops[i]);
      -      else
      -        BasePtr1Ty = 0;
      -    }
      -
      -    if (BasePtr2Ty && Op2) {
      -      if (const CompositeType *CT = dyn_cast(BasePtr2Ty))
      -        BasePtr2Ty = CT->getTypeAtIndex(GEP2Ops[i]);
      -      else
      -        BasePtr2Ty = 0;
      -    }
      -  }
      -
      -  if (TD && GEPPointerTy->getElementType()->isSized()) {
      -    int64_t Offset1 =
      -      TD->getIndexedOffset(GEPPointerTy, GEP1Ops, NumGEP1Ops);
      -    int64_t Offset2 = 
      -      TD->getIndexedOffset(GEPPointerTy, GEP2Ops, NumGEP2Ops);
      -    assert(Offset1 != Offset2 &&
      -           "There is at least one different constant here!");
      -    
      -    // Make sure we compare the absolute difference.
      -    if (Offset1 > Offset2)
      -      std::swap(Offset1, Offset2);
      -    
      -    if ((uint64_t)(Offset2-Offset1) >= SizeMax) {
      -      //cerr << "Determined that these two GEP's don't alias ["
      -      //     << SizeMax << " bytes]: \n" << *GEP1 << *GEP2;
      -      return NoAlias;
      -    }
      -  }
      -  return MayAlias;
      -}
      -
       // Make sure that anything that uses AliasAnalysis pulls in this file.
       DEFINING_FILE_FOR(BasicAliasAnalysis)
      
      Modified: llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll?rev=89922&r1=89921&r2=89922&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll Wed Nov 25 20:17:34 2009
      @@ -1,7 +1,9 @@
      -; RUN: opt < %s -aa-eval -print-all-alias-modref-info -disable-output |& grep {MustAlias:.*%R,.*%r}
      +; RUN: opt < %s -gvn -instcombine -S |& FileCheck %s
       ; Make sure that basicaa thinks R and r are must aliases.
       
      -define i32 @test(i8 * %P) {
      +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
      +
      +define i32 @test1(i8 * %P) {
       entry:
       	%Q = bitcast i8* %P to {i32, i32}*
       	%R = getelementptr {i32, i32}* %Q, i32 0, i32 1
      @@ -13,4 +15,59 @@
       
       	%t = sub i32 %S, %s
       	ret i32 %t
      +; CHECK: @test1
      +; CHECK: ret i32 0
      +}
      +
      +define i32 @test2(i8 * %P) {
      +entry:
      +	%Q = bitcast i8* %P to {i32, i32, i32}*
      +	%R = getelementptr {i32, i32, i32}* %Q, i32 0, i32 1
      +	%S = load i32* %R
      +
      +	%r = getelementptr {i32, i32, i32}* %Q, i32 0, i32 2
      +  store i32 42, i32* %r
      +
      +	%s = load i32* %R
      +
      +	%t = sub i32 %S, %s
      +	ret i32 %t
      +; CHECK: @test2
      +; CHECK: ret i32 0
      +}
      +
      +
      +; This was a miscompilation.
      +define i32 @test3({float, {i32, i32, i32}}* %P) {
      +entry:
      +  %P2 = getelementptr {float, {i32, i32, i32}}* %P, i32 0, i32 1
      +	%R = getelementptr {i32, i32, i32}* %P2, i32 0, i32 1
      +	%S = load i32* %R
      +
      +	%r = getelementptr {i32, i32, i32}* %P2, i32 0, i32 2
      +  store i32 42, i32* %r
      +
      +	%s = load i32* %R
      +
      +	%t = sub i32 %S, %s
      +	ret i32 %t
      +; CHECK: @test3
      +; CHECK: ret i32 0
      +}
      +
      +
      +;; This is reduced from the SmallPtrSet constructor.
      +%SmallPtrSetImpl = type { i8**, i32, i32, i32, [1 x i8*] }
      +%SmallPtrSet64 = type { %SmallPtrSetImpl, [64 x i8*] }
      +
      +define i32 @test4(%SmallPtrSet64* %P) {
      +entry:
      +  %tmp2 = getelementptr inbounds %SmallPtrSet64* %P, i64 0, i32 0, i32 1
      +  store i32 64, i32* %tmp2, align 8
      +  %tmp3 = getelementptr inbounds %SmallPtrSet64* %P, i64 0, i32 0, i32 4, i64 64
      +  store i8* null, i8** %tmp3, align 8
      +  %tmp4 = load i32* %tmp2, align 8
      +	ret i32 %tmp4
      +; CHECK: @test4
      +; CHECK: ret i32 64
       }
      
      
      
      
      From clattner at apple.com  Wed Nov 25 20:21:04 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Wed, 25 Nov 2009 18:21:04 -0800
      Subject: [llvm-commits] [llvm] r89663
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <4B0AF85E.4090709@free.fr>
      References: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu>
      	<4B0AF85E.4090709@free.fr>
      Message-ID: <0D454083-C03A-4B5D-9D2B-055F3EA610E0@apple.com>
      
      
      On Nov 23, 2009, at 1:02 PM, Duncan Sands wrote:
      
      > Hi Chris, I tried to understand what you are doing here but failed,
      > feel like explaining some more?  I don't see how capture/nocapture
      > is relevant here...
      
      The basic idea is that this is trying to decide whether a call could mod/ref a non-escaping alloca.  If the alloca is passed as an argument to a call, then it will be escaping, unless the call argument is marked nocapture.  Because of this, we only need to check nocapture arguments.
      
      -Chris
      
      
      
      
      From clattner at apple.com  Wed Nov 25 20:33:27 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Wed, 25 Nov 2009 18:33:27 -0800
      Subject: [llvm-commits] [llvm] r89877 -
      	/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.h
      In-Reply-To: <20AB4AAE-93CF-46FB-8788-F53C1107E490@apple.com>
      References: <200911251957.nAPJvEma011366@zion.cs.uiuc.edu>
      	<6CB2B61C-657A-4260-9717-1267F605009E@apple.com>
      	<20AB4AAE-93CF-46FB-8788-F53C1107E490@apple.com>
      Message-ID: <06AC226D-4505-40A4-93EA-18A39047A994@apple.com>
      
      
      On Nov 25, 2009, at 4:45 PM, Bob Wilson wrote:
      
      > 
      > On Nov 25, 2009, at 4:18 PM, Chris Lattner wrote:
      > 
      >> 
      >> On Nov 25, 2009, at 11:57 AM, Bob Wilson wrote:
      >> 
      >>> Author: bwilson
      >>> Date: Wed Nov 25 13:57:14 2009
      >>> New Revision: 89877
      >>> 
      >>> URL: http://llvm.org/viewvc/llvm-project?rev=89877&view=rev
      >>> Log:
      >>> Tail duplicate indirect branches for PowerPC, too.
      >>> With the testcase for pr3120, the "threaded interpreter" runtime decreases
      >>> from 1788 to 1413 with this change.
      >> 
      >> Very nice Bob!  Silly question: should isProfitableToDuplicateIndirectBranch default to true?  That would let targets opt out if it is not beneficial and would save a small bit of code size.
      > 
      > Not silly at all.... I had the same thought.
      > 
      > Having spent more time looking at the effects of this transformation, I have a different proposal that I think you might like.  Let's just get rid of that target hook altogether.  ;-)
      
      You are very persuasive, I'm convinced!!
      
      > This special treatment of indirect branches for tail duplication just doesn't kick in very often.  I was being cautious about blowing up code size, but it doesn't happen often enough to matter.  I also wasn't sure if this would matter on anything besides ARM Cortex processors but now we know that it does.
      
      Right.  Indirect gotos just aren't that common.
      
      > On a related note, Evan suggested that we be more aggressive about this with -Os, and I think that's also a good idea.
      > 
      > Unless someone objects, I'll go ahead with those changes sometime soon.
      
      Sounds great, thanks Bob!
      
      -Chris
       
      
      
      From espindola at google.com  Wed Nov 25 22:18:11 2009
      From: espindola at google.com (Rafael Espindola)
      Date: Wed, 25 Nov 2009 23:18:11 -0500
      Subject: [llvm-commits] [llvm-gcc-4.2] r86892 -
      	/llvm-gcc-4.2/trunk/gcc/llvm-abi.h
      In-Reply-To: <35038F04-ED7A-4D60-ABD5-0CADD56536D2@apple.com>
      References: <200911112305.nABN5jd1010544@zion.cs.uiuc.edu>
      	<860AC9D2-EA76-476F-9B1D-DA0E7BB0A651@apple.com>
      	<35038F04-ED7A-4D60-ABD5-0CADD56536D2@apple.com>
      Message-ID: <38a0d8450911252018p3930f82cg43ebe9f6ded326c0@mail.gmail.com>
      
      2009/11/12 Bob Wilson :
      > It's breaking something on ARM, too. ?I'll revert it for now.
      >
      > I did run gcc's compat tests, but not on PPC. ?I guess I'll need to do
      > that.
      
      Can you try the patch I attached to the bug?
      
      Is there an easy way to get just a pass/fail list from the test-suite?
      I found the testcase in the bug report by doing
      
      gmake TEST=nightly report report.html
      
      in the SingleSource directory both with and without r86892 and
      searching for FAIL in the output.
      
      Cheers,
      -- 
      Rafael ?vila de Esp?ndola
      
      
      
      From baldrick at free.fr  Thu Nov 26 03:03:48 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 10:03:48 +0100
      Subject: [llvm-commits] [llvm] r89663
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <0D454083-C03A-4B5D-9D2B-055F3EA610E0@apple.com>
      References: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu>
      	<4B0AF85E.4090709@free.fr>
      	<0D454083-C03A-4B5D-9D2B-055F3EA610E0@apple.com>
      Message-ID: <4B0E4474.5030209@free.fr>
      
      Hi Chris,
      
      > The basic idea is that this is trying to decide whether a call could mod/ref a non-escaping alloca.  If the alloca is passed as an argument to a call, then it will be escaping, unless the call argument is marked nocapture.  Because of this, we only need to check nocapture arguments.
      
      I see, that makes sense.  I noticed that there is no logic about readonly calls
      in this routine - is this handled elsewhere?
      
      Ciao,
      
      Duncan.
      
      
      From clattner at apple.com  Thu Nov 26 09:06:29 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 07:06:29 -0800
      Subject: [llvm-commits] [llvm] r89663 -
      	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <4B0E4474.5030209@free.fr>
      References: <200911231644.nANGiiSl030197@zion.cs.uiuc.edu>
      	<4B0AF85E.4090709@free.fr>
      	<0D454083-C03A-4B5D-9D2B-055F3EA610E0@apple.com>
      	<4B0E4474.5030209@free.fr>
      Message-ID: <63DB33A4-C113-40BB-8EFA-08BCC0D4439D@apple.com>
      
      
      On Nov 26, 2009, at 1:03 AM, Duncan Sands wrote:
      
      > Hi Chris,
      > 
      >> The basic idea is that this is trying to decide whether a call could mod/ref a non-escaping alloca.  If the alloca is passed as an argument to a call, then it will be escaping, unless the call argument is marked nocapture.  Because of this, we only need to check nocapture arguments.
      > 
      > I see, that makes sense.  I noticed that there is no logic about readonly calls
      > in this routine - is this handled elsewhere?
      
      Yep, the code falls back to AliasAnalysis::getModRefInfo if it can't do any better.  That code handles "accesses arguments", readonly, etc.
      
      -Chris
      
      
      From sabre at nondot.org  Thu Nov 26 10:08:42 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 16:08:42 -0000
      Subject: [llvm-commits] [llvm] r89950 - in
       /llvm/trunk/test/Analysis/BasicAA: 2008-12-09-GEP-IndicesAlias.ll
       gep-alias.ll
      Message-ID: <200911261608.nAQG8gXV002962@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 10:08:41 2009
      New Revision: 89950
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89950&view=rev
      Log:
      rename test
      
      Added:
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
            - copied unchanged from r89948, llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      Removed:
          llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      
      Removed: llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll?rev=89949&view=auto
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/2008-12-09-GEP-IndicesAlias.ll (removed)
      @@ -1,73 +0,0 @@
      -; RUN: opt < %s -gvn -instcombine -S |& FileCheck %s
      -; Make sure that basicaa thinks R and r are must aliases.
      -
      -target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
      -
      -define i32 @test1(i8 * %P) {
      -entry:
      -	%Q = bitcast i8* %P to {i32, i32}*
      -	%R = getelementptr {i32, i32}* %Q, i32 0, i32 1
      -	%S = load i32* %R
      -
      -	%q = bitcast i8* %P to {i32, i32}*
      -	%r = getelementptr {i32, i32}* %q, i32 0, i32 1
      -	%s = load i32* %r
      -
      -	%t = sub i32 %S, %s
      -	ret i32 %t
      -; CHECK: @test1
      -; CHECK: ret i32 0
      -}
      -
      -define i32 @test2(i8 * %P) {
      -entry:
      -	%Q = bitcast i8* %P to {i32, i32, i32}*
      -	%R = getelementptr {i32, i32, i32}* %Q, i32 0, i32 1
      -	%S = load i32* %R
      -
      -	%r = getelementptr {i32, i32, i32}* %Q, i32 0, i32 2
      -  store i32 42, i32* %r
      -
      -	%s = load i32* %R
      -
      -	%t = sub i32 %S, %s
      -	ret i32 %t
      -; CHECK: @test2
      -; CHECK: ret i32 0
      -}
      -
      -
      -; This was a miscompilation.
      -define i32 @test3({float, {i32, i32, i32}}* %P) {
      -entry:
      -  %P2 = getelementptr {float, {i32, i32, i32}}* %P, i32 0, i32 1
      -	%R = getelementptr {i32, i32, i32}* %P2, i32 0, i32 1
      -	%S = load i32* %R
      -
      -	%r = getelementptr {i32, i32, i32}* %P2, i32 0, i32 2
      -  store i32 42, i32* %r
      -
      -	%s = load i32* %R
      -
      -	%t = sub i32 %S, %s
      -	ret i32 %t
      -; CHECK: @test3
      -; CHECK: ret i32 0
      -}
      -
      -
      -;; This is reduced from the SmallPtrSet constructor.
      -%SmallPtrSetImpl = type { i8**, i32, i32, i32, [1 x i8*] }
      -%SmallPtrSet64 = type { %SmallPtrSetImpl, [64 x i8*] }
      -
      -define i32 @test4(%SmallPtrSet64* %P) {
      -entry:
      -  %tmp2 = getelementptr inbounds %SmallPtrSet64* %P, i64 0, i32 0, i32 1
      -  store i32 64, i32* %tmp2, align 8
      -  %tmp3 = getelementptr inbounds %SmallPtrSet64* %P, i64 0, i32 0, i32 4, i64 64
      -  store i8* null, i8** %tmp3, align 8
      -  %tmp4 = load i32* %tmp2, align 8
      -	ret i32 %tmp4
      -; CHECK: @test4
      -; CHECK: ret i32 64
      -}
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 10:18:10 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 16:18:10 -0000
      Subject: [llvm-commits] [llvm] r89951 - in /llvm/trunk:
       lib/Analysis/BasicAliasAnalysis.cpp lib/Target/README.txt
       test/Analysis/BasicAA/gep-alias.ll
      Message-ID: <200911261618.nAQGIBxe003273@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 10:18:10 2009
      New Revision: 89951
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89951&view=rev
      Log:
      teach basicaa that A[i] != A[i+1].
      
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
          llvm/trunk/lib/Target/README.txt
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89951&r1=89950&r2=89951&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Thu Nov 26 10:18:10 2009
      @@ -378,6 +378,30 @@
         return NoAA::getModRefInfo(CS1, CS2);
       }
       
      +/// GetLinearExpression - Analyze the specified value as a linear expression:
      +/// "A*V + B".  Return the scale and offset values as APInts and return V as a
      +/// Value*.  The incoming Value is known to be a scalar integer.
      +static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset) {
      +  assert(isa(V->getType()) && "Not an integer value");
      +  
      +  if (BinaryOperator *BOp = dyn_cast(V)) {
      +    if (ConstantInt *RHSC = dyn_cast(BOp->getOperand(1))) {
      +      switch (BOp->getOpcode()) {
      +      default: break;
      +      case Instruction::Add:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset);
      +        Offset += RHSC->getValue();
      +        return V;
      +      // TODO: SHL, MUL, OR.
      +      }
      +    }
      +  }
      +
      +  Scale = 1;
      +  Offset = 0;
      +  return V;
      +}
      +
       /// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose it
       /// into a base pointer with a constant offset and a number of scaled symbolic
       /// offsets.
      @@ -456,6 +480,14 @@
             // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
             uint64_t Scale = TD->getTypeAllocSize(*GTI);
             
      +      unsigned Width = cast(Index->getType())->getBitWidth();
      +      APInt IndexScale(Width, 0), IndexOffset(Width, 0);
      +      Index = GetLinearExpression(Index, IndexScale, IndexOffset);
      +      
      +      Scale *= IndexScale.getZExtValue();
      +      BaseOffs += IndexOffset.getZExtValue()*Scale;
      +      
      +      
             // If we already had an occurrance of this index variable, merge this
             // scale into it.  For example, we want to handle:
             //   A[x][x] -> x*16 + x*4 -> x*20
      
      Modified: llvm/trunk/lib/Target/README.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=89951&r1=89950&r2=89951&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Target/README.txt (original)
      +++ llvm/trunk/lib/Target/README.txt Thu Nov 26 10:18:10 2009
      @@ -458,8 +458,6 @@
         }
       }
       
      -BasicAA also doesn't do this for add.  It needs to know that &A[i+1] != &A[i].
      -
       //===---------------------------------------------------------------------===//
       
       We should investigate an instruction sinking pass.  Consider this silly
      
      Modified: llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll?rev=89951&r1=89950&r2=89951&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/gep-alias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/gep-alias.ll Thu Nov 26 10:18:10 2009
      @@ -71,3 +71,19 @@
       ; CHECK: @test4
       ; CHECK: ret i32 64
       }
      +
      +; P[i] != p[i+1]
      +define i32 @test5(i32* %p, i64 %i) {
      +  %pi = getelementptr i32* %p, i64 %i
      +  %i.next = add i64 %i, 1
      +  %pi.next = getelementptr i32* %p, i64 %i.next
      +  %x = load i32* %pi
      +  store i32 42, i32* %pi.next
      +  %y = load i32* %pi
      +  %z = sub i32 %x, %y
      +  ret i32 %z
      +; CHECK: @test5
      +; CHECK: ret i32 0
      +}
      +
      +
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 10:26:43 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 16:26:43 -0000
      Subject: [llvm-commits] [llvm] r89952 - in /llvm/trunk:
       lib/Analysis/BasicAliasAnalysis.cpp lib/Target/README.txt
       test/Analysis/BasicAA/gep-alias.ll
      Message-ID: <200911261626.nAQGQiEV003805@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 10:26:43 2009
      New Revision: 89952
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89952&view=rev
      Log:
      Teach basicaa that x|c == x+c when the c bits of x are clear.  This
      allows us to compile the example in readme.txt into:
      
      LBB1_1:                                                     ## %bb
      	movl	4(%rdx,%rax), %ecx
      	movl	%ecx, %esi
      	imull	(%rdx,%rax), %esi
      	imull	%esi, %ecx
      	movl	%esi, 8(%rdx,%rax)
      	imull	%ecx, %esi
      	movl	%ecx, 12(%rdx,%rax)
      	movl	%esi, 16(%rdx,%rax)
      	imull	%ecx, %esi
      	movl	%esi, 20(%rdx,%rax)
      	addq	$16, %rax
      	cmpq	$4000, %rax
      	jne	LBB1_1
      
      instead of:
      
      LBB1_1: 
      	movl	(%rdx,%rax), %ecx
      	imull	4(%rdx,%rax), %ecx
      	movl	%ecx, 8(%rdx,%rax)
      	imull	4(%rdx,%rax), %ecx
      	movl	%ecx, 12(%rdx,%rax)
      	imull	8(%rdx,%rax), %ecx
      	movl	%ecx, 16(%rdx,%rax)
      	imull	12(%rdx,%rax), %ecx
      	movl	%ecx, 20(%rdx,%rax)
      	addq	$16, %rax
      	cmpq	$4000, %rax
      	jne	LBB1_1
      
      GCC (4.2) doesn't seem to be able to eliminate the loads in this 
      testcase either, it generates:
      
      L2:
      	movl	(%rdx), %eax
      	imull	4(%rdx), %eax
      	movl	%eax, 8(%rdx)
      	imull	4(%rdx), %eax
      	movl	%eax, 12(%rdx)
      	imull	8(%rdx), %eax
      	movl	%eax, 16(%rdx)
      	imull	12(%rdx), %eax
      	movl	%eax, 20(%rdx)
      	addl	$4, %ecx
      	addq	$16, %rdx
      	cmpl	$1002, %ecx
      	jne	L2
      
      
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
          llvm/trunk/lib/Target/README.txt
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89952&r1=89951&r2=89952&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Thu Nov 26 10:26:43 2009
      @@ -14,8 +14,6 @@
       //===----------------------------------------------------------------------===//
       
       #include "llvm/Analysis/AliasAnalysis.h"
      -#include "llvm/Analysis/CaptureTracking.h"
      -#include "llvm/Analysis/MemoryBuiltins.h"
       #include "llvm/Analysis/Passes.h"
       #include "llvm/Constants.h"
       #include "llvm/DerivedTypes.h"
      @@ -26,6 +24,9 @@
       #include "llvm/IntrinsicInst.h"
       #include "llvm/Operator.h"
       #include "llvm/Pass.h"
      +#include "llvm/Analysis/CaptureTracking.h"
      +#include "llvm/Analysis/MemoryBuiltins.h"
      +#include "llvm/Analysis/ValueTracking.h"
       #include "llvm/Target/TargetData.h"
       #include "llvm/ADT/SmallSet.h"
       #include "llvm/ADT/SmallVector.h"
      @@ -381,15 +382,22 @@
       /// GetLinearExpression - Analyze the specified value as a linear expression:
       /// "A*V + B".  Return the scale and offset values as APInts and return V as a
       /// Value*.  The incoming Value is known to be a scalar integer.
      -static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset) {
      +static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset,
      +                                  const TargetData *TD) {
         assert(isa(V->getType()) && "Not an integer value");
         
         if (BinaryOperator *BOp = dyn_cast(V)) {
           if (ConstantInt *RHSC = dyn_cast(BOp->getOperand(1))) {
             switch (BOp->getOpcode()) {
             default: break;
      +      case Instruction::Or:
      +        // X|C == X+C if all the bits in C are unset in X.  Otherwise we can't
      +        // analyze it.
      +        if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), TD))
      +          break;
      +        // FALL THROUGH.
             case Instruction::Add:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset);
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
               Offset += RHSC->getValue();
               return V;
             // TODO: SHL, MUL, OR.
      @@ -482,7 +490,7 @@
             
             unsigned Width = cast(Index->getType())->getBitWidth();
             APInt IndexScale(Width, 0), IndexOffset(Width, 0);
      -      Index = GetLinearExpression(Index, IndexScale, IndexOffset);
      +      Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD);
             
             Scale *= IndexScale.getZExtValue();
             BaseOffs += IndexOffset.getZExtValue()*Scale;
      
      Modified: llvm/trunk/lib/Target/README.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=89952&r1=89951&r2=89952&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Target/README.txt (original)
      +++ llvm/trunk/lib/Target/README.txt Thu Nov 26 10:26:43 2009
      @@ -443,23 +443,6 @@
       
       //===---------------------------------------------------------------------===//
       
      -"basicaa" should know how to look through "or" instructions that act like add
      -instructions.  For example in this code, the x*4+1 is turned into x*4 | 1, and
      -basicaa can't analyze the array subscript, leading to duplicated loads in the
      -generated code:
      -
      -void test(int X, int Y, int a[]) {
      -int i;
      -  for (i=2; i<1000; i+=4) {
      -  a[i+0] = a[i-1+0]*a[i-2+0];
      -  a[i+1] = a[i-1+1]*a[i-2+1];
      -  a[i+2] = a[i-1+2]*a[i-2+2];
      -  a[i+3] = a[i-1+3]*a[i-2+3];
      -  }
      -}
      -
      -//===---------------------------------------------------------------------===//
      -
       We should investigate an instruction sinking pass.  Consider this silly
       example in pic mode:
       
      
      Modified: llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll?rev=89952&r1=89951&r2=89952&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/gep-alias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/gep-alias.ll Thu Nov 26 10:26:43 2009
      @@ -86,4 +86,18 @@
       ; CHECK: ret i32 0
       }
       
      +; P[i] != p[(i*4)|1]
      +define i32 @test6(i32* %p, i64 %i1) {
      +  %i = shl i64 %i1, 2
      +  %pi = getelementptr i32* %p, i64 %i
      +  %i.next = or i64 %i, 1
      +  %pi.next = getelementptr i32* %p, i64 %i.next
      +  %x = load i32* %pi
      +  store i32 42, i32* %pi.next
      +  %y = load i32* %pi
      +  %z = sub i32 %x, %y
      +  ret i32 %z
      +; CHECK: @test6
      +; CHECK: ret i32 0
      +}
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 10:42:00 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 16:42:00 -0000
      Subject: [llvm-commits] [llvm] r89953 -
      	/llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll
      Message-ID: <200911261642.nAQGg04R004287@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 10:42:00 2009
      New Revision: 89953
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89953&view=rev
      Log:
      update status of this.  basicaa is much improved now,
      only missing the one form (in this testcase).  Dan, do you
      consider this example to be important?
      
      
      Modified:
          llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll
      
      Modified: llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll?rev=89953&r1=89952&r2=89953&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll (original)
      +++ llvm/trunk/test/Analysis/ScalarEvolution/scev-aa.ll Thu Nov 26 10:42:00 2009
      @@ -1,8 +1,8 @@
       ; RUN: opt < %s -scev-aa -aa-eval -print-all-alias-modref-info \
       ; RUN:   |& FileCheck %s
       
      -; At the time of this writing, all of these CHECK lines are cases that
      -; plain -basicaa misses.
      +; At the time of this writing, -basicaa only misses the example of the form
      +; A[i+(j+1)] != A[i+j].  However, it does get A[(i+j)+1] != A[i+j].
       
       target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64"
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 10:52:32 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 16:52:32 -0000
      Subject: [llvm-commits] [llvm] r89954 -
      	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      Message-ID: <200911261652.nAQGqWwB004772@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 10:52:32 2009
      New Revision: 89954
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89954&view=rev
      Log:
      resolve a fixme.  I haven't figured out how to write a testcase
      to exercise this though.
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89954&r1=89953&r2=89954&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Thu Nov 26 10:52:32 2009
      @@ -400,7 +400,7 @@
               V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
               Offset += RHSC->getValue();
               return V;
      -      // TODO: SHL, MUL, OR.
      +      // TODO: SHL, MUL.
             }
           }
         }
      @@ -610,10 +610,9 @@
           // Check to see if these two pointers are related by the getelementptr
           // instruction.  If one pointer is a GEP with a non-zero index of the other
           // pointer, we know they cannot alias.
      -    //
      -    // FIXME: The check below only looks at the size of one of the pointers, not
      -    // both, this may cause us to miss things.
      -    if (V1Size == ~0U || V2Size == ~0U)
      +
      +    // If both accesses are unknown size, we can't do anything useful here.
      +    if (V1Size == ~0U && V2Size == ~0U)
             return MayAlias;
       
           AliasResult R = aliasCheck(UnderlyingV1, ~0U, V2, V2Size);
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 11:00:01 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 17:00:01 -0000
      Subject: [llvm-commits] [llvm] r89955 - in /llvm/trunk:
       lib/Analysis/BasicAliasAnalysis.cpp test/Analysis/BasicAA/gep-alias.ll
      Message-ID: <200911261700.nAQH01H0005061@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 11:00:01 2009
      New Revision: 89955
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89955&view=rev
      Log:
      teach GetLinearExpression to be a bit more aggressive.
      
      Modified:
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89955&r1=89954&r2=89955&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Thu Nov 26 11:00:01 2009
      @@ -400,7 +400,16 @@
               V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
               Offset += RHSC->getValue();
               return V;
      -      // TODO: SHL, MUL.
      +      case Instruction::Mul:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        Offset *= RHSC->getValue();
      +        Scale *= RHSC->getValue();
      +        return V;
      +      case Instruction::Shl:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        Offset <<= RHSC->getValue().getLimitedValue();
      +        Scale <<= RHSC->getValue().getLimitedValue();
      +        return V;
             }
           }
         }
      
      Modified: llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll?rev=89955&r1=89954&r2=89955&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/gep-alias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/gep-alias.ll Thu Nov 26 11:00:01 2009
      @@ -101,3 +101,18 @@
       ; CHECK: ret i32 0
       }
       
      +; P[1] != P[i*4]
      +define i32 @test7(i32* %p, i64 %i) {
      +  %pi = getelementptr i32* %p, i64 1
      +  %i.next = shl i64 %i, 2
      +  %pi.next = getelementptr i32* %p, i64 %i.next
      +  %x = load i32* %pi
      +  store i32 42, i32* %pi.next
      +  %y = load i32* %pi
      +  %z = sub i32 %x, %y
      +  ret i32 %z
      +; CHECK: @test7
      +; CHECK: ret i32 0
      +}
      +
      +
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 11:12:51 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 17:12:51 -0000
      Subject: [llvm-commits] [llvm] r89956 - in /llvm/trunk:
       include/llvm/Analysis/ValueTracking.h lib/Analysis/BasicAliasAnalysis.cpp
       lib/Analysis/ValueTracking.cpp
      Message-ID: <200911261712.nAQHCpkH005459@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 11:12:50 2009
      New Revision: 89956
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89956&view=rev
      Log:
      move DecomposeGEPExpression out into ValueTracking.cpp
      
      Modified:
          llvm/trunk/include/llvm/Analysis/ValueTracking.h
          llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
          llvm/trunk/lib/Analysis/ValueTracking.cpp
      
      Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=89956&r1=89955&r2=89956&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original)
      +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Thu Nov 26 11:12:50 2009
      @@ -19,6 +19,7 @@
       #include 
       
       namespace llvm {
      +  template  class SmallVectorImpl;
         class Value;
         class Instruction;
         class APInt;
      @@ -77,6 +78,20 @@
         ///
         bool CannotBeNegativeZero(const Value *V, unsigned Depth = 0);
       
      +  /// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose
      +  /// it into a base pointer with a constant offset and a number of scaled
      +  /// symbolic offsets.
      +  ///
      +  /// When TargetData is around, this function is capable of analyzing
      +  /// everything that Value::getUnderlyingObject() can look through.  When not,
      +  /// it just looks through pointer casts.
      +  ///
      +  const Value *DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
      +                 SmallVectorImpl > &VarIndices,
      +                                      const TargetData *TD);
      +    
      +  
      +  
         /// FindScalarValue - Given an aggregrate and an sequence of indices, see if
         /// the scalar value indexed is already around as a register, for example if
         /// it were inserted directly into the aggregrate.
      
      Modified: llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?rev=89956&r1=89955&r2=89956&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp Thu Nov 26 11:12:50 2009
      @@ -18,7 +18,6 @@
       #include "llvm/Constants.h"
       #include "llvm/DerivedTypes.h"
       #include "llvm/Function.h"
      -#include "llvm/GlobalAlias.h"
       #include "llvm/GlobalVariable.h"
       #include "llvm/Instructions.h"
       #include "llvm/IntrinsicInst.h"
      @@ -28,11 +27,9 @@
       #include "llvm/Analysis/MemoryBuiltins.h"
       #include "llvm/Analysis/ValueTracking.h"
       #include "llvm/Target/TargetData.h"
      -#include "llvm/ADT/SmallSet.h"
      +#include "llvm/ADT/SmallPtrSet.h"
       #include "llvm/ADT/SmallVector.h"
      -#include "llvm/ADT/STLExtras.h"
       #include "llvm/Support/ErrorHandling.h"
      -#include "llvm/Support/GetElementPtrTypeIterator.h"
       #include 
       using namespace llvm;
       
      @@ -379,160 +376,6 @@
         return NoAA::getModRefInfo(CS1, CS2);
       }
       
      -/// GetLinearExpression - Analyze the specified value as a linear expression:
      -/// "A*V + B".  Return the scale and offset values as APInts and return V as a
      -/// Value*.  The incoming Value is known to be a scalar integer.
      -static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset,
      -                                  const TargetData *TD) {
      -  assert(isa(V->getType()) && "Not an integer value");
      -  
      -  if (BinaryOperator *BOp = dyn_cast(V)) {
      -    if (ConstantInt *RHSC = dyn_cast(BOp->getOperand(1))) {
      -      switch (BOp->getOpcode()) {
      -      default: break;
      -      case Instruction::Or:
      -        // X|C == X+C if all the bits in C are unset in X.  Otherwise we can't
      -        // analyze it.
      -        if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), TD))
      -          break;
      -        // FALL THROUGH.
      -      case Instruction::Add:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      -        Offset += RHSC->getValue();
      -        return V;
      -      case Instruction::Mul:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      -        Offset *= RHSC->getValue();
      -        Scale *= RHSC->getValue();
      -        return V;
      -      case Instruction::Shl:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      -        Offset <<= RHSC->getValue().getLimitedValue();
      -        Scale <<= RHSC->getValue().getLimitedValue();
      -        return V;
      -      }
      -    }
      -  }
      -
      -  Scale = 1;
      -  Offset = 0;
      -  return V;
      -}
      -
      -/// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose it
      -/// into a base pointer with a constant offset and a number of scaled symbolic
      -/// offsets.
      -///
      -/// When TargetData is around, this function is capable of analyzing everything
      -/// that Value::getUnderlyingObject() can look through.  When not, it just looks
      -/// through pointer casts.
      -///
      -/// FIXME: Move this out to ValueTracking.cpp
      -///
      -static const Value *DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
      -                 SmallVectorImpl > &VarIndices,
      -                                           const TargetData *TD) {
      -  // FIXME: Should limit depth like getUnderlyingObject?
      -  BaseOffs = 0;
      -  while (1) {
      -    // See if this is a bitcast or GEP.
      -    const Operator *Op = dyn_cast(V);
      -    if (Op == 0) {
      -      // The only non-operator case we can handle are GlobalAliases.
      -      if (const GlobalAlias *GA = dyn_cast(V)) {
      -        if (!GA->mayBeOverridden()) {
      -          V = GA->getAliasee();
      -          continue;
      -        }
      -      }
      -      return V;
      -    }
      -    
      -    if (Op->getOpcode() == Instruction::BitCast) {
      -      V = Op->getOperand(0);
      -      continue;
      -    }
      -    
      -    const GEPOperator *GEPOp = dyn_cast(Op);
      -    if (GEPOp == 0)
      -      return V;
      -    
      -    // Don't attempt to analyze GEPs over unsized objects.
      -    if (!cast(GEPOp->getOperand(0)->getType())
      -          ->getElementType()->isSized())
      -      return V;
      -
      -    // If we are lacking TargetData information, we can't compute the offets of
      -    // elements computed by GEPs.  However, we can handle bitcast equivalent
      -    // GEPs.
      -    if (!TD) {
      -      if (!GEPOp->hasAllZeroIndices())
      -        return V;
      -      V = GEPOp->getOperand(0);
      -      continue;
      -    }
      -    
      -    // Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
      -    gep_type_iterator GTI = gep_type_begin(GEPOp);
      -    for (User::const_op_iterator I = next(GEPOp->op_begin()),
      -         E = GEPOp->op_end(); I != E; ++I) {
      -      Value *Index = *I;
      -      // Compute the (potentially symbolic) offset in bytes for this index.
      -      if (const StructType *STy = dyn_cast(*GTI++)) {
      -        // For a struct, add the member offset.
      -        unsigned FieldNo = cast(Index)->getZExtValue();
      -        if (FieldNo == 0) continue;
      -        
      -        BaseOffs += TD->getStructLayout(STy)->getElementOffset(FieldNo);
      -        continue;
      -      }
      -      
      -      // For an array/pointer, add the element offset, explicitly scaled.
      -      if (ConstantInt *CIdx = dyn_cast(Index)) {
      -        if (CIdx->isZero()) continue;
      -        BaseOffs += TD->getTypeAllocSize(*GTI)*CIdx->getSExtValue();
      -        continue;
      -      }
      -      
      -      // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
      -      uint64_t Scale = TD->getTypeAllocSize(*GTI);
      -      
      -      unsigned Width = cast(Index->getType())->getBitWidth();
      -      APInt IndexScale(Width, 0), IndexOffset(Width, 0);
      -      Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD);
      -      
      -      Scale *= IndexScale.getZExtValue();
      -      BaseOffs += IndexOffset.getZExtValue()*Scale;
      -      
      -      
      -      // If we already had an occurrance of this index variable, merge this
      -      // scale into it.  For example, we want to handle:
      -      //   A[x][x] -> x*16 + x*4 -> x*20
      -      // This also ensures that 'x' only appears in the index list once.
      -      for (unsigned i = 0, e = VarIndices.size(); i != e; ++i) {
      -        if (VarIndices[i].first == Index) {
      -          Scale += VarIndices[i].second;
      -          VarIndices.erase(VarIndices.begin()+i);
      -          break;
      -        }
      -      }
      -      
      -      // Make sure that we have a scale that makes sense for this target's
      -      // pointer size.
      -      if (unsigned ShiftBits = 64-TD->getPointerSizeInBits()) {
      -        Scale <<= ShiftBits;
      -        Scale >>= ShiftBits;
      -      }
      -      
      -      if (Scale)
      -        VarIndices.push_back(std::make_pair(Index, Scale));
      -    }
      -    
      -    // Analyze the base pointer next.
      -    V = GEPOp->getOperand(0);
      -  }
      -}
      -
       /// GetIndiceDifference - Dest and Src are the variable indices from two
       /// decomposed GetElementPtr instructions GEP1 and GEP2 which have common base
       /// pointers.  Subtract the GEP2 indices from GEP1 to find the symbolic
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=89956&r1=89955&r2=89956&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Thu Nov 26 11:12:50 2009
      @@ -948,6 +948,160 @@
         return false;
       }
       
      +
      +/// GetLinearExpression - Analyze the specified value as a linear expression:
      +/// "A*V + B".  Return the scale and offset values as APInts and return V as a
      +/// Value*.  The incoming Value is known to be a scalar integer.
      +static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset,
      +                                  const TargetData *TD) {
      +  assert(isa(V->getType()) && "Not an integer value");
      +  
      +  if (BinaryOperator *BOp = dyn_cast(V)) {
      +    if (ConstantInt *RHSC = dyn_cast(BOp->getOperand(1))) {
      +      switch (BOp->getOpcode()) {
      +      default: break;
      +      case Instruction::Or:
      +        // X|C == X+C if all the bits in C are unset in X.  Otherwise we can't
      +        // analyze it.
      +        if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), TD))
      +          break;
      +        // FALL THROUGH.
      +      case Instruction::Add:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        Offset += RHSC->getValue();
      +        return V;
      +      case Instruction::Mul:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        Offset *= RHSC->getValue();
      +        Scale *= RHSC->getValue();
      +        return V;
      +      case Instruction::Shl:
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        Offset <<= RHSC->getValue().getLimitedValue();
      +        Scale <<= RHSC->getValue().getLimitedValue();
      +        return V;
      +      }
      +    }
      +  }
      +  
      +  Scale = 1;
      +  Offset = 0;
      +  return V;
      +}
      +
      +/// DecomposeGEPExpression - If V is a symbolic pointer expression, decompose it
      +/// into a base pointer with a constant offset and a number of scaled symbolic
      +/// offsets.
      +///
      +/// When TargetData is around, this function is capable of analyzing everything
      +/// that Value::getUnderlyingObject() can look through.  When not, it just looks
      +/// through pointer casts.
      +///
      +const Value *llvm::DecomposeGEPExpression(const Value *V, int64_t &BaseOffs,
      +                 SmallVectorImpl > &VarIndices,
      +                                          const TargetData *TD) {
      +  // FIXME: Should limit depth like getUnderlyingObject?
      +  BaseOffs = 0;
      +  while (1) {
      +    // See if this is a bitcast or GEP.
      +    const Operator *Op = dyn_cast(V);
      +    if (Op == 0) {
      +      // The only non-operator case we can handle are GlobalAliases.
      +      if (const GlobalAlias *GA = dyn_cast(V)) {
      +        if (!GA->mayBeOverridden()) {
      +          V = GA->getAliasee();
      +          continue;
      +        }
      +      }
      +      return V;
      +    }
      +    
      +    if (Op->getOpcode() == Instruction::BitCast) {
      +      V = Op->getOperand(0);
      +      continue;
      +    }
      +    
      +    const GEPOperator *GEPOp = dyn_cast(Op);
      +    if (GEPOp == 0)
      +      return V;
      +    
      +    // Don't attempt to analyze GEPs over unsized objects.
      +    if (!cast(GEPOp->getOperand(0)->getType())
      +        ->getElementType()->isSized())
      +      return V;
      +    
      +    // If we are lacking TargetData information, we can't compute the offets of
      +    // elements computed by GEPs.  However, we can handle bitcast equivalent
      +    // GEPs.
      +    if (!TD) {
      +      if (!GEPOp->hasAllZeroIndices())
      +        return V;
      +      V = GEPOp->getOperand(0);
      +      continue;
      +    }
      +    
      +    // Walk the indices of the GEP, accumulating them into BaseOff/VarIndices.
      +    gep_type_iterator GTI = gep_type_begin(GEPOp);
      +    for (User::const_op_iterator I = GEPOp->op_begin()+1,
      +         E = GEPOp->op_end(); I != E; ++I) {
      +      Value *Index = *I;
      +      // Compute the (potentially symbolic) offset in bytes for this index.
      +      if (const StructType *STy = dyn_cast(*GTI++)) {
      +        // For a struct, add the member offset.
      +        unsigned FieldNo = cast(Index)->getZExtValue();
      +        if (FieldNo == 0) continue;
      +        
      +        BaseOffs += TD->getStructLayout(STy)->getElementOffset(FieldNo);
      +        continue;
      +      }
      +      
      +      // For an array/pointer, add the element offset, explicitly scaled.
      +      if (ConstantInt *CIdx = dyn_cast(Index)) {
      +        if (CIdx->isZero()) continue;
      +        BaseOffs += TD->getTypeAllocSize(*GTI)*CIdx->getSExtValue();
      +        continue;
      +      }
      +      
      +      // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
      +      uint64_t Scale = TD->getTypeAllocSize(*GTI);
      +      
      +      unsigned Width = cast(Index->getType())->getBitWidth();
      +      APInt IndexScale(Width, 0), IndexOffset(Width, 0);
      +      Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD);
      +      
      +      Scale *= IndexScale.getZExtValue();
      +      BaseOffs += IndexOffset.getZExtValue()*Scale;
      +      
      +      
      +      // If we already had an occurrance of this index variable, merge this
      +      // scale into it.  For example, we want to handle:
      +      //   A[x][x] -> x*16 + x*4 -> x*20
      +      // This also ensures that 'x' only appears in the index list once.
      +      for (unsigned i = 0, e = VarIndices.size(); i != e; ++i) {
      +        if (VarIndices[i].first == Index) {
      +          Scale += VarIndices[i].second;
      +          VarIndices.erase(VarIndices.begin()+i);
      +          break;
      +        }
      +      }
      +      
      +      // Make sure that we have a scale that makes sense for this target's
      +      // pointer size.
      +      if (unsigned ShiftBits = 64-TD->getPointerSizeInBits()) {
      +        Scale <<= ShiftBits;
      +        Scale >>= ShiftBits;
      +      }
      +      
      +      if (Scale)
      +        VarIndices.push_back(std::make_pair(Index, Scale));
      +    }
      +    
      +    // Analyze the base pointer next.
      +    V = GEPOp->getOperand(0);
      +  }
      +}
      +
      +
       // This is the recursive version of BuildSubAggregate. It takes a few different
       // arguments. Idxs is the index within the nested struct From that we are
       // looking at now (which is of type IndexedType). IdxSkip is the number of
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 11:14:10 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 17:14:10 -0000
      Subject: [llvm-commits] [llvm] r89957 -
      	/llvm/trunk/lib/Analysis/ValueTracking.cpp
      Message-ID: <200911261714.nAQHEANM005500@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 11:14:10 2009
      New Revision: 89957
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89957&view=rev
      Log:
      this todo is resolved.
      
      Modified:
          llvm/trunk/lib/Analysis/ValueTracking.cpp
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=89957&r1=89956&r2=89957&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Thu Nov 26 11:14:10 2009
      @@ -1062,7 +1062,6 @@
               continue;
             }
             
      -      // TODO: Could handle linear expressions here like A[X+1], also A[X*4|1].
             uint64_t Scale = TD->getTypeAllocSize(*GTI);
             
             unsigned Width = cast(Index->getType())->getBitWidth();
      
      
      
      
      From baldrick at free.fr  Thu Nov 26 11:29:30 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 18:29:30 +0100
      Subject: [llvm-commits] [llvm] r89951 - in /llvm/trunk:
       lib/Analysis/BasicAliasAnalysis.cpp lib/Target/README.txt
       test/Analysis/BasicAA/gep-alias.ll
      In-Reply-To: <200911261618.nAQGIBxe003273@zion.cs.uiuc.edu>
      References: <200911261618.nAQGIBxe003273@zion.cs.uiuc.edu>
      Message-ID: <4B0EBAFA.2000107@free.fr>
      
      Hi Chris,
      
      > +      Scale *= IndexScale.getZExtValue();
      > +      BaseOffs += IndexOffset.getZExtValue()*Scale;
      
      shouldn't these two lines be swapped?  I think in the BaseOffs line
      you should be multiplying by the value of Scale before IndexScale was
      factored into it, not afterwards...
      
      Ciao,
      
      Duncan.
      
      
      From baldrick at free.fr  Thu Nov 26 11:46:28 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 18:46:28 +0100
      Subject: [llvm-commits] [llvm] r89915
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      References: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      Message-ID: <4B0EBEF4.4020507@free.fr>
      
      Hi Chris,
      
      > +  // If our known offset is bigger than the access size, we know we don't have
      > +  // an alias.
      > +  if (GEP1BaseOffset) {
      > +    if (GEP1BaseOffset >= (int64_t)V2Size ||
      > +        GEP1BaseOffset <= -(int64_t)V1Size)
      >        return NoAlias;
      
      what if this wraps around the entire address space and lands back on top of
      the original pointer?
      
      Ciao,
      
      Duncan.
      
      
      From sabre at nondot.org  Thu Nov 26 12:29:40 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 10:29:40 -0800
      Subject: [llvm-commits] [llvm] r89915
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <4B0EBEF4.4020507@free.fr>
      References: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      	<4B0EBEF4.4020507@free.fr>
      Message-ID: <294E5597-8435-4C69-AE83-272E57FFB474@nondot.org>
      
      
      On Nov 26, 2009, at 9:46 AM, Duncan Sands wrote:
      
      > Hi Chris,
      > 
      >> +  // If our known offset is bigger than the access size, we know we don't have
      >> +  // an alias.
      >> +  if (GEP1BaseOffset) {
      >> +    if (GEP1BaseOffset >= (int64_t)V2Size ||
      >> +        GEP1BaseOffset <= -(int64_t)V1Size)
      >>       return NoAlias;
      > 
      > what if this wraps around the entire address space and lands back on top of
      > the original pointer?
      
      Then that's undefined behavior.
      
      -Chris
      
      
      From sabre at nondot.org  Thu Nov 26 12:31:46 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 10:31:46 -0800
      Subject: [llvm-commits] [llvm] r89951 - in /llvm/trunk:
      	lib/Analysis/BasicAliasAnalysis.cpp lib/Target/README.txt
      	test/Analysis/BasicAA/gep-alias.ll
      In-Reply-To: <4B0EBAFA.2000107@free.fr>
      References: <200911261618.nAQGIBxe003273@zion.cs.uiuc.edu>
      	<4B0EBAFA.2000107@free.fr>
      Message-ID: <922C1681-40B0-4777-98E7-5D41468D633E@nondot.org>
      
      On Nov 26, 2009, at 9:29 AM, Duncan Sands wrote:
      
      > Hi Chris,
      > 
      >> +      Scale *= IndexScale.getZExtValue();
      >> +      BaseOffs += IndexOffset.getZExtValue()*Scale;
      > 
      > shouldn't these two lines be swapped?  I think in the BaseOffs line
      > you should be multiplying by the value of Scale before IndexScale was
      > factored into it, not afterwards...
      
      Nice catch, you're completely right, will fix!
      
      
      From baldrick at free.fr  Thu Nov 26 12:33:02 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 19:33:02 +0100
      Subject: [llvm-commits] [llvm] r89915
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <294E5597-8435-4C69-AE83-272E57FFB474@nondot.org>
      References: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      	<4B0EBEF4.4020507@free.fr>
      	<294E5597-8435-4C69-AE83-272E57FFB474@nondot.org>
      Message-ID: <4B0EC9DE.9060404@free.fr>
      
      >> what if this wraps around the entire address space and lands back on top of
      >> the original pointer?
      > 
      > Then that's undefined behavior.
      
      I thought it was only undefined for inbounds GEP?  I'm a bit worried about this
      because I have a testcase in which the optimizers changed increments by -4 into
      increments by 2^32 - 4, which would mean that they are creating undefined
      behaviour where there was none before (PR5282).
      
      Ciao,
      
      Duncan.
      
      
      From sabre at nondot.org  Thu Nov 26 12:35:46 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 18:35:46 -0000
      Subject: [llvm-commits] [llvm] r89958 -
      	/llvm/trunk/lib/Analysis/ValueTracking.cpp
      Message-ID: <200911261835.nAQIZk5f008207@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 12:35:46 2009
      New Revision: 89958
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89958&view=rev
      Log:
      fix two transposed lines duncan caught and add an explanatory comment.
      
      Modified:
          llvm/trunk/lib/Analysis/ValueTracking.cpp
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=89958&r1=89957&r2=89958&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Thu Nov 26 12:35:46 2009
      @@ -1064,12 +1064,15 @@
             
             uint64_t Scale = TD->getTypeAllocSize(*GTI);
             
      +      // Use GetLinearExpression to decompose the index into a C1*V+C2 form.
             unsigned Width = cast(Index->getType())->getBitWidth();
             APInt IndexScale(Width, 0), IndexOffset(Width, 0);
             Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD);
             
      -      Scale *= IndexScale.getZExtValue();
      +      // The GEP index scale ("Scale") scales C1*V+C2, yielding (C1*V+C2)*Scale.
      +      // This gives us an aggregate computation of (C1*Scale)*V + C2*Scale.
             BaseOffs += IndexOffset.getZExtValue()*Scale;
      +      Scale *= IndexScale.getZExtValue();
             
             
             // If we already had an occurrance of this index variable, merge this
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 12:53:34 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 18:53:34 -0000
      Subject: [llvm-commits] [llvm] r89959 - in /llvm/trunk:
       include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp
       test/Analysis/BasicAA/gep-alias.ll
      Message-ID: <200911261853.nAQIrYs3008772@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 12:53:33 2009
      New Revision: 89959
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89959&view=rev
      Log:
      Implement PR1143 (at -m64) by making basicaa look through extensions.  We
      previously already handled it at -m32 because there were no i32->i64 
      extensions for addressing.
      
      Modified:
          llvm/trunk/include/llvm/Analysis/ValueTracking.h
          llvm/trunk/lib/Analysis/ValueTracking.cpp
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      
      Modified: llvm/trunk/include/llvm/Analysis/ValueTracking.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/ValueTracking.h?rev=89959&r1=89958&r2=89959&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/Analysis/ValueTracking.h (original)
      +++ llvm/trunk/include/llvm/Analysis/ValueTracking.h Thu Nov 26 12:53:33 2009
      @@ -82,6 +82,12 @@
         /// it into a base pointer with a constant offset and a number of scaled
         /// symbolic offsets.
         ///
      +  /// The scaled symbolic offsets (represented by pairs of a Value* and a scale
      +  /// in the VarIndices vector) are Value*'s that are known to be scaled by the
      +  /// specified amount, but which may have other unrepresented high bits. As
      +  /// such, the gep cannot necessarily be reconstructed from its decomposed
      +  /// form.
      +  ///
         /// When TargetData is around, this function is capable of analyzing
         /// everything that Value::getUnderlyingObject() can look through.  When not,
         /// it just looks through pointer casts.
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=89959&r1=89958&r2=89959&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Thu Nov 26 12:53:33 2009
      @@ -950,8 +950,10 @@
       
       
       /// GetLinearExpression - Analyze the specified value as a linear expression:
      -/// "A*V + B".  Return the scale and offset values as APInts and return V as a
      -/// Value*.  The incoming Value is known to be a scalar integer.
      +/// "A*V + B", where A and B are constant integers.  Return the scale and offset
      +/// values as APInts and return V as a Value*.  The incoming Value is known to
      +/// have IntegerType.  Note that this looks through extends, so the high bits
      +/// may not be represented in the result.
       static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset,
                                         const TargetData *TD) {
         assert(isa(V->getType()) && "Not an integer value");
      @@ -984,6 +986,20 @@
           }
         }
         
      +  // Since clients don't care about the high bits of the value, just scales and
      +  // offsets, we can look through extensions.
      +  if (isa(V) || isa(V)) {
      +    Value *CastOp = cast(V)->getOperand(0);
      +    unsigned OldWidth = Scale.getBitWidth();
      +    unsigned SmallWidth = CastOp->getType()->getPrimitiveSizeInBits();
      +    Scale.trunc(SmallWidth);
      +    Offset.trunc(SmallWidth);
      +    Value *Result = GetLinearExpression(CastOp, Scale, Offset, TD);
      +    Scale.zext(OldWidth);
      +    Offset.zext(OldWidth);
      +    return Result;
      +  }
      +  
         Scale = 1;
         Offset = 0;
         return V;
      @@ -993,6 +1009,11 @@
       /// into a base pointer with a constant offset and a number of scaled symbolic
       /// offsets.
       ///
      +/// The scaled symbolic offsets (represented by pairs of a Value* and a scale in
      +/// the VarIndices vector) are Value*'s that are known to be scaled by the
      +/// specified amount, but which may have other unrepresented high bits. As such,
      +/// the gep cannot necessarily be reconstructed from its decomposed form.
      +///
       /// When TargetData is around, this function is capable of analyzing everything
       /// that Value::getUnderlyingObject() can look through.  When not, it just looks
       /// through pointer casts.
      
      Modified: llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll?rev=89959&r1=89958&r2=89959&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/gep-alias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/gep-alias.ll Thu Nov 26 12:53:33 2009
      @@ -115,4 +115,19 @@
       ; CHECK: ret i32 0
       }
       
      -
      +; P[zext(i)] != p[zext(i+1)]
      +; PR1143
      +define i32 @test8(i32* %p, i32 %i) {
      +  %i1 = zext i32 %i to i64
      +  %pi = getelementptr i32* %p, i64 %i1
      +  %i.next = add i32 %i, 1
      +  %i.next2 = zext i32 %i.next to i64
      +  %pi.next = getelementptr i32* %p, i64 %i.next2
      +  %x = load i32* %pi
      +  store i32 42, i32* %pi.next
      +  %y = load i32* %pi
      +  %z = sub i32 %x, %y
      +  ret i32 %z
      +; CHECK: @test8
      +; CHECK: ret i32 0
      +}
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 13:25:46 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 19:25:46 -0000
      Subject: [llvm-commits] [llvm] r89960 -
      	/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      Message-ID: <200911261925.nAQJPkU4009844@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 13:25:46 2009
      New Revision: 89960
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89960&view=rev
      Log:
      @test9 is a testcase for r89958.  Before 89958, we misanalyzed the
      first expression as P+4+4*i which we considered to possibly alias
      P+4*j.  Now we correctly analyze the former one as P+1+4*i.
      
      @test10 is a sanity test that verfies that we know that P+4+4*i != P+4*i.
      
      
      Modified:
          llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      
      Modified: llvm/trunk/test/Analysis/BasicAA/gep-alias.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Analysis/BasicAA/gep-alias.ll?rev=89960&r1=89959&r2=89960&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Analysis/BasicAA/gep-alias.ll (original)
      +++ llvm/trunk/test/Analysis/BasicAA/gep-alias.ll Thu Nov 26 13:25:46 2009
      @@ -1,8 +1,8 @@
       ; RUN: opt < %s -gvn -instcombine -S |& FileCheck %s
      -; Make sure that basicaa thinks R and r are must aliases.
       
       target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
       
      +; Make sure that basicaa thinks R and r are must aliases.
       define i32 @test1(i8 * %P) {
       entry:
       	%Q = bitcast i8* %P to {i32, i32}*
      @@ -131,3 +131,41 @@
       ; CHECK: @test8
       ; CHECK: ret i32 0
       }
      +
      +define i8 @test9([4 x i8] *%P, i32 %i, i32 %j) {
      +  %i2 = shl i32 %i, 2
      +  %i3 = add i32 %i2, 1
      +  ; P2 = P + 1 + 4*i
      +  %P2 = getelementptr [4 x i8] *%P, i32 0, i32 %i3
      +
      +  %j2 = shl i32 %j, 2
      +  
      +  ; P4 = P + 4*j
      +  %P4 = getelementptr [4 x i8]* %P, i32 0, i32 %j2
      +
      +  %x = load i8* %P2
      +  store i8 42, i8* %P4
      +  %y = load i8* %P2
      +  %z = sub i8 %x, %y
      +  ret i8 %z
      +; CHECK: @test9
      +; CHECK: ret i8 0
      +}
      +
      +define i8 @test10([4 x i8] *%P, i32 %i) {
      +  %i2 = shl i32 %i, 2
      +  %i3 = add i32 %i2, 4
      +  ; P2 = P + 4 + 4*i
      +  %P2 = getelementptr [4 x i8] *%P, i32 0, i32 %i3
      +  
      +  ; P4 = P + 4*i
      +  %P4 = getelementptr [4 x i8]* %P, i32 0, i32 %i2
      +
      +  %x = load i8* %P2
      +  store i8 42, i8* %P4
      +  %y = load i8* %P2
      +  %z = sub i8 %x, %y
      +  ret i8 %z
      +; CHECK: @test10
      +; CHECK: ret i8 0
      +}
      
      
      
      
      From baldrick at free.fr  Thu Nov 26 13:27:08 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 20:27:08 +0100
      Subject: [llvm-commits] [llvm] r89959 - in /llvm/trunk:
       include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp
       test/Analysis/BasicAA/gep-alias.ll
      In-Reply-To: <200911261853.nAQIrYs3008772@zion.cs.uiuc.edu>
      References: <200911261853.nAQIrYs3008772@zion.cs.uiuc.edu>
      Message-ID: <4B0ED68C.4080409@free.fr>
      
      Hi Chris,
      
      > +  /// The scaled symbolic offsets (represented by pairs of a Value* and a scale
      > +  /// in the VarIndices vector) are Value*'s that are known to be scaled by the
      > +  /// specified amount, but which may have other unrepresented high bits. As
      > +  /// such, the gep cannot necessarily be reconstructed from its decomposed
      > +  /// form.
      
      it looks like there is no control on how many bits are "unrepresented high
      bits".  For example, if you had an extension from i1 somewhere, presumably
      all bits except the first may contain some rubbish values.  If you now do
      a test like "is this value bigger than 4", how can you answer?
      
      Ciao,
      
      Duncan.
      
      
      From sabre at nondot.org  Thu Nov 26 14:11:31 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 12:11:31 -0800
      Subject: [llvm-commits] [llvm] r89959 - in /llvm/trunk:
      	include/llvm/Analysis/ValueTracking.h lib/Analysis/ValueTracking.cpp
      	test/Analysis/BasicAA/gep-alias.ll
      In-Reply-To: <4B0ED68C.4080409@free.fr>
      References: <200911261853.nAQIrYs3008772@zion.cs.uiuc.edu>
      	<4B0ED68C.4080409@free.fr>
      Message-ID: 
      
      
      On Nov 26, 2009, at 11:27 AM, Duncan Sands wrote:
      
      > Hi Chris,
      > 
      >> +  /// The scaled symbolic offsets (represented by pairs of a Value* and a scale
      >> +  /// in the VarIndices vector) are Value*'s that are known to be scaled by the
      >> +  /// specified amount, but which may have other unrepresented high bits. As
      >> +  /// such, the gep cannot necessarily be reconstructed from its decomposed
      >> +  /// form.
      > 
      > it looks like there is no control on how many bits are "unrepresented high
      > bits".  For example, if you had an extension from i1 somewhere, presumably
      > all bits except the first may contain some rubbish values.  If you now do
      > a test like "is this value bigger than 4", how can you answer?
      
      This code only cares about scales and offsets.  Because of this, an i1 won't have a scale.  It doesn't matter of it is 0 or 1. :)
      
      -Chris
      
      
      From sabre at nondot.org  Thu Nov 26 14:18:46 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 12:18:46 -0800
      Subject: [llvm-commits] [llvm] r89915
      	-	/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp
      In-Reply-To: <4B0EC9DE.9060404@free.fr>
      References: <200911260213.nAQ2D3Q0024164@zion.cs.uiuc.edu>
      	<4B0EBEF4.4020507@free.fr>
      	<294E5597-8435-4C69-AE83-272E57FFB474@nondot.org>
      	<4B0EC9DE.9060404@free.fr>
      Message-ID: <18BEBF85-FCC7-4246-9BF8-26F41C6D7D37@nondot.org>
      
      
      On Nov 26, 2009, at 10:33 AM, Duncan Sands wrote:
      
      >>> what if this wraps around the entire address space and lands back on top of
      >>> the original pointer?
      >> Then that's undefined behavior.
      > 
      > I thought it was only undefined for inbounds GEP?  I'm a bit worried about this
      > because I have a testcase in which the optimizers changed increments by -4 into
      > increments by 2^32 - 4, which would mean that they are creating undefined
      > behaviour where there was none before (PR5282).
      
      This isn't related, it is looking at the access size in this code, not the stride.  The scales are explicitly truncated to pointer size here:
      
            // Make sure that we have a scale that makes sense for this target's
            // pointer size.
            if (unsigned ShiftBits = 64-TD->getPointerSizeInBits()) {
              Scale <<= ShiftBits;
              Scale >>= ShiftBits;
            }
      
      so it will be handled as -4, not "something crazy".
      
      -Chris
      -------------- next part --------------
      An HTML attachment was scrubbed...
      URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091126/0bd1d71e/attachment.html 
      
      From anton at korobeynikov.info  Thu Nov 26 14:25:43 2009
      From: anton at korobeynikov.info (Anton Korobeynikov)
      Date: Thu, 26 Nov 2009 23:25:43 +0300
      Subject: [llvm-commits] Invalid instcombine transformation
      Message-ID: <1259267143.21630.143.camel@aslstation>
      
      Hello, Everyone
      
      Currently instcombine turns small memcpy's (say, of 1/2/4/8 bytes)
      into loads / stores. This seems to be invalid if target does not allow
      unaligned access (and even if it allows, it's slower in general).
      Consider ARM, there we have:
       
      1. Byte loads / stores => no alignment requirement
      2. 2-byte loads / stores => 16 bit alignment requirement
      3. 4-byte loads / stores => 32 bit alignment requirement
       
      In case of misaligned access in cases 2 and 3 it can be either "fixed"
      (if the processor is configured so), or trap (it it doesn't configured
      so).
      
      memcpy allows arbitrary alignment. In such situation, turn of e.g.
      memcpy(i8*, i8*, 2, 1) and memcpy(i8*, i8*, 4, 1) into loads / stores
      is invalid and thus the transform should be disabled. Same applies to
      memset.
      
      Attached patch allows memcpy => load/store transformation only if memcpy
      alignment is not lower than ABI alignment of the load/store type (when
      TargetData is available).
      
      This fixed several bugs in real code running on ARM.
      
      Ok to commit?
      
      -- 
      With best regards, Anton Korobeynikov.
      
      Faculty of Mathematics & Mechanics, Saint Petersburg State University.
      -------------- next part --------------
      A non-text attachment was scrubbed...
      Name: instcombine-unaligned.diff
      Type: text/x-patch
      Size: 4257 bytes
      Desc: not available
      Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091126/5d02154c/attachment.bin 
      
      From nicholas at mxc.ca  Mon Nov 23 23:49:22 2009
      From: nicholas at mxc.ca (Nick Lewycky)
      Date: Mon, 23 Nov 2009 21:49:22 -0800
      Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk:
       lib/Transforms/Scalar/InstructionCombining.cpp
       test/Transforms/InstCombine/compare-signs.ll
      In-Reply-To: 
      References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu>
      	<4B0A446C.10104@free.fr> <4B0A4558.8070803@mxc.ca>
      	<4B0A48CB.4020201@free.fr>
      	
      Message-ID: <4B0B73E2.6010009@mxc.ca>
      
      Chris Lattner wrote:
      >
      > On Nov 23, 2009, at 12:33 AM, Duncan Sands wrote:
      >
      >>>>> + if (KnownZeroLHS.countLeadingOnes() == BitWidth-1&&
      >>>>> + KnownZeroRHS.countLeadingOnes() == BitWidth-1) {
      >>>>
      >>>> == ->  >= :)
      >>>
      >>> Nope, look again!
      >>>
      >>> +        APInt TypeMask(APInt::getHighBitsSet(BitWidth, BitWidth-1));
      >>>
      >>> Thus, it will never return a knownzero with all bits set. :)
      >>
      >> Ha ha, you got me there!
      >
      > Please add a comment.  Obviously it isn't clear what is going on here.
      
      Sorry Chris, but I don't think a comment would do anything but confuse 
      people further. A reader who wasn't thinking of the earlier version of 
      the patch that behaved differently wouldn't be asking this question.
      
      Also, the issue Duncan raised is one of possible missed optimization, 
      not correctness. It addressed handling the case where the comparison is 
      with a constant, but those should be getting optimized away elsewhere 
      anyways.
      
      Nick
      
      PS. Of course, if you were to tell me to stop arguing and just add a 
      damned comment, I will.
      
      
      From baldrick at free.fr  Thu Nov 26 14:43:00 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Thu, 26 Nov 2009 21:43:00 +0100
      Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk:
       lib/Transforms/Scalar/InstructionCombining.cpp
       test/Transforms/InstCombine/compare-signs.ll
      In-Reply-To: <4B0B73E2.6010009@mxc.ca>
      References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu>
      	<4B0A446C.10104@free.fr> <4B0A4558.8070803@mxc.ca>
      	<4B0A48CB.4020201@free.fr>
      	
      	<4B0B73E2.6010009@mxc.ca>
      Message-ID: <4B0EE854.80802@free.fr>
      
      > Sorry Chris, but I don't think a comment would do anything but confuse 
      > people further. A reader who wasn't thinking of the earlier version of 
      > the patch that behaved differently wouldn't be asking this question.
      
      Yeah, it was basically an "in joke" :)
      
      Ciao,
      
      Duncan.
      
      
      From clattner at apple.com  Thu Nov 26 15:20:46 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 13:20:46 -0800
      Subject: [llvm-commits] [llvm] r89639 - in /llvm/trunk:
      	lib/Transforms/Scalar/InstructionCombining.cpp
      	test/Transforms/InstCombine/compare-signs.ll
      In-Reply-To: <4B0EE854.80802@free.fr>
      References: <200911230317.nAN3HYvG017794@zion.cs.uiuc.edu>
      	<4B0A446C.10104@free.fr> <4B0A4558.8070803@mxc.ca>
      	<4B0A48CB.4020201@free.fr>
      	
      	<4B0B73E2.6010009@mxc.ca> <4B0EE854.80802@free.fr>
      Message-ID: 
      
      ok
      
      On Nov 26, 2009, at 12:43 PM, Duncan Sands wrote:
      
      >> Sorry Chris, but I don't think a comment would do anything but confuse people further. A reader who wasn't thinking of the earlier version of the patch that behaved differently wouldn't be asking this question.
      > 
      > Yeah, it was basically an "in joke" :)
      > 
      > Ciao,
      > 
      > Duncan.
      
      
      
      
      From clattner at apple.com  Thu Nov 26 15:24:06 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 13:24:06 -0800
      Subject: [llvm-commits] Invalid instcombine transformation
      In-Reply-To: <1259267143.21630.143.camel@aslstation>
      References: <1259267143.21630.143.camel@aslstation>
      Message-ID: <6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      
      On Nov 26, 2009, at 12:25 PM, Anton Korobeynikov wrote:
      > Hello, Everyone
      > 
      > Currently instcombine turns small memcpy's (say, of 1/2/4/8 bytes)
      > into loads / stores. 
      
      Yes, this is very bad.  instcombine should not 'lower' into 2/4/8 byte stores (1 byte are ok though).  This is for a couple of reasons, not the least of which is that it introduces type-unsafe code that is harder to analyze than the original IR.
      
      > This seems to be invalid if target does not allow
      > unaligned access (and even if it allows, it's slower in general).
      
      The target *has* to support unaligned accesses.  Worst case, the code generator should lower to byte accesses.  That said, this is still bad :)
      
      > Attached patch allows memcpy => load/store transformation only if memcpy
      > alignment is not lower than ABI alignment of the load/store type (when
      > TargetData is available).
      > 
      > This fixed several bugs in real code running on ARM.
      
      I'd prefer to completely remove this from instcombine.  To do this, please verify that we are already doing the equivalent lowering in codegen (e.g. on x86 a 2 and 4-byte memset should turn into an unaligned store).  Then verify that disabling the instcombine xform doesn't cause any significant changes in codegen on multisource (or something else signifcant).
      
      -Chris
      
      
      From anton at korobeynikov.info  Thu Nov 26 15:30:16 2009
      From: anton at korobeynikov.info (Anton Korobeynikov)
      Date: Fri, 27 Nov 2009 00:30:16 +0300
      Subject: [llvm-commits] Invalid instcombine transformation
      In-Reply-To: <6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      References: <1259267143.21630.143.camel@aslstation>
      	<6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      Message-ID: 
      
      Hi, Chris
      
      > The target *has* to support unaligned accesses.
      Sorry, this is definitely unreal :) Especially in embedded world. And
      not only there (think about sparc/solaris).
      
      > I'd prefer to completely remove this from instcombine. ?To do this, please verify that we are already doing the equivalent lowering in codegen (e.g. on x86 a 2 and 4-byte memset should turn into an unaligned store). ?Then verify that disabling the instcombine xform doesn't cause any significant changes in codegen on multisource (or something else signifcant).
      Ok.
      
      -- 
      With best regards, Anton Korobeynikov
      Faculty of Mathematics and Mechanics, Saint Petersburg State University
      
      
      
      From bob.wilson at apple.com  Thu Nov 26 15:38:42 2009
      From: bob.wilson at apple.com (Bob Wilson)
      Date: Thu, 26 Nov 2009 21:38:42 -0000
      Subject: [llvm-commits] [llvm] r89968 - in /llvm/trunk:
       include/llvm/CodeGen/Passes.h lib/CodeGen/LLVMTargetMachine.cpp
       lib/CodeGen/TailDuplication.cpp
      Message-ID: <200911262138.nAQLcgII014775@zion.cs.uiuc.edu>
      
      Author: bwilson
      Date: Thu Nov 26 15:38:41 2009
      New Revision: 89968
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89968&view=rev
      Log:
      Rename new TailDuplicationPass to avoid name conflict with the old one.
      
      Modified:
          llvm/trunk/include/llvm/CodeGen/Passes.h
          llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp
          llvm/trunk/lib/CodeGen/TailDuplication.cpp
      
      Modified: llvm/trunk/include/llvm/CodeGen/Passes.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/Passes.h?rev=89968&r1=89967&r2=89968&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/CodeGen/Passes.h (original)
      +++ llvm/trunk/include/llvm/CodeGen/Passes.h Thu Nov 26 15:38:41 2009
      @@ -129,9 +129,9 @@
         /// branches.
         FunctionPass *createBranchFoldingPass(bool DefaultEnableTailMerge);
       
      -  /// TailDuplication Pass - Duplicate blocks with unconditional branches
      +  /// TailDuplicate Pass - Duplicate blocks with unconditional branches
         /// into tails of their predecessors.
      -  FunctionPass *createTailDuplicationPass();
      +  FunctionPass *createTailDuplicatePass();
       
         /// IfConverter Pass - This pass performs machine code if conversion.
         FunctionPass *createIfConverterPass();
      
      Modified: llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp?rev=89968&r1=89967&r2=89968&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp (original)
      +++ llvm/trunk/lib/CodeGen/LLVMTargetMachine.cpp Thu Nov 26 15:38:41 2009
      @@ -348,8 +348,8 @@
       
         // Tail duplication.
         if (OptLevel != CodeGenOpt::None && !DisableTailDuplicate) {
      -    PM.add(createTailDuplicationPass());
      -    printAndVerify(PM, "After TailDuplication");
      +    PM.add(createTailDuplicatePass());
      +    printAndVerify(PM, "After TailDuplicate");
         }
       
         PM.add(createGCMachineCodeAnalysisPass());
      
      Modified: llvm/trunk/lib/CodeGen/TailDuplication.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TailDuplication.cpp?rev=89968&r1=89967&r2=89968&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/CodeGen/TailDuplication.cpp (original)
      +++ llvm/trunk/lib/CodeGen/TailDuplication.cpp Thu Nov 26 15:38:41 2009
      @@ -37,14 +37,14 @@
                         cl::init(2), cl::Hidden);
       
       namespace {
      -  /// TailDuplicationPass - Perform tail duplication.
      -  class TailDuplicationPass : public MachineFunctionPass {
      +  /// TailDuplicatePass - Perform tail duplication.
      +  class TailDuplicatePass : public MachineFunctionPass {
           const TargetInstrInfo *TII;
           MachineModuleInfo *MMI;
       
         public:
           static char ID;
      -    explicit TailDuplicationPass() : MachineFunctionPass(&ID) {}
      +    explicit TailDuplicatePass() : MachineFunctionPass(&ID) {}
       
           virtual bool runOnMachineFunction(MachineFunction &MF);
           virtual const char *getPassName() const { return "Tail Duplication"; }
      @@ -55,14 +55,14 @@
           void RemoveDeadBlock(MachineBasicBlock *MBB);
         };
       
      -  char TailDuplicationPass::ID = 0;
      +  char TailDuplicatePass::ID = 0;
       }
       
      -FunctionPass *llvm::createTailDuplicationPass() {
      -  return new TailDuplicationPass();
      +FunctionPass *llvm::createTailDuplicatePass() {
      +  return new TailDuplicatePass();
       }
       
      -bool TailDuplicationPass::runOnMachineFunction(MachineFunction &MF) {
      +bool TailDuplicatePass::runOnMachineFunction(MachineFunction &MF) {
         TII = MF.getTarget().getInstrInfo();
         MMI = getAnalysisIfAvailable();
       
      @@ -80,7 +80,7 @@
       /// TailDuplicateBlocks - Look for small blocks that are unconditionally
       /// branched to and do not fall through. Tail-duplicate their instructions
       /// into their predecessors to eliminate (dynamic) branches.
      -bool TailDuplicationPass::TailDuplicateBlocks(MachineFunction &MF) {
      +bool TailDuplicatePass::TailDuplicateBlocks(MachineFunction &MF) {
         bool MadeChange = false;
       
         for (MachineFunction::iterator I = ++MF.begin(), E = MF.end(); I != E; ) {
      @@ -105,7 +105,7 @@
       
       /// TailDuplicate - If it is profitable, duplicate TailBB's contents in each
       /// of its predecessors.
      -bool TailDuplicationPass::TailDuplicate(MachineBasicBlock *TailBB,
      +bool TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB,
                                               MachineFunction &MF) {
         // Don't try to tail-duplicate single-block loops.
         if (TailBB->isSuccessor(TailBB))
      @@ -225,7 +225,7 @@
       
       /// RemoveDeadBlock - Remove the specified dead machine basic block from the
       /// function, updating the CFG.
      -void TailDuplicationPass::RemoveDeadBlock(MachineBasicBlock *MBB) {
      +void TailDuplicatePass::RemoveDeadBlock(MachineBasicBlock *MBB) {
         assert(MBB->pred_empty() && "MBB must be dead!");
         DEBUG(errs() << "\nRemoving MBB: " << *MBB);
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 15:42:48 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 21:42:48 -0000
      Subject: [llvm-commits] [llvm] r89970 -
      	/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      Message-ID: <200911262142.nAQLgmlS014915@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 15:42:47 2009
      New Revision: 89970
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89970&view=rev
      Log:
      implement a bunch of xforms for overflow intrinsics, based on a patch
      by Alastair Lynn.
      
      
      Modified:
          llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      
      Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=89970&r1=89969&r2=89970&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Thu Nov 26 15:42:47 2009
      @@ -9872,6 +9872,120 @@
             if (Operand->getIntrinsicID() == Intrinsic::bswap)
               return ReplaceInstUsesWith(CI, Operand->getOperand(1));
           break;
      +  case Intrinsic::uadd_with_overflow: {
      +    Value *LHS = II->getOperand(1), *RHS = II->getOperand(2);
      +    const IntegerType *IT = cast(II->getOperand(1)->getType());
      +    uint32_t BitWidth = IT->getBitWidth();
      +    APInt Mask = APInt::getSignBit(BitWidth);
      +    APInt LHSKnownZero, LHSKnownOne, RHSKnownZero, RHSKnownOne;
      +    ComputeMaskedBits(LHS, Mask, LHSKnownZero, LHSKnownOne);
      +    bool LHSKnownNegative = LHSKnownOne[BitWidth - 1];
      +    bool LHSKnownPositive = LHSKnownZero[BitWidth - 1];
      +
      +    if (LHSKnownNegative || LHSKnownPositive) {
      +      ComputeMaskedBits(RHS, Mask, RHSKnownZero, RHSKnownOne);
      +      bool RHSKnownNegative = RHSKnownOne[BitWidth - 1];
      +      bool RHSKnownPositive = RHSKnownZero[BitWidth - 1];
      +      if (LHSKnownNegative && RHSKnownNegative) {
      +        // The sign bit is set in both cases: this MUST overflow.
      +        // Create a simple add instruction, and insert it into the struct.
      +        Instruction *Add = BinaryOperator::CreateAdd(LHS, RHS, "", &CI);
      +        Worklist.Add(Add);
      +        Constant *V[2];
      +        V[0] = UndefValue::get(LHS->getType());
      +        V[1] = ConstantInt::getTrue(*Context);
      +        Constant *Struct = ConstantStruct::get(*Context, V, 2, false);
      +        return InsertValueInst::Create(Struct, Add, 0);
      +      }
      +      
      +      if (LHSKnownPositive && RHSKnownPositive) {
      +        // The sign bit is clear in both cases: this CANNOT overflow.
      +        // Create a simple add instruction, and insert it into the struct.
      +        Instruction *Add = BinaryOperator::CreateNUWAdd(LHS, RHS, "", &CI);
      +        Worklist.Add(Add);
      +        Constant *V[2];
      +        V[0] = UndefValue::get(LHS->getType());
      +        V[1] = ConstantInt::getFalse(*Context);
      +        Constant *Struct = ConstantStruct::get(*Context, V, 2, false);
      +        return InsertValueInst::Create(Struct, Add, 0);
      +      }
      +    }
      +  }
      +  // FALL THROUGH uadd into sadd
      +  case Intrinsic::sadd_with_overflow:
      +    // Canonicalize constants into the RHS.
      +    if (isa(II->getOperand(1)) &&
      +        !isa(II->getOperand(2))) {
      +      Value *LHS = II->getOperand(1);
      +      II->setOperand(1, II->getOperand(2));
      +      II->setOperand(2, LHS);
      +      return II;
      +    }
      +
      +    // X + undef -> undef
      +    if (isa(II->getOperand(2)))
      +      return ReplaceInstUsesWith(CI, UndefValue::get(II->getType()));
      +      
      +    if (ConstantInt *RHS = dyn_cast(II->getOperand(2))) {
      +      // X + 0 -> {X, false}
      +      if (RHS->isZero()) {
      +        Constant *V[] = {
      +          UndefValue::get(II->getType()), ConstantInt::getFalse(*Context)
      +        };
      +        Constant *Struct = ConstantStruct::get(*Context, V, 2, false);
      +        return InsertValueInst::Create(Struct, II->getOperand(1), 0);
      +      }
      +    }
      +    break;
      +  case Intrinsic::usub_with_overflow:
      +  case Intrinsic::ssub_with_overflow:
      +    // undef - X -> undef
      +    // X - undef -> undef
      +    if (isa(II->getOperand(1)) ||
      +        isa(II->getOperand(2)))
      +      return ReplaceInstUsesWith(CI, UndefValue::get(II->getType()));
      +      
      +    if (ConstantInt *RHS = dyn_cast(II->getOperand(2))) {
      +      // X - 0 -> {X, false}
      +      if (RHS->isZero()) {
      +        Constant *V[] = {
      +          UndefValue::get(II->getType()), ConstantInt::getFalse(*Context)
      +        };
      +        Constant *Struct = ConstantStruct::get(*Context, V, 2, false);
      +        return InsertValueInst::Create(Struct, II->getOperand(1), 0);
      +      }
      +    }
      +    break;
      +  case Intrinsic::umul_with_overflow:
      +  case Intrinsic::smul_with_overflow:
      +    // Canonicalize constants into the RHS.
      +    if (isa(II->getOperand(1)) &&
      +        !isa(II->getOperand(2))) {
      +      Value *LHS = II->getOperand(1);
      +      II->setOperand(1, II->getOperand(2));
      +      II->setOperand(2, LHS);
      +      return II;
      +    }
      +
      +    // X * undef -> undef
      +    if (isa(II->getOperand(2)))
      +      return ReplaceInstUsesWith(CI, UndefValue::get(II->getType()));
      +      
      +    if (ConstantInt *RHSI = dyn_cast(II->getOperand(2))) {
      +      // X*0 -> {0, false}
      +      if (RHSI->isZero())
      +        return ReplaceInstUsesWith(CI, Constant::getNullValue(II->getType()));
      +      
      +      // X * 1 -> {X, false}
      +      if (RHSI->equalsInt(1)) {
      +        Constant *V[2];
      +        V[0] = UndefValue::get(II->getType());
      +        V[1] = ConstantInt::getFalse(*Context);
      +        Constant *Struct = ConstantStruct::get(*Context, V, 2, false);
      +        return InsertValueInst::Create(Struct, II->getOperand(1), 1);
      +      }
      +    }
      +    break;
         case Intrinsic::ppc_altivec_lvx:
         case Intrinsic::ppc_altivec_lvxl:
         case Intrinsic::x86_sse_loadu_ps:
      
      
      
      
      From clattner at apple.com  Thu Nov 26 15:43:25 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 13:43:25 -0800
      Subject: [llvm-commits] Overflow intrinsics patch
      In-Reply-To: <17C1443D-34E2-4FE9-BBF9-13BECDA37A51@gmail.com>
      References: <17C1443D-34E2-4FE9-BBF9-13BECDA37A51@gmail.com>
      Message-ID: 
      
      
      On Nov 11, 2009, at 7:23 PM, Alastair Lynn wrote:
      
      > Hello-
      > 
      > Third time lucky, here is the absolutely bug-free patch.
      
      Thanks Alastair, sorry for the delay reviewing this.
      
      I committed a simplified version of the patch as r89970.  Please send in a testcase that covers these xforms.
      
      -Chris
      
      
      From clattner at apple.com  Thu Nov 26 15:44:34 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 13:44:34 -0800
      Subject: [llvm-commits] Invalid instcombine transformation
      In-Reply-To: 
      References: <1259267143.21630.143.camel@aslstation>
      	<6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      	
      Message-ID: <71FB32CC-9F96-4FC0-BDBE-55741E5BBA5B@apple.com>
      
      
      On Nov 26, 2009, at 1:30 PM, Anton Korobeynikov wrote:
      
      > Hi, Chris
      > 
      >> The target *has* to support unaligned accesses.
      > Sorry, this is definitely unreal :) Especially in embedded world. And
      > not only there (think about sparc/solaris).
      
      What do you mean?  Legalize can always 'fix' this for a target.  Unaligned load/store should always be legal for any target.
      
      -Chris
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 16:04:42 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 22:04:42 -0000
      Subject: [llvm-commits] [llvm] r89971 - in /llvm/trunk:
       lib/Transforms/Scalar/InstructionCombining.cpp
       test/Transforms/InstCombine/crash.ll test/Transforms/InstCombine/store.ll
      Message-ID: <200911262204.nAQM4gWY015836@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 16:04:42 2009
      New Revision: 89971
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89971&view=rev
      Log:
      Fix PR5471 by removing an instcombine xform.  Some pieces of the code
      generates store to undef and some generates store to null as the idiom
      for undefined behavior.  Since simplifycfg zaps both, don't remove the
      undefined behavior in instcombine.
      
      Modified:
          llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
          llvm/trunk/test/Transforms/InstCombine/crash.ll
          llvm/trunk/test/Transforms/InstCombine/store.ll
      
      Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=89971&r1=89970&r2=89971&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Thu Nov 26 16:04:42 2009
      @@ -12096,12 +12096,6 @@
         Value *Val = SI.getOperand(0);
         Value *Ptr = SI.getOperand(1);
       
      -  if (isa(Ptr)) {     // store X, undef -> noop (even if volatile)
      -    EraseInstFromFunction(SI);
      -    ++NumCombined;
      -    return 0;
      -  }
      -  
         // If the RHS is an alloca with a single use, zapify the store, making the
         // alloca dead.
         // If the RHS is an alloca with a two uses, the other one being a 
      
      Modified: llvm/trunk/test/Transforms/InstCombine/crash.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/crash.ll?rev=89971&r1=89970&r2=89971&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/InstCombine/crash.ll (original)
      +++ llvm/trunk/test/Transforms/InstCombine/crash.ll Thu Nov 26 16:04:42 2009
      @@ -125,3 +125,15 @@
         %v11 = select i1 %v5_, i64 0, i64 %v6
         ret i64 %v11
       }
      +
      +; PR5471
      +define arm_apcscc i32 @test5a() {
      +       ret i32 0
      +}
      +
      +define arm_apcscc void @test5() {
      +       store i1 true, i1* undef
      +       %1 = invoke i32 @test5a() to label %exit unwind label %exit
      +exit:
      +       ret void
      +}
      
      Modified: llvm/trunk/test/Transforms/InstCombine/store.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/store.ll?rev=89971&r1=89970&r2=89971&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/InstCombine/store.ll (original)
      +++ llvm/trunk/test/Transforms/InstCombine/store.ll Thu Nov 26 16:04:42 2009
      @@ -6,6 +6,7 @@
               store i32 124, i32* null
               ret void
       ; CHECK: @test1(
      +; CHECK-NEXT: store i32 123, i32* undef
       ; CHECK-NEXT: store i32 undef, i32* null
       ; CHECK-NEXT: ret void
       }
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 16:08:06 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 22:08:06 -0000
      Subject: [llvm-commits] [llvm] r89972 -
      	/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      Message-ID: <200911262208.nAQM8636015959@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 16:08:06 2009
      New Revision: 89972
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89972&view=rev
      Log:
      fix crash on Transforms/InstCombine/intrinsics.ll introduced by r89970 
      
      Modified:
          llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      
      Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=89972&r1=89971&r2=89972&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Thu Nov 26 16:08:06 2009
      @@ -9877,12 +9877,15 @@
           const IntegerType *IT = cast(II->getOperand(1)->getType());
           uint32_t BitWidth = IT->getBitWidth();
           APInt Mask = APInt::getSignBit(BitWidth);
      -    APInt LHSKnownZero, LHSKnownOne, RHSKnownZero, RHSKnownOne;
      +    APInt LHSKnownZero(BitWidth, 0);
      +    APInt LHSKnownOne(BitWidth, 0);
           ComputeMaskedBits(LHS, Mask, LHSKnownZero, LHSKnownOne);
           bool LHSKnownNegative = LHSKnownOne[BitWidth - 1];
           bool LHSKnownPositive = LHSKnownZero[BitWidth - 1];
       
           if (LHSKnownNegative || LHSKnownPositive) {
      +      APInt RHSKnownZero(BitWidth, 0);
      +      APInt RHSKnownOne(BitWidth, 0);
             ComputeMaskedBits(RHS, Mask, RHSKnownZero, RHSKnownOne);
             bool RHSKnownNegative = RHSKnownOne[BitWidth - 1];
             bool RHSKnownPositive = RHSKnownZero[BitWidth - 1];
      
      
      
      
      From nicholas at mxc.ca  Thu Nov 26 16:15:58 2009
      From: nicholas at mxc.ca (Nick Lewycky)
      Date: Thu, 26 Nov 2009 14:15:58 -0800
      Subject: [llvm-commits] [PATCH] LTO code generator options
      In-Reply-To: <352a1fb20911250925q4d4b8b9fwae1ac454f4c0341c@mail.gmail.com>
      References: <04F6B1512E264B27AEE607542FCDD113@andreic6e7fe55>	<41BA1AA405BC4D19BA9B4FAB6543D62F@andreic6e7fe55>	<38a0d8450911190723g644ad4c7ife769ab35da9efb9@mail.gmail.com>		<38a0d8450911200722i5efa690ci6ab671d71b5f40dc@mail.gmail.com>		<38a0d8450911231309t6f37e2a0ga7c9eaa50d495c60@mail.gmail.com>	<2E9E5BD4D91C4B32850CD83726EFE19B@andreic6e7fe55>	<352a1fb20911241558o4131950di5e3bac9db3a31e30@mail.gmail.com>	<3780A335BCB5498984B6E0408C2EDB0D@andreic6e7fe55>
      	<352a1fb20911250925q4d4b8b9fwae1ac454f4c0341c@mail.gmail.com>
      Message-ID: <4B0EFE1E.3000000@mxc.ca>
      
      Devang Patel wrote:
      > 3 - It is quite reasonable for some to put two copies of a function,
      > one for SSE3 machines, one for non-SSE machines, in one bitcode file
      > and let code generator generate appropriate code for each functions so
      > that the user can select desired function at run time. At Apple, we
      > supported similar requirements for Altivec vs. non-Altivec code. This
      > can be achieved if subtarget features like SSE3 are encoded in bitcode
      > files.
      
      Now that's an interesting idea. There's one obvious major issue with it, 
      which is that we would end up with two different function definitions 
      with the same name. Assuming we can overcome that, we then run into 
      issues with (for example) the inliner being unable to inline a function 
      because it doesn't know what the subtarget is.
      
      Otherwise, this 'fat llvm ir' sounds like it would work, if that's what 
      you really wanted to do-- which I don't think it is. I have trouble 
      creating an argument for why it's bad, but I note that this is contrary 
      to the way LLVM has approached portability thus far.
      
      Nick
      
      
      From anton at korobeynikov.info  Thu Nov 26 16:20:51 2009
      From: anton at korobeynikov.info (Anton Korobeynikov)
      Date: Fri, 27 Nov 2009 01:20:51 +0300
      Subject: [llvm-commits] Invalid instcombine transformation
      In-Reply-To: <71FB32CC-9F96-4FC0-BDBE-55741E5BBA5B@apple.com>
      References: <1259267143.21630.143.camel@aslstation>
      	<6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      	
      	<71FB32CC-9F96-4FC0-BDBE-55741E5BBA5B@apple.com>
      Message-ID: <1259274051.21630.147.camel@aslstation>
      
      Hi, Chris
      
      > What do you mean?  Legalize can always 'fix' this for a target.  Unaligned load/store should always be legal for any target.
      Looks like it didn't in this case... Maybe we're just caught some
      legalize bug as well. I will investigate
      
      -- 
      With best regards, Anton Korobeynikov.
      
      Faculty of Mathematics & Mechanics, Saint Petersburg State University.
      
      
      
      From clattner at apple.com  Thu Nov 26 16:36:37 2009
      From: clattner at apple.com (Chris Lattner)
      Date: Thu, 26 Nov 2009 14:36:37 -0800
      Subject: [llvm-commits] Invalid instcombine transformation
      In-Reply-To: <1259274051.21630.147.camel@aslstation>
      References: <1259267143.21630.143.camel@aslstation>
      	<6634FC61-E6AE-4FA4-BCF2-267F390C1735@apple.com>
      	
      	<71FB32CC-9F96-4FC0-BDBE-55741E5BBA5B@apple.com>
      	<1259274051.21630.147.camel@aslstation>
      Message-ID: 
      
      
      On Nov 26, 2009, at 2:20 PM, Anton Korobeynikov wrote:
      
      > Hi, Chris
      > 
      >> What do you mean?  Legalize can always 'fix' this for a target.  Unaligned load/store should always be legal for any target.
      > Looks like it didn't in this case... Maybe we're just caught some
      > legalize bug as well. I will investigate
      
      Thanks Anton!
      
      -Chris
      
      
      From sabre at nondot.org  Thu Nov 26 16:48:24 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 22:48:24 -0000
      Subject: [llvm-commits] [llvm] r89973 -
      	/llvm/trunk/lib/AsmParser/LLParser.cpp
      Message-ID: <200911262248.nAQMmOB2017485@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 16:48:23 2009
      New Revision: 89973
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89973&view=rev
      Log:
      Add a hack for PR5601, a crash on obsolete syntax that we plan to
      remove in LLVM 3.0
      
      Modified:
          llvm/trunk/lib/AsmParser/LLParser.cpp
      
      Modified: llvm/trunk/lib/AsmParser/LLParser.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/AsmParser/LLParser.cpp?rev=89973&r1=89972&r2=89973&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/AsmParser/LLParser.cpp (original)
      +++ llvm/trunk/lib/AsmParser/LLParser.cpp Thu Nov 26 16:48:23 2009
      @@ -2701,6 +2701,10 @@
         // Add all of the arguments we parsed to the function.
         Function::arg_iterator ArgIt = Fn->arg_begin();
         for (unsigned i = 0, e = ArgList.size(); i != e; ++i, ++ArgIt) {
      +    // If we run out of arguments in the Function prototype, exit early.
      +    // FIXME: REMOVE THIS IN LLVM 3.0, this is just for the mismatch case above.
      +    if (ArgIt == Fn->arg_end()) break;
      +    
           // If the argument has a name, insert it into the argument symbol table.
           if (ArgList[i].Name.empty()) continue;
       
      
      
      
      
      From nicholas at mxc.ca  Thu Nov 26 16:54:26 2009
      From: nicholas at mxc.ca (Nick Lewycky)
      Date: Thu, 26 Nov 2009 22:54:26 -0000
      Subject: [llvm-commits] [llvm] r89974 - /llvm/trunk/lib/VMCore/Metadata.cpp
      Message-ID: <200911262254.nAQMsQ5X017677@zion.cs.uiuc.edu>
      
      Author: nicholas
      Date: Thu Nov 26 16:54:26 2009
      New Revision: 89974
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89974&view=rev
      Log:
      Clean up file, no functionality change.
      
      Modified:
          llvm/trunk/lib/VMCore/Metadata.cpp
      
      Modified: llvm/trunk/lib/VMCore/Metadata.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=89974&r1=89973&r2=89974&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/VMCore/Metadata.cpp (original)
      +++ llvm/trunk/lib/VMCore/Metadata.cpp Thu Nov 26 16:54:26 2009
      @@ -33,10 +33,8 @@
         StringMapEntry &Entry = 
           pImpl->MDStringCache.GetOrCreateValue(Str);
         MDString *&S = Entry.getValue();
      -  if (S) return S;
      -  
      -  return S = 
      -    new MDString(Context, Entry.getKey());
      +  if (!S) S = new MDString(Context, Entry.getKey());
      +  return S;
       }
       
       MDString *MDString::get(LLVMContext &Context, const char *Str) {
      @@ -44,10 +42,8 @@
         StringMapEntry &Entry = 
           pImpl->MDStringCache.GetOrCreateValue(Str ? StringRef(Str) : StringRef());
         MDString *&S = Entry.getValue();
      -  if (S) return S;
      -  
      -  return S = 
      -    new MDString(Context, Entry.getKey());
      +  if (!S) new MDString(Context, Entry.getKey());
      +  return S;
       }
       
       //===----------------------------------------------------------------------===//
      @@ -74,28 +70,19 @@
           ID.AddPointer(Vals[i]);
       
         void *InsertPoint;
      -  MDNode *N;
      -  {
      -    N = pImpl->MDNodeSet.FindNodeOrInsertPos(ID, InsertPoint);
      -  }  
      -  if (N) return N;
      -  
      -  N = pImpl->MDNodeSet.FindNodeOrInsertPos(ID, InsertPoint);
      +  MDNode *N = pImpl->MDNodeSet.FindNodeOrInsertPos(ID, InsertPoint);
         if (!N) {
           // InsertPoint will have been set by the FindNodeOrInsertPos call.
           N = new MDNode(Context, Vals, NumVals);
           pImpl->MDNodeSet.InsertNode(N, InsertPoint);
         }
      -
         return N;
       }
       
       /// ~MDNode - Destroy MDNode.
       MDNode::~MDNode() {
      -  {
      -    LLVMContextImpl *pImpl = getType()->getContext().pImpl;
      -    pImpl->MDNodeSet.RemoveNode(this);
      -  }
      +  LLVMContextImpl *pImpl = getType()->getContext().pImpl;
      +  pImpl->MDNodeSet.RemoveNode(this);
         delete [] Node;
         Node = NULL;
       }
      @@ -241,7 +228,7 @@
         /// the same metadata to In2.
         void copyMD(Instruction *In1, Instruction *In2);
       
      -  /// getHandlerNames - Populate client supplied smallvector using custome
      +  /// getHandlerNames - Populate client-supplied smallvector using custom
         /// metadata name and ID.
         void getHandlerNames(SmallVectorImpl >&) const;
       
      @@ -317,7 +304,7 @@
           }
         }
       }
      -  
      +
       /// removeAllMetadata - Remove all metadata attached with an instruction.
       void MetadataContextImpl::removeAllMetadata(Instruction *Inst) {
         MetadataStore.erase(Inst);
      @@ -454,12 +441,12 @@
       void MetadataContext::addMD(unsigned Kind, MDNode *Node, Instruction *Inst) {
         pImpl->addMD(Kind, Node, Inst);
       }
      -  
      +
       /// removeMD - Remove metadata of given kind attached with an instuction.
       void MetadataContext::removeMD(unsigned Kind, Instruction *Inst) {
         pImpl->removeMD(Kind, Inst);
       }
      -  
      +
       /// removeAllMetadata - Remove all metadata attached with an instruction.
       void MetadataContext::removeAllMetadata(Instruction *Inst) {
         pImpl->removeAllMetadata(Inst);
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 17:18:49 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 23:18:49 -0000
      Subject: [llvm-commits] [llvm] r89975 -
      	/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      Message-ID: <200911262318.nAQNInQO018510@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 17:18:49 2009
      New Revision: 89975
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89975&view=rev
      Log:
      factor some code out into some helper functions.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89975&r1=89974&r2=89975&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 17:18:49 2009
      @@ -684,6 +684,33 @@
         }
       }
       
      +/// isPHITranslatable - Return true if the specified computation is derived from
      +/// a PHI node in the current block and if it is simple enough for us to handle.
      +static bool isPHITranslatable(Instruction *Inst) {
      +  if (isa(Inst))
      +    return true;
      +  
      +  // TODO: BITCAST, GEP.
      +
      +  // ...
      +  
      +  //   cerr << "MEMDEP: Could not PHI translate: " << *Pointer;
      +  //   if (isa(PtrInst) || isa(PtrInst))
      +  //     cerr << "OP:\t\t\t\t" << *PtrInst->getOperand(0);
      +  
      +  return false;
      +}
      +
      +/// PHITranslateForPred - Given a computation that satisfied the
      +/// isPHITranslatable predicate, see if we can translate the computation into
      +/// the specified predecessor block.  If so, return that value.
      +static Value *PHITranslateForPred(Instruction *Inst, BasicBlock *Pred) {
      +  if (PHINode *PN = dyn_cast(Inst))
      +    return PN->getIncomingValueForBlock(Pred);
      +  
      +  return 0;
      +}
      +
       
       /// getNonLocalPointerDepFromBB - Perform a dependency query based on
       /// pointer/pointeesize starting at the end of StartBB.  Add any clobber/def
      @@ -827,14 +854,18 @@
             NumSortedEntries = Cache->size();
           }
           
      -    // If this is directly a PHI node, just use the incoming values for each
      -    // pred as the phi translated version.
      -    if (PHINode *PtrPHI = dyn_cast(PtrInst)) {
      +    // If this is a computation derived from a PHI node, use the suitably
      +    // translated incoming values for each pred as the phi translated version.
      +    if (isPHITranslatable(PtrInst)) {
             Cache = 0;
             
             for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) {
               BasicBlock *Pred = *PI;
      -        Value *PredPtr = PtrPHI->getIncomingValueForBlock(Pred);
      +        Value *PredPtr = PHITranslateForPred(PtrInst, Pred);
      +        
      +        // If PHI translation fails, bail out.
      +        if (PredPtr == 0)
      +          goto PredTranslationFailure;
               
               // Check to see if we have already visited this pred block with another
               // pointer.  If so, we can't do this lookup.  This failure can occur
      @@ -881,12 +912,7 @@
             SkipFirstBlock = false;
             continue;
           }
      -    
      -    // TODO: BITCAST, GEP.
      -    
      -    //   cerr << "MEMDEP: Could not PHI translate: " << *Pointer;
      -    //   if (isa(PtrInst) || isa(PtrInst))
      -    //     cerr << "OP:\t\t\t\t" << *PtrInst->getOperand(0);
      +
         PredTranslationFailure:
           
           if (Cache == 0) {
      
      
      
      
      From nicholas at mxc.ca  Thu Nov 26 17:19:05 2009
      From: nicholas at mxc.ca (Nick Lewycky)
      Date: Thu, 26 Nov 2009 23:19:05 -0000
      Subject: [llvm-commits] [llvm] r89976 - /llvm/trunk/lib/VMCore/Metadata.cpp
      Message-ID: <200911262319.nAQNJ67v018528@zion.cs.uiuc.edu>
      
      Author: nicholas
      Date: Thu Nov 26 17:19:05 2009
      New Revision: 89976
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89976&view=rev
      Log:
      Fix typo spotted by Gabor Greif.
      
      Modified:
          llvm/trunk/lib/VMCore/Metadata.cpp
      
      Modified: llvm/trunk/lib/VMCore/Metadata.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=89976&r1=89975&r2=89976&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/VMCore/Metadata.cpp (original)
      +++ llvm/trunk/lib/VMCore/Metadata.cpp Thu Nov 26 17:19:05 2009
      @@ -218,7 +218,7 @@
         /// addMD - Attach the metadata of given kind to an Instruction.
         void addMD(unsigned Kind, MDNode *Node, Instruction *Inst);
         
      -  /// removeMD - Remove metadata of given kind attached with an instuction.
      +  /// removeMD - Remove metadata of given kind attached with an instruction.
         void removeMD(unsigned Kind, Instruction *Inst);
         
         /// removeAllMetadata - Remove all metadata attached with an instruction.
      @@ -289,7 +289,7 @@
         Info.push_back(std::make_pair(MDKind, Node));
       }
       
      -/// removeMD - Remove metadata of given kind attached with an instuction.
      +/// removeMD - Remove metadata of given kind attached with an instruction.
       void MetadataContextImpl::removeMD(unsigned Kind, Instruction *Inst) {
         MDStoreTy::iterator I = MetadataStore.find(Inst);
         if (I == MetadataStore.end())
      @@ -442,7 +442,7 @@
         pImpl->addMD(Kind, Node, Inst);
       }
       
      -/// removeMD - Remove metadata of given kind attached with an instuction.
      +/// removeMD - Remove metadata of given kind attached with an instruction.
       void MetadataContext::removeMD(unsigned Kind, Instruction *Inst) {
         pImpl->removeMD(Kind, Inst);
       }
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 17:33:00 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 23:33:00 -0000
      Subject: [llvm-commits] [llvm] r89977 -
      	/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911262333.nAQNX0Qv018969@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 17:32:59 2009
      New Revision: 89977
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89977&view=rev
      Log:
      convert to filecheck
      
      Modified:
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89977&r1=89976&r2=89977&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 17:32:59 2009
      @@ -1,32 +1,36 @@
      -; RUN: opt < %s -gvn -S | grep {%cv = phi i32}
      -; RUN: opt < %s -gvn -S | grep {%bv = phi i32}
      +; RUN: opt < %s -gvn -S | FileCheck %s
      +
       target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"
       target triple = "i386-apple-darwin7"
       
      -define i32 @g(i32* %b, i32* %c) nounwind {
      +define i32 @test1(i32* %b, i32* %c) nounwind {
       entry:
      -	%g = alloca i32		;  [#uses=4]
      -	%t1 = icmp eq i32* %b, null		;  [#uses=1]
      +	%g = alloca i32
      +	%t1 = icmp eq i32* %b, null
       	br i1 %t1, label %bb, label %bb1
       
      -bb:		; preds = %entry
      -	%t2 = load i32* %c, align 4		;  [#uses=1]
      -	%t3 = add i32 %t2, 1		;  [#uses=1]
      +bb:
      +	%t2 = load i32* %c, align 4
      +	%t3 = add i32 %t2, 1
       	store i32 %t3, i32* %g, align 4
       	br label %bb2
       
       bb1:		; preds = %entry
      -	%t5 = load i32* %b, align 4		;  [#uses=1]
      -	%t6 = add i32 %t5, 1		;  [#uses=1]
      +	%t5 = load i32* %b, align 4
      +	%t6 = add i32 %t5, 1
       	store i32 %t6, i32* %g, align 4
       	br label %bb2
       
       bb2:		; preds = %bb1, %bb
      -	%c_addr.0 = phi i32* [ %g, %bb1 ], [ %c, %bb ]		;  [#uses=1]
      -	%b_addr.0 = phi i32* [ %b, %bb1 ], [ %g, %bb ]		;  [#uses=1]
      -	%cv = load i32* %c_addr.0, align 4		;  [#uses=1]
      -	%bv = load i32* %b_addr.0, align 4		;  [#uses=1]
      -	%ret = add i32 %cv, %bv		;  [#uses=1]
      +	%c_addr.0 = phi i32* [ %g, %bb1 ], [ %c, %bb ]
      +	%b_addr.0 = phi i32* [ %b, %bb1 ], [ %g, %bb ]
      +	%cv = load i32* %c_addr.0, align 4
      +	%bv = load i32* %b_addr.0, align 4
      +; CHECK: %bv = phi i32
      +; CHECK: %cv = phi i32
      +; CHECK-NOT: load
      +; CHECK: ret i32
      +	%ret = add i32 %cv, %bv
       	ret i32 %ret
       }
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 17:41:07 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Thu, 26 Nov 2009 23:41:07 -0000
      Subject: [llvm-commits] [llvm] r89978 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp lib/Target/README.txt
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911262341.nAQNf7Uf019235@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 17:41:07 2009
      New Revision: 89978
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89978&view=rev
      Log:
      Teach memdep to phi translate bitcasts.  This allows us to compile
      the example in GCC PR16799 to:
      
      LBB1_2:                                                     ## %bb1
      	movl	%eax, %eax
      	subq	%rax, %rdi
      	movq	%rdi, (%rcx)
      	movl	(%rdi), %eax
      	testl	%eax, %eax
      	je	LBB1_2
      
      instead of:
      
      LBB1_2:                                                     ## %bb1
      	movl	(%rdi), %ecx
      	subq	%rcx, %rdi
      	movq	%rdi, (%rax)
      	cmpl	$0, (%rdi)
      	je	LBB1_2
      
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/lib/Target/README.txt
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89978&r1=89977&r2=89978&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 17:41:07 2009
      @@ -690,10 +690,15 @@
         if (isa(Inst))
           return true;
         
      -  // TODO: BITCAST, GEP.
      -
      -  // ...
      +  // We can handle bitcast of a PHI, but the PHI needs to be in the same block
      +  // as the bitcast.
      +  if (BitCastInst *BC = dyn_cast(Inst))
      +    if (PHINode *PN = dyn_cast(BC->getOperand(0)))
      +      if (PN->getParent() == BC->getParent())
      +        return true;
         
      +  // TODO: GEP, ...
      +
         //   cerr << "MEMDEP: Could not PHI translate: " << *Pointer;
         //   if (isa(PtrInst) || isa(PtrInst))
         //     cerr << "OP:\t\t\t\t" << *PtrInst->getOperand(0);
      @@ -708,6 +713,25 @@
         if (PHINode *PN = dyn_cast(Inst))
           return PN->getIncomingValueForBlock(Pred);
         
      +  if (BitCastInst *BC = dyn_cast(Inst)) {
      +    PHINode *BCPN = cast(BC->getOperand(0));
      +    Value *PHIIn = BCPN->getIncomingValueForBlock(Pred);
      +    
      +    // Constants are trivial to phi translate.
      +    if (Constant *C = dyn_cast(PHIIn))
      +      return ConstantExpr::getBitCast(C, BC->getType());
      +    
      +    // Otherwise we have to see if a bitcasted version of the incoming pointer
      +    // is available.  If so, we can use it, otherwise we have to fail.
      +    for (Value::use_iterator UI = PHIIn->use_begin(), E = PHIIn->use_end();
      +         UI != E; ++UI) {
      +      if (BitCastInst *BCI = dyn_cast(*UI))
      +        if (BCI->getType() == BC->getType())
      +          return BCI;
      +    }
      +    return 0;
      +  }
      +
         return 0;
       }
       
      
      Modified: llvm/trunk/lib/Target/README.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=89978&r1=89977&r2=89978&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Target/README.txt (original)
      +++ llvm/trunk/lib/Target/README.txt Thu Nov 26 17:41:07 2009
      @@ -1284,8 +1284,6 @@
       http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34677 (licm does this, LPRE crit edge)
         llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -simplifycfg -gvn | llvm-dis
       
      -http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16799 [BITCAST PHI TRANS]
      -
       //===---------------------------------------------------------------------===//
       
       Type based alias analysis:
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89978&r1=89977&r2=89978&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 17:41:07 2009
      @@ -4,6 +4,7 @@
       target triple = "i386-apple-darwin7"
       
       define i32 @test1(i32* %b, i32* %c) nounwind {
      +; CHECK: @test1
       entry:
       	%g = alloca i32
       	%t1 = icmp eq i32* %b, null
      @@ -34,3 +35,28 @@
       	ret i32 %ret
       }
       
      +define i8 @test2(i1 %cond, i32* %b, i32* %c) nounwind {
      +; CHECK: @test2
      +entry:
      +	br i1 %cond, label %bb, label %bb1
      +
      +bb:
      +  %b1 = bitcast i32* %b to i8*
      +  store i8 4, i8* %b1
      +	br label %bb2
      +
      +bb1:
      +  %c1 = bitcast i32* %c to i8*
      +  store i8 92, i8* %c1
      +	br label %bb2
      +
      +bb2:
      +	%d = phi i32* [ %c, %bb1 ], [ %b, %bb ]
      +  %d1 = bitcast i32* %d to i8*
      +	%dv = load i8* %d1
      +; CHECK: %dv = phi i8
      +; CHECK-NOT: load
      +; CHECK: ret i8 %dv
      +	ret i8 %dv
      +}
      +
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 18:07:37 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 00:07:37 -0000
      Subject: [llvm-commits] [llvm] r89979 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270007.nAR07cYO019941@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 18:07:37 2009
      New Revision: 89979
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89979&view=rev
      Log:
      teach memdep to do trivial PHI translation of GEPs.  More to
      come.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89979&r1=89978&r2=89979&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 18:07:37 2009
      @@ -697,7 +697,14 @@
             if (PN->getParent() == BC->getParent())
               return true;
         
      -  // TODO: GEP, ...
      +  // We can translate a GEP that uses a PHI in the current block for at least
      +  // one of its operands.
      +  if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
      +    for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i)
      +      if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      +        if (PN->getParent() == GEP->getParent())
      +          return true;
      +  }
       
         //   cerr << "MEMDEP: Could not PHI translate: " << *Pointer;
         //   if (isa(PtrInst) || isa(PtrInst))
      @@ -713,6 +720,7 @@
         if (PHINode *PN = dyn_cast(Inst))
           return PN->getIncomingValueForBlock(Pred);
         
      +  // Handle bitcast of PHI.
         if (BitCastInst *BC = dyn_cast(Inst)) {
           PHINode *BCPN = cast(BC->getOperand(0));
           Value *PHIIn = BCPN->getIncomingValueForBlock(Pred);
      @@ -732,6 +740,39 @@
           return 0;
         }
       
      +  // Handle getelementptr with at least one PHI operand.
      +  if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
      +    SmallVector GEPOps;
      +    Value *APHIOp = 0;
      +    for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) {
      +      GEPOps.push_back(GEP->getOperand(i));
      +      if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      +        if (PN->getParent() == GEP->getParent())
      +          GEPOps.back() = APHIOp = PN->getIncomingValueForBlock(Pred);
      +    }
      +    
      +    // TODO: Simplify the GEP to handle 'gep x, 0' -> x etc.
      +    
      +    // Scan to see if we have this GEP available.
      +    for (Value::use_iterator UI = APHIOp->use_begin(), E = APHIOp->use_end();
      +         UI != E; ++UI) {
      +      if (GetElementPtrInst *GEPI = dyn_cast(*UI))
      +        if (GEPI->getType() == GEPI->getType() &&
      +            GEPI->getNumOperands() == GEPOps.size() &&
      +            GEPI->getParent()->getParent() == Inst->getParent()->getParent()) {
      +          bool Mismatch = false;
      +          for (unsigned i = 0, e = GEPOps.size(); i != e; ++i)
      +            if (GEPI->getOperand(i) != GEPOps[i]) {
      +              Mismatch = true;
      +              break;
      +            }
      +          if (!Mismatch)
      +            return GEPI;
      +        }
      +    }
      +    return 0;
      +  }
      +  
         return 0;
       }
       
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89979&r1=89978&r2=89979&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 18:07:37 2009
      @@ -54,9 +54,35 @@
       	%d = phi i32* [ %c, %bb1 ], [ %b, %bb ]
         %d1 = bitcast i32* %d to i8*
       	%dv = load i8* %d1
      -; CHECK: %dv = phi i8
      +; CHECK: %dv = phi i8 [ 92, %bb1 ], [ 4, %bb ]
       ; CHECK-NOT: load
       ; CHECK: ret i8 %dv
       	ret i8 %dv
       }
       
      +define i32 @test3(i1 %cond, i32* %b, i32* %c) nounwind {
      +; CHECK: @test3
      +entry:
      +	br i1 %cond, label %bb, label %bb1
      +
      +bb:
      +  %b1 = getelementptr i32* %b, i32 17
      +  store i32 4, i32* %b1
      +	br label %bb2
      +
      +bb1:
      +  %c1 = getelementptr i32* %c, i32 7
      +  store i32 82, i32* %c1
      +	br label %bb2
      +
      +bb2:
      +	%d = phi i32* [ %c, %bb1 ], [ %b, %bb ]
      +	%i = phi i32 [ 7, %bb1 ], [ 17, %bb ]
      +  %d1 = getelementptr i32* %d, i32 %i
      +	%dv = load i32* %d1
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
      +	ret i32 %dv
      +}
      +
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 18:29:05 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 00:29:05 -0000
      Subject: [llvm-commits] [llvm] r89980 - in /llvm/trunk:
       include/llvm/Analysis/InstructionSimplify.h
       lib/Analysis/InstructionSimplify.cpp
       lib/Transforms/Scalar/InstructionCombining.cpp
      Message-ID: <200911270029.nAR0T6bV020547@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 18:29:05 2009
      New Revision: 89980
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89980&view=rev
      Log:
      factor some instcombine simplifications for getelementptr out to a new 
      SimplifyGEPInst method in InstructionSimplify.h.  No functionality change.
      
      Modified:
          llvm/trunk/include/llvm/Analysis/InstructionSimplify.h
          llvm/trunk/lib/Analysis/InstructionSimplify.cpp
          llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      
      Modified: llvm/trunk/include/llvm/Analysis/InstructionSimplify.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/InstructionSimplify.h?rev=89980&r1=89979&r2=89980&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/Analysis/InstructionSimplify.h (original)
      +++ llvm/trunk/include/llvm/Analysis/InstructionSimplify.h Thu Nov 26 18:29:05 2009
      @@ -42,6 +42,11 @@
                                 const TargetData *TD = 0);
         
       
      +  /// SimplifyGEPInst - Given operands for an GetElementPtrInst, see if we can
      +  /// fold the result.  If not, this returns null.
      +  Value *SimplifyGEPInst(Value * const *Ops, unsigned NumOps,
      +                         const TargetData *TD = 0);
      +  
         //=== Helper functions for higher up the class hierarchy.
         
         
      
      Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=89980&r1=89979&r2=89980&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original)
      +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Nov 26 18:29:05 2009
      @@ -264,6 +264,34 @@
         return 0;
       }
       
      +/// SimplifyGEPInst - Given operands for an GetElementPtrInst, see if we can
      +/// fold the result.  If not, this returns null.
      +Value *llvm::SimplifyGEPInst(Value *const *Ops, unsigned NumOps,
      +                             const TargetData *TD) {
      +  // getelementptr P -> P.
      +  if (NumOps == 1)
      +    return Ops[0];
      +
      +  // TODO.
      +  //if (isa(Ops[0]))
      +  //  return UndefValue::get(GEP.getType());
      +
      +  // getelementptr P, 0 -> P.
      +  if (NumOps == 2)
      +    if (ConstantInt *C = dyn_cast(Ops[1]))
      +      if (C->isZero())
      +        return Ops[0];
      +  
      +  // Check to see if this is constant foldable.
      +  for (unsigned i = 0; i != NumOps; ++i)
      +    if (!isa(Ops[i]))
      +      return 0;
      +  
      +  return ConstantExpr::getGetElementPtr(cast(Ops[0]),
      +                                        (Constant *const*)Ops+1, NumOps-1);
      +}
      +
      +
       //=== Helper functions for higher up the class hierarchy.
       
       /// SimplifyBinOp - Given operands for a BinaryOperator, see if we can
      @@ -309,6 +337,10 @@
         case Instruction::FCmp:
           return SimplifyFCmpInst(cast(I)->getPredicate(),
                                   I->getOperand(0), I->getOperand(1), TD);
      +  case Instruction::GetElementPtr: {
      +    SmallVector Ops(I->op_begin(), I->op_end());
      +    return SimplifyGEPInst(&Ops[0], Ops.size(), TD);
      +  }
         }
       }
       
      
      Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=89980&r1=89979&r2=89980&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Thu Nov 26 18:29:05 2009
      @@ -11429,21 +11429,16 @@
       }
       
       Instruction *InstCombiner::visitGetElementPtrInst(GetElementPtrInst &GEP) {
      +  SmallVector Ops(GEP.op_begin(), GEP.op_end());
      +
      +  if (Value *V = SimplifyGEPInst(&Ops[0], Ops.size(), TD))
      +    return ReplaceInstUsesWith(GEP, V);
      +
         Value *PtrOp = GEP.getOperand(0);
      -  // Eliminate 'getelementptr %P, i32 0' and 'getelementptr %P', they are noops.
      -  if (GEP.getNumOperands() == 1)
      -    return ReplaceInstUsesWith(GEP, PtrOp);
       
         if (isa(GEP.getOperand(0)))
           return ReplaceInstUsesWith(GEP, UndefValue::get(GEP.getType()));
       
      -  bool HasZeroPointerIndex = false;
      -  if (Constant *C = dyn_cast(GEP.getOperand(1)))
      -    HasZeroPointerIndex = C->isNullValue();
      -
      -  if (GEP.getNumOperands() == 2 && HasZeroPointerIndex)
      -    return ReplaceInstUsesWith(GEP, PtrOp);
      -
         // Eliminate unneeded casts for indices.
         if (TD) {
           bool MadeChange = false;
      @@ -11548,6 +11543,10 @@
             return 0;
           }
           
      +    bool HasZeroPointerIndex = false;
      +    if (ConstantInt *C = dyn_cast(GEP.getOperand(1)))
      +      HasZeroPointerIndex = C->isZero();
      +    
           // Transform: GEP (bitcast [10 x i8]* X to [0 x i8]*), i32 0, ...
           // into     : GEP [10 x i8]* X, i32 0, ...
           //
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 18:34:38 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 00:34:38 -0000
      Subject: [llvm-commits] [llvm] r89981 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270034.nAR0YdXq020728@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 18:34:38 2009
      New Revision: 89981
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89981&view=rev
      Log:
      teach phi translation of GEPs to simplify geps like 'gep x, 0'.
      This allows us to compile the example from PR5313 into:
      
      LBB1_2:                                                     ## %bb
      	incl	%ecx
      	movb	%al, (%rsi)
      	movslq	%ecx, %rax
      	movb	(%rdi,%rax), %al
      	testb	%al, %al
      	jne	LBB1_2
      
      instead of:
      
      LBB1_2:                                                     ## %bb
      	movslq	%eax, %rcx
      	incl	%eax
      	movb	(%rdi,%rcx), %cl
      	movb	%cl, (%rsi)
      	movslq	%eax, %rcx
      	cmpb	$0, (%rdi,%rcx)
      	jne	LBB1_2
      
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89981&r1=89980&r2=89981&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 18:34:38 2009
      @@ -20,6 +20,7 @@
       #include "llvm/IntrinsicInst.h"
       #include "llvm/Function.h"
       #include "llvm/Analysis/AliasAnalysis.h"
      +#include "llvm/Analysis/InstructionSimplify.h"
       #include "llvm/Analysis/MemoryBuiltins.h"
       #include "llvm/ADT/Statistic.h"
       #include "llvm/ADT/STLExtras.h"
      @@ -716,7 +717,8 @@
       /// PHITranslateForPred - Given a computation that satisfied the
       /// isPHITranslatable predicate, see if we can translate the computation into
       /// the specified predecessor block.  If so, return that value.
      -static Value *PHITranslateForPred(Instruction *Inst, BasicBlock *Pred) {
      +static Value *PHITranslateForPred(Instruction *Inst, BasicBlock *Pred,
      +                                  const TargetData *TD) {
         if (PHINode *PN = dyn_cast(Inst))
           return PN->getIncomingValueForBlock(Pred);
         
      @@ -751,7 +753,9 @@
                 GEPOps.back() = APHIOp = PN->getIncomingValueForBlock(Pred);
           }
           
      -    // TODO: Simplify the GEP to handle 'gep x, 0' -> x etc.
      +    // Simplify the GEP to handle 'gep x, 0' -> x etc.
      +    if (Value *V = SimplifyGEPInst(&GEPOps[0], GEPOps.size(), TD))
      +      return V;
           
           // Scan to see if we have this GEP available.
           for (Value::use_iterator UI = APHIOp->use_begin(), E = APHIOp->use_end();
      @@ -926,7 +930,7 @@
             
             for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) {
               BasicBlock *Pred = *PI;
      -        Value *PredPtr = PHITranslateForPred(PtrInst, Pred);
      +        Value *PredPtr = PHITranslateForPred(PtrInst, Pred, TD);
               
               // If PHI translation fails, bail out.
               if (PredPtr == 0)
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89981&r1=89980&r2=89981&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 18:34:38 2009
      @@ -86,3 +86,29 @@
       	ret i32 %dv
       }
       
      +; PR5313
      +define i32 @test4(i1 %cond, i32* %b, i32* %c) nounwind {
      +; CHECK: @test4
      +entry:
      +	br i1 %cond, label %bb, label %bb1
      +
      +bb:
      +  store i32 4, i32* %b
      +	br label %bb2
      +
      +bb1:
      +  %c1 = getelementptr i32* %c, i32 7
      +  store i32 82, i32* %c1
      +	br label %bb2
      +
      +bb2:
      +	%d = phi i32* [ %c, %bb1 ], [ %b, %bb ]
      +	%i = phi i32 [ 7, %bb1 ], [ 0, %bb ]
      +  %d1 = getelementptr i32* %d, i32 %i
      +	%dv = load i32* %d1
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
      +	ret i32 %dv
      +}
      +
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 18:35:06 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 00:35:06 -0000
      Subject: [llvm-commits] [llvm] r89982 - /llvm/trunk/lib/Target/README.txt
      Message-ID: <200911270035.nAR0Z6gj020752@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 18:35:04 2009
      New Revision: 89982
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89982&view=rev
      Log:
      this (and probably several others) are now done.
      
      Modified:
          llvm/trunk/lib/Target/README.txt
      
      Modified: llvm/trunk/lib/Target/README.txt
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=89982&r1=89981&r2=89982&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Target/README.txt (original)
      +++ llvm/trunk/lib/Target/README.txt Thu Nov 26 18:35:04 2009
      @@ -1196,39 +1196,6 @@
       
       GCC PR33344 is a similar case.
       
      -//===---------------------------------------------------------------------===//
      -
      -[PHI TRANSLATE INDEXED GEPs]  PR5313
      -
      -Load redundancy elimination for simple loop.  This loop:
      -
      -void append_text(const char* text,unsigned char * const  io) {
      -  while(*text)
      -    *io=*text++;
      -}
      -
      -Compiles to have a fully redundant load in the loop (%2):
      -
      -define void @append_text(i8* nocapture %text, i8* nocapture %io) nounwind {
      -entry:
      -  %0 = load i8* %text, align 1                    ;  [#uses=1]
      -  %1 = icmp eq i8 %0, 0                           ;  [#uses=1]
      -  br i1 %1, label %return, label %bb
      -
      -bb:                                               ; preds = %bb, %entry
      -  %indvar = phi i32 [ 0, %entry ], [ %tmp, %bb ]  ;  [#uses=2]
      -  %text_addr.04 = getelementptr i8* %text, i32 %indvar ;  [#uses=1]
      -  %2 = load i8* %text_addr.04, align 1            ;  [#uses=1]
      -  store i8 %2, i8* %io, align 1
      -  %tmp = add i32 %indvar, 1                       ;  [#uses=2]
      -  %scevgep = getelementptr i8* %text, i32 %tmp    ;  [#uses=1]
      -  %3 = load i8* %scevgep, align 1                 ;  [#uses=1]
      -  %4 = icmp eq i8 %3, 0                           ;  [#uses=1]
      -  br i1 %4, label %return, label %bb
      -
      -return:                                           ; preds = %bb, %entry
      -  ret void
      -}
       
       //===---------------------------------------------------------------------===//
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 19:52:22 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 01:52:22 -0000
      Subject: [llvm-commits] [llvm] r89985 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270152.nAR1qMpO023819@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 19:52:22 2009
      New Revision: 89985
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89985&view=rev
      Log:
      this is causing buildbot failures, disable for now.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89985&r1=89984&r2=89985&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 19:52:22 2009
      @@ -700,6 +700,7 @@
         
         // We can translate a GEP that uses a PHI in the current block for at least
         // one of its operands.
      +  if (0)
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i)
             if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89985&r1=89984&r2=89985&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 19:52:22 2009
      @@ -80,9 +80,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 17, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; CHECK-NOT: load
      -; CHECK: ret i32 %dv
      +; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; HECK-NOT: load
      +; HECK: ret i32 %dv
       	ret i32 %dv
       }
       
      @@ -106,9 +106,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 0, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; CHECK-NOT: load
      -; CHECK: ret i32 %dv
      +; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; HECK-NOT: load
      +; HECK: ret i32 %dv
       	ret i32 %dv
       }
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 23:19:57 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 05:19:57 -0000
      Subject: [llvm-commits] [llvm] r89990 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270519.nAR5JvmJ030184@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 23:19:56 2009
      New Revision: 89990
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89990&view=rev
      Log:
      try again.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89990&r1=89989&r2=89990&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 23:19:56 2009
      @@ -700,7 +700,6 @@
         
         // We can translate a GEP that uses a PHI in the current block for at least
         // one of its operands.
      -  if (0)
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i)
             if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      @@ -747,11 +746,11 @@
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           SmallVector GEPOps;
           Value *APHIOp = 0;
      +    BasicBlock *CurBB = GEP->getParent();
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) {
      -      GEPOps.push_back(GEP->getOperand(i));
      -      if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      -        if (PN->getParent() == GEP->getParent())
      -          GEPOps.back() = APHIOp = PN->getIncomingValueForBlock(Pred);
      +      GEPOps.push_back(GEP->getOperand(i)->DoPHITranslation(CurBB, Pred));
      +      if (!isa(GEPOps.back()))
      +        APHIOp = GEPOps.back();
           }
           
           // Simplify the GEP to handle 'gep x, 0' -> x etc.
      @@ -762,9 +761,9 @@
           for (Value::use_iterator UI = APHIOp->use_begin(), E = APHIOp->use_end();
                UI != E; ++UI) {
             if (GetElementPtrInst *GEPI = dyn_cast(*UI))
      -        if (GEPI->getType() == GEPI->getType() &&
      +        if (GEPI->getType() == GEP->getType() &&
                   GEPI->getNumOperands() == GEPOps.size() &&
      -            GEPI->getParent()->getParent() == Inst->getParent()->getParent()) {
      +            GEPI->getParent()->getParent() == CurBB->getParent()) {
                 bool Mismatch = false;
                 for (unsigned i = 0, e = GEPOps.size(); i != e; ++i)
                   if (GEPI->getOperand(i) != GEPOps[i]) {
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89990&r1=89989&r2=89990&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 23:19:56 2009
      @@ -80,9 +80,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 17, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; HECK-NOT: load
      -; HECK: ret i32 %dv
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
       	ret i32 %dv
       }
       
      @@ -106,9 +106,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 0, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; HECK-NOT: load
      -; HECK: ret i32 %dv
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
       	ret i32 %dv
       }
       
      
      
      
      
      From sabre at nondot.org  Thu Nov 26 23:53:01 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 05:53:01 -0000
      Subject: [llvm-commits] [llvm] r89991 - in /llvm/trunk:
       lib/Analysis/MemoryDependenceAnalysis.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270553.nAR5r1Zu031051@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Thu Nov 26 23:53:01 2009
      New Revision: 89991
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89991&view=rev
      Log:
      redisable this, my bootstrap worked because it wasn't an optimized build, whoops.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89991&r1=89990&r2=89991&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Thu Nov 26 23:53:01 2009
      @@ -700,6 +700,7 @@
         
         // We can translate a GEP that uses a PHI in the current block for at least
         // one of its operands.
      +  if (0)
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i)
             if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89991&r1=89990&r2=89991&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Thu Nov 26 23:53:01 2009
      @@ -80,9 +80,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 17, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; CHECK-NOT: load
      -; CHECK: ret i32 %dv
      +; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; HECK-NOT: load
      +; HECK: ret i32 %dv
       	ret i32 %dv
       }
       
      @@ -106,9 +106,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 0, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; CHECK-NOT: load
      -; CHECK: ret i32 %dv
      +; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; HECK-NOT: load
      +; HECK: ret i32 %dv
       	ret i32 %dv
       }
       
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 00:31:14 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 06:31:14 -0000
      Subject: [llvm-commits] [llvm] r89992 - in /llvm/trunk:
       include/llvm/Analysis/MemoryDependenceAnalysis.h
       lib/Analysis/MemoryDependenceAnalysis.cpp lib/Transforms/Scalar/GVN.cpp
       test/Transforms/GVN/rle-phi-translate.ll
      Message-ID: <200911270631.nAR6VETi032167@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 00:31:14 2009
      New Revision: 89992
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89992&view=rev
      Log:
      Fix phi translation in load PRE to agree with the phi 
      translation done by memdep, and reenable gep translation 
      again.
      
      Modified:
          llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/lib/Transforms/Scalar/GVN.cpp
          llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      
      Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h?rev=89992&r1=89991&r2=89992&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h (original)
      +++ llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h Fri Nov 27 00:31:14 2009
      @@ -244,6 +244,13 @@
                                             BasicBlock *BB,
                                            SmallVectorImpl &Result);
           
      +    /// PHITranslatePointer - Find an available version of the specified value
      +    /// PHI translated across the specified edge.  If MemDep isn't able to
      +    /// satisfy this request, it returns null.
      +    Value *PHITranslatePointer(Value *V,
      +                               BasicBlock *CurBB, BasicBlock *PredBB,
      +                               const TargetData *TD) const;
      +    
           /// removeInstruction - Remove an instruction from the dependence analysis,
           /// updating the dependence of instructions that previously depended on it.
           void removeInstruction(Instruction *InstToRemove);
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89992&r1=89991&r2=89992&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 00:31:14 2009
      @@ -700,7 +700,6 @@
         
         // We can translate a GEP that uses a PHI in the current block for at least
         // one of its operands.
      -  if (0)
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i)
             if (PHINode *PN = dyn_cast(GEP->getOperand(i)))
      @@ -718,8 +717,15 @@
       /// PHITranslateForPred - Given a computation that satisfied the
       /// isPHITranslatable predicate, see if we can translate the computation into
       /// the specified predecessor block.  If so, return that value.
      -static Value *PHITranslateForPred(Instruction *Inst, BasicBlock *Pred,
      -                                  const TargetData *TD) {
      +Value *MemoryDependenceAnalysis::
      +PHITranslatePointer(Value *InVal, BasicBlock *CurBB, BasicBlock *Pred,
      +                    const TargetData *TD) const {  
      +  // If the input value is not an instruction, or if it is not defined in CurBB,
      +  // then we don't need to phi translate it.
      +  Instruction *Inst = dyn_cast(InVal);
      +  if (Inst == 0 || Inst->getParent() != CurBB)
      +    return InVal;
      +  
         if (PHINode *PN = dyn_cast(Inst))
           return PN->getIncomingValueForBlock(Pred);
         
      @@ -931,7 +937,7 @@
             
             for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) {
               BasicBlock *Pred = *PI;
      -        Value *PredPtr = PHITranslateForPred(PtrInst, Pred, TD);
      +        Value *PredPtr = PHITranslatePointer(PtrInst, BB, Pred, TD);
               
               // If PHI translation fails, bail out.
               if (PredPtr == 0)
      
      Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=89992&r1=89991&r2=89992&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Fri Nov 27 00:31:14 2009
      @@ -1427,13 +1427,19 @@
       
         // If the loaded pointer is PHI node defined in this block, do PHI translation
         // to get its value in the predecessor.
      -  Value *LoadPtr = LI->getOperand(0)->DoPHITranslation(LoadBB, UnavailablePred);
      +  Value *LoadPtr = MD->PHITranslatePointer(LI->getOperand(0),
      +                                           LoadBB, UnavailablePred, TD);
      +  if (LoadPtr == 0) {
      +    DEBUG(errs() << "COULDN'T PRE LOAD BECAUSE PTR CAN'T BE PHI TRANSLATED: "
      +          << *LI->getOperand(0) << '\n' << *LI << "\n");
      +    return false;
      +  }
       
         // Make sure the value is live in the predecessor.  If it was defined by a
         // non-PHI instruction in this block, we don't know how to recompute it above.
         if (Instruction *LPInst = dyn_cast(LoadPtr))
           if (!DT->dominates(LPInst->getParent(), UnavailablePred)) {
      -      DEBUG(errs() << "COULDN'T PRE LOAD BECAUSE PTR IS UNAVAILABLE IN PRED: "
      +      DEBUG(errs() << "COULDN'T PRE LOAD BECAUSE PTR DOES NOT DOMINATE PRED: "
                          << *LPInst << '\n' << *LI << "\n");
             return false;
           }
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll?rev=89992&r1=89991&r2=89992&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-phi-translate.ll Fri Nov 27 00:31:14 2009
      @@ -80,9 +80,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 17, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; HECK-NOT: load
      -; HECK: ret i32 %dv
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
       	ret i32 %dv
       }
       
      @@ -106,9 +106,9 @@
       	%i = phi i32 [ 7, %bb1 ], [ 0, %bb ]
         %d1 = getelementptr i32* %d, i32 %i
       	%dv = load i32* %d1
      -; HECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      -; HECK-NOT: load
      -; HECK: ret i32 %dv
      +; CHECK: %dv = phi i32 [ 82, %bb1 ], [ 4, %bb ]
      +; CHECK-NOT: load
      +; CHECK: ret i32 %dv
       	ret i32 %dv
       }
       
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 00:31:56 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 06:31:56 -0000
      Subject: [llvm-commits] [llvm] r89993 - in /llvm/trunk/test/Transforms/GVN:
      	lpre-basic.ll pre-load.ll
      Message-ID: <200911270631.nAR6Vu3l032196@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 00:31:55 2009
      New Revision: 89993
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89993&view=rev
      Log:
      rename test.
      
      Added:
          llvm/trunk/test/Transforms/GVN/pre-load.ll
            - copied unchanged from r89948, llvm/trunk/test/Transforms/GVN/lpre-basic.ll
      Removed:
          llvm/trunk/test/Transforms/GVN/lpre-basic.ll
      
      Removed: llvm/trunk/test/Transforms/GVN/lpre-basic.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/lpre-basic.ll?rev=89992&view=auto
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/lpre-basic.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/lpre-basic.ll (removed)
      @@ -1,18 +0,0 @@
      -; RUN: opt < %s -gvn -enable-load-pre -S | grep {%PRE = phi}
      -
      -define i32 @test(i32* %p, i1 %C) {
      -block1:
      -	br i1 %C, label %block2, label %block3
      -
      -block2:
      - br label %block4
      -
      -block3:
      -  %b = bitcast i32 0 to i32
      -  store i32 %b, i32* %p
      -  br label %block4
      -
      -block4:
      -  %PRE = load i32* %p
      -  ret i32 %PRE
      -}
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 00:33:09 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 06:33:09 -0000
      Subject: [llvm-commits] [llvm] r89994 -
      	/llvm/trunk/test/Transforms/GVN/pre-load.ll
      Message-ID: <200911270633.nAR6X9x7032235@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 00:33:09 2009
      New Revision: 89994
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89994&view=rev
      Log:
      filecheckize
      
      Modified:
          llvm/trunk/test/Transforms/GVN/pre-load.ll
      
      Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=89994&r1=89993&r2=89994&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 00:33:09 2009
      @@ -1,11 +1,14 @@
      -; RUN: opt < %s -gvn -enable-load-pre -S | grep {%PRE = phi}
      +; RUN: opt < %s -gvn -enable-load-pre -S | FileCheck %s
       
      -define i32 @test(i32* %p, i1 %C) {
      +define i32 @test1(i32* %p, i1 %C) {
      +; CHECK: @test1
       block1:
       	br i1 %C, label %block2, label %block3
       
       block2:
        br label %block4
      +; CHECK: block2:
      +; CHECK-NEXT: load i32* %p
       
       block3:
         %b = bitcast i32 0 to i32
      @@ -15,4 +18,7 @@
       block4:
         %PRE = load i32* %p
         ret i32 %PRE
      +; CHECK: block4:
      +; CHECK-NEXT: phi i32
      +; CHECK-NEXT: ret i32
       }
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 00:36:28 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 06:36:28 -0000
      Subject: [llvm-commits] [llvm] r89995 -
      	/llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll
      Message-ID: <200911270636.nAR6aSBM032342@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 00:36:28 2009
      New Revision: 89995
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89995&view=rev
      Log:
      this test is failing, and is expected to.
      
      Modified:
          llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll
      
      Modified: llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll?rev=89995&r1=89994&r2=89995&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/rle-no-phi-translate.ll Fri Nov 27 00:36:28 2009
      @@ -1,4 +1,5 @@
      -; RUN: opt < %s -gvn -S | grep load
      +; RUN: opt < %s -gvn -S | FileCheck %s
      +; XFAIL: *
       ; FIXME: This should be promotable, but memdep/gvn don't track values
       ; path/edge sensitively enough.
       
      @@ -20,5 +21,8 @@
       	%c_addr.0 = phi i32* [ %b, %entry ], [ %c, %bb ]		;  [#uses=1]
       	%cv = load i32* %c_addr.0, align 4		;  [#uses=1]
       	ret i32 %cv
      +; CHECK: bb2:
      +; CHECK-NOT: load i32
      +; CHECK: ret i32 
       }
       
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 00:42:42 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 06:42:42 -0000
      Subject: [llvm-commits] [llvm] r89996 -
      	/llvm/trunk/test/Transforms/GVN/pre-load.ll
      Message-ID: <200911270642.nAR6ggjW032502@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 00:42:42 2009
      New Revision: 89996
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89996&view=rev
      Log:
      add some tests for memdep phi translation + PRE.
      
      Modified:
          llvm/trunk/test/Transforms/GVN/pre-load.ll
      
      Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=89996&r1=89995&r2=89996&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 00:42:42 2009
      @@ -11,8 +11,7 @@
       ; CHECK-NEXT: load i32* %p
       
       block3:
      -  %b = bitcast i32 0 to i32
      -  store i32 %b, i32* %p
      +  store i32 0, i32* %p
         br label %block4
       
       block4:
      @@ -22,3 +21,55 @@
       ; CHECK-NEXT: phi i32
       ; CHECK-NEXT: ret i32
       }
      +
      +define i32 @test2(i32* %p, i32* %q, i1 %C) {
      +; CHECK: @test2
      +block1:
      +	br i1 %C, label %block2, label %block3
      +
      +block2:
      + br label %block4
      +; CHECK: block2:
      +; CHECK-NEXT: load i32* %q
      +
      +block3:
      +  store i32 0, i32* %p
      +  br label %block4
      +
      +block4:
      +  %P2 = phi i32* [%p, %block3], [%q, %block2]
      +  %PRE = load i32* %P2
      +  ret i32 %PRE
      +; CHECK: block4:
      +; CHECK-NEXT: phi i32 [
      +; CHECK-NOT: load
      +; CHECK: ret i32
      +}
      +
      +define i32 @test3(i32* %p, i32* %q, i32** %Hack, i1 %C) {
      +; CHECK: @test3
      +block1:
      +  %B = getelementptr i32* %q, i32 1
      +  store i32* %B, i32** %Hack
      +	br i1 %C, label %block2, label %block3
      +
      +block2:
      + br label %block4
      +; CHECK: block2:
      +; CHECK-NEXT: load i32* %B
      +
      +block3:
      +  %A = getelementptr i32* %p, i32 1
      +  store i32 0, i32* %A
      +  br label %block4
      +
      +block4:
      +  %P2 = phi i32* [%p, %block3], [%q, %block2]
      +  %P3 = getelementptr i32* %P2, i32 1
      +  %PRE = load i32* %P3
      +  ret i32 %PRE
      +; CHECK: block4:
      +; CHECK-NEXT: phi i32 [
      +; CHECK-NOT: load
      +; CHECK: ret i32
      +}
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 02:25:11 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 08:25:11 -0000
      Subject: [llvm-commits] [llvm] r89997 - in /llvm/trunk:
       include/llvm/Analysis/MemoryDependenceAnalysis.h
       lib/Analysis/MemoryDependenceAnalysis.cpp lib/Transforms/Scalar/GVN.cpp
       test/Transforms/GVN/pre-load.ll
      Message-ID: <200911270825.nAR8PB9g003096@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 02:25:10 2009
      New Revision: 89997
      
      URL: http://llvm.org/viewvc/llvm-project?rev=89997&view=rev
      Log:
      teach GVN's load PRE to insert computations of the address in predecessors
      where it is not available.  It's unclear how to get this inserted 
      computation into GVN's scalar availability sets, Owen, help? :)
      
      Modified:
          llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
          llvm/trunk/lib/Transforms/Scalar/GVN.cpp
          llvm/trunk/test/Transforms/GVN/pre-load.ll
      
      Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h?rev=89997&r1=89996&r2=89997&view=diff
      
      ==============================================================================
      --- llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h (original)
      +++ llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h Fri Nov 27 02:25:10 2009
      @@ -250,6 +250,13 @@
           Value *PHITranslatePointer(Value *V,
                                      BasicBlock *CurBB, BasicBlock *PredBB,
                                      const TargetData *TD) const;
      +
      +    /// InsertPHITranslatedPointer - Insert a computation of the PHI translated
      +    /// version of 'V' for the edge PredBB->CurBB into the end of the PredBB
      +    /// block.
      +    Value *InsertPHITranslatedPointer(Value *V,
      +                                      BasicBlock *CurBB, BasicBlock *PredBB,
      +                                      const TargetData *TD) const;
           
           /// removeInstruction - Remove an instruction from the dependence analysis,
           /// updating the dependence of instructions that previously depended on it.
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=89997&r1=89996&r2=89997&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 02:25:10 2009
      @@ -752,19 +752,35 @@
         // Handle getelementptr with at least one PHI operand.
         if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
           SmallVector GEPOps;
      -    Value *APHIOp = 0;
           BasicBlock *CurBB = GEP->getParent();
           for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) {
      -      GEPOps.push_back(GEP->getOperand(i)->DoPHITranslation(CurBB, Pred));
      -      if (!isa(GEPOps.back()))
      -        APHIOp = GEPOps.back();
      +      Value *GEPOp = GEP->getOperand(i);
      +      // No PHI translation is needed of operands whose values are live in to
      +      // the predecessor block.
      +      if (!isa(GEPOp) ||
      +          cast(GEPOp)->getParent() != CurBB) {
      +        GEPOps.push_back(GEPOp);
      +        continue;
      +      }
      +      
      +      // If the operand is a phi node, do phi translation.
      +      if (PHINode *PN = dyn_cast(GEPOp)) {
      +        GEPOps.push_back(PN->getIncomingValueForBlock(Pred));
      +        continue;
      +      }
      +      
      +      // Otherwise, we can't PHI translate this random value defined in this
      +      // block.
      +      return 0;
           }
           
           // Simplify the GEP to handle 'gep x, 0' -> x etc.
           if (Value *V = SimplifyGEPInst(&GEPOps[0], GEPOps.size(), TD))
             return V;
      -    
      +
      +
           // Scan to see if we have this GEP available.
      +    Value *APHIOp = GEPOps[0];
           for (Value::use_iterator UI = APHIOp->use_begin(), E = APHIOp->use_end();
                UI != E; ++UI) {
             if (GetElementPtrInst *GEPI = dyn_cast(*UI))
      @@ -787,6 +803,52 @@
         return 0;
       }
       
      +/// InsertPHITranslatedPointer - Insert a computation of the PHI translated
      +/// version of 'V' for the edge PredBB->CurBB into the end of the PredBB
      +/// block.
      +///
      +/// This is only called when PHITranslatePointer returns a value that doesn't
      +/// dominate the block, so we don't need to handle the trivial cases here.
      +Value *MemoryDependenceAnalysis::
      +InsertPHITranslatedPointer(Value *InVal, BasicBlock *CurBB,
      +                           BasicBlock *PredBB, const TargetData *TD) const {
      +  // If the input value isn't an instruction in CurBB, it doesn't need phi
      +  // translation.
      +  Instruction *Inst = cast(InVal);
      +  assert(Inst->getParent() == CurBB && "Doesn't need phi trans");
      +
      +  // Handle bitcast of PHI.
      +  if (BitCastInst *BC = dyn_cast(Inst)) {
      +    PHINode *BCPN = cast(BC->getOperand(0));
      +    Value *PHIIn = BCPN->getIncomingValueForBlock(PredBB);
      +    
      +    // Otherwise insert a bitcast at the end of PredBB.
      +    return new BitCastInst(PHIIn, InVal->getType(),
      +                           InVal->getName()+".phi.trans.insert",
      +                           PredBB->getTerminator());
      +  }
      +  
      +  // Handle getelementptr with at least one PHI operand.
      +  if (GetElementPtrInst *GEP = dyn_cast(Inst)) {
      +    SmallVector GEPOps;
      +    Value *APHIOp = 0;
      +    BasicBlock *CurBB = GEP->getParent();
      +    for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) {
      +      GEPOps.push_back(GEP->getOperand(i)->DoPHITranslation(CurBB, PredBB));
      +      if (!isa(GEPOps.back()))
      +        APHIOp = GEPOps.back();
      +    }
      +    
      +    GetElementPtrInst *Result = 
      +      GetElementPtrInst::Create(GEPOps[0], GEPOps.begin()+1, GEPOps.end(),
      +                                InVal->getName()+".phi.trans.insert",
      +                                PredBB->getTerminator());
      +    Result->setIsInBounds(GEP->isInBounds());
      +    return Result;
      +  }
      +  
      +  return 0;
      +}
       
       /// getNonLocalPointerDepFromBB - Perform a dependency query based on
       /// pointer/pointeesize starting at the end of StartBB.  Add any clobber/def
      
      Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=89997&r1=89996&r2=89997&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
      +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Fri Nov 27 02:25:10 2009
      @@ -1425,32 +1425,40 @@
         assert(UnavailablePred != 0 &&
                "Fully available value should be eliminated above!");
       
      +  // We don't currently handle critical edges :(
      +  if (UnavailablePred->getTerminator()->getNumSuccessors() != 1) {
      +    DEBUG(errs() << "COULD NOT PRE LOAD BECAUSE OF CRITICAL EDGE '"
      +                 << UnavailablePred->getName() << "': " << *LI << '\n');
      +    return false;
      +  }
      +  
         // If the loaded pointer is PHI node defined in this block, do PHI translation
         // to get its value in the predecessor.
         Value *LoadPtr = MD->PHITranslatePointer(LI->getOperand(0),
                                                  LoadBB, UnavailablePred, TD);
      -  if (LoadPtr == 0) {
      -    DEBUG(errs() << "COULDN'T PRE LOAD BECAUSE PTR CAN'T BE PHI TRANSLATED: "
      -          << *LI->getOperand(0) << '\n' << *LI << "\n");
      -    return false;
      -  }
      +  // Make sure the value is live in the predecessor.  MemDep found a computation
      +  // of LPInst with the right value, but that does not dominate UnavailablePred,
      +  // then we can't use it.
      +  if (Instruction *LPInst = dyn_cast_or_null(LoadPtr))
      +    if (!DT->dominates(LPInst->getParent(), UnavailablePred))
      +      LoadPtr = 0;
       
      -  // Make sure the value is live in the predecessor.  If it was defined by a
      -  // non-PHI instruction in this block, we don't know how to recompute it above.
      -  if (Instruction *LPInst = dyn_cast(LoadPtr))
      -    if (!DT->dominates(LPInst->getParent(), UnavailablePred)) {
      -      DEBUG(errs() << "COULDN'T PRE LOAD BECAUSE PTR DOES NOT DOMINATE PRED: "
      -                   << *LPInst << '\n' << *LI << "\n");
      +  // If we don't have a computation of this phi translated value, try to insert
      +  // one.
      +  if (LoadPtr == 0) {
      +    LoadPtr = MD->InsertPHITranslatedPointer(LI->getOperand(0),
      +                                             LoadBB, UnavailablePred, TD);
      +    if (LoadPtr == 0) {
      +      DEBUG(errs() << "COULDN'T INSERT PHI TRANSLATED VALUE OF: "
      +                   << *LI->getOperand(0) << "\n");
             return false;
           }
      -
      -  // We don't currently handle critical edges :(
      -  if (UnavailablePred->getTerminator()->getNumSuccessors() != 1) {
      -    DEBUG(errs() << "COULD NOT PRE LOAD BECAUSE OF CRITICAL EDGE '"
      -                 << UnavailablePred->getName() << "': " << *LI << '\n');
      -    return false;
      +    
      +    // FIXME: This inserts a computation, but we don't tell scalar GVN
      +    // optimization stuff about it.  How do we do this?
      +    DEBUG(errs() << "INSERTED PHI TRANSLATED VALUE: " << *LoadPtr << "\n");
         }
      -
      +  
         // Make sure it is valid to move this load here.  We have to watch out for:
         //  @1 = getelementptr (i8* p, ...
         //  test p and branch if == 0
      
      Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=89997&r1=89996&r2=89997&view=diff
      
      ==============================================================================
      --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original)
      +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 02:25:10 2009
      @@ -22,6 +22,7 @@
       ; CHECK-NEXT: ret i32
       }
       
      +; This is a simple phi translation case.
       define i32 @test2(i32* %p, i32* %q, i1 %C) {
       ; CHECK: @test2
       block1:
      @@ -46,6 +47,7 @@
       ; CHECK: ret i32
       }
       
      +; This is a PRE case that requires phi translation through a GEP.
       define i32 @test3(i32* %p, i32* %q, i32** %Hack, i1 %C) {
       ; CHECK: @test3
       block1:
      @@ -73,3 +75,35 @@
       ; CHECK-NOT: load
       ; CHECK: ret i32
       }
      +
      +;; Here the loaded address is available, but the computation is in 'block3'
      +;; which does not dominate 'block2'.
      +define i32 @test4(i32* %p, i32* %q, i32** %Hack, i1 %C) {
      +; CHECK: @test4
      +block1:
      +	br i1 %C, label %block2, label %block3
      +
      +block2:
      + br label %block4
      +; CHECK: block2:
      +; CHECK:   load i32*
      +; CHECK:   br label %block4
      +
      +block3:
      +  %B = getelementptr i32* %q, i32 1
      +  store i32* %B, i32** %Hack
      +
      +  %A = getelementptr i32* %p, i32 1
      +  store i32 0, i32* %A
      +  br label %block4
      +
      +block4:
      +  %P2 = phi i32* [%p, %block3], [%q, %block2]
      +  %P3 = getelementptr i32* %P2, i32 1
      +  %PRE = load i32* %P3
      +  ret i32 %PRE
      +; CHECK: block4:
      +; CHECK-NEXT: phi i32 [
      +; CHECK-NOT: load
      +; CHECK: ret i32
      +}
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 02:32:53 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 08:32:53 -0000
      Subject: [llvm-commits] [llvm] r90000 -
      	/llvm/trunk/lib/Analysis/ValueTracking.cpp
      Message-ID: <200911270832.nAR8WrV4003527@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 02:32:52 2009
      New Revision: 90000
      
      URL: http://llvm.org/viewvc/llvm-project?rev=90000&view=rev
      Log:
      limit the recursion depth of GetLinearExpression.  This
      fixes a crash analyzing consumer-lame, which had an "%X = add %X, 1"
      in unreachable code.
      
      Modified:
          llvm/trunk/lib/Analysis/ValueTracking.cpp
      
      Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=90000&r1=89999&r2=90000&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original)
      +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Fri Nov 27 02:32:52 2009
      @@ -955,8 +955,15 @@
       /// have IntegerType.  Note that this looks through extends, so the high bits
       /// may not be represented in the result.
       static Value *GetLinearExpression(Value *V, APInt &Scale, APInt &Offset,
      -                                  const TargetData *TD) {
      +                                  const TargetData *TD, unsigned Depth) {
         assert(isa(V->getType()) && "Not an integer value");
      +
      +  // Limit our recursion depth.
      +  if (Depth == 6) {
      +    Scale = 1;
      +    Offset = 0;
      +    return V;
      +  }
         
         if (BinaryOperator *BOp = dyn_cast(V)) {
           if (ConstantInt *RHSC = dyn_cast(BOp->getOperand(1))) {
      @@ -969,16 +976,16 @@
                 break;
               // FALL THROUGH.
             case Instruction::Add:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD, Depth+1);
               Offset += RHSC->getValue();
               return V;
             case Instruction::Mul:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD, Depth+1);
               Offset *= RHSC->getValue();
               Scale *= RHSC->getValue();
               return V;
             case Instruction::Shl:
      -        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD);
      +        V = GetLinearExpression(BOp->getOperand(0), Scale, Offset, TD, Depth+1);
               Offset <<= RHSC->getValue().getLimitedValue();
               Scale <<= RHSC->getValue().getLimitedValue();
               return V;
      @@ -994,7 +1001,7 @@
           unsigned SmallWidth = CastOp->getType()->getPrimitiveSizeInBits();
           Scale.trunc(SmallWidth);
           Offset.trunc(SmallWidth);
      -    Value *Result = GetLinearExpression(CastOp, Scale, Offset, TD);
      +    Value *Result = GetLinearExpression(CastOp, Scale, Offset, TD, Depth+1);
           Scale.zext(OldWidth);
           Offset.zext(OldWidth);
           return Result;
      @@ -1088,7 +1095,7 @@
             // Use GetLinearExpression to decompose the index into a C1*V+C2 form.
             unsigned Width = cast(Index->getType())->getBitWidth();
             APInt IndexScale(Width, 0), IndexOffset(Width, 0);
      -      Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD);
      +      Index = GetLinearExpression(Index, IndexScale, IndexOffset, TD, 0);
             
             // The GEP index scale ("Scale") scales C1*V+C2, yielding (C1*V+C2)*Scale.
             // This gives us an aggregate computation of (C1*Scale)*V + C2*Scale.
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 02:37:22 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 08:37:22 -0000
      Subject: [llvm-commits] [llvm] r90001 -
      	/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      Message-ID: <200911270837.nAR8bMGN003677@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 02:37:22 2009
      New Revision: 90001
      
      URL: http://llvm.org/viewvc/llvm-project?rev=90001&view=rev
      Log:
      reduce nesting, no functionality change.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90001&r1=90000&r2=90001&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 02:37:22 2009
      @@ -994,62 +994,63 @@
           
           // If this is a computation derived from a PHI node, use the suitably
           // translated incoming values for each pred as the phi translated version.
      -    if (isPHITranslatable(PtrInst)) {
      -      Cache = 0;
      +    if (!isPHITranslatable(PtrInst))
      +      goto PredTranslationFailure;
      +
      +    Cache = 0;
             
      -      for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) {
      -        BasicBlock *Pred = *PI;
      -        Value *PredPtr = PHITranslatePointer(PtrInst, BB, Pred, TD);
      -        
      -        // If PHI translation fails, bail out.
      -        if (PredPtr == 0)
      -          goto PredTranslationFailure;
      +    for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) {
      +      BasicBlock *Pred = *PI;
      +      Value *PredPtr = PHITranslatePointer(PtrInst, BB, Pred, TD);
      +      
      +      // If PHI translation fails, bail out.
      +      if (PredPtr == 0)
      +        goto PredTranslationFailure;
      +      
      +      // Check to see if we have already visited this pred block with another
      +      // pointer.  If so, we can't do this lookup.  This failure can occur
      +      // with PHI translation when a critical edge exists and the PHI node in
      +      // the successor translates to a pointer value different than the
      +      // pointer the block was first analyzed with.
      +      std::pair::iterator, bool>
      +        InsertRes = Visited.insert(std::make_pair(Pred, PredPtr));
      +
      +      if (!InsertRes.second) {
      +        // If the predecessor was visited with PredPtr, then we already did
      +        // the analysis and can ignore it.
      +        if (InsertRes.first->second == PredPtr)
      +          continue;
               
      -        // Check to see if we have already visited this pred block with another
      -        // pointer.  If so, we can't do this lookup.  This failure can occur
      -        // with PHI translation when a critical edge exists and the PHI node in
      -        // the successor translates to a pointer value different than the
      -        // pointer the block was first analyzed with.
      -        std::pair::iterator, bool>
      -          InsertRes = Visited.insert(std::make_pair(Pred, PredPtr));
      -
      -        if (!InsertRes.second) {
      -          // If the predecessor was visited with PredPtr, then we already did
      -          // the analysis and can ignore it.
      -          if (InsertRes.first->second == PredPtr)
      -            continue;
      -          
      -          // Otherwise, the block was previously analyzed with a different
      -          // pointer.  We can't represent the result of this case, so we just
      -          // treat this as a phi translation failure.
      -          goto PredTranslationFailure;
      -        }
      -
      -        // FIXME: it is entirely possible that PHI translating will end up with
      -        // the same value.  Consider PHI translating something like:
      -        // X = phi [x, bb1], [y, bb2].  PHI translating for bb1 doesn't *need*
      -        // to recurse here, pedantically speaking.
      -        
      -        // If we have a problem phi translating, fall through to the code below
      -        // to handle the failure condition.
      -        if (getNonLocalPointerDepFromBB(PredPtr, PointeeSize, isLoad, Pred,
      -                                        Result, Visited))
      -          goto PredTranslationFailure;
      +        // Otherwise, the block was previously analyzed with a different
      +        // pointer.  We can't represent the result of this case, so we just
      +        // treat this as a phi translation failure.
      +        goto PredTranslationFailure;
             }
      -      
      -      // Refresh the CacheInfo/Cache pointer so that it isn't invalidated.
      -      CacheInfo = &NonLocalPointerDeps[CacheKey];
      -      Cache = &CacheInfo->second;
      -      NumSortedEntries = Cache->size();
      -      
      -      // Since we did phi translation, the "Cache" set won't contain all of the
      -      // results for the query.  This is ok (we can still use it to accelerate
      -      // specific block queries) but we can't do the fastpath "return all
      -      // results from the set"  Clear out the indicator for this.
      -      CacheInfo->first = BBSkipFirstBlockPair();
      -      SkipFirstBlock = false;
      -      continue;
      +
      +      // FIXME: it is entirely possible that PHI translating will end up with
      +      // the same value.  Consider PHI translating something like:
      +      // X = phi [x, bb1], [y, bb2].  PHI translating for bb1 doesn't *need*
      +      // to recurse here, pedantically speaking.
      +      
      +      // If we have a problem phi translating, fall through to the code below
      +      // to handle the failure condition.
      +      if (getNonLocalPointerDepFromBB(PredPtr, PointeeSize, isLoad, Pred,
      +                                      Result, Visited))
      +        goto PredTranslationFailure;
           }
      +    
      +    // Refresh the CacheInfo/Cache pointer so that it isn't invalidated.
      +    CacheInfo = &NonLocalPointerDeps[CacheKey];
      +    Cache = &CacheInfo->second;
      +    NumSortedEntries = Cache->size();
      +    
      +    // Since we did phi translation, the "Cache" set won't contain all of the
      +    // results for the query.  This is ok (we can still use it to accelerate
      +    // specific block queries) but we can't do the fastpath "return all
      +    // results from the set"  Clear out the indicator for this.
      +    CacheInfo->first = BBSkipFirstBlockPair();
      +    SkipFirstBlock = false;
      +    continue;
       
         PredTranslationFailure:
           
      
      
      
      
      From sabre at nondot.org  Fri Nov 27 02:40:14 2009
      From: sabre at nondot.org (Chris Lattner)
      Date: Fri, 27 Nov 2009 08:40:14 -0000
      Subject: [llvm-commits] [llvm] r90002 -
      	/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      Message-ID: <200911270840.nAR8eEjN005822@zion.cs.uiuc.edu>
      
      Author: lattner
      Date: Fri Nov 27 02:40:14 2009
      New Revision: 90002
      
      URL: http://llvm.org/viewvc/llvm-project?rev=90002&view=rev
      Log:
      add comment.
      
      Modified:
          llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      
      Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90002&r1=90001&r2=90002&view=diff
      
      ==============================================================================
      --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original)
      +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 02:40:14 2009
      @@ -1004,8 +1004,12 @@
             Value *PredPtr = PHITranslatePointer(PtrInst, BB, Pred, TD);
             
             // If PHI translation fails, bail out.
      -      if (PredPtr == 0)
      +      if (PredPtr == 0) {
      +        // FIXME: Instead of modelling this as a phi trans failure, we should
      +        // model this as a clobber in the one predecessor.  This will allow
      +        // us to PRE values that are only available in some preds but not all.
               goto PredTranslationFailure;
      +      }
             
             // Check to see if we have already visited this pred block with another
             // pointer.  If so, we can't do this lookup.  This failure can occur
      
      
      
      
      From baldrick at free.fr  Fri Nov 27 06:33:24 2009
      From: baldrick at free.fr (Duncan Sands)
      Date: Fri, 27 Nov 2009 12:33:24 -0000
      Subject: [llvm-commits] [llvm] r90003 - /llvm/trunk/docs/LangRef.html
      Message-ID: <200911271233.nARCXOwN026794@zion.cs.uiuc.edu>
      
      Author: baldrick
      Date: Fri Nov 27 06:33:22 2009
      New Revision: 90003
      
      URL: http://llvm.org/viewvc/llvm-project?rev=90003&view=rev
      Log:
      These code generator limitations have been removed.
      
      Modified:
          llvm/trunk/docs/LangRef.html
      
      Modified: llvm/trunk/docs/LangRef.html
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.html?rev=90003&r1=90002&r2=90003&view=diff
      
      ==============================================================================
      --- llvm/trunk/docs/LangRef.html (original)
      +++ llvm/trunk/docs/LangRef.html Fri Nov 27 06:33:22 2009
      @@ -1440,11 +1440,6 @@
         
       
       
      -

      Note that the code generator does not yet support large integer types to be - used as function return types. The specific limit on how large a return type - the code generator can currently handle is target-dependent; currently it's - often 64 bits for 32-bit targets and 128 bits for 64-bit targets.

      - @@ -1583,11 +1578,6 @@ length array type. An implementation of 'pascal style arrays' in LLVM could use the type "{ i32, [0 x float]}", for example.

      -

      Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.

      - @@ -1680,11 +1670,6 @@ -

      Note that the code generator does not yet support large aggregate types to be - used as function return types. The specific limit on how large an aggregate - return type the code generator can currently handle is target-dependent, and - also dependent on the aggregate element types.

      - @@ -1803,11 +1788,6 @@ -

      Note that the code generator does not yet support large vector types to be - used as function return types. The specific limit on how large a vector - return type codegen can currently handle is target-dependent; currently it's - often a few times longer than a hardware vector register.

      - @@ -2600,14 +2580,6 @@ ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
      -

      Note that the code generator does not yet fully support large - return values. The specific sizes that are currently supported are - dependent on the target. For integers, on 32-bit targets the limit - is often 64 bits, and on 64-bit targets the limit is often 128 bits. - For aggregate types, the current limits are dependent on the element - types; for example targets are often limited to 2 total integer - elements and 2 total floating-point elements.

      - From baldrick at free.fr Fri Nov 27 07:38:04 2009 From: baldrick at free.fr (Duncan Sands) Date: Fri, 27 Nov 2009 13:38:04 -0000 Subject: [llvm-commits] [llvm] r90004 - /llvm/trunk/docs/LangRef.html Message-ID: <200911271338.nARDc4De029353@zion.cs.uiuc.edu> Author: baldrick Date: Fri Nov 27 07:38:03 2009 New Revision: 90004 URL: http://llvm.org/viewvc/llvm-project?rev=90004&view=rev Log: Vector types are no longer required to have a power-of-two length. Modified: llvm/trunk/docs/LangRef.html Modified: llvm/trunk/docs/LangRef.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LangRef.html?rev=90004&r1=90003&r2=90004&view=diff ============================================================================== --- llvm/trunk/docs/LangRef.html (original) +++ llvm/trunk/docs/LangRef.html Fri Nov 27 07:38:03 2009 @@ -1760,8 +1760,7 @@

      A vector type is a simple derived type that represents a vector of elements. Vector types are used when multiple primitive data are operated in parallel using a single instruction (SIMD). A vector type requires a size (number of - elements) and an underlying primitive data type. Vectors must have a power - of two length (1, 2, 4, 8, 16 ...). Vector types are considered + elements) and an underlying primitive data type. Vector types are considered first class.

      Syntax:
      From baldrick at free.fr Fri Nov 27 10:04:14 2009 From: baldrick at free.fr (Duncan Sands) Date: Fri, 27 Nov 2009 16:04:14 -0000 Subject: [llvm-commits] [llvm] r90005 - in /llvm/trunk/test/CodeGen: Generic/2009-11-20-NewNode.ll MSP430/2009-11-20-NewNode.ll PIC16/2009-11-20-NewNode.ll Message-ID: <200911271604.nARG4E0o002359@zion.cs.uiuc.edu> Author: baldrick Date: Fri Nov 27 10:04:14 2009 New Revision: 90005 URL: http://llvm.org/viewvc/llvm-project?rev=90005&view=rev Log: While this test is testing a problem in the generic part of codegen, the problem only shows for msp430 and pic16 which is why it specifies them using -march. But it is wrong to put such tests in CodeGen/Generic, since not everyone builds these targets. Put a copy of the test in each of the target test directories. Added: llvm/trunk/test/CodeGen/MSP430/2009-11-20-NewNode.ll - copied, changed from r90004, llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll llvm/trunk/test/CodeGen/PIC16/2009-11-20-NewNode.ll - copied, changed from r90004, llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll Removed: llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll Removed: llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll?rev=90004&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll (original) +++ llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll (removed) @@ -1,37 +0,0 @@ -; RUN: llc -march=msp430 < %s -; RUN: llc -march=pic16 < %s -; PR5558 - -define i64 @_strtoll_r(i16 %base) nounwind { -entry: - br i1 undef, label %if.then, label %if.end27 - -if.then: ; preds = %do.end - br label %if.end27 - -if.end27: ; preds = %if.then, %do.end - %cond66 = select i1 undef, i64 -9223372036854775808, i64 9223372036854775807 ; [#uses=3] - %conv69 = sext i16 %base to i64 ; [#uses=1] - %div = udiv i64 %cond66, %conv69 ; [#uses=1] - br label %for.cond - -for.cond: ; preds = %if.end116, %if.end27 - br i1 undef, label %if.then152, label %if.then93 - -if.then93: ; preds = %for.cond - br i1 undef, label %if.end116, label %if.then152 - -if.end116: ; preds = %if.then93 - %cmp123 = icmp ugt i64 undef, %div ; [#uses=1] - %or.cond = or i1 undef, %cmp123 ; [#uses=0] - br label %for.cond - -if.then152: ; preds = %if.then93, %for.cond - br i1 undef, label %if.end182, label %if.then172 - -if.then172: ; preds = %if.then152 - ret i64 %cond66 - -if.end182: ; preds = %if.then152 - ret i64 %cond66 -} Copied: llvm/trunk/test/CodeGen/MSP430/2009-11-20-NewNode.ll (from r90004, llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/MSP430/2009-11-20-NewNode.ll?p2=llvm/trunk/test/CodeGen/MSP430/2009-11-20-NewNode.ll&p1=llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll&r1=90004&r2=90005&rev=90005&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll (original) +++ llvm/trunk/test/CodeGen/MSP430/2009-11-20-NewNode.ll Fri Nov 27 10:04:14 2009 @@ -1,5 +1,4 @@ ; RUN: llc -march=msp430 < %s -; RUN: llc -march=pic16 < %s ; PR5558 define i64 @_strtoll_r(i16 %base) nounwind { Copied: llvm/trunk/test/CodeGen/PIC16/2009-11-20-NewNode.ll (from r90004, llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll) URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PIC16/2009-11-20-NewNode.ll?p2=llvm/trunk/test/CodeGen/PIC16/2009-11-20-NewNode.ll&p1=llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll&r1=90004&r2=90005&rev=90005&view=diff ============================================================================== --- llvm/trunk/test/CodeGen/Generic/2009-11-20-NewNode.ll (original) +++ llvm/trunk/test/CodeGen/PIC16/2009-11-20-NewNode.ll Fri Nov 27 10:04:14 2009 @@ -1,4 +1,3 @@ -; RUN: llc -march=msp430 < %s ; RUN: llc -march=pic16 < %s ; PR5558 From sabre at nondot.org Fri Nov 27 10:31:59 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 16:31:59 -0000 Subject: [llvm-commits] [llvm] r90006 - /llvm/trunk/test/Transforms/ScalarRepl/badarray.ll Message-ID: <200911271632.nARGW0YM003580@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 10:31:59 2009 New Revision: 90006 URL: http://llvm.org/viewvc/llvm-project?rev=90006&view=rev Log: filecheckize Modified: llvm/trunk/test/Transforms/ScalarRepl/badarray.ll Modified: llvm/trunk/test/Transforms/ScalarRepl/badarray.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/ScalarRepl/badarray.ll?rev=90006&r1=90005&r2=90006&view=diff ============================================================================== --- llvm/trunk/test/Transforms/ScalarRepl/badarray.ll (original) +++ llvm/trunk/test/Transforms/ScalarRepl/badarray.ll Fri Nov 27 10:31:59 2009 @@ -1,9 +1,11 @@ -; RUN: opt < %s -scalarrepl -instcombine -S | not grep alloca -; PR3466 +; RUN: opt < %s -scalarrepl -S | FileCheck %s -define i32 @test() { - %X = alloca [4 x i32] ; <[4 x i32]*> [#uses=1] - ; Off end of array! +; PR3466 +; Off end of array, don't transform. +define i32 @test1() { +; CHECK: @test1 +; CHECK: %X = alloca + %X = alloca [4 x i32] %Y = getelementptr [4 x i32]* %X, i64 0, i64 6 ; [#uses=2] store i32 0, i32* %Y %Z = load i32* %Y ; [#uses=1] @@ -11,8 +13,11 @@ } +; Off end of array, don't transform. define i32 @test2() nounwind { entry: +; CHECK: @test2 +; CHECK: %yx2.i = alloca %yx2.i = alloca float, align 4 ; [#uses=1] %yx26.i = bitcast float* %yx2.i to i64* ; [#uses=1] %0 = load i64* %yx26.i, align 8 ; [#uses=0] From sabre at nondot.org Fri Nov 27 10:37:42 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 16:37:42 -0000 Subject: [llvm-commits] [llvm] r90007 - in /llvm/trunk: lib/Transforms/Scalar/ScalarReplAggregates.cpp test/Transforms/ScalarRepl/badarray.ll Message-ID: <200911271637.nARGbgqK003728@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 10:37:41 2009 New Revision: 90007 URL: http://llvm.org/viewvc/llvm-project?rev=90007&view=rev Log: fix PR5436 by making the 'simple' case of SRoA not promote out of range array indexes. The "complex" case of SRoA still handles them, and correctly. This fixes a weirdness where we'd correctly avoid transforming A[0][42] if the 42 was too large, but we'd only do it if it was one gep, not two separate ones. Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp llvm/trunk/test/Transforms/ScalarRepl/badarray.ll Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp?rev=90007&r1=90006&r2=90007&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Fri Nov 27 10:37:41 2009 @@ -469,15 +469,41 @@ case Instruction::GetElementPtr: { GetElementPtrInst *GEP = cast(User); bool AreAllZeroIndices = isFirstElt; - if (GEP->getNumOperands() > 1) { - if (!isa(GEP->getOperand(1)) || - !cast(GEP->getOperand(1))->isZero()) - // Using pointer arithmetic to navigate the array. - return MarkUnsafe(Info); - - if (AreAllZeroIndices) - AreAllZeroIndices = GEP->hasAllZeroIndices(); + if (GEP->getNumOperands() > 1 && + (!isa(GEP->getOperand(1)) || + !cast(GEP->getOperand(1))->isZero())) + // Using pointer arithmetic to navigate the array. + return MarkUnsafe(Info); + + // Verify that any array subscripts are in range. + for (gep_type_iterator GEPIt = gep_type_begin(GEP), + E = gep_type_end(GEP); GEPIt != E; ++GEPIt) { + // Ignore struct elements, no extra checking needed for these. + if (isa(*GEPIt)) + continue; + + // This GEP indexes an array. Verify that this is an in-range + // constant integer. Specifically, consider A[0][i]. We cannot know that + // the user isn't doing invalid things like allowing i to index an + // out-of-range subscript that accesses A[1]. Because of this, we have + // to reject SROA of any accesses into structs where any of the + // components are variables. + ConstantInt *IdxVal = dyn_cast(GEPIt.getOperand()); + if (!IdxVal) return MarkUnsafe(Info); + + // Are all indices still zero? + AreAllZeroIndices &= IdxVal->isZero(); + + if (const ArrayType *AT = dyn_cast(*GEPIt)) { + if (IdxVal->getZExtValue() >= AT->getNumElements()) + return MarkUnsafe(Info); + } else if (const VectorType *VT = dyn_cast(*GEPIt)) { + if (IdxVal->getZExtValue() >= VT->getNumElements()) + return MarkUnsafe(Info); + } } + + isSafeElementUse(GEP, AreAllZeroIndices, AI, Info); if (Info.isUnsafe) return; break; Modified: llvm/trunk/test/Transforms/ScalarRepl/badarray.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/ScalarRepl/badarray.ll?rev=90007&r1=90006&r2=90007&view=diff ============================================================================== --- llvm/trunk/test/Transforms/ScalarRepl/badarray.ll (original) +++ llvm/trunk/test/Transforms/ScalarRepl/badarray.ll Fri Nov 27 10:37:41 2009 @@ -1,10 +1,14 @@ ; RUN: opt < %s -scalarrepl -S | FileCheck %s +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32" +target triple = "i386-pc-linux-gnu" + + ; PR3466 ; Off end of array, don't transform. define i32 @test1() { ; CHECK: @test1 -; CHECK: %X = alloca +; CHECK-NOT: = alloca %X = alloca [4 x i32] %Y = getelementptr [4 x i32]* %X, i64 0, i64 6 ; [#uses=2] store i32 0, i32* %Y @@ -17,9 +21,37 @@ define i32 @test2() nounwind { entry: ; CHECK: @test2 -; CHECK: %yx2.i = alloca +; CHECK-NOT: = alloca %yx2.i = alloca float, align 4 ; [#uses=1] %yx26.i = bitcast float* %yx2.i to i64* ; [#uses=1] %0 = load i64* %yx26.i, align 8 ; [#uses=0] unreachable } + +%base = type { i32, [0 x i8] } +%padded = type { %base, [1 x i32] } + +; PR5436 +define void @test3() { +entry: +; CHECK: @test3 +; CHECK-NOT: = alloca +; CHECK: store i64 + %var_1 = alloca %padded, align 8 ; <%padded*> [#uses=3] + %0 = getelementptr inbounds %padded* %var_1, i32 0, i32 0 ; <%base*> [#uses=2] + + %p2 = getelementptr inbounds %base* %0, i32 0, i32 1, i32 0 ; [#uses=1] + store i8 72, i8* %p2, align 1 + + ; 72 -> a[0]. + + %callret = call %padded *@test3f() ; [#uses=2] + %callretcast = bitcast %padded* %callret to i8* ; [#uses=1] + %var_11 = bitcast %padded* %var_1 to i8* ; [#uses=1] + call void @llvm.memcpy.i32(i8* %callretcast, i8* %var_11, i32 8, i32 4) + ret void +} + +declare void @llvm.memcpy.i32(i8* nocapture, i8* nocapture, i32, i32) nounwind + +declare %padded* @test3f() From sabre at nondot.org Fri Nov 27 10:53:57 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 16:53:57 -0000 Subject: [llvm-commits] [llvm] r90008 - /llvm/trunk/lib/Target/README.txt Message-ID: <200911271653.nARGrvks004248@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 10:53:57 2009 New Revision: 90008 URL: http://llvm.org/viewvc/llvm-project?rev=90008&view=rev Log: This testcase is actually only partially redundant, and requires the FIXME I added yesterday to be implemented. Modified: llvm/trunk/lib/Target/README.txt Modified: llvm/trunk/lib/Target/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=90008&r1=90007&r2=90008&view=diff ============================================================================== --- llvm/trunk/lib/Target/README.txt (original) +++ llvm/trunk/lib/Target/README.txt Fri Nov 27 10:53:57 2009 @@ -1175,7 +1175,7 @@ //===---------------------------------------------------------------------===// -[PHI TRANSLATE GEPs] +[LOAD PRE with NON-AVAILABLE ADDRESS] GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stack leading to excess stack traffic. This could be handled by GVN with some crazy @@ -1192,7 +1192,7 @@ %10 = getelementptr %struct.f* %c_addr.0, i32 0, i32 0 %11 = load i32* %10, align 4 -%11 is fully redundant, an in BB2 it should have the value %8. +%11 is partially redundant, an in BB2 it should have the value %8. GCC PR33344 is a similar case. From arplynn at gmail.com Fri Nov 27 10:56:46 2009 From: arplynn at gmail.com (Alastair Lynn) Date: Fri, 27 Nov 2009 16:56:46 +0000 Subject: [llvm-commits] Testcases for previous overflow intrinsics xforms Message-ID: <3E593936-C4FE-4CC6-86DD-7AFA7587CBB4@gmail.com> Hi- This patch contains test cases for the instcombine simplification patch I sent in a couple of weeks ago. -------------- next part -------------- A non-text attachment was scrubbed... Name: overflow-intrinsics-testcases.patch Type: application/octet-stream Size: 2924 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/9a8e8400/attachment.obj From sabre at nondot.org Fri Nov 27 11:12:30 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 17:12:30 -0000 Subject: [llvm-commits] [llvm] r90009 - /llvm/trunk/lib/Target/README.txt Message-ID: <200911271712.nARHCUFq004736@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 11:12:30 2009 New Revision: 90009 URL: http://llvm.org/viewvc/llvm-project?rev=90009&view=rev Log: add a deadargelim note. Modified: llvm/trunk/lib/Target/README.txt Modified: llvm/trunk/lib/Target/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=90009&r1=90008&r2=90009&view=diff ============================================================================== --- llvm/trunk/lib/Target/README.txt (original) +++ llvm/trunk/lib/Target/README.txt Fri Nov 27 11:12:30 2009 @@ -2,6 +2,29 @@ //===---------------------------------------------------------------------===// +Dead argument elimination should be enhanced to handle cases when an argument is +dead to an externally visible function. Though the argument can't be removed +from the externally visible function, the caller doesn't need to pass it in. +For example in this testcase: + + void foo(int X) __attribute__((noinline)); + void foo(int X) { sideeffect(); } + void bar(int A) { foo(A+1); } + +We compile bar to: + +define void @bar(i32 %A) nounwind ssp { + %0 = add nsw i32 %A, 1 ; [#uses=1] + tail call void @foo(i32 %0) nounwind noinline ssp + ret void +} + +The add is dead, we could pass in 'i32 undef' instead. This occurs for C++ +templates etc, which usually have linkonce_odr/weak_odr linkage, not internal +linkage. + +//===---------------------------------------------------------------------===// + With the recent changes to make the implicit def/use set explicit in machineinstrs, we should change the target descriptions for 'call' instructions so that the .td files don't list all the call-clobbered registers as implicit From sabre at nondot.org Fri Nov 27 11:42:22 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 17:42:22 -0000 Subject: [llvm-commits] [llvm] r90011 - in /llvm/trunk: include/llvm/Analysis/InstructionSimplify.h lib/Analysis/InstructionSimplify.cpp lib/Transforms/Scalar/InstructionCombining.cpp Message-ID: <200911271742.nARHgN9w005647@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 11:42:22 2009 New Revision: 90011 URL: http://llvm.org/viewvc/llvm-project?rev=90011&view=rev Log: factor some logic out of instcombine into a new SimplifyAddInst method. Modified: llvm/trunk/include/llvm/Analysis/InstructionSimplify.h llvm/trunk/lib/Analysis/InstructionSimplify.cpp llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Modified: llvm/trunk/include/llvm/Analysis/InstructionSimplify.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/InstructionSimplify.h?rev=90011&r1=90010&r2=90011&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/InstructionSimplify.h (original) +++ llvm/trunk/include/llvm/Analysis/InstructionSimplify.h Fri Nov 27 11:42:22 2009 @@ -20,6 +20,11 @@ class Instruction; class Value; class TargetData; + + /// SimplifyAddInst - Given operands for an Add, see if we can + /// fold the result. If not, this returns null. + Value *SimplifyAddInst(Value *LHS, Value *RHS, bool isNSW, bool isNUW, + const TargetData *TD = 0); /// SimplifyAndInst - Given operands for an And, see if we can /// fold the result. If not, this returns null. Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=90011&r1=90010&r2=90011&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Fri Nov 27 11:42:22 2009 @@ -21,13 +21,41 @@ using namespace llvm; using namespace llvm::PatternMatch; -/// SimplifyAndInst - Given operands for an And, see if we can +/// SimplifyAddInst - Given operands for an Add, see if we can /// fold the result. If not, this returns null. -Value *llvm::SimplifyAndInst(Value *Op0, Value *Op1, +Value *llvm::SimplifyAddInst(Value *Op0, Value *Op1, bool isNSW, bool isNUW, const TargetData *TD) { if (Constant *CLHS = dyn_cast(Op0)) { if (Constant *CRHS = dyn_cast(Op1)) { Constant *Ops[] = { CLHS, CRHS }; + return ConstantFoldInstOperands(Instruction::Add, CLHS->getType(), + Ops, 2, TD); + } + + // Canonicalize the constant to the RHS. + std::swap(Op0, Op1); + } + + if (Constant *Op1C = dyn_cast(Op1)) { + // X + undef -> undef + if (isa(Op1C)) + return Op1C; + + // X + 0 --> X + if (Op1C->isNullValue()) + return Op0; + } + + // FIXME: Could pull several more out of instcombine. + return 0; +} + +/// SimplifyAndInst - Given operands for an And, see if we can +/// fold the result. If not, this returns null. +Value *llvm::SimplifyAndInst(Value *Op0, Value *Op1, const TargetData *TD) { + if (Constant *CLHS = dyn_cast(Op0)) { + if (Constant *CRHS = dyn_cast(Op1)) { + Constant *Ops[] = { CLHS, CRHS }; return ConstantFoldInstOperands(Instruction::And, CLHS->getType(), Ops, 2, TD); } @@ -83,8 +111,7 @@ /// SimplifyOrInst - Given operands for an Or, see if we can /// fold the result. If not, this returns null. -Value *llvm::SimplifyOrInst(Value *Op0, Value *Op1, - const TargetData *TD) { +Value *llvm::SimplifyOrInst(Value *Op0, Value *Op1, const TargetData *TD) { if (Constant *CLHS = dyn_cast(Op0)) { if (Constant *CRHS = dyn_cast(Op1)) { Constant *Ops[] = { CLHS, CRHS }; @@ -142,8 +169,6 @@ } - - static const Type *GetCompareTy(Value *Op) { return CmpInst::makeCmpResultType(Op->getType()); } @@ -327,6 +352,10 @@ switch (I->getOpcode()) { default: return ConstantFoldInstruction(I, TD); + case Instruction::Add: + return SimplifyAddInst(I->getOperand(0), I->getOperand(1), + cast(I)->hasNoSignedWrap(), + cast(I)->hasNoUnsignedWrap(), TD); case Instruction::And: return SimplifyAndInst(I->getOperand(0), I->getOperand(1), TD); case Instruction::Or: Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=90011&r1=90010&r2=90011&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Fri Nov 27 11:42:22 2009 @@ -2163,8 +2163,8 @@ // Add has the property that adding any two 2's complement numbers can only // have one carry bit which can change a sign. As such, if LHS and RHS each - // have at least two sign bits, we know that the addition of the two values will - // sign extend fine. + // have at least two sign bits, we know that the addition of the two values + // will sign extend fine. if (ComputeNumSignBits(LHS) > 1 && ComputeNumSignBits(RHS) > 1) return true; @@ -2184,15 +2184,12 @@ bool Changed = SimplifyCommutative(I); Value *LHS = I.getOperand(0), *RHS = I.getOperand(1); - if (Constant *RHSC = dyn_cast(RHS)) { - // X + undef -> undef - if (isa(RHS)) - return ReplaceInstUsesWith(I, RHS); - - // X + 0 --> X - if (RHSC->isNullValue()) - return ReplaceInstUsesWith(I, LHS); + if (Value *V = SimplifyAddInst(LHS, RHS, I.hasNoSignedWrap(), + I.hasNoUnsignedWrap(), TD)) + return ReplaceInstUsesWith(I, V); + + if (Constant *RHSC = dyn_cast(RHS)) { if (ConstantInt *CI = dyn_cast(RHSC)) { // X + (signbit) --> X ^ signbit const APInt& Val = CI->getValue(); @@ -4322,7 +4319,6 @@ if (Value *V = SimplifyAndInst(Op0, Op1, TD)) return ReplaceInstUsesWith(I, V); - // See if we can simplify any instructions used by the instruction whose sole // purpose is to compute bits we don't care about. From sabre at nondot.org Fri Nov 27 12:08:31 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 18:08:31 -0000 Subject: [llvm-commits] [llvm] r90012 - /llvm/trunk/test/Transforms/GVN/pre-load.ll Message-ID: <200911271808.nARI8VsY006401@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 12:08:30 2009 New Revision: 90012 URL: http://llvm.org/viewvc/llvm-project?rev=90012&view=rev Log: add two simple test cases we now optimize (to one load in the loop each) and one we don't (corresponding to the fixme I added yesterday). Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90012&r1=90011&r2=90012&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 12:08:30 2009 @@ -1,4 +1,5 @@ ; RUN: opt < %s -gvn -enable-load-pre -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" define i32 @test1(i32* %p, i1 %C) { ; CHECK: @test1 @@ -107,3 +108,123 @@ ; CHECK-NOT: load ; CHECK: ret i32 } + +;void test5(int N, double *G) { +; int j; +; for (j = 0; j < N - 1; j++) +; G[j] = G[j] + G[j+1]; +;} + +define void @test5(i32 %N, double* nocapture %G) nounwind ssp { +; CHECK: @test5 +entry: + %0 = add i32 %N, -1 + %1 = icmp sgt i32 %0, 0 + br i1 %1, label %bb.nph, label %return + +bb.nph: + %tmp = zext i32 %0 to i64 + br label %bb + +; CHECK: bb.nph: +; CHECK: load double* +; CHECK: br label %bb + +bb: + %indvar = phi i64 [ 0, %bb.nph ], [ %tmp6, %bb ] + %tmp6 = add i64 %indvar, 1 + %scevgep = getelementptr double* %G, i64 %tmp6 + %scevgep7 = getelementptr double* %G, i64 %indvar + %2 = load double* %scevgep7, align 8 + %3 = load double* %scevgep, align 8 + %4 = fadd double %2, %3 + store double %4, double* %scevgep7, align 8 + %exitcond = icmp eq i64 %tmp6, %tmp + br i1 %exitcond, label %return, label %bb + +; Should only be one load in the loop. +; CHECK: bb: +; CHECK: load double* +; CHECK-NOT: load double* +; CHECK: br i1 %exitcond + +return: + ret void +} + +;void test6(int N, double *G) { +; int j; +; for (j = 0; j < N - 1; j++) +; G[j+1] = G[j] + G[j+1]; +;} + +define void @test6(i32 %N, double* nocapture %G) nounwind ssp { +; CHECK: @test6 +entry: + %0 = add i32 %N, -1 + %1 = icmp sgt i32 %0, 0 + br i1 %1, label %bb.nph, label %return + +bb.nph: + %tmp = zext i32 %0 to i64 + br label %bb + +; CHECK: bb.nph: +; CHECK: load double* +; CHECK: br label %bb + +bb: + %indvar = phi i64 [ 0, %bb.nph ], [ %tmp6, %bb ] + %tmp6 = add i64 %indvar, 1 + %scevgep = getelementptr double* %G, i64 %tmp6 + %scevgep7 = getelementptr double* %G, i64 %indvar + %2 = load double* %scevgep7, align 8 + %3 = load double* %scevgep, align 8 + %4 = fadd double %2, %3 + store double %4, double* %scevgep, align 8 + %exitcond = icmp eq i64 %tmp6, %tmp + br i1 %exitcond, label %return, label %bb + +; Should only be one load in the loop. +; CHECK: bb: +; CHECK: load double* +; CHECK-NOT: load double* +; CHECK: br i1 %exitcond + +return: + ret void +} + + + +;;; --- todo + +;; Here the loaded address isn't available in 'block2' at all. +define i32 @testX(i32* %p, i32* %q, i32** %Hack, i1 %C) { +; CHECK: @testX +block1: + br i1 %C, label %block2, label %block3 + +block2: + br label %block4 +; HECK: block2: +; HECK: load i32* +; HECK: br label %block4 + +block3: + %A = getelementptr i32* %p, i32 1 + store i32 0, i32* %A + br label %block4 + +block4: + %P2 = phi i32* [%p, %block3], [%q, %block2] + %P3 = getelementptr i32* %P2, i32 1 + %PRE = load i32* %P3 + ret i32 %PRE +; HECK: block4: +; HECK-NEXT: phi i32 [ +; HECK-NOT: load +; HECK: ret i32 +} + + From sabre at nondot.org Fri Nov 27 13:11:32 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 19:11:32 -0000 Subject: [llvm-commits] [llvm] r90013 - in /llvm/trunk: lib/Analysis/MemoryDependenceAnalysis.cpp test/Transforms/GVN/pre-load.ll Message-ID: <200911271911.nARJBW1F008505@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 13:11:31 2009 New Revision: 90013 URL: http://llvm.org/viewvc/llvm-project?rev=90013&view=rev Log: add support for recursive phi translation and phi translation of add with immediate. This allows us to optimize this function: void test(int N, double* G) { long j; G[1] = 1; for (j = 1; j < N - 1; j++) G[j+1] = G[j] + G[j+1]; } to only do one load every iteration of the loop. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90013&r1=90012&r2=90013&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 13:11:31 2009 @@ -694,17 +694,31 @@ // We can handle bitcast of a PHI, but the PHI needs to be in the same block // as the bitcast. if (BitCastInst *BC = dyn_cast(Inst)) + // FIXME: Allow any phi translatable operand. if (PHINode *PN = dyn_cast(BC->getOperand(0))) if (PN->getParent() == BC->getParent()) return true; - // We can translate a GEP that uses a PHI in the current block for at least - // one of its operands. + // We can translate a GEP if all of its operands defined in this block are phi + // translatable. if (GetElementPtrInst *GEP = dyn_cast(Inst)) { - for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) - if (PHINode *PN = dyn_cast(GEP->getOperand(i))) - if (PN->getParent() == GEP->getParent()) - return true; + for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) { + Instruction *GEPOpI = dyn_cast(GEP->getOperand(i)); + if (GEPOpI == 0 || GEPOpI->getParent() != Inst->getParent()) + continue; + + if (!isPHITranslatable(GEPOpI)) + return false; + } + return true; + } + + if (Inst->getOpcode() == Instruction::Add && + isa(Inst->getOperand(1))) { + Instruction *GEPOpI = dyn_cast(Inst->getOperand(0)); + if (GEPOpI == 0 || GEPOpI->getParent() != Inst->getParent()) + return true; + return isPHITranslatable(GEPOpI); } // cerr << "MEMDEP: Could not PHI translate: " << *Pointer; @@ -731,6 +745,7 @@ // Handle bitcast of PHI. if (BitCastInst *BC = dyn_cast(Inst)) { + // FIXME: Recurse! PHINode *BCPN = cast(BC->getOperand(0)); Value *PHIIn = BCPN->getIncomingValueForBlock(Pred); @@ -749,7 +764,7 @@ return 0; } - // Handle getelementptr with at least one PHI operand. + // Handle getelementptr with at least one PHI translatable operand. if (GetElementPtrInst *GEP = dyn_cast(Inst)) { SmallVector GEPOps; BasicBlock *CurBB = GEP->getParent(); @@ -764,8 +779,8 @@ } // If the operand is a phi node, do phi translation. - if (PHINode *PN = dyn_cast(GEPOp)) { - GEPOps.push_back(PN->getIncomingValueForBlock(Pred)); + if (Value *InOp = PHITranslatePointer(GEPOp, CurBB, Pred, TD)) { + GEPOps.push_back(InOp); continue; } @@ -778,7 +793,6 @@ if (Value *V = SimplifyGEPInst(&GEPOps[0], GEPOps.size(), TD)) return V; - // Scan to see if we have this GEP available. Value *APHIOp = GEPOps[0]; for (Value::use_iterator UI = APHIOp->use_begin(), E = APHIOp->use_end(); @@ -800,6 +814,49 @@ return 0; } + // Handle add with a constant RHS. + if (Inst->getOpcode() == Instruction::Add && + isa(Inst->getOperand(1))) { + // PHI translate the LHS. + Value *LHS; + Constant *RHS = cast(Inst->getOperand(1)); + Instruction *OpI = dyn_cast(Inst->getOperand(0)); + bool isNSW = cast(Inst)->hasNoSignedWrap(); + bool isNUW = cast(Inst)->hasNoUnsignedWrap(); + + if (OpI == 0 || OpI->getParent() != Inst->getParent()) + LHS = Inst->getOperand(0); + else { + LHS = PHITranslatePointer(Inst->getOperand(0), CurBB, Pred, TD); + if (LHS == 0) + return 0; + } + + // If the PHI translated LHS is an add of a constant, fold the immediates. + if (BinaryOperator *BOp = dyn_cast(LHS)) + if (BOp->getOpcode() == Instruction::Add) + if (ConstantInt *CI = dyn_cast(BOp->getOperand(1))) { + LHS = BOp->getOperand(0); + RHS = ConstantExpr::getAdd(RHS, CI); + isNSW = isNUW = false; + } + + // See if the add simplifies away. + if (Value *Res = SimplifyAddInst(LHS, RHS, isNSW, isNUW, TD)) + return Res; + + // Otherwise, see if we have this add available somewhere. + for (Value::use_iterator UI = LHS->use_begin(), E = LHS->use_end(); + UI != E; ++UI) { + if (BinaryOperator *BO = dyn_cast(*UI)) + if (BO->getOperand(0) == LHS && BO->getOperand(1) == RHS && + BO->getParent()->getParent() == CurBB->getParent()) + return BO; + } + + return 0; + } + return 0; } Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90013&r1=90012&r2=90013&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 13:11:31 2009 @@ -195,6 +195,49 @@ ret void } +;void test7(int N, double* G) { +; long j; +; G[1] = 1; +; for (j = 1; j < N - 1; j++) +; G[j+1] = G[j] + G[j+1]; +;} + +; This requires phi translation of the adds. +define void @test7(i32 %N, double* nocapture %G) nounwind ssp { +entry: + %0 = getelementptr inbounds double* %G, i64 1 + store double 1.000000e+00, double* %0, align 8 + %1 = add i32 %N, -1 + %2 = icmp sgt i32 %1, 1 + br i1 %2, label %bb.nph, label %return + +bb.nph: + %tmp = sext i32 %1 to i64 + %tmp7 = add i64 %tmp, -1 + br label %bb + +bb: + %indvar = phi i64 [ 0, %bb.nph ], [ %tmp9, %bb ] + %tmp8 = add i64 %indvar, 2 + %scevgep = getelementptr double* %G, i64 %tmp8 + %tmp9 = add i64 %indvar, 1 + %scevgep10 = getelementptr double* %G, i64 %tmp9 + %3 = load double* %scevgep10, align 8 + %4 = load double* %scevgep, align 8 + %5 = fadd double %3, %4 + store double %5, double* %scevgep, align 8 + %exitcond = icmp eq i64 %tmp9, %tmp7 + br i1 %exitcond, label %return, label %bb + +; Should only be one load in the loop. +; CHECK: bb: +; CHECK: load double* +; CHECK-NOT: load double* +; CHECK: br i1 %exitcond + +return: + ret void +} ;;; --- todo From grosser at fim.uni-passau.de Fri Nov 27 13:43:18 2009 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Fri, 27 Nov 2009 20:43:18 +0100 Subject: [llvm-commits] [llvm] r89974 - /llvm/trunk/lib/VMCore/Metadata.cpp In-Reply-To: <1609_1259276091_4B0F073A_1609_73_1_200911262254.nAQMsQ5X017677@zion.cs.uiuc.edu> References: <1609_1259276091_4B0F073A_1609_73_1_200911262254.nAQMsQ5X017677@zion.cs.uiuc.edu> Message-ID: <1259350998.75448.149.camel@tobilaptop.fritz.box> On Thu, 2009-11-26 at 22:54 +0000, Nick Lewycky wrote: > Author: nicholas > Date: Thu Nov 26 16:54:26 2009 > New Revision: 89974 > > URL: http://llvm.org/viewvc/llvm-project?rev=89974&view=rev > Log: > Clean up file, no functionality change. > > Modified: > llvm/trunk/lib/VMCore/Metadata.cpp > > Modified: llvm/trunk/lib/VMCore/Metadata.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=89974&r1=89973&r2=89974&view=diff > > ============================================================================== > --- llvm/trunk/lib/VMCore/Metadata.cpp (original) > +++ llvm/trunk/lib/VMCore/Metadata.cpp Thu Nov 26 16:54:26 2009 > @@ -33,10 +33,8 @@ > StringMapEntry &Entry = > pImpl->MDStringCache.GetOrCreateValue(Str); > MDString *&S = Entry.getValue(); > - if (S) return S; > - > - return S = > - new MDString(Context, Entry.getKey()); > + if (!S) S = new MDString(Context, Entry.getKey()); > + return S; > } > > MDString *MDString::get(LLVMContext &Context, const char *Str) { > @@ -44,10 +42,8 @@ > StringMapEntry &Entry = > pImpl->MDStringCache.GetOrCreateValue(Str ? StringRef(Str) : StringRef()); > MDString *&S = Entry.getValue(); > - if (S) return S; > - > - return S = > - new MDString(Context, Entry.getKey()); > + if (!S) new MDString(Context, Entry.getKey()); > + return S; Did you miss a "S = " here? - return S = - new MDString(Context, Entry.getKey()); + if (!S) S = new MDString(Context, Entry.getKey()); ^^^^ Tobi From grosser at fim.uni-passau.de Fri Nov 27 13:44:46 2009 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Fri, 27 Nov 2009 20:44:46 +0100 Subject: [llvm-commits] [PATCH] DOTGraphTraits improvements Message-ID: <1259351086.75448.152.camel@tobilaptop.fritz.box> Hi, I worked on some dotty improvements. The main changes are: * Better layout * Allow to hide details in -only mode * Do not print unneeded stuff * Several small bug fixes The effects can be seen on this page: http://students.fim.uni-passau.de/~grosser/llvm/dotty_patch/dotty.html There is a the complete change set attached, as well as one big patch to be applied at once. These are the changes in detail: * Remove ":" after BB name in -view-cfg-only * Small PostDominatorTree improvements * Do not SEGFAULT if tree entryNode() is NULL * Print function names in dotty printer * Only print edgeSourceLabels if they are not empty Graphviz can layout the graphs better if a node does not contain source ports. Therefore only print the ports if the source ports are useful, that means are not labeled with the empty string "". This patch also simplifies graphs without any edgeSourceLabels e.g. the dominance trees. * Do not point edge heads to source labels If no destination label is available, just point to the node itself instead of pointing to some source label. Source and destination labels are not related in any way. * Instantiate DefaultDOTGraphTraits Allows to query isSimple() from all methods in DOTGraphTraits. It can be used as a general ShortNames flag. * Remove ShortNames from getNodeLabel Convert ShortNames to the general isSimple(). isSimple() is only called in the classes where it actually makes a difference. * Do not print edge source labels in -view-cfg-only & -dot-cfg-only Without branch instructions the labels are not informative and at the same time complicate the layout of the graph. This uses the new isSimple. Tobias -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Remove-after-BB-name-in-view-cfg-only.patch Type: text/x-patch Size: 918 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Small-PostDominatorTree-improvements.patch Type: text/x-patch Size: 2229 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Only-print-edgeSourceLabels-if-they-are-not-empty.patch Type: text/x-patch Size: 3627 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0002.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-Do-not-point-edge-heads-to-source-labels.patch Type: text/x-patch Size: 1152 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0003.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0005-Instantiate-DefaultDOTGraphTraits.patch Type: text/x-patch Size: 11913 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0004.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0006-Remove-ShortNames-from-getNodeLabel.patch Type: text/x-patch Size: 11918 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0005.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: 0007-Do-not-print-edge-source-labels-in-simple-CFG.patch Type: text/x-patch Size: 1125 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0006.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: complete_patch_set-llvm_graph_printer_improvements.patch Type: text/x-patch Size: 21305 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091127/8b04095e/attachment-0007.bin From sabre at nondot.org Fri Nov 27 13:56:01 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 19:56:01 -0000 Subject: [llvm-commits] [llvm] r90014 - /llvm/trunk/test/Transforms/GVN/rle-must-alias.ll Message-ID: <200911271956.nARJu1Xv010140@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 13:56:00 2009 New Revision: 90014 URL: http://llvm.org/viewvc/llvm-project?rev=90014&view=rev Log: I accidentally implemented this :) Modified: llvm/trunk/test/Transforms/GVN/rle-must-alias.ll Modified: llvm/trunk/test/Transforms/GVN/rle-must-alias.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/rle-must-alias.ll?rev=90014&r1=90013&r2=90014&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/rle-must-alias.ll (original) +++ llvm/trunk/test/Transforms/GVN/rle-must-alias.ll Fri Nov 27 13:56:00 2009 @@ -1,7 +1,6 @@ ; RUN: opt < %s -gvn -S | grep {DEAD = phi i32 } -; XFAIL: * -; FIXME: GVN should eliminate the fully redundant %9 GEP which +; GVN should eliminate the fully redundant %9 GEP which ; allows DEAD to be removed. This is PR3198. ; The %7 and %4 loads combine to make %DEAD unneeded. From nicholas at mxc.ca Fri Nov 27 13:57:53 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 27 Nov 2009 19:57:53 -0000 Subject: [llvm-commits] [llvm] r90015 - /llvm/trunk/lib/VMCore/Metadata.cpp Message-ID: <200911271957.nARJvrnb010270@zion.cs.uiuc.edu> Author: nicholas Date: Fri Nov 27 13:57:53 2009 New Revision: 90015 URL: http://llvm.org/viewvc/llvm-project?rev=90015&view=rev Log: Oops! Fix bug introduced in my recent cleanup change. Thanks to Tobias Grosser for pointing this out. Modified: llvm/trunk/lib/VMCore/Metadata.cpp Modified: llvm/trunk/lib/VMCore/Metadata.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=90015&r1=90014&r2=90015&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/Metadata.cpp (original) +++ llvm/trunk/lib/VMCore/Metadata.cpp Fri Nov 27 13:57:53 2009 @@ -42,7 +42,7 @@ StringMapEntry &Entry = pImpl->MDStringCache.GetOrCreateValue(Str ? StringRef(Str) : StringRef()); MDString *&S = Entry.getValue(); - if (!S) new MDString(Context, Entry.getKey()); + if (!S) S = new MDString(Context, Entry.getKey()); return S; } From nicholas at mxc.ca Fri Nov 27 13:58:09 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Fri, 27 Nov 2009 11:58:09 -0800 Subject: [llvm-commits] [llvm] r89974 - /llvm/trunk/lib/VMCore/Metadata.cpp In-Reply-To: <1259350998.75448.149.camel@tobilaptop.fritz.box> References: <1609_1259276091_4B0F073A_1609_73_1_200911262254.nAQMsQ5X017677@zion.cs.uiuc.edu> <1259350998.75448.149.camel@tobilaptop.fritz.box> Message-ID: <4B102F51.8090607@mxc.ca> Tobias Grosser wrote: > On Thu, 2009-11-26 at 22:54 +0000, Nick Lewycky wrote: >> Author: nicholas >> Date: Thu Nov 26 16:54:26 2009 >> New Revision: 89974 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=89974&view=rev >> Log: >> Clean up file, no functionality change. >> >> Modified: >> llvm/trunk/lib/VMCore/Metadata.cpp >> >> Modified: llvm/trunk/lib/VMCore/Metadata.cpp >> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/Metadata.cpp?rev=89974&r1=89973&r2=89974&view=diff >> >> ============================================================================== >> --- llvm/trunk/lib/VMCore/Metadata.cpp (original) >> +++ llvm/trunk/lib/VMCore/Metadata.cpp Thu Nov 26 16:54:26 2009 >> @@ -33,10 +33,8 @@ >> StringMapEntry &Entry = >> pImpl->MDStringCache.GetOrCreateValue(Str); >> MDString *&S = Entry.getValue(); >> - if (S) return S; >> - >> - return S = >> - new MDString(Context, Entry.getKey()); >> + if (!S) S = new MDString(Context, Entry.getKey()); >> + return S; >> } >> >> MDString *MDString::get(LLVMContext&Context, const char *Str) { >> @@ -44,10 +42,8 @@ >> StringMapEntry &Entry = >> pImpl->MDStringCache.GetOrCreateValue(Str ? StringRef(Str) : StringRef()); >> MDString *&S = Entry.getValue(); >> - if (S) return S; >> - >> - return S = >> - new MDString(Context, Entry.getKey()); >> + if (!S) new MDString(Context, Entry.getKey()); >> + return S; > > Did you miss a "S = " here? > > - return S = > - new MDString(Context, Entry.getKey()); > + if (!S) S = new MDString(Context, Entry.getKey()); > ^^^^ I totally did, thanks for catching it! Fixed in r90015. Nick From sabre at nondot.org Fri Nov 27 14:25:30 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 20:25:30 -0000 Subject: [llvm-commits] [llvm] r90016 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911272025.nARKPU9P011343@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 14:25:30 2009 New Revision: 90016 URL: http://llvm.org/viewvc/llvm-project?rev=90016&view=rev Log: recursively phi translate bitcast operands too, for consistency. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90016&r1=90015&r2=90016&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 14:25:30 2009 @@ -693,21 +693,22 @@ // We can handle bitcast of a PHI, but the PHI needs to be in the same block // as the bitcast. - if (BitCastInst *BC = dyn_cast(Inst)) - // FIXME: Allow any phi translatable operand. - if (PHINode *PN = dyn_cast(BC->getOperand(0))) - if (PN->getParent() == BC->getParent()) - return true; + if (BitCastInst *BC = dyn_cast(Inst)) { + Instruction *OpI = dyn_cast(BC->getOperand(0)); + if (OpI == 0 || OpI->getParent() != Inst->getParent()) + return true; + return isPHITranslatable(OpI); + } // We can translate a GEP if all of its operands defined in this block are phi // translatable. if (GetElementPtrInst *GEP = dyn_cast(Inst)) { for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) { - Instruction *GEPOpI = dyn_cast(GEP->getOperand(i)); - if (GEPOpI == 0 || GEPOpI->getParent() != Inst->getParent()) + Instruction *OpI = dyn_cast(GEP->getOperand(i)); + if (OpI == 0 || OpI->getParent() != Inst->getParent()) continue; - if (!isPHITranslatable(GEPOpI)) + if (!isPHITranslatable(OpI)) return false; } return true; @@ -715,10 +716,10 @@ if (Inst->getOpcode() == Instruction::Add && isa(Inst->getOperand(1))) { - Instruction *GEPOpI = dyn_cast(Inst->getOperand(0)); - if (GEPOpI == 0 || GEPOpI->getParent() != Inst->getParent()) + Instruction *OpI = dyn_cast(Inst->getOperand(0)); + if (OpI == 0 || OpI->getParent() != Inst->getParent()) return true; - return isPHITranslatable(GEPOpI); + return isPHITranslatable(OpI); } // cerr << "MEMDEP: Could not PHI translate: " << *Pointer; @@ -745,9 +746,9 @@ // Handle bitcast of PHI. if (BitCastInst *BC = dyn_cast(Inst)) { - // FIXME: Recurse! - PHINode *BCPN = cast(BC->getOperand(0)); - Value *PHIIn = BCPN->getIncomingValueForBlock(Pred); + // PHI translate the input operand. + Value *PHIIn = PHITranslatePointer(BC->getOperand(0), CurBB, Pred, TD); + if (PHIIn == 0) return 0; // Constants are trivial to phi translate. if (Constant *C = dyn_cast(PHIIn)) @@ -779,14 +780,10 @@ } // If the operand is a phi node, do phi translation. - if (Value *InOp = PHITranslatePointer(GEPOp, CurBB, Pred, TD)) { - GEPOps.push_back(InOp); - continue; - } + Value *InOp = PHITranslatePointer(GEPOp, CurBB, Pred, TD); + if (InOp == 0) return 0; - // Otherwise, we can't PHI translate this random value defined in this - // block. - return 0; + GEPOps.push_back(InOp); } // Simplify the GEP to handle 'gep x, 0' -> x etc. From sabre at nondot.org Fri Nov 27 16:05:15 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 22:05:15 -0000 Subject: [llvm-commits] [llvm] r90019 - in /llvm/trunk: include/llvm/Analysis/MemoryDependenceAnalysis.h lib/Analysis/MemoryDependenceAnalysis.cpp lib/Transforms/Scalar/GVN.cpp Message-ID: <200911272205.nARM5F0o014568@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 16:05:15 2009 New Revision: 90019 URL: http://llvm.org/viewvc/llvm-project?rev=90019&view=rev Log: Rework InsertPHITranslatedPointer to handle the recursive case, this fixes PR5630 and sets the stage for the next phase of goodness (testcase pending). Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h?rev=90019&r1=90018&r2=90019&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h (original) +++ llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h Fri Nov 27 16:05:15 2009 @@ -30,6 +30,7 @@ class TargetData; class MemoryDependenceAnalysis; class PredIteratorCache; + class DominatorTree; /// MemDepResult - A memory dependence query can return one of three different /// answers, described below. @@ -244,19 +245,27 @@ BasicBlock *BB, SmallVectorImpl &Result); - /// PHITranslatePointer - Find an available version of the specified value + /// GetPHITranslatedValue - Find an available version of the specified value /// PHI translated across the specified edge. If MemDep isn't able to /// satisfy this request, it returns null. - Value *PHITranslatePointer(Value *V, - BasicBlock *CurBB, BasicBlock *PredBB, - const TargetData *TD) const; - + Value *GetPHITranslatedValue(Value *V, + BasicBlock *CurBB, BasicBlock *PredBB, + const TargetData *TD) const; + + /// GetAvailablePHITranslatedValue - Return the value computed by + /// PHITranslatePointer if it dominates PredBB, otherwise return null. + Value *GetAvailablePHITranslatedValue(Value *V, + BasicBlock *CurBB, BasicBlock *PredBB, + const TargetData *TD, + const DominatorTree &DT) const; + /// InsertPHITranslatedPointer - Insert a computation of the PHI translated /// version of 'V' for the edge PredBB->CurBB into the end of the PredBB /// block. Value *InsertPHITranslatedPointer(Value *V, BasicBlock *CurBB, BasicBlock *PredBB, - const TargetData *TD) const; + const TargetData *TD, + const DominatorTree &DT) const; /// removeInstruction - Remove an instruction from the dependence analysis, /// updating the dependence of instructions that previously depended on it. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90019&r1=90018&r2=90019&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Fri Nov 27 16:05:15 2009 @@ -20,6 +20,7 @@ #include "llvm/IntrinsicInst.h" #include "llvm/Function.h" #include "llvm/Analysis/AliasAnalysis.h" +#include "llvm/Analysis/Dominators.h" #include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/MemoryBuiltins.h" #include "llvm/ADT/Statistic.h" @@ -729,12 +730,12 @@ return false; } -/// PHITranslateForPred - Given a computation that satisfied the +/// GetPHITranslatedValue - Given a computation that satisfied the /// isPHITranslatable predicate, see if we can translate the computation into /// the specified predecessor block. If so, return that value. Value *MemoryDependenceAnalysis:: -PHITranslatePointer(Value *InVal, BasicBlock *CurBB, BasicBlock *Pred, - const TargetData *TD) const { +GetPHITranslatedValue(Value *InVal, BasicBlock *CurBB, BasicBlock *Pred, + const TargetData *TD) const { // If the input value is not an instruction, or if it is not defined in CurBB, // then we don't need to phi translate it. Instruction *Inst = dyn_cast(InVal); @@ -747,7 +748,7 @@ // Handle bitcast of PHI. if (BitCastInst *BC = dyn_cast(Inst)) { // PHI translate the input operand. - Value *PHIIn = PHITranslatePointer(BC->getOperand(0), CurBB, Pred, TD); + Value *PHIIn = GetPHITranslatedValue(BC->getOperand(0), CurBB, Pred, TD); if (PHIIn == 0) return 0; // Constants are trivial to phi translate. @@ -780,7 +781,7 @@ } // If the operand is a phi node, do phi translation. - Value *InOp = PHITranslatePointer(GEPOp, CurBB, Pred, TD); + Value *InOp = GetPHITranslatedValue(GEPOp, CurBB, Pred, TD); if (InOp == 0) return 0; GEPOps.push_back(InOp); @@ -824,7 +825,7 @@ if (OpI == 0 || OpI->getParent() != Inst->getParent()) LHS = Inst->getOperand(0); else { - LHS = PHITranslatePointer(Inst->getOperand(0), CurBB, Pred, TD); + LHS = GetPHITranslatedValue(Inst->getOperand(0), CurBB, Pred, TD); if (LHS == 0) return 0; } @@ -857,6 +858,25 @@ return 0; } +/// GetAvailablePHITranslatePointer - Return the value computed by +/// PHITranslatePointer if it dominates PredBB, otherwise return null. +Value *MemoryDependenceAnalysis:: +GetAvailablePHITranslatedValue(Value *V, + BasicBlock *CurBB, BasicBlock *PredBB, + const TargetData *TD, + const DominatorTree &DT) const { + // See if PHI translation succeeds. + V = GetPHITranslatedValue(V, CurBB, PredBB, TD); + if (V == 0) return 0; + + // Make sure the value is live in the predecessor. + if (Instruction *Inst = dyn_cast_or_null(V)) + if (!DT.dominates(Inst->getParent(), PredBB)) + return 0; + return V; +} + + /// InsertPHITranslatedPointer - Insert a computation of the PHI translated /// version of 'V' for the edge PredBB->CurBB into the end of the PredBB /// block. @@ -865,19 +885,25 @@ /// dominate the block, so we don't need to handle the trivial cases here. Value *MemoryDependenceAnalysis:: InsertPHITranslatedPointer(Value *InVal, BasicBlock *CurBB, - BasicBlock *PredBB, const TargetData *TD) const { - // If the input value isn't an instruction in CurBB, it doesn't need phi - // translation. + BasicBlock *PredBB, const TargetData *TD, + const DominatorTree &DT) const { + // See if we have a version of this value already available and dominating + // PredBB. If so, there is no need to insert a new copy. + if (Value *Res = GetAvailablePHITranslatedValue(InVal, CurBB, PredBB, TD, DT)) + return Res; + + // If we don't have an available version of this value, it must be an + // instruction. Instruction *Inst = cast(InVal); - assert(Inst->getParent() == CurBB && "Doesn't need phi trans"); - - // Handle bitcast of PHI. + + // Handle bitcast of PHI translatable value. if (BitCastInst *BC = dyn_cast(Inst)) { - PHINode *BCPN = cast(BC->getOperand(0)); - Value *PHIIn = BCPN->getIncomingValueForBlock(PredBB); - + Value *OpVal = InsertPHITranslatedPointer(BC->getOperand(0), + CurBB, PredBB, TD, DT); + if (OpVal == 0) return 0; + // Otherwise insert a bitcast at the end of PredBB. - return new BitCastInst(PHIIn, InVal->getType(), + return new BitCastInst(OpVal, InVal->getType(), InVal->getName()+".phi.trans.insert", PredBB->getTerminator()); } @@ -885,12 +911,12 @@ // Handle getelementptr with at least one PHI operand. if (GetElementPtrInst *GEP = dyn_cast(Inst)) { SmallVector GEPOps; - Value *APHIOp = 0; BasicBlock *CurBB = GEP->getParent(); for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) { - GEPOps.push_back(GEP->getOperand(i)->DoPHITranslation(CurBB, PredBB)); - if (!isa(GEPOps.back())) - APHIOp = GEPOps.back(); + Value *OpVal = InsertPHITranslatedPointer(GEP->getOperand(i), + CurBB, PredBB, TD, DT); + if (OpVal == 0) return 0; + GEPOps.push_back(OpVal); } GetElementPtrInst *Result = @@ -901,6 +927,28 @@ return Result; } +#if 0 + // FIXME: This code works, but it is unclear that we actually want to insert + // a big chain of computation in order to make a value available in a block. + // This needs to be evaluated carefully to consider its cost trade offs. + + // Handle add with a constant RHS. + if (Inst->getOpcode() == Instruction::Add && + isa(Inst->getOperand(1))) { + // PHI translate the LHS. + Value *OpVal = InsertPHITranslatedPointer(Inst->getOperand(0), + CurBB, PredBB, TD, DT); + if (OpVal == 0) return 0; + + BinaryOperator *Res = BinaryOperator::CreateAdd(OpVal, Inst->getOperand(1), + InVal->getName()+".phi.trans.insert", + PredBB->getTerminator()); + Res->setHasNoSignedWrap(cast(Inst)->hasNoSignedWrap()); + Res->setHasNoUnsignedWrap(cast(Inst)->hasNoUnsignedWrap()); + return Res; + } +#endif + return 0; } @@ -1055,15 +1103,9 @@ for (BasicBlock **PI = PredCache->GetPreds(BB); *PI; ++PI) { BasicBlock *Pred = *PI; - Value *PredPtr = PHITranslatePointer(PtrInst, BB, Pred, TD); - - // If PHI translation fails, bail out. - if (PredPtr == 0) { - // FIXME: Instead of modelling this as a phi trans failure, we should - // model this as a clobber in the one predecessor. This will allow - // us to PRE values that are only available in some preds but not all. - goto PredTranslationFailure; - } + // Get the PHI translated pointer in this predecessor. This can fail and + // return null if not translatable. + Value *PredPtr = GetPHITranslatedValue(PtrInst, BB, Pred, TD); // Check to see if we have already visited this pred block with another // pointer. If so, we can't do this lookup. This failure can occur @@ -1084,6 +1126,19 @@ // treat this as a phi translation failure. goto PredTranslationFailure; } + + // If PHI translation was unable to find an available pointer in this + // predecessor, then we have to assume that the pointer is clobbered in + // that predecessor. We can still do PRE of the load, which would insert + // a computation of the pointer in this predecessor. + if (PredPtr == 0) { + goto PredTranslationFailure; +#if 0 // TODO. + Result.push_back(NonLocalDepEntry(Pred, + MemDepResult::getClobber(Pred->getTerminator()))); + continue; +#endif + } // FIXME: it is entirely possible that PHI translating will end up with // the same value. Consider PHI translating something like: Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=90019&r1=90018&r2=90019&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Fri Nov 27 16:05:15 2009 @@ -1432,31 +1432,21 @@ return false; } - // If the loaded pointer is PHI node defined in this block, do PHI translation - // to get its value in the predecessor. - Value *LoadPtr = MD->PHITranslatePointer(LI->getOperand(0), - LoadBB, UnavailablePred, TD); - // Make sure the value is live in the predecessor. MemDep found a computation - // of LPInst with the right value, but that does not dominate UnavailablePred, - // then we can't use it. - if (Instruction *LPInst = dyn_cast_or_null(LoadPtr)) - if (!DT->dominates(LPInst->getParent(), UnavailablePred)) - LoadPtr = 0; + // Do PHI translation to get its value in the predecessor if necessary. The + // returned pointer (if non-null) is guaranteed to dominate UnavailablePred. + // + // FIXME: This may insert a computation, but we don't tell scalar GVN + // optimization stuff about it. How do we do this? + Value *LoadPtr = + MD->InsertPHITranslatedPointer(LI->getOperand(0), LoadBB, + UnavailablePred, TD, *DT); - // If we don't have a computation of this phi translated value, try to insert - // one. + // If we couldn't find or insert a computation of this phi translated value, + // we fail PRE. if (LoadPtr == 0) { - LoadPtr = MD->InsertPHITranslatedPointer(LI->getOperand(0), - LoadBB, UnavailablePred, TD); - if (LoadPtr == 0) { - DEBUG(errs() << "COULDN'T INSERT PHI TRANSLATED VALUE OF: " - << *LI->getOperand(0) << "\n"); - return false; - } - - // FIXME: This inserts a computation, but we don't tell scalar GVN - // optimization stuff about it. How do we do this? - DEBUG(errs() << "INSERTED PHI TRANSLATED VALUE: " << *LoadPtr << "\n"); + DEBUG(errs() << "COULDN'T INSERT PHI TRANSLATED VALUE OF: " + << *LI->getOperand(0) << "\n"); + return false; } // Make sure it is valid to move this load here. We have to watch out for: From sabre at nondot.org Fri Nov 27 16:50:08 2009 From: sabre at nondot.org (Chris Lattner) Date: Fri, 27 Nov 2009 22:50:08 -0000 Subject: [llvm-commits] [llvm] r90022 - in /llvm/trunk: lib/Transforms/Scalar/GVN.cpp test/Transforms/GVN/crash.ll test/Transforms/GVN/pre-load.ll Message-ID: <200911272250.nARMo8Rx016049@zion.cs.uiuc.edu> Author: lattner Date: Fri Nov 27 16:50:07 2009 New Revision: 90022 URL: http://llvm.org/viewvc/llvm-project?rev=90022&view=rev Log: disable value insertion for now, I need to figure out how to inform GVN about the newly inserted values. This fixes PR5631. Added: llvm/trunk/test/Transforms/GVN/crash.ll Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=90022&r1=90021&r2=90022&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Fri Nov 27 16:50:07 2009 @@ -1437,10 +1437,16 @@ // // FIXME: This may insert a computation, but we don't tell scalar GVN // optimization stuff about it. How do we do this? +#if 0 Value *LoadPtr = MD->InsertPHITranslatedPointer(LI->getOperand(0), LoadBB, UnavailablePred, TD, *DT); - +#else + Value *LoadPtr = + MD->GetAvailablePHITranslatedValue(LI->getOperand(0), LoadBB, + UnavailablePred, TD, *DT); +#endif + // If we couldn't find or insert a computation of this phi translated value, // we fail PRE. if (LoadPtr == 0) { Added: llvm/trunk/test/Transforms/GVN/crash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/crash.ll?rev=90022&view=auto ============================================================================== --- llvm/trunk/test/Transforms/GVN/crash.ll (added) +++ llvm/trunk/test/Transforms/GVN/crash.ll Fri Nov 27 16:50:07 2009 @@ -0,0 +1,61 @@ +; RUN: opt -gvn %s -disable-output + +; PR5631 + +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" +target triple = "x86_64-apple-darwin10.0" + +define i32* @peel_to_type(i8* %name, i32 %namelen, i32* %o, i32 %expected_type) nounwind ssp { +entry: + br i1 undef, label %if.end13, label %while.body.preheader + + +if.end13: ; preds = %if.then6 + br label %while.body.preheader + +while.body.preheader: ; preds = %if.end13, %if.end + br label %while.body + +while.body: ; preds = %while.body.backedge, %while.body.preheader + %o.addr.0 = phi i32* [ undef, %while.body.preheader ], [ %o.addr.0.be, %while.body.backedge ] ; [#uses=2] + br i1 false, label %return.loopexit, label %lor.lhs.false + +lor.lhs.false: ; preds = %while.body + %tmp20 = bitcast i32* %o.addr.0 to i32* ; [#uses=1] + %tmp22 = load i32* %tmp20 ; [#uses=0] + br i1 undef, label %land.lhs.true24, label %if.end31 + +land.lhs.true24: ; preds = %lor.lhs.false + %call28 = call i32* @parse_object(i8* undef) nounwind ; [#uses=0] + br i1 undef, label %return.loopexit, label %if.end31 + +if.end31: ; preds = %land.lhs.true24, %lor.lhs.false + br i1 undef, label %return.loopexit, label %if.end41 + +if.end41: ; preds = %if.end31 + %tmp43 = bitcast i32* %o.addr.0 to i32* ; [#uses=1] + %tmp45 = load i32* %tmp43 ; [#uses=0] + br i1 undef, label %if.then50, label %if.else + +if.then50: ; preds = %if.end41 + %tmp53 = load i32** undef ; [#uses=1] + br label %while.body.backedge + +if.else: ; preds = %if.end41 + br i1 undef, label %if.then62, label %if.else67 + +if.then62: ; preds = %if.else + br label %while.body.backedge + +while.body.backedge: ; preds = %if.then62, %if.then50 + %o.addr.0.be = phi i32* [ %tmp53, %if.then50 ], [ undef, %if.then62 ] ; [#uses=1] + br label %while.body + +if.else67: ; preds = %if.else + ret i32* null + +return.loopexit: ; preds = %if.end31, %land.lhs.true24, %while.body + ret i32* undef +} + +declare i32* @parse_object(i8*) Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90022&r1=90021&r2=90022&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Fri Nov 27 16:50:07 2009 @@ -86,9 +86,9 @@ block2: br label %block4 -; CHECK: block2: -; CHECK: load i32* -; CHECK: br label %block4 +; HECK: block2: +; HECK: load i32* +; HECK: br label %block4 block3: %B = getelementptr i32* %q, i32 1 @@ -103,10 +103,10 @@ %P3 = getelementptr i32* %P2, i32 1 %PRE = load i32* %P3 ret i32 %PRE -; CHECK: block4: -; CHECK-NEXT: phi i32 [ -; CHECK-NOT: load -; CHECK: ret i32 +; HECK: block4: +; HECK-NEXT: phi i32 [ +; HECK-NOT: load +; HECK: ret i32 } ;void test5(int N, double *G) { @@ -239,12 +239,10 @@ ret void } - -;;; --- todo - -;; Here the loaded address isn't available in 'block2' at all. -define i32 @testX(i32* %p, i32* %q, i32** %Hack, i1 %C) { -; CHECK: @testX +;; Here the loaded address isn't available in 'block2' at all, requiring a new +;; GEP to be inserted into it. +define i32 @test8(i32* %p, i32* %q, i32** %Hack, i1 %C) { +; CHECK: @test8 block1: br i1 %C, label %block2, label %block3 From sabre at nondot.org Sat Nov 28 08:54:10 2009 From: sabre at nondot.org (Chris Lattner) Date: Sat, 28 Nov 2009 14:54:10 -0000 Subject: [llvm-commits] [llvm] r90037 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911281454.nASEsB8u006817@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 08:54:10 2009 New Revision: 90037 URL: http://llvm.org/viewvc/llvm-project?rev=90037&view=rev Log: enable code to handle un-phi-translatable cases more aggressively: if we don't have an address expression available in a predecessor, then model this as the value being clobbered at the end of the pred block instead of being modeled as a complete phi translation failure. This is important for PRE of loads because we want to see that the load is available in all but this predecessor, and complete phi translation failure results in not getting any information about predecessors. This doesn't do anything until I renable code insertion since PRE now sees that it is available in all but one predecessors, but can't insert the addressing in the predecessor that is missing it to eliminate the redundancy. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90037&r1=90036&r2=90037&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sat Nov 28 08:54:10 2009 @@ -1132,12 +1132,9 @@ // that predecessor. We can still do PRE of the load, which would insert // a computation of the pointer in this predecessor. if (PredPtr == 0) { - goto PredTranslationFailure; -#if 0 // TODO. Result.push_back(NonLocalDepEntry(Pred, MemDepResult::getClobber(Pred->getTerminator()))); continue; -#endif } // FIXME: it is entirely possible that PHI translating will end up with From sabre at nondot.org Sat Nov 28 09:12:42 2009 From: sabre at nondot.org (Chris Lattner) Date: Sat, 28 Nov 2009 15:12:42 -0000 Subject: [llvm-commits] [llvm] r90038 - in /llvm/trunk: lib/Analysis/ValueTracking.cpp test/Transforms/DeadStoreElimination/crash.ll Message-ID: <200911281512.nASFCgJu007898@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 09:12:41 2009 New Revision: 90038 URL: http://llvm.org/viewvc/llvm-project?rev=90038&view=rev Log: implement a FIXME: limit the depth that DecomposeGEPExpression goes the same way that getUnderlyingObject does it. This fixes the 'DecomposeGEPExpression and getUnderlyingObject disagree!' assertion on sqlite3. Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp llvm/trunk/test/Transforms/DeadStoreElimination/crash.ll Modified: llvm/trunk/lib/Analysis/ValueTracking.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ValueTracking.cpp?rev=90038&r1=90037&r2=90038&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ValueTracking.cpp (original) +++ llvm/trunk/lib/Analysis/ValueTracking.cpp Sat Nov 28 09:12:41 2009 @@ -1028,9 +1028,11 @@ const Value *llvm::DecomposeGEPExpression(const Value *V, int64_t &BaseOffs, SmallVectorImpl > &VarIndices, const TargetData *TD) { - // FIXME: Should limit depth like getUnderlyingObject? + // Limit recursion depth to limit compile time in crazy cases. + unsigned MaxLookup = 6; + BaseOffs = 0; - while (1) { + do { // See if this is a bitcast or GEP. const Operator *Op = dyn_cast(V); if (Op == 0) { @@ -1128,7 +1130,10 @@ // Analyze the base pointer next. V = GEPOp->getOperand(0); - } + } while (--MaxLookup); + + // If the chain of expressions is too deep, just return early. + return V; } Modified: llvm/trunk/test/Transforms/DeadStoreElimination/crash.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadStoreElimination/crash.ll?rev=90038&r1=90037&r2=90038&view=diff ============================================================================== --- llvm/trunk/test/Transforms/DeadStoreElimination/crash.ll (original) +++ llvm/trunk/test/Transforms/DeadStoreElimination/crash.ll Sat Nov 28 09:12:41 2009 @@ -1,4 +1,4 @@ -; RUN: opt < %s -dse | llvm-dis +; RUN: opt < %s -dse -S target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" target triple = "i386-apple-darwin10.0" @@ -24,3 +24,20 @@ %2 = tail call signext i8 @foo(i8 signext undef, i8 signext 1) nounwind ; [#uses=1] br label %bb } + +define fastcc i32 @test2() nounwind ssp { +bb14: ; preds = %bb4 + %0 = bitcast i8* undef to i8** ; [#uses=1] + %1 = getelementptr inbounds i8** %0, i64 undef ; [#uses=1] + %2 = bitcast i8** %1 to i16* ; [#uses=2] + %3 = getelementptr inbounds i16* %2, i64 undef ; [#uses=1] + %4 = bitcast i16* %3 to i8* ; [#uses=1] + %5 = getelementptr inbounds i8* %4, i64 undef ; [#uses=1] + %6 = getelementptr inbounds i16* %2, i64 undef ; [#uses=1] + store i16 undef, i16* %6, align 2 + %7 = getelementptr inbounds i8* %5, i64 undef ; [#uses=1] + call void @llvm.memcpy.i64(i8* %7, i8* undef, i64 undef, i32 1) nounwind + unreachable +} + +declare void @llvm.memcpy.i64(i8* nocapture, i8* nocapture, i64, i32) nounwind From sabre at nondot.org Sat Nov 28 09:39:14 2009 From: sabre at nondot.org (Chris Lattner) Date: Sat, 28 Nov 2009 15:39:14 -0000 Subject: [llvm-commits] [llvm] r90039 - in /llvm/trunk: include/llvm/Analysis/MemoryDependenceAnalysis.h lib/Analysis/MemoryDependenceAnalysis.cpp lib/Transforms/Scalar/GVN.cpp Message-ID: <200911281539.nASFdEW7009838@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 09:39:14 2009 New Revision: 90039 URL: http://llvm.org/viewvc/llvm-project?rev=90039&view=rev Log: Enhance InsertPHITranslatedPointer to be able to return a list of newly inserted instructions. No functionality change until someone starts using it. Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp llvm/trunk/lib/Transforms/Scalar/GVN.cpp Modified: llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h?rev=90039&r1=90038&r2=90039&view=diff ============================================================================== --- llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h (original) +++ llvm/trunk/include/llvm/Analysis/MemoryDependenceAnalysis.h Sat Nov 28 09:39:14 2009 @@ -261,11 +261,12 @@ /// InsertPHITranslatedPointer - Insert a computation of the PHI translated /// version of 'V' for the edge PredBB->CurBB into the end of the PredBB - /// block. + /// block. All newly created instructions are added to the NewInsts list. Value *InsertPHITranslatedPointer(Value *V, BasicBlock *CurBB, BasicBlock *PredBB, const TargetData *TD, - const DominatorTree &DT) const; + const DominatorTree &DT, + SmallVectorImpl &NewInsts) const; /// removeInstruction - Remove an instruction from the dependence analysis, /// updating the dependence of instructions that previously depended on it. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90039&r1=90038&r2=90039&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sat Nov 28 09:39:14 2009 @@ -879,14 +879,13 @@ /// InsertPHITranslatedPointer - Insert a computation of the PHI translated /// version of 'V' for the edge PredBB->CurBB into the end of the PredBB -/// block. +/// block. All newly created instructions are added to the NewInsts list. /// -/// This is only called when PHITranslatePointer returns a value that doesn't -/// dominate the block, so we don't need to handle the trivial cases here. Value *MemoryDependenceAnalysis:: InsertPHITranslatedPointer(Value *InVal, BasicBlock *CurBB, BasicBlock *PredBB, const TargetData *TD, - const DominatorTree &DT) const { + const DominatorTree &DT, + SmallVectorImpl &NewInsts) const { // See if we have a version of this value already available and dominating // PredBB. If so, there is no need to insert a new copy. if (Value *Res = GetAvailablePHITranslatedValue(InVal, CurBB, PredBB, TD, DT)) @@ -899,13 +898,15 @@ // Handle bitcast of PHI translatable value. if (BitCastInst *BC = dyn_cast(Inst)) { Value *OpVal = InsertPHITranslatedPointer(BC->getOperand(0), - CurBB, PredBB, TD, DT); + CurBB, PredBB, TD, DT, NewInsts); if (OpVal == 0) return 0; // Otherwise insert a bitcast at the end of PredBB. - return new BitCastInst(OpVal, InVal->getType(), - InVal->getName()+".phi.trans.insert", - PredBB->getTerminator()); + BitCastInst *New = new BitCastInst(OpVal, InVal->getType(), + InVal->getName()+".phi.trans.insert", + PredBB->getTerminator()); + NewInsts.push_back(New); + return New; } // Handle getelementptr with at least one PHI operand. @@ -914,7 +915,7 @@ BasicBlock *CurBB = GEP->getParent(); for (unsigned i = 0, e = GEP->getNumOperands(); i != e; ++i) { Value *OpVal = InsertPHITranslatedPointer(GEP->getOperand(i), - CurBB, PredBB, TD, DT); + CurBB, PredBB, TD, DT, NewInsts); if (OpVal == 0) return 0; GEPOps.push_back(OpVal); } @@ -924,6 +925,7 @@ InVal->getName()+".phi.trans.insert", PredBB->getTerminator()); Result->setIsInBounds(GEP->isInBounds()); + NewInsts.push_back(Result); return Result; } @@ -937,7 +939,7 @@ isa(Inst->getOperand(1))) { // PHI translate the LHS. Value *OpVal = InsertPHITranslatedPointer(Inst->getOperand(0), - CurBB, PredBB, TD, DT); + CurBB, PredBB, TD, DT, NewInsts); if (OpVal == 0) return 0; BinaryOperator *Res = BinaryOperator::CreateAdd(OpVal, Inst->getOperand(1), @@ -945,6 +947,7 @@ PredBB->getTerminator()); Res->setHasNoSignedWrap(cast(Inst)->hasNoSignedWrap()); Res->setHasNoUnsignedWrap(cast(Inst)->hasNoUnsignedWrap()); + NewInsts.push_back(Res); return Res; } #endif Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=90039&r1=90038&r2=90039&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Sat Nov 28 09:39:14 2009 @@ -1437,10 +1437,11 @@ // // FIXME: This may insert a computation, but we don't tell scalar GVN // optimization stuff about it. How do we do this? + SmallVector NewInsts; #if 0 Value *LoadPtr = MD->InsertPHITranslatedPointer(LI->getOperand(0), LoadBB, - UnavailablePred, TD, *DT); + UnavailablePred, TD, *DT, NewInsts); #else Value *LoadPtr = MD->GetAvailablePHITranslatedValue(LI->getOperand(0), LoadBB, @@ -1465,6 +1466,7 @@ // we do not have this case. Otherwise, check that the load is safe to // put anywhere; this can be improved, but should be conservatively safe. if (!allSingleSucc && + // FIXME: REEVALUTE THIS. !isSafeToLoadUnconditionally(LoadPtr, UnavailablePred->getTerminator())) return false; From sabre at nondot.org Sat Nov 28 10:08:18 2009 From: sabre at nondot.org (Chris Lattner) Date: Sat, 28 Nov 2009 16:08:18 -0000 Subject: [llvm-commits] [llvm] r90041 - in /llvm/trunk: lib/Transforms/Scalar/GVN.cpp test/Transforms/GVN/pre-load.ll Message-ID: <200911281608.nASG8IMi012208@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 10:08:18 2009 New Revision: 90041 URL: http://llvm.org/viewvc/llvm-project?rev=90041&view=rev Log: reenable load address insertion in load pre. This allows us to handle cases like this: void test(int N, double* G) { long j; for (j = 1; j < N - 1; j++) G[j+1] = G[j] + G[j+1]; } where G[1] isn't live into the loop. Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=90041&r1=90040&r2=90041&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Sat Nov 28 10:08:18 2009 @@ -1438,16 +1438,19 @@ // FIXME: This may insert a computation, but we don't tell scalar GVN // optimization stuff about it. How do we do this? SmallVector NewInsts; -#if 0 - Value *LoadPtr = - MD->InsertPHITranslatedPointer(LI->getOperand(0), LoadBB, - UnavailablePred, TD, *DT, NewInsts); -#else - Value *LoadPtr = - MD->GetAvailablePHITranslatedValue(LI->getOperand(0), LoadBB, - UnavailablePred, TD, *DT); -#endif + Value *LoadPtr = 0; + // If all preds have a single successor, then we know it is safe to insert the + // load on the pred (?!?), so we can insert code to materialize the pointer if + // it is not available. + if (allSingleSucc) { + LoadPtr = MD->InsertPHITranslatedPointer(LI->getOperand(0), LoadBB, + UnavailablePred, TD, *DT,NewInsts); + } else { + LoadPtr = MD->GetAvailablePHITranslatedValue(LI->getOperand(0), LoadBB, + UnavailablePred, TD, *DT); + } + // If we couldn't find or insert a computation of this phi translated value, // we fail PRE. if (LoadPtr == 0) { @@ -1467,14 +1470,19 @@ // put anywhere; this can be improved, but should be conservatively safe. if (!allSingleSucc && // FIXME: REEVALUTE THIS. - !isSafeToLoadUnconditionally(LoadPtr, UnavailablePred->getTerminator())) + !isSafeToLoadUnconditionally(LoadPtr, UnavailablePred->getTerminator())) { + assert(NewInsts.empty() && "Should not have inserted instructions"); return false; + } // Okay, we can eliminate this load by inserting a reload in the predecessor // and using PHI construction to get the value in the other predecessors, do // it. DEBUG(errs() << "GVN REMOVING PRE LOAD: " << *LI << '\n'); - + DEBUG(if (!NewInsts.empty()) + errs() << "INSERTED " << NewInsts.size() << " INSTS: " + << *NewInsts.back() << '\n'); + Value *NewLoad = new LoadInst(LoadPtr, LI->getName()+".pre", false, LI->getAlignment(), UnavailablePred->getTerminator()); Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90041&r1=90040&r2=90041&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Sat Nov 28 10:08:18 2009 @@ -86,9 +86,9 @@ block2: br label %block4 -; HECK: block2: -; HECK: load i32* -; HECK: br label %block4 +; CHECK: block2: +; CHECK: load i32* +; CHECK: br label %block4 block3: %B = getelementptr i32* %q, i32 1 @@ -103,10 +103,10 @@ %P3 = getelementptr i32* %P2, i32 1 %PRE = load i32* %P3 ret i32 %PRE -; HECK: block4: -; HECK-NEXT: phi i32 [ -; HECK-NOT: load -; HECK: ret i32 +; CHECK: block4: +; CHECK-NEXT: phi i32 [ +; CHECK-NOT: load +; CHECK: ret i32 } ;void test5(int N, double *G) { @@ -248,9 +248,9 @@ block2: br label %block4 -; HECK: block2: -; HECK: load i32* -; HECK: br label %block4 +; CHECK: block2: +; CHECK: load i32* +; CHECK: br label %block4 block3: %A = getelementptr i32* %p, i32 1 @@ -262,10 +262,10 @@ %P3 = getelementptr i32* %P2, i32 1 %PRE = load i32* %P3 ret i32 %PRE -; HECK: block4: -; HECK-NEXT: phi i32 [ -; HECK-NOT: load -; HECK: ret i32 +; CHECK: block4: +; CHECK-NEXT: phi i32 [ +; CHECK-NOT: load +; CHECK: ret i32 } From baldrick at free.fr Sat Nov 28 10:23:56 2009 From: baldrick at free.fr (Duncan Sands) Date: Sat, 28 Nov 2009 17:23:56 +0100 Subject: [llvm-commits] [llvm] r90038 - in /llvm/trunk: lib/Analysis/ValueTracking.cpp test/Transforms/DeadStoreElimination/crash.ll In-Reply-To: <200911281512.nASFCgJu007898@zion.cs.uiuc.edu> References: <200911281512.nASFCgJu007898@zion.cs.uiuc.edu> Message-ID: <4B114E9C.2060801@free.fr> Hi Chris, > + unsigned MaxLookup = 6; maybe this should be a symbolic constant shared with getUnderlyingObject? That said, the testcase should stop them getting out of sync. Ciao, Duncan. From nicholas at mxc.ca Sat Nov 28 14:22:49 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 28 Nov 2009 12:22:49 -0800 Subject: [llvm-commits] patch: make memdep scan memory use intrinsics In-Reply-To: <4B0A48A2.3030306@free.fr> References: <4B09F47B.1040008@mxc.ca> <4B0A48A2.3030306@free.fr> Message-ID: <4B118699.9090106@mxc.ca> Duncan Sands wrote: > Hi Nick, > >> + case Intrinsic::lifetime_start: >> + case Intrinsic::invariant_start: >> + case Intrinsic::invariant_end: >> + MemPtr = QueryInst->getOperand(1); >> + MemSize = cast(QueryInst->getOperand(0))->getZExtValue(); > > isn't the pointer operand 2, and the size operand 1? Yes. >> + break; >> + case Intrinsic::lifetime_end: >> + MemPtr = QueryInst->getOperand(2); >> + MemSize = cast(QueryInst->getOperand(1))->getZExtValue(); > > And here operands 3 and 2? Worse! It was supposed to be Intrinsic::invariant_end here, not lifetime_end! I've fixed this part and intend to commit it shortly. Nick From nicholas at mxc.ca Sat Nov 28 15:27:49 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sat, 28 Nov 2009 21:27:49 -0000 Subject: [llvm-commits] [llvm] r90045 - in /llvm/trunk: lib/Analysis/MemoryDependenceAnalysis.cpp test/Transforms/DeadStoreElimination/lifetime.ll Message-ID: <200911282127.nASLRoum025712@zion.cs.uiuc.edu> Author: nicholas Date: Sat Nov 28 15:27:49 2009 New Revision: 90045 URL: http://llvm.org/viewvc/llvm-project?rev=90045&view=rev Log: Teach memdep to look for memory use intrinsics during dependency queries. Fixes PR5574. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp llvm/trunk/test/Transforms/DeadStoreElimination/lifetime.ll Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90045&r1=90044&r2=90045&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sat Nov 28 15:27:49 2009 @@ -201,7 +201,7 @@ // If we reach a lifetime begin or end marker, then the query ends here // because the value is undefined. } else if (II->getIntrinsicID() == Intrinsic::lifetime_start || - II->getIntrinsicID() == Intrinsic::lifetime_end) { + II->getIntrinsicID() == Intrinsic::lifetime_end) { uint64_t invariantSize = ~0ULL; if (ConstantInt *CI = dyn_cast(II->getOperand(1))) invariantSize = CI->getZExtValue(); @@ -369,20 +369,41 @@ // calls to free() erase the entire structure, not just a field. MemSize = ~0UL; } else if (isa(QueryInst) || isa(QueryInst)) { - CallSite QueryCS = CallSite::get(QueryInst); - bool isReadOnly = AA->onlyReadsMemory(QueryCS); - LocalCache = getCallSiteDependencyFrom(QueryCS, isReadOnly, ScanPos, - QueryParent); + int IntrinsicID = 0; // Intrinsic IDs start at 1. + if (IntrinsicInst *II = dyn_cast(QueryInst)) + IntrinsicID = II->getIntrinsicID(); + + switch (IntrinsicID) { + case Intrinsic::lifetime_start: + case Intrinsic::lifetime_end: + case Intrinsic::invariant_start: + MemPtr = QueryInst->getOperand(2); + MemSize = cast(QueryInst->getOperand(1))->getZExtValue(); + break; + case Intrinsic::invariant_end: + MemPtr = QueryInst->getOperand(3); + MemSize = cast(QueryInst->getOperand(2))->getZExtValue(); + break; + default: + CallSite QueryCS = CallSite::get(QueryInst); + bool isReadOnly = AA->onlyReadsMemory(QueryCS); + LocalCache = getCallSiteDependencyFrom(QueryCS, isReadOnly, ScanPos, + QueryParent); + } } else { // Non-memory instruction. LocalCache = MemDepResult::getClobber(--BasicBlock::iterator(ScanPos)); } // If we need to do a pointer scan, make it happen. - if (MemPtr) - LocalCache = getPointerDependencyFrom(MemPtr, MemSize, - isa(QueryInst), - ScanPos, QueryParent); + if (MemPtr) { + bool isLoad = !QueryInst->mayWriteToMemory(); + if (IntrinsicInst *II = dyn_cast(QueryInst)) { + isLoad |= II->getIntrinsicID() == Intrinsic::lifetime_end; + } + LocalCache = getPointerDependencyFrom(MemPtr, MemSize, isLoad, ScanPos, + QueryParent); + } // Remember the result! if (Instruction *I = LocalCache.getInst()) Modified: llvm/trunk/test/Transforms/DeadStoreElimination/lifetime.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/DeadStoreElimination/lifetime.ll?rev=90045&r1=90044&r2=90045&view=diff ============================================================================== --- llvm/trunk/test/Transforms/DeadStoreElimination/lifetime.ll (original) +++ llvm/trunk/test/Transforms/DeadStoreElimination/lifetime.ll Sat Nov 28 15:27:49 2009 @@ -1,6 +1,9 @@ ; RUN: opt -S -dse < %s | FileCheck %s -declare void @llvm.lifetime.end(i64, i8*) +target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128" + +declare void @llvm.lifetime.start(i64, i8* nocapture) nounwind +declare void @llvm.lifetime.end(i64, i8* nocapture) nounwind declare void @llvm.memset.i8(i8*, i8, i8, i32) define void @test1() { @@ -17,3 +20,18 @@ ret void ; CHECK: ret void } + +define void @test2(i32* %P) { +; CHECK: test2 + %Q = getelementptr i32* %P, i32 1 + %R = bitcast i32* %Q to i8* + call void @llvm.lifetime.start(i64 4, i8* %R) +; CHECK: lifetime.start + store i32 0, i32* %Q ;; This store is dead. +; CHECK-NOT: store + call void @llvm.lifetime.end(i64 4, i8* %R) +; CHECK: lifetime.end + ret void +} + + From sabre at nondot.org Sat Nov 28 18:51:18 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 00:51:18 -0000 Subject: [llvm-commits] [llvm] r90046 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/or.ll Message-ID: <200911290051.nAT0pI99031862@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 18:51:17 2009 New Revision: 90046 URL: http://llvm.org/viewvc/llvm-project?rev=90046&view=rev Log: Implement PR5634. Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp llvm/trunk/test/Transforms/InstCombine/or.ll Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=90046&r1=90045&r2=90046&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Sat Nov 28 18:51:17 2009 @@ -4067,6 +4067,21 @@ /// FoldAndOfICmps - Fold (icmp)&(icmp) if possible. Instruction *InstCombiner::FoldAndOfICmps(Instruction &I, ICmpInst *LHS, ICmpInst *RHS) { + // (icmp eq A, null) & (icmp eq B, null) --> + // (icmp eq (ptrtoint(A)|ptrtoint(B)), 0) + if (TD && + LHS->getPredicate() == ICmpInst::ICMP_EQ && + RHS->getPredicate() == ICmpInst::ICMP_EQ && + isa(LHS->getOperand(1)) && + isa(RHS->getOperand(1))) { + const Type *IntPtrTy = TD->getIntPtrType(I.getContext()); + Value *A = Builder->CreatePtrToInt(LHS->getOperand(0), IntPtrTy); + Value *B = Builder->CreatePtrToInt(RHS->getOperand(0), IntPtrTy); + Value *NewOr = Builder->CreateOr(A, B); + return new ICmpInst(ICmpInst::ICMP_EQ, NewOr, + Constant::getNullValue(IntPtrTy)); + } + Value *Val, *Val2; ConstantInt *LHSCst, *RHSCst; ICmpInst::Predicate LHSCC, RHSCC; @@ -4078,12 +4093,20 @@ m_ConstantInt(RHSCst)))) return 0; - // (icmp ult A, C) & (icmp ult B, C) --> (icmp ult (A|B), C) - // where C is a power of 2 - if (LHSCst == RHSCst && LHSCC == RHSCC && LHSCC == ICmpInst::ICMP_ULT && - LHSCst->getValue().isPowerOf2()) { - Value *NewOr = Builder->CreateOr(Val, Val2); - return new ICmpInst(LHSCC, NewOr, LHSCst); + if (LHSCst == RHSCst && LHSCC == RHSCC) { + // (icmp ult A, C) & (icmp ult B, C) --> (icmp ult (A|B), C) + // where C is a power of 2 + if (LHSCC == ICmpInst::ICMP_ULT && + LHSCst->getValue().isPowerOf2()) { + Value *NewOr = Builder->CreateOr(Val, Val2); + return new ICmpInst(LHSCC, NewOr, LHSCst); + } + + // (icmp eq A, 0) & (icmp eq B, 0) --> (icmp eq (A|B), 0) + if (LHSCC == ICmpInst::ICMP_EQ && LHSCst->isZero()) { + Value *NewOr = Builder->CreateOr(Val, Val2); + return new ICmpInst(LHSCC, NewOr, LHSCst); + } } // From here on, we only handle: @@ -4739,16 +4762,37 @@ /// FoldOrOfICmps - Fold (icmp)|(icmp) if possible. Instruction *InstCombiner::FoldOrOfICmps(Instruction &I, ICmpInst *LHS, ICmpInst *RHS) { + // (icmp ne A, null) | (icmp ne B, null) --> + // (icmp ne (ptrtoint(A)|ptrtoint(B)), 0) + if (TD && + LHS->getPredicate() == ICmpInst::ICMP_NE && + RHS->getPredicate() == ICmpInst::ICMP_NE && + isa(LHS->getOperand(1)) && + isa(RHS->getOperand(1))) { + const Type *IntPtrTy = TD->getIntPtrType(I.getContext()); + Value *A = Builder->CreatePtrToInt(LHS->getOperand(0), IntPtrTy); + Value *B = Builder->CreatePtrToInt(RHS->getOperand(0), IntPtrTy); + Value *NewOr = Builder->CreateOr(A, B); + return new ICmpInst(ICmpInst::ICMP_NE, NewOr, + Constant::getNullValue(IntPtrTy)); + } + Value *Val, *Val2; ConstantInt *LHSCst, *RHSCst; ICmpInst::Predicate LHSCC, RHSCC; // This only handles icmp of constants: (icmp1 A, C1) | (icmp2 B, C2). - if (!match(LHS, m_ICmp(LHSCC, m_Value(Val), - m_ConstantInt(LHSCst))) || - !match(RHS, m_ICmp(RHSCC, m_Value(Val2), - m_ConstantInt(RHSCst)))) + if (!match(LHS, m_ICmp(LHSCC, m_Value(Val), m_ConstantInt(LHSCst))) || + !match(RHS, m_ICmp(RHSCC, m_Value(Val2), m_ConstantInt(RHSCst)))) return 0; + + + // (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) + if (LHSCst == RHSCst && LHSCC == RHSCC && + LHSCC == ICmpInst::ICMP_NE && LHSCst->isZero()) { + Value *NewOr = Builder->CreateOr(Val, Val2); + return new ICmpInst(LHSCC, NewOr, LHSCst); + } // From here on, we only handle: // (icmp1 A, C1) | (icmp2 A, C2) --> something simpler. Modified: llvm/trunk/test/Transforms/InstCombine/or.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/or.ll?rev=90046&r1=90045&r2=90046&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/or.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/or.ll Sat Nov 28 18:51:17 2009 @@ -1,7 +1,8 @@ ; This test makes sure that these instructions are properly eliminated. -; ; RUN: opt < %s -instcombine -S | FileCheck %s +target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128" + define i32 @test1(i32 %A) { %B = or i32 %A, 0 ret i32 %B @@ -253,3 +254,44 @@ ; CHECK-NEXT: %F = and i1 ; CHECK-NEXT: ret i1 %F } + +; PR5634 +define i1 @test26(i32 %A, i32 %B) { + %C1 = icmp eq i32 %A, 0 + %C2 = icmp eq i32 %B, 0 + ; (A == 0) & (A == 0) --> (A|B) == 0 + %D = and i1 %C1, %C2 + ret i1 %D +; CHECK: @test26 +; CHECK: or i32 %A, %B +; CHECK: icmp eq i32 {{.*}}, 0 +; CHECK: ret i1 +} + +; PR5634 +define i1 @test27(i32* %A, i32* %B) { + %C1 = icmp eq i32* %A, null + %C2 = icmp eq i32* %B, null + ; (A == 0) & (A == 0) --> (A|B) == 0 + %D = and i1 %C1, %C2 + ret i1 %D +; CHECK: @test27 +; CHECK: ptrtoint i32* %A +; CHECK: ptrtoint i32* %B +; CHECK: or i32 +; CHECK: icmp eq i32 {{.*}}, 0 +; CHECK: ret i1 +} + +; PR5634 +define i1 @test28(i32 %A, i32 %B) { + %C1 = icmp ne i32 %A, 0 + %C2 = icmp ne i32 %B, 0 + ; (A != 0) | (A != 0) --> (A|B) != 0 + %D = or i1 %C1, %C2 + ret i1 %D +; CHECK: @test28 +; CHECK: or i32 %A, %B +; CHECK: icmp ne i32 {{.*}}, 0 +; CHECK: ret i1 +} From sabre at nondot.org Sat Nov 28 19:04:40 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 01:04:40 -0000 Subject: [llvm-commits] [llvm] r90047 - /llvm/trunk/test/Transforms/GVN/pre-load.ll Message-ID: <200911290104.nAT14eCn032299@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 19:04:40 2009 New Revision: 90047 URL: http://llvm.org/viewvc/llvm-project?rev=90047&view=rev Log: add a testcase for void test9(int N, double* G) { long j; for (j = 1; j < N - 1; j++) G[j+1] = G[j] + G[j+1]; } Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90047&r1=90046&r2=90047&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Sat Nov 28 19:04:40 2009 @@ -268,4 +268,49 @@ ; CHECK: ret i32 } +;void test9(int N, double* G) { +; long j; +; for (j = 1; j < N - 1; j++) +; G[j+1] = G[j] + G[j+1]; +;} + +; This requires phi translation of the adds. +define void @test9(i32 %N, double* nocapture %G) nounwind ssp { +entry: + add i32 0, 0 + %1 = add i32 %N, -1 + %2 = icmp sgt i32 %1, 1 + br i1 %2, label %bb.nph, label %return + +bb.nph: + %tmp = sext i32 %1 to i64 + %tmp7 = add i64 %tmp, -1 + br label %bb + +; CHECK: bb.nph: +; CHECK: load double* +; CHECK: br label %bb + +bb: + %indvar = phi i64 [ 0, %bb.nph ], [ %tmp9, %bb ] + %tmp8 = add i64 %indvar, 2 + %scevgep = getelementptr double* %G, i64 %tmp8 + %tmp9 = add i64 %indvar, 1 + %scevgep10 = getelementptr double* %G, i64 %tmp9 + %3 = load double* %scevgep10, align 8 + %4 = load double* %scevgep, align 8 + %5 = fadd double %3, %4 + store double %5, double* %scevgep, align 8 + %exitcond = icmp eq i64 %tmp9, %tmp7 + br i1 %exitcond, label %return, label %bb + +; Should only be one load in the loop. +; CHECK: bb: +; CHECK: load double* +; CHECK-NOT: load double* +; CHECK: br i1 %exitcond + +return: + ret void +} From sabre at nondot.org Sat Nov 28 19:15:43 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 01:15:43 -0000 Subject: [llvm-commits] [llvm] r90048 - /llvm/trunk/test/Transforms/GVN/pre-load.ll Message-ID: <200911290115.nAT1FheS032602@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 19:15:43 2009 New Revision: 90048 URL: http://llvm.org/viewvc/llvm-project?rev=90048&view=rev Log: Add a testcase for: void test(int N, double* G) { long j; for (j = 1; j < N - 1; j++) G[j] = G[j] + G[j+1] + G[j-1]; } which we now compile to one load in the loop: LBB1_2: ## %bb movsd 16(%rsi,%rax,8), %xmm2 incq %rdx addsd %xmm2, %xmm1 addsd %xmm1, %xmm0 movapd %xmm2, %xmm1 movsd %xmm0, 8(%rsi,%rax,8) incq %rax cmpq %rcx, %rax jne LBB1_2 instead of: LBB1_2: ## %bb movsd 8(%rsi,%rax,8), %xmm0 addsd 16(%rsi,%rax,8), %xmm0 addsd (%rsi,%rax,8), %xmm0 movsd %xmm0, 8(%rsi,%rax,8) incq %rax cmpq %rcx, %rax jne LBB1_2 Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90048&r1=90047&r2=90048&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Sat Nov 28 19:15:43 2009 @@ -314,3 +314,50 @@ ret void } +;void test10(int N, double* G) { +; long j; +; for (j = 1; j < N - 1; j++) +; G[j] = G[j] + G[j+1] + G[j-1]; +;} + +define void @test10(i32 %N, double* nocapture %G) nounwind ssp { +entry: + %0 = add i32 %N, -1 + %1 = icmp sgt i32 %0, 1 + br i1 %1, label %bb.nph, label %return + +bb.nph: + %tmp = sext i32 %0 to i64 + %tmp8 = add i64 %tmp, -1 + br label %bb +; CHECK: bb.nph: +; CHECK: load double* +; CHECK: load double* +; CHECK: br label %bb + + +bb: + %indvar = phi i64 [ 0, %bb.nph ], [ %tmp11, %bb ] + %scevgep = getelementptr double* %G, i64 %indvar + %tmp9 = add i64 %indvar, 2 + %scevgep10 = getelementptr double* %G, i64 %tmp9 + %tmp11 = add i64 %indvar, 1 + %scevgep12 = getelementptr double* %G, i64 %tmp11 + %2 = load double* %scevgep12, align 8 + %3 = load double* %scevgep10, align 8 + %4 = fadd double %2, %3 + %5 = load double* %scevgep, align 8 + %6 = fadd double %4, %5 + store double %6, double* %scevgep12, align 8 + %exitcond = icmp eq i64 %tmp11, %tmp8 + br i1 %exitcond, label %return, label %bb + +; Should only be one load in the loop. +; CHECK: bb: +; CHECK: load double* +; CHECK-NOT: load double* +; CHECK: br i1 %exitcond + +return: + ret void +} From sabre at nondot.org Sat Nov 28 19:28:58 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 01:28:58 -0000 Subject: [llvm-commits] [llvm] r90049 - /llvm/trunk/test/Transforms/GVN/pre-load.ll Message-ID: <200911290128.nAT1Swcg000737@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 19:28:58 2009 New Revision: 90049 URL: http://llvm.org/viewvc/llvm-project?rev=90049&view=rev Log: add PR# Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll Modified: llvm/trunk/test/Transforms/GVN/pre-load.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/GVN/pre-load.ll?rev=90049&r1=90048&r2=90049&view=diff ============================================================================== --- llvm/trunk/test/Transforms/GVN/pre-load.ll (original) +++ llvm/trunk/test/Transforms/GVN/pre-load.ll Sat Nov 28 19:28:58 2009 @@ -320,6 +320,7 @@ ; G[j] = G[j] + G[j+1] + G[j-1]; ;} +; PR5501 define void @test10(i32 %N, double* nocapture %G) nounwind ssp { entry: %0 = add i32 %N, -1 From sabre at nondot.org Sat Nov 28 20:19:52 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 02:19:52 -0000 Subject: [llvm-commits] [llvm] r90050 - /llvm/trunk/lib/Target/README.txt Message-ID: <200911290219.nAT2JqXd002589@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 20:19:52 2009 New Revision: 90050 URL: http://llvm.org/viewvc/llvm-project?rev=90050&view=rev Log: update and consolidate the load pre notes. Modified: llvm/trunk/lib/Target/README.txt Modified: llvm/trunk/lib/Target/README.txt URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/README.txt?rev=90050&r1=90049&r2=90050&view=diff ============================================================================== --- llvm/trunk/lib/Target/README.txt (original) +++ llvm/trunk/lib/Target/README.txt Sat Nov 28 20:19:52 2009 @@ -1116,6 +1116,8 @@ //===---------------------------------------------------------------------===// +[STORE SINKING] + Store sinking: This code: void f (int n, int *cond, int *res) { @@ -1171,6 +1173,8 @@ //===---------------------------------------------------------------------===// +[STORE SINKING] + GCC PR37810 is an interesting case where we should sink load/store reload into the if block and outside the loop, so we don't reload/store it on the non-call path. @@ -1198,7 +1202,7 @@ //===---------------------------------------------------------------------===// -[LOAD PRE with NON-AVAILABLE ADDRESS] +[LOAD PRE CRIT EDGE SPLITTING] GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stack leading to excess stack traffic. This could be handled by GVN with some crazy @@ -1217,62 +1221,57 @@ %11 is partially redundant, an in BB2 it should have the value %8. -GCC PR33344 is a similar case. +GCC PR33344 and PR35287 are similar cases. //===---------------------------------------------------------------------===// +[LOAD PRE] + There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in the -GCC testsuite. There are many pre testcases as ssa-pre-*.c +GCC testsuite, ones we don't get yet are (checked through loadpre25): + +[CRIT EDGE BREAKING] +loadpre3.c predcom-4.c + +[PRE OF READONLY CALL] +loadpre5.c + +[TURN SELECT INTO BRANCH] +loadpre14.c loadpre15.c + +actually a conditional increment: loadpre18.c loadpre19.c + + +//===---------------------------------------------------------------------===// + +[SCALAR PRE] +There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in the +GCC testsuite. //===---------------------------------------------------------------------===// There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in the -GCC testsuite. For example, predcom-1.c is: +GCC testsuite. For example, we get the first example in predcom-1.c, but +miss the second one: - for (i = 2; i < 1000; i++) - fib[i] = (fib[i-1] + fib[i - 2]) & 0xffff; +unsigned fib[1000]; +unsigned avg[1000]; -which compiles into: +__attribute__ ((noinline)) +void count_averages(int n) { + int i; + for (i = 1; i < n; i++) + avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff; +} -bb1: ; preds = %bb1, %bb1.thread - %indvar = phi i32 [ 0, %bb1.thread ], [ %0, %bb1 ] - %i.0.reg2mem.0 = add i32 %indvar, 2 - %0 = add i32 %indvar, 1 ; [#uses=3] - %1 = getelementptr [1000 x i32]* @fib, i32 0, i32 %0 - %2 = load i32* %1, align 4 ; [#uses=1] - %3 = getelementptr [1000 x i32]* @fib, i32 0, i32 %indvar - %4 = load i32* %3, align 4 ; [#uses=1] - %5 = add i32 %4, %2 ; [#uses=1] - %6 = and i32 %5, 65535 ; [#uses=1] - %7 = getelementptr [1000 x i32]* @fib, i32 0, i32 %i.0.reg2mem.0 - store i32 %6, i32* %7, align 4 - %exitcond = icmp eq i32 %0, 998 ; [#uses=1] - br i1 %exitcond, label %return, label %bb1 +which compiles into two loads instead of one in the loop. -This is basically: - LOAD fib[i+1] - LOAD fib[i] - STORE fib[i+2] - -instead of handling this as a loop or other xform, all we'd need to do is teach -load PRE to phi translate the %0 add (i+1) into the predecessor as (i'+1+1) = -(i'+2) (where i' is the previous iteration of i). This would find the store -which feeds it. +predcom-2.c is the same as predcom-1.c -predcom-2.c is apparently the same as predcom-1.c predcom-3.c is very similar but needs loads feeding each other instead of store->load. -predcom-4.c seems the same as the rest. - - -//===---------------------------------------------------------------------===// - -Other simple load PRE cases: -http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35287 [LPRE crit edge splitting] -http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34677 (licm does this, LPRE crit edge) - llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -simplifycfg -gvn | llvm-dis //===---------------------------------------------------------------------===// @@ -1305,7 +1304,7 @@ http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26629 With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-dis -we miss it because we need 1) GEP PHI TRAN, 2) CRIT EDGE 3) MULTIPLE DIFFERENT +we miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENT VALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHS //===---------------------------------------------------------------------===// From sabre at nondot.org Sat Nov 28 20:44:33 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 02:44:33 -0000 Subject: [llvm-commits] [llvm] r90055 - /llvm/trunk/include/llvm/Intrinsics.td Message-ID: <200911290244.nAT2iXrj003405@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 20:44:33 2009 New Revision: 90055 URL: http://llvm.org/viewvc/llvm-project?rev=90055&view=rev Log: mark all the 'foo with overflow' intrinsics as readnone. Modified: llvm/trunk/include/llvm/Intrinsics.td Modified: llvm/trunk/include/llvm/Intrinsics.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Intrinsics.td?rev=90055&r1=90054&r2=90055&view=diff ============================================================================== --- llvm/trunk/include/llvm/Intrinsics.td (original) +++ llvm/trunk/include/llvm/Intrinsics.td Sat Nov 28 20:44:33 2009 @@ -341,19 +341,25 @@ // Expose the carry flag from add operations on two integrals. def int_sadd_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; def int_uadd_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; def int_ssub_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; def int_usub_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; def int_smul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; def int_umul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty], - [LLVMMatchType<0>, LLVMMatchType<0>]>; + [LLVMMatchType<0>, LLVMMatchType<0>], + [IntrNoMem]>; //===------------------------- Atomic Intrinsics --------------------------===// // From sabre at nondot.org Sat Nov 28 20:57:29 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 02:57:29 -0000 Subject: [llvm-commits] [llvm] r90056 - in /llvm/trunk: lib/Transforms/Scalar/InstructionCombining.cpp test/Transforms/InstCombine/intrinsics.ll Message-ID: <200911290257.nAT2vTfK003822@zion.cs.uiuc.edu> Author: lattner Date: Sat Nov 28 20:57:29 2009 New Revision: 90056 URL: http://llvm.org/viewvc/llvm-project?rev=90056&view=rev Log: add testcases for the foo_with_overflow op xforms added recently and fix bugs exposed by the tests. Testcases from Alastair Lynn! Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp llvm/trunk/test/Transforms/InstCombine/intrinsics.ll Modified: llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp?rev=90056&r1=90055&r2=90056&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/InstructionCombining.cpp Sat Nov 28 20:57:29 2009 @@ -9934,9 +9934,9 @@ // Create a simple add instruction, and insert it into the struct. Instruction *Add = BinaryOperator::CreateAdd(LHS, RHS, "", &CI); Worklist.Add(Add); - Constant *V[2]; - V[0] = UndefValue::get(LHS->getType()); - V[1] = ConstantInt::getTrue(*Context); + Constant *V[] = { + UndefValue::get(LHS->getType()), ConstantInt::getTrue(*Context) + }; Constant *Struct = ConstantStruct::get(*Context, V, 2, false); return InsertValueInst::Create(Struct, Add, 0); } @@ -9946,9 +9946,9 @@ // Create a simple add instruction, and insert it into the struct. Instruction *Add = BinaryOperator::CreateNUWAdd(LHS, RHS, "", &CI); Worklist.Add(Add); - Constant *V[2]; - V[0] = UndefValue::get(LHS->getType()); - V[1] = ConstantInt::getFalse(*Context); + Constant *V[] = { + UndefValue::get(LHS->getType()), ConstantInt::getFalse(*Context) + }; Constant *Struct = ConstantStruct::get(*Context, V, 2, false); return InsertValueInst::Create(Struct, Add, 0); } @@ -9973,7 +9973,8 @@ // X + 0 -> {X, false} if (RHS->isZero()) { Constant *V[] = { - UndefValue::get(II->getType()), ConstantInt::getFalse(*Context) + UndefValue::get(II->getOperand(0)->getType()), + ConstantInt::getFalse(*Context) }; Constant *Struct = ConstantStruct::get(*Context, V, 2, false); return InsertValueInst::Create(Struct, II->getOperand(1), 0); @@ -9992,7 +9993,8 @@ // X - 0 -> {X, false} if (RHS->isZero()) { Constant *V[] = { - UndefValue::get(II->getType()), ConstantInt::getFalse(*Context) + UndefValue::get(II->getOperand(1)->getType()), + ConstantInt::getFalse(*Context) }; Constant *Struct = ConstantStruct::get(*Context, V, 2, false); return InsertValueInst::Create(Struct, II->getOperand(1), 0); @@ -10021,11 +10023,12 @@ // X * 1 -> {X, false} if (RHSI->equalsInt(1)) { - Constant *V[2]; - V[0] = UndefValue::get(II->getType()); - V[1] = ConstantInt::getFalse(*Context); + Constant *V[] = { + UndefValue::get(II->getOperand(1)->getType()), + ConstantInt::getFalse(*Context) + }; Constant *Struct = ConstantStruct::get(*Context, V, 2, false); - return InsertValueInst::Create(Struct, II->getOperand(1), 1); + return InsertValueInst::Create(Struct, II->getOperand(1), 0); } } break; Modified: llvm/trunk/test/Transforms/InstCombine/intrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/intrinsics.ll?rev=90056&r1=90055&r2=90056&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/intrinsics.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/intrinsics.ll Sat Nov 28 20:57:29 2009 @@ -1,12 +1,79 @@ ; RUN: opt %s -instcombine -S | FileCheck %s -declare {i8, i1} @llvm.uadd.with.overflow.i8(i8, i8) +%overflow.result = type {i8, i1} + +declare %overflow.result @llvm.uadd.with.overflow.i8(i8, i8) +declare %overflow.result @llvm.umul.with.overflow.i8(i8, i8) define i8 @test1(i8 %A, i8 %B) { - %x = call {i8, i1} @llvm.uadd.with.overflow.i8(i8 %A, i8 %B) - %y = extractvalue {i8, i1} %x, 0 + %x = call %overflow.result @llvm.uadd.with.overflow.i8(i8 %A, i8 %B) + %y = extractvalue %overflow.result %x, 0 ret i8 %y ; CHECK: @test1 ; CHECK-NEXT: %y = add i8 %A, %B ; CHECK-NEXT: ret i8 %y } + +define i8 @test2(i8 %A, i8 %B, i1* %overflowPtr) { + %and.A = and i8 %A, 127 + %and.B = and i8 %B, 127 + %x = call %overflow.result @llvm.uadd.with.overflow.i8(i8 %and.A, i8 %and.B) + %y = extractvalue %overflow.result %x, 0 + %z = extractvalue %overflow.result %x, 1 + store i1 %z, i1* %overflowPtr + ret i8 %y +; CHECK: @test2 +; CHECK-NEXT: %and.A = and i8 %A, 127 +; CHECK-NEXT: %and.B = and i8 %B, 127 +; CHECK-NEXT: %1 = add nuw i8 %and.A, %and.B +; CHECK-NEXT: store i1 false, i1* %overflowPtr +; CHECK-NEXT: ret i8 %1 +} + +define i8 @test3(i8 %A, i8 %B, i1* %overflowPtr) { + %or.A = or i8 %A, -128 + %or.B = or i8 %B, -128 + %x = call %overflow.result @llvm.uadd.with.overflow.i8(i8 %or.A, i8 %or.B) + %y = extractvalue %overflow.result %x, 0 + %z = extractvalue %overflow.result %x, 1 + store i1 %z, i1* %overflowPtr + ret i8 %y +; CHECK: @test3 +; CHECK-NEXT: %or.A = or i8 %A, -128 +; CHECK-NEXT: %or.B = or i8 %B, -128 +; CHECK-NEXT: %1 = add i8 %or.A, %or.B +; CHECK-NEXT: store i1 true, i1* %overflowPtr +; CHECK-NEXT: ret i8 %1 +} + +define i8 @test4(i8 %A, i1* %overflowPtr) { + %x = call %overflow.result @llvm.uadd.with.overflow.i8(i8 undef, i8 %A) + %y = extractvalue %overflow.result %x, 0 + %z = extractvalue %overflow.result %x, 1 + store i1 %z, i1* %overflowPtr + ret i8 %y +; CHECK: @test4 +; CHECK-NEXT: ret i8 undef +} + +define i8 @test5(i8 %A, i1* %overflowPtr) { + %x = call %overflow.result @llvm.umul.with.overflow.i8(i8 0, i8 %A) + %y = extractvalue %overflow.result %x, 0 + %z = extractvalue %overflow.result %x, 1 + store i1 %z, i1* %overflowPtr + ret i8 %y +; CHECK: @test5 +; CHECK-NEXT: store i1 false, i1* %overflowPtr +; CHECK-NEXT: ret i8 0 +} + +define i8 @test6(i8 %A, i1* %overflowPtr) { + %x = call %overflow.result @llvm.umul.with.overflow.i8(i8 1, i8 %A) + %y = extractvalue %overflow.result %x, 0 + %z = extractvalue %overflow.result %x, 1 + store i1 %z, i1* %overflowPtr + ret i8 %y +; CHECK: @test6 +; CHECK-NEXT: store i1 false, i1* %overflowPtr +; CHECK-NEXT: ret i8 %A +} From clattner at apple.com Sat Nov 28 20:57:54 2009 From: clattner at apple.com (Chris Lattner) Date: Sat, 28 Nov 2009 18:57:54 -0800 Subject: [llvm-commits] Testcases for previous overflow intrinsics xforms In-Reply-To: <3E593936-C4FE-4CC6-86DD-7AFA7587CBB4@gmail.com> References: <3E593936-C4FE-4CC6-86DD-7AFA7587CBB4@gmail.com> Message-ID: <851FBC0D-6D87-4703-89F4-7FC6D68AE602@apple.com> Thanks, applied in r90056. -Chris On Nov 27, 2009, at 8:56 AM, Alastair Lynn wrote: > Hi- > > This patch contains test cases for the instcombine simplification patch I sent in a couple of weeks ago. > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From daniel at zuster.org Sun Nov 29 02:30:24 2009 From: daniel at zuster.org (Daniel Dunbar) Date: Sun, 29 Nov 2009 08:30:24 -0000 Subject: [llvm-commits] [llvm] r90065 - /llvm/trunk/utils/FileCheck/FileCheck.cpp Message-ID: <200911290830.nAT8UOOH016690@zion.cs.uiuc.edu> Author: ddunbar Date: Sun Nov 29 02:30:24 2009 New Revision: 90065 URL: http://llvm.org/viewvc/llvm-project?rev=90065&view=rev Log: Fix FileCheck crash when fuzzy scanning starting at the end of the file. Modified: llvm/trunk/utils/FileCheck/FileCheck.cpp Modified: llvm/trunk/utils/FileCheck/FileCheck.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/FileCheck/FileCheck.cpp?rev=90065&r1=90064&r2=90065&view=diff ============================================================================== --- llvm/trunk/utils/FileCheck/FileCheck.cpp (original) +++ llvm/trunk/utils/FileCheck/FileCheck.cpp Sun Nov 29 02:30:24 2009 @@ -398,7 +398,7 @@ } } - if (BestQuality < 50) { + if (Best != StringRef::npos && BestQuality < 50) { // Print the "possible intended match here" line if we found something // reasonable. SM.PrintMessage(SMLoc::getFromPointer(Buffer.data() + Best), From baldrick at free.fr Sun Nov 29 03:54:41 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 29 Nov 2009 10:54:41 +0100 Subject: [llvm-commits] [llvm] r90045 - in /llvm/trunk: lib/Analysis/MemoryDependenceAnalysis.cpp test/Transforms/DeadStoreElimination/lifetime.ll In-Reply-To: <200911282127.nASLRoum025712@zion.cs.uiuc.edu> References: <200911282127.nASLRoum025712@zion.cs.uiuc.edu> Message-ID: <4B1244E1.2060500@free.fr> Hi Nick, > Teach memdep to look for memory use intrinsics during dependency queries. Fixes > PR5574. thanks for doing this. > - II->getIntrinsicID() == Intrinsic::lifetime_end) { > + II->getIntrinsicID() == Intrinsic::lifetime_end) { This line uses tab stops for indentation when it should use spaces. Ciao, Duncan. From baldrick at free.fr Sun Nov 29 04:14:53 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 29 Nov 2009 10:14:53 -0000 Subject: [llvm-commits] [dragonegg] r90075 - in /dragonegg/trunk: llvm-abi.h llvm-backend.cpp llvm-convert.cpp llvm-internal.h llvm-types.cpp Message-ID: <200911291014.nATAErbH002965@zion.cs.uiuc.edu> Author: baldrick Date: Sun Nov 29 04:14:52 2009 New Revision: 90075 URL: http://llvm.org/viewvc/llvm-project?rev=90075&view=rev Log: The following testcase crashes dragonegg (and llvm-gcc): typedef __attribute__((aligned(16))) struct { unsigned long long w[3]; } UINT192; UINT192 ten2mk192M[] = { {{0xcddd6e04c0592104ULL, 0x0fcf80dc33721d53ULL, 0xa7c5ac471b478423ULL}}, {{0xcddd6e04c0592104ULL, 0x0fcf80dc33721d53ULL, 0xa7c5ac471b478423ULL}}, {{0xcddd6e04c0592104ULL, 0x0fcf80dc33721d53ULL, 0xa7c5ac471b478423ULL}} }; The reason is that gcc gives the array a size which is 8 bytes longer than what you would expect by multiplying the element size by the array length. It turns out that when the user increases the alignment of a type, then it does not round the size of the type by the user alignment, but if you form an array of that type then it does round the size of the array by the user alignment given to the array element. Go figure. Clang people might want to note that clang and gcc thus disagree as to what sizeof(ten2mk192M) is. This patch changes the way array types are converted: if the LLVM array is smaller than the gcc array, then it wraps the array in a struct with some padding after it to get the size right. Various places then need to be tweaked so as not to crash or give wrong results when this happens. The testcase is reduced from bid128.c in gcc-4.5. Modified: dragonegg/trunk/llvm-abi.h dragonegg/trunk/llvm-backend.cpp dragonegg/trunk/llvm-convert.cpp dragonegg/trunk/llvm-internal.h dragonegg/trunk/llvm-types.cpp Modified: dragonegg/trunk/llvm-abi.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-abi.h?rev=90075&r1=90074&r2=90075&view=diff ============================================================================== --- dragonegg/trunk/llvm-abi.h (original) +++ dragonegg/trunk/llvm-abi.h Sun Nov 29 04:14:52 2009 @@ -537,6 +537,9 @@ (TREE_CODE(type) == QUAL_UNION_TYPE)) { HandleUnion(type, ScalarElts); } else if (TREE_CODE(type) == ARRAY_TYPE) { + // Array with padding? + if (isa(Ty)) + Ty = cast(Ty)->getTypeAtIndex(0U); const ArrayType *ATy = cast(Ty); for (unsigned i = 0, e = ATy->getNumElements(); i != e; ++i) { C.EnterField(i, Ty); Modified: dragonegg/trunk/llvm-backend.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-backend.cpp?rev=90075&r1=90074&r2=90075&view=diff ============================================================================== --- dragonegg/trunk/llvm-backend.cpp (original) +++ dragonegg/trunk/llvm-backend.cpp Sun Nov 29 04:14:52 2009 @@ -276,10 +276,8 @@ // is the same as that of the given GCC declaration. static bool SizeOfGlobalMatchesDecl(GlobalValue *GV, tree decl) { const Type *Ty = GV->getType()->getElementType(); - if (!DECL_SIZE(decl) || !Ty->isSized()) + if (!isInt64(DECL_SIZE(decl), true) || !Ty->isSized()) return true; - if (!isInt64(DECL_SIZE(decl), true)) - return false; return TheTarget->getTargetData()->getTypeAllocSizeInBits(Ty) == getInt64(DECL_SIZE(decl), true); } Modified: dragonegg/trunk/llvm-convert.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-convert.cpp?rev=90075&r1=90074&r2=90075&view=diff ============================================================================== --- dragonegg/trunk/llvm-convert.cpp (original) +++ dragonegg/trunk/llvm-convert.cpp Sun Nov 29 04:14:52 2009 @@ -123,19 +123,19 @@ /// true, returns whether the value is non-negative and fits in a uint64_t. /// Always returns false for overflowed constants. bool isInt64(tree t, bool Unsigned) { + if (!t) + return false; if (HOST_BITS_PER_WIDE_INT == 64) return host_integerp(t, Unsigned) && !TREE_OVERFLOW (t); - else { - assert(HOST_BITS_PER_WIDE_INT == 32 && - "Only 32- and 64-bit hosts supported!"); - return - (TREE_CODE (t) == INTEGER_CST && !TREE_OVERFLOW (t)) - && ((TYPE_UNSIGNED(TREE_TYPE(t)) == Unsigned) || - // If the constant is signed and we want an unsigned result, check - // that the value is non-negative. If the constant is unsigned and - // we want a signed result, check it fits in 63 bits. - (HOST_WIDE_INT)TREE_INT_CST_HIGH(t) >= 0); - } + assert(HOST_BITS_PER_WIDE_INT == 32 && + "Only 32- and 64-bit hosts supported!"); + return + (TREE_CODE (t) == INTEGER_CST && !TREE_OVERFLOW (t)) + && ((TYPE_UNSIGNED(TREE_TYPE(t)) == Unsigned) || + // If the constant is signed and we want an unsigned result, check + // that the value is non-negative. If the constant is unsigned and + // we want a signed result, check it fits in 63 bits. + (HOST_WIDE_INT)TREE_INT_CST_HIGH(t) >= 0); } /// getInt64 - Extract the value of an INTEGER_CST as a 64 bit integer. If @@ -1746,22 +1746,9 @@ // Variable of fixed size that goes on the stack. Ty = ConvertType(type); } else { - // Dynamic-size object: must push space on the stack. - if (TREE_CODE(type) == ARRAY_TYPE - && isSequentialCompatible(type) - && TYPE_SIZE(type) == DECL_SIZE(decl)) { - Ty = ConvertType(TREE_TYPE(type)); // Get array element type. - // Compute the number of elements in the array. - Size = Emit(DECL_SIZE(decl), 0); - assert(!integer_zerop(TYPE_SIZE(TREE_TYPE(type))) - && "Array of positive size with elements of zero size!"); - Value *EltSize = Emit(TYPE_SIZE(TREE_TYPE(type)), 0); - Size = Builder.CreateUDiv(Size, EltSize, "len"); - } else { - // Compute the variable's size in bytes. - Size = Emit(DECL_SIZE_UNIT(decl), 0); - Ty = Type::getInt8Ty(Context); - } + // Compute the variable's size in bytes. + Size = Emit(DECL_SIZE_UNIT(decl), 0); + Ty = Type::getInt8Ty(Context); Size = Builder.CreateIntCast(Size, Type::getInt32Ty(Context), /*isSigned*/false); } @@ -5959,15 +5946,14 @@ // If we are indexing over a fixed-size type, just use a GEP. if (isSequentialCompatible(ArrayTreeType)) { - Value *Idx[2]; - Idx[0] = ConstantInt::get(IntPtrTy, 0); - Idx[1] = IndexVal; + // Avoid any assumptions about how the array type is represented in LLVM by + // doing the GEP on a pointer to the first array element. + const Type *EltTy = ConvertType(ElementType); + ArrayAddr = Builder.CreateBitCast(ArrayAddr, EltTy->getPointerTo()); Value *Ptr = POINTER_TYPE_OVERFLOW_UNDEFINED ? - Builder.CreateInBoundsGEP(ArrayAddr, Idx, Idx + 2) : - Builder.CreateGEP(ArrayAddr, Idx, Idx + 2); - - const Type *ElementTy = ConvertType(ElementType); - unsigned Alignment = MinAlign(ArrayAlign, TD.getABITypeAlignment(ElementTy)); + Builder.CreateInBoundsGEP(ArrayAddr, IndexVal) : + Builder.CreateGEP(ArrayAddr, IndexVal); + unsigned Alignment = MinAlign(ArrayAlign, TD.getABITypeAlignment(EltTy)); return LValue(Builder.CreateBitCast(Ptr, PointerType::getUnqual(ConvertType(TREE_TYPE(exp)))), Alignment); @@ -7496,8 +7482,7 @@ // Zero length array. if (ResultElts.empty()) - return ConstantArray::get( - cast(ConvertType(TREE_TYPE(exp))), ResultElts); + return Constant::getNullValue(ConvertType(TREE_TYPE(exp))); assert(SomeVal && "If we had some initializer, we should have some value!"); // Do a post-pass over all of the elements. We're taking care of two things @@ -7522,10 +7507,23 @@ return ConstantVector::get(ResultElts); } - if (AllEltsSameType) - return ConstantArray::get( - ArrayType::get(ElTy, ResultElts.size()), ResultElts); - return ConstantStruct::get(Context, ResultElts, false); + Constant *Res = AllEltsSameType ? + ConstantArray::get(ArrayType::get(ElTy, ResultElts.size()), ResultElts) : + ConstantStruct::get(Context, ResultElts, false); + + // If the array does not require extra padding, return it. + int64_t PadBits = getInt64(TYPE_SIZE(InitType), true) - + getTargetData().getTypeAllocSizeInBits(Res->getType()); + assert(PadBits >= 0 && "Supersized array initializer!"); + if (PadBits <= 0) + return Res; + + // Wrap the array in a struct with padding at the end. + Constant *PadElts[2]; + PadElts[0] = Res; + PadElts[1] = UndefValue::get(ArrayType::get(Type::getInt8Ty(Context), + PadBits / 8)); + return ConstantStruct::get(Context, PadElts, 2, false); } @@ -8180,8 +8178,6 @@ assert(isSequentialCompatible(TREE_TYPE(Array)) && "Global with variable size?"); - Constant *ArrayAddr; - // First subtract the lower bound, if any, in the type of the index. Constant *IndexVal = Convert(Index); tree LowerBound = array_ref_low_bound(exp); @@ -8190,19 +8186,19 @@ TheFolder->CreateSub(IndexVal, Convert(LowerBound)) : TheFolder->CreateNSWSub(IndexVal, Convert(LowerBound)); - ArrayAddr = EmitLV(Array); - const Type *IntPtrTy = getTargetData().getIntPtrType(Context); IndexVal = TheFolder->CreateIntCast(IndexVal, IntPtrTy, /*isSigned*/!TYPE_UNSIGNED(IndexType)); - Value *Idx[2]; - Idx[0] = ConstantInt::get(IntPtrTy, 0); - Idx[1] = IndexVal; + // Avoid any assumptions about how the array type is represented in LLVM by + // doing the GEP on a pointer to the first array element. + Constant *ArrayAddr = EmitLV(Array); + const Type *EltTy = ConvertType(TREE_TYPE(TREE_TYPE(Array))); + ArrayAddr = TheFolder->CreateBitCast(ArrayAddr, EltTy->getPointerTo()); return POINTER_TYPE_OVERFLOW_UNDEFINED ? - TheFolder->CreateInBoundsGetElementPtr(ArrayAddr, Idx, 2) : - TheFolder->CreateGetElementPtr(ArrayAddr, Idx, 2); + TheFolder->CreateInBoundsGetElementPtr(ArrayAddr, &IndexVal, 1) : + TheFolder->CreateGetElementPtr(ArrayAddr, &IndexVal, 1); } Constant *TreeConstantToLLVM::EmitLV_COMPONENT_REF(tree exp) { Modified: dragonegg/trunk/llvm-internal.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-internal.h?rev=90075&r1=90074&r2=90075&view=diff ============================================================================== --- dragonegg/trunk/llvm-internal.h (original) +++ dragonegg/trunk/llvm-internal.h Sun Nov 29 04:14:52 2009 @@ -216,7 +216,7 @@ /// isInt64 - Return true if t is an INTEGER_CST that fits in a 64 bit integer. /// If Unsigned is false, returns whether it fits in a int64_t. If Unsigned is /// true, returns whether the value is non-negative and fits in a uint64_t. -/// Always returns false for overflowed constants. +/// Always returns false for overflowed constants or if t is NULL. bool isInt64(tree_node *t, bool Unsigned); /// getInt64 - Extract the value of an INTEGER_CST as a 64 bit integer. If @@ -234,13 +234,42 @@ /// type and the corresponding LLVM SequentialType lay out their components /// identically in memory, so doing a GEP accesses the right memory location. /// We assume that objects without a known size do not. -bool isSequentialCompatible(tree_node *type); +inline bool isSequentialCompatible(tree_node *type) { + assert((TREE_CODE(type) == ARRAY_TYPE || + TREE_CODE(type) == POINTER_TYPE || + TREE_CODE(type) == REFERENCE_TYPE) && "not a sequential type!"); + // This relies on gcc types with constant size mapping to LLVM types with the + // same size. It is possible for the component type not to have a size: + // struct foo; extern foo bar[]; + return isInt64(TYPE_SIZE(TREE_TYPE(type)), true); +} /// OffsetIsLLVMCompatible - Return true if the given field is offset from the /// start of the record by a constant amount which is not humongously big. inline bool OffsetIsLLVMCompatible(tree_node *field_decl) { - return DECL_FIELD_OFFSET(field_decl) && - isInt64(DECL_FIELD_OFFSET(field_decl), true); + return isInt64(DECL_FIELD_OFFSET(field_decl), true); +} + +/// ArrayLengthOf - Returns the length of the given gcc array type, or ~0ULL if +/// the array has variable or unknown length. +inline uint64_t ArrayLengthOf(tree_node *type) { + assert(TREE_CODE(type) == ARRAY_TYPE && "Only for array types!"); + // If the element type has variable size and the array type has variable + // length, but by some miracle the product gives a constant size, then we + // also return ~0ULL here. I can live with this, and I bet you can too! + if (!isInt64(TYPE_SIZE(type), true) || + !isInt64(TYPE_SIZE(TREE_TYPE(type)), true)) + return ~0ULL; + // May return zero for arrays that gcc considers to have non-zero length, but + // only if the array type has zero size (this can happen if the element type + // has zero size), in which case the discrepancy doesn't matter. + // + // If the user increased the alignment of the element type, then the size of + // the array type is rounded up by that alignment, but the size of the element + // is not. Since gcc requires the user alignment to be strictly smaller than + // the element size, this does not impact the length computation. + return integer_zerop(TYPE_SIZE(type)) ? 0 : getInt64(TYPE_SIZE(type), true) / + getInt64(TYPE_SIZE(TREE_TYPE(type)), true); } /// isBitfield - Returns whether to treat the specified field as a bitfield. Modified: dragonegg/trunk/llvm-types.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-types.cpp?rev=90075&r1=90074&r2=90075&view=diff ============================================================================== --- dragonegg/trunk/llvm-types.cpp (original) +++ dragonegg/trunk/llvm-types.cpp Sun Nov 29 04:14:52 2009 @@ -296,21 +296,6 @@ } } -/// isSequentialCompatible - Return true if the specified gcc array or pointer -/// type and the corresponding LLVM SequentialType lay out their components -/// identically in memory, so doing a GEP accesses the right memory location. -/// We assume that objects without a known size do not. -bool isSequentialCompatible(tree_node *type) { - assert((TREE_CODE(type) == ARRAY_TYPE || - TREE_CODE(type) == POINTER_TYPE || - TREE_CODE(type) == REFERENCE_TYPE) && "not a sequential type!"); - // This relies on gcc types with constant size mapping to LLVM types with the - // same size. It is possible for the component type not to have a size: - // struct foo; extern foo bar[]; - return TYPE_SIZE(TREE_TYPE(type)) && - isInt64(TYPE_SIZE(TREE_TYPE(type)), true); -} - /// isBitfield - Returns whether to treat the specified field as a bitfield. bool isBitfield(tree_node *field_decl) { tree type = DECL_BIT_FIELD_TYPE(field_decl); @@ -324,7 +309,7 @@ // Does not start on a byte boundary - must treat as a bitfield. return true; - if (!TYPE_SIZE(type) || !isInt64(TYPE_SIZE (type), true)) + if (!isInt64(TYPE_SIZE (type), true)) // No size or variable sized - play safe, treat as a bitfield. return true; @@ -537,12 +522,8 @@ // If the type does not overlap, don't bother checking below. - if (!TYPE_SIZE(type)) - // C-style variable length array? Be conservative. - return true; - if (!isInt64(TYPE_SIZE(type), true)) - // Negative size (!) or huge - be conservative. + // No size, negative size (!) or huge - be conservative. return true; if (!getInt64(TYPE_SIZE(type), true) || @@ -570,8 +551,10 @@ return true; case ARRAY_TYPE: { - unsigned EltSizeBits = TREE_INT_CST_LOW(TYPE_SIZE(TREE_TYPE(type))); - unsigned NumElts = cast(ConvertType(type))->getNumElements(); + uint64_t NumElts = ArrayLengthOf(type); + if (NumElts == ~0ULL) + return true; + unsigned EltSizeBits = getInt64(TYPE_SIZE(TREE_TYPE(type)), true); // Check each element for overlap. This is inelegant, but effective. for (unsigned i = 0; i != NumElts; ++i) @@ -849,51 +832,31 @@ if ((Ty = GET_TYPE_LLVM(type))) return Ty; - uint64_t ElementSize; - const Type *ElementTy; - if (isSequentialCompatible(type)) { - // The gcc element type maps to an LLVM type of the same size. - // Convert to an LLVM array of the converted element type. - ElementSize = getInt64(TYPE_SIZE(TREE_TYPE(type)), true); - ElementTy = ConvertType(TREE_TYPE(type)); - } else { - // The gcc element type has no size, or has variable size. Convert to an - // LLVM array of bytes. In the unlikely but theoretically possible case - // that the gcc array type has constant size, using an i8 for the element - // type ensures we can produce an LLVM array of the right size. - ElementSize = 8; - ElementTy = Type::getInt8Ty(Context); - } - - uint64_t NumElements; - if (!TYPE_SIZE(type)) { - // We get here if we have something that is declared to be an array with - // no dimension. This just becomes a zero length array of the element - // type, so 'int X[]' becomes '%X = external global [0 x i32]'. - // - // Note that this also affects new expressions, which return a pointer - // to an unsized array of elements. - NumElements = 0; - } else if (!isInt64(TYPE_SIZE(type), true)) { - // This handles cases like "int A[n]" which have a runtime constant - // number of elements, but is a compile-time variable. Since these - // are variable sized, we represent them as [0 x type]. - NumElements = 0; - } else if (integer_zerop(TYPE_SIZE(type))) { - // An array of zero length, or with an element type of zero size. - // Turn it into a zero length array of the element type. + const Type *ElementTy = ConvertType(TREE_TYPE(type)); + uint64_t NumElements = ArrayLengthOf(type); + + if (NumElements == ~0ULL) // Variable length array? NumElements = 0; - } else { - // Normal constant-size array. - assert(ElementSize - && "Array of positive size with elements of zero size!"); - NumElements = getInt64(TYPE_SIZE(type), true); - assert(!(NumElements % ElementSize) - && "Array size is not a multiple of the element size!"); - NumElements /= ElementSize; + + // Create the array type. + Ty = ArrayType::get(ElementTy, NumElements); + + // If the user increased the alignment of the array element type, then the + // size of the array is rounded up by that alignment even though the size + // of the array element type is not (!). Correct for this if necessary by + // adding padding. May also need padding if the element type has variable + // size and the array type has variable length, but by a miracle the product + // gives a constant size. + if (isInt64(TYPE_SIZE(type), true)) { + uint64_t PadBits = getInt64(TYPE_SIZE(type), true) - + getTargetData().getTypeAllocSizeInBits(Ty); + if (PadBits) { + const Type *Padding = ArrayType::get(Type::getInt8Ty(Context), PadBits / 8); + Ty = StructType::get(Context, Ty, Padding, NULL); + } } - Ty = TypeDB.setType(type, ArrayType::get(ElementTy, NumElements)); + Ty = TypeDB.setType(type, Ty); break; } From baldrick at free.fr Sun Nov 29 08:18:34 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 29 Nov 2009 14:18:34 -0000 Subject: [llvm-commits] [dragonegg] r90076 - in /dragonegg/trunk: darwin/llvm-os.h llvm-backend.cpp Message-ID: <200911291418.nATEIYo2002495@zion.cs.uiuc.edu> Author: baldrick Date: Sun Nov 29 08:18:33 2009 New Revision: 90076 URL: http://llvm.org/viewvc/llvm-project?rev=90076&view=rev Log: Port commit 89415 (void) from llvm-gcc: Adjust the alignment of strings so that they aren't over aligned. We only need them aligned to 8-bytes instead of 16-bytes for 64-bit. And 4 instead of 8 in 32-bit. Modified: dragonegg/trunk/darwin/llvm-os.h dragonegg/trunk/llvm-backend.cpp Modified: dragonegg/trunk/darwin/llvm-os.h URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/darwin/llvm-os.h?rev=90076&r1=90075&r2=90076&view=diff ============================================================================== --- dragonegg/trunk/darwin/llvm-os.h (original) +++ dragonegg/trunk/darwin/llvm-os.h Sun Nov 29 08:18:33 2009 @@ -42,4 +42,13 @@ argvec.push_back ("--relocation-model=static") #endif /* !TARGET_386 && !TARGET_ARM */ +/* Give a constant string a sufficient alignment for the platform. */ +/* radar 7291825 */ +#define TARGET_ADJUST_CSTRING_ALIGN(GV) \ + do { \ + if (GV->hasInternalLinkage()) { \ + GV->setAlignment(TARGET_64BIT ? 8 : 4); \ + } \ + } while (0) + #endif /* LLVM_OS_H */ Modified: dragonegg/trunk/llvm-backend.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-backend.cpp?rev=90076&r1=90075&r2=90076&view=diff ============================================================================== --- dragonegg/trunk/llvm-backend.cpp (original) +++ dragonegg/trunk/llvm-backend.cpp Sun Nov 29 08:18:33 2009 @@ -1215,8 +1215,16 @@ unsigned TargetAlign = getTargetData().getABITypeAlignment(GV->getType()->getElementType()); if (DECL_USER_ALIGN(decl) || - 8 * TargetAlign < (unsigned)DECL_ALIGN(decl)) + 8 * TargetAlign < (unsigned)DECL_ALIGN(decl)) { GV->setAlignment(DECL_ALIGN(decl) / 8); + } +#ifdef TARGET_ADJUST_CSTRING_ALIGN + else if (DECL_INITIAL(decl) != error_mark_node && // uninitialized? + DECL_INITIAL(decl) && + TREE_CODE(DECL_INITIAL(decl)) == STRING_CST) { + TARGET_ADJUST_CSTRING_ALIGN(GV); + } +#endif } // Handle used decls From baldrick at free.fr Sun Nov 29 08:59:36 2009 From: baldrick at free.fr (Duncan Sands) Date: Sun, 29 Nov 2009 14:59:36 -0000 Subject: [llvm-commits] [dragonegg] r90079 - /dragonegg/trunk/llvm-debug.cpp Message-ID: <200911291459.nATExak2003832@zion.cs.uiuc.edu> Author: baldrick Date: Sun Nov 29 08:59:36 2009 New Revision: 90079 URL: http://llvm.org/viewvc/llvm-project?rev=90079&view=rev Log: Port commits 88701, 88967, 88970 and 89868 (dpatel) from llvm-gcc: - Avoid using c_str() on temp. string. - Use TrackingVH to hold debug info for a forward decl. - Fix context for static variable. - Use StringRef (again) in DebugInfo interface. Modified: dragonegg/trunk/llvm-debug.cpp Modified: dragonegg/trunk/llvm-debug.cpp URL: http://llvm.org/viewvc/llvm-project/dragonegg/trunk/llvm-debug.cpp?rev=90079&r1=90078&r2=90079&view=diff ============================================================================== --- dragonegg/trunk/llvm-debug.cpp (original) +++ dragonegg/trunk/llvm-debug.cpp Sun Nov 29 08:59:36 2009 @@ -136,7 +136,7 @@ /// GetNodeName - Returns the name stored in a node regardless of whether the /// node is a TYPE or DECL. -static const char *GetNodeName(tree Node) { +static StringRef GetNodeName(tree Node) { tree Name = NULL; if (DECL_P(Node)) { @@ -150,11 +150,11 @@ return IDENTIFIER_POINTER(Name); } else if (TREE_CODE(Name) == TYPE_DECL && DECL_NAME(Name) && !DECL_IGNORED_P(Name)) { - return IDENTIFIER_POINTER(DECL_NAME(Name)); + return StringRef(IDENTIFIER_POINTER(DECL_NAME(Name))); } } - return NULL; + return StringRef(); } /// GetNodeLocation - Returns the location stored in a node regardless of @@ -195,12 +195,12 @@ return Location; } -static const char *getLinkageName(tree Node) { +static StringRef getLinkageName(tree Node) { // Use llvm value name as linkage name if it is available. if (DECL_LLVM_SET_P(Node)) { Value *V = DECL_LLVM(Node); - return V->getName().data(); + return V->getName(); } tree decl_name = DECL_NAME(Node); @@ -208,10 +208,10 @@ if (TREE_PUBLIC(Node) && DECL_ASSEMBLER_NAME(Node) != DECL_NAME(Node) && !DECL_ABSTRACT(Node)) { - return IDENTIFIER_POINTER(DECL_ASSEMBLER_NAME(Node)); + return StringRef(IDENTIFIER_POINTER(DECL_ASSEMBLER_NAME(Node))); } } - return NULL; + return StringRef(); } DebugInfo::DebugInfo(Module *m) @@ -231,7 +231,7 @@ BasicBlock *CurBB) { // Gather location information. expanded_location Loc = GetNodeLocation(FnDecl, false); - const char *LinkageName = getLinkageName(FnDecl); + StringRef LinkageName = getLinkageName(FnDecl); DISubprogram SP = DebugFactory.CreateSubprogram(findRegion(FnDecl), @@ -269,8 +269,6 @@ context = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (decl))))); - if (context && !TYPE_P (context)) - context = NULL_TREE; if (context != NULL_TREE) return findRegion(context); } @@ -361,14 +359,13 @@ // Gather location information. expanded_location Loc = expand_location(DECL_SOURCE_LOCATION(decl)); DIType TyD = getOrCreateType(TREE_TYPE(decl)); - std::string DispNameStr = GV->getNameStr(); - const char *DispName = DispNameStr.c_str(); + StringRef DispName = GV->getName(); if (DECL_NAME(decl)) { if (IDENTIFIER_POINTER(DECL_NAME(decl))) DispName = IDENTIFIER_POINTER(DECL_NAME(decl)); } - DebugFactory.CreateGlobalVariable(getOrCreateCompileUnit(Loc.file), + DebugFactory.CreateGlobalVariable(findRegion(decl), DispName, DispName, getLinkageName(decl), getOrCreateCompileUnit(Loc.file), Loc.line, @@ -379,7 +376,7 @@ /// createBasicType - Create BasicType. DIType DebugInfo::createBasicType(tree type) { - const char *TypeName = GetNodeName(type); + StringRef TypeName = GetNodeName(type); uint64_t Size = NodeSizeInBits(type); uint64_t Align = NodeAlignInBits(type); @@ -444,7 +441,7 @@ DebugFactory.GetOrCreateArray(EltTys.data(), EltTys.size()); return DebugFactory.CreateCompositeType(llvm::dwarf::DW_TAG_subroutine_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0, 0, 0, 0, 0, llvm::DIType(), EltTypeArray); @@ -461,7 +458,7 @@ DW_TAG_reference_type; expanded_location Loc = GetNodeLocation(type); - const char *PName = FromTy.getName(); + StringRef PName = FromTy.getName(); return DebugFactory.CreateDerivedType(Tag, findRegion(type), PName, getOrCreateCompileUnit(NULL), 0 /*line no*/, @@ -518,7 +515,7 @@ DebugFactory.GetOrCreateArray(Subscripts.data(), Subscripts.size()); expanded_location Loc = GetNodeLocation(type); return DebugFactory.CreateCompositeType(llvm::dwarf::DW_TAG_array_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(Loc.file), 0, NodeSizeInBits(type), NodeAlignInBits(type), 0, 0, @@ -599,13 +596,13 @@ /// also while creating FwdDecl for now. std::string FwdName; if (TYPE_CONTEXT(type)) { - const char *TypeContextName = GetNodeName(TYPE_CONTEXT(type)); - if (TypeContextName) + StringRef TypeContextName = GetNodeName(TYPE_CONTEXT(type)); + if (!TypeContextName.empty()) FwdName = TypeContextName; } - const char *TypeName = GetNodeName(type); - if (TypeName) - FwdName = FwdName + TypeName; + StringRef TypeName = GetNodeName(type); + if (!TypeName.empty()) + FwdName = FwdName + TypeName.data(); unsigned Flags = llvm::DIType::FlagFwdDecl; llvm::DICompositeType FwdDecl = DebugFactory.CreateCompositeType(Tag, @@ -622,6 +619,7 @@ return FwdDecl; // Insert into the TypeCache so that recursive uses will find it. + llvm::TrackingVH FwdDeclNode = FwdDecl.getNode(); TypeCache[type] = WeakVH(FwdDecl.getNode()); // Convert all the elements. @@ -636,7 +634,7 @@ // FIXME : name, size, align etc... DIType DTy = DebugFactory.CreateDerivedType(DW_TAG_inheritance, - findRegion(type), NULL, + findRegion(type), StringRef(), llvm::DICompileUnit(), 0,0,0, getINTEGER_CSTVal(BINFO_OFFSET(BInfo)), 0, BaseClass); @@ -667,7 +665,7 @@ // Field type is the declared type of the field. tree FieldNodeType = FieldType(Member); DIType MemberType = getOrCreateType(FieldNodeType); - const char *MemberName = GetNodeName(Member); + StringRef MemberName = GetNodeName(Member); unsigned Flags = 0; if (TREE_PROTECTED(Member)) Flags = llvm::DIType::FlagProtected; @@ -700,7 +698,7 @@ expanded_location MemLoc = GetNodeLocation(Member, false); const char *MemberName = lang_hooks.dwarf_name(Member, 0); - const char *LinkageName = getLinkageName(Member); + StringRef LinkageName = getLinkageName(Member); DIType SPTy = getOrCreateType(TREE_TYPE(Member)); DISubprogram SP = DebugFactory.CreateSubprogram(findRegion(Member), MemberName, MemberName, @@ -724,7 +722,7 @@ // Now that we have a real decl for the struct, replace anything using the // old decl with the new one. This will recursively update the debug info. - FwdDecl.replaceAllUsesWith(RealDecl); + llvm::DIDerivedType(FwdDeclNode).replaceAllUsesWith(RealDecl); return RealDecl; } @@ -755,7 +753,7 @@ if (TYPE_VOLATILE(type)) { Ty = DebugFactory.CreateDerivedType(DW_TAG_volatile_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0 /*line no*/, NodeSizeInBits(type), @@ -768,7 +766,7 @@ if (TYPE_READONLY(type)) Ty = DebugFactory.CreateDerivedType(DW_TAG_const_type, - findRegion(type), NULL, + findRegion(type), StringRef(), getOrCreateCompileUnit(NULL), 0 /*line no*/, NodeSizeInBits(type), @@ -935,7 +933,7 @@ DICompileUnit NewCU = DebugFactory.CreateCompileUnit(LangTag, FileName.c_str(), Directory.c_str(), version_string, isMain, - optimize, NULL, + optimize, StringRef(), ObjcRunTimeVer); CUCache[FullPath] = WeakVH(NewCU.getNode()); return NewCU; From espindola at google.com Sun Nov 29 10:34:30 2009 From: espindola at google.com (Rafael Espindola) Date: Sun, 29 Nov 2009 11:34:30 -0500 Subject: [llvm-commits] [patch] Implement no-canonical-prefixes. Message-ID: <38a0d8450911290834l4c32bba0m40da8daf624cf088@mail.gmail.com> Gcc's option -fno-canonical-prefixes is useful for the case where gcc and cc1 are symbolic links to different directories. The same is true for clang and clang-cc. For example, if we have bin/clang -> /foo/clang bin/clang-cc /bar/clang-cc clang will fail to find clang-cc since it will look for /foo/clang-cc. The attached patch implements -no-canonical-prefixes in the driver. For more information see. http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00110.html Cheers, -- Rafael ?vila de Esp?ndola -------------- next part -------------- A non-text attachment was scrubbed... Name: no-canonical-prefixes.patch Type: text/x-diff Size: 2640 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091129/300cdc54/attachment.bin From kovarththanan.rajaratnam at gmail.com Sun Nov 29 04:44:06 2009 From: kovarththanan.rajaratnam at gmail.com (Kovarththanan Rajaratnam) Date: Sun, 29 Nov 2009 11:44:06 +0100 Subject: [llvm-commits] [PATCH] Handle realpath NULL pointer return value Message-ID: <4B125076.9020905@gmail.com> Hey, Is it OK if I commit the following: This patch ensures that Path::GetMainExecutable is able to handle the case where realpath() fails. When this occurs we segfault trying to create a std::string from a NULL pointer. Fixes PR5635. -- Best regards, Kovarththanan Rajaratnam -------------- next part -------------- A non-text attachment was scrubbed... Name: path_realpath.patch Type: text/x-patch Size: 926 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091129/9de41933/attachment.bin From nunoplopes at sapo.pt Sun Nov 29 10:52:17 2009 From: nunoplopes at sapo.pt (Nuno Lopes) Date: Sun, 29 Nov 2009 16:52:17 -0000 Subject: [llvm-commits] [PATCH] Handle realpath NULL pointer return value In-Reply-To: <4B125076.9020905@gmail.com> References: <4B125076.9020905@gmail.com> Message-ID: <03673C690AC64B52A57D97A09EA19AF3@pc07654> Looks good; please commit it. Nuno ----- Original Message ----- > Hey, > > Is it OK if I commit the following: > > This patch ensures that Path::GetMainExecutable is able to handle the > case where realpath() fails. When this occurs we segfault trying to > create a std::string from a NULL pointer. > > Fixes PR5635. > > -- > Best regards, > Kovarththanan Rajaratnam From kovarththanan.rajaratnam at gmail.com Sun Nov 29 11:19:49 2009 From: kovarththanan.rajaratnam at gmail.com (Kovarththanan Rajaratnam) Date: Sun, 29 Nov 2009 17:19:49 -0000 Subject: [llvm-commits] [llvm] r90082 - /llvm/trunk/lib/System/Unix/Path.inc Message-ID: <200911291719.nATHJnl2010837@zion.cs.uiuc.edu> Author: krj Date: Sun Nov 29 11:19:48 2009 New Revision: 90082 URL: http://llvm.org/viewvc/llvm-project?rev=90082&view=rev Log: This patch ensures that Path::GetMainExecutable is able to handle the case where realpath() fails. When this occurs we segfault trying to create a std::string from a NULL pointer. Fixes PR5635. Modified: llvm/trunk/lib/System/Unix/Path.inc Modified: llvm/trunk/lib/System/Unix/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=90082&r1=90081&r2=90082&view=diff ============================================================================== --- llvm/trunk/lib/System/Unix/Path.inc (original) +++ llvm/trunk/lib/System/Unix/Path.inc Sun Nov 29 11:19:48 2009 @@ -348,7 +348,9 @@ uint32_t size = sizeof(exe_path); if (_NSGetExecutablePath(exe_path, &size) == 0) { char link_path[MAXPATHLEN]; - return Path(std::string(realpath(exe_path, link_path))); + if (realpath(exe_path, link_path)) + return Path(std::string(link_path)); + return Path(); } #elif defined(__FreeBSD__) char exe_path[PATH_MAX]; @@ -370,7 +372,9 @@ // If the filename is a symlink, we need to resolve and return the location of // the actual executable. char link_path[MAXPATHLEN]; - return Path(std::string(realpath(DLInfo.dli_fname, link_path))); + if (realpath(DLInfo.dli_fname, link_path)) + return Path(std::string(link_path)); + return Path(); #endif return Path(); } From benny.kra at googlemail.com Sun Nov 29 11:42:58 2009 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 29 Nov 2009 17:42:58 -0000 Subject: [llvm-commits] [llvm] r90083 - /llvm/trunk/lib/System/Unix/Path.inc Message-ID: <200911291742.nATHgwm1011619@zion.cs.uiuc.edu> Author: d0k Date: Sun Nov 29 11:42:58 2009 New Revision: 90083 URL: http://llvm.org/viewvc/llvm-project?rev=90083&view=rev Log: Remove dead returns. Modified: llvm/trunk/lib/System/Unix/Path.inc Modified: llvm/trunk/lib/System/Unix/Path.inc URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/System/Unix/Path.inc?rev=90083&r1=90082&r2=90083&view=diff ============================================================================== --- llvm/trunk/lib/System/Unix/Path.inc (original) +++ llvm/trunk/lib/System/Unix/Path.inc Sun Nov 29 11:42:58 2009 @@ -350,7 +350,6 @@ char link_path[MAXPATHLEN]; if (realpath(exe_path, link_path)) return Path(std::string(link_path)); - return Path(); } #elif defined(__FreeBSD__) char exe_path[PATH_MAX]; @@ -374,7 +373,6 @@ char link_path[MAXPATHLEN]; if (realpath(DLInfo.dli_fname, link_path)) return Path(std::string(link_path)); - return Path(); #endif return Path(); } From nicholas at mxc.ca Sun Nov 29 12:10:45 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 29 Nov 2009 18:10:45 -0000 Subject: [llvm-commits] [llvm] r90085 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911291810.nATIAj5m012577@zion.cs.uiuc.edu> Author: nicholas Date: Sun Nov 29 12:10:39 2009 New Revision: 90085 URL: http://llvm.org/viewvc/llvm-project?rev=90085&view=rev Log: Detabify. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90085&r1=90084&r2=90085&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sun Nov 29 12:10:39 2009 @@ -201,7 +201,7 @@ // If we reach a lifetime begin or end marker, then the query ends here // because the value is undefined. } else if (II->getIntrinsicID() == Intrinsic::lifetime_start || - II->getIntrinsicID() == Intrinsic::lifetime_end) { + II->getIntrinsicID() == Intrinsic::lifetime_end) { uint64_t invariantSize = ~0ULL; if (ConstantInt *CI = dyn_cast(II->getOperand(1))) invariantSize = CI->getZExtValue(); From nicholas at mxc.ca Sun Nov 29 12:10:53 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 29 Nov 2009 10:10:53 -0800 Subject: [llvm-commits] [llvm] r90045 - in /llvm/trunk: lib/Analysis/MemoryDependenceAnalysis.cpp test/Transforms/DeadStoreElimination/lifetime.ll In-Reply-To: <4B1244E1.2060500@free.fr> References: <200911282127.nASLRoum025712@zion.cs.uiuc.edu> <4B1244E1.2060500@free.fr> Message-ID: <4B12B92D.4020503@mxc.ca> Duncan Sands wrote: > Hi Nick, > >> Teach memdep to look for memory use intrinsics during dependency >> queries. Fixes >> PR5574. > > thanks for doing this. > >> - II->getIntrinsicID() == Intrinsic::lifetime_end) { >> + II->getIntrinsicID() == Intrinsic::lifetime_end) { > > This line uses tab stops for indentation when it should use spaces. Thanks Duncan! Fixed in r90085. Nick From clattner at apple.com Sun Nov 29 13:42:08 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Nov 2009 11:42:08 -0800 Subject: [llvm-commits] [llvm] r89659 - in /llvm/trunk: lib/Analysis/ConstantFolding.cpp lib/Transforms/IPO/GlobalOpt.cpp test/Transforms/GlobalOpt/constantfold-initializers.ll test/Transforms/InstCombine/cast.ll test/Transforms/InstCombine/shufflevec-constant.ll In-Reply-To: <200911231622.nANGMLiD029357@zion.cs.uiuc.edu> References: <200911231622.nANGMLiD029357@zion.cs.uiuc.edu> Message-ID: On Nov 23, 2009, at 8:22 AM, Dan Gohman wrote: > Author: djg > Date: Mon Nov 23 10:22:21 2009 > New Revision: 89659 > > URL: http://llvm.org/viewvc/llvm-project?rev=89659&view=rev > Log: > Make ConstantFoldConstantExpression recursively visit the entire > ConstantExpr, not just the top-level operator. This allows it to > fold many more constants. > > Also, make GlobalOpt call ConstantFoldConstantExpression on > GlobalVariable initializers. Hi Dan, My recollection is that instcombine uses this function, so making it more expensive is potentially bad. Have you looked for any compile time impact of this? -Chris From clattner at apple.com Sun Nov 29 14:12:22 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Nov 2009 12:12:22 -0800 Subject: [llvm-commits] [PATCH] DOTGraphTraits improvements In-Reply-To: <1259351086.75448.152.camel@tobilaptop.fritz.box> References: <1259351086.75448.152.camel@tobilaptop.fritz.box> Message-ID: On Nov 27, 2009, at 11:44 AM, Tobias Grosser wrote: > Hi, > > I worked on some dotty improvements. The main changes are: > > * Better layout > * Allow to hide details in -only mode > * Do not print unneeded stuff > * Several small bug fixes > > The effects can be seen on this page: > http://students.fim.uni-passau.de/~grosser/llvm/dotty_patch/dotty.html Looks nice! The only thing I don't like is the removal of the T/F labels from the "only" CFG graph. > These are the changes in detail: > > * Remove ":" after BB name in -view-cfg-only Ok. > * Small PostDominatorTree improvements > > * Do not SEGFAULT if tree entryNode() is NULL > * Print function names in dotty printer Ok. > * Only print edgeSourceLabels if they are not empty Ok. > * Do not point edge heads to source labels > > If no destination label is available, just point to the node itself > instead of pointing to some source label. Source and destination > labels are not related in any way. Ok. > * Instantiate DefaultDOTGraphTraits Ok. > * Remove ShortNames from getNodeLabel > > Convert ShortNames to the general isSimple(). isSimple() is only > called in the classes where it actually makes a difference. + std::string getNodeLabel(const BasicBlock *Node, + const Function *Graph) { Please line up the second line with the ( + if (isSimple()) + return getSimpleNodeLabel (Node, Graph); + else + return getCompleteNodeLabel (Node, Graph); + No spaces before the '(' in function calls. Otherwise, ok. > * Do not print edge source labels in -view-cfg-only & -dot-cfg-only > > Without branch instructions the labels are not informative and at > the same time complicate the layout of the graph. This uses the new > isSimple. I'd prefer to keep these. I frequently look at a .ll file and a cfg-only graph at the same time, and the edges are really useful for this. Please commit the other patches, if you need commit access, please send me the info requested in the dev policy document off list. Thanks Tobias! -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20091129/9e131b03/attachment.html From benny.kra at googlemail.com Sun Nov 29 14:29:31 2009 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 29 Nov 2009 20:29:31 -0000 Subject: [llvm-commits] [llvm] r90089 - /llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Message-ID: <200911292029.nATKTVou016982@zion.cs.uiuc.edu> Author: d0k Date: Sun Nov 29 14:29:30 2009 New Revision: 90089 URL: http://llvm.org/viewvc/llvm-project?rev=90089&view=rev Log: Fix two FIXMEs. Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp?rev=90089&r1=90088&r2=90089&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Sun Nov 29 14:29:30 2009 @@ -1480,8 +1480,7 @@ if (StoreInst *SI = dyn_cast(User)) { assert(SI->getOperand(0) != Ptr && "Consistency error!"); - // FIXME: Remove once builder has Twine API. - Value *Old = Builder.CreateLoad(NewAI, (NewAI->getName()+".in").str().c_str()); + Value *Old = Builder.CreateLoad(NewAI, NewAI->getName() + ".in"); Value *New = ConvertScalar_InsertValue(SI->getOperand(0), Old, Offset, Builder); Builder.CreateStore(New, NewAI); @@ -1504,9 +1503,8 @@ if (Val) for (unsigned i = 1; i != NumBytes; ++i) APVal |= APVal << 8; - - // FIXME: Remove once builder has Twine API. - Value *Old = Builder.CreateLoad(NewAI, (NewAI->getName()+".in").str().c_str()); + + Value *Old = Builder.CreateLoad(NewAI, NewAI->getName() + ".in"); Value *New = ConvertScalar_InsertValue( ConstantInt::get(User->getContext(), APVal), Old, Offset, Builder); From sabre at nondot.org Sun Nov 29 15:09:36 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 21:09:36 -0000 Subject: [llvm-commits] [llvm] r90093 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911292109.nATL9a6x018872@zion.cs.uiuc.edu> Author: lattner Date: Sun Nov 29 15:09:36 2009 New Revision: 90093 URL: http://llvm.org/viewvc/llvm-project?rev=90093&view=rev Log: Fix a really nasty caching bug I introduced in memdep. An entry was being added to the Result vector, but not being put in the cache. This means that if the cache was reused wholesale for a later query that it would be missing this entry and we'd do an incorrect load elimination. Unfortunately, it's not really possible to write a useful testcase for this, but this unbreaks 255.vortex. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90093&r1=90092&r2=90093&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sun Nov 29 15:09:36 2009 @@ -1156,8 +1156,18 @@ // that predecessor. We can still do PRE of the load, which would insert // a computation of the pointer in this predecessor. if (PredPtr == 0) { - Result.push_back(NonLocalDepEntry(Pred, - MemDepResult::getClobber(Pred->getTerminator()))); + // Add the entry to the Result list. + NonLocalDepEntry Entry(Pred, + MemDepResult::getClobber(Pred->getTerminator())); + Result.push_back(Entry); + + // Add it to the cache for this CacheKey so that subsequent queries get + // this result. + Cache = &NonLocalPointerDeps[CacheKey].second; + MemoryDependenceAnalysis::NonLocalDepInfo::iterator It = + std::upper_bound(Cache->begin(), Cache->end(), Entry); + Cache->insert(It, Entry); + Cache = 0; continue; } From sabre at nondot.org Sun Nov 29 15:14:59 2009 From: sabre at nondot.org (Chris Lattner) Date: Sun, 29 Nov 2009 21:14:59 -0000 Subject: [llvm-commits] [llvm] r90096 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911292114.nATLExWx019236@zion.cs.uiuc.edu> Author: lattner Date: Sun Nov 29 15:14:59 2009 New Revision: 90096 URL: http://llvm.org/viewvc/llvm-project?rev=90096&view=rev Log: revert this patch for now, it causes failures of: LLVM::Transforms/GVN/2009-02-17-LoadPRECrash.ll LLVM::Transforms/GVN/2009-06-17-InvalidPRE.ll Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90096&r1=90095&r2=90096&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sun Nov 29 15:14:59 2009 @@ -1156,18 +1156,8 @@ // that predecessor. We can still do PRE of the load, which would insert // a computation of the pointer in this predecessor. if (PredPtr == 0) { - // Add the entry to the Result list. - NonLocalDepEntry Entry(Pred, - MemDepResult::getClobber(Pred->getTerminator())); - Result.push_back(Entry); - - // Add it to the cache for this CacheKey so that subsequent queries get - // this result. - Cache = &NonLocalPointerDeps[CacheKey].second; - MemoryDependenceAnalysis::NonLocalDepInfo::iterator It = - std::upper_bound(Cache->begin(), Cache->end(), Entry); - Cache->insert(It, Entry); - Cache = 0; + Result.push_back(NonLocalDepEntry(Pred, + MemDepResult::getClobber(Pred->getTerminator()))); continue; } From benny.kra at googlemail.com Sun Nov 29 15:17:48 2009 From: benny.kra at googlemail.com (Benjamin Kramer) Date: Sun, 29 Nov 2009 21:17:48 -0000 Subject: [llvm-commits] [llvm] r90097 - /llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Message-ID: <200911292117.nATLHml3019412@zion.cs.uiuc.edu> Author: d0k Date: Sun Nov 29 15:17:48 2009 New Revision: 90097 URL: http://llvm.org/viewvc/llvm-project?rev=90097&view=rev Log: Revert r90089 for now, it's breaking selfhost. Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Modified: llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp?rev=90097&r1=90096&r2=90097&view=diff ============================================================================== --- llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp (original) +++ llvm/trunk/lib/Transforms/Scalar/ScalarReplAggregates.cpp Sun Nov 29 15:17:48 2009 @@ -1480,7 +1480,8 @@ if (StoreInst *SI = dyn_cast(User)) { assert(SI->getOperand(0) != Ptr && "Consistency error!"); - Value *Old = Builder.CreateLoad(NewAI, NewAI->getName() + ".in"); + // FIXME: Remove once builder has Twine API. + Value *Old = Builder.CreateLoad(NewAI, (NewAI->getName()+".in").str().c_str()); Value *New = ConvertScalar_InsertValue(SI->getOperand(0), Old, Offset, Builder); Builder.CreateStore(New, NewAI); @@ -1503,8 +1504,9 @@ if (Val) for (unsigned i = 1; i != NumBytes; ++i) APVal |= APVal << 8; - - Value *Old = Builder.CreateLoad(NewAI, NewAI->getName() + ".in"); + + // FIXME: Remove once builder has Twine API. + Value *Old = Builder.CreateLoad(NewAI, (NewAI->getName()+".in").str().c_str()); Value *New = ConvertScalar_InsertValue( ConstantInt::get(User->getContext(), APVal), Old, Offset, Builder); From nicholas at mxc.ca Sun Nov 29 15:40:55 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Sun, 29 Nov 2009 21:40:55 -0000 Subject: [llvm-commits] [llvm] r90099 - in /llvm/trunk: lib/Analysis/ConstantFolding.cpp test/Transforms/InstCombine/getelementptr.ll Message-ID: <200911292140.nATLet7a020939@zion.cs.uiuc.edu> Author: nicholas Date: Sun Nov 29 15:40:55 2009 New Revision: 90099 URL: http://llvm.org/viewvc/llvm-project?rev=90099&view=rev Log: Teach ConstantFolding to do a better job when folding gep(bitcast). This permits the devirtualization of llvm.org/PR3100#c9 when compiled by clang. Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp llvm/trunk/test/Transforms/InstCombine/getelementptr.ll Modified: llvm/trunk/lib/Analysis/ConstantFolding.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ConstantFolding.cpp?rev=90099&r1=90098&r2=90099&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ConstantFolding.cpp (original) +++ llvm/trunk/lib/Analysis/ConstantFolding.cpp Sun Nov 29 15:40:55 2009 @@ -564,6 +564,7 @@ // we eliminate over-indexing of the notional static type array bounds. // This makes it easy to determine if the getelementptr is "inbounds". // Also, this helps GlobalOpt do SROA on GlobalVariables. + Ptr = cast(Ptr->stripPointerCasts()); const Type *Ty = Ptr->getType(); SmallVector NewIdxs; do { Modified: llvm/trunk/test/Transforms/InstCombine/getelementptr.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/getelementptr.ll?rev=90099&r1=90098&r2=90099&view=diff ============================================================================== --- llvm/trunk/test/Transforms/InstCombine/getelementptr.ll (original) +++ llvm/trunk/test/Transforms/InstCombine/getelementptr.ll Sun Nov 29 15:40:55 2009 @@ -445,7 +445,7 @@ i8* getelementptr (%t1* bitcast (%t0* @s to %t1*), i32 0, i32 1, i32 0)) nounwind ret i32 0 ; CHECK: @test35 -; CHECK: call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([17 x i8]* @"\01LC8", i64 0, i64 0), i8* bitcast (i8** getelementptr (%t1* bitcast (%t0* @s to %t1*), i64 1, i32 0) to i8*)) nounwind +; CHECK: call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([17 x i8]* @"\01LC8", i64 0, i64 0), i8* getelementptr inbounds (%t0* @s, i64 0, i32 1, i64 0)) nounwind } ; Instcombine should constant-fold the GEP so that indices that have From nicholas at mxc.ca Sun Nov 29 18:38:56 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 30 Nov 2009 00:38:56 -0000 Subject: [llvm-commits] [llvm] r90104 - /llvm/trunk/test/FrontendC/pr4349.c Message-ID: <200911300038.nAU0cu3K031389@zion.cs.uiuc.edu> Author: nicholas Date: Sun Nov 29 18:38:56 2009 New Revision: 90104 URL: http://llvm.org/viewvc/llvm-project?rev=90104&view=rev Log: Commit r90099 made LLVM simplify one of these constant expressions a little more. Update the syntax we're checking for and filecheckize it too. This will fix the selfhost buildbots but will 'break' the others (sigh) because they're still linked against older LLVM which is emitting less optimized IR. Modified: llvm/trunk/test/FrontendC/pr4349.c Modified: llvm/trunk/test/FrontendC/pr4349.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/FrontendC/pr4349.c?rev=90104&r1=90103&r2=90104&view=diff ============================================================================== --- llvm/trunk/test/FrontendC/pr4349.c (original) +++ llvm/trunk/test/FrontendC/pr4349.c Sun Nov 29 18:38:56 2009 @@ -1,9 +1,4 @@ -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars2 | grep {\\\[2 x \\\[2 x i8\\\]\\\]} -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars2 | grep {, i\[\[:digit:\]\]\\+ 1)} | count 1 -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars3 | grep {\\\[2 x i16\\\]} -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars3 | grep {, i\[\[:digit:\]\]\\+ 1)} | count 1 -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars4 | grep {\\\[2 x \\\[2 x i8\\\]\\\]} | count 1 -// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | grep svars4 | grep {, i\[\[:digit:\]\]\\+ 1, i\[\[:digit:\]\]\\+ 1)} | count 1 +// RUN: %llvmgcc %s -S -emit-llvm -O0 -o - | FileCheck %s // PR 4349 union reg @@ -21,18 +16,22 @@ { void *ptr; }; +// CHECK: @svars1 = global [1 x %struct.svar] [%struct.svar { i8* bitcast (%struct.cpu* @cpu to i8*) }] struct svar svars1[] = { { &((cpu.pc).w[0]) } }; +// CHECK: @svars2 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x i8]* bitcast (%struct.cpu* @cpu to [2 x i8]*), i32 0, i32 1) }] struct svar svars2[] = { { &((cpu.pc).b[0][1]) } }; +// CHECK: @svars3 = global [1 x %struct.svar] [%struct.svar { i8* bitcast (i16* getelementptr ([2 x i16]* bitcast (%struct.cpu* @cpu to [2 x i16]*), i32 0, i32 1) to i8*) }] struct svar svars3[] = { { &((cpu.pc).w[1]) } }; +// CHECK: @svars4 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x [2 x i8]]* bitcast (%struct.cpu* @cpu to [2 x [2 x i8]]*), i32 0, i32 1, i32 1) }] struct svar svars4[] = { { &((cpu.pc).b[1][1]) } From nicholas at mxc.ca Sun Nov 29 20:23:57 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 30 Nov 2009 02:23:57 -0000 Subject: [llvm-commits] [llvm] r90106 - /llvm/trunk/test/FrontendC/pr4349.c Message-ID: <200911300223.nAU2Nvpj003634@zion.cs.uiuc.edu> Author: nicholas Date: Sun Nov 29 20:23:57 2009 New Revision: 90106 URL: http://llvm.org/viewvc/llvm-project?rev=90106&view=rev Log: Fix this test on 64-bit systems which seem to use i64 for gep indices sometimes while 32-bit gcc uses i32. Modified: llvm/trunk/test/FrontendC/pr4349.c Modified: llvm/trunk/test/FrontendC/pr4349.c URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/FrontendC/pr4349.c?rev=90106&r1=90105&r2=90106&view=diff ============================================================================== --- llvm/trunk/test/FrontendC/pr4349.c (original) +++ llvm/trunk/test/FrontendC/pr4349.c Sun Nov 29 20:23:57 2009 @@ -21,17 +21,17 @@ { { &((cpu.pc).w[0]) } }; -// CHECK: @svars2 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x i8]* bitcast (%struct.cpu* @cpu to [2 x i8]*), i32 0, i32 1) }] +// CHECK: @svars2 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x i8]* bitcast (%struct.cpu* @cpu to [2 x i8]*), i{{[0-9]+}} 0, i{{[0-9]+}} 1) }] struct svar svars2[] = { { &((cpu.pc).b[0][1]) } }; -// CHECK: @svars3 = global [1 x %struct.svar] [%struct.svar { i8* bitcast (i16* getelementptr ([2 x i16]* bitcast (%struct.cpu* @cpu to [2 x i16]*), i32 0, i32 1) to i8*) }] +// CHECK: @svars3 = global [1 x %struct.svar] [%struct.svar { i8* bitcast (i16* getelementptr ([2 x i16]* bitcast (%struct.cpu* @cpu to [2 x i16]*), i{{[0-9]+}} 0, i{{[0-9]+}} 1) to i8*) }] struct svar svars3[] = { { &((cpu.pc).w[1]) } }; -// CHECK: @svars4 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x [2 x i8]]* bitcast (%struct.cpu* @cpu to [2 x [2 x i8]]*), i32 0, i32 1, i32 1) }] +// CHECK: @svars4 = global [1 x %struct.svar] [%struct.svar { i8* getelementptr ([2 x [2 x i8]]* bitcast (%struct.cpu* @cpu to [2 x [2 x i8]]*), i{{[0-9]+}} 0, i{{[0-9]+}} 1, i{{[0-9]+}} 1) }] struct svar svars4[] = { { &((cpu.pc).b[1][1]) } From sabre at nondot.org Sun Nov 29 20:26:34 2009 From: sabre at nondot.org (Chris Lattner) Date: Mon, 30 Nov 2009 02:26:34 -0000 Subject: [llvm-commits] [llvm] r90107 - /llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Message-ID: <200911300226.nAU2QYBJ003714@zion.cs.uiuc.edu> Author: lattner Date: Sun Nov 29 20:26:29 2009 New Revision: 90107 URL: http://llvm.org/viewvc/llvm-project?rev=90107&view=rev Log: reapply r90093 with an addition of keeping the forward and reverse nonlocal memdep maps in synch, this should fix 255.vortex. Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Modified: llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp?rev=90107&r1=90106&r2=90107&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp (original) +++ llvm/trunk/lib/Analysis/MemoryDependenceAnalysis.cpp Sun Nov 29 20:26:29 2009 @@ -1156,8 +1156,21 @@ // that predecessor. We can still do PRE of the load, which would insert // a computation of the pointer in this predecessor. if (PredPtr == 0) { - Result.push_back(NonLocalDepEntry(Pred, - MemDepResult::getClobber(Pred->getTerminator()))); + // Add the entry to the Result list. + NonLocalDepEntry Entry(Pred, + MemDepResult::getClobber(Pred->getTerminator())); + Result.push_back(Entry); + + // Add it to the cache for this CacheKey so that subsequent queries get + // this result. + Cache = &NonLocalPointerDeps[CacheKey].second; + MemoryDependenceAnalysis::NonLocalDepInfo::iterator It = + std::upper_bound(Cache->begin(), Cache->end(), Entry); + Cache->insert(It, Entry); + Cache = 0; + + // Add it to the reverse map next. + ReverseNonLocalPtrDeps[Pred->getTerminator()].insert(CacheKey); continue; } From wangmp at apple.com Sun Nov 29 20:42:02 2009 From: wangmp at apple.com (Mon P Wang) Date: Mon, 30 Nov 2009 02:42:02 -0000 Subject: [llvm-commits] [llvm] r90108 - in /llvm/trunk: include/llvm/CodeGen/SelectionDAG.h lib/CodeGen/SelectionDAG/LegalizeTypes.cpp lib/CodeGen/SelectionDAG/LegalizeTypes.h lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp lib/Target/X86/X86ISelLowering.cpp Message-ID: <200911300242.nAU2g38a004283@zion.cs.uiuc.edu> Author: wangmp Date: Sun Nov 29 20:42:02 2009 New Revision: 90108 URL: http://llvm.org/viewvc/llvm-project?rev=90108&view=rev Log: Added support to allow clients to custom widen. For X86, custom widen vectors for divide/remainder since these operations can trap by unroll them and adding undefs for the resulting vector. Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original) +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Sun Nov 29 20:42:02 2009 @@ -882,6 +882,14 @@ /// element of the result of the vector shuffle. SDValue getShuffleScalarElt(const ShuffleVectorSDNode *N, unsigned Idx); + /// UnrollVectorOp - Utility function used by legalize and lowering to + /// "unroll" a vector operation by splitting out the scalars and operating + /// on each element individually. If the ResNE is 0, fully unroll the vector + /// op. If ResNE is less than the width of the vector op, unroll up to ResNE. + /// If the ResNE is greater than the width of the vector op, unroll the + /// vector op and fill the end of the resulting vector with UNDEFS. + SDValue UnrollVectorOp(SDNode *N, unsigned ResNE = 0); + private: bool RemoveNodeFromCSEMaps(SDNode *N); void AddModifiedNodeToCSEMaps(SDNode *N, DAGUpdateListener *UpdateListener); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp Sun Nov 29 20:42:02 2009 @@ -907,6 +907,29 @@ return true; } + +/// CustomWidenLowerNode - Widen the node's results with custom code provided +/// by the target and return "true", or do nothing and return "false". +bool DAGTypeLegalizer::CustomWidenLowerNode(SDNode *N, EVT VT) { + // See if the target wants to custom lower this node. + if (TLI.getOperationAction(N->getOpcode(), VT) != TargetLowering::Custom) + return false; + + SmallVector Results; + TLI.ReplaceNodeResults(N, Results, DAG); + + if (Results.empty()) + // The target didn't want to custom widen lower its result after all. + return false; + + // Update the widening map. + assert(Results.size() == N->getNumValues() && + "Custom lowering returned the wrong number of results!"); + for (unsigned i = 0, e = Results.size(); i != e; ++i) + SetWidenedVector(SDValue(N, i), Results[i]); + return true; +} + /// GetSplitDestVTs - Compute the VTs needed for the low/hi parts of a type /// which is split into two not necessarily identical pieces. void DAGTypeLegalizer::GetSplitDestVTs(EVT InVT, EVT &LoVT, EVT &HiVT) { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h Sun Nov 29 20:42:02 2009 @@ -188,6 +188,7 @@ SDValue BitConvertVectorToIntegerVector(SDValue Op); SDValue CreateStackStoreLoad(SDValue Op, EVT DestVT); bool CustomLowerNode(SDNode *N, EVT VT, bool LegalizeResult); + bool CustomWidenLowerNode(SDNode *N, EVT VT); SDValue GetVectorElementPointer(SDValue VecPtr, EVT EltVT, SDValue Index); SDValue JoinIntegers(SDValue Lo, SDValue Hi); SDValue LibCallify(RTLIB::Libcall LC, SDNode *N, bool isSigned); Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp Sun Nov 29 20:42:02 2009 @@ -54,9 +54,6 @@ SDValue LegalizeOp(SDValue Op); // Assuming the node is legal, "legalize" the results SDValue TranslateLegalizeResults(SDValue Op, SDValue Result); - // Implements unrolling a generic vector operation, i.e. turning it into - // scalar operations. - SDValue UnrollVectorOp(SDValue Op); // Implements unrolling a VSETCC. SDValue UnrollVSETCC(SDValue Op); // Implements expansion for FNEG; falls back to UnrollVectorOp if FSUB @@ -211,7 +208,7 @@ else if (Node->getOpcode() == ISD::VSETCC) Result = UnrollVSETCC(Op); else - Result = UnrollVectorOp(Op); + Result = DAG.UnrollVectorOp(Op.getNode()); break; } @@ -256,7 +253,7 @@ return DAG.getNode(ISD::FSUB, Op.getDebugLoc(), Op.getValueType(), Zero, Op.getOperand(0)); } - return UnrollVectorOp(Op); + return DAG.UnrollVectorOp(Op.getNode()); } SDValue VectorLegalizer::UnrollVSETCC(SDValue Op) { @@ -282,56 +279,6 @@ return DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Ops[0], NumElems); } -/// UnrollVectorOp - We know that the given vector has a legal type, however -/// the operation it performs is not legal, and the target has requested that -/// the operation be expanded. "Unroll" the vector, splitting out the scalars -/// and operating on each element individually. -SDValue VectorLegalizer::UnrollVectorOp(SDValue Op) { - EVT VT = Op.getValueType(); - assert(Op.getNode()->getNumValues() == 1 && - "Can't unroll a vector with multiple results!"); - unsigned NE = VT.getVectorNumElements(); - EVT EltVT = VT.getVectorElementType(); - DebugLoc dl = Op.getDebugLoc(); - - SmallVector Scalars; - SmallVector Operands(Op.getNumOperands()); - for (unsigned i = 0; i != NE; ++i) { - for (unsigned j = 0; j != Op.getNumOperands(); ++j) { - SDValue Operand = Op.getOperand(j); - EVT OperandVT = Operand.getValueType(); - if (OperandVT.isVector()) { - // A vector operand; extract a single element. - EVT OperandEltVT = OperandVT.getVectorElementType(); - Operands[j] = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, - OperandEltVT, - Operand, - DAG.getConstant(i, MVT::i32)); - } else { - // A scalar operand; just use it as is. - Operands[j] = Operand; - } - } - - switch (Op.getOpcode()) { - default: - Scalars.push_back(DAG.getNode(Op.getOpcode(), dl, EltVT, - &Operands[0], Operands.size())); - break; - case ISD::SHL: - case ISD::SRA: - case ISD::SRL: - case ISD::ROTL: - case ISD::ROTR: - Scalars.push_back(DAG.getNode(Op.getOpcode(), dl, EltVT, Operands[0], - DAG.getShiftAmountOperand(Operands[1]))); - break; - } - } - - return DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Scalars[0], Scalars.size()); -} - } bool SelectionDAG::LegalizeVectors() { Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Sun Nov 29 20:42:02 2009 @@ -1118,8 +1118,12 @@ DEBUG(errs() << "Widen node result " << ResNo << ": "; N->dump(&DAG); errs() << "\n"); - SDValue Res = SDValue(); + // See if the target wants to custom widen this node. + if (CustomWidenLowerNode(N, N->getValueType(ResNo))) + return; + + SDValue Res = SDValue(); switch (N->getOpcode()) { default: #ifndef NDEBUG Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Sun Nov 29 20:42:02 2009 @@ -5807,6 +5807,66 @@ N->dump(G); } +SDValue SelectionDAG::UnrollVectorOp(SDNode *N, unsigned ResNE) { + assert(N->getNumValues() == 1 && + "Can't unroll a vector with multiple results!"); + + EVT VT = N->getValueType(0); + unsigned NE = VT.getVectorNumElements(); + EVT EltVT = VT.getVectorElementType(); + DebugLoc dl = N->getDebugLoc(); + + SmallVector Scalars; + SmallVector Operands(N->getNumOperands()); + + // If ResNE is 0, fully unroll the vector op. + if (ResNE == 0) + ResNE = NE; + else if (NE > ResNE) + NE = ResNE; + + unsigned i; + for (i= 0; i != NE; ++i) { + for (unsigned j = 0; j != N->getNumOperands(); ++j) { + SDValue Operand = N->getOperand(j); + EVT OperandVT = Operand.getValueType(); + if (OperandVT.isVector()) { + // A vector operand; extract a single element. + EVT OperandEltVT = OperandVT.getVectorElementType(); + Operands[j] = getNode(ISD::EXTRACT_VECTOR_ELT, dl, + OperandEltVT, + Operand, + getConstant(i, MVT::i32)); + } else { + // A scalar operand; just use it as is. + Operands[j] = Operand; + } + } + + switch (N->getOpcode()) { + default: + Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, + &Operands[0], Operands.size())); + break; + case ISD::SHL: + case ISD::SRA: + case ISD::SRL: + case ISD::ROTL: + case ISD::ROTR: + Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, Operands[0], + getShiftAmountOperand(Operands[1]))); + break; + } + } + + for (; i < ResNE; ++i) + Scalars.push_back(getUNDEF(EltVT)); + + return getNode(ISD::BUILD_VECTOR, dl, + EVT::getVectorVT(*getContext(), EltVT, ResNE), + &Scalars[0], Scalars.size()); +} + void SelectionDAG::dump() const { errs() << "SelectionDAG has " << AllNodes.size() << " nodes:"; @@ -5962,3 +6022,4 @@ return false; return true; } + Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=90108&r1=90107&r2=90108&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Nov 29 20:42:02 2009 @@ -975,6 +975,19 @@ computeRegisterProperties(); + // Divide and reminder operations have no vector equivalent and can + // trap. Do a custom widening for these operations in which we never + // generate more divides/remainder than the original vector width. + for (unsigned VT = (unsigned)MVT::FIRST_VECTOR_VALUETYPE; + VT <= (unsigned)MVT::LAST_VECTOR_VALUETYPE; ++VT) { + if (!isTypeLegal((MVT::SimpleValueType)VT)) { + setOperationAction(ISD::SDIV, (MVT::SimpleValueType) VT, Custom); + setOperationAction(ISD::UDIV, (MVT::SimpleValueType) VT, Custom); + setOperationAction(ISD::SREM, (MVT::SimpleValueType) VT, Custom); + setOperationAction(ISD::UREM, (MVT::SimpleValueType) VT, Custom); + } + } + // FIXME: These should be based on subtarget info. Plus, the values should // be smaller when we are in optimizing for size mode. maxStoresPerMemset = 16; // For @llvm.memset -> sequence of stores @@ -7170,6 +7183,14 @@ Results.push_back(edx.getValue(1)); return; } + case ISD::SDIV: + case ISD::UDIV: + case ISD::SREM: + case ISD::UREM: { + EVT WidenVT = getTypeToTransformTo(*DAG.getContext(), N->getValueType(0)); + Results.push_back(DAG.UnrollVectorOp(N, WidenVT.getVectorNumElements())); + return; + } case ISD::ATOMIC_CMP_SWAP: { EVT T = N->getValueType(0); assert (T == MVT::i64 && "Only know how to expand i64 Cmp and Swap"); From wangmp at apple.com Sun Nov 29 20:42:27 2009 From: wangmp at apple.com (Mon P Wang) Date: Mon, 30 Nov 2009 02:42:27 -0000 Subject: [llvm-commits] [llvm] r90109 - /llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll Message-ID: <200911300242.nAU2gRq5004305@zion.cs.uiuc.edu> Author: wangmp Date: Sun Nov 29 20:42:27 2009 New Revision: 90109 URL: http://llvm.org/viewvc/llvm-project?rev=90109&view=rev Log: Add test case for r90108 Added: llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll Added: llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll?rev=90109&view=auto ============================================================================== --- llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll (added) +++ llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll Sun Nov 29 20:42:27 2009 @@ -0,0 +1,154 @@ +; RUN: llc < %s -disable-mmx -march=x86-64 -mattr=+sse42 | FileCheck %s + +; Verify when widening a divide/remainder operation, we only generate a +; divide/rem per element since divide/remainder can trap. + +define void @vectorDiv (<2 x i32> addrspace(1)* %nsource, <2 x i32> addrspace(1)* %dsource, <2 x i32> addrspace(1)* %qdest) nounwind { +; CHECK: idivl +; CHECK: idivl +; CHECK-NOT: idivl +; CHECK: ret +entry: + %nsource.addr = alloca <2 x i32> addrspace(1)*, align 4 + %dsource.addr = alloca <2 x i32> addrspace(1)*, align 4 + %qdest.addr = alloca <2 x i32> addrspace(1)*, align 4 + %index = alloca i32, align 4 + store <2 x i32> addrspace(1)* %nsource, <2 x i32> addrspace(1)** %nsource.addr + store <2 x i32> addrspace(1)* %dsource, <2 x i32> addrspace(1)** %dsource.addr + store <2 x i32> addrspace(1)* %qdest, <2 x i32> addrspace(1)** %qdest.addr + %tmp = load <2 x i32> addrspace(1)** %qdest.addr + %tmp1 = load i32* %index + %arrayidx = getelementptr <2 x i32> addrspace(1)* %tmp, i32 %tmp1 + %tmp2 = load <2 x i32> addrspace(1)** %nsource.addr + %tmp3 = load i32* %index + %arrayidx4 = getelementptr <2 x i32> addrspace(1)* %tmp2, i32 %tmp3 + %tmp5 = load <2 x i32> addrspace(1)* %arrayidx4 + %tmp6 = load <2 x i32> addrspace(1)** %dsource.addr + %tmp7 = load i32* %index + %arrayidx8 = getelementptr <2 x i32> addrspace(1)* %tmp6, i32 %tmp7 + %tmp9 = load <2 x i32> addrspace(1)* %arrayidx8 + %tmp10 = sdiv <2 x i32> %tmp5, %tmp9 + store <2 x i32> %tmp10, <2 x i32> addrspace(1)* %arrayidx + ret void +} + +define <3 x i8> @test_char_div(<3 x i8> %num, <3 x i8> %div) { +; CHECK: idivb +; CHECK: idivb +; CHECK: idivb +; CHECK-NOT: idivb +; CHECK: ret + %div.r = sdiv <3 x i8> %num, %div + ret <3 x i8> %div.r +} + +define <3 x i8> @test_uchar_div(<3 x i8> %num, <3 x i8> %div) { +; CHECK: divb +; CHECK: divb +; CHECK: divb +; CHECK-NOT: divb +; CHECK: ret + %div.r = udiv <3 x i8> %num, %div + ret <3 x i8> %div.r +} + +define <5 x i16> @test_short_div(<5 x i16> %num, <5 x i16> %div) { +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK-NOT: idivw +; CHECK: ret + %div.r = sdiv <5 x i16> %num, %div + ret <5 x i16> %div.r +} + +define <4 x i16> @test_ushort_div(<4 x i16> %num, <4 x i16> %div) { +; CHECK: divw +; CHECK: divw +; CHECK: divw +; CHECK: divw +; CHECK-NOT: divw +; CHECK: ret + %div.r = udiv <4 x i16> %num, %div + ret <4 x i16> %div.r +} + +define <3 x i32> @test_uint_div(<3 x i32> %num, <3 x i32> %div) { +; CHECK: divl +; CHECK: divl +; CHECK: divl +; CHECK-NOT: divl +; CHECK: ret + %div.r = udiv <3 x i32> %num, %div + ret <3 x i32> %div.r +} + +define <3 x i64> @test_long_div(<3 x i64> %num, <3 x i64> %div) { +; CHECK: idivq +; CHECK: idivq +; CHECK: idivq +; CHECK-NOT: idivq +; CHECK: ret + %div.r = sdiv <3 x i64> %num, %div + ret <3 x i64> %div.r +} + +define <3 x i64> @test_ulong_div(<3 x i64> %num, <3 x i64> %div) { +; CHECK: divq +; CHECK: divq +; CHECK: divq +; CHECK-NOT: divq +; CHECK: ret + %div.r = udiv <3 x i64> %num, %div + ret <3 x i64> %div.r +} + + +define <4 x i8> @test_char_rem(<4 x i8> %num, <4 x i8> %rem) { +; CHECK: idivb +; CHECK: idivb +; CHECK: idivb +; CHECK: idivb +; CHECK-NOT: idivb +; CHECK: ret + %rem.r = srem <4 x i8> %num, %rem + ret <4 x i8> %rem.r +} + +define <5 x i16> @test_short_rem(<5 x i16> %num, <5 x i16> %rem) { +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK: idivw +; CHECK-NOT: idivw +; CHECK: ret + %rem.r = srem <5 x i16> %num, %rem + ret <5 x i16> %rem.r +} + +define <4 x i32> @test_uint_rem(<4 x i32> %num, <4 x i32> %rem) { +; CHECK: idivl +; CHECK: idivl +; CHECK: idivl +; CHECK: idivl +; CHECK-NOT: idivl +; CHECK: ret + %rem.r = srem <4 x i32> %num, %rem + ret <4 x i32> %rem.r +} + + +define <5 x i64> @test_ulong_rem(<5 x i64> %num, <5 x i64> %rem) { +; CHECK: divq +; CHECK: divq +; CHECK: divq +; CHECK: divq +; CHECK: divq +; CHECK-NOT: divq +; CHECK: ret + %rem.r = urem <5 x i64> %num, %rem + ret <5 x i64> %rem.r +} From grosser at fim.uni-passau.de Sun Nov 29 20:55:56 2009 From: grosser at fim.uni-passau.de (Tobias Grosser) Date: Mon, 30 Nov 2009 03:55:56 +0100 Subject: [llvm-commits] [PATCH] DOTGraphTraits improvements In-Reply-To: <10545_1259525561_4B12D5B3_10545_45_1_D153E70E-5C1A-4A71-9B5B-F2EB2E56E5BF@apple.com> References: <1259351086.75448.152.camel@tobilaptop.fritz.box> <10545_1259525561_4B12D5B3_10545_45_1_D153E70E-5C1A-4A71-9B5B-F2EB2E56E5BF@apple.com> Message-ID: <1259549756.18675.13.camel@tobilaptop.fritz.box> On Sun, 2009-11-29 at 12:12 -0800, Chris Lattner wrote: > > On Nov 27, 2009, at 11:44 AM, Tobias Grosser wrote: > > > Hi, > > > > I worked on some dotty improvements. The main changes are: > > > > * Better layout > > * Allow to hide details in -only mode > > * Do not print unneeded stuff > > * Several small bug fixes > > > > The effects can be seen on this page: > > http://students.fim.uni-passau.de/~grosser/llvm/dotty_patch/dotty.html > > > > > Looks nice! The only thing I don't like is the removal of the T/F > labels from the "only" CFG graph. > > > > > These are the changes in detail: > > > > * Remove ":" after BB name in -view-cfg-only > > > > > Ok. > > > * Small PostDominatorTree improvements > > > > * Do not SEGFAULT if tree entryNode() is NULL > > * Print function names in dotty printer > > > > > Ok. > > > > * Only print edgeSourceLabels if they are not empty > > > > > Ok. > > > * Do not point edge heads to source labels > > > > If no destination label is available, just point to the node > > itself > > instead of pointing to some source label. Source and > > destination > > labels are not related in any way. > > > > > Ok. > > > * Instantiate DefaultDOTGraphTraits > > > > > Ok. > > > * Remove ShortNames from getNodeLabel > > > > Convert ShortNames to the general isSimple(). isSimple() is only > > called in the classes where it actually makes a difference. > > > > > + std::string getNodeLabel(const BasicBlock *Node, > + const Function *Graph) { > > > Please line up the second line with the ( > > > + if (isSimple()) > + return getSimpleNodeLabel (Node, Graph); > + else > + return getCompleteNodeLabel (Node, Graph); > + > > > No spaces before the '(' in function calls. > > > Otherwise, ok. I always overlook these. Thanks for showing me these style issues. > > * Do not print edge source labels in -view-cfg-only & > > -dot-cfg-only > > > > Without branch instructions the labels are not informative and > > at > > the same time complicate the layout of the graph. This uses the > > new > > isSimple. > > > > I'd prefer to keep these. I frequently look at a .ll file and a > cfg-only graph at the same time, and the edges are really useful for > this. OK. I will propose another solution. It complicates graph layout. > Please commit the other patches, if you need commit access, please > send me the info requested in the dev policy document off list. I will do this. From clattner at apple.com Sun Nov 29 22:21:11 2009 From: clattner at apple.com (Chris Lattner) Date: Sun, 29 Nov 2009 20:21:11 -0800 Subject: [llvm-commits] [llvm] r90108 - in /llvm/trunk: include/llvm/CodeGen/SelectionDAG.h lib/CodeGen/SelectionDAG/LegalizeTypes.cpp lib/CodeGen/SelectionDAG/LegalizeTypes.h lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp lib/CodeGen/SelectionDAG/SelectionDAG.cpp lib/Target/X86/X86ISelLowering.cpp In-Reply-To: <200911300242.nAU2g38a004283@zion.cs.uiuc.edu> References: <200911300242.nAU2g38a004283@zion.cs.uiuc.edu> Message-ID: On Nov 29, 2009, at 6:42 PM, Mon P Wang wrote: > Author: wangmp > Date: Sun Nov 29 20:42:02 2009 > New Revision: 90108 > > URL: http://llvm.org/viewvc/llvm-project?rev=90108&view=rev > Log: > Added support to allow clients to custom widen. For X86, custom widen vectors for > divide/remainder since these operations can trap by unroll them and adding undefs > for the resulting vector. Hi Mon Ping, For a vec3 divide, wouldn't it be more efficient to shuffle one of the three valid elements of the divisor into the fourth element, then do a vec4 divide? I'd think that one vec4 divide + shuffle would be faster than a bunch of extracts, scalar divs then reconstructions. -Chris > > Modified: > llvm/trunk/include/llvm/CodeGen/SelectionDAG.h > llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp > llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h > llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp > llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp > llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp > llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > > Modified: llvm/trunk/include/llvm/CodeGen/SelectionDAG.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/SelectionDAG.h?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/include/llvm/CodeGen/SelectionDAG.h (original) > +++ llvm/trunk/include/llvm/CodeGen/SelectionDAG.h Sun Nov 29 20:42:02 2009 > @@ -882,6 +882,14 @@ > /// element of the result of the vector shuffle. > SDValue getShuffleScalarElt(const ShuffleVectorSDNode *N, unsigned Idx); > > + /// UnrollVectorOp - Utility function used by legalize and lowering to > + /// "unroll" a vector operation by splitting out the scalars and operating > + /// on each element individually. If the ResNE is 0, fully unroll the vector > + /// op. If ResNE is less than the width of the vector op, unroll up to ResNE. > + /// If the ResNE is greater than the width of the vector op, unroll the > + /// vector op and fill the end of the resulting vector with UNDEFS. > + SDValue UnrollVectorOp(SDNode *N, unsigned ResNE = 0); > + > private: > bool RemoveNodeFromCSEMaps(SDNode *N); > void AddModifiedNodeToCSEMaps(SDNode *N, DAGUpdateListener *UpdateListener); > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp Sun Nov 29 20:42:02 2009 > @@ -907,6 +907,29 @@ > return true; > } > > + > +/// CustomWidenLowerNode - Widen the node's results with custom code provided > +/// by the target and return "true", or do nothing and return "false". > +bool DAGTypeLegalizer::CustomWidenLowerNode(SDNode *N, EVT VT) { > + // See if the target wants to custom lower this node. > + if (TLI.getOperationAction(N->getOpcode(), VT) != TargetLowering::Custom) > + return false; > + > + SmallVector Results; > + TLI.ReplaceNodeResults(N, Results, DAG); > + > + if (Results.empty()) > + // The target didn't want to custom widen lower its result after all. > + return false; > + > + // Update the widening map. > + assert(Results.size() == N->getNumValues() && > + "Custom lowering returned the wrong number of results!"); > + for (unsigned i = 0, e = Results.size(); i != e; ++i) > + SetWidenedVector(SDValue(N, i), Results[i]); > + return true; > +} > + > /// GetSplitDestVTs - Compute the VTs needed for the low/hi parts of a type > /// which is split into two not necessarily identical pieces. > void DAGTypeLegalizer::GetSplitDestVTs(EVT InVT, EVT &LoVT, EVT &HiVT) { > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h Sun Nov 29 20:42:02 2009 > @@ -188,6 +188,7 @@ > SDValue BitConvertVectorToIntegerVector(SDValue Op); > SDValue CreateStackStoreLoad(SDValue Op, EVT DestVT); > bool CustomLowerNode(SDNode *N, EVT VT, bool LegalizeResult); > + bool CustomWidenLowerNode(SDNode *N, EVT VT); > SDValue GetVectorElementPointer(SDValue VecPtr, EVT EltVT, SDValue Index); > SDValue JoinIntegers(SDValue Lo, SDValue Hi); > SDValue LibCallify(RTLIB::Libcall LC, SDNode *N, bool isSigned); > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp Sun Nov 29 20:42:02 2009 > @@ -54,9 +54,6 @@ > SDValue LegalizeOp(SDValue Op); > // Assuming the node is legal, "legalize" the results > SDValue TranslateLegalizeResults(SDValue Op, SDValue Result); > - // Implements unrolling a generic vector operation, i.e. turning it into > - // scalar operations. > - SDValue UnrollVectorOp(SDValue Op); > // Implements unrolling a VSETCC. > SDValue UnrollVSETCC(SDValue Op); > // Implements expansion for FNEG; falls back to UnrollVectorOp if FSUB > @@ -211,7 +208,7 @@ > else if (Node->getOpcode() == ISD::VSETCC) > Result = UnrollVSETCC(Op); > else > - Result = UnrollVectorOp(Op); > + Result = DAG.UnrollVectorOp(Op.getNode()); > break; > } > > @@ -256,7 +253,7 @@ > return DAG.getNode(ISD::FSUB, Op.getDebugLoc(), Op.getValueType(), > Zero, Op.getOperand(0)); > } > - return UnrollVectorOp(Op); > + return DAG.UnrollVectorOp(Op.getNode()); > } > > SDValue VectorLegalizer::UnrollVSETCC(SDValue Op) { > @@ -282,56 +279,6 @@ > return DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Ops[0], NumElems); > } > > -/// UnrollVectorOp - We know that the given vector has a legal type, however > -/// the operation it performs is not legal, and the target has requested that > -/// the operation be expanded. "Unroll" the vector, splitting out the scalars > -/// and operating on each element individually. > -SDValue VectorLegalizer::UnrollVectorOp(SDValue Op) { > - EVT VT = Op.getValueType(); > - assert(Op.getNode()->getNumValues() == 1 && > - "Can't unroll a vector with multiple results!"); > - unsigned NE = VT.getVectorNumElements(); > - EVT EltVT = VT.getVectorElementType(); > - DebugLoc dl = Op.getDebugLoc(); > - > - SmallVector Scalars; > - SmallVector Operands(Op.getNumOperands()); > - for (unsigned i = 0; i != NE; ++i) { > - for (unsigned j = 0; j != Op.getNumOperands(); ++j) { > - SDValue Operand = Op.getOperand(j); > - EVT OperandVT = Operand.getValueType(); > - if (OperandVT.isVector()) { > - // A vector operand; extract a single element. > - EVT OperandEltVT = OperandVT.getVectorElementType(); > - Operands[j] = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, > - OperandEltVT, > - Operand, > - DAG.getConstant(i, MVT::i32)); > - } else { > - // A scalar operand; just use it as is. > - Operands[j] = Operand; > - } > - } > - > - switch (Op.getOpcode()) { > - default: > - Scalars.push_back(DAG.getNode(Op.getOpcode(), dl, EltVT, > - &Operands[0], Operands.size())); > - break; > - case ISD::SHL: > - case ISD::SRA: > - case ISD::SRL: > - case ISD::ROTL: > - case ISD::ROTR: > - Scalars.push_back(DAG.getNode(Op.getOpcode(), dl, EltVT, Operands[0], > - DAG.getShiftAmountOperand(Operands[1]))); > - break; > - } > - } > - > - return DAG.getNode(ISD::BUILD_VECTOR, dl, VT, &Scalars[0], Scalars.size()); > -} > - > } > > bool SelectionDAG::LegalizeVectors() { > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Sun Nov 29 20:42:02 2009 > @@ -1118,8 +1118,12 @@ > DEBUG(errs() << "Widen node result " << ResNo << ": "; > N->dump(&DAG); > errs() << "\n"); > - SDValue Res = SDValue(); > > + // See if the target wants to custom widen this node. > + if (CustomWidenLowerNode(N, N->getValueType(ResNo))) > + return; > + > + SDValue Res = SDValue(); > switch (N->getOpcode()) { > default: > #ifndef NDEBUG > > Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original) > +++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Sun Nov 29 20:42:02 2009 > @@ -5807,6 +5807,66 @@ > N->dump(G); > } > > +SDValue SelectionDAG::UnrollVectorOp(SDNode *N, unsigned ResNE) { > + assert(N->getNumValues() == 1 && > + "Can't unroll a vector with multiple results!"); > + > + EVT VT = N->getValueType(0); > + unsigned NE = VT.getVectorNumElements(); > + EVT EltVT = VT.getVectorElementType(); > + DebugLoc dl = N->getDebugLoc(); > + > + SmallVector Scalars; > + SmallVector Operands(N->getNumOperands()); > + > + // If ResNE is 0, fully unroll the vector op. > + if (ResNE == 0) > + ResNE = NE; > + else if (NE > ResNE) > + NE = ResNE; > + > + unsigned i; > + for (i= 0; i != NE; ++i) { > + for (unsigned j = 0; j != N->getNumOperands(); ++j) { > + SDValue Operand = N->getOperand(j); > + EVT OperandVT = Operand.getValueType(); > + if (OperandVT.isVector()) { > + // A vector operand; extract a single element. > + EVT OperandEltVT = OperandVT.getVectorElementType(); > + Operands[j] = getNode(ISD::EXTRACT_VECTOR_ELT, dl, > + OperandEltVT, > + Operand, > + getConstant(i, MVT::i32)); > + } else { > + // A scalar operand; just use it as is. > + Operands[j] = Operand; > + } > + } > + > + switch (N->getOpcode()) { > + default: > + Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, > + &Operands[0], Operands.size())); > + break; > + case ISD::SHL: > + case ISD::SRA: > + case ISD::SRL: > + case ISD::ROTL: > + case ISD::ROTR: > + Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, Operands[0], > + getShiftAmountOperand(Operands[1]))); > + break; > + } > + } > + > + for (; i < ResNE; ++i) > + Scalars.push_back(getUNDEF(EltVT)); > + > + return getNode(ISD::BUILD_VECTOR, dl, > + EVT::getVectorVT(*getContext(), EltVT, ResNE), > + &Scalars[0], Scalars.size()); > +} > + > void SelectionDAG::dump() const { > errs() << "SelectionDAG has " << AllNodes.size() << " nodes:"; > > @@ -5962,3 +6022,4 @@ > return false; > return true; > } > + > > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=90108&r1=90107&r2=90108&view=diff > > ============================================================================== > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original) > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Sun Nov 29 20:42:02 2009 > @@ -975,6 +975,19 @@ > > computeRegisterProperties(); > > + // Divide and reminder operations have no vector equivalent and can > + // trap. Do a custom widening for these operations in which we never > + // generate more divides/remainder than the original vector width. > + for (unsigned VT = (unsigned)MVT::FIRST_VECTOR_VALUETYPE; > + VT <= (unsigned)MVT::LAST_VECTOR_VALUETYPE; ++VT) { > + if (!isTypeLegal((MVT::SimpleValueType)VT)) { > + setOperationAction(ISD::SDIV, (MVT::SimpleValueType) VT, Custom); > + setOperationAction(ISD::UDIV, (MVT::SimpleValueType) VT, Custom); > + setOperationAction(ISD::SREM, (MVT::SimpleValueType) VT, Custom); > + setOperationAction(ISD::UREM, (MVT::SimpleValueType) VT, Custom); > + } > + } > + > // FIXME: These should be based on subtarget info. Plus, the values should > // be smaller when we are in optimizing for size mode. > maxStoresPerMemset = 16; // For @llvm.memset -> sequence of stores > @@ -7170,6 +7183,14 @@ > Results.push_back(edx.getValue(1)); > return; > } > + case ISD::SDIV: > + case ISD::UDIV: > + case ISD::SREM: > + case ISD::UREM: { > + EVT WidenVT = getTypeToTransformTo(*DAG.getContext(), N->getValueType(0)); > + Results.push_back(DAG.UnrollVectorOp(N, WidenVT.getVectorNumElements())); > + return; > + } > case ISD::ATOMIC_CMP_SWAP: { > EVT T = N->getValueType(0); > assert (T == MVT::i64 && "Only know how to expand i64 Cmp and Swap"); > > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits From nicholas at mxc.ca Sun Nov 29 22:23:18 2009 From: nicholas at mxc.ca (Nick Lewycky) Date: Mon, 30 Nov 2009 04:23:18 -0000 Subject: [llvm-commits] [llvm] r90111 - in /llvm/trunk/docs/tutorial: JITTutorial1.html JITTutorial2-1.png JITTutorial2.html index.html Message-ID: <200911300423.nAU4NJXo009535@zion.cs.uiuc.edu> Author: nicholas Date: Sun Nov 29 22:23:17 2009 New Revision: 90111 URL: http://llvm.org/viewvc/llvm-project?rev=90111&view=rev Log: Remove the 'simple jit' tutorial as it wasn't really being maintained and its material is covered by the Kaleidoscope tutorial. Removed: llvm/trunk/docs/tutorial/JITTutorial1.html llvm/trunk/docs/tutorial/JITTutorial2-1.png llvm/trunk/docs/tutorial/JITTutorial2.html Modified: llvm/trunk/docs/tutorial/index.html Removed: llvm/trunk/docs/tutorial/JITTutorial1.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/JITTutorial1.html?rev=90110&view=auto ============================================================================== --- llvm/trunk/docs/tutorial/JITTutorial1.html (original) +++ llvm/trunk/docs/tutorial/JITTutorial1.html (removed) @@ -1,207 +0,0 @@ - - - - - LLVM Tutorial 1: A First Function - - - - - - - - -
      LLVM Tutorial 1: A First Function
      - -
      -

      Written by Owen Anderson

      -
      - - - - - -
      - -

      For starters, let's consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple, especially since it involves no control flow:

      - -
      -
      -int mul_add(int x, int y, int z) {
      -  return x * y + z;
      -}
      -
      -
      - -

      As a preview, the LLVM IR we???re going to end up generating for this function will look like:

      - -
      -
      -define i32 @mul_add(i32 %x, i32 %y, i32 %z) {
      -entry:
      -  %tmp = mul i32 %x, %y
      -  %tmp2 = add i32 %tmp, %z
      -  ret i32 %tmp2
      -}
      -
      -
      - -

      If you're unsure what the above code says, skim through the LLVM Language Reference Manual and convince yourself that the above LLVM IR is actually equivalent to the original function. Once you???re satisfied with that, let's move on to actually generating it programmatically!

      - -

      Of course, before we can start, we need to #include the appropriate LLVM header files:

      - -
      -
      -#include "llvm/Module.h"
      -#include "llvm/Function.h"
      -#include "llvm/PassManager.h"
      -#include "llvm/CallingConv.h"
      -#include "llvm/Analysis/Verifier.h"
      -#include "llvm/Assembly/PrintModulePass.h"
      -#include "llvm/Support/IRBuilder.h"
      -#include "llvm/Support/raw_ostream.h"
      -
      -
      - -

      Now, let's get started on our real program. Here's what our basic main() will look like:

      - -
      -
      -using namespace llvm;
      -
      -Module* makeLLVMModule();
      -
      -int main(int argc, char**argv) {
      -  Module* Mod = makeLLVMModule();
      -
      -  verifyModule(*Mod, PrintMessageAction);
      -
      -  PassManager PM;
      -  PM.add(createPrintModulePass(&outs()));
      -  PM.run(*Mod);
      -
      -  delete Mod;
      -  return 0;
      -}
      -
      -
      - -

      The first segment is pretty simple: it creates an LLVM ???module.??? In LLVM, a module represents a single unit of code that is to be processed together. A module contains things like global variables, function declarations, and implementations. Here we???ve declared a makeLLVMModule() function to do the real work of creating the module. Don???t worry, we???ll be looking at that one next!

      - -

      The second segment runs the LLVM module verifier on our newly created module. While this probably isn???t really necessary for a simple module like this one, it's always a good idea, especially if you???re generating LLVM IR based on some input. The verifier will print an error message if your LLVM module is malformed in any way.

      - -

      Finally, we instantiate an LLVM PassManager and run -the PrintModulePass on our module. LLVM uses an explicit pass -infrastructure to manage optimizations and various other things. -A PassManager, as should be obvious from its name, manages passes: -it is responsible for scheduling them, invoking them, and ensuring the proper -disposal after we???re done with them. For this example, we???re just using a -trivial pass that prints out our module in textual form.

      - -

      Now onto the interesting part: creating and populating a module. Here's the -first chunk of our makeLLVMModule():

      - -
      -
      -Module* makeLLVMModule() {
      -  // Module Construction
      -  Module* mod = new Module("test", getGlobalContext());
      -
      -
      - -

      Exciting, isn???t it!? All we???re doing here is instantiating a module and giving it a name. The name isn???t particularly important unless you???re going to be dealing with multiple modules at once.

      - -
      -
      -  Constant* c = mod->getOrInsertFunction("mul_add",
      -  /*ret type*/                           IntegerType::get(32),
      -  /*args*/                               IntegerType::get(32),
      -                                         IntegerType::get(32),
      -                                         IntegerType::get(32),
      -  /*varargs terminated with null*/       NULL);
      -  
      -  Function* mul_add = cast<Function>(c);
      -  mul_add->setCallingConv(CallingConv::C);
      -
      -
      - -

      We construct our Function by calling getOrInsertFunction() on our module, passing in the name, return type, and argument types of the function. In the case of our mul_add function, that means one 32-bit integer for the return value and three 32-bit integers for the arguments.

      - -

      You'll notice that getOrInsertFunction() doesn't actually return a Function*. This is because getOrInsertFunction() will return a cast of the existing function if the function already existed with a different prototype. Since we know that there's not already a mul_add function, we can safely just cast c to a Function*. - -

      In addition, we set the calling convention for our new function to be the C -calling convention. This isn???t strictly necessary, but it ensures that our new -function will interoperate properly with C code, which is a good thing.

      - -
      -
      -  Function::arg_iterator args = mul_add->arg_begin();
      -  Value* x = args++;
      -  x->setName("x");
      -  Value* y = args++;
      -  y->setName("y");
      -  Value* z = args++;
      -  z->setName("z");
      -
      -
      - -

      While we???re setting up our function, let's also give names to the parameters. This also isn???t strictly necessary (LLVM will generate names for them if you don???t specify them), but it???ll make looking at our output somewhat more pleasant. To name the parameters, we iterate over the arguments of our function and call setName() on them. We???ll also keep the pointer to x, y, and z around, since we???ll need them when we get around to creating instructions.

      - -

      Great! We have a function now. But what good is a function if it has no body? Before we start working on a body for our new function, we need to recall some details of the LLVM IR. The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. To create a body for our function, we fill it with blocks:

      - -
      -
      -  BasicBlock* block = BasicBlock::Create(getGlobalContext(), "entry", mul_add);
      -  IRBuilder<> builder(block);
      -
      -
      - -

      We create a new basic block, as you might expect, by calling its constructor. All we need to tell it is its name and the function to which it belongs. In addition, we???re creating an IRBuilder object, which is a convenience interface for creating instructions and appending them to the end of a block. Instructions can be created through their constructors as well, but some of their interfaces are quite complicated. Unless you need a lot of control, using IRBuilder will make your life simpler.

      - -
      -
      -  Value* tmp = builder.CreateBinOp(Instruction::Mul,
      -                                   x, y, "tmp");
      -  Value* tmp2 = builder.CreateBinOp(Instruction::Add,
      -                                    tmp, z, "tmp2");
      -
      -  builder.CreateRet(tmp2);
      -  
      -  return mod;
      -}
      -
      -
      - -

      The final step in creating our function is to create the instructions that make it up. Our mul_add function is composed of just three instructions: a multiply, an add, and a return. IRBuilder gives us a simple interface for constructing these instructions and appending them to the ???entry??? block. Each of the calls to IRBuilder returns a Value* that represents the value yielded by the instruction. You???ll also notice that, above, x, y, and z are also Value*'s, so it's clear that instructions operate on Value*'s.

      - -

      And that's it! Now you can compile and run your code, and get a wonderful textual print out of the LLVM IR we saw at the beginning. To compile, use the following command line as a guide:

      - -
      -
      -# c++ -g tut1.cpp `llvm-config --cxxflags --ldflags --libs core` -o tut1
      -# ./tut1
      -
      -
      - -

      The llvm-config utility is used to obtain the necessary GCC-compatible compiler flags for linking with LLVM. For this example, we only need the 'core' library. We'll use others once we start adding optimizers and the JIT engine.

      - -Next: A More Complicated Function -
      - - -
      -
      - Valid CSS! - Valid HTML 4.01! - - Owen Anderson
      - The LLVM Compiler Infrastructure
      - Last modified: $Date: 2009-07-21 11:05:13 -0700 (Tue, 21 Jul 2009) $ -
      - - - Removed: llvm/trunk/docs/tutorial/JITTutorial2-1.png URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/JITTutorial2-1.png?rev=90110&view=auto ============================================================================== Binary file - no diff available. Removed: llvm/trunk/docs/tutorial/JITTutorial2.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/JITTutorial2.html?rev=90110&view=auto ============================================================================== --- llvm/trunk/docs/tutorial/JITTutorial2.html (original) +++ llvm/trunk/docs/tutorial/JITTutorial2.html (removed) @@ -1,200 +0,0 @@ - - - - - LLVM Tutorial 2: A More Complicated Function - - - - - - - - -
      LLVM Tutorial 2: A More Complicated Function
      - -
      -

      Written by Owen Anderson

      -
      - - - - - -
      - -

      Now that we understand the basics of creating functions in LLVM, let's move on to a more complicated example: something with control flow. As an example, let's consider Euclid's Greatest Common Denominator (GCD) algorithm:

      - -
      -
      -unsigned gcd(unsigned x, unsigned y) {
      -  if(x == y) {
      -    return x;
      -  } else if(x < y) {
      -    return gcd(x, y - x);
      -  } else {
      -    return gcd(x - y, y);
      -  }
      -}
      -
      -
      - -

      With this example, we'll learn how to create functions with multiple blocks and control flow, and how to make function calls within your LLVM code. For starters, consider the diagram below.

      - -
      GCD CFG
      - -

      This is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the LLVM Language Reference Manual and convince yourself that it is, in fact, the GCD algorithm.

      - -

      The first part of our code is practically the same as from the first tutorial. The same basic setup is required: creating a module, verifying it, and running the PrintModulePass on it. Even the first segment of makeLLVMModule() looks essentially the same, except that gcd takes one fewer parameter than mul_add.

      - -
      -
      -#include "llvm/Module.h"
      -#include "llvm/Function.h"
      -#include "llvm/PassManager.h"
      -#include "llvm/Analysis/Verifier.h"
      -#include "llvm/Assembly/PrintModulePass.h"
      -#include "llvm/Support/IRBuilder.h"
      -#include "llvm/Support/raw_ostream.h"
      -
      -using namespace llvm;
      -
      -Module* makeLLVMModule();
      -
      -int main(int argc, char**argv) {
      -  Module* Mod = makeLLVMModule();
      -  
      -  verifyModule(*Mod, PrintMessageAction);
      -  
      -  PassManager PM;
      -  PM.add(createPrintModulePass(&outs()));
      -  PM.run(*Mod);
      -
      -  delete Mod;  
      -  return 0;
      -}
      -
      -Module* makeLLVMModule() {
      -  Module* mod = new Module("tut2");
      -  
      -  Constant* c = mod->getOrInsertFunction("gcd",
      -                                         IntegerType::get(32),
      -                                         IntegerType::get(32),
      -                                         IntegerType::get(32),
      -                                         NULL);
      -  Function* gcd = cast<Function>(c);
      -  
      -  Function::arg_iterator args = gcd->arg_begin();
      -  Value* x = args++;
      -  x->setName("x");
      -  Value* y = args++;
      -  y->setName("y");
      -
      -
      - -

      Here, however, is where our code begins to diverge from the first tutorial. Because gcd has control flow, it is composed of multiple blocks interconnected by branching (br) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as br or ret.

      - -

      Blocks correspond to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that we're making use of LLVM's automatic name uniquing in this code sample, since we're giving two blocks the same name.

      - -
      -
      -  BasicBlock* entry = BasicBlock::Create(getGlobalContext(), ("entry", gcd);
      -  BasicBlock* ret = BasicBlock::Create(getGlobalContext(), ("return", gcd);
      -  BasicBlock* cond_false = BasicBlock::Create(getGlobalContext(), ("cond_false", gcd);
      -  BasicBlock* cond_true = BasicBlock::Create(getGlobalContext(), ("cond_true", gcd);
      -  BasicBlock* cond_false_2 = BasicBlock::Create(getGlobalContext(), ("cond_false", gcd);
      -
      -
      - -

      Now we're ready to begin generating code! We'll start with the entry block. This block corresponds to the top-level if-statement in the original C code, so we need to compare x and y. To achieve this, we perform an explicit comparison using ICmpEQ. ICmpEQ stands for an integer comparison for equality and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with ret as the true and cond_false as the false case.

      - -
      -
      -  IRBuilder<> builder(entry);
      -  Value* xEqualsY = builder.CreateICmpEQ(x, y, "tmp");
      -  builder.CreateCondBr(xEqualsY, ret, cond_false);
      -
      -
      - -

      Our next block, ret, is pretty simple: it just returns the value of x. Recall that this block is only reached if x == y, so this is the correct behavior. Notice that instead of creating a new IRBuilder for each block, we can use SetInsertPoint to retarget our existing one. This saves on construction and memory allocation costs.

      - -
      -
      -  builder.SetInsertPoint(ret);
      -  builder.CreateRet(x);
      -
      -
      - -

      cond_false is a more interesting block: we now know that x -!= y, so we must branch again to determine which of x -and y is larger. This is achieved using the ICmpULT -instruction, which stands for integer comparison for unsigned -less-than. In LLVM, integer types do not carry sign; a 32-bit integer -pseudo-register can be interpreted as signed or unsigned without casting. -Whether a signed or unsigned interpretation is desired is specified in the -instruction. This is why several instructions in the LLVM IR, such as integer -less-than, include a specifier for signed or unsigned.

      - -

      Also note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp" to illustrate that LLVM will give them all unique names without getting confused.

      - -
      -
      -  builder.SetInsertPoint(cond_false);
      -  Value* xLessThanY = builder.CreateICmpULT(x, y, "tmp");
      -  builder.CreateCondBr(xLessThanY, cond_true, cond_false_2);
      -
      -
      - -

      Our last two blocks are quite similar; they're both recursive calls to gcd with different parameters. To create a call instruction, we have to create a vector (or any other container with InputInterators) to hold the arguments. We then pass in the beginning and ending iterators for this vector.

      - -
      -
      -  builder.SetInsertPoint(cond_true);
      -  Value* yMinusX = builder.CreateSub(y, x, "tmp");
      -  std::vector<Value*> args1;
      -  args1.push_back(x);
      -  args1.push_back(yMinusX);
      -  Value* recur_1 = builder.CreateCall(gcd, args1.begin(), args1.end(), "tmp");
      -  builder.CreateRet(recur_1);
      -  
      -  builder.SetInsertPoint(cond_false_2);
      -  Value* xMinusY = builder.CreateSub(x, y, "tmp");
      -  std::vector<Value*> args2;
      -  args2.push_back(xMinusY);
      -  args2.push_back(y);
      -  Value* recur_2 = builder.CreateCall(gcd, args2.begin(), args2.end(), "tmp");
      -  builder.CreateRet(recur_2);
      -  
      -  return mod;
      -}
      -
      -
      - -

      And that's it! You can compile and execute your code in the same way as before, by doing:

      - -
      -
      -# c++ -g tut2.cpp `llvm-config --cxxflags --ldflags --libs core` -o tut2
      -# ./tut2
      -
      -
      - -
      - - -
      -
      - Valid CSS! - Valid HTML 4.01! - - Owen Anderson