From isanbard at gmail.com Mon Oct 4 00:30:24 2010 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 04 Oct 2010 05:30:24 -0000 Subject: [llvm-commits] [www] r115496 - /www/trunk/Users.html Message-ID: <20101004053024.5B4692A6C12E@llvm.org> Author: void Date: Mon Oct 4 00:30:24 2010 New Revision: 115496 URL: http://llvm.org/viewvc/llvm-project?rev=115496&view=rev Log: Use HTML entities for non-latinate characters. Modified: www/trunk/Users.html Modified: www/trunk/Users.html URL: http://llvm.org/viewvc/llvm-project/www/trunk/Users.html?rev=115496&r1=115495&r2=115496&view=diff ============================================================================== --- www/trunk/Users.html (original) +++ www/trunk/Users.html Mon Oct 4 00:30:24 2010 @@ -402,9 +402,9 @@ - Institut d'Electronique et T??l??communications de Rennes
+ Institut d'Electronique et Télécommunications de Rennes
ARTEMIS - Institut Telecom/Telecom SudParis - Micka??l Raulet, Matthieu Wipliez, J??r??me Gorin + Mickaël Raulet, Matthieu Wipliez, Jérôme Gorin From stoklund at 2pi.dk Mon Oct 4 00:39:02 2010 From: stoklund at 2pi.dk (Jakob Stoklund Olesen) Date: Sun, 3 Oct 2010 22:39:02 -0700 Subject: [llvm-commits] [llvm] r115495 - /llvm/trunk/docs/ReleaseNotes.html In-Reply-To: <20101004043925.632362A6C12E@llvm.org> References: <20101004043925.632362A6C12E@llvm.org> Message-ID: On Oct 3, 2010, at 9:39 PM, Chris Lattner wrote: > +
  • The new SubRegIndex tablegen class allows subregisters to be indexed > + symbolically instead of numerically. If your target uses subregisters you > + will need to adapt to use SubRegIndex when you upgrade to 2.8.
  • Yup. > I don't think this is worth mentioning in the release notes since it doesn't really work yet. /jakob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1929 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20101003/0ba83854/attachment.bin From nlewycky at google.com Mon Oct 4 01:39:49 2010 From: nlewycky at google.com (Nick Lewycky) Date: Sun, 3 Oct 2010 23:39:49 -0700 Subject: [llvm-commits] [llvm] r115393 - in /llvm/trunk: CMakeLists.txt lib/Target/MSP430/InstPrinter/CMakeLists.txt lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h lib/Target/MSP430/InstPrinter/Makefi In-Reply-To: References: <07E65C41-D605-4147-B976-1BC502C5BF68@apple.com> Message-ID: Here's what the GenLibDeps.pl script thinks is going on. libLLVMMSP430CodeGen.a uses but does not define symbols: _ZN4llvm17MSP430InstPrinter15getRegisterNameEj aka. llvm::MSP430InstPrinter::getRegisterName(unsigned int) which are provided by libLLVMMSP430AsmPrinter.a. Going in the other direction, libLLVMMSP430AsmPrinter.a uses symbols: _ZN4llvm6MSP43011GR8RegClassE aka. llvm::MSP430::GR8RegClass _ZN4llvm6MSP43012GR16RegClassE aka. llvm::MSP430::GR16RegClass GR8RegClass and GR16RegClass is defined in MSP430RegisterInfo.o (which is rolled into ...CodeGen.a). Its only reference in ...AsmPrinter.a is by MSP430InstPrinter.o. The getRegisterName function is defined in MSP430InstPrinter.o (part of ...AsmPrinter.a) and its only reference in ...CodeGen.a is by MSP430AsmPrinter.o. Is that enough to go on? Nick On 1 October 2010 18:43, Nick Lewycky wrote: > Okay. You can see that almost all of the open-source builders were broken: > > http://google1.osuosl.org:8011/console > > in that time. It's impossible for > this particular error to occur in a cmake build because cmake doesn't run > find-cycles.pl (last i checked). My suspicion is that the cmake builders > were working fine while configure+make ones were not? > > I'm going to wind back to the broken point and try to reproduce the failure > and see if I can figure out what the cyclic dependency actually was. > > Nick > > > On 1 October 2010 18:27, Jim Grosbach wrote: > >> That's very strange. I do a configure/make here, and it works, and lots of >> bots using that were green as well. If there's a case I missed, I'd love to >> have some help tracking down what it is. Can you try a "make clean" and see >> if that works? Maybe there's just something stale that the configure portion >> of the patch needs to clean up. >> >> -Jim >> >> >> >> On Oct 1, 2010, at 6:24 PM, Nick Lewycky wrote: >> >> Nope, it broke under a regular configure+make in-srctree incremental build >> on multiple different machines. >> >> On 1 October 2010 18:22, Jim Grosbach wrote: >> >>> Nick, >>> >>> These only break for you under CMake, right? That's the only place I've >>> been able to reproduce failures. >>> >>> -Jim >>> >>> >>> On Oct 1, 2010, at 6:06 PM, Nick Lewycky wrote: >>> >>> > Author: nicholas >>> > Date: Fri Oct 1 20:06:42 2010 >>> > New Revision: 115393 >>> > >>> > URL: http://llvm.org/viewvc/llvm-project?rev=115393&view=rev >>> > Log: >>> > Revert patches r115363 r115367 r115391 due to build breakage: >>> > llvm[2]: Updated LibDeps.txt because dependencies changed >>> > llvm[2]: Checking for cyclic dependencies between LLVM libraries. >>> > find-cycles.pl: Circular dependency between *.a files: >>> > find-cycles.pl: libLLVMMSP430AsmPrinter.a libLLVMMSP430CodeGen.a >>> > >>> > >>> > Modified: >>> > llvm/trunk/CMakeLists.txt >>> > llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >>> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >>> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >>> > llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >>> > llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >>> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >>> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >>> > llvm/trunk/lib/Target/MSP430/Makefile >>> > >>> > Modified: llvm/trunk/CMakeLists.txt >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/CMakeLists.txt (original) >>> > +++ llvm/trunk/CMakeLists.txt Fri Oct 1 20:06:42 2010 >>> > @@ -323,10 +323,6 @@ >>> > add_subdirectory(lib/Target/${t}/AsmPrinter) >>> > set(LLVM_ENUM_ASM_PRINTERS >>> > "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >>> > - if( EXISTS >>> ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/InstPrinter/CMakeLists.txt ) >>> > - add_subdirectory(lib/Target/${t}/InstPrinter) >>> > - set(LLVM_ENUM_ASM_PRINTERS >>> > - "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >>> > endif( EXISTS >>> ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmPrinter/CMakeLists.txt ) >>> > if( EXISTS >>> ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmParser/CMakeLists.txt ) >>> > add_subdirectory(lib/Target/${t}/AsmParser) >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt (original) >>> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt Fri Oct 1 >>> 20:06:42 2010 >>> > @@ -1,6 +0,0 @@ >>> > -include_directories( ${CMAKE_CURRENT_BINARY_DIR}/.. >>> ${CMAKE_CURRENT_SOURCE_DIR}/.. ) >>> > - >>> > -add_llvm_library(LLVMMSP430AsmPrinter >>> > - MSP430InstPrinter.cpp >>> > - ) >>> > -add_dependencies(LLVMMSP430AsmPrinter MSP430CodeGenTable_gen) >>> > >>> > Modified: >>> llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >>> (original) >>> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp Fri >>> Oct 1 20:06:42 2010 >>> > @@ -1,114 +0,0 @@ >>> > -//===-- MSP430InstPrinter.cpp - Convert MSP430 MCInst to assembly >>> syntax --===// >>> > -// >>> > -// The LLVM Compiler Infrastructure >>> > -// >>> > -// This file is distributed under the University of Illinois Open >>> Source >>> > -// License. See LICENSE.TXT for details. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > -// >>> > -// This class prints an MSP430 MCInst to a .s file. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > - >>> > -#define DEBUG_TYPE "asm-printer" >>> > -#include "MSP430.h" >>> > -#include "MSP430InstrInfo.h" >>> > -#include "MSP430InstPrinter.h" >>> > -#include "llvm/MC/MCInst.h" >>> > -#include "llvm/MC/MCAsmInfo.h" >>> > -#include "llvm/MC/MCExpr.h" >>> > -#include "llvm/Support/ErrorHandling.h" >>> > -#include "llvm/Support/FormattedStream.h" >>> > -using namespace llvm; >>> > - >>> > - >>> > -// Include the auto-generated portion of the assembly writer. >>> > -#include "MSP430GenAsmWriter.inc" >>> > - >>> > -void MSP430InstPrinter::printInst(const MCInst *MI, raw_ostream &O) { >>> > - printInstruction(MI, O); >>> > -} >>> > - >>> > -void MSP430InstPrinter::printPCRelImmOperand(const MCInst *MI, >>> unsigned OpNo, >>> > - raw_ostream &O) { >>> > - const MCOperand &Op = MI->getOperand(OpNo); >>> > - if (Op.isImm()) >>> > - O << Op.getImm(); >>> > - else { >>> > - assert(Op.isExpr() && "unknown pcrel immediate operand"); >>> > - O << *Op.getExpr(); >>> > - } >>> > -} >>> > - >>> > -void MSP430InstPrinter::printOperand(const MCInst *MI, unsigned OpNo, >>> > - raw_ostream &O, const char >>> *Modifier) { >>> > - assert((Modifier == 0 || Modifier[0] == 0) && "No modifiers >>> supported"); >>> > - const MCOperand &Op = MI->getOperand(OpNo); >>> > - if (Op.isReg()) { >>> > - O << getRegisterName(Op.getReg()); >>> > - } else if (Op.isImm()) { >>> > - O << '#' << Op.getImm(); >>> > - } else { >>> > - assert(Op.isExpr() && "unknown operand kind in printOperand"); >>> > - O << '#' << *Op.getExpr(); >>> > - } >>> > -} >>> > - >>> > -void MSP430InstPrinter::printSrcMemOperand(const MCInst *MI, unsigned >>> OpNo, >>> > - raw_ostream &O, >>> > - const char *Modifier) { >>> > - const MCOperand &Base = MI->getOperand(OpNo); >>> > - const MCOperand &Disp = MI->getOperand(OpNo+1); >>> > - >>> > - // Print displacement first >>> > - >>> > - // If the global address expression is a part of displacement field >>> with a >>> > - // register base, we should not emit any prefix symbol here, e.g. >>> > - // mov.w &foo, r1 >>> > - // vs >>> > - // mov.w glb(r1), r2 >>> > - // Otherwise (!) msp430-as will silently miscompile the output :( >>> > - if (!Base.getReg()) >>> > - O << '&'; >>> > - >>> > - if (Disp.isExpr()) >>> > - O << *Disp.getExpr(); >>> > - else { >>> > - assert(Disp.isImm() && "Expected immediate in displacement >>> field"); >>> > - O << Disp.getImm(); >>> > - } >>> > - >>> > - // Print register base field >>> > - if (Base.getReg()) >>> > - O << '(' << getRegisterName(Base.getReg()) << ')'; >>> > -} >>> > - >>> > -void MSP430InstPrinter::printCCOperand(const MCInst *MI, unsigned >>> OpNo, >>> > - raw_ostream &O) { >>> > - unsigned CC = MI->getOperand(OpNo).getImm(); >>> > - >>> > - switch (CC) { >>> > - default: >>> > - llvm_unreachable("Unsupported CC code"); >>> > - break; >>> > - case MSP430CC::COND_E: >>> > - O << "eq"; >>> > - break; >>> > - case MSP430CC::COND_NE: >>> > - O << "ne"; >>> > - break; >>> > - case MSP430CC::COND_HS: >>> > - O << "hs"; >>> > - break; >>> > - case MSP430CC::COND_LO: >>> > - O << "lo"; >>> > - break; >>> > - case MSP430CC::COND_GE: >>> > - O << "ge"; >>> > - break; >>> > - case MSP430CC::COND_L: >>> > - O << 'l'; >>> > - break; >>> > - } >>> > -} >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >>> (original) >>> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h Fri >>> Oct 1 20:06:42 2010 >>> > @@ -1,43 +0,0 @@ >>> > -//===-- MSP430InstPrinter.h - Convert MSP430 MCInst to assembly syntax >>> ----===// >>> > -// >>> > -// The LLVM Compiler Infrastructure >>> > -// >>> > -// This file is distributed under the University of Illinois Open >>> Source >>> > -// License. See LICENSE.TXT for details. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > -// >>> > -// This class prints a MSP430 MCInst to a .s file. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > - >>> > -#ifndef MSP430INSTPRINTER_H >>> > -#define MSP430INSTPRINTER_H >>> > - >>> > -#include "llvm/MC/MCInstPrinter.h" >>> > - >>> > -namespace llvm { >>> > - class MCOperand; >>> > - >>> > - class MSP430InstPrinter : public MCInstPrinter { >>> > - public: >>> > - MSP430InstPrinter(const MCAsmInfo &MAI) : MCInstPrinter(MAI) { >>> > - } >>> > - >>> > - virtual void printInst(const MCInst *MI, raw_ostream &O); >>> > - >>> > - // Autogenerated by tblgen. >>> > - void printInstruction(const MCInst *MI, raw_ostream &O); >>> > - static const char *getRegisterName(unsigned RegNo); >>> > - >>> > - void printOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O, >>> > - const char *Modifier = 0); >>> > - void printPCRelImmOperand(const MCInst *MI, unsigned OpNo, >>> raw_ostream &O); >>> > - void printSrcMemOperand(const MCInst *MI, unsigned OpNo, >>> raw_ostream &O, >>> > - const char *Modifier = 0); >>> > - void printCCOperand(const MCInst *MI, unsigned OpNo, raw_ostream >>> &O); >>> > - >>> > - }; >>> > -} >>> > - >>> > -#endif >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile (original) >>> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile Fri Oct 1 >>> 20:06:42 2010 >>> > @@ -1,15 +0,0 @@ >>> > -##===- lib/Target/MSP430/AsmPrinter/Makefile ---------------*- >>> Makefile -*-===## >>> > -# >>> > -# The LLVM Compiler Infrastructure >>> > -# >>> > -# This file is distributed under the University of Illinois Open >>> Source >>> > -# License. See LICENSE.TXT for details. >>> > -# >>> > >>> -##===----------------------------------------------------------------------===## >>> > -LEVEL = ../../../.. >>> > -LIBRARYNAME = LLVMMSP430AsmPrinter >>> > - >>> > -# Hack: we need to include 'main' MSP430 target directory to grab >>> private headers >>> > -CPP.Flags += -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/.. >>> > - >>> > -include $(LEVEL)/Makefile.common >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp (original) >>> > +++ llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp Fri Oct 1 >>> 20:06:42 2010 >>> > @@ -1,179 +0,0 @@ >>> > -//===-- MSP430AsmPrinter.cpp - MSP430 LLVM assembly writer >>> ----------------===// >>> > -// >>> > -// The LLVM Compiler Infrastructure >>> > -// >>> > -// This file is distributed under the University of Illinois Open >>> Source >>> > -// License. See LICENSE.TXT for details. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > -// >>> > -// This file contains a printer that converts from our internal >>> representation >>> > -// of machine-dependent LLVM code to the MSP430 assembly language. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > - >>> > -#define DEBUG_TYPE "asm-printer" >>> > -#include "MSP430.h" >>> > -#include "MSP430InstrInfo.h" >>> > -#include "InstPrinter/MSP430InstPrinter.h" >>> > -#include "MSP430MCAsmInfo.h" >>> > -#include "MSP430MCInstLower.h" >>> > -#include "MSP430TargetMachine.h" >>> > -#include "llvm/Constants.h" >>> > -#include "llvm/DerivedTypes.h" >>> > -#include "llvm/Module.h" >>> > -#include "llvm/Assembly/Writer.h" >>> > -#include "llvm/CodeGen/AsmPrinter.h" >>> > -#include "llvm/CodeGen/MachineModuleInfo.h" >>> > -#include "llvm/CodeGen/MachineFunctionPass.h" >>> > -#include "llvm/CodeGen/MachineConstantPool.h" >>> > -#include "llvm/CodeGen/MachineInstr.h" >>> > -#include "llvm/MC/MCInst.h" >>> > -#include "llvm/MC/MCStreamer.h" >>> > -#include "llvm/MC/MCSymbol.h" >>> > -#include "llvm/Target/Mangler.h" >>> > -#include "llvm/Target/TargetData.h" >>> > -#include "llvm/Target/TargetLoweringObjectFile.h" >>> > -#include "llvm/Target/TargetRegistry.h" >>> > -#include "llvm/Support/raw_ostream.h" >>> > -using namespace llvm; >>> > - >>> > -namespace { >>> > - class MSP430AsmPrinter : public AsmPrinter { >>> > - public: >>> > - MSP430AsmPrinter(TargetMachine &TM, MCStreamer &Streamer) >>> > - : AsmPrinter(TM, Streamer) {} >>> > - >>> > - virtual const char *getPassName() const { >>> > - return "MSP430 Assembly Printer"; >>> > - } >>> > - >>> > - void printOperand(const MachineInstr *MI, int OpNum, >>> > - raw_ostream &O, const char* Modifier = 0); >>> > - void printSrcMemOperand(const MachineInstr *MI, int OpNum, >>> > - raw_ostream &O); >>> > - bool PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, >>> > - unsigned AsmVariant, const char *ExtraCode, >>> > - raw_ostream &O); >>> > - bool PrintAsmMemoryOperand(const MachineInstr *MI, >>> > - unsigned OpNo, unsigned AsmVariant, >>> > - const char *ExtraCode, raw_ostream &O); >>> > - void EmitInstruction(const MachineInstr *MI); >>> > - }; >>> > -} // end of anonymous namespace >>> > - >>> > - >>> > -void MSP430AsmPrinter::printOperand(const MachineInstr *MI, int OpNum, >>> > - raw_ostream &O, const char >>> *Modifier) { >>> > - const MachineOperand &MO = MI->getOperand(OpNum); >>> > - switch (MO.getType()) { >>> > - default: assert(0 && "Not implemented yet!"); >>> > - case MachineOperand::MO_Register: >>> > - O << MSP430InstPrinter::getRegisterName(MO.getReg()); >>> > - return; >>> > - case MachineOperand::MO_Immediate: >>> > - if (!Modifier || strcmp(Modifier, "nohash")) >>> > - O << '#'; >>> > - O << MO.getImm(); >>> > - return; >>> > - case MachineOperand::MO_MachineBasicBlock: >>> > - O << *MO.getMBB()->getSymbol(); >>> > - return; >>> > - case MachineOperand::MO_GlobalAddress: { >>> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >>> > - uint64_t Offset = MO.getOffset(); >>> > - >>> > - // If the global address expression is a part of displacement >>> field with a >>> > - // register base, we should not emit any prefix symbol here, e.g. >>> > - // mov.w &foo, r1 >>> > - // vs >>> > - // mov.w glb(r1), r2 >>> > - // Otherwise (!) msp430-as will silently miscompile the output :( >>> > - if (!Modifier || strcmp(Modifier, "nohash")) >>> > - O << (isMemOp ? '&' : '#'); >>> > - if (Offset) >>> > - O << '(' << Offset << '+'; >>> > - >>> > - O << *Mang->getSymbol(MO.getGlobal()); >>> > - >>> > - if (Offset) >>> > - O << ')'; >>> > - >>> > - return; >>> > - } >>> > - case MachineOperand::MO_ExternalSymbol: { >>> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >>> > - O << (isMemOp ? '&' : '#'); >>> > - O << MAI->getGlobalPrefix() << MO.getSymbolName(); >>> > - return; >>> > - } >>> > - } >>> > -} >>> > - >>> > -void MSP430AsmPrinter::printSrcMemOperand(const MachineInstr *MI, int >>> OpNum, >>> > - raw_ostream &O) { >>> > - const MachineOperand &Base = MI->getOperand(OpNum); >>> > - const MachineOperand &Disp = MI->getOperand(OpNum+1); >>> > - >>> > - // Print displacement first >>> > - >>> > - // Imm here is in fact global address - print extra modifier. >>> > - if (Disp.isImm() && !Base.getReg()) >>> > - O << '&'; >>> > - printOperand(MI, OpNum+1, O, "nohash"); >>> > - >>> > - // Print register base field >>> > - if (Base.getReg()) { >>> > - O << '('; >>> > - printOperand(MI, OpNum, O); >>> > - O << ')'; >>> > - } >>> > -} >>> > - >>> > -/// PrintAsmOperand - Print out an operand for an inline asm >>> expression. >>> > -/// >>> > -bool MSP430AsmPrinter::PrintAsmOperand(const MachineInstr *MI, >>> unsigned OpNo, >>> > - unsigned AsmVariant, >>> > - const char *ExtraCode, >>> raw_ostream &O) { >>> > - // Does this asm operand have a single letter operand modifier? >>> > - if (ExtraCode && ExtraCode[0]) >>> > - return true; // Unknown modifier. >>> > - >>> > - printOperand(MI, OpNo, O); >>> > - return false; >>> > -} >>> > - >>> > -bool MSP430AsmPrinter::PrintAsmMemoryOperand(const MachineInstr *MI, >>> > - unsigned OpNo, unsigned >>> AsmVariant, >>> > - const char *ExtraCode, >>> > - raw_ostream &O) { >>> > - if (ExtraCode && ExtraCode[0]) { >>> > - return true; // Unknown modifier. >>> > - } >>> > - printSrcMemOperand(MI, OpNo, O); >>> > - return false; >>> > -} >>> > - >>> > >>> -//===----------------------------------------------------------------------===// >>> > -void MSP430AsmPrinter::EmitInstruction(const MachineInstr *MI) { >>> > - MSP430MCInstLower MCInstLowering(OutContext, *Mang, *this); >>> > - >>> > - MCInst TmpInst; >>> > - MCInstLowering.Lower(MI, TmpInst); >>> > - OutStreamer.EmitInstruction(TmpInst); >>> > -} >>> > - >>> > -static MCInstPrinter *createMSP430MCInstPrinter(const Target &T, >>> > - unsigned >>> SyntaxVariant, >>> > - const MCAsmInfo &MAI) >>> { >>> > - if (SyntaxVariant == 0) >>> > - return new MSP430InstPrinter(MAI); >>> > - return 0; >>> > -} >>> > - >>> > -// Force static initialization. >>> > -extern "C" void LLVMInitializeMSP430AsmPrinter() { >>> > - RegisterAsmPrinter X(TheMSP430Target); >>> > - TargetRegistry::RegisterMCInstPrinter(TheMSP430Target, >>> > - createMSP430MCInstPrinter); >>> > -} >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp (original) >>> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp Fri Oct 1 >>> 20:06:42 2010 >>> > @@ -1,150 +0,0 @@ >>> > -//===-- MSP430MCInstLower.cpp - Convert MSP430 MachineInstr to an >>> MCInst---===// >>> > -// >>> > -// The LLVM Compiler Infrastructure >>> > -// >>> > -// This file is distributed under the University of Illinois Open >>> Source >>> > -// License. See LICENSE.TXT for details. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > -// >>> > -// This file contains code to lower MSP430 MachineInstrs to their >>> corresponding >>> > -// MCInst records. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > - >>> > -#include "MSP430MCInstLower.h" >>> > -#include "llvm/CodeGen/AsmPrinter.h" >>> > -#include "llvm/CodeGen/MachineBasicBlock.h" >>> > -#include "llvm/CodeGen/MachineInstr.h" >>> > -#include "llvm/MC/MCAsmInfo.h" >>> > -#include "llvm/MC/MCContext.h" >>> > -#include "llvm/MC/MCExpr.h" >>> > -#include "llvm/MC/MCInst.h" >>> > -#include "llvm/Target/Mangler.h" >>> > -#include "llvm/Support/raw_ostream.h" >>> > -#include "llvm/Support/ErrorHandling.h" >>> > -#include "llvm/ADT/SmallString.h" >>> > -using namespace llvm; >>> > - >>> > -MCSymbol *MSP430MCInstLower:: >>> > -GetGlobalAddressSymbol(const MachineOperand &MO) const { >>> > - switch (MO.getTargetFlags()) { >>> > - default: llvm_unreachable("Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - return Printer.Mang->getSymbol(MO.getGlobal()); >>> > -} >>> > - >>> > -MCSymbol *MSP430MCInstLower:: >>> > -GetExternalSymbolSymbol(const MachineOperand &MO) const { >>> > - switch (MO.getTargetFlags()) { >>> > - default: assert(0 && "Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - return Printer.GetExternalSymbolSymbol(MO.getSymbolName()); >>> > -} >>> > - >>> > -MCSymbol *MSP430MCInstLower:: >>> > -GetJumpTableSymbol(const MachineOperand &MO) const { >>> > - SmallString<256> Name; >>> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() >>> << "JTI" >>> > - << Printer.getFunctionNumber() << '_' >>> > - << MO.getIndex(); >>> > - >>> > - switch (MO.getTargetFlags()) { >>> > - default: llvm_unreachable("Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - // Create a symbol for the name. >>> > - return Ctx.GetOrCreateSymbol(Name.str()); >>> > -} >>> > - >>> > -MCSymbol *MSP430MCInstLower:: >>> > -GetConstantPoolIndexSymbol(const MachineOperand &MO) const { >>> > - SmallString<256> Name; >>> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() >>> << "CPI" >>> > - << Printer.getFunctionNumber() << '_' >>> > - << MO.getIndex(); >>> > - >>> > - switch (MO.getTargetFlags()) { >>> > - default: llvm_unreachable("Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - // Create a symbol for the name. >>> > - return Ctx.GetOrCreateSymbol(Name.str()); >>> > -} >>> > - >>> > -MCSymbol *MSP430MCInstLower:: >>> > -GetBlockAddressSymbol(const MachineOperand &MO) const { >>> > - switch (MO.getTargetFlags()) { >>> > - default: assert(0 && "Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - return Printer.GetBlockAddressSymbol(MO.getBlockAddress()); >>> > -} >>> > - >>> > -MCOperand MSP430MCInstLower:: >>> > -LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const { >>> > - // FIXME: We would like an efficient form for this, so we don't have >>> to do a >>> > - // lot of extra uniquing. >>> > - const MCExpr *Expr = MCSymbolRefExpr::Create(Sym, Ctx); >>> > - >>> > - switch (MO.getTargetFlags()) { >>> > - default: llvm_unreachable("Unknown target flag on GV operand"); >>> > - case 0: break; >>> > - } >>> > - >>> > - if (!MO.isJTI() && MO.getOffset()) >>> > - Expr = MCBinaryExpr::CreateAdd(Expr, >>> > - >>> MCConstantExpr::Create(MO.getOffset(), Ctx), >>> > - Ctx); >>> > - return MCOperand::CreateExpr(Expr); >>> > -} >>> > - >>> > -void MSP430MCInstLower::Lower(const MachineInstr *MI, MCInst &OutMI) >>> const { >>> > - OutMI.setOpcode(MI->getOpcode()); >>> > - >>> > - for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) { >>> > - const MachineOperand &MO = MI->getOperand(i); >>> > - >>> > - MCOperand MCOp; >>> > - switch (MO.getType()) { >>> > - default: >>> > - MI->dump(); >>> > - assert(0 && "unknown operand type"); >>> > - case MachineOperand::MO_Register: >>> > - // Ignore all implicit register operands. >>> > - if (MO.isImplicit()) continue; >>> > - MCOp = MCOperand::CreateReg(MO.getReg()); >>> > - break; >>> > - case MachineOperand::MO_Immediate: >>> > - MCOp = MCOperand::CreateImm(MO.getImm()); >>> > - break; >>> > - case MachineOperand::MO_MachineBasicBlock: >>> > - MCOp = MCOperand::CreateExpr(MCSymbolRefExpr::Create( >>> > - MO.getMBB()->getSymbol(), Ctx)); >>> > - break; >>> > - case MachineOperand::MO_GlobalAddress: >>> > - MCOp = LowerSymbolOperand(MO, GetGlobalAddressSymbol(MO)); >>> > - break; >>> > - case MachineOperand::MO_ExternalSymbol: >>> > - MCOp = LowerSymbolOperand(MO, GetExternalSymbolSymbol(MO)); >>> > - break; >>> > - case MachineOperand::MO_JumpTableIndex: >>> > - MCOp = LowerSymbolOperand(MO, GetJumpTableSymbol(MO)); >>> > - break; >>> > - case MachineOperand::MO_ConstantPoolIndex: >>> > - MCOp = LowerSymbolOperand(MO, GetConstantPoolIndexSymbol(MO)); >>> > - break; >>> > - case MachineOperand::MO_BlockAddress: >>> > - MCOp = LowerSymbolOperand(MO, GetBlockAddressSymbol(MO)); >>> > - } >>> > - >>> > - OutMI.addOperand(MCOp); >>> > - } >>> > -} >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h (original) >>> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h Fri Oct 1 >>> 20:06:42 2010 >>> > @@ -1,50 +0,0 @@ >>> > -//===-- MSP430MCInstLower.h - Lower MachineInstr to MCInst >>> ----------------===// >>> > -// >>> > -// The LLVM Compiler Infrastructure >>> > -// >>> > -// This file is distributed under the University of Illinois Open >>> Source >>> > -// License. See LICENSE.TXT for details. >>> > -// >>> > >>> -//===----------------------------------------------------------------------===// >>> > - >>> > -#ifndef MSP430_MCINSTLOWER_H >>> > -#define MSP430_MCINSTLOWER_H >>> > - >>> > -#include "llvm/Support/Compiler.h" >>> > - >>> > -namespace llvm { >>> > - class AsmPrinter; >>> > - class MCAsmInfo; >>> > - class MCContext; >>> > - class MCInst; >>> > - class MCOperand; >>> > - class MCSymbol; >>> > - class MachineInstr; >>> > - class MachineModuleInfoMachO; >>> > - class MachineOperand; >>> > - class Mangler; >>> > - >>> > - /// MSP430MCInstLower - This class is used to lower an MachineInstr >>> > - /// into an MCInst. >>> > -class LLVM_LIBRARY_VISIBILITY MSP430MCInstLower { >>> > - MCContext &Ctx; >>> > - Mangler &Mang; >>> > - >>> > - AsmPrinter &Printer; >>> > -public: >>> > - MSP430MCInstLower(MCContext &ctx, Mangler &mang, AsmPrinter >>> &printer) >>> > - : Ctx(ctx), Mang(mang), Printer(printer) {} >>> > - void Lower(const MachineInstr *MI, MCInst &OutMI) const; >>> > - >>> > - MCOperand LowerSymbolOperand(const MachineOperand &MO, MCSymbol >>> *Sym) const; >>> > - >>> > - MCSymbol *GetGlobalAddressSymbol(const MachineOperand &MO) const; >>> > - MCSymbol *GetExternalSymbolSymbol(const MachineOperand &MO) const; >>> > - MCSymbol *GetJumpTableSymbol(const MachineOperand &MO) const; >>> > - MCSymbol *GetConstantPoolIndexSymbol(const MachineOperand &MO) >>> const; >>> > - MCSymbol *GetBlockAddressSymbol(const MachineOperand &MO) const; >>> > -}; >>> > - >>> > -} >>> > - >>> > -#endif >>> > >>> > Modified: llvm/trunk/lib/Target/MSP430/Makefile >>> > URL: >>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/Makefile?rev=115393&r1=115392&r2=115393&view=diff >>> > >>> ============================================================================== >>> > --- llvm/trunk/lib/Target/MSP430/Makefile (original) >>> > +++ llvm/trunk/lib/Target/MSP430/Makefile Fri Oct 1 20:06:42 2010 >>> > @@ -18,7 +18,7 @@ >>> > MSP430GenDAGISel.inc MSP430GenCallingConv.inc \ >>> > MSP430GenSubtarget.inc >>> > >>> > -DIRS = InstPrinter TargetInfo >>> > +DIRS = AsmPrinter TargetInfo >>> > >>> > include $(LEVEL)/Makefile.common >>> > >>> > >>> > >>> > _______________________________________________ >>> > llvm-commits mailing list >>> > llvm-commits at cs.uiuc.edu >>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >>> >>> >>> _______________________________________________ >>> llvm-commits mailing list >>> llvm-commits at cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >>> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20101003/02c3c775/attachment.html From geek4civic at gmail.com Mon Oct 4 01:59:01 2010 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Mon, 4 Oct 2010 15:59:01 +0900 Subject: [llvm-commits] [PATCH] FileCheck.cpp: made match regex '$' for DOSish \r\n Message-ID: Some tests have CHECK: {{foobar$}} to cause mismatch failure on mingw. I took the way to eliminate \r on MemoryBuffer. Is there any better way? ...Takumi -------------- next part -------------- diff --git a/utils/FileCheck/FileCheck.cpp b/utils/FileCheck/FileCheck.cpp index cd76d44..2077e09 100644 --- a/utils/FileCheck/FileCheck.cpp +++ b/utils/FileCheck/FileCheck.cpp @@ -446,6 +446,11 @@ static MemoryBuffer *CanonicalizeInputFile(MemoryBuffer *MB) { for (const char *Ptr = MB->getBufferStart(), *End = MB->getBufferEnd(); Ptr != End; ++Ptr) { + // Eliminate trailing dosish \r. + if (Ptr <= End - 2 && Ptr[0] == '\r' && Ptr[1] == '\n') { + continue; + } + // If C is not a horizontal whitespace, skip it. if (*Ptr != ' ' && *Ptr != '\t') { NewFile.push_back(*Ptr); From dgregor at apple.com Mon Oct 4 02:02:35 2010 From: dgregor at apple.com (Douglas Gregor) Date: Mon, 04 Oct 2010 07:02:35 -0000 Subject: [llvm-commits] [llvm] r115498 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004070235.86C2D2A6C12C@llvm.org> Author: dgregor Date: Mon Oct 4 02:02:35 2010 New Revision: 115498 URL: http://llvm.org/viewvc/llvm-project?rev=115498&view=rev Log: Update LLVM 2.8 release notes for Clang Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115498&r1=115497&r2=115498&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 02:02:35 2010 @@ -119,10 +119,18 @@

    In the LLVM 2.8 time-frame, the Clang team has made many improvements:

    -
      -
    • Surely these guys have done something
    • -
    • X86-64 abi improvements? Did they make it in?
    • -
    +
      +
    • Clang C++ is now feature-complete with respect to the ISO C++ 1998 and 2003 standards.
    • +
    • Added support for Objective-C++.
    • +
    • Clang now uses LLVM-MC to directly generate object code and to parse inline assembly (on Darwin).
    • +
    • Introduced many new warnings, including -Wmissing-field-initializers, -Wshadow, -Wno-protocol, -Wtautological-compare, -Wstrict-selector-match, -Wcast-align, -Wunused improvements, and greatly improved format-string checking.
    • +
    • Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.
    • +
    • Added support for #pragma GCC visibility, #pragma align, and others.
    • +
    • Added support for SSE, ARM NEON, and Altvec.
    • +
    • Implemented support for blocks in C++.
    • +
    • Implemented precompiled headers for C++.
    • +
    • Improved abstract syntax trees to retain more accurate source information.
    • +
    From geek4civic at gmail.com Mon Oct 4 02:22:55 2010 From: geek4civic at gmail.com (NAKAMURA Takumi) Date: Mon, 4 Oct 2010 16:22:55 +0900 Subject: [llvm-commits] [Review request] lit: cygming support on the function "which" Message-ID: Hello, Daniel. Please take a look into my patch. confirmed on cygwin-1.5, cygwin-1.7 and mingw. * On Windows, os.path.exists(command) matches to directories, too. (eg. at seeking "bugpoint", the directory "test/BugPoint" matched when current directory is test/) * On Windows, unexecutable file matches. (eg. when "macho-dump" and "macho-dump.bat" exist, "macho-dump.bat" is not hit) * On Cygwin, the environment variable PATHEXT meddles. (on cygwin, os.path.exists("/path/to/clang") matches to "/path/to/clang.exe") In contrast, on win32, which should not seek suffix-less files. (eg. "test/Scripts/macho-dump" should be ignored) Thank you, ...Takumi -------------- next part -------------- diff --git a/utils/lit/lit/Util.py b/utils/lit/lit/Util.py index 414b714..5dbdd0b 100644 --- a/utils/lit/lit/Util.py +++ b/utils/lit/lit/Util.py @@ -56,7 +56,8 @@ def which(command, paths = None): paths = os.environ.get('PATH','') # Check for absolute match first. - if os.path.exists(command): + if (command != os.path.basename(command) + and os.path.isfile(command)): return command # Would be nice if Python had a lib function for this. @@ -64,7 +65,11 @@ def which(command, paths = None): paths = os.defpath # Get suffixes to search. - pathext = os.environ.get('PATHEXT', '').split(os.pathsep) + # On Cygwin, 'PATHEXT' exists but it should not be used. + if os.pathsep == ';': + pathext = os.environ.get('PATHEXT', '').split(';') + else: + pathext = [''] # Search the paths... for path in paths.split(os.pathsep): From baldrick at free.fr Mon Oct 4 04:11:50 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Oct 2010 09:11:50 -0000 Subject: [llvm-commits] [llvm] r115499 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004091151.249432A6C12E@llvm.org> Author: baldrick Date: Mon Oct 4 04:11:50 2010 New Revision: 115499 URL: http://llvm.org/viewvc/llvm-project?rev=115499&view=rev Log: Altvec -> Altivec. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115499&r1=115498&r2=115499&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 04:11:50 2010 @@ -126,7 +126,7 @@
  • Introduced many new warnings, including -Wmissing-field-initializers, -Wshadow, -Wno-protocol, -Wtautological-compare, -Wstrict-selector-match, -Wcast-align, -Wunused improvements, and greatly improved format-string checking.
  • Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.
  • Added support for #pragma GCC visibility, #pragma align, and others.
  • -
  • Added support for SSE, ARM NEON, and Altvec.
  • +
  • Added support for SSE, ARM NEON, and Altivec.
  • Implemented support for blocks in C++.
  • Implemented precompiled headers for C++.
  • Improved abstract syntax trees to retain more accurate source information.
  • From baldrick at free.fr Mon Oct 4 05:04:14 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Oct 2010 10:04:14 -0000 Subject: [llvm-commits] [llvm] r115500 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004100415.08CF12A6C12E@llvm.org> Author: baldrick Date: Mon Oct 4 05:04:14 2010 New Revision: 115500 URL: http://llvm.org/viewvc/llvm-project?rev=115500&view=rev Log: Fix a bunch of typos. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115500&r1=115499&r2=115500&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 05:04:14 2010 @@ -450,7 +450,7 @@ configurations are designed by MPEG Reconfigurable Video Coding (RVC) committee. MPEG RVC standard is built on a stream-based dataflow representation of decoders. It is composed of a standard library of coding tools written in -RVC-CAL language and a dataflow configuration &emdash; block diagram &emdash; +RVC-CAL language and a dataflow configuration — block diagram — of a decoder.

    Jade project is hosted as part of the Open @@ -631,7 +631,7 @@

  • The new RegionInfo analysis pass identifies single-entry single-exit regions in the CFG. You can play with it with the "opt -regions analyze" or "opt -view-regions" commands.
  • -
  • The loop optimizer has significantly improve strength reduction and analysis +
  • The loop optimizer has significantly improved strength reduction and analysis capabilities. Notably it is able to build on the trap value and signed integer overflow information to optimize <= and >= loops.
  • The CallGraphSCCPassManager now has some basic support for iterating within @@ -733,7 +733,7 @@ extends, and optimizes away compare instructions when the condition result is available from a previous instruction.
  • Atomic operations now get legalized into simpler atomic operations if not - natively supported, easy the implementation burden on targets.
  • + natively supported, easing the implementation burden on targets.
  • The bottom-up pre-allocation scheduler is now register pressure aware, allowing it to avoid overscheduling in high pressure situations while still aggressively scheduling when registers are available.
  • @@ -782,7 +782,7 @@ using a register in a different domain than where it was defined. This pass optimizes away these stalls. -
  • The X86 backend now promote 16-bit integer operations to 32-bits when +
  • The X86 backend now promotes 16-bit integer operations to 32-bits when possible. This avoids 0x66 prefixes, which are slow on some microarchitectures and bloat the code on all of them.
  • @@ -794,7 +794,7 @@ the X86 "int $42" and "int3" instructions.
  • At the IR level, the <2 x float> datatype is now promoted and passed - around as a <4 x float> instead of being passed and returns as an MMX + around as a <4 x float> instead of being passed and returned as an MMX vector. If you have a frontend that uses this, please pass and return a <2 x i32> instead (using bitcasts).
  • @@ -829,7 +829,7 @@
  • Half float instructions are now supported.
  • NEON support has been improved to model instructions which operate onto - multiple consequtive registers more aggressively. This avoids lots of + multiple consecutive registers more aggressively. This avoids lots of extraneous register copies.
  • The ARM backend now uses a new "ARMGlobalMerge" pass, which merges several global variables into one, saving extra address computation (all the global @@ -905,7 +905,7 @@
    • The build configuration machinery changed the output directory names. It - wasn't clear to many people that "Release-Asserts" build was a release build + wasn't clear to many people that a "Release-Asserts" build was a release build without asserts. To make this more clear, "Release" does not include assertions and "Release+Asserts" does (likewise, "Debug" and "Debug+Asserts").
    • From baldrick at free.fr Mon Oct 4 05:06:56 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Oct 2010 10:06:56 -0000 Subject: [llvm-commits] [llvm] r115501 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004100656.D65942A6C12E@llvm.org> Author: baldrick Date: Mon Oct 4 05:06:56 2010 New Revision: 115501 URL: http://llvm.org/viewvc/llvm-project?rev=115501&view=rev Log: Ada support has moved to dragonegg - I am no longer working on Ada in llvm-gcc. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115501&r1=115500&r2=115501&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 05:06:56 2010 @@ -1161,37 +1161,9 @@ 4.2. If you are interested in Fortran, we recommend that you consider using dragonegg instead.

      -

      The llvm-gcc 4.2 Ada compiler has basic functionality. However, this is not a -mature technology, and problems should be expected. For example:

      -
        -
      • The Ada front-end currently only builds on X86-32. This is mainly due -to lack of trampoline support (pointers to nested functions) on other platforms. -However, it also fails to build on X86-64 -which does support trampolines.
      • -
      • The Ada front-end fails to bootstrap. -This is due to lack of LLVM support for setjmp/longjmp style -exception handling, which is used internally by the compiler. -Workaround: configure with --disable-bootstrap.
      • -
      • The c380004, c393010 -and cxg2021 ACATS tests fail -(c380004 also fails with gcc-4.2 mainline). -If the compiler is built with checks disabled then c393010 -causes the compiler to go into an infinite loop, using up all system memory.
      • -
      • Some GCC specific Ada tests continue to crash the compiler.
      • -
      • The -E binder option (exception backtraces) -does not work and will result in programs -crashing if an exception is raised. Workaround: do not use -E.
      • -
      • Only discrete types are allowed to start -or finish at a non-byte offset in a record. Workaround: do not pack records -or use representation clauses that result in a field of a non-discrete type -starting or finishing in the middle of a byte.
      • -
      • The lli interpreter considers -'main' as generated by the Ada binder to be invalid. -Workaround: hand edit the file to use pointers for argv and -envp rather than integers.
      • -
      • The -fstack-check option is -ignored.
      • -
      +

      The llvm-gcc 4.2 Ada compiler has basic functionality, but is no longer being +actively maintained. If you are interested in Ada, we recommend that you +consider using dragonegg instead.

      From pichet2000 at gmail.com Mon Oct 4 06:44:10 2010 From: pichet2000 at gmail.com (Francois Pichet) Date: Mon, 4 Oct 2010 07:44:10 -0400 Subject: [llvm-commits] [PATCH[WIN32] fix clang\test\Lexer\preamble.c in win32 Message-ID: Hi, I am investigating all the win32 XFAIL in clang test. This patch is necessary to remove the XFAIL from Lexer\preamble.c The failing was due to this: 1. preamble.c contains CR+LF new lines 2. write() is called with a buffer containing the original (CR+LF) to output the result on the console. 3. In text mode(the default), write() convert LF to CR+LF even if LF is preceded by CR, hence we have CR+CR+LF which filecheck interprets as 2 lines. 4. Filecheck fails Solution: always use binary mode for output stream. Should not affect unix where O_BINARY is not defined (I believe) After this patch is accepted, i'll remove the XFAIL from preamble.c in the clang svn. -------------- next part -------------- A non-text attachment was scrubbed... Name: preamble_win32_fix.patch Type: application/octet-stream Size: 1612 bytes Desc: not available Url : http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20101004/08baa63a/attachment.obj From lhames at gmail.com Mon Oct 4 07:13:07 2010 From: lhames at gmail.com (Lang Hames) Date: Mon, 04 Oct 2010 12:13:07 -0000 Subject: [llvm-commits] [llvm] r115502 - /llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp Message-ID: <20101004121307.75AB42A6C12E@llvm.org> Author: lhames Date: Mon Oct 4 07:13:07 2010 New Revision: 115502 URL: http://llvm.org/viewvc/llvm-project?rev=115502&view=rev Log: Removed the older style (in-allocator) problem construction system from the PBQP allocator. Problem construction is now done exclusively with the new builders. Modified: llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp Modified: llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp?rev=115502&r1=115501&r2=115502&view=diff ============================================================================== --- llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp (original) +++ llvm/trunk/lib/CodeGen/RegAllocPBQP.cpp Mon Oct 4 07:13:07 2010 @@ -68,12 +68,6 @@ cl::init(false), cl::Hidden); static cl::opt -pbqpBuilder("pbqp-builder", - cl::desc("Use new builder system."), - cl::init(true), cl::Hidden); - - -static cl::opt pbqpPreSplitting("pbqp-pre-splitting", cl::desc("Pre-split before PBQP register allocation."), cl::init(false), cl::Hidden); @@ -129,76 +123,17 @@ LiveStacks *lss; VirtRegMap *vrm; - LI2NodeMap li2Node; - Node2LIMap node2LI; - AllowedSetMap allowedSets; RegSet vregsToAlloc, emptyIntervalVRegs; - NodeVector problemNodes; - - - /// Builds a PBQP cost vector. - template - PBQP::Vector buildCostVector(unsigned vReg, - const RegContainer &allowed, - const CoalesceMap &cealesces, - PBQP::PBQPNum spillCost) const; - - /// \brief Builds a PBQP interference matrix. - /// - /// @return Either a pointer to a non-zero PBQP matrix representing the - /// allocation option costs, or a null pointer for a zero matrix. - /// - /// Expects allowed sets for two interfering LiveIntervals. These allowed - /// sets should contain only allocable registers from the LiveInterval's - /// register class, with any interfering pre-colored registers removed. - template - PBQP::Matrix* buildInterferenceMatrix(const RegContainer &allowed1, - const RegContainer &allowed2) const; - - /// - /// Expects allowed sets for two potentially coalescable LiveIntervals, - /// and an estimated benefit due to coalescing. The allowed sets should - /// contain only allocable registers from the LiveInterval's register - /// classes, with any interfering pre-colored registers removed. - template - PBQP::Matrix* buildCoalescingMatrix(const RegContainer &allowed1, - const RegContainer &allowed2, - PBQP::PBQPNum cBenefit) const; - - /// \brief Finds coalescing opportunities and returns them as a map. - /// - /// Any entries in the map are guaranteed coalescable, even if their - /// corresponding live intervals overlap. - CoalesceMap findCoalesces(); /// \brief Finds the initial set of vreg intervals to allocate. void findVRegIntervalsToAlloc(); - /// \brief Constructs a PBQP problem representation of the register - /// allocation problem for this function. - /// - /// Old Construction Process - this functionality has been subsumed - /// by PBQPBuilder. This function will only be hanging around for a little - /// while until the new system has been fully tested. - /// - /// @return a PBQP solver object for the register allocation problem. - PBQP::Graph constructPBQPProblemOld(); - /// \brief Adds a stack interval if the given live interval has been /// spilled. Used to support stack slot coloring. void addStackInterval(const LiveInterval *spilled,MachineRegisterInfo* mri); /// \brief Given a solved PBQP problem maps this solution back to a register /// assignment. - /// - /// Old Construction Process - this functionality has been subsumed - /// by PBQPBuilder. This function will only be hanging around for a little - /// while until the new system has been fully tested. - /// - bool mapPBQPToRegAllocOld(const PBQP::Solution &solution); - - /// \brief Given a solved PBQP problem maps this solution back to a register - /// assignment. bool mapPBQPToRegAlloc(const PBQPRAProblem &problem, const PBQP::Solution &solution); @@ -510,306 +445,6 @@ MachineFunctionPass::getAnalysisUsage(au); } -template -PBQP::Vector RegAllocPBQP::buildCostVector(unsigned vReg, - const RegContainer &allowed, - const CoalesceMap &coalesces, - PBQP::PBQPNum spillCost) const { - - typedef typename RegContainer::const_iterator AllowedItr; - - // Allocate vector. Additional element (0th) used for spill option - PBQP::Vector v(allowed.size() + 1, 0); - - v[0] = spillCost; - - // Iterate over the allowed registers inserting coalesce benefits if there - // are any. - unsigned ai = 0; - for (AllowedItr itr = allowed.begin(), end = allowed.end(); - itr != end; ++itr, ++ai) { - - unsigned pReg = *itr; - - CoalesceMap::const_iterator cmItr = - coalesces.find(RegPair(vReg, pReg)); - - // No coalesce - on to the next preg. - if (cmItr == coalesces.end()) - continue; - - // We have a coalesce - insert the benefit. - v[ai + 1] = -cmItr->second; - } - - return v; -} - -template -PBQP::Matrix* RegAllocPBQP::buildInterferenceMatrix( - const RegContainer &allowed1, const RegContainer &allowed2) const { - - typedef typename RegContainer::const_iterator RegContainerIterator; - - // Construct a PBQP matrix representing the cost of allocation options. The - // rows and columns correspond to the allocation options for the two live - // intervals. Elements will be infinite where corresponding registers alias, - // since we cannot allocate aliasing registers to interfering live intervals. - // All other elements (non-aliasing combinations) will have zero cost. Note - // that the spill option (element 0,0) has zero cost, since we can allocate - // both intervals to memory safely (the cost for each individual allocation - // to memory is accounted for by the cost vectors for each live interval). - PBQP::Matrix *m = - new PBQP::Matrix(allowed1.size() + 1, allowed2.size() + 1, 0); - - // Assume this is a zero matrix until proven otherwise. Zero matrices occur - // between interfering live ranges with non-overlapping register sets (e.g. - // non-overlapping reg classes, or disjoint sets of allowed regs within the - // same class). The term "overlapping" is used advisedly: sets which do not - // intersect, but contain registers which alias, will have non-zero matrices. - // We optimize zero matrices away to improve solver speed. - bool isZeroMatrix = true; - - - // Row index. Starts at 1, since the 0th row is for the spill option, which - // is always zero. - unsigned ri = 1; - - // Iterate over allowed sets, insert infinities where required. - for (RegContainerIterator a1Itr = allowed1.begin(), a1End = allowed1.end(); - a1Itr != a1End; ++a1Itr) { - - // Column index, starts at 1 as for row index. - unsigned ci = 1; - unsigned reg1 = *a1Itr; - - for (RegContainerIterator a2Itr = allowed2.begin(), a2End = allowed2.end(); - a2Itr != a2End; ++a2Itr) { - - unsigned reg2 = *a2Itr; - - // If the row/column regs are identical or alias insert an infinity. - if (tri->regsOverlap(reg1, reg2)) { - (*m)[ri][ci] = std::numeric_limits::infinity(); - isZeroMatrix = false; - } - - ++ci; - } - - ++ri; - } - - // If this turns out to be a zero matrix... - if (isZeroMatrix) { - // free it and return null. - delete m; - return 0; - } - - // ...otherwise return the cost matrix. - return m; -} - -template -PBQP::Matrix* RegAllocPBQP::buildCoalescingMatrix( - const RegContainer &allowed1, const RegContainer &allowed2, - PBQP::PBQPNum cBenefit) const { - - typedef typename RegContainer::const_iterator RegContainerIterator; - - // Construct a PBQP Matrix representing the benefits of coalescing. As with - // interference matrices the rows and columns represent allowed registers - // for the LiveIntervals which are (potentially) to be coalesced. The amount - // -cBenefit will be placed in any element representing the same register - // for both intervals. - PBQP::Matrix *m = - new PBQP::Matrix(allowed1.size() + 1, allowed2.size() + 1, 0); - - // Reset costs to zero. - m->reset(0); - - // Assume the matrix is zero till proven otherwise. Zero matrices will be - // optimized away as in the interference case. - bool isZeroMatrix = true; - - // Row index. Starts at 1, since the 0th row is for the spill option, which - // is always zero. - unsigned ri = 1; - - // Iterate over the allowed sets, insert coalescing benefits where - // appropriate. - for (RegContainerIterator a1Itr = allowed1.begin(), a1End = allowed1.end(); - a1Itr != a1End; ++a1Itr) { - - // Column index, starts at 1 as for row index. - unsigned ci = 1; - unsigned reg1 = *a1Itr; - - for (RegContainerIterator a2Itr = allowed2.begin(), a2End = allowed2.end(); - a2Itr != a2End; ++a2Itr) { - - // If the row and column represent the same register insert a beneficial - // cost to preference this allocation - it would allow us to eliminate a - // move instruction. - if (reg1 == *a2Itr) { - (*m)[ri][ci] = -cBenefit; - isZeroMatrix = false; - } - - ++ci; - } - - ++ri; - } - - // If this turns out to be a zero matrix... - if (isZeroMatrix) { - // ...free it and return null. - delete m; - return 0; - } - - return m; -} - -RegAllocPBQP::CoalesceMap RegAllocPBQP::findCoalesces() { - - typedef MachineFunction::const_iterator MFIterator; - typedef MachineBasicBlock::const_iterator MBBIterator; - typedef LiveInterval::const_vni_iterator VNIIterator; - - CoalesceMap coalescesFound; - - // To find coalesces we need to iterate over the function looking for - // copy instructions. - for (MFIterator bbItr = mf->begin(), bbEnd = mf->end(); - bbItr != bbEnd; ++bbItr) { - - const MachineBasicBlock *mbb = &*bbItr; - - for (MBBIterator iItr = mbb->begin(), iEnd = mbb->end(); - iItr != iEnd; ++iItr) { - - const MachineInstr *instr = &*iItr; - - // If this isn't a copy then continue to the next instruction. - if (!instr->isCopy()) - continue; - - unsigned srcReg = instr->getOperand(1).getReg(); - unsigned dstReg = instr->getOperand(0).getReg(); - - // If the registers are already the same our job is nice and easy. - if (dstReg == srcReg) - continue; - - bool srcRegIsPhysical = TargetRegisterInfo::isPhysicalRegister(srcReg), - dstRegIsPhysical = TargetRegisterInfo::isPhysicalRegister(dstReg); - - // If both registers are physical then we can't coalesce. - if (srcRegIsPhysical && dstRegIsPhysical) - continue; - - // If it's a copy that includes two virtual register but the source and - // destination classes differ then we can't coalesce. - if (!srcRegIsPhysical && !dstRegIsPhysical && - mri->getRegClass(srcReg) != mri->getRegClass(dstReg)) - continue; - - // If one is physical and one is virtual, check that the physical is - // allocatable in the class of the virtual. - if (srcRegIsPhysical && !dstRegIsPhysical) { - const TargetRegisterClass *dstRegClass = mri->getRegClass(dstReg); - if (std::find(dstRegClass->allocation_order_begin(*mf), - dstRegClass->allocation_order_end(*mf), srcReg) == - dstRegClass->allocation_order_end(*mf)) - continue; - } - if (!srcRegIsPhysical && dstRegIsPhysical) { - const TargetRegisterClass *srcRegClass = mri->getRegClass(srcReg); - if (std::find(srcRegClass->allocation_order_begin(*mf), - srcRegClass->allocation_order_end(*mf), dstReg) == - srcRegClass->allocation_order_end(*mf)) - continue; - } - - // If we've made it here we have a copy with compatible register classes. - // We can probably coalesce, but we need to consider overlap. - const LiveInterval *srcLI = &lis->getInterval(srcReg), - *dstLI = &lis->getInterval(dstReg); - - if (srcLI->overlaps(*dstLI)) { - // Even in the case of an overlap we might still be able to coalesce, - // but we need to make sure that no definition of either range occurs - // while the other range is live. - - // Otherwise start by assuming we're ok. - bool badDef = false; - - // Test all defs of the source range. - for (VNIIterator - vniItr = srcLI->vni_begin(), vniEnd = srcLI->vni_end(); - vniItr != vniEnd; ++vniItr) { - - // If we find a poorly defined def we err on the side of caution. - if (!(*vniItr)->def.isValid()) { - badDef = true; - break; - } - - // If we find a def that kills the coalescing opportunity then - // record it and break from the loop. - if (dstLI->liveAt((*vniItr)->def)) { - badDef = true; - break; - } - } - - // If we have a bad def give up, continue to the next instruction. - if (badDef) - continue; - - // Otherwise test definitions of the destination range. - for (VNIIterator - vniItr = dstLI->vni_begin(), vniEnd = dstLI->vni_end(); - vniItr != vniEnd; ++vniItr) { - - // We want to make sure we skip the copy instruction itself. - if ((*vniItr)->getCopy() == instr) - continue; - - if (!(*vniItr)->def.isValid()) { - badDef = true; - break; - } - - if (srcLI->liveAt((*vniItr)->def)) { - badDef = true; - break; - } - } - - // As before a bad def we give up and continue to the next instr. - if (badDef) - continue; - } - - // If we make it to here then either the ranges didn't overlap, or they - // did, but none of their definitions would prevent us from coalescing. - // We're good to go with the coalesce. - - float cBenefit = std::pow(10.0f, (float)loopInfo->getLoopDepth(mbb)) / 5.0; - - coalescesFound[RegPair(srcReg, dstReg)] = cBenefit; - coalescesFound[RegPair(dstReg, srcReg)] = cBenefit; - } - - } - - return coalescesFound; -} - void RegAllocPBQP::findVRegIntervalsToAlloc() { // Iterate over all live ranges. @@ -834,171 +469,6 @@ } } -PBQP::Graph RegAllocPBQP::constructPBQPProblemOld() { - - typedef std::vector LIVector; - typedef std::vector RegVector; - - // This will store the physical intervals for easy reference. - LIVector physIntervals; - - // Start by clearing the old node <-> live interval mappings & allowed sets - li2Node.clear(); - node2LI.clear(); - allowedSets.clear(); - - // Populate physIntervals, update preg use: - for (LiveIntervals::iterator itr = lis->begin(), end = lis->end(); - itr != end; ++itr) { - - if (TargetRegisterInfo::isPhysicalRegister(itr->first)) { - physIntervals.push_back(itr->second); - mri->setPhysRegUsed(itr->second->reg); - } - } - - // Iterate over vreg intervals, construct live interval <-> node number - // mappings. - for (RegSet::const_iterator itr = vregsToAlloc.begin(), - end = vregsToAlloc.end(); - itr != end; ++itr) { - const LiveInterval *li = &lis->getInterval(*itr); - - li2Node[li] = node2LI.size(); - node2LI.push_back(li); - } - - // Get the set of potential coalesces. - CoalesceMap coalesces; - - if (pbqpCoalescing) { - coalesces = findCoalesces(); - } - - // Construct a PBQP solver for this problem - PBQP::Graph problem; - problemNodes.resize(vregsToAlloc.size()); - - // Resize allowedSets container appropriately. - allowedSets.resize(vregsToAlloc.size()); - - BitVector ReservedRegs = tri->getReservedRegs(*mf); - - // Iterate over virtual register intervals to compute allowed sets... - for (unsigned node = 0; node < node2LI.size(); ++node) { - - // Grab pointers to the interval and its register class. - const LiveInterval *li = node2LI[node]; - const TargetRegisterClass *liRC = mri->getRegClass(li->reg); - - // Start by assuming all allocable registers in the class are allowed... - RegVector liAllowed; - TargetRegisterClass::iterator aob = liRC->allocation_order_begin(*mf); - TargetRegisterClass::iterator aoe = liRC->allocation_order_end(*mf); - for (TargetRegisterClass::iterator it = aob; it != aoe; ++it) - if (!ReservedRegs.test(*it)) - liAllowed.push_back(*it); - - // Eliminate the physical registers which overlap with this range, along - // with all their aliases. - for (LIVector::iterator pItr = physIntervals.begin(), - pEnd = physIntervals.end(); pItr != pEnd; ++pItr) { - - if (!li->overlaps(**pItr)) - continue; - - unsigned pReg = (*pItr)->reg; - - // If we get here then the live intervals overlap, but we're still ok - // if they're coalescable. - if (coalesces.find(RegPair(li->reg, pReg)) != coalesces.end()) { - DEBUG(dbgs() << "CoalescingOverride: (" << li->reg << ", " << pReg << ")\n"); - continue; - } - - // If we get here then we have a genuine exclusion. - - // Remove the overlapping reg... - RegVector::iterator eraseItr = - std::find(liAllowed.begin(), liAllowed.end(), pReg); - - if (eraseItr != liAllowed.end()) - liAllowed.erase(eraseItr); - - const unsigned *aliasItr = tri->getAliasSet(pReg); - - if (aliasItr != 0) { - // ...and its aliases. - for (; *aliasItr != 0; ++aliasItr) { - RegVector::iterator eraseItr = - std::find(liAllowed.begin(), liAllowed.end(), *aliasItr); - - if (eraseItr != liAllowed.end()) { - liAllowed.erase(eraseItr); - } - } - } - } - - // Copy the allowed set into a member vector for use when constructing cost - // vectors & matrices, and mapping PBQP solutions back to assignments. - allowedSets[node] = AllowedSet(liAllowed.begin(), liAllowed.end()); - - // Set the spill cost to the interval weight, or epsilon if the - // interval weight is zero - PBQP::PBQPNum spillCost = (li->weight != 0.0) ? - li->weight : std::numeric_limits::min(); - - // Build a cost vector for this interval. - problemNodes[node] = - problem.addNode( - buildCostVector(li->reg, allowedSets[node], coalesces, spillCost)); - - } - - - // Now add the cost matrices... - for (unsigned node1 = 0; node1 < node2LI.size(); ++node1) { - const LiveInterval *li = node2LI[node1]; - - // Test for live range overlaps and insert interference matrices. - for (unsigned node2 = node1 + 1; node2 < node2LI.size(); ++node2) { - const LiveInterval *li2 = node2LI[node2]; - - CoalesceMap::const_iterator cmItr = - coalesces.find(RegPair(li->reg, li2->reg)); - - PBQP::Matrix *m = 0; - - if (cmItr != coalesces.end()) { - m = buildCoalescingMatrix(allowedSets[node1], allowedSets[node2], - cmItr->second); - } - else if (li->overlaps(*li2)) { - m = buildInterferenceMatrix(allowedSets[node1], allowedSets[node2]); - } - - if (m != 0) { - problem.addEdge(problemNodes[node1], - problemNodes[node2], - *m); - - delete m; - } - } - } - - assert(problem.getNumNodes() == allowedSets.size()); -/* - std::cerr << "Allocating for " << problem.getNumNodes() << " nodes, " - << problem.getNumEdges() << " edges.\n"; - - problem.printDot(std::cerr); -*/ - // We're done, PBQP problem constructed - return it. - return problem; -} - void RegAllocPBQP::addStackInterval(const LiveInterval *spilled, MachineRegisterInfo* mri) { int stackSlot = vrm->getStackSlot(spilled->reg); @@ -1020,77 +490,6 @@ stackInterval.MergeRangesInAsValue(rhsInterval, vni); } -bool RegAllocPBQP::mapPBQPToRegAllocOld(const PBQP::Solution &solution) { - - // Set to true if we have any spills - bool anotherRoundNeeded = false; - - // Clear the existing allocation. - vrm->clearAllVirt(); - - // Iterate over the nodes mapping the PBQP solution to a register assignment. - for (unsigned node = 0; node < node2LI.size(); ++node) { - unsigned virtReg = node2LI[node]->reg, - allocSelection = solution.getSelection(problemNodes[node]); - - - // If the PBQP solution is non-zero it's a physical register... - if (allocSelection != 0) { - // Get the physical reg, subtracting 1 to account for the spill option. - unsigned physReg = allowedSets[node][allocSelection - 1]; - - DEBUG(dbgs() << "VREG " << virtReg << " -> " - << tri->getName(physReg) << " (Option: " << allocSelection << ")\n"); - - assert(physReg != 0); - - // Add to the virt reg map and update the used phys regs. - vrm->assignVirt2Phys(virtReg, physReg); - } - // ...Otherwise it's a spill. - else { - - // Make sure we ignore this virtual reg on the next round - // of allocation - vregsToAlloc.erase(virtReg); - - // Insert spill ranges for this live range - const LiveInterval *spillInterval = node2LI[node]; - double oldSpillWeight = spillInterval->weight; - SmallVector spillIs; - rmf->rememberUseDefs(spillInterval); - std::vector newSpills = - lis->addIntervalsForSpills(*spillInterval, spillIs, loopInfo, *vrm); - addStackInterval(spillInterval, mri); - rmf->rememberSpills(spillInterval, newSpills); - - (void) oldSpillWeight; - DEBUG(dbgs() << "VREG " << virtReg << " -> SPILLED (Option: 0, Cost: " - << oldSpillWeight << ", New vregs: "); - - // Copy any newly inserted live intervals into the list of regs to - // allocate. - for (std::vector::const_iterator - itr = newSpills.begin(), end = newSpills.end(); - itr != end; ++itr) { - - assert(!(*itr)->empty() && "Empty spill range."); - - DEBUG(dbgs() << (*itr)->reg << " "); - - vregsToAlloc.insert((*itr)->reg); - } - - DEBUG(dbgs() << ")\n"); - - // We need another round if spill intervals were added. - anotherRoundNeeded |= !newSpills.empty(); - } - } - - return !anotherRoundNeeded; -} - bool RegAllocPBQP::mapPBQPToRegAlloc(const PBQPRAProblem &problem, const PBQP::Solution &solution) { // Set to true if we have any spills @@ -1255,32 +654,18 @@ bool pbqpAllocComplete = false; unsigned round = 0; - if (!pbqpBuilder) { - while (!pbqpAllocComplete) { - DEBUG(dbgs() << " PBQP Regalloc round " << round << ":\n"); + while (!pbqpAllocComplete) { + DEBUG(dbgs() << " PBQP Regalloc round " << round << ":\n"); - PBQP::Graph problem = constructPBQPProblemOld(); - PBQP::Solution solution = - PBQP::HeuristicSolver::solve(problem); + std::auto_ptr problem = + builder->build(mf, lis, loopInfo, vregsToAlloc); + PBQP::Solution solution = + PBQP::HeuristicSolver::solve( + problem->getGraph()); - pbqpAllocComplete = mapPBQPToRegAllocOld(solution); + pbqpAllocComplete = mapPBQPToRegAlloc(*problem, solution); - ++round; - } - } else { - while (!pbqpAllocComplete) { - DEBUG(dbgs() << " PBQP Regalloc round " << round << ":\n"); - - std::auto_ptr problem = - builder->build(mf, lis, loopInfo, vregsToAlloc); - PBQP::Solution solution = - PBQP::HeuristicSolver::solve( - problem->getGraph()); - - pbqpAllocComplete = mapPBQPToRegAlloc(*problem, solution); - - ++round; - } + ++round; } } @@ -1291,10 +676,6 @@ vregsToAlloc.clear(); emptyIntervalVRegs.clear(); - li2Node.clear(); - node2LI.clear(); - allowedSets.clear(); - problemNodes.clear(); DEBUG(dbgs() << "Post alloc VirtRegMap:\n" << *vrm << "\n"); From pichet2000 at gmail.com Mon Oct 4 07:31:20 2010 From: pichet2000 at gmail.com (Francois Pichet) Date: Mon, 04 Oct 2010 12:31:20 -0000 Subject: [llvm-commits] [llvm] r115503 - /llvm/trunk/unittests/Support/System.cpp Message-ID: <20101004123120.4CCB62A6C130@llvm.org> Author: fpichet Date: Mon Oct 4 07:31:20 2010 New Revision: 115503 URL: http://llvm.org/viewvc/llvm-project?rev=115503&view=rev Log: static_cast to long, otherwise MSVC 2008 won't compile. Modified: llvm/trunk/unittests/Support/System.cpp Modified: llvm/trunk/unittests/Support/System.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/unittests/Support/System.cpp?rev=115503&r1=115502&r2=115503&view=diff ============================================================================== --- llvm/trunk/unittests/Support/System.cpp (original) +++ llvm/trunk/unittests/Support/System.cpp Mon Oct 4 07:31:20 2010 @@ -11,6 +11,6 @@ TEST_F(SystemTest, TimeValue) { sys::TimeValue now = sys::TimeValue::now(); time_t now_t = time(NULL); - EXPECT_TRUE(abs(static_cast(now_t - now.toEpochTime())) < 2); + EXPECT_TRUE(abs(static_cast(now_t - now.toEpochTime())) < 2); } } From bigcheesegs at gmail.com Mon Oct 4 09:45:37 2010 From: bigcheesegs at gmail.com (Michael Spencer) Date: Mon, 4 Oct 2010 10:45:37 -0400 Subject: [llvm-commits] [Review request] test: adding triplets and FileCheck-ize for cygming(and msvc) In-Reply-To: References: Message-ID: On Mon, Oct 4, 2010 at 12:27 AM, NAKAMURA Takumi wrote: > Good afternoon, guys! > > I have many patches for tests. At first, I post 6 patchesets I think > they would be simpler. > Please give me any comments. > Feel free for everyone to improve my patches and commit in yourself. > > Thank you in advance, ...Takumi Thank you for working on all these. The main problem I have is with changing the tripples. This is not a real fix for the tests. The tests need to be changed to either support both Windows style x86[-64] code and Linux/Darwin, or we need separate tests for them. Just changing the tripple masks the problem. FileCheck has rather advanced facilities for regular expressions and variables. Most of the differences seem to be register choice due to the calling conventions that Windows uses. > ps. All tests pass on llvm/cygmingw with my other patches remained. > > > * Principle > > ?- FileCheck-ize everything I touched. > ?- Add appropriate triplets as "-mtriple=i686-linux" and > "-mtriple=x86_64-linux". > > * 0001(1 file) -mtriple=i686-linux > > ?It was needed for Cygwin and Mingw. See above, and this test passes for me on pure Windows. > * 0002(29 files) -mtriple=x86_64-linux > > ?It was needed for incompatibility among win64 and others. > ?They have been already FileCheck-ized. See above. > * 0003(18 files) FileCheck-ize When replacing a grep line followed by count, you need to add a: CHECK-NOT: If you don't, you're not checking the greater than case. > [snip...] I'll review 4-6 later today, but they have many of the same problems above. - Michael Spencer From daniel at zuster.org Mon Oct 4 09:57:59 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Mon, 04 Oct 2010 14:57:59 -0000 Subject: [llvm-commits] [test-suite] r115505 - in /test-suite/trunk/SingleSource/UnitTests/ObjC: Makefile print-class-info-x86-32.m print-class-info-x86-32.reference_output print-class-info-x86-64.m print-class-info-x86-64.reference_output Message-ID: <20101004145759.878CA2A6C12E@llvm.org> Author: ddunbar Date: Mon Oct 4 09:57:59 2010 New Revision: 115505 URL: http://llvm.org/viewvc/llvm-project?rev=115505&view=rev Log: Remove print-class-info tests, they aren't very useful and are too tied to other platform details (specific Foundation version, etc.) Removed: test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.m test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.reference_output test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-64.m test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-64.reference_output Modified: test-suite/trunk/SingleSource/UnitTests/ObjC/Makefile Modified: test-suite/trunk/SingleSource/UnitTests/ObjC/Makefile URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/UnitTests/ObjC/Makefile?rev=115505&r1=115504&r2=115505&view=diff ============================================================================== --- test-suite/trunk/SingleSource/UnitTests/ObjC/Makefile (original) +++ test-suite/trunk/SingleSource/UnitTests/ObjC/Makefile Mon Oct 4 09:57:59 2010 @@ -8,13 +8,6 @@ LDFLAGS += -lobjc -framework Foundation PROGRAM_REQUIRED_TO_EXIT_OK := 1 -ifneq ($(ARCH),x86) -PROGRAMS_TO_SKIP += print-class-info-x86-32 -endif -ifneq ($(ARCH),x86_64) -PROGRAMS_TO_SKIP += print-class-info-x86-64 -endif - # This is a known gcc / llvm-gcc miscompilation fixed in clang. ifdef CC_UNDER_TEST_IS_LLVM_GCC EXEC_XFAILS = dot-syntax-2 Removed: test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.m URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.m?rev=115504&view=auto ============================================================================== --- test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.m (original) +++ test-suite/trunk/SingleSource/UnitTests/ObjC/print-class-info-x86-32.m (removed) @@ -1,422 +0,0 @@ -#include -#include -#include -#include - - at protocol UnusedProtocol -+(void) makeWaffles; - at end - - at protocol P2 - at end - - at protocol P - at required -+(void) requiredProtocolClassMethod; --(void) requiredProtocolInstanceMethod; - at optional -+(void) optionalProtocolClassMethod; --(void) optionalProtocolInstanceMethod; - - at required - at property int requiredProtocolProperty; - at optional - at property int optionalProtocolProperty; // XXX this is not actually - // optional in Obj-C 2? or - // maybe just in old ABI? - at end - - at protocol CategoryProtocol -+(void) categoryClassMethod; --(void) categoryInstanceMethod; - at end - - at interface A : NSObject

      { - at private - int privateVar; - at protected - int protectedVar; - at public - int publicVar; - __weak id weakVar; - __strong id strongVar; - - int halfDynamicA, halfDynamicB; -} - - at property(assign) int requiredProtocolProperty; - at property(assign) int optionalProtocolProperty; - - at property(assign) int Ptest_a; - at property(copy) id Ptest_b; - at property(retain) id Ptest_c; - - at property(getter=iGetThings) int things; - at property(setter=iSetOtherThings:) int otherThings; - - at property(assign) int dynamicNotReally; - at property(assign) int halfDynamicA; - at property(assign) int halfDynamicB; - -+(void) classMethod; --(void) instanceMethod; - at end - - at interface A () -+(void) extendedClassMethod; --(void) extendedInstanceMethod; -+(void) requiredProtocolClassMethod; --(void) requiredProtocolInstanceMethod; - at end - - at implementation A - at dynamic Ptest_a, Ptest_b, Ptest_c; - at dynamic things, otherThings; - - at dynamic dynamicNotReally; --(int) dynamicNotReally {}; --(void) dynamicNotReally: (int) arg {}; - - at synthesize halfDynamicA; --(int) halfDynamicA {}; - - at synthesize halfDynamicB; --(void) halfDynamicB: (int) arg {}; - - at synthesize requiredProtocolProperty = privateVar; -#ifdef ABI2 - at synthesize optionalProtocolProperty = someRandomVar; -#else - at synthesize optionalProtocolProperty = publicVar; -#endif - -+(void) classMethod { - printf("I am a class method\n"); -} --(void) instanceMethod { - printf("I am an instance method\n"); -} - -+(void) requiredProtocolClassMethod { - printf("I am a required protocol class method\n"); -} - --(void) requiredProtocolInstanceMethod { - printf("I am a required protocol instance method\n"); -} - -+(void) extendedClassMethod { - printf("I am an extended class method\n"); -} --(void) extendedInstanceMethod { - printf("I am an extended instance method\n"); -} - at end - - at interface A ( A_Category ) -+(void) categoryClassMethod; --(void) categoryInstanceMethod; - - at property(assign) int categoryProperty; - at end - - at implementation A ( A_Category ) - at dynamic categoryProperty; - -+(void) categoryClassMethod { -} --(void) categoryInstanceMethod { -} - at end - -/***/ - -int ivar_cmp(const void *av, const void *bv) { - const Ivar *a = av; - const Ivar *b = bv; - return strcmp(ivar_getName(*a), ivar_getName(*b)); -} - -int methodDescription_cmp(const void *av, const void *bv) { - const struct objc_method_description *a = av; - const struct objc_method_description *b = bv; - return strcmp(sel_getName(a->name), sel_getName(b->name)); -} - -int method_cmp(const void *av, const void *bv) { - const Method *a = av; - const Method *b = bv; - return strcmp(method_getName(*a), method_getName(*b)); -} - -int property_cmp(const void *av, const void *bv) { - const objc_property_t *a = av; - const objc_property_t *b = bv; - return strcmp(property_getName(*a), property_getName(*b)); -} - -int protocol_cmp(const void *av, const void *bv) { - Protocol * const *a = av; - Protocol * const *b = bv; - return strcmp(protocol_getName(*a), protocol_getName(*b)); -} - -void sort_ivars(Ivar *ivars, unsigned numIvars) { - qsort(ivars, numIvars, sizeof(*ivars), ivar_cmp); -} - -void sort_methodDescriptions(struct objc_method_description *methods, unsigned numMethods) { - qsort(methods, numMethods, sizeof(*methods), methodDescription_cmp); -} - -void sort_methods(Method *methods, unsigned numMethods) { - unsigned i; - qsort(methods, numMethods, sizeof(*methods), method_cmp); -} - -void sort_properties(objc_property_t *properties, unsigned numProperties) { - qsort(properties, numProperties, sizeof(*properties), property_cmp); -} - -void sort_protocols(Protocol **protocols, unsigned numProtocols) { - qsort(protocols, numProtocols, sizeof(*protocols), protocol_cmp); -} - -/***/ - -static int indent = 0; -#define PRINT1(e0,t0) printf("%*s" #e0 ": %" #t0 "\n", indent*2, "", e0) -#define PRINT2(e0,t0,e1,t1) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 "\n", indent*2, "", e0, e1) -#define PRINT3(e0,t0,e1,t1,e2,t2) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 "\n", indent*2, "", e0, e1, e2) -#define PRINT4(e0,t0,e1,t1,e2,t2,e3,t3) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 ", " #e3 ": %" #t3 "\n", indent*2, "", e0, e1, e2, e3) -#define PRINT5(e0,t0,e1,t1,e2,t2,e3,t3,e4,t4) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 ", " #e3 ": %" #t3 ", " #e4 ": %" #t4 "\n", indent*2, "", e0, e1, e2, e3, e4) -void printInfo(Class c, int printData) { - unsigned i; - - ++indent; - PRINT1(c != 0, d); - PRINT1(class_getName(c), s); - PRINT1(object_getClassName(c), s); - PRINT1(objc_getClass(class_getName(c)) == c, d); - PRINT1(class_conformsToProtocol(c, @protocol(P)), d); - - unsigned numIvars; - Ivar *ivars = class_copyIvarList(c, &numIvars); - sort_ivars(ivars, numIvars); - PRINT1(numIvars, d); - if (printData) { - ++indent; - for (i=0; i -#include -#include -#include - - at protocol UnusedProtocol -+(void) makeWaffles; - at end - - at protocol P2 - at end - - at protocol P - at required -+(void) requiredProtocolClassMethod; --(void) requiredProtocolInstanceMethod; - at optional -+(void) optionalProtocolClassMethod; --(void) optionalProtocolInstanceMethod; - - at required - at property int requiredProtocolProperty; - at optional - at property int optionalProtocolProperty; // XXX this is not actually - // optional in Obj-C 2? or - // maybe just in old ABI? - at end - - at protocol CategoryProtocol -+(void) categoryClassMethod; --(void) categoryInstanceMethod; - at end - - at interface A : NSObject

      { - at private - int privateVar; - at protected - int protectedVar; - at public - int publicVar; - __weak id weakVar; - __strong id strongVar; - - int halfDynamicA, halfDynamicB; -} - - at property(assign) int requiredProtocolProperty; - at property(assign) int optionalProtocolProperty; - - at property(assign) int Ptest_a; - at property(copy) id Ptest_b; - at property(retain) id Ptest_c; - - at property(getter=iGetThings) int things; - at property(setter=iSetOtherThings:) int otherThings; - - at property(assign) int dynamicNotReally; - at property(assign) int halfDynamicA; - at property(assign) int halfDynamicB; - -+(void) classMethod; --(void) instanceMethod; - at end - - at interface A () -+(void) extendedClassMethod; --(void) extendedInstanceMethod; -+(void) requiredProtocolClassMethod; --(void) requiredProtocolInstanceMethod; - at end - - at implementation A - at dynamic Ptest_a, Ptest_b, Ptest_c; - at dynamic things, otherThings; - - at dynamic dynamicNotReally; --(int) dynamicNotReally {}; --(void) dynamicNotReally: (int) arg {}; - - at synthesize halfDynamicA; --(int) halfDynamicA {}; - - at synthesize halfDynamicB; --(void) halfDynamicB: (int) arg {}; - - at synthesize requiredProtocolProperty = privateVar; -#ifdef ABI2 - at synthesize optionalProtocolProperty = someRandomVar; -#else - at synthesize optionalProtocolProperty = publicVar; -#endif - -+(void) classMethod { - printf("I am a class method\n"); -} --(void) instanceMethod { - printf("I am an instance method\n"); -} - -+(void) requiredProtocolClassMethod { - printf("I am a required protocol class method\n"); -} - --(void) requiredProtocolInstanceMethod { - printf("I am a required protocol instance method\n"); -} - -+(void) extendedClassMethod { - printf("I am an extended class method\n"); -} --(void) extendedInstanceMethod { - printf("I am an extended instance method\n"); -} - at end - - at interface A ( A_Category ) -+(void) categoryClassMethod; --(void) categoryInstanceMethod; - - at property(assign) int categoryProperty; - at end - - at implementation A ( A_Category ) - at dynamic categoryProperty; - -+(void) categoryClassMethod { -} --(void) categoryInstanceMethod { -} - at end - -/***/ - -int ivar_cmp(const void *av, const void *bv) { - const Ivar *a = av; - const Ivar *b = bv; - return strcmp(ivar_getName(*a), ivar_getName(*b)); -} - -int methodDescription_cmp(const void *av, const void *bv) { - const struct objc_method_description *a = av; - const struct objc_method_description *b = bv; - return strcmp(sel_getName(a->name), sel_getName(b->name)); -} - -int method_cmp(const void *av, const void *bv) { - const Method *a = av; - const Method *b = bv; - return strcmp(method_getName(*a), method_getName(*b)); -} - -int property_cmp(const void *av, const void *bv) { - const objc_property_t *a = av; - const objc_property_t *b = bv; - return strcmp(property_getName(*a), property_getName(*b)); -} - -int protocol_cmp(const void *av, const void *bv) { - Protocol * const *a = av; - Protocol * const *b = bv; - return strcmp(protocol_getName(*a), protocol_getName(*b)); -} - -void sort_ivars(Ivar *ivars, unsigned numIvars) { - qsort(ivars, numIvars, sizeof(*ivars), ivar_cmp); -} - -void sort_methodDescriptions(struct objc_method_description *methods, unsigned numMethods) { - qsort(methods, numMethods, sizeof(*methods), methodDescription_cmp); -} - -void sort_methods(Method *methods, unsigned numMethods) { - unsigned i; - qsort(methods, numMethods, sizeof(*methods), method_cmp); -} - -void sort_properties(objc_property_t *properties, unsigned numProperties) { - qsort(properties, numProperties, sizeof(*properties), property_cmp); -} - -void sort_protocols(Protocol **protocols, unsigned numProtocols) { - qsort(protocols, numProtocols, sizeof(*protocols), protocol_cmp); -} - -/***/ - -static int indent = 0; -#define PRINT1(e0,t0) printf("%*s" #e0 ": %" #t0 "\n", indent*2, "", e0) -#define PRINT2(e0,t0,e1,t1) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 "\n", indent*2, "", e0, e1) -#define PRINT3(e0,t0,e1,t1,e2,t2) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 "\n", indent*2, "", e0, e1, e2) -#define PRINT4(e0,t0,e1,t1,e2,t2,e3,t3) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 ", " #e3 ": %" #t3 "\n", indent*2, "", e0, e1, e2, e3) -#define PRINT5(e0,t0,e1,t1,e2,t2,e3,t3,e4,t4) printf("%*s" #e0 ": %" #t0 ", " #e1 ": %" #t1 ", " #e2 ": %" #t2 ", " #e3 ": %" #t3 ", " #e4 ": %" #t4 "\n", indent*2, "", e0, e1, e2, e3, e4) -void printInfo(Class c, int printData) { - unsigned i; - - ++indent; - PRINT1(c != 0, d); - PRINT1(class_getName(c), s); - PRINT1(object_getClassName(c), s); - PRINT1(objc_getClass(class_getName(c)) == c, d); - PRINT1(class_conformsToProtocol(c, @protocol(P)), d); - - unsigned numIvars; - Ivar *ivars = class_copyIvarList(c, &numIvars); - sort_ivars(ivars, numIvars); - PRINT1(numIvars, d); - if (printData) { - ++indent; - for (i=0; i Author: rafael Date: Mon Oct 4 10:28:43 2010 New Revision: 115506 URL: http://llvm.org/viewvc/llvm-project?rev=115506&view=rev Log: Correctly compute the relocation when it is not in the first fragment. Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp llvm/trunk/test/MC/ELF/pic-diff.s Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115506&r1=115505&r2=115506&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 10:28:43 2010 @@ -539,7 +539,7 @@ const MCSymbol &SymbolB = RefB->getSymbol(); MCSymbolData &SDB = Asm.getSymbolData(SymbolB); IsPCRel = true; - Value += Fixup.getOffset() - Layout.getSymbolAddress(&SDB); + Value += Layout.getFragmentOffset(Fragment) + Fixup.getOffset() - Layout.getSymbolAddress(&SDB); } // Check that this case has already been fully resolved before we get Modified: llvm/trunk/test/MC/ELF/pic-diff.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/pic-diff.s?rev=115506&r1=115505&r2=115506&view=diff ============================================================================== --- llvm/trunk/test/MC/ELF/pic-diff.s (original) +++ llvm/trunk/test/MC/ELF/pic-diff.s Mon Oct 4 10:28:43 2010 @@ -19,7 +19,8 @@ // CHECK-NEXT: ), // CHECK-NEXT: ]) -.zero 4 +.zero 1 +.align 4 foo: .zero 8 .long baz - foo From baldrick at free.fr Mon Oct 4 10:38:45 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Oct 2010 15:38:45 -0000 Subject: [llvm-commits] [zorg] r115507 - in /zorg/trunk: buildbot/osuosl/master/config/builders.py zorg/buildbot/builders/LLVMGCCBuilder.py Message-ID: <20101004153845.979242A6C12E@llvm.org> Author: baldrick Date: Mon Oct 4 10:38:45 2010 New Revision: 115507 URL: http://llvm.org/viewvc/llvm-project?rev=115507&view=rev Log: Make it possible to specify extra languages for llvm-gcc. Give this a whirl by having the i386 buildbot build Fortran as well as C and C++. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=115507&r1=115506&r2=115507&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Mon Oct 4 10:38:45 2010 @@ -94,6 +94,7 @@ 'slavenames':["gcc11"], 'builddir':"llvm-gcc-i386-linux-selfhost", 'factory':LLVMGCCBuilder.getLLVMGCCBuildFactory(triple='i686-pc-linux-gnu', + extra_languages="fortran", extra_configure_args=['--disable-multilib', '--enable-targets=all','--with-as=/home/baldrick/bin32/as'])}, ] Modified: zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py?rev=115507&r1=115506&r2=115507&view=diff ============================================================================== --- zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py (original) +++ zorg/trunk/zorg/buildbot/builders/LLVMGCCBuilder.py Mon Oct 4 10:38:45 2010 @@ -13,8 +13,9 @@ triple=None, build=None, host=None, target=None, useTwoStage=True, stage1_config='Release+Asserts', stage2_config='Release+Asserts', make='make', - extra_configure_args=[], verbose=False, env = {}, - defaultBranch='trunk', timeout=20): + extra_configure_args=[], extra_languages=None, + verbose=False, env = {}, defaultBranch='trunk', + timeout=20): if build or host or target: if not build or not host or not target: raise ValueError,"Must specify all of 'build', 'host', 'target' if used." @@ -103,8 +104,11 @@ workdir=".", env=env)) # Configure llvm-gcc. - base_llvmgcc_configure_args = ["../llvm-gcc.src/configure", - "--enable-languages=c,c++"] + base_llvmgcc_configure_args = ["../llvm-gcc.src/configure"] + llvmgcc_languages = "--enable-languages=c,c++" + if extra_languages: + llvmgcc_languages = llvmgcc_languages + "," + extra_languages + base_llvmgcc_configure_args.append(llvmgcc_languages) if gxxincludedir: base_llvmgcc_configure_args.append('--with-gxx-include-dir=' + gxxincludedir) base_llvmgcc_configure_args.extend(extra_configure_args) From grosbach at apple.com Mon Oct 4 10:39:57 2010 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 4 Oct 2010 08:39:57 -0700 Subject: [llvm-commits] [llvm] r115393 - in /llvm/trunk: CMakeLists.txt lib/Target/MSP430/InstPrinter/CMakeLists.txt lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h lib/Target/MSP430/InstPrinter/Makefi In-Reply-To: References: <07E65C41-D605-4147-B976-1BC502C5BF68@apple.com> Message-ID: <512DB063-78F7-4D30-A660-DECA1CC817D2@apple.com> Hi Nick, That's a great help, yes. Thank you. I'm hoping to get some time to look at this again this afternoon. It's very odd that this only shows up on Linux. Now that we know that, I should be able to reproduce this issue. The getRegisterName() reference is expected, but going the other direction (from the instruction printer to codegen) is not, and if there really is such a reference, it's a definite bug that needs cleaned up. I don't see an obvious reference to GR8 or GR16 in the instruction printer source, though. I'll look more closely later. Thanks again, Jim On Oct 3, 2010, at 11:39 PM, Nick Lewycky wrote: > Here's what the GenLibDeps.pl script thinks is going on. > > libLLVMMSP430CodeGen.a uses but does not define symbols: > _ZN4llvm17MSP430InstPrinter15getRegisterNameEj aka. llvm::MSP430InstPrinter::getRegisterName(unsigned int) > which are provided by libLLVMMSP430AsmPrinter.a. Going in the other direction, libLLVMMSP430AsmPrinter.a uses symbols: > _ZN4llvm6MSP43011GR8RegClassE aka. llvm::MSP430::GR8RegClass > _ZN4llvm6MSP43012GR16RegClassE aka. llvm::MSP430::GR16RegClass > > GR8RegClass and GR16RegClass is defined in MSP430RegisterInfo.o (which is rolled into ...CodeGen.a). Its only reference in ...AsmPrinter.a is by MSP430InstPrinter.o. > > The getRegisterName function is defined in MSP430InstPrinter.o (part of ...AsmPrinter.a) and its only reference in ...CodeGen.a is by MSP430AsmPrinter.o. > > Is that enough to go on? > > Nick > > On 1 October 2010 18:43, Nick Lewycky wrote: > Okay. You can see that almost all of the open-source builders were broken: > > http://google1.osuosl.org:8011/console > > in that time. It's impossible for this particular error to occur in a cmake build because cmake doesn't run find-cycles.pl (last i checked). My suspicion is that the cmake builders were working fine while configure+make ones were not? > > I'm going to wind back to the broken point and try to reproduce the failure and see if I can figure out what the cyclic dependency actually was. > > Nick > > > On 1 October 2010 18:27, Jim Grosbach wrote: > That's very strange. I do a configure/make here, and it works, and lots of bots using that were green as well. If there's a case I missed, I'd love to have some help tracking down what it is. Can you try a "make clean" and see if that works? Maybe there's just something stale that the configure portion of the patch needs to clean up. > > -Jim > > > > On Oct 1, 2010, at 6:24 PM, Nick Lewycky wrote: > >> Nope, it broke under a regular configure+make in-srctree incremental build on multiple different machines. >> >> On 1 October 2010 18:22, Jim Grosbach wrote: >> Nick, >> >> These only break for you under CMake, right? That's the only place I've been able to reproduce failures. >> >> -Jim >> >> >> On Oct 1, 2010, at 6:06 PM, Nick Lewycky wrote: >> >> > Author: nicholas >> > Date: Fri Oct 1 20:06:42 2010 >> > New Revision: 115393 >> > >> > URL: http://llvm.org/viewvc/llvm-project?rev=115393&view=rev >> > Log: >> > Revert patches r115363 r115367 r115391 due to build breakage: >> > llvm[2]: Updated LibDeps.txt because dependencies changed >> > llvm[2]: Checking for cyclic dependencies between LLVM libraries. >> > find-cycles.pl: Circular dependency between *.a files: >> > find-cycles.pl: libLLVMMSP430AsmPrinter.a libLLVMMSP430CodeGen.a >> > >> > >> > Modified: >> > llvm/trunk/CMakeLists.txt >> > llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >> > llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >> > llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >> > llvm/trunk/lib/Target/MSP430/Makefile >> > >> > Modified: llvm/trunk/CMakeLists.txt >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/CMakeLists.txt (original) >> > +++ llvm/trunk/CMakeLists.txt Fri Oct 1 20:06:42 2010 >> > @@ -323,10 +323,6 @@ >> > add_subdirectory(lib/Target/${t}/AsmPrinter) >> > set(LLVM_ENUM_ASM_PRINTERS >> > "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >> > - if( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/InstPrinter/CMakeLists.txt ) >> > - add_subdirectory(lib/Target/${t}/InstPrinter) >> > - set(LLVM_ENUM_ASM_PRINTERS >> > - "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >> > endif( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmPrinter/CMakeLists.txt ) >> > if( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmParser/CMakeLists.txt ) >> > add_subdirectory(lib/Target/${t}/AsmParser) >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt Fri Oct 1 20:06:42 2010 >> > @@ -1,6 +0,0 @@ >> > -include_directories( ${CMAKE_CURRENT_BINARY_DIR}/.. ${CMAKE_CURRENT_SOURCE_DIR}/.. ) >> > - >> > -add_llvm_library(LLVMMSP430AsmPrinter >> > - MSP430InstPrinter.cpp >> > - ) >> > -add_dependencies(LLVMMSP430AsmPrinter MSP430CodeGenTable_gen) >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,114 +0,0 @@ >> > -//===-- MSP430InstPrinter.cpp - Convert MSP430 MCInst to assembly syntax --===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This class prints an MSP430 MCInst to a .s file. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#define DEBUG_TYPE "asm-printer" >> > -#include "MSP430.h" >> > -#include "MSP430InstrInfo.h" >> > -#include "MSP430InstPrinter.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/MC/MCAsmInfo.h" >> > -#include "llvm/MC/MCExpr.h" >> > -#include "llvm/Support/ErrorHandling.h" >> > -#include "llvm/Support/FormattedStream.h" >> > -using namespace llvm; >> > - >> > - >> > -// Include the auto-generated portion of the assembly writer. >> > -#include "MSP430GenAsmWriter.inc" >> > - >> > -void MSP430InstPrinter::printInst(const MCInst *MI, raw_ostream &O) { >> > - printInstruction(MI, O); >> > -} >> > - >> > -void MSP430InstPrinter::printPCRelImmOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O) { >> > - const MCOperand &Op = MI->getOperand(OpNo); >> > - if (Op.isImm()) >> > - O << Op.getImm(); >> > - else { >> > - assert(Op.isExpr() && "unknown pcrel immediate operand"); >> > - O << *Op.getExpr(); >> > - } >> > -} >> > - >> > -void MSP430InstPrinter::printOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O, const char *Modifier) { >> > - assert((Modifier == 0 || Modifier[0] == 0) && "No modifiers supported"); >> > - const MCOperand &Op = MI->getOperand(OpNo); >> > - if (Op.isReg()) { >> > - O << getRegisterName(Op.getReg()); >> > - } else if (Op.isImm()) { >> > - O << '#' << Op.getImm(); >> > - } else { >> > - assert(Op.isExpr() && "unknown operand kind in printOperand"); >> > - O << '#' << *Op.getExpr(); >> > - } >> > -} >> > - >> > -void MSP430InstPrinter::printSrcMemOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O, >> > - const char *Modifier) { >> > - const MCOperand &Base = MI->getOperand(OpNo); >> > - const MCOperand &Disp = MI->getOperand(OpNo+1); >> > - >> > - // Print displacement first >> > - >> > - // If the global address expression is a part of displacement field with a >> > - // register base, we should not emit any prefix symbol here, e.g. >> > - // mov.w &foo, r1 >> > - // vs >> > - // mov.w glb(r1), r2 >> > - // Otherwise (!) msp430-as will silently miscompile the output :( >> > - if (!Base.getReg()) >> > - O << '&'; >> > - >> > - if (Disp.isExpr()) >> > - O << *Disp.getExpr(); >> > - else { >> > - assert(Disp.isImm() && "Expected immediate in displacement field"); >> > - O << Disp.getImm(); >> > - } >> > - >> > - // Print register base field >> > - if (Base.getReg()) >> > - O << '(' << getRegisterName(Base.getReg()) << ')'; >> > -} >> > - >> > -void MSP430InstPrinter::printCCOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O) { >> > - unsigned CC = MI->getOperand(OpNo).getImm(); >> > - >> > - switch (CC) { >> > - default: >> > - llvm_unreachable("Unsupported CC code"); >> > - break; >> > - case MSP430CC::COND_E: >> > - O << "eq"; >> > - break; >> > - case MSP430CC::COND_NE: >> > - O << "ne"; >> > - break; >> > - case MSP430CC::COND_HS: >> > - O << "hs"; >> > - break; >> > - case MSP430CC::COND_LO: >> > - O << "lo"; >> > - break; >> > - case MSP430CC::COND_GE: >> > - O << "ge"; >> > - break; >> > - case MSP430CC::COND_L: >> > - O << 'l'; >> > - break; >> > - } >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h Fri Oct 1 20:06:42 2010 >> > @@ -1,43 +0,0 @@ >> > -//===-- MSP430InstPrinter.h - Convert MSP430 MCInst to assembly syntax ----===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This class prints a MSP430 MCInst to a .s file. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#ifndef MSP430INSTPRINTER_H >> > -#define MSP430INSTPRINTER_H >> > - >> > -#include "llvm/MC/MCInstPrinter.h" >> > - >> > -namespace llvm { >> > - class MCOperand; >> > - >> > - class MSP430InstPrinter : public MCInstPrinter { >> > - public: >> > - MSP430InstPrinter(const MCAsmInfo &MAI) : MCInstPrinter(MAI) { >> > - } >> > - >> > - virtual void printInst(const MCInst *MI, raw_ostream &O); >> > - >> > - // Autogenerated by tblgen. >> > - void printInstruction(const MCInst *MI, raw_ostream &O); >> > - static const char *getRegisterName(unsigned RegNo); >> > - >> > - void printOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O, >> > - const char *Modifier = 0); >> > - void printPCRelImmOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O); >> > - void printSrcMemOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O, >> > - const char *Modifier = 0); >> > - void printCCOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O); >> > - >> > - }; >> > -} >> > - >> > -#endif >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile Fri Oct 1 20:06:42 2010 >> > @@ -1,15 +0,0 @@ >> > -##===- lib/Target/MSP430/AsmPrinter/Makefile ---------------*- Makefile -*-===## >> > -# >> > -# The LLVM Compiler Infrastructure >> > -# >> > -# This file is distributed under the University of Illinois Open Source >> > -# License. See LICENSE.TXT for details. >> > -# >> > -##===----------------------------------------------------------------------===## >> > -LEVEL = ../../../.. >> > -LIBRARYNAME = LLVMMSP430AsmPrinter >> > - >> > -# Hack: we need to include 'main' MSP430 target directory to grab private headers >> > -CPP.Flags += -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/.. >> > - >> > -include $(LEVEL)/Makefile.common >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,179 +0,0 @@ >> > -//===-- MSP430AsmPrinter.cpp - MSP430 LLVM assembly writer ----------------===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This file contains a printer that converts from our internal representation >> > -// of machine-dependent LLVM code to the MSP430 assembly language. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#define DEBUG_TYPE "asm-printer" >> > -#include "MSP430.h" >> > -#include "MSP430InstrInfo.h" >> > -#include "InstPrinter/MSP430InstPrinter.h" >> > -#include "MSP430MCAsmInfo.h" >> > -#include "MSP430MCInstLower.h" >> > -#include "MSP430TargetMachine.h" >> > -#include "llvm/Constants.h" >> > -#include "llvm/DerivedTypes.h" >> > -#include "llvm/Module.h" >> > -#include "llvm/Assembly/Writer.h" >> > -#include "llvm/CodeGen/AsmPrinter.h" >> > -#include "llvm/CodeGen/MachineModuleInfo.h" >> > -#include "llvm/CodeGen/MachineFunctionPass.h" >> > -#include "llvm/CodeGen/MachineConstantPool.h" >> > -#include "llvm/CodeGen/MachineInstr.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/MC/MCStreamer.h" >> > -#include "llvm/MC/MCSymbol.h" >> > -#include "llvm/Target/Mangler.h" >> > -#include "llvm/Target/TargetData.h" >> > -#include "llvm/Target/TargetLoweringObjectFile.h" >> > -#include "llvm/Target/TargetRegistry.h" >> > -#include "llvm/Support/raw_ostream.h" >> > -using namespace llvm; >> > - >> > -namespace { >> > - class MSP430AsmPrinter : public AsmPrinter { >> > - public: >> > - MSP430AsmPrinter(TargetMachine &TM, MCStreamer &Streamer) >> > - : AsmPrinter(TM, Streamer) {} >> > - >> > - virtual const char *getPassName() const { >> > - return "MSP430 Assembly Printer"; >> > - } >> > - >> > - void printOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O, const char* Modifier = 0); >> > - void printSrcMemOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O); >> > - bool PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, >> > - unsigned AsmVariant, const char *ExtraCode, >> > - raw_ostream &O); >> > - bool PrintAsmMemoryOperand(const MachineInstr *MI, >> > - unsigned OpNo, unsigned AsmVariant, >> > - const char *ExtraCode, raw_ostream &O); >> > - void EmitInstruction(const MachineInstr *MI); >> > - }; >> > -} // end of anonymous namespace >> > - >> > - >> > -void MSP430AsmPrinter::printOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O, const char *Modifier) { >> > - const MachineOperand &MO = MI->getOperand(OpNum); >> > - switch (MO.getType()) { >> > - default: assert(0 && "Not implemented yet!"); >> > - case MachineOperand::MO_Register: >> > - O << MSP430InstPrinter::getRegisterName(MO.getReg()); >> > - return; >> > - case MachineOperand::MO_Immediate: >> > - if (!Modifier || strcmp(Modifier, "nohash")) >> > - O << '#'; >> > - O << MO.getImm(); >> > - return; >> > - case MachineOperand::MO_MachineBasicBlock: >> > - O << *MO.getMBB()->getSymbol(); >> > - return; >> > - case MachineOperand::MO_GlobalAddress: { >> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >> > - uint64_t Offset = MO.getOffset(); >> > - >> > - // If the global address expression is a part of displacement field with a >> > - // register base, we should not emit any prefix symbol here, e.g. >> > - // mov.w &foo, r1 >> > - // vs >> > - // mov.w glb(r1), r2 >> > - // Otherwise (!) msp430-as will silently miscompile the output :( >> > - if (!Modifier || strcmp(Modifier, "nohash")) >> > - O << (isMemOp ? '&' : '#'); >> > - if (Offset) >> > - O << '(' << Offset << '+'; >> > - >> > - O << *Mang->getSymbol(MO.getGlobal()); >> > - >> > - if (Offset) >> > - O << ')'; >> > - >> > - return; >> > - } >> > - case MachineOperand::MO_ExternalSymbol: { >> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >> > - O << (isMemOp ? '&' : '#'); >> > - O << MAI->getGlobalPrefix() << MO.getSymbolName(); >> > - return; >> > - } >> > - } >> > -} >> > - >> > -void MSP430AsmPrinter::printSrcMemOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O) { >> > - const MachineOperand &Base = MI->getOperand(OpNum); >> > - const MachineOperand &Disp = MI->getOperand(OpNum+1); >> > - >> > - // Print displacement first >> > - >> > - // Imm here is in fact global address - print extra modifier. >> > - if (Disp.isImm() && !Base.getReg()) >> > - O << '&'; >> > - printOperand(MI, OpNum+1, O, "nohash"); >> > - >> > - // Print register base field >> > - if (Base.getReg()) { >> > - O << '('; >> > - printOperand(MI, OpNum, O); >> > - O << ')'; >> > - } >> > -} >> > - >> > -/// PrintAsmOperand - Print out an operand for an inline asm expression. >> > -/// >> > -bool MSP430AsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, >> > - unsigned AsmVariant, >> > - const char *ExtraCode, raw_ostream &O) { >> > - // Does this asm operand have a single letter operand modifier? >> > - if (ExtraCode && ExtraCode[0]) >> > - return true; // Unknown modifier. >> > - >> > - printOperand(MI, OpNo, O); >> > - return false; >> > -} >> > - >> > -bool MSP430AsmPrinter::PrintAsmMemoryOperand(const MachineInstr *MI, >> > - unsigned OpNo, unsigned AsmVariant, >> > - const char *ExtraCode, >> > - raw_ostream &O) { >> > - if (ExtraCode && ExtraCode[0]) { >> > - return true; // Unknown modifier. >> > - } >> > - printSrcMemOperand(MI, OpNo, O); >> > - return false; >> > -} >> > - >> > -//===----------------------------------------------------------------------===// >> > -void MSP430AsmPrinter::EmitInstruction(const MachineInstr *MI) { >> > - MSP430MCInstLower MCInstLowering(OutContext, *Mang, *this); >> > - >> > - MCInst TmpInst; >> > - MCInstLowering.Lower(MI, TmpInst); >> > - OutStreamer.EmitInstruction(TmpInst); >> > -} >> > - >> > -static MCInstPrinter *createMSP430MCInstPrinter(const Target &T, >> > - unsigned SyntaxVariant, >> > - const MCAsmInfo &MAI) { >> > - if (SyntaxVariant == 0) >> > - return new MSP430InstPrinter(MAI); >> > - return 0; >> > -} >> > - >> > -// Force static initialization. >> > -extern "C" void LLVMInitializeMSP430AsmPrinter() { >> > - RegisterAsmPrinter X(TheMSP430Target); >> > - TargetRegistry::RegisterMCInstPrinter(TheMSP430Target, >> > - createMSP430MCInstPrinter); >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,150 +0,0 @@ >> > -//===-- MSP430MCInstLower.cpp - Convert MSP430 MachineInstr to an MCInst---===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This file contains code to lower MSP430 MachineInstrs to their corresponding >> > -// MCInst records. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#include "MSP430MCInstLower.h" >> > -#include "llvm/CodeGen/AsmPrinter.h" >> > -#include "llvm/CodeGen/MachineBasicBlock.h" >> > -#include "llvm/CodeGen/MachineInstr.h" >> > -#include "llvm/MC/MCAsmInfo.h" >> > -#include "llvm/MC/MCContext.h" >> > -#include "llvm/MC/MCExpr.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/Target/Mangler.h" >> > -#include "llvm/Support/raw_ostream.h" >> > -#include "llvm/Support/ErrorHandling.h" >> > -#include "llvm/ADT/SmallString.h" >> > -using namespace llvm; >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetGlobalAddressSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.Mang->getSymbol(MO.getGlobal()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetExternalSymbolSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: assert(0 && "Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.GetExternalSymbolSymbol(MO.getSymbolName()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetJumpTableSymbol(const MachineOperand &MO) const { >> > - SmallString<256> Name; >> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() << "JTI" >> > - << Printer.getFunctionNumber() << '_' >> > - << MO.getIndex(); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - // Create a symbol for the name. >> > - return Ctx.GetOrCreateSymbol(Name.str()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetConstantPoolIndexSymbol(const MachineOperand &MO) const { >> > - SmallString<256> Name; >> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() << "CPI" >> > - << Printer.getFunctionNumber() << '_' >> > - << MO.getIndex(); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - // Create a symbol for the name. >> > - return Ctx.GetOrCreateSymbol(Name.str()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetBlockAddressSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: assert(0 && "Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.GetBlockAddressSymbol(MO.getBlockAddress()); >> > -} >> > - >> > -MCOperand MSP430MCInstLower:: >> > -LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const { >> > - // FIXME: We would like an efficient form for this, so we don't have to do a >> > - // lot of extra uniquing. >> > - const MCExpr *Expr = MCSymbolRefExpr::Create(Sym, Ctx); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - if (!MO.isJTI() && MO.getOffset()) >> > - Expr = MCBinaryExpr::CreateAdd(Expr, >> > - MCConstantExpr::Create(MO.getOffset(), Ctx), >> > - Ctx); >> > - return MCOperand::CreateExpr(Expr); >> > -} >> > - >> > -void MSP430MCInstLower::Lower(const MachineInstr *MI, MCInst &OutMI) const { >> > - OutMI.setOpcode(MI->getOpcode()); >> > - >> > - for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) { >> > - const MachineOperand &MO = MI->getOperand(i); >> > - >> > - MCOperand MCOp; >> > - switch (MO.getType()) { >> > - default: >> > - MI->dump(); >> > - assert(0 && "unknown operand type"); >> > - case MachineOperand::MO_Register: >> > - // Ignore all implicit register operands. >> > - if (MO.isImplicit()) continue; >> > - MCOp = MCOperand::CreateReg(MO.getReg()); >> > - break; >> > - case MachineOperand::MO_Immediate: >> > - MCOp = MCOperand::CreateImm(MO.getImm()); >> > - break; >> > - case MachineOperand::MO_MachineBasicBlock: >> > - MCOp = MCOperand::CreateExpr(MCSymbolRefExpr::Create( >> > - MO.getMBB()->getSymbol(), Ctx)); >> > - break; >> > - case MachineOperand::MO_GlobalAddress: >> > - MCOp = LowerSymbolOperand(MO, GetGlobalAddressSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_ExternalSymbol: >> > - MCOp = LowerSymbolOperand(MO, GetExternalSymbolSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_JumpTableIndex: >> > - MCOp = LowerSymbolOperand(MO, GetJumpTableSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_ConstantPoolIndex: >> > - MCOp = LowerSymbolOperand(MO, GetConstantPoolIndexSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_BlockAddress: >> > - MCOp = LowerSymbolOperand(MO, GetBlockAddressSymbol(MO)); >> > - } >> > - >> > - OutMI.addOperand(MCOp); >> > - } >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h Fri Oct 1 20:06:42 2010 >> > @@ -1,50 +0,0 @@ >> > -//===-- MSP430MCInstLower.h - Lower MachineInstr to MCInst ----------------===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#ifndef MSP430_MCINSTLOWER_H >> > -#define MSP430_MCINSTLOWER_H >> > - >> > -#include "llvm/Support/Compiler.h" >> > - >> > -namespace llvm { >> > - class AsmPrinter; >> > - class MCAsmInfo; >> > - class MCContext; >> > - class MCInst; >> > - class MCOperand; >> > - class MCSymbol; >> > - class MachineInstr; >> > - class MachineModuleInfoMachO; >> > - class MachineOperand; >> > - class Mangler; >> > - >> > - /// MSP430MCInstLower - This class is used to lower an MachineInstr >> > - /// into an MCInst. >> > -class LLVM_LIBRARY_VISIBILITY MSP430MCInstLower { >> > - MCContext &Ctx; >> > - Mangler &Mang; >> > - >> > - AsmPrinter &Printer; >> > -public: >> > - MSP430MCInstLower(MCContext &ctx, Mangler &mang, AsmPrinter &printer) >> > - : Ctx(ctx), Mang(mang), Printer(printer) {} >> > - void Lower(const MachineInstr *MI, MCInst &OutMI) const; >> > - >> > - MCOperand LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const; >> > - >> > - MCSymbol *GetGlobalAddressSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetExternalSymbolSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetJumpTableSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetConstantPoolIndexSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetBlockAddressSymbol(const MachineOperand &MO) const; >> > -}; >> > - >> > -} >> > - >> > -#endif >> > >> > Modified: llvm/trunk/lib/Target/MSP430/Makefile >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/Makefile?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/Makefile (original) >> > +++ llvm/trunk/lib/Target/MSP430/Makefile Fri Oct 1 20:06:42 2010 >> > @@ -18,7 +18,7 @@ >> > MSP430GenDAGISel.inc MSP430GenCallingConv.inc \ >> > MSP430GenSubtarget.inc >> > >> > -DIRS = InstPrinter TargetInfo >> > +DIRS = AsmPrinter TargetInfo >> > >> > include $(LEVEL)/Makefile.common >> > >> > >> > >> > _______________________________________________ >> > llvm-commits mailing list >> > llvm-commits at cs.uiuc.edu >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.cs.uiuc.edu/pipermail/llvm-commits/attachments/20101004/de38a366/attachment.html From rafael.espindola at gmail.com Mon Oct 4 10:59:01 2010 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 04 Oct 2010 15:59:01 -0000 Subject: [llvm-commits] [llvm] r115509 - in /llvm/trunk: lib/MC/ELFObjectWriter.cpp test/MC/ELF/pic-diff.s Message-ID: <20101004155901.6C2592A6C12E@llvm.org> Author: rafael Date: Mon Oct 4 10:59:01 2010 New Revision: 115509 URL: http://llvm.org/viewvc/llvm-project?rev=115509&view=rev Log: Include the section address in the computation of the relocation. Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp llvm/trunk/test/MC/ELF/pic-diff.s Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115509&r1=115508&r2=115509&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 10:59:01 2010 @@ -539,7 +539,14 @@ const MCSymbol &SymbolB = RefB->getSymbol(); MCSymbolData &SDB = Asm.getSymbolData(SymbolB); IsPCRel = true; - Value += Layout.getFragmentOffset(Fragment) + Fixup.getOffset() - Layout.getSymbolAddress(&SDB); + MCSectionData *Sec = Fragment->getParent(); + + // Offset of the symbol in the section + int64_t a = Layout.getSymbolAddress(&SDB) - Layout.getSectionAddress(Sec); + + // Ofeset of the relocation in the section + int64_t b = Layout.getFragmentOffset(Fragment) + Fixup.getOffset(); + Value += b - a; } // Check that this case has already been fully resolved before we get Modified: llvm/trunk/test/MC/ELF/pic-diff.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/pic-diff.s?rev=115509&r1=115508&r2=115509&view=diff ============================================================================== --- llvm/trunk/test/MC/ELF/pic-diff.s (original) +++ llvm/trunk/test/MC/ELF/pic-diff.s Mon Oct 4 10:59:01 2010 @@ -19,6 +19,9 @@ // CHECK-NEXT: ), // CHECK-NEXT: ]) +.zero 4 +.data + .zero 1 .align 4 foo: From sabre at nondot.org Mon Oct 4 11:06:37 2010 From: sabre at nondot.org (Chris Lattner) Date: Mon, 04 Oct 2010 16:06:37 -0000 Subject: [llvm-commits] [llvm] r115510 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004160637.EA97B2A6C12E@llvm.org> Author: lattner Date: Mon Oct 4 11:06:37 2010 New Revision: 115510 URL: http://llvm.org/viewvc/llvm-project?rev=115510&view=rev Log: another tweak Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115510&r1=115509&r2=115510&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 11:06:37 2010 @@ -505,7 +505,7 @@

      From sabre at nondot.org Mon Oct 4 11:14:54 2010 From: sabre at nondot.org (Chris Lattner) Date: Mon, 4 Oct 2010 09:14:54 -0700 Subject: [llvm-commits] [llvm] r115495 - /llvm/trunk/docs/ReleaseNotes.html In-Reply-To: References: <20101004043925.632362A6C12E@llvm.org> Message-ID: <87FEB9A7-90A3-4C77-A224-841117883015@nondot.org> Ok, thanks! On Oct 3, 2010, at 10:39 PM, Jakob Stoklund Olesen wrote: > > On Oct 3, 2010, at 9:39 PM, Chris Lattner wrote: > >> +
    • The new SubRegIndex tablegen class allows subregisters to be indexed >> + symbolically instead of numerically. If your target uses subregisters you >> + will need to adapt to use SubRegIndex when you upgrade to 2.8.
    • > > Yup. > >> > > I don't think this is worth mentioning in the release notes since it doesn't really work yet. > > /jakob > From criswell at uiuc.edu Mon Oct 4 11:19:26 2010 From: criswell at uiuc.edu (John Criswell) Date: Mon, 04 Oct 2010 16:19:26 -0000 Subject: [llvm-commits] [poolalloc] r115511 - /poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp Message-ID: <20101004161926.AE6172A6C12E@llvm.org> Author: criswell Date: Mon Oct 4 11:19:26 2010 New Revision: 115511 URL: http://llvm.org/viewvc/llvm-project?rev=115511&view=rev Log: Fixed assertion message. No functionality changes. Modified: poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp Modified: poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp URL: http://llvm.org/viewvc/llvm-project/poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp?rev=115511&r1=115510&r2=115511&view=diff ============================================================================== --- poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp (original) +++ poolalloc/trunk/lib/PoolAllocate/TransformFunctionBody.cpp Mon Oct 4 11:19:26 2010 @@ -684,7 +684,7 @@ void FuncTransform::visitRuntimeCheck (CallSite CS) { // A run-time check should have at least one argument for a pool - assert ((CS.arg_size() > 1) && "strdup takes one argument!"); + assert ((CS.arg_size() > 1) && "Runtime check takes more than one argument!"); // // Get the pool handle for the pointer argument. From sabre at nondot.org Mon Oct 4 11:46:07 2010 From: sabre at nondot.org (Chris Lattner) Date: Mon, 04 Oct 2010 16:46:07 -0000 Subject: [llvm-commits] [llvm] r115515 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004164607.EE40A2A6C12E@llvm.org> Author: lattner Date: Mon Oct 4 11:46:07 2010 New Revision: 115515 URL: http://llvm.org/viewvc/llvm-project?rev=115515&view=rev Log: scheduler update Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115515&r1=115514&r2=115515&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 11:46:07 2010 @@ -734,12 +734,11 @@ is available from a previous instruction.
    • Atomic operations now get legalized into simpler atomic operations if not natively supported, easing the implementation burden on targets.
    • -
    • The bottom-up pre-allocation scheduler is now register pressure aware, - allowing it to avoid overscheduling in high pressure situations while still - aggressively scheduling when registers are available.
    • -
    • A new instruction-level-parallelism pre-allocation scheduler is available, - which is also register pressure aware. This scheduler has shown substantial - wins on X86-64 and is on by default.
    • +
    • We have added two new bottom-up pre-allocation register pressure aware schedulers: +
        +
      1. The hybrid scheduler schedules aggressively to minimize schedule length when registers are available and avoid overscheduling in high pressure situations.
      2. +
      3. The instruction-level-parallelism scheduler schedules for maximum ILP when registers are available and avoid overscheduling in high pressure situations.
      4. +
    • The tblgen type inference algorithm was rewritten to be more consistent and diagnose more target bugs. If you have an out-of-tree backend, you may find that it finds bugs in your target description. This support also From dpatel at apple.com Mon Oct 4 11:51:59 2010 From: dpatel at apple.com (Devang Patel) Date: Mon, 04 Oct 2010 16:51:59 -0000 Subject: [llvm-commits] [llvm] r115516 - /llvm/trunk/docs/SourceLevelDebugging.html Message-ID: <20101004165159.9CA892A6C12E@llvm.org> Author: dpatel Date: Mon Oct 4 11:51:59 2010 New Revision: 115516 URL: http://llvm.org/viewvc/llvm-project?rev=115516&view=rev Log: Fix lexical block's tag number. Modified: llvm/trunk/docs/SourceLevelDebugging.html Modified: llvm/trunk/docs/SourceLevelDebugging.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/SourceLevelDebugging.html?rev=115516&r1=115515&r2=115516&view=diff ============================================================================== --- llvm/trunk/docs/SourceLevelDebugging.html (original) +++ llvm/trunk/docs/SourceLevelDebugging.html Mon Oct 4 11:51:59 2010 @@ -460,7 +460,7 @@
       !3 = metadata !{
      -  i32,     ;; Tag = 13 + LLVMDebugVersion (DW_TAG_lexical_block)
      +  i32,     ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block)
         metadata,;; Reference to context descriptor
         i32,     ;; Line number
         i32      ;; Column number
      
      
      
      From ggreif at gmail.com  Mon Oct  4 12:03:49 2010
      From: ggreif at gmail.com (Gabor Greif)
      Date: Mon, 04 Oct 2010 17:03:49 -0000
      Subject: [llvm-commits] [llvm] r115518 - /llvm/trunk/docs/ReleaseNotes.html
      Message-ID: <20101004170349.DEE3D2A6C12E@llvm.org>
      
      Author: ggreif
      Date: Mon Oct  4 12:03:49 2010
      New Revision: 115518
      
      URL: http://llvm.org/viewvc/llvm-project?rev=115518&view=rev
      Log:
      minor tweaks and typos
      
      Modified:
          llvm/trunk/docs/ReleaseNotes.html
      
      Modified: llvm/trunk/docs/ReleaseNotes.html
      URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115518&r1=115517&r2=115518&view=diff
      ==============================================================================
      --- llvm/trunk/docs/ReleaseNotes.html (original)
      +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct  4 12:03:49 2010
      @@ -321,8 +321,8 @@
       language and compiler written on top of LLVM, intended for producing
       single-address-space managed code operating systems that
       run faster than the equivalent multiple-address-space C systems.
      -More in-depth blurb is available on the wiki.

      +More in-depth blurb is available on the wiki.

      @@ -339,8 +339,8 @@ href="http://vrt-sourcefire.blogspot.com/2010/09/introduction-to-clamavs-low-level.html">bytecode signatures that allow writing detections for complex malware. It uses LLVM's JIT to speed up the execution of bytecode on -X86,X86-64,PPC32/64, falling back to its own interpreter otherwise. -The git version was updated to work with LLVM 2.8 +X86, X86-64, PPC32/64, falling back to its own interpreter otherwise. +The git version was updated to work with LLVM 2.8.

      The DTMC provides support for Transactional Memory, which is an easy-to-use and efficient way to synchronize accesses to shared memory. Transactions can contain normal C/C++ code (e.g., -__transaction { list.remove(x); x.refCount--; }) and will be executed +__transaction { list.remove(x); x.refCount--; }) and will be executed virtually atomically and isolated from other transactions.

    • @@ -774,7 +774,7 @@
      • The X86 backend now supports holding X87 floating point stack values in registers across basic blocks, dramatically improving performance of code - that uses long double, and when targetting CPUs that don't support SSE.
      • + that uses long double, and when targeting CPUs that don't support SSE.
      • The X86 backend now uses a SSEDomainFix pass to optimize SSE operations. On Nehalem ("Core i7") and newer CPUs there is a 2 cycle latency penalty on @@ -799,7 +799,7 @@
      • When printing .s files in verbose assembly mode (the default for clang -S), the X86 backend now decodes X86 shuffle instructions and prints human - readable comments after the most inscrutible of them, e.g.: + readable comments after the most inscrutable of them, e.g.:
           insertps $113, %xmm3, %xmm0 # xmm0 = zero,xmm0[1,2],xmm3[1]
        @@ -854,7 +854,7 @@
           
      • The llvm.arm.neon.vabdl and llvm.arm.neon.vabal intrinsics (lengthening - vector absolute difference with and without accumlation) have been removed. + vector absolute difference with and without accumulation) have been removed. They are represented using the llvm.arm.neon.vabd intrinsic (vector absolute difference) followed by a vector zero-extend operation, and for vabal, a vector add. @@ -947,7 +947,7 @@ operands are now address-space qualified. If you were creating these intrinsic calls and prototypes yourself (as opposed to using Intrinsic::getDeclaration), you can use - UpgradeIntrinsicFunction/UpgradeIntrinsicCall to be portable accross releases. + UpgradeIntrinsicFunction/UpgradeIntrinsicCall to be portable across releases.
      • SetCurrentDebugLocation takes a DebugLoc now instead of a MDNode. From daniel at zuster.org Mon Oct 4 12:06:49 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Mon, 04 Oct 2010 17:06:49 -0000 Subject: [llvm-commits] [llvm] r115520 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004170649.883912A6C12E@llvm.org> Author: ddunbar Date: Mon Oct 4 12:06:49 2010 New Revision: 115520 URL: http://llvm.org/viewvc/llvm-project?rev=115520&view=rev Log: A few more random Clang release notes. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115520&r1=115519&r2=115520&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 12:06:49 2010 @@ -127,9 +127,18 @@
      • Introduced the "libclang" library, a C interface to Clang intended to support IDE clients.
      • Added support for #pragma GCC visibility, #pragma align, and others.
      • Added support for SSE, ARM NEON, and Altivec.
      • +
      • Improved support for many Microsoft extensions.
      • Implemented support for blocks in C++.
      • Implemented precompiled headers for C++.
      • Improved abstract syntax trees to retain more accurate source information.
      • +
      • Added driver support for handling LLVM IR and bitcode files directly.
      • +
      • Major improvements to compiler correctness for exception handling.
      • +
      • Improved generated code quality in some areas: +
          +
        • Good code generation for X86-32 and X86-64 ABI handling.
        • +
        • Improved code generation for bit-fields, although important work remains.
        • +
        +
      From anton at korobeynikov.info Mon Oct 4 12:11:00 2010 From: anton at korobeynikov.info (Anton Korobeynikov) Date: Mon, 4 Oct 2010 21:11:00 +0400 Subject: [llvm-commits] [Review request] test: adding triplets and FileCheck-ize for cygming(and msvc) In-Reply-To: References: Message-ID: Hi Takumi, > Thank you for working on all these. The main problem I have is with > changing the tripples. This is not a real fix for the tests. The tests > need to be changed to either support both Windows style x86[-64] code > and Linux/Darwin, or we need separate tests for them. Just changing > the tripple masks the problem. Michael's right, it might be possible that test fail indicates the real problem on mingw. It seems to unwise just to "switch the problem off". -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University From gohman at apple.com Mon Oct 4 12:24:08 2010 From: gohman at apple.com (Dan Gohman) Date: Mon, 04 Oct 2010 17:24:08 -0000 Subject: [llvm-commits] [llvm] r115521 - /llvm/trunk/lib/Analysis/ScalarEvolution.cpp Message-ID: <20101004172408.740BF2A6C12E@llvm.org> Author: djg Date: Mon Oct 4 12:24:08 2010 New Revision: 115521 URL: http://llvm.org/viewvc/llvm-project?rev=115521&view=rev Log: Don't add the operand count to SCEV uniquing data; FoldingSetNodeID already knows its own length, so this is redundant. Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp Modified: llvm/trunk/lib/Analysis/ScalarEvolution.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/ScalarEvolution.cpp?rev=115521&r1=115520&r2=115521&view=diff ============================================================================== --- llvm/trunk/lib/Analysis/ScalarEvolution.cpp (original) +++ llvm/trunk/lib/Analysis/ScalarEvolution.cpp Mon Oct 4 12:24:08 2010 @@ -1711,7 +1711,6 @@ // already have one, otherwise create a new one. FoldingSetNodeID ID; ID.AddInteger(scAddExpr); - ID.AddInteger(Ops.size()); for (unsigned i = 0, e = Ops.size(); i != e; ++i) ID.AddPointer(Ops[i]); void *IP = 0; @@ -1917,7 +1916,6 @@ // already have one, otherwise create a new one. FoldingSetNodeID ID; ID.AddInteger(scMulExpr); - ID.AddInteger(Ops.size()); for (unsigned i = 0, e = Ops.size(); i != e; ++i) ID.AddPointer(Ops[i]); void *IP = 0; @@ -2131,7 +2129,6 @@ // already have one, otherwise create a new one. FoldingSetNodeID ID; ID.AddInteger(scAddRecExpr); - ID.AddInteger(Operands.size()); for (unsigned i = 0, e = Operands.size(); i != e; ++i) ID.AddPointer(Operands[i]); ID.AddPointer(L); @@ -2242,7 +2239,6 @@ // already have one, otherwise create a new one. FoldingSetNodeID ID; ID.AddInteger(scSMaxExpr); - ID.AddInteger(Ops.size()); for (unsigned i = 0, e = Ops.size(); i != e; ++i) ID.AddPointer(Ops[i]); void *IP = 0; @@ -2347,7 +2343,6 @@ // already have one, otherwise create a new one. FoldingSetNodeID ID; ID.AddInteger(scUMaxExpr); - ID.AddInteger(Ops.size()); for (unsigned i = 0, e = Ops.size(); i != e; ++i) ID.AddPointer(Ops[i]); void *IP = 0; From espindola at google.com Mon Oct 4 12:29:59 2010 From: espindola at google.com (Rafael Espindola) Date: Mon, 4 Oct 2010 13:29:59 -0400 Subject: [llvm-commits] [patch] AsmParser hook for UseCodeAlign In-Reply-To: References: Message-ID: On 1 October 2010 17:12, Jan Voung wrote: > Thanks for the review Rafael. Attached is a new patch with the fixed comment > plus?tests for ELF and COFF. ?I think MachO already had a test. I think this patch is OK. Thanks! > - Jan Cheers,, -- Rafael ?vila de Esp?ndola From jvoung at google.com Mon Oct 4 12:32:41 2010 From: jvoung at google.com (Jan Wen Voung) Date: Mon, 04 Oct 2010 17:32:41 -0000 Subject: [llvm-commits] [llvm] r115523 - in /llvm/trunk: include/llvm/MC/MCSection.h include/llvm/MC/MCSectionCOFF.h include/llvm/MC/MCSectionELF.h include/llvm/MC/MCSectionMachO.h lib/MC/MCParser/AsmParser.cpp lib/MC/MCSectionCOFF.cpp lib/MC/MCSectionELF.cpp lib/MC/MCSectionMachO.cpp lib/Target/PIC16/PIC16Section.cpp lib/Target/PIC16/PIC16Section.h test/MC/COFF/align-nops.s test/MC/COFF/dg.exp test/MC/ELF/align-nops.s Message-ID: <20101004173241.F34182A6C12E@llvm.org> Author: jvoung Date: Mon Oct 4 12:32:41 2010 New Revision: 115523 URL: http://llvm.org/viewvc/llvm-project?rev=115523&view=rev Log: Add hook in MCSection to decide when to use "optimized nops", for each section kind. Previously, optimized nops were only used for MachO. Also added tests for ELF and COFF. Added: llvm/trunk/test/MC/COFF/align-nops.s llvm/trunk/test/MC/ELF/align-nops.s Modified: llvm/trunk/include/llvm/MC/MCSection.h llvm/trunk/include/llvm/MC/MCSectionCOFF.h llvm/trunk/include/llvm/MC/MCSectionELF.h llvm/trunk/include/llvm/MC/MCSectionMachO.h llvm/trunk/lib/MC/MCParser/AsmParser.cpp llvm/trunk/lib/MC/MCSectionCOFF.cpp llvm/trunk/lib/MC/MCSectionELF.cpp llvm/trunk/lib/MC/MCSectionMachO.cpp llvm/trunk/lib/Target/PIC16/PIC16Section.cpp llvm/trunk/lib/Target/PIC16/PIC16Section.h llvm/trunk/test/MC/COFF/dg.exp Modified: llvm/trunk/include/llvm/MC/MCSection.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCSection.h?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCSection.h (original) +++ llvm/trunk/include/llvm/MC/MCSection.h Mon Oct 4 12:32:41 2010 @@ -61,6 +61,10 @@ return false; } + // UseCodeAlign - Return true if a .align directive should use + // "optimized nops" to fill instead of 0s. + virtual bool UseCodeAlign() const = 0; + static bool classof(const MCSection *) { return true; } }; Modified: llvm/trunk/include/llvm/MC/MCSectionCOFF.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCSectionCOFF.h?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCSectionCOFF.h (original) +++ llvm/trunk/include/llvm/MC/MCSectionCOFF.h Mon Oct 4 12:32:41 2010 @@ -55,6 +55,7 @@ virtual void PrintSwitchToSection(const MCAsmInfo &MAI, raw_ostream &OS) const; + virtual bool UseCodeAlign() const; static bool classof(const MCSection *S) { return S->getVariant() == SV_COFF; Modified: llvm/trunk/include/llvm/MC/MCSectionELF.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCSectionELF.h?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCSectionELF.h (original) +++ llvm/trunk/include/llvm/MC/MCSectionELF.h Mon Oct 4 12:32:41 2010 @@ -178,7 +178,8 @@ void PrintSwitchToSection(const MCAsmInfo &MAI, raw_ostream &OS) const; - + virtual bool UseCodeAlign() const; + /// isBaseAddressKnownZero - We know that non-allocatable sections (like /// debug info) have a base of zero. virtual bool isBaseAddressKnownZero() const { Modified: llvm/trunk/include/llvm/MC/MCSectionMachO.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCSectionMachO.h?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCSectionMachO.h (original) +++ llvm/trunk/include/llvm/MC/MCSectionMachO.h Mon Oct 4 12:32:41 2010 @@ -165,6 +165,7 @@ virtual void PrintSwitchToSection(const MCAsmInfo &MAI, raw_ostream &OS) const; + virtual bool UseCodeAlign() const; static bool classof(const MCSection *S) { return S->getVariant() == SV_MachO; Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Mon Oct 4 12:32:41 2010 @@ -1654,12 +1654,7 @@ // Check whether we should use optimal code alignment for this .align // directive. - // - // FIXME: This should be using a target hook. - bool UseCodeAlign = false; - if (const MCSectionMachO *S = dyn_cast( - getStreamer().getCurrentSection())) - UseCodeAlign = S->hasAttribute(MCSectionMachO::S_ATTR_PURE_INSTRUCTIONS); + bool UseCodeAlign = getStreamer().getCurrentSection()->UseCodeAlign(); if ((!HasFillExpr || Lexer.getMAI().getTextAlignFillValue() == FillExpr) && ValueSize == 1 && UseCodeAlign) { getStreamer().EmitCodeAlignment(Alignment, MaxBytesToFill); Modified: llvm/trunk/lib/MC/MCSectionCOFF.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCSectionCOFF.cpp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCSectionCOFF.cpp (original) +++ llvm/trunk/lib/MC/MCSectionCOFF.cpp Mon Oct 4 12:32:41 2010 @@ -74,3 +74,7 @@ } } } + +bool MCSectionCOFF::UseCodeAlign() const { + return getKind().isText(); +} Modified: llvm/trunk/lib/MC/MCSectionELF.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCSectionELF.cpp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCSectionELF.cpp (original) +++ llvm/trunk/lib/MC/MCSectionELF.cpp Mon Oct 4 12:32:41 2010 @@ -112,6 +112,10 @@ OS << '\n'; } +bool MCSectionELF::UseCodeAlign() const { + return getFlags() & MCSectionELF::SHF_EXECINSTR; +} + // HasCommonSymbols - True if this section holds common symbols, this is // indicated on the ELF object file by a symbol with SHN_COMMON section // header index. Modified: llvm/trunk/lib/MC/MCSectionMachO.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCSectionMachO.cpp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCSectionMachO.cpp (original) +++ llvm/trunk/lib/MC/MCSectionMachO.cpp Mon Oct 4 12:32:41 2010 @@ -148,6 +148,10 @@ OS << '\n'; } +bool MCSectionMachO::UseCodeAlign() const { + return hasAttribute(MCSectionMachO::S_ATTR_PURE_INSTRUCTIONS); +} + /// StripSpaces - This removes leading and trailing spaces from the StringRef. static void StripSpaces(StringRef &Str) { while (!Str.empty() && isspace(Str[0])) @@ -283,4 +287,3 @@ return ""; } - Modified: llvm/trunk/lib/Target/PIC16/PIC16Section.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PIC16/PIC16Section.cpp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/Target/PIC16/PIC16Section.cpp (original) +++ llvm/trunk/lib/Target/PIC16/PIC16Section.cpp Mon Oct 4 12:32:41 2010 @@ -102,3 +102,7 @@ OS << '\n'; } + +bool PIC16Section::UseCodeAlign() const { + return isCODE_Type(); +} Modified: llvm/trunk/lib/Target/PIC16/PIC16Section.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PIC16/PIC16Section.h?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/lib/Target/PIC16/PIC16Section.h (original) +++ llvm/trunk/lib/Target/PIC16/PIC16Section.h Mon Oct 4 12:32:41 2010 @@ -88,6 +88,8 @@ virtual void PrintSwitchToSection(const MCAsmInfo &MAI, raw_ostream &OS) const; + virtual bool UseCodeAlign() const; + static bool classof(const MCSection *S) { return S->getVariant() == SV_PIC16; } Added: llvm/trunk/test/MC/COFF/align-nops.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/COFF/align-nops.s?rev=115523&view=auto ============================================================================== --- llvm/trunk/test/MC/COFF/align-nops.s (added) +++ llvm/trunk/test/MC/COFF/align-nops.s Mon Oct 4 12:32:41 2010 @@ -0,0 +1,45 @@ +// RUN: llvm-mc -filetype=obj -triple i686-pc-win32 %s -o %t +// RUN: coff-dump.py %abs_tmp | FileCheck %s + +// Test that we get optimal nops in text + .text +f0: + .long 0 + .align 8, 0x90 + .long 0 + .align 8 + +// But not in another section + .data + .long 0 + .align 8, 0x90 + .long 0 + .align 8 + +//CHECK: Name = .text +//CHECK-NEXT: VirtualSize +//CHECK-NEXT: VirtualAddress +//CHECK-NEXT: SizeOfRawData = 16 +//CHECK-NEXT: PointerToRawData +//CHECK-NEXT: PointerToRelocations +//CHECK-NEXT: PointerToLineNumbers +//CHECK-NEXT: NumberOfRelocations +//CHECK-NEXT: NumberOfLineNumbers +//CHECK-NEXT: Charateristics = 0x400001 +//CHECK-NEXT: IMAGE_SCN_ALIGN_8BYTES +//CHECK-NEXT: SectionData = +//CHECK-NEXT: 00 00 00 00 0F 1F 40 00 - 00 00 00 00 0F 1F 40 00 + +//CHECK: Name = .data +//CHECK-NEXT: VirtualSize +//CHECK-NEXT: VirtualAddress +//CHECK-NEXT: SizeOfRawData = 16 +//CHECK-NEXT: PointerToRawData +//CHECK-NEXT: PointerToRelocations +//CHECK-NEXT: PointerToLineNumbers +//CHECK-NEXT: NumberOfRelocations +//CHECK-NEXT: NumberOfLineNumbers +//CHECK-NEXT: Charateristics = 0x400001 +//CHECK-NEXT: IMAGE_SCN_ALIGN_8BYTES +//CHECK-NEXT: SectionData = +//CHECK-NEXT: 00 00 00 00 90 90 90 90 - 00 00 00 00 00 00 00 00 Modified: llvm/trunk/test/MC/COFF/dg.exp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/COFF/dg.exp?rev=115523&r1=115522&r2=115523&view=diff ============================================================================== --- llvm/trunk/test/MC/COFF/dg.exp (original) +++ llvm/trunk/test/MC/COFF/dg.exp Mon Oct 4 12:32:41 2010 @@ -1,5 +1,5 @@ load_lib llvm.exp if { [llvm_supports_target X86] } { - RunLLVMTests [lsort [glob -nocomplain $srcdir/$subdir/*.{ll}]] + RunLLVMTests [lsort [glob -nocomplain $srcdir/$subdir/*.{ll,s}]] } Added: llvm/trunk/test/MC/ELF/align-nops.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/align-nops.s?rev=115523&view=auto ============================================================================== --- llvm/trunk/test/MC/ELF/align-nops.s (added) +++ llvm/trunk/test/MC/ELF/align-nops.s Mon Oct 4 12:32:41 2010 @@ -0,0 +1,40 @@ +// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | elf-dump --dump-section-data | FileCheck %s + +// Test that we get optimal nops in text + .text +f0: + .long 0 + .align 8, 0x90 + .long 0 + .align 8 + +// But not in another section + .data + .long 0 + .align 8, 0x90 + .long 0 + .align 8 + +// CHECK: (('sh_name', 1) # '.text' +// CHECK-NEXT: ('sh_type', 1) +// CHECK-NEXT: ('sh_flags', 6) +// CHECK-NEXT: ('sh_addr', +// CHECK-NEXT: ('sh_offset', +// CHECK-NEXT: ('sh_size', 16) +// CHECK-NEXT: ('sh_link', 0) +// CHECK-NEXT: ('sh_info', 0) +// CHECK-NEXT: ('sh_addralign', 8) +// CHECK-NEXT: ('sh_entsize', 0) +// CHECK-NEXT: ('_section_data', '00000000 0f1f4000 00000000 0f1f4000') + +// CHECK: (('sh_name', 7) # '.data' +// CHECK-NEXT: ('sh_type', 1) +// CHECK-NEXT: ('sh_flags', 3) +// CHECK-NEXT: ('sh_addr', +// CHECK-NEXT: ('sh_offset', +// CHECK-NEXT: ('sh_size', 16) +// CHECK-NEXT: ('sh_link', 0) +// CHECK-NEXT: ('sh_info', 0) +// CHECK-NEXT: ('sh_addralign', 8) +// CHECK-NEXT: ('sh_entsize', 0) +// CHECK-NEXT: ('_section_data', '00000000 90909090 00000000 00000000') From daniel at zuster.org Mon Oct 4 12:39:47 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Mon, 04 Oct 2010 17:39:47 -0000 Subject: [llvm-commits] [llvm] r115524 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004173947.ECF2C2A6C12E@llvm.org> Author: ddunbar Date: Mon Oct 4 12:39:47 2010 New Revision: 115524 URL: http://llvm.org/viewvc/llvm-project?rev=115524&view=rev Log: Add KLEE 2.8 release notes. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115524&r1=115523&r2=115524&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 12:39:47 2010 @@ -284,6 +284,43 @@ + + + + +
      +

      +KLEE is a symbolic execution framework for +programs in LLVM bitcode form. KLEE tries to symbolically evaluate "all" paths +through the application and records state transitions that lead to fault +states. This allows it to construct testcases that lead to faults and can even +be used to verify some algorithms. +

      + +

      Although KLEE does not have any major new features as of 2.8, we have made +various minor improvements, particular to ease development:

      +
        +
      • Added support for LLVM 2.8. KLEE currently maintains compatibility with + LLVM 2.6, 2.7, and 2.8.
      • +
      • Added a buildbot for 2.6, 2.7, and trunk. A 2.8 buildbot will be coming + soon following release.
      • +
      • Fixed many C++ code issues to allow building with Clang++. Mostly + complete, except for the version of MiniSAT which is inside the KLEE STP + version.
      • +
      • Improved support for building with separate source and build + directories.
      • +
      • Added support for "long double" on x86.
      • +
      • Initial work on KLEE support for using 'lit' test runner instead of + DejaGNU.
      • +
      • Added configure support for using an external version of + STP.
      • +
      + +
      + +
      External Open Source Projects Using LLVM 2.8 From grosbach at apple.com Mon Oct 4 12:49:26 2010 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 4 Oct 2010 10:49:26 -0700 Subject: [llvm-commits] [llvm] r115393 - in /llvm/trunk: CMakeLists.txt lib/Target/MSP430/InstPrinter/CMakeLists.txt lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h lib/Target/MSP430/InstPrinter/Makefi In-Reply-To: References: <07E65C41-D605-4147-B976-1BC502C5BF68@apple.com> Message-ID: <78259CE4-A19D-4C6C-9F87-5C84F023EAD9@apple.com> OK, I think I see the culprit. MSP430InstPrinter.cpp includes MSP430InstrInfo.h which in turn include MSP430RegisterInfo.h which references those classes. Nothing actually uses those references, however, and it appears that for OSX either the references aren't emitted to the .s file at all or the assembler/linker are smart enough to ignore them since they're not used. Either way, the InstPrinter shouldn't be including MSP430InstrInfo.h. I'll fix that and re-apply the other patches once I verify they build OK on Linux. -Jim On Oct 3, 2010, at 11:39 PM, Nick Lewycky wrote: > Here's what the GenLibDeps.pl script thinks is going on. > > libLLVMMSP430CodeGen.a uses but does not define symbols: > _ZN4llvm17MSP430InstPrinter15getRegisterNameEj aka. llvm::MSP430InstPrinter::getRegisterName(unsigned int) > which are provided by libLLVMMSP430AsmPrinter.a. Going in the other direction, libLLVMMSP430AsmPrinter.a uses symbols: > _ZN4llvm6MSP43011GR8RegClassE aka. llvm::MSP430::GR8RegClass > _ZN4llvm6MSP43012GR16RegClassE aka. llvm::MSP430::GR16RegClass > > GR8RegClass and GR16RegClass is defined in MSP430RegisterInfo.o (which is rolled into ...CodeGen.a). Its only reference in ...AsmPrinter.a is by MSP430InstPrinter.o. > > The getRegisterName function is defined in MSP430InstPrinter.o (part of ...AsmPrinter.a) and its only reference in ...CodeGen.a is by MSP430AsmPrinter.o. > > Is that enough to go on? > > Nick > > On 1 October 2010 18:43, Nick Lewycky wrote: > Okay. You can see that almost all of the open-source builders were broken: > > http://google1.osuosl.org:8011/console > > in that time. It's impossible for this particular error to occur in a cmake build because cmake doesn't run find-cycles.pl (last i checked). My suspicion is that the cmake builders were working fine while configure+make ones were not? > > I'm going to wind back to the broken point and try to reproduce the failure and see if I can figure out what the cyclic dependency actually was. > > Nick > > > On 1 October 2010 18:27, Jim Grosbach wrote: > That's very strange. I do a configure/make here, and it works, and lots of bots using that were green as well. If there's a case I missed, I'd love to have some help tracking down what it is. Can you try a "make clean" and see if that works? Maybe there's just something stale that the configure portion of the patch needs to clean up. > > -Jim > > > > On Oct 1, 2010, at 6:24 PM, Nick Lewycky wrote: > >> Nope, it broke under a regular configure+make in-srctree incremental build on multiple different machines. >> >> On 1 October 2010 18:22, Jim Grosbach wrote: >> Nick, >> >> These only break for you under CMake, right? That's the only place I've been able to reproduce failures. >> >> -Jim >> >> >> On Oct 1, 2010, at 6:06 PM, Nick Lewycky wrote: >> >> > Author: nicholas >> > Date: Fri Oct 1 20:06:42 2010 >> > New Revision: 115393 >> > >> > URL: http://llvm.org/viewvc/llvm-project?rev=115393&view=rev >> > Log: >> > Revert patches r115363 r115367 r115391 due to build breakage: >> > llvm[2]: Updated LibDeps.txt because dependencies changed >> > llvm[2]: Checking for cyclic dependencies between LLVM libraries. >> > find-cycles.pl: Circular dependency between *.a files: >> > find-cycles.pl: libLLVMMSP430AsmPrinter.a libLLVMMSP430CodeGen.a >> > >> > >> > Modified: >> > llvm/trunk/CMakeLists.txt >> > llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >> > llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >> > llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >> > llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >> > llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >> > llvm/trunk/lib/Target/MSP430/Makefile >> > >> > Modified: llvm/trunk/CMakeLists.txt >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/CMakeLists.txt (original) >> > +++ llvm/trunk/CMakeLists.txt Fri Oct 1 20:06:42 2010 >> > @@ -323,10 +323,6 @@ >> > add_subdirectory(lib/Target/${t}/AsmPrinter) >> > set(LLVM_ENUM_ASM_PRINTERS >> > "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >> > - if( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/InstPrinter/CMakeLists.txt ) >> > - add_subdirectory(lib/Target/${t}/InstPrinter) >> > - set(LLVM_ENUM_ASM_PRINTERS >> > - "${LLVM_ENUM_ASM_PRINTERS}LLVM_ASM_PRINTER(${t})\n") >> > endif( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmPrinter/CMakeLists.txt ) >> > if( EXISTS ${LLVM_MAIN_SRC_DIR}/lib/Target/${t}/AsmParser/CMakeLists.txt ) >> > add_subdirectory(lib/Target/${t}/AsmParser) >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/CMakeLists.txt Fri Oct 1 20:06:42 2010 >> > @@ -1,6 +0,0 @@ >> > -include_directories( ${CMAKE_CURRENT_BINARY_DIR}/.. ${CMAKE_CURRENT_SOURCE_DIR}/.. ) >> > - >> > -add_llvm_library(LLVMMSP430AsmPrinter >> > - MSP430InstPrinter.cpp >> > - ) >> > -add_dependencies(LLVMMSP430AsmPrinter MSP430CodeGenTable_gen) >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,114 +0,0 @@ >> > -//===-- MSP430InstPrinter.cpp - Convert MSP430 MCInst to assembly syntax --===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This class prints an MSP430 MCInst to a .s file. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#define DEBUG_TYPE "asm-printer" >> > -#include "MSP430.h" >> > -#include "MSP430InstrInfo.h" >> > -#include "MSP430InstPrinter.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/MC/MCAsmInfo.h" >> > -#include "llvm/MC/MCExpr.h" >> > -#include "llvm/Support/ErrorHandling.h" >> > -#include "llvm/Support/FormattedStream.h" >> > -using namespace llvm; >> > - >> > - >> > -// Include the auto-generated portion of the assembly writer. >> > -#include "MSP430GenAsmWriter.inc" >> > - >> > -void MSP430InstPrinter::printInst(const MCInst *MI, raw_ostream &O) { >> > - printInstruction(MI, O); >> > -} >> > - >> > -void MSP430InstPrinter::printPCRelImmOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O) { >> > - const MCOperand &Op = MI->getOperand(OpNo); >> > - if (Op.isImm()) >> > - O << Op.getImm(); >> > - else { >> > - assert(Op.isExpr() && "unknown pcrel immediate operand"); >> > - O << *Op.getExpr(); >> > - } >> > -} >> > - >> > -void MSP430InstPrinter::printOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O, const char *Modifier) { >> > - assert((Modifier == 0 || Modifier[0] == 0) && "No modifiers supported"); >> > - const MCOperand &Op = MI->getOperand(OpNo); >> > - if (Op.isReg()) { >> > - O << getRegisterName(Op.getReg()); >> > - } else if (Op.isImm()) { >> > - O << '#' << Op.getImm(); >> > - } else { >> > - assert(Op.isExpr() && "unknown operand kind in printOperand"); >> > - O << '#' << *Op.getExpr(); >> > - } >> > -} >> > - >> > -void MSP430InstPrinter::printSrcMemOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O, >> > - const char *Modifier) { >> > - const MCOperand &Base = MI->getOperand(OpNo); >> > - const MCOperand &Disp = MI->getOperand(OpNo+1); >> > - >> > - // Print displacement first >> > - >> > - // If the global address expression is a part of displacement field with a >> > - // register base, we should not emit any prefix symbol here, e.g. >> > - // mov.w &foo, r1 >> > - // vs >> > - // mov.w glb(r1), r2 >> > - // Otherwise (!) msp430-as will silently miscompile the output :( >> > - if (!Base.getReg()) >> > - O << '&'; >> > - >> > - if (Disp.isExpr()) >> > - O << *Disp.getExpr(); >> > - else { >> > - assert(Disp.isImm() && "Expected immediate in displacement field"); >> > - O << Disp.getImm(); >> > - } >> > - >> > - // Print register base field >> > - if (Base.getReg()) >> > - O << '(' << getRegisterName(Base.getReg()) << ')'; >> > -} >> > - >> > -void MSP430InstPrinter::printCCOperand(const MCInst *MI, unsigned OpNo, >> > - raw_ostream &O) { >> > - unsigned CC = MI->getOperand(OpNo).getImm(); >> > - >> > - switch (CC) { >> > - default: >> > - llvm_unreachable("Unsupported CC code"); >> > - break; >> > - case MSP430CC::COND_E: >> > - O << "eq"; >> > - break; >> > - case MSP430CC::COND_NE: >> > - O << "ne"; >> > - break; >> > - case MSP430CC::COND_HS: >> > - O << "hs"; >> > - break; >> > - case MSP430CC::COND_LO: >> > - O << "lo"; >> > - break; >> > - case MSP430CC::COND_GE: >> > - O << "ge"; >> > - break; >> > - case MSP430CC::COND_L: >> > - O << 'l'; >> > - break; >> > - } >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/MSP430InstPrinter.h Fri Oct 1 20:06:42 2010 >> > @@ -1,43 +0,0 @@ >> > -//===-- MSP430InstPrinter.h - Convert MSP430 MCInst to assembly syntax ----===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This class prints a MSP430 MCInst to a .s file. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#ifndef MSP430INSTPRINTER_H >> > -#define MSP430INSTPRINTER_H >> > - >> > -#include "llvm/MC/MCInstPrinter.h" >> > - >> > -namespace llvm { >> > - class MCOperand; >> > - >> > - class MSP430InstPrinter : public MCInstPrinter { >> > - public: >> > - MSP430InstPrinter(const MCAsmInfo &MAI) : MCInstPrinter(MAI) { >> > - } >> > - >> > - virtual void printInst(const MCInst *MI, raw_ostream &O); >> > - >> > - // Autogenerated by tblgen. >> > - void printInstruction(const MCInst *MI, raw_ostream &O); >> > - static const char *getRegisterName(unsigned RegNo); >> > - >> > - void printOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O, >> > - const char *Modifier = 0); >> > - void printPCRelImmOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O); >> > - void printSrcMemOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O, >> > - const char *Modifier = 0); >> > - void printCCOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O); >> > - >> > - }; >> > -} >> > - >> > -#endif >> > >> > Modified: llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile (original) >> > +++ llvm/trunk/lib/Target/MSP430/InstPrinter/Makefile Fri Oct 1 20:06:42 2010 >> > @@ -1,15 +0,0 @@ >> > -##===- lib/Target/MSP430/AsmPrinter/Makefile ---------------*- Makefile -*-===## >> > -# >> > -# The LLVM Compiler Infrastructure >> > -# >> > -# This file is distributed under the University of Illinois Open Source >> > -# License. See LICENSE.TXT for details. >> > -# >> > -##===----------------------------------------------------------------------===## >> > -LEVEL = ../../../.. >> > -LIBRARYNAME = LLVMMSP430AsmPrinter >> > - >> > -# Hack: we need to include 'main' MSP430 target directory to grab private headers >> > -CPP.Flags += -I$(PROJ_OBJ_DIR)/.. -I$(PROJ_SRC_DIR)/.. >> > - >> > -include $(LEVEL)/Makefile.common >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430AsmPrinter.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,179 +0,0 @@ >> > -//===-- MSP430AsmPrinter.cpp - MSP430 LLVM assembly writer ----------------===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This file contains a printer that converts from our internal representation >> > -// of machine-dependent LLVM code to the MSP430 assembly language. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#define DEBUG_TYPE "asm-printer" >> > -#include "MSP430.h" >> > -#include "MSP430InstrInfo.h" >> > -#include "InstPrinter/MSP430InstPrinter.h" >> > -#include "MSP430MCAsmInfo.h" >> > -#include "MSP430MCInstLower.h" >> > -#include "MSP430TargetMachine.h" >> > -#include "llvm/Constants.h" >> > -#include "llvm/DerivedTypes.h" >> > -#include "llvm/Module.h" >> > -#include "llvm/Assembly/Writer.h" >> > -#include "llvm/CodeGen/AsmPrinter.h" >> > -#include "llvm/CodeGen/MachineModuleInfo.h" >> > -#include "llvm/CodeGen/MachineFunctionPass.h" >> > -#include "llvm/CodeGen/MachineConstantPool.h" >> > -#include "llvm/CodeGen/MachineInstr.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/MC/MCStreamer.h" >> > -#include "llvm/MC/MCSymbol.h" >> > -#include "llvm/Target/Mangler.h" >> > -#include "llvm/Target/TargetData.h" >> > -#include "llvm/Target/TargetLoweringObjectFile.h" >> > -#include "llvm/Target/TargetRegistry.h" >> > -#include "llvm/Support/raw_ostream.h" >> > -using namespace llvm; >> > - >> > -namespace { >> > - class MSP430AsmPrinter : public AsmPrinter { >> > - public: >> > - MSP430AsmPrinter(TargetMachine &TM, MCStreamer &Streamer) >> > - : AsmPrinter(TM, Streamer) {} >> > - >> > - virtual const char *getPassName() const { >> > - return "MSP430 Assembly Printer"; >> > - } >> > - >> > - void printOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O, const char* Modifier = 0); >> > - void printSrcMemOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O); >> > - bool PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, >> > - unsigned AsmVariant, const char *ExtraCode, >> > - raw_ostream &O); >> > - bool PrintAsmMemoryOperand(const MachineInstr *MI, >> > - unsigned OpNo, unsigned AsmVariant, >> > - const char *ExtraCode, raw_ostream &O); >> > - void EmitInstruction(const MachineInstr *MI); >> > - }; >> > -} // end of anonymous namespace >> > - >> > - >> > -void MSP430AsmPrinter::printOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O, const char *Modifier) { >> > - const MachineOperand &MO = MI->getOperand(OpNum); >> > - switch (MO.getType()) { >> > - default: assert(0 && "Not implemented yet!"); >> > - case MachineOperand::MO_Register: >> > - O << MSP430InstPrinter::getRegisterName(MO.getReg()); >> > - return; >> > - case MachineOperand::MO_Immediate: >> > - if (!Modifier || strcmp(Modifier, "nohash")) >> > - O << '#'; >> > - O << MO.getImm(); >> > - return; >> > - case MachineOperand::MO_MachineBasicBlock: >> > - O << *MO.getMBB()->getSymbol(); >> > - return; >> > - case MachineOperand::MO_GlobalAddress: { >> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >> > - uint64_t Offset = MO.getOffset(); >> > - >> > - // If the global address expression is a part of displacement field with a >> > - // register base, we should not emit any prefix symbol here, e.g. >> > - // mov.w &foo, r1 >> > - // vs >> > - // mov.w glb(r1), r2 >> > - // Otherwise (!) msp430-as will silently miscompile the output :( >> > - if (!Modifier || strcmp(Modifier, "nohash")) >> > - O << (isMemOp ? '&' : '#'); >> > - if (Offset) >> > - O << '(' << Offset << '+'; >> > - >> > - O << *Mang->getSymbol(MO.getGlobal()); >> > - >> > - if (Offset) >> > - O << ')'; >> > - >> > - return; >> > - } >> > - case MachineOperand::MO_ExternalSymbol: { >> > - bool isMemOp = Modifier && !strcmp(Modifier, "mem"); >> > - O << (isMemOp ? '&' : '#'); >> > - O << MAI->getGlobalPrefix() << MO.getSymbolName(); >> > - return; >> > - } >> > - } >> > -} >> > - >> > -void MSP430AsmPrinter::printSrcMemOperand(const MachineInstr *MI, int OpNum, >> > - raw_ostream &O) { >> > - const MachineOperand &Base = MI->getOperand(OpNum); >> > - const MachineOperand &Disp = MI->getOperand(OpNum+1); >> > - >> > - // Print displacement first >> > - >> > - // Imm here is in fact global address - print extra modifier. >> > - if (Disp.isImm() && !Base.getReg()) >> > - O << '&'; >> > - printOperand(MI, OpNum+1, O, "nohash"); >> > - >> > - // Print register base field >> > - if (Base.getReg()) { >> > - O << '('; >> > - printOperand(MI, OpNum, O); >> > - O << ')'; >> > - } >> > -} >> > - >> > -/// PrintAsmOperand - Print out an operand for an inline asm expression. >> > -/// >> > -bool MSP430AsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, >> > - unsigned AsmVariant, >> > - const char *ExtraCode, raw_ostream &O) { >> > - // Does this asm operand have a single letter operand modifier? >> > - if (ExtraCode && ExtraCode[0]) >> > - return true; // Unknown modifier. >> > - >> > - printOperand(MI, OpNo, O); >> > - return false; >> > -} >> > - >> > -bool MSP430AsmPrinter::PrintAsmMemoryOperand(const MachineInstr *MI, >> > - unsigned OpNo, unsigned AsmVariant, >> > - const char *ExtraCode, >> > - raw_ostream &O) { >> > - if (ExtraCode && ExtraCode[0]) { >> > - return true; // Unknown modifier. >> > - } >> > - printSrcMemOperand(MI, OpNo, O); >> > - return false; >> > -} >> > - >> > -//===----------------------------------------------------------------------===// >> > -void MSP430AsmPrinter::EmitInstruction(const MachineInstr *MI) { >> > - MSP430MCInstLower MCInstLowering(OutContext, *Mang, *this); >> > - >> > - MCInst TmpInst; >> > - MCInstLowering.Lower(MI, TmpInst); >> > - OutStreamer.EmitInstruction(TmpInst); >> > -} >> > - >> > -static MCInstPrinter *createMSP430MCInstPrinter(const Target &T, >> > - unsigned SyntaxVariant, >> > - const MCAsmInfo &MAI) { >> > - if (SyntaxVariant == 0) >> > - return new MSP430InstPrinter(MAI); >> > - return 0; >> > -} >> > - >> > -// Force static initialization. >> > -extern "C" void LLVMInitializeMSP430AsmPrinter() { >> > - RegisterAsmPrinter X(TheMSP430Target); >> > - TargetRegistry::RegisterMCInstPrinter(TheMSP430Target, >> > - createMSP430MCInstPrinter); >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.cpp Fri Oct 1 20:06:42 2010 >> > @@ -1,150 +0,0 @@ >> > -//===-- MSP430MCInstLower.cpp - Convert MSP430 MachineInstr to an MCInst---===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > -// >> > -// This file contains code to lower MSP430 MachineInstrs to their corresponding >> > -// MCInst records. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#include "MSP430MCInstLower.h" >> > -#include "llvm/CodeGen/AsmPrinter.h" >> > -#include "llvm/CodeGen/MachineBasicBlock.h" >> > -#include "llvm/CodeGen/MachineInstr.h" >> > -#include "llvm/MC/MCAsmInfo.h" >> > -#include "llvm/MC/MCContext.h" >> > -#include "llvm/MC/MCExpr.h" >> > -#include "llvm/MC/MCInst.h" >> > -#include "llvm/Target/Mangler.h" >> > -#include "llvm/Support/raw_ostream.h" >> > -#include "llvm/Support/ErrorHandling.h" >> > -#include "llvm/ADT/SmallString.h" >> > -using namespace llvm; >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetGlobalAddressSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.Mang->getSymbol(MO.getGlobal()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetExternalSymbolSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: assert(0 && "Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.GetExternalSymbolSymbol(MO.getSymbolName()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetJumpTableSymbol(const MachineOperand &MO) const { >> > - SmallString<256> Name; >> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() << "JTI" >> > - << Printer.getFunctionNumber() << '_' >> > - << MO.getIndex(); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - // Create a symbol for the name. >> > - return Ctx.GetOrCreateSymbol(Name.str()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetConstantPoolIndexSymbol(const MachineOperand &MO) const { >> > - SmallString<256> Name; >> > - raw_svector_ostream(Name) << Printer.MAI->getPrivateGlobalPrefix() << "CPI" >> > - << Printer.getFunctionNumber() << '_' >> > - << MO.getIndex(); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - // Create a symbol for the name. >> > - return Ctx.GetOrCreateSymbol(Name.str()); >> > -} >> > - >> > -MCSymbol *MSP430MCInstLower:: >> > -GetBlockAddressSymbol(const MachineOperand &MO) const { >> > - switch (MO.getTargetFlags()) { >> > - default: assert(0 && "Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - return Printer.GetBlockAddressSymbol(MO.getBlockAddress()); >> > -} >> > - >> > -MCOperand MSP430MCInstLower:: >> > -LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const { >> > - // FIXME: We would like an efficient form for this, so we don't have to do a >> > - // lot of extra uniquing. >> > - const MCExpr *Expr = MCSymbolRefExpr::Create(Sym, Ctx); >> > - >> > - switch (MO.getTargetFlags()) { >> > - default: llvm_unreachable("Unknown target flag on GV operand"); >> > - case 0: break; >> > - } >> > - >> > - if (!MO.isJTI() && MO.getOffset()) >> > - Expr = MCBinaryExpr::CreateAdd(Expr, >> > - MCConstantExpr::Create(MO.getOffset(), Ctx), >> > - Ctx); >> > - return MCOperand::CreateExpr(Expr); >> > -} >> > - >> > -void MSP430MCInstLower::Lower(const MachineInstr *MI, MCInst &OutMI) const { >> > - OutMI.setOpcode(MI->getOpcode()); >> > - >> > - for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) { >> > - const MachineOperand &MO = MI->getOperand(i); >> > - >> > - MCOperand MCOp; >> > - switch (MO.getType()) { >> > - default: >> > - MI->dump(); >> > - assert(0 && "unknown operand type"); >> > - case MachineOperand::MO_Register: >> > - // Ignore all implicit register operands. >> > - if (MO.isImplicit()) continue; >> > - MCOp = MCOperand::CreateReg(MO.getReg()); >> > - break; >> > - case MachineOperand::MO_Immediate: >> > - MCOp = MCOperand::CreateImm(MO.getImm()); >> > - break; >> > - case MachineOperand::MO_MachineBasicBlock: >> > - MCOp = MCOperand::CreateExpr(MCSymbolRefExpr::Create( >> > - MO.getMBB()->getSymbol(), Ctx)); >> > - break; >> > - case MachineOperand::MO_GlobalAddress: >> > - MCOp = LowerSymbolOperand(MO, GetGlobalAddressSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_ExternalSymbol: >> > - MCOp = LowerSymbolOperand(MO, GetExternalSymbolSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_JumpTableIndex: >> > - MCOp = LowerSymbolOperand(MO, GetJumpTableSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_ConstantPoolIndex: >> > - MCOp = LowerSymbolOperand(MO, GetConstantPoolIndexSymbol(MO)); >> > - break; >> > - case MachineOperand::MO_BlockAddress: >> > - MCOp = LowerSymbolOperand(MO, GetBlockAddressSymbol(MO)); >> > - } >> > - >> > - OutMI.addOperand(MCOp); >> > - } >> > -} >> > >> > Modified: llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h (original) >> > +++ llvm/trunk/lib/Target/MSP430/MSP430MCInstLower.h Fri Oct 1 20:06:42 2010 >> > @@ -1,50 +0,0 @@ >> > -//===-- MSP430MCInstLower.h - Lower MachineInstr to MCInst ----------------===// >> > -// >> > -// The LLVM Compiler Infrastructure >> > -// >> > -// This file is distributed under the University of Illinois Open Source >> > -// License. See LICENSE.TXT for details. >> > -// >> > -//===----------------------------------------------------------------------===// >> > - >> > -#ifndef MSP430_MCINSTLOWER_H >> > -#define MSP430_MCINSTLOWER_H >> > - >> > -#include "llvm/Support/Compiler.h" >> > - >> > -namespace llvm { >> > - class AsmPrinter; >> > - class MCAsmInfo; >> > - class MCContext; >> > - class MCInst; >> > - class MCOperand; >> > - class MCSymbol; >> > - class MachineInstr; >> > - class MachineModuleInfoMachO; >> > - class MachineOperand; >> > - class Mangler; >> > - >> > - /// MSP430MCInstLower - This class is used to lower an MachineInstr >> > - /// into an MCInst. >> > -class LLVM_LIBRARY_VISIBILITY MSP430MCInstLower { >> > - MCContext &Ctx; >> > - Mangler &Mang; >> > - >> > - AsmPrinter &Printer; >> > -public: >> > - MSP430MCInstLower(MCContext &ctx, Mangler &mang, AsmPrinter &printer) >> > - : Ctx(ctx), Mang(mang), Printer(printer) {} >> > - void Lower(const MachineInstr *MI, MCInst &OutMI) const; >> > - >> > - MCOperand LowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) const; >> > - >> > - MCSymbol *GetGlobalAddressSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetExternalSymbolSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetJumpTableSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetConstantPoolIndexSymbol(const MachineOperand &MO) const; >> > - MCSymbol *GetBlockAddressSymbol(const MachineOperand &MO) const; >> > -}; >> > - >> > -} >> > - >> > -#endif >> > >> > Modified: llvm/trunk/lib/Target/MSP430/Makefile >> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/Makefile?rev=115393&r1=115392&r2=115393&view=diff >> > ============================================================================== >> > --- llvm/trunk/lib/Target/MSP430/Makefile (original) >> > +++ llvm/trunk/lib/Target/MSP430/Makefile Fri Oct 1 20:06:42 2010 >> > @@ -18,7 +18,7 @@ >> > MSP430GenDAGISel.inc MSP430GenCallingConv.inc \ >> > MSP430GenSubtarget.inc >> > >> > -DIRS = InstPrinter TargetInfo >> > +DIRS = AsmPrinter TargetInfo >> > >> > include $(LEVEL)/Makefile.common >> > >> > >> > >> > _______________________________________________ >> > llvm-commits mailing list >> > llvm-commits at cs.uiuc.edu >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> >> >> _______________________________________________ >> llvm-commits mailing list >> llvm-commits at cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits >> > > > From grosbach at apple.com Mon Oct 4 12:53:40 2010 From: grosbach at apple.com (Jim Grosbach) Date: Mon, 04 Oct 2010 17:53:40 -0000 Subject: [llvm-commits] [llvm] r115525 - /llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp Message-ID: <20101004175340.63A8A2A6C12E@llvm.org> Author: grosbach Date: Mon Oct 4 12:53:40 2010 New Revision: 115525 URL: http://llvm.org/viewvc/llvm-project?rev=115525&view=rev Log: Remove unneeded header Modified: llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp Modified: llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp?rev=115525&r1=115524&r2=115525&view=diff ============================================================================== --- llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp (original) +++ llvm/trunk/lib/Target/MSP430/AsmPrinter/MSP430InstPrinter.cpp Mon Oct 4 12:53:40 2010 @@ -13,7 +13,6 @@ #define DEBUG_TYPE "asm-printer" #include "MSP430.h" -#include "MSP430InstrInfo.h" #include "MSP430InstPrinter.h" #include "llvm/MC/MCInst.h" #include "llvm/MC/MCAsmInfo.h" From ggreif at gmail.com Mon Oct 4 12:54:30 2010 From: ggreif at gmail.com (Gabor Greif) Date: Mon, 04 Oct 2010 17:54:30 -0000 Subject: [llvm-commits] [llvm] r115526 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004175430.EAF092A6C12E@llvm.org> Author: ggreif Date: Mon Oct 4 12:54:30 2010 New Revision: 115526 URL: http://llvm.org/viewvc/llvm-project?rev=115526&view=rev Log: validator fixes; others remain, somebody who knows some html, please have a look Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115526&r1=115525&r2=115526&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 12:54:30 2010 @@ -197,7 +197,6 @@
    • Fortran programs using common variables now link correctly.
    • GNU OMP constructs no longer crash the compiler.
    -

    @@ -379,7 +378,7 @@

    -Clam AntiVirus is an open source (GPL) +Clam AntiVirus is an open source (GPL) anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail gateways. Since version 0.96 it has bytecode @@ -616,7 +615,6 @@

  • LLVM 2.8 now has pretty decent support for debugging optimized code. You should be able to reliably get debug info for function arguments, assuming that the value is actually available where you have stopped.
  • -
  • A new 'llvm-diff' tool is available that does a semantic diff of .ll files.
  • The MC subproject has made major progress in this release. From rafael.espindola at gmail.com Mon Oct 4 13:44:25 2010 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 04 Oct 2010 18:44:25 -0000 Subject: [llvm-commits] [llvm] r115537 - in /llvm/trunk: lib/MC/ELFObjectWriter.cpp test/MC/ELF/got.s Message-ID: <20101004184425.2FF772A6C12C@llvm.org> Author: rafael Date: Mon Oct 4 13:44:25 2010 New Revision: 115537 URL: http://llvm.org/viewvc/llvm-project?rev=115537&view=rev Log: Produce a R_X86_64_GOT32 when needed. Added: llvm/trunk/test/MC/ELF/got.s Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115537&r1=115536&r2=115537&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 13:44:25 2010 @@ -576,6 +576,8 @@ FixedValue = Value; // determine the type of the relocation + + MCSymbolRefExpr::VariantKind Modifier = Target.getSymA()->getKind(); unsigned Type; if (Is64Bit) { if (IsPCRel) { @@ -587,7 +589,16 @@ case X86::reloc_signed_4byte: case X86::reloc_pcrel_4byte: assert(isInt<32>(Target.getConstant())); - Type = ELF::R_X86_64_32S; + switch (Modifier) { + case MCSymbolRefExpr::VK_None: + Type = ELF::R_X86_64_32S; + break; + case MCSymbolRefExpr::VK_GOT: + Type = ELF::R_X86_64_GOT32; + break; + default: + llvm_unreachable("Unimplemented"); + } break; case FK_Data_4: Type = ELF::R_X86_64_32; Added: llvm/trunk/test/MC/ELF/got.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/got.s?rev=115537&view=auto ============================================================================== --- llvm/trunk/test/MC/ELF/got.s (added) +++ llvm/trunk/test/MC/ELF/got.s Mon Oct 4 13:44:25 2010 @@ -0,0 +1,14 @@ +// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | elf-dump | FileCheck %s + +// Test that this produces a R_X86_64_GOT32. + + movl foo at GOT, %eax + +// CHECK: ('_relocations', [ +// CHECK-NEXT: # Relocation 0 +// CHECK-NEXT: (('r_offset', +// CHECK-NEXT: ('r_sym', +// CHECK-NEXT: ('r_type', 3) +// CHECK-NEXT: ('r_addend', +// CHECK-NEXT: ), +// CHECK-NEXT: ]) From criswell at uiuc.edu Mon Oct 4 14:01:12 2010 From: criswell at uiuc.edu (John Criswell) Date: Mon, 04 Oct 2010 19:01:12 -0000 Subject: [llvm-commits] [poolalloc] r115539 - /poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp Message-ID: <20101004190112.C4D1F2A6C12C@llvm.org> Author: criswell Date: Mon Oct 4 14:01:12 2010 New Revision: 115539 URL: http://llvm.org/viewvc/llvm-project?rev=115539&view=rev Log: Modified heuristic code to locate local DSNodes in function DSGraphs which are mirrored in the Globals Graph. This should help ensure that values DSNodes are consistently assigned the same pool regardless of how which DSNode is examined. Modified: poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp Modified: poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp URL: http://llvm.org/viewvc/llvm-project/poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp?rev=115539&r1=115538&r2=115539&view=diff ============================================================================== --- poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp (original) +++ poolalloc/trunk/lib/PoolAllocate/Heuristic.cpp Mon Oct 4 14:01:12 2010 @@ -301,6 +301,17 @@ GetNodesReachableFromGlobals (GG, GlobalHeapNodes); // + // Now find all DSNodes belonging to function-local DSGraphs which are + // mirrored in the globals graph. These DSNodes require a global pool, too. + // + for (Module::iterator F = M->begin(); F != M->end(); ++F) { + if (Graphs->hasDSGraph(*F)) { + DSGraph* G = Graphs->getDSGraph(*F); + GetNodesReachableFromGlobals (G, GlobalHeapNodes); + } + } + + // // We do not want to create pools for all memory objects reachable from // globals. We only want those that are or could be heap objects. // From rafael.espindola at gmail.com Mon Oct 4 14:04:13 2010 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 04 Oct 2010 19:04:13 -0000 Subject: [llvm-commits] [llvm] r115541 - in /llvm/trunk: lib/MC/ELFObjectWriter.cpp test/MC/ELF/plt.s Message-ID: <20101004190413.7E5C22A6C12C@llvm.org> Author: rafael Date: Mon Oct 4 14:04:13 2010 New Revision: 115541 URL: http://llvm.org/viewvc/llvm-project?rev=115541&view=rev Log: Produce a R_X86_64_PLT32 when needed. Added: llvm/trunk/test/MC/ELF/plt.s Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115541&r1=115540&r2=115541&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 14:04:13 2010 @@ -581,7 +581,16 @@ unsigned Type; if (Is64Bit) { if (IsPCRel) { - Type = ELF::R_X86_64_PC32; + switch (Modifier) { + case MCSymbolRefExpr::VK_None: + Type = ELF::R_X86_64_PC32; + break; + case MCSymbolRefExpr::VK_PLT: + Type = ELF::R_X86_64_PLT32; + break; + default: + llvm_unreachable("Unimplemented"); + } } else { switch ((unsigned)Fixup.getKind()) { default: llvm_unreachable("invalid fixup kind!"); Added: llvm/trunk/test/MC/ELF/plt.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/plt.s?rev=115541&view=auto ============================================================================== --- llvm/trunk/test/MC/ELF/plt.s (added) +++ llvm/trunk/test/MC/ELF/plt.s Mon Oct 4 14:04:13 2010 @@ -0,0 +1,14 @@ +// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - | elf-dump | FileCheck %s + +// Test that this produces a R_X86_64_PLT32. + + jmp foo at PLT + +// CHECK: ('_relocations', [ +// CHECK-NEXT: # Relocation 0 +// CHECK-NEXT: (('r_offset', +// CHECK-NEXT: ('r_sym', +// CHECK-NEXT: ('r_type', 4) +// CHECK-NEXT: ('r_addend', +// CHECK-NEXT: ), +// CHECK-NEXT: ]) From baldrick at free.fr Mon Oct 4 14:44:15 2010 From: baldrick at free.fr (Duncan Sands) Date: Mon, 04 Oct 2010 19:44:15 -0000 Subject: [llvm-commits] [zorg] r115544 - /zorg/trunk/buildbot/osuosl/master/config/builders.py Message-ID: <20101004194415.8C6E72A6C12C@llvm.org> Author: baldrick Date: Mon Oct 4 14:44:15 2010 New Revision: 115544 URL: http://llvm.org/viewvc/llvm-project?rev=115544&view=rev Log: Turn off Fortran for the moment - need to install some additional libraries on the build slave. Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py Modified: zorg/trunk/buildbot/osuosl/master/config/builders.py URL: http://llvm.org/viewvc/llvm-project/zorg/trunk/buildbot/osuosl/master/config/builders.py?rev=115544&r1=115543&r2=115544&view=diff ============================================================================== --- zorg/trunk/buildbot/osuosl/master/config/builders.py (original) +++ zorg/trunk/buildbot/osuosl/master/config/builders.py Mon Oct 4 14:44:15 2010 @@ -94,7 +94,6 @@ 'slavenames':["gcc11"], 'builddir':"llvm-gcc-i386-linux-selfhost", 'factory':LLVMGCCBuilder.getLLVMGCCBuildFactory(triple='i686-pc-linux-gnu', - extra_languages="fortran", extra_configure_args=['--disable-multilib', '--enable-targets=all','--with-as=/home/baldrick/bin32/as'])}, ] From rafael.espindola at gmail.com Mon Oct 4 14:46:28 2010 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 04 Oct 2010 19:46:28 -0000 Subject: [llvm-commits] [llvm] r115545 - /llvm/trunk/lib/MC/ELFObjectWriter.cpp Message-ID: <20101004194628.51E012A6C12C@llvm.org> Author: rafael Date: Mon Oct 4 14:46:28 2010 New Revision: 115545 URL: http://llvm.org/viewvc/llvm-project?rev=115545&view=rev Log: Move isFixupKindX86PCRel. Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115545&r1=115544&r2=115545&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 14:46:28 2010 @@ -57,21 +57,21 @@ SD.setFlags(OtherFlags | (Binding << ELF_STB_Shift)); } +static bool isFixupKindX86PCRel(unsigned Kind) { + switch (Kind) { + default: + return false; + case X86::reloc_pcrel_1byte: + case X86::reloc_pcrel_4byte: + case X86::reloc_riprel_4byte: + case X86::reloc_riprel_4byte_movq_load: + return true; + } +} + namespace { class ELFObjectWriterImpl { - static bool isFixupKindX86PCRel(unsigned Kind) { - switch (Kind) { - default: - return false; - case X86::reloc_pcrel_1byte: - case X86::reloc_pcrel_4byte: - case X86::reloc_riprel_4byte: - case X86::reloc_riprel_4byte_movq_load: - return true; - } - } - /*static bool isFixupKindX86RIPRel(unsigned Kind) { return Kind == X86::reloc_riprel_4byte || Kind == X86::reloc_riprel_4byte_movq_load; From rafael.espindola at gmail.com Mon Oct 4 14:51:40 2010 From: rafael.espindola at gmail.com (Rafael Espindola) Date: Mon, 04 Oct 2010 19:51:40 -0000 Subject: [llvm-commits] [llvm] r115547 - in /llvm/trunk: lib/MC/ELFObjectWriter.cpp test/MC/ELF/got.s Message-ID: <20101004195140.0B5CA2A6C12C@llvm.org> Author: rafael Date: Mon Oct 4 14:51:39 2010 New Revision: 115547 URL: http://llvm.org/viewvc/llvm-project?rev=115547&view=rev Log: Implement ELF::R_X86_64_GOTPCREL. Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp llvm/trunk/test/MC/ELF/got.s Modified: llvm/trunk/lib/MC/ELFObjectWriter.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?rev=115547&r1=115546&r2=115547&view=diff ============================================================================== --- llvm/trunk/lib/MC/ELFObjectWriter.cpp (original) +++ llvm/trunk/lib/MC/ELFObjectWriter.cpp Mon Oct 4 14:51:39 2010 @@ -588,6 +588,9 @@ case MCSymbolRefExpr::VK_PLT: Type = ELF::R_X86_64_PLT32; break; + case llvm::MCSymbolRefExpr::VK_GOTPCREL: + Type = ELF::R_X86_64_GOTPCREL; + break; default: llvm_unreachable("Unimplemented"); } Modified: llvm/trunk/test/MC/ELF/got.s URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/MC/ELF/got.s?rev=115547&r1=115546&r2=115547&view=diff ============================================================================== --- llvm/trunk/test/MC/ELF/got.s (original) +++ llvm/trunk/test/MC/ELF/got.s Mon Oct 4 14:51:39 2010 @@ -2,7 +2,8 @@ // Test that this produces a R_X86_64_GOT32. - movl foo at GOT, %eax + movl foo at GOT, %eax + movl foo at GOTPCREL(%rip), %eax // CHECK: ('_relocations', [ // CHECK-NEXT: # Relocation 0 @@ -11,4 +12,10 @@ // CHECK-NEXT: ('r_type', 3) // CHECK-NEXT: ('r_addend', // CHECK-NEXT: ), +// CHECK-NEXT: # Relocation 1 +// CHECK-NEXT: (('r_offset', +// CHECK-NEXT: ('r_sym', +// CHECK-NEXT: ('r_type', 9) +// CHECK-NEXT: ('r_addend', +// CHECK-NEXT: ), // CHECK-NEXT: ]) From daniel at zuster.org Mon Oct 4 15:11:39 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Mon, 04 Oct 2010 20:11:39 -0000 Subject: [llvm-commits] [llvm] r115549 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004201139.B607E2A6C12C@llvm.org> Author: ddunbar Date: Mon Oct 4 15:11:39 2010 New Revision: 115549 URL: http://llvm.org/viewvc/llvm-project?rev=115549&view=rev Log: ReleaseNotes: Note a header rename. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115549&r1=115548&r2=115549&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 15:11:39 2010 @@ -1012,7 +1012,7 @@
  • - Some APIs got renamed: + Some APIs were renamed:
    • llvm_report_error -> report_fatal_error
    • llvm_install_error_handler -> install_fatal_error_handler
    • @@ -1021,6 +1021,13 @@
  • +
  • + Some public headers were renamed: +
      +
    • llvm/Assembly/AsmAnnotationWriter.h was renamed + to llvm/Assembly/AssemblyAnnotationWriter.h +
    • +
    From daniel at zuster.org Mon Oct 4 15:11:41 2010 From: daniel at zuster.org (Daniel Dunbar) Date: Mon, 04 Oct 2010 20:11:41 -0000 Subject: [llvm-commits] [llvm] r115550 - /llvm/trunk/docs/ReleaseNotes.html Message-ID: <20101004201141.ED1222A6C12D@llvm.org> Author: ddunbar Date: Mon Oct 4 15:11:41 2010 New Revision: 115550 URL: http://llvm.org/viewvc/llvm-project?rev=115550&view=rev Log: ReleaseNotes: Note some changes to LLVM development infrastructure. Modified: llvm/trunk/docs/ReleaseNotes.html Modified: llvm/trunk/docs/ReleaseNotes.html URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ReleaseNotes.html?rev=115550&r1=115549&r2=115550&view=diff ============================================================================== --- llvm/trunk/docs/ReleaseNotes.html (original) +++ llvm/trunk/docs/ReleaseNotes.html Mon Oct 4 15:11:41 2010 @@ -1032,6 +1032,45 @@ + + + +
    + +

    This section lists changes to the LLVM development infrastructure. This +mostly impacts users who actively work on LLVM or follow development on +mainline, but may also impact users who leverage the LLVM build infrastructure +or are interested in LLVM qualification.

    + +
      +
    • The default for make check is now to use + the lit testing tool, which is + part of LLVM itself. You can use lit directly as well, or use + the llvm-lit tool which is created as part of a Makefile or CMake + build (and knows how to find the appropriate tools). See the lit + documentation and the blog + post, and PR5217 + for more information.
    • + +
    • The LLVM test-suite infrastructure has a new "simple" test format + (make TEST=simple). The new format is intended to require only a + compiler and not a full set of LLVM tools. This makes it useful for testing + released compilers, for running the test suite with other compilers (for + performance comparisons), and makes sure that we are testing the compiler as + users would see it. The new format is also designed to work using reference + outputs instead of comparison to a baseline compiler, which makes it run much + faster and makes it less system dependent.
    • + +
    • Significant progress has been made on a new interface to running the + LLVM test-suite (aka the LLVM "nightly tests") using + the LNT infrastructure. The LNT + interface to the test-suite brings significantly improved reporting + capabilities for monitoring the correctness and generated code quality + produced by LLVM over time.
    • +
    +
    From enderby at apple.com Mon Oct 4 15:17:24 2010 From: enderby at apple.com (Kevin Enderby) Date: Mon, 04 Oct 2010 20:17:24 -0000 Subject: [llvm-commits] [llvm] r115551 - in /llvm/trunk: include/llvm/MC/MCContext.h lib/MC/MCContext.cpp lib/MC/MCDwarf.cpp lib/MC/MCParser/AsmParser.cpp Message-ID: <20101004201724.4612F2A6C12C@llvm.org> Author: enderby Date: Mon Oct 4 15:17:24 2010 New Revision: 115551 URL: http://llvm.org/viewvc/llvm-project?rev=115551&view=rev Log: Incorporate suggestions by Daniel Dunbar after his review. Thanks Daniel! 1) Changed ValidateDwarfFileNumber() to isValidDwarfFileNumber() to be better named. Since it is just a predicate and isn't actually changing any state. 2) Added a missing return in the comments for setCurrentDwarfLoc() in include/llvm/MC/MCContext.h for fix formatting. 3) Changed clearDwarfLocSeen() to ClearDwarfLocSeen() since it does change state. 4) Simplified the last test in isValidDwarfFileNumber() to just a one line boolean test of MCDwarfFiles[FileNumber] != 0 for the final return statement. Modified: llvm/trunk/include/llvm/MC/MCContext.h llvm/trunk/lib/MC/MCContext.cpp llvm/trunk/lib/MC/MCDwarf.cpp llvm/trunk/lib/MC/MCParser/AsmParser.cpp Modified: llvm/trunk/include/llvm/MC/MCContext.h URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCContext.h?rev=115551&r1=115550&r2=115551&view=diff ============================================================================== --- llvm/trunk/include/llvm/MC/MCContext.h (original) +++ llvm/trunk/include/llvm/MC/MCContext.h Mon Oct 4 15:17:24 2010 @@ -160,7 +160,7 @@ /// GetDwarfFile - creates an entry in the dwarf file and directory tables. unsigned GetDwarfFile(StringRef FileName, unsigned FileNumber); - bool ValidateDwarfFileNumber(unsigned FileNumber); + bool isValidDwarfFileNumber(unsigned FileNumber); bool hasDwarfFiles(void) { return MCDwarfFiles.size() != 0; @@ -177,7 +177,8 @@ } /// setCurrentDwarfLoc - saves the information from the currently parsed - /// dwarf .loc directive and sets DwarfLocSeen. When the next instruction /// is assembled an entry in the line number table with this information and + /// dwarf .loc directive and sets DwarfLocSeen. When the next instruction + /// is assembled an entry in the line number table with this information and /// the address of the instruction will be created. void setCurrentDwarfLoc(unsigned FileNum, unsigned Line, unsigned Column, unsigned Flags, unsigned Isa) { @@ -188,7 +189,7 @@ CurrentDwarfLoc.setIsa(Isa); DwarfLocSeen = true; } - void clearDwarfLocSeen() { DwarfLocSeen = false; } + void ClearDwarfLocSeen() { DwarfLocSeen = false; } bool getDwarfLocSeen() { return DwarfLocSeen; } const MCDwarfLoc &getCurrentDwarfLoc() { return CurrentDwarfLoc; } Modified: llvm/trunk/lib/MC/MCContext.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCContext.cpp?rev=115551&r1=115550&r2=115551&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCContext.cpp (original) +++ llvm/trunk/lib/MC/MCContext.cpp Mon Oct 4 15:17:24 2010 @@ -255,15 +255,11 @@ return FileNumber; } -/// ValidateDwarfFileNumber - takes a dwarf file number and returns true if it +/// isValidDwarfFileNumber - takes a dwarf file number and returns true if it /// currently is assigned and false otherwise. -bool MCContext::ValidateDwarfFileNumber(unsigned FileNumber) { +bool MCContext::isValidDwarfFileNumber(unsigned FileNumber) { if(FileNumber == 0 || FileNumber >= MCDwarfFiles.size()) return false; - MCDwarfFile *&ExistingFile = MCDwarfFiles[FileNumber]; - if (ExistingFile) - return true; - else - return false; + return MCDwarfFiles[FileNumber] != 0; } Modified: llvm/trunk/lib/MC/MCDwarf.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCDwarf.cpp?rev=115551&r1=115550&r2=115551&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCDwarf.cpp (original) +++ llvm/trunk/lib/MC/MCDwarf.cpp Mon Oct 4 15:17:24 2010 @@ -76,7 +76,7 @@ MCLineEntry LineEntry(LineSym, DwarfLoc); // clear DwarfLocSeen saying the current .loc info is now used. - MCOS->getContext().clearDwarfLocSeen(); + MCOS->getContext().ClearDwarfLocSeen(); // Get the MCLineSection for this section, if one does not exist for this // section create it. Modified: llvm/trunk/lib/MC/MCParser/AsmParser.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCParser/AsmParser.cpp?rev=115551&r1=115550&r2=115551&view=diff ============================================================================== --- llvm/trunk/lib/MC/MCParser/AsmParser.cpp (original) +++ llvm/trunk/lib/MC/MCParser/AsmParser.cpp Mon Oct 4 15:17:24 2010 @@ -2031,7 +2031,7 @@ int64_t FileNumber = getTok().getIntVal(); if (FileNumber < 1) return TokError("file number less than one in '.loc' directive"); - if (!getContext().ValidateDwarfFileNumber(FileNumber)) + if (!getContext().isValidDwarfFileNumber(FileNumber)) return TokError("unassigned file number in '.loc' directive"); Lex(); From isanbard at gmail.com Mon Oct 4 15:24:02 2010 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 04 Oct 2010 20:24:02 -0000 Subject: [llvm-commits] [llvm] r115552 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86InstrMMX.td lib/VMCore/AutoUpgrade.cpp test/Assembler/AutoUpgradeMMXIntrinsics.ll Message-ID: <20101004202402.19DAB2A6C12C@llvm.org> Author: void Date: Mon Oct 4 15:24:01 2010 New Revision: 115552 URL: http://llvm.org/viewvc/llvm-project?rev=115552&view=rev Log: The pshufw instruction came about in MMX2 when SSE was introduced. Don't place it in with the SSSE3 instructions. Steward! Could you place this chair by the aft sun deck? I'm trying to get away from the Astors. They are such boors! Modified: llvm/trunk/include/llvm/IntrinsicsX86.td llvm/trunk/lib/Target/X86/X86InstrMMX.td llvm/trunk/lib/VMCore/AutoUpgrade.cpp llvm/trunk/test/Assembler/AutoUpgradeMMXIntrinsics.ll Modified: llvm/trunk/include/llvm/IntrinsicsX86.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IntrinsicsX86.td?rev=115552&r1=115551&r2=115552&view=diff ============================================================================== --- llvm/trunk/include/llvm/IntrinsicsX86.td (original) +++ llvm/trunk/include/llvm/IntrinsicsX86.td Mon Oct 4 15:24:01 2010 @@ -630,7 +630,7 @@ def int_x86_ssse3_pshuf_b_128 : GCCBuiltin<"__builtin_ia32_pshufb128">, Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], [IntrNoMem]>; - def int_x86_ssse3_pshuf_w : GCCBuiltin<"__builtin_ia32_pshufw">, + def int_x86_sse_pshuf_w : GCCBuiltin<"__builtin_ia32_pshufw">, Intrinsic<[llvm_x86mmx_ty], [llvm_x86mmx_ty, llvm_i8_ty], [IntrNoMem]>; } Modified: llvm/trunk/lib/Target/X86/X86InstrMMX.td URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrMMX.td?rev=115552&r1=115551&r2=115552&view=diff ============================================================================== --- llvm/trunk/lib/Target/X86/X86InstrMMX.td (original) +++ llvm/trunk/lib/Target/X86/X86InstrMMX.td Mon Oct 4 15:24:01 2010 @@ -342,13 +342,13 @@ (outs VR64:$dst), (ins VR64:$src1, i8imm:$src2), "pshufw\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set VR64:$dst, - (int_x86_ssse3_pshuf_w VR64:$src1, imm:$src2))]>; + (int_x86_sse_pshuf_w VR64:$src1, imm:$src2))]>; def MMX_PSHUFWmi : MMXIi8<0x70, MRMSrcMem, (outs VR64:$dst), (ins i64mem:$src1, i8imm:$src2), "pshufw\t{$src2, $src1, $dst|$dst, $src1, $src2}", [(set VR64:$dst, - (int_x86_ssse3_pshuf_w (load_mmx addr:$src1), - imm:$src2))]>; + (int_x86_sse_pshuf_w (load_mmx addr:$src1), + imm:$src2))]>; Modified: llvm/trunk/lib/VMCore/AutoUpgrade.cpp URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/VMCore/AutoUpgrade.cpp?rev=115552&r1=115551&r2=115552&view=diff ============================================================================== --- llvm/trunk/lib/VMCore/AutoUpgrade.cpp (original) +++ llvm/trunk/lib/VMCore/AutoUpgrade.cpp Mon Oct 4 15:24:01 2010 @@ -528,6 +528,16 @@ // or 0. NewFn = 0; return true; + } else if (Name.compare(5, 17, "x86.ssse3.pshuf.w", 17) == 0) { + // This is an SSE/MMX instruction. + const Type *X86_MMXTy = VectorType::getX86_MMXTy(FTy->getContext()); + NewFn = + cast(M->getOrInsertFunction("llvm.x86.sse.pshuf.w", + X86_MMXTy, + X86_MMXTy, + Type::getInt8Ty(F->getContext()), + (Type*)0)); + return true; } break; @@ -631,22 +641,23 @@ NewCI->setTailCall(OldCI->isTailCall()); NewCI->setCallingConv(OldCI->getCallingConv()); - // Handle any uses of the old CallInst. + // Handle any uses of the old CallInst. If the type has changed, add a cast. if (!OldCI->use_empty()) { - // If the type has changed, add a cast. - Instruction *I = OldCI; if (OldCI->getType() != NewCI->getType()) { Function *OldFn = OldCI->getCalledFunction(); CastInst *RetCast = CastInst::Create(CastInst::getCastOpcode(NewCI, true, OldFn->getReturnType(), true), NewCI, OldFn->getReturnType(), NewCI->getName(),OldCI); - I = RetCast; + + // Replace all uses of the old call with the new cast which has the + // correct type. + OldCI->replaceAllUsesWith(RetCast); + } else { + OldCI->replaceAllUsesWith(NewCI); } - // Replace all uses of the old call with the new cast which has the - // correct type. - OldCI->replaceAllUsesWith(I); } + // Clean up the old call now that it has been completely upgraded. OldCI->eraseFromParent(); } @@ -1150,6 +1161,25 @@ ConstructNewCallInst(NewFn, CI, Operands, 3); break; } + case Intrinsic::x86_sse_pshuf_w: { + IRBuilder<> Builder(C); + Builder.SetInsertPoint(CI->getParent(), CI); + + // Cast the operand to the X86 MMX type. + Value *Operands[2]; + Operands[0] = + Builder.CreateBitCast(CI->getArgOperand(0), + NewFn->getFunctionType()->getParamType(0), + "upgraded."); + Operands[1] = + Builder.CreateTrunc(CI->getArgOperand(1), + Type::getInt8Ty(C), + "upgraded."); + + ConstructNewCallInst(NewFn, CI, Operands, 2); + break; + } + #if 0 case Intrinsic::x86_mmx_cvtsi32_si64: { // The return type needs to be changed. Modified: llvm/trunk/test/Assembler/AutoUpgradeMMXIntrinsics.ll URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Assembler/AutoUpgradeMMXIntrinsics.ll?rev=115552&r1=115551&r2=115552&view=diff ============================================================================== --- llvm/trunk/test/Assembler/AutoUpgradeMMXIntrinsics.ll (original) +++ llvm/trunk/test/Assembler/AutoUpgradeMMXIntrinsics.ll Mon Oct 4 15:24:01 2010 @@ -4,6 +4,7 @@ ; RUN: grep {llvm\\.x86\\.mmx} %t | not grep {\\\<2 x i32\\\>} ; RUN: grep {llvm\\.x86\\.mmx} %t | not grep {\\\<4 x i16\\\>} ; RUN: grep {llvm\\.x86\\.mmx} %t | not grep {\\\<8 x i8\\\>} +; RUN: grep {llvm\\.x86\\.sse\\.pshuf\\.w} %t | not grep i32 ; Addition declare <8 x i8> @llvm.x86.mmx.padd.b(<8 x i8>, <8 x i8>) nounwind readnone @@ -207,6 +208,7 @@ declare <1 x i64> @llvm.x86.mmx.palignr.b(<1 x i64>, <1 x i64>, i8) nounwind readnone declare i32 @llvm.x86.mmx.pextr.w(<1 x i64>, i32) nounwind readnone declare <1 x i64> @llvm.x86.mmx.pinsr.w(<1 x i64>, i32, i32) nounwind readnone +declare <4 x i16> @llvm.x86.ssse3.pshuf.w(<4 x i16>, i32) nounwind readnone define void @misc(<8 x i8> %A, <8 x i8> %B, <4 x i16> %C, <4 x i16> %D, <2 x i32> %E, <2 x i32> %F, <1 x i64> %G, <1 x i64> %H, i32* %I, i8 %J, i16 %K, i32 %L) { @@ -216,5 +218,6 @@ %r2 = call <1 x i64> @llvm.x86.mmx.palignr.b(<1 x i64> %G, <1 x i64> %H, i8 %J) %r3 = call i32 @llvm.x86.mmx.pextr.w(<1 x i64> %G, i32 37) %r4 = call <1 x i64> @llvm.x86.mmx.pinsr.w(<1 x i64> %G, i32 37, i32 927) + %r5 = call <4 x i16> @llvm.x86.ssse3.pshuf.w(<4 x i16> %C, i32 37) ret void } From isanbard at gmail.com Mon Oct 4 15:36:41 2010 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 04 Oct 2010 20:36:41 -0000 Subject: [llvm-commits] [www-releases] r115553 - /www-releases/trunk/2.8/ Message-ID: <20101004203641.C3B972A6C12C@llvm.org> Author: void Date: Mon Oct 4 15:36:41 2010 New Revision: 115553 URL: http://llvm.org/viewvc/llvm-project?rev=115553&view=rev Log: Creating directory for 2.8 goodness. Added: www-releases/trunk/2.8/ From isanbard at gmail.com Mon Oct 4 15:49:24 2010 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 04 Oct 2010 20:49:24 -0000 Subject: [llvm-commits] [www-releases] r115556 [3/3] - in /www-releases/trunk/2.8: ./ docs/ docs/CommandGuide/ docs/CommandGuide/html/ docs/CommandGuide/man/ docs/CommandGuide/man/man1/ docs/CommandGuide/ps/ docs/HistoricalNotes/ docs/img/ docs/tutorial/ Message-ID: <20101004204926.EEDC12A6C12D@llvm.org> Added: www-releases/trunk/2.8/docs/WritingAnLLVMBackend.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/WritingAnLLVMBackend.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/WritingAnLLVMBackend.html (added) +++ www-releases/trunk/2.8/docs/WritingAnLLVMBackend.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,2556 @@ + + + + + Writing an LLVM Compiler Backend + + + + + +
    + Writing an LLVM Compiler Backend +
    + +
      +
    1. Introduction + +
    2. Target Machine
    3. +
    4. Target Registration
    5. +
    6. Register Set and Register Classes +
    7. +
    8. Instruction Set +
    9. +
    10. Instruction Selector +
    11. +
    12. Assembly Printer
    13. +
    14. Subtarget Support
    15. +
    16. JIT Support +
    17. +
    + +
    +

    Written by Mason Woo and + Misha Brukman

    +
    + + + + + +
    + +

    +This document describes techniques for writing compiler backends that convert +the LLVM Intermediate Representation (IR) to code for a specified machine or +other languages. Code intended for a specific machine can take the form of +either assembly code or binary code (usable for a JIT compiler). +

    + +

    +The backend of LLVM features a target-independent code generator that may create +output for several types of target CPUs — including X86, PowerPC, Alpha, +and SPARC. The backend may also be used to generate code targeted at SPUs of the +Cell processor or GPUs to support the execution of compute kernels. +

    + +

    +The document focuses on existing examples found in subdirectories +of llvm/lib/Target in a downloaded LLVM release. In particular, this +document focuses on the example of creating a static compiler (one that emits +text assembly) for a SPARC target, because SPARC has fairly standard +characteristics, such as a RISC instruction set and straightforward calling +conventions. +

    + +
    + + + +
    + +

    +The audience for this document is anyone who needs to write an LLVM backend to +generate code for a specific hardware or software target. +

    + +
    + + + +
    + +

    +These essential documents must be read before reading this document: +

    + +
      +
    • LLVM Language Reference + Manual — a reference manual for the LLVM assembly language.
    • + +
    • The LLVM + Target-Independent Code Generator — a guide to the components + (classes and code generation algorithms) for translating the LLVM internal + representation into machine code for a specified target. Pay particular + attention to the descriptions of code generation stages: Instruction + Selection, Scheduling and Formation, SSA-based Optimization, Register + Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations, + and Code Emission.
    • + +
    • TableGen + Fundamentals —a document that describes the TableGen + (tblgen) application that manages domain-specific information to + support LLVM code generation. TableGen processes input from a target + description file (.td suffix) and generates C++ code that can be + used for code generation.
    • + +
    • Writing an LLVM + Pass — The assembly printer is a FunctionPass, as are + several SelectionDAG processing steps.
    • +
    + +

    +To follow the SPARC examples in this document, have a copy of +The SPARC Architecture +Manual, Version 8 for reference. For details about the ARM instruction +set, refer to the ARM Architecture +Reference Manual. For more about the GNU Assembler format +(GAS), see +Using As, +especially for the assembly printer. Using As contains a list of target +machine dependent features. +

    + +
    + + + +
    + +

    +To write a compiler backend for LLVM that converts the LLVM IR to code for a +specified target (machine or other language), follow these steps: +

    + +
      +
    • Create a subclass of the TargetMachine class that describes characteristics + of your target machine. Copy existing examples of specific TargetMachine + class and header files; for example, start with + SparcTargetMachine.cpp and SparcTargetMachine.h, but + change the file names for your target. Similarly, change code that + references "Sparc" to reference your target.
    • + +
    • Describe the register set of the target. Use TableGen to generate code for + register definition, register aliases, and register classes from a + target-specific RegisterInfo.td input file. You should also write + additional code for a subclass of the TargetRegisterInfo class that + represents the class register file data used for register allocation and + also describes the interactions between registers.
    • + +
    • Describe the instruction set of the target. Use TableGen to generate code + for target-specific instructions from target-specific versions of + TargetInstrFormats.td and TargetInstrInfo.td. You should + write additional code for a subclass of the TargetInstrInfo class to + represent machine instructions supported by the target machine.
    • + +
    • Describe the selection and conversion of the LLVM IR from a Directed Acyclic + Graph (DAG) representation of instructions to native target-specific + instructions. Use TableGen to generate code that matches patterns and + selects instructions based on additional information in a target-specific + version of TargetInstrInfo.td. Write code + for XXXISelDAGToDAG.cpp, where XXX identifies the specific target, + to perform pattern matching and DAG-to-DAG instruction selection. Also write + code in XXXISelLowering.cpp to replace or remove operations and + data types that are not supported natively in a SelectionDAG.
    • + +
    • Write code for an assembly printer that converts LLVM IR to a GAS format for + your target machine. You should add assembly strings to the instructions + defined in your target-specific version of TargetInstrInfo.td. You + should also write code for a subclass of AsmPrinter that performs the + LLVM-to-assembly conversion and a trivial subclass of TargetAsmInfo.
    • + +
    • Optionally, add support for subtargets (i.e., variants with different + capabilities). You should also write code for a subclass of the + TargetSubtarget class, which allows you to use the -mcpu= + and -mattr= command-line options.
    • + +
    • Optionally, add JIT support and create a machine code emitter (subclass of + TargetJITInfo) that is used to emit binary code directly into memory.
    • +
    + +

    +In the .cpp and .h. files, initially stub up these methods and +then implement them later. Initially, you may not know which private members +that the class will need and which components will need to be subclassed. +

    + +
    + + + +
    + +

    +To actually create your compiler backend, you need to create and modify a few +files. The absolute minimum is discussed here. But to actually use the LLVM +target-independent code generator, you must perform the steps described in +the LLVM +Target-Independent Code Generator document. +

    + +

    +First, you should create a subdirectory under lib/Target to hold all +the files related to your target. If your target is called "Dummy," create the +directory lib/Target/Dummy. +

    + +

    +In this new +directory, create a Makefile. It is easiest to copy a +Makefile of another target and modify it. It should at least contain +the LEVEL, LIBRARYNAME and TARGET variables, and then +include $(LEVEL)/Makefile.common. The library can be +named LLVMDummy (for example, see the MIPS target). Alternatively, you +can split the library into LLVMDummyCodeGen +and LLVMDummyAsmPrinter, the latter of which should be implemented in a +subdirectory below lib/Target/Dummy (for example, see the PowerPC +target). +

    + +

    +Note that these two naming schemes are hardcoded into llvm-config. +Using any other naming scheme will confuse llvm-config and produce a +lot of (seemingly unrelated) linker errors when linking llc. +

    + +

    +To make your target actually do something, you need to implement a subclass of +TargetMachine. This implementation should typically be in the file +lib/Target/DummyTargetMachine.cpp, but any file in +the lib/Target directory will be built and should work. To use LLVM's +target independent code generator, you should do what all current machine +backends do: create a subclass of LLVMTargetMachine. (To create a +target from scratch, create a subclass of TargetMachine.) +

    + +

    +To get LLVM to actually build and link your target, you need to add it to +the TARGETS_TO_BUILD variable. To do this, you modify the configure +script to know about your target when parsing the --enable-targets +option. Search the configure script for TARGETS_TO_BUILD, add your +target to the lists there (some creativity required), and then +reconfigure. Alternatively, you can change autotools/configure.ac and +regenerate configure by running ./autoconf/AutoRegen.sh. +

    + +
    + + + + + +
    + +

    +LLVMTargetMachine is designed as a base class for targets implemented +with the LLVM target-independent code generator. The LLVMTargetMachine +class should be specialized by a concrete target class that implements the +various virtual methods. LLVMTargetMachine is defined as a subclass of +TargetMachine in include/llvm/Target/TargetMachine.h. The +TargetMachine class implementation (TargetMachine.cpp) also +processes numerous command-line options. +

    + +

    +To create a concrete target-specific subclass of LLVMTargetMachine, +start by copying an existing TargetMachine class and header. You +should name the files that you create to reflect your specific target. For +instance, for the SPARC target, name the files SparcTargetMachine.h and +SparcTargetMachine.cpp. +

    + +

    +For a target machine XXX, the implementation of +XXXTargetMachine must have access methods to obtain objects that +represent target components. These methods are named get*Info, and are +intended to obtain the instruction set (getInstrInfo), register set +(getRegisterInfo), stack frame layout (getFrameInfo), and +similar information. XXXTargetMachine must also implement the +getTargetData method to access an object with target-specific data +characteristics, such as data type size and alignment requirements. +

    + +

    +For instance, for the SPARC target, the header file +SparcTargetMachine.h declares prototypes for several get*Info +and getTargetData methods that simply return a class member. +

    + +
    +
    +namespace llvm {
    +
    +class Module;
    +
    +class SparcTargetMachine : public LLVMTargetMachine {
    +  const TargetData DataLayout;       // Calculates type size & alignment
    +  SparcSubtarget Subtarget;
    +  SparcInstrInfo InstrInfo;
    +  TargetFrameInfo FrameInfo;
    +  
    +protected:
    +  virtual const TargetAsmInfo *createTargetAsmInfo() const;
    +  
    +public:
    +  SparcTargetMachine(const Module &M, const std::string &FS);
    +
    +  virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
    +  virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
    +  virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
    +  virtual const TargetRegisterInfo *getRegisterInfo() const {
    +    return &InstrInfo.getRegisterInfo();
    +  }
    +  virtual const TargetData *getTargetData() const { return &DataLayout; }
    +  static unsigned getModuleMatchQuality(const Module &M);
    +
    +  // Pass Pipeline Configuration
    +  virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
    +  virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
    +};
    +
    +} // end namespace llvm
    +
    +
    + +
    + + +
    + +
      +
    • getInstrInfo()
    • +
    • getRegisterInfo()
    • +
    • getFrameInfo()
    • +
    • getTargetData()
    • +
    • getSubtargetImpl()
    • +
    + +

    For some targets, you also need to support the following methods:

    + +
      +
    • getTargetLowering()
    • +
    • getJITInfo()
    • +
    + +

    +In addition, the XXXTargetMachine constructor should specify a +TargetDescription string that determines the data layout for the target +machine, including characteristics such as pointer size, alignment, and +endianness. For example, the constructor for SparcTargetMachine contains the +following: +

    + +
    +
    +SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
    +  : DataLayout("E-p:32:32-f128:128:128"),
    +    Subtarget(M, FS), InstrInfo(Subtarget),
    +    FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
    +}
    +
    +
    + +
    + +
    + +

    Hyphens separate portions of the TargetDescription string.

    + +
      +
    • An upper-case "E" in the string indicates a big-endian target data + model. a lower-case "e" indicates little-endian.
    • + +
    • "p:" is followed by pointer information: size, ABI alignment, and + preferred alignment. If only two figures follow "p:", then the + first value is pointer size, and the second value is both ABI and preferred + alignment.
    • + +
    • Then a letter for numeric type alignment: "i", "f", + "v", or "a" (corresponding to integer, floating point, + vector, or aggregate). "i", "v", or "a" are + followed by ABI alignment and preferred alignment. "f" is followed + by three values: the first indicates the size of a long double, then ABI + alignment, and then ABI preferred alignment.
    • +
    + +
    + + + + + +
    + +

    +You must also register your target with the TargetRegistry, which is +what other LLVM tools use to be able to lookup and use your target at +runtime. The TargetRegistry can be used directly, but for most targets +there are helper templates which should take care of the work for you.

    + +

    +All targets should declare a global Target object which is used to +represent the target during registration. Then, in the target's TargetInfo +library, the target should define that object and use +the RegisterTarget template to register the target. For example, the Sparc registration code looks like this: +

    + +
    +
    +Target llvm::TheSparcTarget;
    +
    +extern "C" void LLVMInitializeSparcTargetInfo() { 
    +  RegisterTarget<Triple::sparc, /*HasJIT=*/false>
    +    X(TheSparcTarget, "sparc", "Sparc");
    +}
    +
    +
    + +

    +This allows the TargetRegistry to look up the target by name or by +target triple. In addition, most targets will also register additional features +which are available in separate libraries. These registration steps are +separate, because some clients may wish to only link in some parts of the target +-- the JIT code generator does not require the use of the assembler printer, for +example. Here is an example of registering the Sparc assembly printer: +

    + +
    +
    +extern "C" void LLVMInitializeSparcAsmPrinter() { 
    +  RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget);
    +}
    +
    +
    + +

    +For more information, see +"llvm/Target/TargetRegistry.h". +

    + +
    + + + + + +
    + +

    +You should describe a concrete target-specific class that represents the +register file of a target machine. This class is called XXXRegisterInfo +(where XXX identifies the target) and represents the class register +file data that is used for register allocation. It also describes the +interactions between registers. +

    + +

    +You also need to define register classes to categorize related registers. A +register class should be added for groups of registers that are all treated the +same way for some instruction. Typical examples are register classes for +integer, floating-point, or vector registers. A register allocator allows an +instruction to use any register in a specified register class to perform the +instruction in a similar manner. Register classes allocate virtual registers to +instructions from these sets, and register classes let the target-independent +register allocator automatically choose the actual registers. +

    + +

    +Much of the code for registers, including register definition, register aliases, +and register classes, is generated by TableGen from XXXRegisterInfo.td +input files and placed in XXXGenRegisterInfo.h.inc and +XXXGenRegisterInfo.inc output files. Some of the code in the +implementation of XXXRegisterInfo requires hand-coding. +

    + +
    + + + + +
    + +

    +The XXXRegisterInfo.td file typically starts with register definitions +for a target machine. The Register class (specified +in Target.td) is used to define an object for each register. The +specified string n becomes the Name of the register. The +basic Register object does not have any subregisters and does not +specify any aliases. +

    + +
    +
    +class Register<string n> {
    +  string Namespace = "";
    +  string AsmName = n;
    +  string Name = n;
    +  int SpillSize = 0;
    +  int SpillAlignment = 0;
    +  list<Register> Aliases = [];
    +  list<Register> SubRegs = [];
    +  list<int> DwarfNumbers = [];
    +}
    +
    +
    + +

    +For example, in the X86RegisterInfo.td file, there are register +definitions that utilize the Register class, such as: +

    + +
    +
    +def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
    +
    +
    + +

    +This defines the register AL and assigns it values (with +DwarfRegNum) that are used by gcc, gdb, or a debug +information writer to identify a register. For register +AL, DwarfRegNum takes an array of 3 values representing 3 +different modes: the first element is for X86-64, the second for exception +handling (EH) on X86-32, and the third is generic. -1 is a special Dwarf number +that indicates the gcc number is undefined, and -2 indicates the register number +is invalid for this mode. +

    + +

    +From the previously described line in the X86RegisterInfo.td file, +TableGen generates this code in the X86GenRegisterInfo.inc file: +

    + +
    +
    +static const unsigned GR8[] = { X86::AL, ... };
    +
    +const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
    +
    +const TargetRegisterDesc RegisterDescriptors[] = { 
    +  ...
    +{ "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
    +
    +
    + +

    +From the register info file, TableGen generates a TargetRegisterDesc +object for each register. TargetRegisterDesc is defined in +include/llvm/Target/TargetRegisterInfo.h with the following fields: +

    + +
    +
    +struct TargetRegisterDesc {
    +  const char     *AsmName;      // Assembly language name for the register
    +  const char     *Name;         // Printable name for the reg (for debugging)
    +  const unsigned *AliasSet;     // Register Alias Set
    +  const unsigned *SubRegs;      // Sub-register set
    +  const unsigned *ImmSubRegs;   // Immediate sub-register set
    +  const unsigned *SuperRegs;    // Super-register set
    +};
    +
    + +

    +TableGen uses the entire target description file (.td) to determine +text names for the register (in the AsmName and Name fields of +TargetRegisterDesc) and the relationships of other registers to the +defined register (in the other TargetRegisterDesc fields). In this +example, other definitions establish the registers "AX", +"EAX", and "RAX" as aliases for one another, so TableGen +generates a null-terminated array (AL_AliasSet) for this register alias +set. +

    + +

    +The Register class is commonly used as a base class for more complex +classes. In Target.td, the Register class is the base for the +RegisterWithSubRegs class that is used to define registers that need to +specify subregisters in the SubRegs list, as shown here: +

    + +
    +
    +class RegisterWithSubRegs<string n,
    +list<Register> subregs> : Register<n> {
    +  let SubRegs = subregs;
    +}
    +
    +
    + +

    +In SparcRegisterInfo.td, additional register classes are defined for +SPARC: a Register subclass, SparcReg, and further subclasses: Ri, +Rf, and Rd. SPARC registers are identified by 5-bit ID +numbers, which is a feature common to these subclasses. Note the use of +'let' expressions to override values that are initially defined in a +superclass (such as SubRegs field in the Rd class). +

    + +
    +
    +class SparcReg<string n> : Register<n> {
    +  field bits<5> Num;
    +  let Namespace = "SP";
    +}
    +// Ri - 32-bit integer registers
    +class Ri<bits<5> num, string n> :
    +SparcReg<n> {
    +  let Num = num;
    +}
    +// Rf - 32-bit floating-point registers
    +class Rf<bits<5> num, string n> :
    +SparcReg<n> {
    +  let Num = num;
    +}
    +// Rd - Slots in the FP register file for 64-bit
    +floating-point values.
    +class Rd<bits<5> num, string n,
    +list<Register> subregs> : SparcReg<n> {
    +  let Num = num;
    +  let SubRegs = subregs;
    +}
    +
    +
    + +

    +In the SparcRegisterInfo.td file, there are register definitions that +utilize these subclasses of Register, such as: +

    + +
    +
    +def G0 : Ri< 0, "G0">,
    +DwarfRegNum<[0]>;
    +def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
    +...
    +def F0 : Rf< 0, "F0">,
    +DwarfRegNum<[32]>;
    +def F1 : Rf< 1, "F1">,
    +DwarfRegNum<[33]>;
    +...
    +def D0 : Rd< 0, "F0", [F0, F1]>,
    +DwarfRegNum<[32]>;
    +def D1 : Rd< 2, "F2", [F2, F3]>,
    +DwarfRegNum<[34]>;
    +
    +
    + +

    +The last two registers shown above (D0 and D1) are +double-precision floating-point registers that are aliases for pairs of +single-precision floating-point sub-registers. In addition to aliases, the +sub-register and super-register relationships of the defined register are in +fields of a register's TargetRegisterDesc. +

    + +
    + + + + +
    + +

    +The RegisterClass class (specified in Target.td) is used to +define an object that represents a group of related registers and also defines +the default allocation order of the registers. A target description file +XXXRegisterInfo.td that uses Target.td can construct register +classes using the following class: +

    + +
    +
    +class RegisterClass<string namespace,
    +list<ValueType> regTypes, int alignment,
    +                    list<Register> regList> {
    +  string Namespace = namespace;
    +  list<ValueType> RegTypes = regTypes;
    +  int Size = 0;  // spill size, in bits; zero lets tblgen pick the size
    +  int Alignment = alignment;
    +
    +  // CopyCost is the cost of copying a value between two registers
    +  // default value 1 means a single instruction
    +  // A negative value means copying is extremely expensive or impossible
    +  int CopyCost = 1;  
    +  list<Register> MemberList = regList;
    +  
    +  // for register classes that are subregisters of this class
    +  list<RegisterClass> SubRegClassList = [];  
    +  
    +  code MethodProtos = [{}];  // to insert arbitrary code
    +  code MethodBodies = [{}];
    +}
    +
    +
    + +

    To define a RegisterClass, use the following 4 arguments:

    + +
      +
    • The first argument of the definition is the name of the namespace.
    • + +
    • The second argument is a list of ValueType register type values + that are defined in include/llvm/CodeGen/ValueTypes.td. Defined + values include integer types (such as i16, i32, + and i1 for Boolean), floating-point types + (f32, f64), and vector types (for example, v8i16 + for an 8 x i16 vector). All registers in a RegisterClass + must have the same ValueType, but some registers may store vector + data in different configurations. For example a register that can process a + 128-bit vector may be able to handle 16 8-bit integer elements, 8 16-bit + integers, 4 32-bit integers, and so on.
    • + +
    • The third argument of the RegisterClass definition specifies the + alignment required of the registers when they are stored or loaded to + memory.
    • + +
    • The final argument, regList, specifies which registers are in this + class. If an allocation_order_* method is not specified, + then regList also defines the order of allocation used by the + register allocator.
    • +
    + +

    +In SparcRegisterInfo.td, three RegisterClass objects are defined: +FPRegs, DFPRegs, and IntRegs. For all three register +classes, the first argument defines the namespace with the string +'SP'. FPRegs defines a group of 32 single-precision +floating-point registers (F0 to F31); DFPRegs defines +a group of 16 double-precision registers +(D0-D15). For IntRegs, the MethodProtos +and MethodBodies methods are used by TableGen to insert the specified +code into generated output. +

    + +
    +
    +def FPRegs : RegisterClass<"SP", [f32], 32,
    +  [F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15,
    +   F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31]>;
    +
    +def DFPRegs : RegisterClass<"SP", [f64], 64,
    +  [D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, D10, D11, D12, D13, D14, D15]>;
    + 
    +def IntRegs : RegisterClass<"SP", [i32], 32,
    +    [L0, L1, L2, L3, L4, L5, L6, L7,
    +     I0, I1, I2, I3, I4, I5,
    +     O0, O1, O2, O3, O4, O5, O7,
    +     G1,
    +     // Non-allocatable regs:
    +     G2, G3, G4, 
    +     O6,        // stack ptr
    +    I6,        // frame ptr
    +     I7,        // return address
    +     G0,        // constant zero
    +     G5, G6, G7 // reserved for kernel
    +    ]> {
    +  let MethodProtos = [{
    +    iterator allocation_order_end(const MachineFunction &MF) const;
    +  }];
    +  let MethodBodies = [{
    +    IntRegsClass::iterator
    +    IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
    +      return end() - 10  // Don't allocate special registers
    +         -1;
    +    }
    +  }];
    +}
    +
    +
    + +

    +Using SparcRegisterInfo.td with TableGen generates several output files +that are intended for inclusion in other source code that you write. +SparcRegisterInfo.td generates SparcGenRegisterInfo.h.inc, +which should be included in the header file for the implementation of the SPARC +register implementation that you write (SparcRegisterInfo.h). In +SparcGenRegisterInfo.h.inc a new structure is defined called +SparcGenRegisterInfo that uses TargetRegisterInfo as its +base. It also specifies types, based upon the defined register +classes: DFPRegsClass, FPRegsClass, and IntRegsClass. +

    + +

    +SparcRegisterInfo.td also generates SparcGenRegisterInfo.inc, +which is included at the bottom of SparcRegisterInfo.cpp, the SPARC +register implementation. The code below shows only the generated integer +registers and associated register classes. The order of registers +in IntRegs reflects the order in the definition of IntRegs in +the target description file. Take special note of the use +of MethodBodies in SparcRegisterInfo.td to create code in +SparcGenRegisterInfo.inc. MethodProtos generates similar code +in SparcGenRegisterInfo.h.inc. +

    + +
    +
      // IntRegs Register Class...
    +  static const unsigned IntRegs[] = {
    +    SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
    +    SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
    +    SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
    +    SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
    +    SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
    +    SP::G6, SP::G7, 
    +  };
    +
    +  // IntRegsVTs Register Class Value Types...
    +  static const MVT::ValueType IntRegsVTs[] = {
    +    MVT::i32, MVT::Other
    +  };
    +
    +namespace SP {   // Register class instances
    +  DFPRegsClass    DFPRegsRegClass;
    +  FPRegsClass     FPRegsRegClass;
    +  IntRegsClass    IntRegsRegClass;
    +...
    +  // IntRegs Sub-register Classess...
    +  static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
    +    NULL
    +  };
    +...
    +  // IntRegs Super-register Classess...
    +  static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
    +    NULL
    +  };
    +...
    +  // IntRegs Register Class sub-classes...
    +  static const TargetRegisterClass* const IntRegsSubclasses [] = {
    +    NULL
    +  };
    +...
    +  // IntRegs Register Class super-classes...
    +  static const TargetRegisterClass* const IntRegsSuperclasses [] = {
    +    NULL
    +  };
    +...
    +  IntRegsClass::iterator
    +  IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
    +     return end()-10  // Don't allocate special registers
    +         -1;
    +  }
    +  
    +  IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID, 
    +    IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses, 
    +    IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
    +}
    +
    +
    + +
    + + + + +
    + +

    +The final step is to hand code portions of XXXRegisterInfo, which +implements the interface described in TargetRegisterInfo.h. These +functions return 0, NULL, or false, unless +overridden. Here is a list of functions that are overridden for the SPARC +implementation in SparcRegisterInfo.cpp: +

    + +
      +
    • getCalleeSavedRegs — Returns a list of callee-saved registers + in the order of the desired callee-save stack frame offset.
    • + +
    • getReservedRegs — Returns a bitset indexed by physical + register numbers, indicating if a particular register is unavailable.
    • + +
    • hasFP — Return a Boolean indicating if a function should have + a dedicated frame pointer register.
    • + +
    • eliminateCallFramePseudoInstr — If call frame setup or + destroy pseudo instructions are used, this can be called to eliminate + them.
    • + +
    • eliminateFrameIndex — Eliminate abstract frame indices from + instructions that may use them.
    • + +
    • emitPrologue — Insert prologue code into the function.
    • + +
    • emitEpilogue — Insert epilogue code into the function.
    • +
    + +
    + + + + + +
    + +

    +During the early stages of code generation, the LLVM IR code is converted to a +SelectionDAG with nodes that are instances of the SDNode class +containing target instructions. An SDNode has an opcode, operands, type +requirements, and operation properties. For example, is an operation +commutative, does an operation load from memory. The various operation node +types are described in the include/llvm/CodeGen/SelectionDAGNodes.h +file (values of the NodeType enum in the ISD namespace). +

    + +

    +TableGen uses the following target description (.td) input files to +generate much of the code for instruction definition: +

    + +
      +
    • Target.td — Where the Instruction, Operand, + InstrInfo, and other fundamental classes are defined.
    • + +
    • TargetSelectionDAG.td— Used by SelectionDAG + instruction selection generators, contains SDTC* classes (selection + DAG type constraint), definitions of SelectionDAG nodes (such as + imm, cond, bb, add, fadd, + sub), and pattern support (Pattern, Pat, + PatFrag, PatLeaf, ComplexPattern.
    • + +
    • XXXInstrFormats.td — Patterns for definitions of + target-specific instructions.
    • + +
    • XXXInstrInfo.td — Target-specific definitions of instruction + templates, condition codes, and instructions of an instruction set. For + architecture modifications, a different file name may be used. For example, + for Pentium with SSE instruction, this file is X86InstrSSE.td, and + for Pentium with MMX, this file is X86InstrMMX.td.
    • +
    + +

    +There is also a target-specific XXX.td file, where XXX is the +name of the target. The XXX.td file includes the other .td +input files, but its contents are only directly important for subtargets. +

    + +

    +You should describe a concrete target-specific class XXXInstrInfo that +represents machine instructions supported by a target machine. +XXXInstrInfo contains an array of XXXInstrDescriptor objects, +each of which describes one instruction. An instruction descriptor defines:

    + +
      +
    • Opcode mnemonic
    • + +
    • Number of operands
    • + +
    • List of implicit register definitions and uses
    • + +
    • Target-independent properties (such as memory access, is commutable)
    • + +
    • Target-specific flags
    • +
    + +

    +The Instruction class (defined in Target.td) is mostly used as a base +for more complex instruction classes. +

    + +
    +
    class Instruction {
    +  string Namespace = "";
    +  dag OutOperandList;       // An dag containing the MI def operand list.
    +  dag InOperandList;        // An dag containing the MI use operand list.
    +  string AsmString = "";    // The .s format to print the instruction with.
    +  list<dag> Pattern;  // Set to the DAG pattern for this instruction
    +  list<Register> Uses = []; 
    +  list<Register> Defs = [];
    +  list<Predicate> Predicates = [];  // predicates turned into isel match code
    +  ... remainder not shown for space ...
    +}
    +
    +
    + +

    +A SelectionDAG node (SDNode) should contain an object +representing a target-specific instruction that is defined +in XXXInstrInfo.td. The instruction objects should represent +instructions from the architecture manual of the target machine (such as the +SPARC Architecture Manual for the SPARC target). +

    + +

    +A single instruction from the architecture manual is often modeled as multiple +target instructions, depending upon its operands. For example, a manual might +describe an add instruction that takes a register or an immediate operand. An +LLVM target could model this with two instructions named ADDri and +ADDrr. +

    + +

    +You should define a class for each instruction category and define each opcode +as a subclass of the category with appropriate parameters such as the fixed +binary encoding of opcodes and extended opcodes. You should map the register +bits to the bits of the instruction in which they are encoded (for the +JIT). Also you should specify how the instruction should be printed when the +automatic assembly printer is used. +

    + +

    +As is described in the SPARC Architecture Manual, Version 8, there are three +major 32-bit formats for instructions. Format 1 is only for the CALL +instruction. Format 2 is for branch on condition codes and SETHI (set +high bits of a register) instructions. Format 3 is for other instructions. +

    + +

    +Each of these formats has corresponding classes in SparcInstrFormat.td. +InstSP is a base class for other instruction classes. Additional base +classes are specified for more precise formats: for example +in SparcInstrFormat.td, F2_1 is for SETHI, +and F2_2 is for branches. There are three other base +classes: F3_1 for register/register operations, F3_2 for +register/immediate operations, and F3_3 for floating-point +operations. SparcInstrInfo.td also adds the base class Pseudo for +synthetic SPARC instructions. +

    + +

    +SparcInstrInfo.td largely consists of operand and instruction +definitions for the SPARC target. In SparcInstrInfo.td, the following +target description file entry, LDrr, defines the Load Integer +instruction for a Word (the LD SPARC opcode) from a memory address to a +register. The first parameter, the value 3 (112), is the +operation value for this category of operation. The second parameter +(0000002) is the specific operation value +for LD/Load Word. The third parameter is the output destination, which +is a register operand and defined in the Register target description +file (IntRegs). +

    + +
    +
    def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
    +                 "ld [$addr], $dst",
    +                 [(set IntRegs:$dst, (load ADDRrr:$addr))]>;
    +
    +
    + +

    +The fourth parameter is the input source, which uses the address +operand MEMrr that is defined earlier in SparcInstrInfo.td: +

    + +
    +
    def MEMrr : Operand<i32> {
    +  let PrintMethod = "printMemOperand";
    +  let MIOperandInfo = (ops IntRegs, IntRegs);
    +}
    +
    +
    + +

    +The fifth parameter is a string that is used by the assembly printer and can be +left as an empty string until the assembly printer interface is implemented. The +sixth and final parameter is the pattern used to match the instruction during +the SelectionDAG Select Phase described in +(The LLVM +Target-Independent Code Generator). This parameter is detailed in the next +section, Instruction Selector. +

    + +

    +Instruction class definitions are not overloaded for different operand types, so +separate versions of instructions are needed for register, memory, or immediate +value operands. For example, to perform a Load Integer instruction for a Word +from an immediate operand to a register, the following instruction class is +defined: +

    + +
    +
    def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
    +                 "ld [$addr], $dst",
    +                 [(set IntRegs:$dst, (load ADDRri:$addr))]>;
    +
    +
    + +

    +Writing these definitions for so many similar instructions can involve a lot of +cut and paste. In td files, the multiclass directive enables the +creation of templates to define several instruction classes at once (using +the defm directive). For example in SparcInstrInfo.td, the +multiclass pattern F3_12 is defined to create 2 instruction +classes each time F3_12 is invoked: +

    + +
    +
    multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
    +  def rr  : F3_1 <2, Op3Val, 
    +                 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
    +                 !strconcat(OpcStr, " $b, $c, $dst"),
    +                 [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
    +  def ri  : F3_2 <2, Op3Val,
    +                 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
    +                 !strconcat(OpcStr, " $b, $c, $dst"),
    +                 [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
    +}
    +
    +
    + +

    +So when the defm directive is used for the XOR +and ADD instructions, as seen below, it creates four instruction +objects: XORrr, XORri, ADDrr, and ADDri. +

    + +
    +
    +defm XOR   : F3_12<"xor", 0b000011, xor>;
    +defm ADD   : F3_12<"add", 0b000000, add>;
    +
    +
    + +

    +SparcInstrInfo.td also includes definitions for condition codes that +are referenced by branch instructions. The following definitions +in SparcInstrInfo.td indicate the bit location of the SPARC condition +code. For example, the 10th bit represents the 'greater than' +condition for integers, and the 22nd bit represents the 'greater +than' condition for floats. +

    + +
    +
    +def ICC_NE  : ICC_VAL< 9>;  // Not Equal
    +def ICC_E   : ICC_VAL< 1>;  // Equal
    +def ICC_G   : ICC_VAL<10>;  // Greater
    +...
    +def FCC_U   : FCC_VAL<23>;  // Unordered
    +def FCC_G   : FCC_VAL<22>;  // Greater
    +def FCC_UG  : FCC_VAL<21>;  // Unordered or Greater
    +...
    +
    +
    + +

    +(Note that Sparc.h also defines enums that correspond to the same SPARC +condition codes. Care must be taken to ensure the values in Sparc.h +correspond to the values in SparcInstrInfo.td. I.e., +SPCC::ICC_NE = 9, SPCC::FCC_U = 23 and so on.) +

    + +
    + + + + +
    + +

    +The code generator backend maps instruction operands to fields in the +instruction. Operands are assigned to unbound fields in the instruction in the +order they are defined. Fields are bound when they are assigned a value. For +example, the Sparc target defines the XNORrr instruction as +a F3_1 format instruction having three operands. +

    + +
    +
    +def XNORrr  : F3_1<2, 0b000111,
    +                   (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
    +                   "xnor $b, $c, $dst",
    +                   [(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]>;
    +
    +
    + +

    +The instruction templates in SparcInstrFormats.td show the base class +for F3_1 is InstSP. +

    + +
    +
    +class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
    +  field bits<32> Inst;
    +  let Namespace = "SP";
    +  bits<2> op;
    +  let Inst{31-30} = op;       
    +  dag OutOperandList = outs;
    +  dag InOperandList = ins;
    +  let AsmString   = asmstr;
    +  let Pattern = pattern;
    +}
    +
    +
    + +

    InstSP leaves the op field unbound.

    + +
    +
    +class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
    +    : InstSP<outs, ins, asmstr, pattern> {
    +  bits<5> rd;
    +  bits<6> op3;
    +  bits<5> rs1;
    +  let op{1} = 1;   // Op = 2 or 3
    +  let Inst{29-25} = rd;
    +  let Inst{24-19} = op3;
    +  let Inst{18-14} = rs1;
    +}
    +
    +
    + +

    +F3 binds the op field and defines the rd, +op3, and rs1 fields. F3 format instructions will +bind the operands rd, op3, and rs1 fields. +

    + +
    +
    +class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
    +           string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
    +  bits<8> asi = 0; // asi not currently used
    +  bits<5> rs2;
    +  let op         = opVal;
    +  let op3        = op3val;
    +  let Inst{13}   = 0;     // i field = 0
    +  let Inst{12-5} = asi;   // address space identifier
    +  let Inst{4-0}  = rs2;
    +}
    +
    +
    + +

    +F3_1 binds the op3 field and defines the rs2 +fields. F3_1 format instructions will bind the operands to the rd, +rs1, and rs2 fields. This results in the XNORrr +instruction binding $dst, $b, and $c operands to +the rd, rs1, and rs2 fields respectively. +

    + +
    + + + + +
    + +

    +The final step is to hand code portions of XXXInstrInfo, which +implements the interface described in TargetInstrInfo.h. These +functions return 0 or a Boolean or they assert, unless +overridden. Here's a list of functions that are overridden for the SPARC +implementation in SparcInstrInfo.cpp: +

    + +
      +
    • isLoadFromStackSlot — If the specified machine instruction is + a direct load from a stack slot, return the register number of the + destination and the FrameIndex of the stack slot.
    • + +
    • isStoreToStackSlot — If the specified machine instruction is + a direct store to a stack slot, return the register number of the + destination and the FrameIndex of the stack slot.
    • + +
    • copyPhysReg — Copy values between a pair of physical + registers.
    • + +
    • storeRegToStackSlot — Store a register value to a stack + slot.
    • + +
    • loadRegFromStackSlot — Load a register value from a stack + slot.
    • + +
    • storeRegToAddr — Store a register value to memory.
    • + +
    • loadRegFromAddr — Load a register value from memory.
    • + +
    • foldMemoryOperand — Attempt to combine instructions of any + load or store instruction for the specified operand(s).
    • +
    + +
    + + + +
    + +

    +Performance can be improved by combining instructions or by eliminating +instructions that are never reached. The AnalyzeBranch method +in XXXInstrInfo may be implemented to examine conditional instructions +and remove unnecessary instructions. AnalyzeBranch looks at the end of +a machine basic block (MBB) for opportunities for improvement, such as branch +folding and if conversion. The BranchFolder and IfConverter +machine function passes (see the source files BranchFolding.cpp and +IfConversion.cpp in the lib/CodeGen directory) call +AnalyzeBranch to improve the control flow graph that represents the +instructions. +

    + +

    +Several implementations of AnalyzeBranch (for ARM, Alpha, and X86) can +be examined as models for your own AnalyzeBranch implementation. Since +SPARC does not implement a useful AnalyzeBranch, the ARM target +implementation is shown below. +

    + +

    AnalyzeBranch returns a Boolean value and takes four parameters:

    + +
      +
    • MachineBasicBlock &MBB — The incoming block to be + examined.
    • + +
    • MachineBasicBlock *&TBB — A destination block that is + returned. For a conditional branch that evaluates to true, TBB is + the destination.
    • + +
    • MachineBasicBlock *&FBB — For a conditional branch that + evaluates to false, FBB is returned as the destination.
    • + +
    • std::vector<MachineOperand> &Cond — List of + operands to evaluate a condition for a conditional branch.
    • +
    + +

    +In the simplest case, if a block ends without a branch, then it falls through to +the successor block. No destination blocks are specified for either TBB +or FBB, so both parameters return NULL. The start of +the AnalyzeBranch (see code below for the ARM target) shows the +function parameters and the code for the simplest case. +

    + +
    +
    bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
    +        MachineBasicBlock *&TBB, MachineBasicBlock *&FBB,
    +        std::vector<MachineOperand> &Cond) const
    +{
    +  MachineBasicBlock::iterator I = MBB.end();
    +  if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
    +    return false;
    +
    +
    + +

    +If a block ends with a single unconditional branch instruction, then +AnalyzeBranch (shown below) should return the destination of that +branch in the TBB parameter. +

    + +
    +
    +  if (LastOpc == ARM::B || LastOpc == ARM::tB) {
    +    TBB = LastInst->getOperand(0).getMBB();
    +    return false;
    +  }
    +
    +
    + +

    +If a block ends with two unconditional branches, then the second branch is never +reached. In that situation, as shown below, remove the last branch instruction +and return the penultimate branch in the TBB parameter. +

    + +
    +
    +  if ((SecondLastOpc == ARM::B || SecondLastOpc==ARM::tB) &&
    +      (LastOpc == ARM::B || LastOpc == ARM::tB)) {
    +    TBB = SecondLastInst->getOperand(0).getMBB();
    +    I = LastInst;
    +    I->eraseFromParent();
    +    return false;
    +  }
    +
    +
    + +

    +A block may end with a single conditional branch instruction that falls through +to successor block if the condition evaluates to false. In that case, +AnalyzeBranch (shown below) should return the destination of that +conditional branch in the TBB parameter and a list of operands in +the Cond parameter to evaluate the condition. +

    + +
    +
    +  if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
    +    // Block ends with fall-through condbranch.
    +    TBB = LastInst->getOperand(0).getMBB();
    +    Cond.push_back(LastInst->getOperand(1));
    +    Cond.push_back(LastInst->getOperand(2));
    +    return false;
    +  }
    +
    +
    + +

    +If a block ends with both a conditional branch and an ensuing unconditional +branch, then AnalyzeBranch (shown below) should return the conditional +branch destination (assuming it corresponds to a conditional evaluation of +'true') in the TBB parameter and the unconditional branch +destination in the FBB (corresponding to a conditional evaluation of +'false'). A list of operands to evaluate the condition should be +returned in the Cond parameter. +

    + +
    +
    +  unsigned SecondLastOpc = SecondLastInst->getOpcode();
    +
    +  if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
    +      (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
    +    TBB =  SecondLastInst->getOperand(0).getMBB();
    +    Cond.push_back(SecondLastInst->getOperand(1));
    +    Cond.push_back(SecondLastInst->getOperand(2));
    +    FBB = LastInst->getOperand(0).getMBB();
    +    return false;
    +  }
    +
    +
    + +

    +For the last two cases (ending with a single conditional branch or ending with +one conditional and one unconditional branch), the operands returned in +the Cond parameter can be passed to methods of other instructions to +create new branches or perform other operations. An implementation +of AnalyzeBranch requires the helper methods RemoveBranch +and InsertBranch to manage subsequent operations. +

    + +

    +AnalyzeBranch should return false indicating success in most circumstances. +AnalyzeBranch should only return true when the method is stumped about what to +do, for example, if a block has three terminating branches. AnalyzeBranch may +return true if it encounters a terminator it cannot handle, such as an indirect +branch. +

    + +
    + + + + + +
    + +

    +LLVM uses a SelectionDAG to represent LLVM IR instructions, and nodes +of the SelectionDAG ideally represent native target +instructions. During code generation, instruction selection passes are performed +to convert non-native DAG instructions into native target-specific +instructions. The pass described in XXXISelDAGToDAG.cpp is used to +match patterns and perform DAG-to-DAG instruction selection. Optionally, a pass +may be defined (in XXXBranchSelector.cpp) to perform similar DAG-to-DAG +operations for branch instructions. Later, the code in +XXXISelLowering.cpp replaces or removes operations and data types not +supported natively (legalizes) in a SelectionDAG. +

    + +

    +TableGen generates code for instruction selection using the following target +description input files: +

    + +
      +
    • XXXInstrInfo.td — Contains definitions of instructions in a + target-specific instruction set, generates XXXGenDAGISel.inc, which + is included in XXXISelDAGToDAG.cpp.
    • + +
    • XXXCallingConv.td — Contains the calling and return value + conventions for the target architecture, and it generates + XXXGenCallingConv.inc, which is included in + XXXISelLowering.cpp.
    • +
    + +

    +The implementation of an instruction selection pass must include a header that +declares the FunctionPass class or a subclass of FunctionPass. In +XXXTargetMachine.cpp, a Pass Manager (PM) should add each instruction +selection pass into the queue of passes to run. +

    + +

    +The LLVM static compiler (llc) is an excellent tool for visualizing the +contents of DAGs. To display the SelectionDAG before or after specific +processing phases, use the command line options for llc, described +at +SelectionDAG Instruction Selection Process. +

    + +

    +To describe instruction selector behavior, you should add patterns for lowering +LLVM code into a SelectionDAG as the last parameter of the instruction +definitions in XXXInstrInfo.td. For example, in +SparcInstrInfo.td, this entry defines a register store operation, and +the last parameter describes a pattern with the store DAG operator. +

    + +
    +
    +def STrr  : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
    +                 "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
    +
    +
    + +

    +ADDRrr is a memory mode that is also defined in +SparcInstrInfo.td: +

    + +
    +
    +def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
    +
    +
    + +

    +The definition of ADDRrr refers to SelectADDRrr, which is a +function defined in an implementation of the Instructor Selector (such +as SparcISelDAGToDAG.cpp). +

    + +

    +In lib/Target/TargetSelectionDAG.td, the DAG operator for store is +defined below: +

    + +
    +
    +def store : PatFrag<(ops node:$val, node:$ptr),
    +                    (st node:$val, node:$ptr), [{
    +  if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
    +    return !ST->isTruncatingStore() && 
    +           ST->getAddressingMode() == ISD::UNINDEXED;
    +  return false;
    +}]>;
    +
    +
    + +

    +XXXInstrInfo.td also generates (in XXXGenDAGISel.inc) the +SelectCode method that is used to call the appropriate processing +method for an instruction. In this example, SelectCode +calls Select_ISD_STORE for the ISD::STORE opcode. +

    + +
    +
    +SDNode *SelectCode(SDValue N) {
    +  ... 
    +  MVT::ValueType NVT = N.getNode()->getValueType(0);
    +  switch (N.getOpcode()) {
    +  case ISD::STORE: {
    +    switch (NVT) {
    +    default:
    +      return Select_ISD_STORE(N);
    +      break;
    +    }
    +    break;
    +  }
    +  ...
    +
    +
    + +

    +The pattern for STrr is matched, so elsewhere in +XXXGenDAGISel.inc, code for STrr is created for +Select_ISD_STORE. The Emit_22 method is also generated +in XXXGenDAGISel.inc to complete the processing of this +instruction. +

    + +
    +
    +SDNode *Select_ISD_STORE(const SDValue &N) {
    +  SDValue Chain = N.getOperand(0);
    +  if (Predicate_store(N.getNode())) {
    +    SDValue N1 = N.getOperand(1);
    +    SDValue N2 = N.getOperand(2);
    +    SDValue CPTmp0;
    +    SDValue CPTmp1;
    +
    +    // Pattern: (st:void IntRegs:i32:$src, 
    +    //           ADDRrr:i32:$addr)<<P:Predicate_store>>
    +    // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
    +    // Pattern complexity = 13  cost = 1  size = 0
    +    if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
    +        N1.getNode()->getValueType(0) == MVT::i32 &&
    +        N2.getNode()->getValueType(0) == MVT::i32) {
    +      return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
    +    }
    +...
    +
    +
    + +
    + + + + +
    + +

    +The Legalize phase converts a DAG to use types and operations that are natively +supported by the target. For natively unsupported types and operations, you need +to add code to the target-specific XXXTargetLowering implementation to convert +unsupported types and operations to supported ones. +

    + +

    +In the constructor for the XXXTargetLowering class, first use the +addRegisterClass method to specify which types are supports and which +register classes are associated with them. The code for the register classes are +generated by TableGen from XXXRegisterInfo.td and placed +in XXXGenRegisterInfo.h.inc. For example, the implementation of the +constructor for the SparcTargetLowering class (in +SparcISelLowering.cpp) starts with the following code: +

    + +
    +
    +addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
    +addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
    +addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass); 
    +
    +
    + +

    +You should examine the node types in the ISD namespace +(include/llvm/CodeGen/SelectionDAGNodes.h) and determine which +operations the target natively supports. For operations that do not have +native support, add a callback to the constructor for the XXXTargetLowering +class, so the instruction selection process knows what to do. The TargetLowering +class callback methods (declared in llvm/Target/TargetLowering.h) are: +

    + +
      +
    • setOperationAction — General operation.
    • + +
    • setLoadExtAction — Load with extension.
    • + +
    • setTruncStoreAction — Truncating store.
    • + +
    • setIndexedLoadAction — Indexed load.
    • + +
    • setIndexedStoreAction — Indexed store.
    • + +
    • setConvertAction — Type conversion.
    • + +
    • setCondCodeAction — Support for a given condition code.
    • +
    + +

    +Note: on older releases, setLoadXAction is used instead +of setLoadExtAction. Also, on older releases, +setCondCodeAction may not be supported. Examine your release +to see what methods are specifically supported. +

    + +

    +These callbacks are used to determine that an operation does or does not work +with a specified type (or types). And in all cases, the third parameter is +a LegalAction type enum value: Promote, Expand, +Custom, or Legal. SparcISelLowering.cpp +contains examples of all four LegalAction values. +

    + +
    + + +
    + Promote +
    + +
    + +

    +For an operation without native support for a given type, the specified type may +be promoted to a larger type that is supported. For example, SPARC does not +support a sign-extending load for Boolean values (i1 type), so +in SparcISelLowering.cpp the third parameter below, Promote, +changes i1 type values to a large type before loading. +

    + +
    +
    +setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
    +
    +
    + +
    + + +
    + Expand +
    + +
    + +

    +For a type without native support, a value may need to be broken down further, +rather than promoted. For an operation without native support, a combination of +other operations may be used to similar effect. In SPARC, the floating-point +sine and cosine trig operations are supported by expansion to other operations, +as indicated by the third parameter, Expand, to +setOperationAction: +

    + +
    +
    +setOperationAction(ISD::FSIN, MVT::f32, Expand);
    +setOperationAction(ISD::FCOS, MVT::f32, Expand);
    +
    +
    + +
    + + +
    + Custom +
    + +
    + +

    +For some operations, simple type promotion or operation expansion may be +insufficient. In some cases, a special intrinsic function must be implemented. +

    + +

    +For example, a constant value may require special treatment, or an operation may +require spilling and restoring registers in the stack and working with register +allocators. +

    + +

    +As seen in SparcISelLowering.cpp code below, to perform a type +conversion from a floating point value to a signed integer, first the +setOperationAction should be called with Custom as the third +parameter: +

    + +
    +
    +setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
    +
    +
    + +

    +In the LowerOperation method, for each Custom operation, a +case statement should be added to indicate what function to call. In the +following code, an FP_TO_SINT opcode will call +the LowerFP_TO_SINT method: +

    + +
    +
    +SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
    +  switch (Op.getOpcode()) {
    +  case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
    +  ...
    +  }
    +}
    +
    +
    + +

    +Finally, the LowerFP_TO_SINT method is implemented, using an FP +register to convert the floating-point value to an integer. +

    + +
    +
    +static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
    +  assert(Op.getValueType() == MVT::i32);
    +  Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
    +  return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
    +}
    +
    +
    + +
    + + +
    + Legal +
    + +
    + +

    +The Legal LegalizeAction enum value simply indicates that an +operation is natively supported. Legal represents the default +condition, so it is rarely used. In SparcISelLowering.cpp, the action +for CTPOP (an operation to count the bits set in an integer) is +natively supported only for SPARC v9. The following code enables +the Expand conversion technique for non-v9 SPARC implementations. +

    + +
    +
    +setOperationAction(ISD::CTPOP, MVT::i32, Expand);
    +...
    +if (TM.getSubtarget<SparcSubtarget>().isV9())
    +  setOperationAction(ISD::CTPOP, MVT::i32, Legal);
    +  case ISD::SETULT: return SPCC::ICC_CS;
    +  case ISD::SETULE: return SPCC::ICC_LEU;
    +  case ISD::SETUGT: return SPCC::ICC_GU;
    +  case ISD::SETUGE: return SPCC::ICC_CC;
    +  }
    +}
    +
    +
    + +
    + + + + +
    + +

    +To support target-specific calling conventions, XXXGenCallingConv.td +uses interfaces (such as CCIfType and CCAssignToReg) that are defined in +lib/Target/TargetCallingConv.td. TableGen can take the target +descriptor file XXXGenCallingConv.td and generate the header +file XXXGenCallingConv.inc, which is typically included +in XXXISelLowering.cpp. You can use the interfaces in +TargetCallingConv.td to specify: +

    + +
      +
    • The order of parameter allocation.
    • + +
    • Where parameters and return values are placed (that is, on the stack or in + registers).
    • + +
    • Which registers may be used.
    • + +
    • Whether the caller or callee unwinds the stack.
    • +
    + +

    +The following example demonstrates the use of the CCIfType and +CCAssignToReg interfaces. If the CCIfType predicate is true +(that is, if the current argument is of type f32 or f64), then +the action is performed. In this case, the CCAssignToReg action assigns +the argument value to the first available register: either R0 +or R1. +

    + +
    +
    +CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
    +
    +
    + +

    +SparcCallingConv.td contains definitions for a target-specific +return-value calling convention (RetCC_Sparc32) and a basic 32-bit C calling +convention (CC_Sparc32). The definition of RetCC_Sparc32 +(shown below) indicates which registers are used for specified scalar return +types. A single-precision float is returned to register F0, and a +double-precision float goes to register D0. A 32-bit integer is +returned in register I0 or I1. +

    + +
    +
    +def RetCC_Sparc32 : CallingConv<[
    +  CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
    +  CCIfType<[f32], CCAssignToReg<[F0]>>,
    +  CCIfType<[f64], CCAssignToReg<[D0]>>
    +]>;
    +
    +
    + +

    +The definition of CC_Sparc32 in SparcCallingConv.td introduces +CCAssignToStack, which assigns the value to a stack slot with the +specified size and alignment. In the example below, the first parameter, 4, +indicates the size of the slot, and the second parameter, also 4, indicates the +stack alignment along 4-byte units. (Special cases: if size is zero, then the +ABI size is used; if alignment is zero, then the ABI alignment is used.) +

    + +
    +
    +def CC_Sparc32 : CallingConv<[
    +  // All arguments get passed in integer registers if there is space.
    +  CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
    +  CCAssignToStack<4, 4>
    +]>;
    +
    +
    + +

    +CCDelegateTo is another commonly used interface, which tries to find a +specified sub-calling convention, and, if a match is found, it is invoked. In +the following example (in X86CallingConv.td), the definition of +RetCC_X86_32_C ends with CCDelegateTo. After the current value +is assigned to the register ST0 or ST1, +the RetCC_X86Common is invoked. +

    + +
    +
    +def RetCC_X86_32_C : CallingConv<[
    +  CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
    +  CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
    +  CCDelegateTo<RetCC_X86Common>
    +]>;
    +
    +
    + +

    +CCIfCC is an interface that attempts to match the given name to the +current calling convention. If the name identifies the current calling +convention, then a specified action is invoked. In the following example (in +X86CallingConv.td), if the Fast calling convention is in use, +then RetCC_X86_32_Fast is invoked. If the SSECall calling +convention is in use, then RetCC_X86_32_SSE is invoked. +

    + +
    +
    +def RetCC_X86_32 : CallingConv<[
    +  CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
    +  CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
    +  CCDelegateTo<RetCC_X86_32_C>
    +]>;
    +
    +
    + +

    Other calling convention interfaces include:

    + +
      +
    • CCIf <predicate, action> — If the predicate matches, + apply the action.
    • + +
    • CCIfInReg <action> — If the argument is marked with the + 'inreg' attribute, then apply the action.
    • + +
    • CCIfNest <action> — Inf the argument is marked with the + 'nest' attribute, then apply the action.
    • + +
    • CCIfNotVarArg <action> — If the current function does + not take a variable number of arguments, apply the action.
    • + +
    • CCAssignToRegWithShadow <registerList, shadowList> — + similar to CCAssignToReg, but with a shadow list of registers.
    • + +
    • CCPassByVal <size, align> — Assign value to a stack + slot with the minimum specified size and alignment.
    • + +
    • CCPromoteToType <type> — Promote the current value to + the specified type.
    • + +
    • CallingConv <[actions]> — Define each calling + convention that is supported.
    • +
    + +
    + + + + + +
    + +

    +During the code emission stage, the code generator may utilize an LLVM pass to +produce assembly output. To do this, you want to implement the code for a +printer that converts LLVM IR to a GAS-format assembly language for your target +machine, using the following steps: +

    + +
      +
    • Define all the assembly strings for your target, adding them to the + instructions defined in the XXXInstrInfo.td file. + (See Instruction Set.) TableGen will produce + an output file (XXXGenAsmWriter.inc) with an implementation of + the printInstruction method for the XXXAsmPrinter class.
    • + +
    • Write XXXTargetAsmInfo.h, which contains the bare-bones declaration + of the XXXTargetAsmInfo class (a subclass + of TargetAsmInfo).
    • + +
    • Write XXXTargetAsmInfo.cpp, which contains target-specific values + for TargetAsmInfo properties and sometimes new implementations for + methods.
    • + +
    • Write XXXAsmPrinter.cpp, which implements the AsmPrinter + class that performs the LLVM-to-assembly conversion.
    • +
    + +

    +The code in XXXTargetAsmInfo.h is usually a trivial declaration of the +XXXTargetAsmInfo class for use in XXXTargetAsmInfo.cpp. +Similarly, XXXTargetAsmInfo.cpp usually has a few declarations of +XXXTargetAsmInfo replacement values that override the default values +in TargetAsmInfo.cpp. For example in SparcTargetAsmInfo.cpp: +

    + +
    +
    +SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
    +  Data16bitsDirective = "\t.half\t";
    +  Data32bitsDirective = "\t.word\t";
    +  Data64bitsDirective = 0;  // .xword is only supported by V9.
    +  ZeroDirective = "\t.skip\t";
    +  CommentString = "!";
    +  ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
    +}
    +
    +
    + +

    +The X86 assembly printer implementation (X86TargetAsmInfo) is an +example where the target specific TargetAsmInfo class uses an +overridden methods: ExpandInlineAsm. +

    + +

    +A target-specific implementation of AsmPrinter is written in +XXXAsmPrinter.cpp, which implements the AsmPrinter class that +converts the LLVM to printable assembly. The implementation must include the +following headers that have declarations for the AsmPrinter and +MachineFunctionPass classes. The MachineFunctionPass is a +subclass of FunctionPass. +

    + +
    +
    +#include "llvm/CodeGen/AsmPrinter.h"
    +#include "llvm/CodeGen/MachineFunctionPass.h" 
    +
    +
    + +

    +As a FunctionPass, AsmPrinter first +calls doInitialization to set up the AsmPrinter. In +SparcAsmPrinter, a Mangler object is instantiated to process +variable names. +

    + +

    +In XXXAsmPrinter.cpp, the runOnMachineFunction method +(declared in MachineFunctionPass) must be implemented +for XXXAsmPrinter. In MachineFunctionPass, +the runOnFunction method invokes runOnMachineFunction. +Target-specific implementations of runOnMachineFunction differ, but +generally do the following to process each machine function: +

    + +
      +
    • Call SetupMachineFunction to perform initialization.
    • + +
    • Call EmitConstantPool to print out (to the output stream) constants + which have been spilled to memory.
    • + +
    • Call EmitJumpTableInfo to print out jump tables used by the current + function.
    • + +
    • Print out the label for the current function.
    • + +
    • Print out the code for the function, including basic block labels and the + assembly for the instruction (using printInstruction)
    • +
    + +

    +The XXXAsmPrinter implementation must also include the code generated +by TableGen that is output in the XXXGenAsmWriter.inc file. The code +in XXXGenAsmWriter.inc contains an implementation of the +printInstruction method that may call these methods: +

    + +
      +
    • printOperand
    • + +
    • printMemOperand
    • + +
    • printCCOperand (for conditional statements)
    • + +
    • printDataDirective
    • + +
    • printDeclare
    • + +
    • printImplicitDef
    • + +
    • printInlineAsm
    • +
    + +

    +The implementations of printDeclare, printImplicitDef, +printInlineAsm, and printLabel in AsmPrinter.cpp are +generally adequate for printing assembly and do not need to be +overridden. +

    + +

    +The printOperand method is implemented with a long switch/case +statement for the type of operand: register, immediate, basic block, external +symbol, global address, constant pool index, or jump table index. For an +instruction with a memory address operand, the printMemOperand method +should be implemented to generate the proper output. Similarly, +printCCOperand should be used to print a conditional operand. +

    + +

    doFinalization should be overridden in XXXAsmPrinter, and +it should be called to shut down the assembly printer. During +doFinalization, global variables and constants are printed to +output. +

    + +
    + + + + + +
    + +

    +Subtarget support is used to inform the code generation process of instruction +set variations for a given chip set. For example, the LLVM SPARC implementation +provided covers three major versions of the SPARC microprocessor architecture: +Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 64-bit +architecture), and the UltraSPARC architecture. V8 has 16 double-precision +floating-point registers that are also usable as either 32 single-precision or 8 +quad-precision registers. V8 is also purely big-endian. V9 has 32 +double-precision floating-point registers that are also usable as 16 +quad-precision registers, but cannot be used as single-precision registers. The +UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set +extensions. +

    + +

    +If subtarget support is needed, you should implement a target-specific +XXXSubtarget class for your architecture. This class should process the +command-line options -mcpu= and -mattr=. +

    + +

    +TableGen uses definitions in the Target.td and Sparc.td files +to generate code in SparcGenSubtarget.inc. In Target.td, shown +below, the SubtargetFeature interface is defined. The first 4 string +parameters of the SubtargetFeature interface are a feature name, an +attribute set by the feature, the value of the attribute, and a description of +the feature. (The fifth parameter is a list of features whose presence is +implied, and its default value is an empty array.) +

    + +
    +
    +class SubtargetFeature<string n, string a,  string v, string d,
    +                       list<SubtargetFeature> i = []> {
    +  string Name = n;
    +  string Attribute = a;
    +  string Value = v;
    +  string Desc = d;
    +  list<SubtargetFeature> Implies = i;
    +}
    +
    +
    + +

    +In the Sparc.td file, the SubtargetFeature is used to define the +following features. +

    + +
    +
    +def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
    +                     "Enable SPARC-V9 instructions">;
    +def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8", 
    +                     "V8DeprecatedInsts", "true",
    +                     "Enable deprecated V8 instructions in V9 mode">;
    +def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
    +                     "Enable UltraSPARC Visual Instruction Set extensions">;
    +
    +
    + +

    +Elsewhere in Sparc.td, the Proc class is defined and then is used to +define particular SPARC processor subtypes that may have the previously +described features. +

    + +
    +
    +class Proc<string Name, list<SubtargetFeature> Features>
    +  : Processor<Name, NoItineraries, Features>;
    + 
    +def : Proc<"generic",         []>;
    +def : Proc<"v8",              []>;
    +def : Proc<"supersparc",      []>;
    +def : Proc<"sparclite",       []>;
    +def : Proc<"f934",            []>;
    +def : Proc<"hypersparc",      []>;
    +def : Proc<"sparclite86x",    []>;
    +def : Proc<"sparclet",        []>;
    +def : Proc<"tsc701",          []>;
    +def : Proc<"v9",              [FeatureV9]>;
    +def : Proc<"ultrasparc",      [FeatureV9, FeatureV8Deprecated]>;
    +def : Proc<"ultrasparc3",     [FeatureV9, FeatureV8Deprecated]>;
    +def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
    +
    +
    + +

    +From Target.td and Sparc.td files, the resulting +SparcGenSubtarget.inc specifies enum values to identify the features, arrays of +constants to represent the CPU features and CPU subtypes, and the +ParseSubtargetFeatures method that parses the features string that sets +specified subtarget options. The generated SparcGenSubtarget.inc file +should be included in the SparcSubtarget.cpp. The target-specific +implementation of the XXXSubtarget method should follow this pseudocode: +

    + +
    +
    +XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
    +  // Set the default features
    +  // Determine default and user specified characteristics of the CPU
    +  // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
    +  // Perform any additional operations
    +}
    +
    +
    + +
    + + + + + +
    + +

    +The implementation of a target machine optionally includes a Just-In-Time (JIT) +code generator that emits machine code and auxiliary structures as binary output +that can be written directly to memory. To do this, implement JIT code +generation by performing the following steps: +

    + +
      +
    • Write an XXXCodeEmitter.cpp file that contains a machine function + pass that transforms target-machine instructions into relocatable machine + code.
    • + +
    • Write an XXXJITInfo.cpp file that implements the JIT interfaces for + target-specific code-generation activities, such as emitting machine code + and stubs.
    • + +
    • Modify XXXTargetMachine so that it provides a + TargetJITInfo object through its getJITInfo method.
    • +
    + +

    +There are several different approaches to writing the JIT support code. For +instance, TableGen and target descriptor files may be used for creating a JIT +code generator, but are not mandatory. For the Alpha and PowerPC target +machines, TableGen is used to generate XXXGenCodeEmitter.inc, which +contains the binary coding of machine instructions and the +getBinaryCodeForInstr method to access those codes. Other JIT +implementations do not. +

    + +

    +Both XXXJITInfo.cpp and XXXCodeEmitter.cpp must include the +llvm/CodeGen/MachineCodeEmitter.h header file that defines the +MachineCodeEmitter class containing code for several callback functions +that write data (in bytes, words, strings, etc.) to the output stream. +

    + +
    + + + + +
    + +

    +In XXXCodeEmitter.cpp, a target-specific of the Emitter class +is implemented as a function pass (subclass +of MachineFunctionPass). The target-specific implementation +of runOnMachineFunction (invoked by +runOnFunction in MachineFunctionPass) iterates through the +MachineBasicBlock calls emitInstruction to process each +instruction and emit binary code. emitInstruction is largely +implemented with case statements on the instruction types defined in +XXXInstrInfo.h. For example, in X86CodeEmitter.cpp, +the emitInstruction method is built around the following switch/case +statements: +

    + +
    +
    +switch (Desc->TSFlags & X86::FormMask) {
    +case X86II::Pseudo:  // for not yet implemented instructions 
    +   ...               // or pseudo-instructions
    +   break;
    +case X86II::RawFrm:  // for instructions with a fixed opcode value
    +   ...
    +   break;
    +case X86II::AddRegFrm: // for instructions that have one register operand 
    +   ...                 // added to their opcode
    +   break;
    +case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
    +   ...                 // to specify a destination (register)
    +   break;
    +case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
    +   ...                 // to specify a destination (memory)
    +   break;
    +case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
    +   ...                 // to specify a source (register)
    +   break;
    +case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
    +   ...                 // to specify a source (memory)
    +   break;
    +case X86II::MRM0r: case X86II::MRM1r:  // for instructions that operate on 
    +case X86II::MRM2r: case X86II::MRM3r:  // a REGISTER r/m operand and
    +case X86II::MRM4r: case X86II::MRM5r:  // use the Mod/RM byte and a field
    +case X86II::MRM6r: case X86II::MRM7r:  // to hold extended opcode data
    +   ...  
    +   break;
    +case X86II::MRM0m: case X86II::MRM1m:  // for instructions that operate on
    +case X86II::MRM2m: case X86II::MRM3m:  // a MEMORY r/m operand and
    +case X86II::MRM4m: case X86II::MRM5m:  // use the Mod/RM byte and a field
    +case X86II::MRM6m: case X86II::MRM7m:  // to hold extended opcode data
    +   ...  
    +   break;
    +case X86II::MRMInitReg: // for instructions whose source and
    +   ...                  // destination are the same register
    +   break;
    +}
    +
    +
    + +

    +The implementations of these case statements often first emit the opcode and +then get the operand(s). Then depending upon the operand, helper methods may be +called to process the operand(s). For example, in X86CodeEmitter.cpp, +for the X86II::AddRegFrm case, the first data emitted +(by emitByte) is the opcode added to the register operand. Then an +object representing the machine operand, MO1, is extracted. The helper +methods such as isImmediate, +isGlobalAddress, isExternalSymbol, isConstantPoolIndex, and +isJumpTableIndex determine the operand +type. (X86CodeEmitter.cpp also has private methods such +as emitConstant, emitGlobalAddress, +emitExternalSymbolAddress, emitConstPoolAddress, +and emitJumpTableAddress that emit the data into the output stream.) +

    + +
    +
    +case X86II::AddRegFrm:
    +  MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
    +  
    +  if (CurOp != NumOps) {
    +    const MachineOperand &MO1 = MI.getOperand(CurOp++);
    +    unsigned Size = X86InstrInfo::sizeOfImm(Desc);
    +    if (MO1.isImmediate())
    +      emitConstant(MO1.getImm(), Size);
    +    else {
    +      unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
    +        : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
    +      if (Opcode == X86::MOV64ri) 
    +        rt = X86::reloc_absolute_dword;  // FIXME: add X86II flag?
    +      if (MO1.isGlobalAddress()) {
    +        bool NeedStub = isa<Function>(MO1.getGlobal());
    +        bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
    +        emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
    +                          NeedStub, isLazy);
    +      } else if (MO1.isExternalSymbol())
    +        emitExternalSymbolAddress(MO1.getSymbolName(), rt);
    +      else if (MO1.isConstantPoolIndex())
    +        emitConstPoolAddress(MO1.getIndex(), rt);
    +      else if (MO1.isJumpTableIndex())
    +        emitJumpTableAddress(MO1.getIndex(), rt);
    +    }
    +  }
    +  break;
    +
    +
    + +

    +In the previous example, XXXCodeEmitter.cpp uses the +variable rt, which is a RelocationType enum that may be used to +relocate addresses (for example, a global address with a PIC base offset). The +RelocationType enum for that target is defined in the short +target-specific XXXRelocations.h file. The RelocationType is used by +the relocate method defined in XXXJITInfo.cpp to rewrite +addresses for referenced global symbols. +

    + +

    +For example, X86Relocations.h specifies the following relocation types +for the X86 addresses. In all four cases, the relocated value is added to the +value already in memory. For reloc_pcrel_word +and reloc_picrel_word, there is an additional initial adjustment. +

    + +
    +
    +enum RelocationType {
    +  reloc_pcrel_word = 0,    // add reloc value after adjusting for the PC loc
    +  reloc_picrel_word = 1,   // add reloc value after adjusting for the PIC base
    +  reloc_absolute_word = 2, // absolute relocation; no additional adjustment 
    +  reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
    +};
    +
    +
    + +
    + + + + +
    + +

    +XXXJITInfo.cpp implements the JIT interfaces for target-specific +code-generation activities, such as emitting machine code and stubs. At minimum, +a target-specific version of XXXJITInfo implements the following: +

    + +
      +
    • getLazyResolverFunction — Initializes the JIT, gives the + target a function that is used for compilation.
    • + +
    • emitFunctionStub — Returns a native function with a specified + address for a callback function.
    • + +
    • relocate — Changes the addresses of referenced globals, based + on relocation types.
    • + +
    • Callback function that are wrappers to a function stub that is used when the + real target is not initially known.
    • +
    + +

    +getLazyResolverFunction is generally trivial to implement. It makes the +incoming parameter as the global JITCompilerFunction and returns the +callback function that will be used a function wrapper. For the Alpha target +(in AlphaJITInfo.cpp), the getLazyResolverFunction +implementation is simply: +

    + +
    +
    +TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(  
    +                                            JITCompilerFn F) {
    +  JITCompilerFunction = F;
    +  return AlphaCompilationCallback;
    +}
    +
    +
    + +

    +For the X86 target, the getLazyResolverFunction implementation is a +little more complication, because it returns a different callback function for +processors with SSE instructions and XMM registers. +

    + +

    +The callback function initially saves and later restores the callee register +values, incoming arguments, and frame and return address. The callback function +needs low-level access to the registers or stack, so it is typically implemented +with assembler. +

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Mason Woo and Misha Brukman
    + The LLVM Compiler Infrastructure +
    + Last modified: $Date: 2010-07-16 15:35:46 -0700 (Fri, 16 Jul 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/WritingAnLLVMPass.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/WritingAnLLVMPass.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/WritingAnLLVMPass.html (added) +++ www-releases/trunk/2.8/docs/WritingAnLLVMPass.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1840 @@ + + + + + Writing an LLVM Pass + + + + +
    + Writing an LLVM Pass +
    + +
      +
    1. Introduction - What is a pass?
    2. +
    3. Quick Start - Writing hello world +
    4. +
    5. Pass classes and requirements + +
    6. Pass Registration +
    7. +
    8. Specifying interactions between passes +
    9. +
    10. Implementing Analysis Groups +
    11. +
    12. Pass Statistics +
    13. What PassManager does +
    14. +
    15. Registering dynamically loaded passes +
    16. +
    17. Using GDB with dynamically loaded passes +
    18. +
    19. Future extensions planned +
    20. +
    + +
    +

    Written by Chris Lattner and + Jim Laskey

    +
    + + + + + +
    + +

    The LLVM Pass Framework is an important part of the LLVM system, because LLVM +passes are where most of the interesting parts of the compiler exist. Passes +perform the transformations and optimizations that make up the compiler, they +build the analysis results that are used by these transformations, and they are, +above all, a structuring technique for compiler code.

    + +

    All LLVM passes are subclasses of the Pass +class, which implement functionality by overriding virtual methods inherited +from Pass. Depending on how your pass works, you should inherit from +the ModulePass, CallGraphSCCPass, FunctionPass, or LoopPass, or BasicBlockPass classes, which gives the system +more information about what your pass does, and how it can be combined with +other passes. One of the main features of the LLVM Pass Framework is that it +schedules passes to run in an efficient way based on the constraints that your +pass meets (which are indicated by which class they derive from).

    + +

    We start by showing you how to construct a pass, everything from setting up +the code, to compiling, loading, and executing it. After the basics are down, +more advanced features are discussed.

    + +
    + + + + + +
    + +

    Here we describe how to write the "hello world" of passes. The "Hello" pass +is designed to simply print out the name of non-external functions that exist in +the program being compiled. It does not modify the program at all, it just +inspects it. The source code and files for this pass are available in the LLVM +source tree in the lib/Transforms/Hello directory.

    + +
    + + + + +
    + +

    First, you need to create a new directory somewhere in the LLVM source + base. For this example, we'll assume that you made + lib/Transforms/Hello. Next, you must set up a build script + (Makefile) that will compile the source code for the new pass. To do this, + copy the following into Makefile:

    +
    + +
    +# Makefile for hello pass
    +
    +# Path to top level of LLVM hierarchy
    +LEVEL = ../../..
    +
    +# Name of the library to build
    +LIBRARYNAME = Hello
    +
    +# Make the shared library become a loadable module so the tools can 
    +# dlopen/dlsym on the resulting library.
    +LOADABLE_MODULE = 1
    +
    +# Include the makefile implementation stuff
    +include $(LEVEL)/Makefile.common
    +
    + +

    This makefile specifies that all of the .cpp files in the current +directory are to be compiled and linked together into a +Debug+Asserts/lib/Hello.so shared object that can be dynamically loaded by +the opt or bugpoint tools via their -load options. +If your operating system uses a suffix other than .so (such as windows or +Mac OS/X), the appropriate extension will be used.

    + +

    Now that we have the build scripts set up, we just need to write the code for +the pass itself.

    + +
    + + + + +
    + +

    Now that we have a way to compile our new pass, we just have to write it. +Start out with:

    + +
    +#include "llvm/Pass.h"
    +#include "llvm/Function.h"
    +#include "llvm/Support/raw_ostream.h"
    +
    + +

    Which are needed because we are writing a Pass, +we are operating on Function's, +and we will be doing some printing.

    + +

    Next we have:

    +
    +using namespace llvm;
    +
    +

    ... which is required because the functions from the include files +live in the llvm namespace. +

    + +

    Next we have:

    + +
    +namespace {
    +
    + +

    ... which starts out an anonymous namespace. Anonymous namespaces are to C++ +what the "static" keyword is to C (at global scope). It makes the +things declared inside of the anonymous namespace only visible to the current +file. If you're not familiar with them, consult a decent C++ book for more +information.

    + +

    Next, we declare our pass itself:

    + +
    +  struct Hello : public FunctionPass {
    +

    + +

    This declares a "Hello" class that is a subclass of FunctionPass. +The different builtin pass subclasses are described in detail later, but for now, know that FunctionPass's operate a function at a +time.

    + +
    +     static char ID;
    +     Hello() : FunctionPass(&ID) {}
    +

    + +

    This declares pass identifier used by LLVM to identify pass. This allows LLVM to +avoid using expensive C++ runtime information.

    + +
    +    virtual bool runOnFunction(Function &F) {
    +      errs() << "Hello: " << F.getName() << "\n";
    +      return false;
    +    }
    +  };  // end of struct Hello
    +
    + +

    We declare a "runOnFunction" method, +which overloads an abstract virtual method inherited from FunctionPass. This is where we are supposed +to do our thing, so we just print out our message with the name of each +function.

    + +
    +  char Hello::ID = 0;
    +
    + +

    We initialize pass ID here. LLVM uses ID's address to identify pass so +initialization value is not important.

    + +
    +  INITIALIZE_PASS(Hello, "hello", "Hello World Pass",
    +                        false /* Only looks at CFG */,
    +                        false /* Analysis Pass */);
    +}  // end of anonymous namespace
    +
    + +

    Lastly, we register our class Hello, +giving it a command line +argument "hello", and a name "Hello World Pass". +Last two arguments describe its behavior. +If a pass walks CFG without modifying it then third argument is set to true. +If a pass is an analysis pass, for example dominator tree pass, then true +is supplied as fourth argument.

    + +

    As a whole, the .cpp file looks like:

    + +
    +#include "llvm/Pass.h"
    +#include "llvm/Function.h"
    +#include "llvm/Support/raw_ostream.h"
    +
    +using namespace llvm;
    +
    +namespace {
    +  struct Hello : public FunctionPass {
    +    
    +    static char ID;
    +    Hello() : FunctionPass(&ID) {}
    +
    +    virtual bool runOnFunction(Function &F) {
    +      errs() << "Hello: " << F.getName() << "\n";
    +      return false;
    +    }
    +  };
    +  
    +  char Hello::ID = 0;
    +  INITIALIZE_PASS(Hello, "Hello", "Hello World Pass", false, false);
    +}
    +
    +
    + +

    Now that it's all together, compile the file with a simple "gmake" +command in the local directory and you should get a new +"Debug+Asserts/lib/Hello.so file. Note that everything in this file is +contained in an anonymous namespace: this reflects the fact that passes are self +contained units that do not need external interfaces (although they can have +them) to be useful.

    + +
    + + + + +
    + +

    Now that you have a brand new shiny shared object file, we can use the +opt command to run an LLVM program through your pass. Because you +registered your pass with the INITIALIZE_PASS macro, you will be able to +use the opt tool to access it, once loaded.

    + +

    To test it, follow the example at the end of the Getting Started Guide to compile "Hello World" to +LLVM. We can now run the bitcode file (hello.bc) for the program +through our transformation like this (or course, any bitcode file will +work):

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -hello < hello.bc > /dev/null
    +Hello: __main
    +Hello: puts
    +Hello: main
    +
    + +

    The '-load' option specifies that 'opt' should load your +pass as a shared object, which makes '-hello' a valid command line +argument (which is one reason you need to register your +pass). Because the hello pass does not modify the program in any +interesting way, we just throw away the result of opt (sending it to +/dev/null).

    + +

    To see what happened to the other string you registered, try running +opt with the -help option:

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -help
    +OVERVIEW: llvm .bc -> .bc modular optimizer
    +
    +USAGE: opt [options] <input bitcode>
    +
    +OPTIONS:
    +  Optimizations available:
    +...
    +    -funcresolve    - Resolve Functions
    +    -gcse           - Global Common Subexpression Elimination
    +    -globaldce      - Dead Global Elimination
    +    -hello          - Hello World Pass
    +    -indvars        - Canonicalize Induction Variables
    +    -inline         - Function Integration/Inlining
    +    -instcombine    - Combine redundant instructions
    +...
    +
    + +

    The pass name get added as the information string for your pass, giving some +documentation to users of opt. Now that you have a working pass, you +would go ahead and make it do the cool transformations you want. Once you get +it all working and tested, it may become useful to find out how fast your pass +is. The PassManager provides a nice command +line option (--time-passes) that allows you to get information about +the execution time of your pass along with the other passes you queue up. For +example:

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -hello -time-passes < hello.bc > /dev/null
    +Hello: __main
    +Hello: puts
    +Hello: main
    +===============================================================================
    +                      ... Pass execution timing report ...
    +===============================================================================
    +  Total Execution Time: 0.02 seconds (0.0479059 wall clock)
    +
    +   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Pass Name ---
    +   0.0100 (100.0%)   0.0000 (  0.0%)   0.0100 ( 50.0%)   0.0402 ( 84.0%)  Bitcode Writer
    +   0.0000 (  0.0%)   0.0100 (100.0%)   0.0100 ( 50.0%)   0.0031 (  6.4%)  Dominator Set Construction
    +   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0013 (  2.7%)  Module Verifier
    +   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0033 (  6.9%)  Hello World Pass
    +   0.0100 (100.0%)   0.0100 (100.0%)   0.0200 (100.0%)   0.0479 (100.0%)  TOTAL
    +
    + +

    As you can see, our implementation above is pretty fast :). The additional +passes listed are automatically inserted by the 'opt' tool to verify +that the LLVM emitted by your pass is still valid and well formed LLVM, which +hasn't been broken somehow.

    + +

    Now that you have seen the basics of the mechanics behind passes, we can talk +about some more details of how they work and how to use them.

    + +
    + + + + + +
    + +

    One of the first things that you should do when designing a new pass is to +decide what class you should subclass for your pass. The Hello World example uses the FunctionPass class for its implementation, but we +did not discuss why or when this should occur. Here we talk about the classes +available, from the most general to the most specific.

    + +

    When choosing a superclass for your Pass, you should choose the most +specific class possible, while still being able to meet the requirements +listed. This gives the LLVM Pass Infrastructure information necessary to +optimize how passes are run, so that the resultant compiler isn't unnecessarily +slow.

    + +
    + + + + +
    + +

    The most plain and boring type of pass is the "ImmutablePass" +class. This pass type is used for passes that do not have to be run, do not +change state, and never need to be updated. This is not a normal type of +transformation or analysis, but can provide information about the current +compiler configuration.

    + +

    Although this pass class is very infrequently used, it is important for +providing information about the current target machine being compiled for, and +other static information that can affect the various transformations.

    + +

    ImmutablePasses never invalidate other transformations, are never +invalidated, and are never "run".

    + +
    + + + + +
    + +

    The "ModulePass" +class is the most general of all superclasses that you can use. Deriving from +ModulePass indicates that your pass uses the entire program as a unit, +referring to function bodies in no predictable order, or adding and removing +functions. Because nothing is known about the behavior of ModulePass +subclasses, no optimization can be done for their execution.

    + +

    A module pass can use function level passes (e.g. dominators) using +the getAnalysis interface +getAnalysis<DominatorTree>(llvm::Function *) to provide the +function to retrieve analysis result for, if the function pass does not require +any module or immutable passes. Note that this can only be done for functions for which the +analysis ran, e.g. in the case of dominators you should only ask for the +DominatorTree for function definitions, not declarations.

    + +

    To write a correct ModulePass subclass, derive from +ModulePass and overload the runOnModule method with the +following signature:

    + +
    + + + + +
    + +
    +  virtual bool runOnModule(Module &M) = 0;
    +
    + +

    The runOnModule method performs the interesting work of the pass. +It should return true if the module was modified by the transformation and +false otherwise.

    + +
    + + + + +
    + +

    The "CallGraphSCCPass" +is used by passes that need to traverse the program bottom-up on the call graph +(callees before callers). Deriving from CallGraphSCCPass provides some +mechanics for building and traversing the CallGraph, but also allows the system +to optimize execution of CallGraphSCCPass's. If your pass meets the +requirements outlined below, and doesn't meet the requirements of a FunctionPass or BasicBlockPass, you should derive from +CallGraphSCCPass.

    + +

    TODO: explain briefly what SCC, Tarjan's algo, and B-U mean.

    + +

    To be explicit, CallGraphSCCPass subclasses are:

    + +
      + +
    1. ... not allowed to modify any Functions that are not in +the current SCC.
    2. + +
    3. ... not allowed to inspect any Function's other than those in the +current SCC and the direct callees of the SCC.
    4. + +
    5. ... required to preserve the current CallGraph object, updating it +to reflect any changes made to the program.
    6. + +
    7. ... not allowed to add or remove SCC's from the current Module, +though they may change the contents of an SCC.
    8. + +
    9. ... allowed to add or remove global variables from the current +Module.
    10. + +
    11. ... allowed to maintain state across invocations of + runOnSCC (including global data).
    12. +
    + +

    Implementing a CallGraphSCCPass is slightly tricky in some cases +because it has to handle SCCs with more than one node in it. All of the virtual +methods described below should return true if they modified the program, or +false if they didn't.

    + +
    + + + + +
    + +
    +  virtual bool doInitialization(CallGraph &CG);
    +
    + +

    The doIninitialize method is allowed to do most of the things that +CallGraphSCCPass's are not allowed to do. They can add and remove +functions, get pointers to functions, etc. The doInitialization method +is designed to do simple initialization type of stuff that does not depend on +the SCCs being processed. The doInitialization method call is not +scheduled to overlap with any other pass executions (thus it should be very +fast).

    + +
    + + + + +
    + +
    +  virtual bool runOnSCC(CallGraphSCC &SCC) = 0;
    +
    + +

    The runOnSCC method performs the interesting work of the pass, and +should return true if the module was modified by the transformation, false +otherwise.

    + +
    + + + + +
    + +
    +  virtual bool doFinalization(CallGraph &CG);
    +
    + +

    The doFinalization method is an infrequently used method that is +called when the pass framework has finished calling runOnFunction for every function in the +program being compiled.

    + +
    + + + + +
    + +

    In contrast to ModulePass subclasses, FunctionPass +subclasses do have a predictable, local behavior that can be expected by the +system. All FunctionPass execute on each function in the program +independent of all of the other functions in the program. +FunctionPass's do not require that they are executed in a particular +order, and FunctionPass's do not modify external functions.

    + +

    To be explicit, FunctionPass subclasses are not allowed to:

    + +
      +
    1. Modify a Function other than the one currently being processed.
    2. +
    3. Add or remove Function's from the current Module.
    4. +
    5. Add or remove global variables from the current Module.
    6. +
    7. Maintain state across invocations of + runOnFunction (including global data)
    8. +
    + +

    Implementing a FunctionPass is usually straightforward (See the Hello World pass for example). FunctionPass's +may overload three virtual methods to do their work. All of these methods +should return true if they modified the program, or false if they didn't.

    + +
    + + + + +
    + +
    +  virtual bool doInitialization(Module &M);
    +
    + +

    The doIninitialize method is allowed to do most of the things that +FunctionPass's are not allowed to do. They can add and remove +functions, get pointers to functions, etc. The doInitialization method +is designed to do simple initialization type of stuff that does not depend on +the functions being processed. The doInitialization method call is not +scheduled to overlap with any other pass executions (thus it should be very +fast).

    + +

    A good example of how this method should be used is the LowerAllocations +pass. This pass converts malloc and free instructions into +platform dependent malloc() and free() function calls. It +uses the doInitialization method to get a reference to the malloc and +free functions that it needs, adding prototypes to the module if necessary.

    + +
    + + + + +
    + +
    +  virtual bool runOnFunction(Function &F) = 0;
    +

    + +

    The runOnFunction method must be implemented by your subclass to do +the transformation or analysis work of your pass. As usual, a true value should +be returned if the function is modified.

    + +
    + + + + +
    + +
    +  virtual bool doFinalization(Module &M);
    +
    + +

    The doFinalization method is an infrequently used method that is +called when the pass framework has finished calling runOnFunction for every function in the +program being compiled.

    + +
    + + + + +
    + +

    All LoopPass execute on each loop in the function independent of +all of the other loops in the function. LoopPass processes loops in +loop nest order such that outer most loop is processed last.

    + +

    LoopPass subclasses are allowed to update loop nest using +LPPassManager interface. Implementing a loop pass is usually +straightforward. Looppass's may overload three virtual methods to +do their work. All these methods should return true if they modified the +program, or false if they didn't.

    +
    + + + + +
    + +
    +  virtual bool doInitialization(Loop *, LPPassManager &LPM);
    +
    + +

    The doInitialization method is designed to do simple initialization +type of stuff that does not depend on the functions being processed. The +doInitialization method call is not scheduled to overlap with any +other pass executions (thus it should be very fast). LPPassManager +interface should be used to access Function or Module level analysis +information.

    + +
    + + + + + +
    + +
    +  virtual bool runOnLoop(Loop *, LPPassManager &LPM) = 0;
    +

    + +

    The runOnLoop method must be implemented by your subclass to do +the transformation or analysis work of your pass. As usual, a true value should +be returned if the function is modified. LPPassManager interface +should be used to update loop nest.

    + +
    + + + + +
    + +
    +  virtual bool doFinalization();
    +
    + +

    The doFinalization method is an infrequently used method that is +called when the pass framework has finished calling runOnLoop for every loop in the +program being compiled.

    + +
    + + + + + + +
    + +

    BasicBlockPass's are just like FunctionPass's, except that they must limit +their scope of inspection and modification to a single basic block at a time. +As such, they are not allowed to do any of the following:

    + +
      +
    1. Modify or inspect any basic blocks outside of the current one
    2. +
    3. Maintain state across invocations of + runOnBasicBlock
    4. +
    5. Modify the control flow graph (by altering terminator instructions)
    6. +
    7. Any of the things forbidden for + FunctionPasses.
    8. +
    + +

    BasicBlockPasses are useful for traditional local and "peephole" +optimizations. They may override the same doInitialization(Module &) and doFinalization(Module &) methods that FunctionPass's have, but also have the following virtual methods that may also be implemented:

    + +
    + + + + +
    + +
    +  virtual bool doInitialization(Function &F);
    +
    + +

    The doIninitialize method is allowed to do most of the things that +BasicBlockPass's are not allowed to do, but that +FunctionPass's can. The doInitialization method is designed +to do simple initialization that does not depend on the +BasicBlocks being processed. The doInitialization method call is not +scheduled to overlap with any other pass executions (thus it should be very +fast).

    + +
    + + + + +
    + +
    +  virtual bool runOnBasicBlock(BasicBlock &BB) = 0;
    +
    + +

    Override this function to do the work of the BasicBlockPass. This +function is not allowed to inspect or modify basic blocks other than the +parameter, and are not allowed to modify the CFG. A true value must be returned +if the basic block is modified.

    + +
    + + + + +
    + +
    +  virtual bool doFinalization(Function &F);
    +
    + +

    The doFinalization method is an infrequently used method that is +called when the pass framework has finished calling runOnBasicBlock for every BasicBlock in the +program being compiled. This can be used to perform per-function +finalization.

    + +
    + + + + +
    + +

    A MachineFunctionPass is a part of the LLVM code generator that +executes on the machine-dependent representation of each LLVM function in the +program.

    + +

    Code generator passes are registered and initialized specially by +TargetMachine::addPassesToEmitFile and similar routines, so they +cannot generally be run from the opt or bugpoint +commands.

    + +

    A MachineFunctionPass is also a FunctionPass, so all +the restrictions that apply to a FunctionPass also apply to it. +MachineFunctionPasses also have additional restrictions. In particular, +MachineFunctionPasses are not allowed to do any of the following:

    + +
      +
    1. Modify or create any LLVM IR Instructions, BasicBlocks, Arguments, + Functions, GlobalVariables, GlobalAliases, or Modules.
    2. +
    3. Modify a MachineFunction other than the one currently being processed.
    4. +
    5. Maintain state across invocations of runOnMachineFunction (including global +data)
    6. +
    + +
    + + + + +
    + +
    +  virtual bool runOnMachineFunction(MachineFunction &MF) = 0;
    +
    + +

    runOnMachineFunction can be considered the main entry point of a +MachineFunctionPass; that is, you should override this method to do the +work of your MachineFunctionPass.

    + +

    The runOnMachineFunction method is called on every +MachineFunction in a Module, so that the +MachineFunctionPass may perform optimizations on the machine-dependent +representation of the function. If you want to get at the LLVM Function +for the MachineFunction you're working on, use +MachineFunction's getFunction() accessor method -- but +remember, you may not modify the LLVM Function or its contents from a +MachineFunctionPass.

    + +
    + + + + + +
    + +

    In the Hello World example pass we illustrated how +pass registration works, and discussed some of the reasons that it is used and +what it does. Here we discuss how and why passes are registered.

    + +

    As we saw above, passes are registered with the INITIALIZE_PASS +macro. The first parameter is the name of the pass that is to be used on +the command line to specify that the pass should be added to a program (for +example, with opt or bugpoint). The second argument is the +name of the pass, which is to be used for the -help output of +programs, as +well as for debug output generated by the --debug-pass option.

    + +

    If you want your pass to be easily dumpable, you should +implement the virtual print method:

    + +
    + + + + +
    + +
    +  virtual void print(std::ostream &O, const Module *M) const;
    +
    + +

    The print method must be implemented by "analyses" in order to print +a human readable version of the analysis results. This is useful for debugging +an analysis itself, as well as for other people to figure out how an analysis +works. Use the opt -analyze argument to invoke this method.

    + +

    The llvm::OStream parameter specifies the stream to write the results on, +and the Module parameter gives a pointer to the top level module of the +program that has been analyzed. Note however that this pointer may be null in +certain circumstances (such as calling the Pass::dump() from a +debugger), so it should only be used to enhance debug output, it should not be +depended on.

    + +
    + + + + + +
    + +

    One of the main responsibilities of the PassManager is to make sure +that passes interact with each other correctly. Because PassManager +tries to optimize the execution of passes it must +know how the passes interact with each other and what dependencies exist between +the various passes. To track this, each pass can declare the set of passes that +are required to be executed before the current pass, and the passes which are +invalidated by the current pass.

    + +

    Typically this functionality is used to require that analysis results are +computed before your pass is run. Running arbitrary transformation passes can +invalidate the computed analysis results, which is what the invalidation set +specifies. If a pass does not implement the getAnalysisUsage method, it defaults to not +having any prerequisite passes, and invalidating all other passes.

    + +
    + + + + +
    + +
    +  virtual void getAnalysisUsage(AnalysisUsage &Info) const;
    +
    + +

    By implementing the getAnalysisUsage method, the required and +invalidated sets may be specified for your transformation. The implementation +should fill in the AnalysisUsage +object with information about which passes are required and not invalidated. To +do this, a pass may call any of the following methods on the AnalysisUsage +object:

    +
    + + + + +
    +

    +If your pass requires a previous pass to be executed (an analysis for example), +it can use one of these methods to arrange for it to be run before your pass. +LLVM has many different types of analyses and passes that can be required, +spanning the range from DominatorSet to BreakCriticalEdges. +Requiring BreakCriticalEdges, for example, guarantees that there will +be no critical edges in the CFG when your pass has been run. +

    + +

    +Some analyses chain to other analyses to do their job. For example, an AliasAnalysis implementation is required to chain to other alias analysis passes. In +cases where analyses chain, the addRequiredTransitive method should be +used instead of the addRequired method. This informs the PassManager +that the transitively required pass should be alive as long as the requiring +pass is. +

    +
    + + + + +
    +

    +One of the jobs of the PassManager is to optimize how and when analyses are run. +In particular, it attempts to avoid recomputing data unless it needs to. For +this reason, passes are allowed to declare that they preserve (i.e., they don't +invalidate) an existing analysis if it's available. For example, a simple +constant folding pass would not modify the CFG, so it can't possibly affect the +results of dominator analysis. By default, all passes are assumed to invalidate +all others. +

    + +

    +The AnalysisUsage class provides several methods which are useful in +certain circumstances that are related to addPreserved. In particular, +the setPreservesAll method can be called to indicate that the pass does +not modify the LLVM program at all (which is true for analyses), and the +setPreservesCFG method can be used by transformations that change +instructions in the program but do not modify the CFG or terminator instructions +(note that this property is implicitly set for BasicBlockPass's). +

    + +

    +addPreserved is particularly useful for transformations like +BreakCriticalEdges. This pass knows how to update a small set of loop +and dominator related analyses if they exist, so it can preserve them, despite +the fact that it hacks on the CFG. +

    +
    + + + + +
    + +
    +  // This is an example implementation from an analysis, which does not modify
    +  // the program at all, yet has a prerequisite.
    +  void PostDominanceFrontier::getAnalysisUsage(AnalysisUsage &AU) const {
    +    AU.setPreservesAll();
    +    AU.addRequired<PostDominatorTree>();
    +  }
    +
    + +

    and:

    + +
    +  // This example modifies the program, but does not modify the CFG
    +  void LICM::getAnalysisUsage(AnalysisUsage &AU) const {
    +    AU.setPreservesCFG();
    +    AU.addRequired<LoopInfo>();
    +  }
    +
    + +
    + + + + +
    + +

    The Pass::getAnalysis<> method is automatically inherited by +your class, providing you with access to the passes that you declared that you +required with the getAnalysisUsage +method. It takes a single template argument that specifies which pass class you +want, and returns a reference to that pass. For example:

    + +
    +   bool LICM::runOnFunction(Function &F) {
    +     LoopInfo &LI = getAnalysis<LoopInfo>();
    +     ...
    +   }
    +
    + +

    This method call returns a reference to the pass desired. You may get a +runtime assertion failure if you attempt to get an analysis that you did not +declare as required in your getAnalysisUsage implementation. This +method can be called by your run* method implementation, or by any +other local method invoked by your run* method. + +A module level pass can use function level analysis info using this interface. +For example:

    + +
    +   bool ModuleLevelPass::runOnModule(Module &M) {
    +     ...
    +     DominatorTree &DT = getAnalysis<DominatorTree>(Func);
    +     ...
    +   }
    +
    + +

    In above example, runOnFunction for DominatorTree is called by pass manager +before returning a reference to the desired pass.

    + +

    +If your pass is capable of updating analyses if they exist (e.g., +BreakCriticalEdges, as described above), you can use the +getAnalysisIfAvailable method, which returns a pointer to the analysis +if it is active. For example:

    + +
    +  ...
    +  if (DominatorSet *DS = getAnalysisIfAvailable<DominatorSet>()) {
    +    // A DominatorSet is active.  This code will update it.
    +  }
    +  ...
    +
    + +
    + + + + + +
    + +

    Now that we understand the basics of how passes are defined, how they are +used, and how they are required from other passes, it's time to get a little bit +fancier. All of the pass relationships that we have seen so far are very +simple: one pass depends on one other specific pass to be run before it can run. +For many applications, this is great, for others, more flexibility is +required.

    + +

    In particular, some analyses are defined such that there is a single simple +interface to the analysis results, but multiple ways of calculating them. +Consider alias analysis for example. The most trivial alias analysis returns +"may alias" for any alias query. The most sophisticated analysis a +flow-sensitive, context-sensitive interprocedural analysis that can take a +significant amount of time to execute (and obviously, there is a lot of room +between these two extremes for other implementations). To cleanly support +situations like this, the LLVM Pass Infrastructure supports the notion of +Analysis Groups.

    + +
    + + + + +
    + +

    An Analysis Group is a single simple interface that may be implemented by +multiple different passes. Analysis Groups can be given human readable names +just like passes, but unlike passes, they need not derive from the Pass +class. An analysis group may have one or more implementations, one of which is +the "default" implementation.

    + +

    Analysis groups are used by client passes just like other passes are: the +AnalysisUsage::addRequired() and Pass::getAnalysis() methods. +In order to resolve this requirement, the PassManager +scans the available passes to see if any implementations of the analysis group +are available. If none is available, the default implementation is created for +the pass to use. All standard rules for interaction +between passes still apply.

    + +

    Although Pass Registration is optional for normal +passes, all analysis group implementations must be registered, and must use the +INITIALIZE_AG_PASS template to join the +implementation pool. Also, a default implementation of the interface +must be registered with RegisterAnalysisGroup.

    + +

    As a concrete example of an Analysis Group in action, consider the AliasAnalysis +analysis group. The default implementation of the alias analysis interface (the +basicaa +pass) just does a few simple checks that don't require significant analysis to +compute (such as: two different globals can never alias each other, etc). +Passes that use the AliasAnalysis +interface (for example the gcse pass), do +not care which implementation of alias analysis is actually provided, they just +use the designated interface.

    + +

    From the user's perspective, commands work just like normal. Issuing the +command 'opt -gcse ...' will cause the basicaa class to be +instantiated and added to the pass sequence. Issuing the command 'opt +-somefancyaa -gcse ...' will cause the gcse pass to use the +somefancyaa alias analysis (which doesn't actually exist, it's just a +hypothetical example) instead.

    + +
    + + + + +
    + +

    The RegisterAnalysisGroup template is used to register the analysis +group itself, while the INITIALIZE_AG_PASS is used to add pass +implementations to the analysis group. First, +an analysis group should be registered, with a human readable name +provided for it. +Unlike registration of passes, there is no command line argument to be specified +for the Analysis Group Interface itself, because it is "abstract":

    + +
    +  static RegisterAnalysisGroup<AliasAnalysis> A("Alias Analysis");
    +
    + +

    Once the analysis is registered, passes can declare that they are valid +implementations of the interface by using the following code:

    + +
    +namespace {
    +  // Declare that we implement the AliasAnalysis interface
    +  INITIALIZE_AG_PASS(FancyAA, AliasAnalysis, "somefancyaa",
    +                     "A more complex alias analysis implementation",
    +                     false, // Is CFG Only?
    +                     true,  // Is Analysis?
    +                     false, // Is default Analysis Group implementation?
    +                    );
    +}
    +
    + +

    This just shows a class FancyAA that +uses the INITIALIZE_AG_PASS macro both to register and +to "join" the AliasAnalysis +analysis group. Every implementation of an analysis group should join using +this macro.

    + +
    +namespace {
    +  // Declare that we implement the AliasAnalysis interface
    +  INITIALIZE_AG_PASS(BasicAA, AliasAnalysis, "basicaa",
    +                     "Basic Alias Analysis (default AA impl)",
    +                     false, // Is CFG Only?
    +                     true,  // Is Analysis?
    +                     true, // Is default Analysis Group implementation?
    +                    );
    +}
    +
    + +

    Here we show how the default implementation is specified (using the final +argument to the INITIALIZE_AG_PASS template). There must be exactly +one default implementation available at all times for an Analysis Group to be +used. Only default implementation can derive from ImmutablePass. +Here we declare that the + BasicAliasAnalysis +pass is the default implementation for the interface.

    + +
    + + + + + +
    +

    The Statistic +class is designed to be an easy way to expose various success +metrics from passes. These statistics are printed at the end of a +run, when the -stats command line option is enabled on the command +line. See the Statistics section in the Programmer's Manual for details. + +

    + + + + + + +
    + +

    The PassManager +class +takes a list of passes, ensures their prerequisites +are set up correctly, and then schedules passes to run efficiently. All of the +LLVM tools that run passes use the PassManager for execution of these +passes.

    + +

    The PassManager does two main things to try to reduce the execution +time of a series of passes:

    + +
      +
    1. Share analysis results - The PassManager attempts to avoid +recomputing analysis results as much as possible. This means keeping track of +which analyses are available already, which analyses get invalidated, and which +analyses are needed to be run for a pass. An important part of work is that the +PassManager tracks the exact lifetime of all analysis results, allowing +it to free memory allocated to holding analysis +results as soon as they are no longer needed.
    2. + +
    3. Pipeline the execution of passes on the program - The +PassManager attempts to get better cache and memory usage behavior out +of a series of passes by pipelining the passes together. This means that, given +a series of consequtive FunctionPass's, it +will execute all of the FunctionPass's on +the first function, then all of the FunctionPasses on the second function, +etc... until the entire program has been run through the passes. + +

      This improves the cache behavior of the compiler, because it is only touching +the LLVM program representation for a single function at a time, instead of +traversing the entire program. It reduces the memory consumption of compiler, +because, for example, only one DominatorSet +needs to be calculated at a time. This also makes it possible to implement +some interesting enhancements in the future.

    4. + +
    + +

    The effectiveness of the PassManager is influenced directly by how +much information it has about the behaviors of the passes it is scheduling. For +example, the "preserved" set is intentionally conservative in the face of an +unimplemented getAnalysisUsage method. +Not implementing when it should be implemented will have the effect of not +allowing any analysis results to live across the execution of your pass.

    + +

    The PassManager class exposes a --debug-pass command line +options that is useful for debugging pass execution, seeing how things work, and +diagnosing when you should be preserving more analyses than you currently are +(To get information about all of the variants of the --debug-pass +option, just type 'opt -help-hidden').

    + +

    By using the --debug-pass=Structure option, for example, we can see +how our Hello World pass interacts with other passes. +Lets try it out with the gcse and licm passes:

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -licm --debug-pass=Structure < hello.bc > /dev/null
    +Module Pass Manager
    +  Function Pass Manager
    +    Dominator Set Construction
    +    Immediate Dominators Construction
    +    Global Common Subexpression Elimination
    +--  Immediate Dominators Construction
    +--  Global Common Subexpression Elimination
    +    Natural Loop Construction
    +    Loop Invariant Code Motion
    +--  Natural Loop Construction
    +--  Loop Invariant Code Motion
    +    Module Verifier
    +--  Dominator Set Construction
    +--  Module Verifier
    +  Bitcode Writer
    +--Bitcode Writer
    +
    + +

    This output shows us when passes are constructed and when the analysis +results are known to be dead (prefixed with '--'). Here we see that +GCSE uses dominator and immediate dominator information to do its job. The LICM +pass uses natural loop information, which uses dominator sets, but not immediate +dominators. Because immediate dominators are no longer useful after the GCSE +pass, it is immediately destroyed. The dominator sets are then reused to +compute natural loop information, which is then used by the LICM pass.

    + +

    After the LICM pass, the module verifier runs (which is automatically added +by the 'opt' tool), which uses the dominator set to check that the +resultant LLVM code is well formed. After it finishes, the dominator set +information is destroyed, after being computed once, and shared by three +passes.

    + +

    Lets see how this changes when we run the Hello +World pass in between the two passes:

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null
    +Module Pass Manager
    +  Function Pass Manager
    +    Dominator Set Construction
    +    Immediate Dominators Construction
    +    Global Common Subexpression Elimination
    +--  Dominator Set Construction
    +--  Immediate Dominators Construction
    +--  Global Common Subexpression Elimination
    +    Hello World Pass
    +--  Hello World Pass
    +    Dominator Set Construction
    +    Natural Loop Construction
    +    Loop Invariant Code Motion
    +--  Natural Loop Construction
    +--  Loop Invariant Code Motion
    +    Module Verifier
    +--  Dominator Set Construction
    +--  Module Verifier
    +  Bitcode Writer
    +--Bitcode Writer
    +Hello: __main
    +Hello: puts
    +Hello: main
    +
    + +

    Here we see that the Hello World pass has killed the +Dominator Set pass, even though it doesn't modify the code at all! To fix this, +we need to add the following getAnalysisUsage method to our pass:

    + +
    +    // We don't modify the program, so we preserve all analyses
    +    virtual void getAnalysisUsage(AnalysisUsage &AU) const {
    +      AU.setPreservesAll();
    +    }
    +
    + +

    Now when we run our pass, we get this output:

    + +
    +$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null
    +Pass Arguments:  -gcse -hello -licm
    +Module Pass Manager
    +  Function Pass Manager
    +    Dominator Set Construction
    +    Immediate Dominators Construction
    +    Global Common Subexpression Elimination
    +--  Immediate Dominators Construction
    +--  Global Common Subexpression Elimination
    +    Hello World Pass
    +--  Hello World Pass
    +    Natural Loop Construction
    +    Loop Invariant Code Motion
    +--  Loop Invariant Code Motion
    +--  Natural Loop Construction
    +    Module Verifier
    +--  Dominator Set Construction
    +--  Module Verifier
    +  Bitcode Writer
    +--Bitcode Writer
    +Hello: __main
    +Hello: puts
    +Hello: main
    +
    + +

    Which shows that we don't accidentally invalidate dominator information +anymore, and therefore do not have to compute it twice.

    + +
    + + + + +
    + +
    +  virtual void releaseMemory();
    +
    + +

    The PassManager automatically determines when to compute analysis +results, and how long to keep them around for. Because the lifetime of the pass +object itself is effectively the entire duration of the compilation process, we +need some way to free analysis results when they are no longer useful. The +releaseMemory virtual method is the way to do this.

    + +

    If you are writing an analysis or any other pass that retains a significant +amount of state (for use by another pass which "requires" your pass and uses the +getAnalysis method) you should implement +releaseMemory to, well, release the memory allocated to maintain this +internal state. This method is called after the run* method for the +class, before the next call of run* in your pass.

    + +
    + + + + + +
    + +

    Size matters when constructing production quality tools using llvm, +both for the purposes of distribution, and for regulating the resident code size +when running on the target system. Therefore, it becomes desirable to +selectively use some passes, while omitting others and maintain the flexibility +to change configurations later on. You want to be able to do all this, and, +provide feedback to the user. This is where pass registration comes into +play.

    + +

    The fundamental mechanisms for pass registration are the +MachinePassRegistry class and subclasses of +MachinePassRegistryNode.

    + +

    An instance of MachinePassRegistry is used to maintain a list of +MachinePassRegistryNode objects. This instance maintains the list and +communicates additions and deletions to the command line interface.

    + +

    An instance of MachinePassRegistryNode subclass is used to maintain +information provided about a particular pass. This information includes the +command line name, the command help string and the address of the function used +to create an instance of the pass. A global static constructor of one of these +instances registers with a corresponding MachinePassRegistry, +the static destructor unregisters. Thus a pass that is statically linked +in the tool will be registered at start up. A dynamically loaded pass will +register on load and unregister at unload.

    + +
    + + + + +
    + +

    There are predefined registries to track instruction scheduling +(RegisterScheduler) and register allocation (RegisterRegAlloc) +machine passes. Here we will describe how to register a register +allocator machine pass.

    + +

    Implement your register allocator machine pass. In your register allocator +.cpp file add the following include;

    + +
    +  #include "llvm/CodeGen/RegAllocRegistry.h"
    +
    + +

    Also in your register allocator .cpp file, define a creator function in the +form;

    + +
    +  FunctionPass *createMyRegisterAllocator() {
    +    return new MyRegisterAllocator();
    +  }
    +
    + +

    Note that the signature of this function should match the type of +RegisterRegAlloc::FunctionPassCtor. In the same file add the +"installing" declaration, in the form;

    + +
    +  static RegisterRegAlloc myRegAlloc("myregalloc",
    +    "  my register allocator help string",
    +    createMyRegisterAllocator);
    +
    + +

    Note the two spaces prior to the help string produces a tidy result on the +-help query.

    + +
    +$ llc -help
    +  ...
    +  -regalloc                    - Register allocator to use (default=linearscan)
    +    =linearscan                -   linear scan register allocator
    +    =local                     -   local register allocator
    +    =simple                    -   simple register allocator
    +    =myregalloc                -   my register allocator help string
    +  ...
    +
    + +

    And that's it. The user is now free to use -regalloc=myregalloc as +an option. Registering instruction schedulers is similar except use the +RegisterScheduler class. Note that the +RegisterScheduler::FunctionPassCtor is significantly different from +RegisterRegAlloc::FunctionPassCtor.

    + +

    To force the load/linking of your register allocator into the llc/lli tools, +add your creator function's global declaration to "Passes.h" and add a "pseudo" +call line to llvm/Codegen/LinkAllCodegenComponents.h.

    + +
    + + + + + +
    + +

    The easiest way to get started is to clone one of the existing registries; we +recommend llvm/CodeGen/RegAllocRegistry.h. The key things to modify +are the class name and the FunctionPassCtor type.

    + +

    Then you need to declare the registry. Example: if your pass registry is +RegisterMyPasses then define;

    + +
    +MachinePassRegistry RegisterMyPasses::Registry;
    +
    + +

    And finally, declare the command line option for your passes. Example:

    + +
    +  cl::opt<RegisterMyPasses::FunctionPassCtor, false,
    +          RegisterPassParser<RegisterMyPasses> >
    +  MyPassOpt("mypass",
    +            cl::init(&createDefaultMyPass),
    +            cl::desc("my pass option help")); 
    +
    + +

    Here the command option is "mypass", with createDefaultMyPass as the default +creator.

    + +
    + + + + + +
    + +

    Unfortunately, using GDB with dynamically loaded passes is not as easy as it +should be. First of all, you can't set a breakpoint in a shared object that has +not been loaded yet, and second of all there are problems with inlined functions +in shared objects. Here are some suggestions to debugging your pass with +GDB.

    + +

    For sake of discussion, I'm going to assume that you are debugging a +transformation invoked by opt, although nothing described here depends +on that.

    + +
    + + + + +
    + +

    First thing you do is start gdb on the opt process:

    + +
    +$ gdb opt
    +GNU gdb 5.0
    +Copyright 2000 Free Software Foundation, Inc.
    +GDB is free software, covered by the GNU General Public License, and you are
    +welcome to change it and/or distribute copies of it under certain conditions.
    +Type "show copying" to see the conditions.
    +There is absolutely no warranty for GDB.  Type "show warranty" for details.
    +This GDB was configured as "sparc-sun-solaris2.6"...
    +(gdb)
    +
    + +

    Note that opt has a lot of debugging information in it, so it takes +time to load. Be patient. Since we cannot set a breakpoint in our pass yet +(the shared object isn't loaded until runtime), we must execute the process, and +have it stop before it invokes our pass, but after it has loaded the shared +object. The most foolproof way of doing this is to set a breakpoint in +PassManager::run and then run the process with the arguments you +want:

    + +
    +(gdb) break llvm::PassManager::run
    +Breakpoint 1 at 0x2413bc: file Pass.cpp, line 70.
    +(gdb) run test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption]
    +Starting program: opt test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption]
    +Breakpoint 1, PassManager::run (this=0xffbef174, M=@0x70b298) at Pass.cpp:70
    +70      bool PassManager::run(Module &M) { return PM->run(M); }
    +(gdb)
    +
    + +

    Once the opt stops in the PassManager::run method you are +now free to set breakpoints in your pass so that you can trace through execution +or do other standard debugging stuff.

    + +
    + + + + +
    + +

    Once you have the basics down, there are a couple of problems that GDB has, +some with solutions, some without.

    + +
      +
    • Inline functions have bogus stack information. In general, GDB does a +pretty good job getting stack traces and stepping through inline functions. +When a pass is dynamically loaded however, it somehow completely loses this +capability. The only solution I know of is to de-inline a function (move it +from the body of a class to a .cpp file).
    • + +
    • Restarting the program breaks breakpoints. After following the information +above, you have succeeded in getting some breakpoints planted in your pass. Nex +thing you know, you restart the program (i.e., you type 'run' again), +and you start getting errors about breakpoints being unsettable. The only way I +have found to "fix" this problem is to delete the breakpoints that are +already set in your pass, run the program, and re-set the breakpoints once +execution stops in PassManager::run.
    • + +
    + +

    Hopefully these tips will help with common case debugging situations. If +you'd like to contribute some tips of your own, just contact Chris.

    + +
    + + + + + +
    + +

    Although the LLVM Pass Infrastructure is very capable as it stands, and does +some nifty stuff, there are things we'd like to add in the future. Here is +where we are going:

    + +
    + + + + +
    + +

    Multiple CPU machines are becoming more common and compilation can never be +fast enough: obviously we should allow for a multithreaded compiler. Because of +the semantics defined for passes above (specifically they cannot maintain state +across invocations of their run* methods), a nice clean way to +implement a multithreaded compiler would be for the PassManager class +to create multiple instances of each pass object, and allow the separate +instances to be hacking on different parts of the program at the same time.

    + +

    This implementation would prevent each of the passes from having to implement +multithreaded constructs, requiring only the LLVM core to have locking in a few +places (for global resources). Although this is a simple extension, we simply +haven't had time (or multiprocessor machines, thus a reason) to implement this. +Despite that, we have kept the LLVM passes SMP ready, and you should too.

    + +
    + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-07-21 16:07:00 -0700 (Wed, 21 Jul 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/doxygen.cfg.in URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/doxygen.cfg.in?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/doxygen.cfg.in (added) +++ www-releases/trunk/2.8/docs/doxygen.cfg.in Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1419 @@ +# Doxyfile 1.5.6 + +# This file describes the settings to be used by the documentation system +# doxygen (www.doxygen.org) for a project +# +# All text after a hash (#) is considered a comment and will be ignored +# The format is: +# TAG = value [value, ...] +# For lists items can also be appended using: +# TAG += value [value, ...] +# Values that contain spaces should be placed between quotes (" ") + +#--------------------------------------------------------------------------- +# Project related configuration options +#--------------------------------------------------------------------------- + +# This tag specifies the encoding used for all characters in the config file +# that follow. The default is UTF-8 which is also the encoding used for all +# text before the first occurrence of this tag. Doxygen uses libiconv (or the +# iconv built into libc) for the transcoding. See +# http://www.gnu.org/software/libiconv for the list of possible encodings. + +DOXYFILE_ENCODING = UTF-8 + +# The PROJECT_NAME tag is a single word (or a sequence of words surrounded +# by quotes) that should identify the project. + +PROJECT_NAME = LLVM + +# The PROJECT_NUMBER tag can be used to enter a project or revision number. +# This could be handy for archiving the generated documentation or +# if some version control system is used. + +PROJECT_NUMBER = @PACKAGE_VERSION@ + +# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) +# base path where the generated documentation will be put. +# If a relative path is entered, it will be relative to the location +# where doxygen was started. If left blank the current directory will be used. + +OUTPUT_DIRECTORY = @abs_top_builddir@/docs/doxygen + +# If the CREATE_SUBDIRS tag is set to YES, then doxygen will create +# 4096 sub-directories (in 2 levels) under the output directory of each output +# format and will distribute the generated files over these directories. +# Enabling this option can be useful when feeding doxygen a huge amount of +# source files, where putting all generated files in the same directory would +# otherwise cause performance problems for the file system. + +CREATE_SUBDIRS = NO + +# The OUTPUT_LANGUAGE tag is used to specify the language in which all +# documentation generated by doxygen is written. Doxygen will use this +# information to generate all constant output in the proper language. +# The default language is English, other supported languages are: +# Afrikaans, Arabic, Brazilian, Catalan, Chinese, Chinese-Traditional, +# Croatian, Czech, Danish, Dutch, Farsi, Finnish, French, German, Greek, +# Hungarian, Italian, Japanese, Japanese-en (Japanese with English messages), +# Korean, Korean-en, Lithuanian, Norwegian, Macedonian, Persian, Polish, +# Portuguese, Romanian, Russian, Serbian, Slovak, Slovene, Spanish, Swedish, +# and Ukrainian. + +OUTPUT_LANGUAGE = English + +# If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will +# include brief member descriptions after the members that are listed in +# the file and class documentation (similar to JavaDoc). +# Set to NO to disable this. + +BRIEF_MEMBER_DESC = YES + +# If the REPEAT_BRIEF tag is set to YES (the default) Doxygen will prepend +# the brief description of a member or function before the detailed description. +# Note: if both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the +# brief descriptions will be completely suppressed. + +REPEAT_BRIEF = YES + +# This tag implements a quasi-intelligent brief description abbreviator +# that is used to form the text in various listings. Each string +# in this list, if found as the leading text of the brief description, will be +# stripped from the text and the result after processing the whole list, is +# used as the annotated text. Otherwise, the brief description is used as-is. +# If left blank, the following values are used ("$name" is automatically +# replaced with the name of the entity): "The $name class" "The $name widget" +# "The $name file" "is" "provides" "specifies" "contains" +# "represents" "a" "an" "the" + +ABBREVIATE_BRIEF = + +# If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then +# Doxygen will generate a detailed section even if there is only a brief +# description. + +ALWAYS_DETAILED_SEC = NO + +# If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all +# inherited members of a class in the documentation of that class as if those +# members were ordinary class members. Constructors, destructors and assignment +# operators of the base classes will not be shown. + +INLINE_INHERITED_MEMB = NO + +# If the FULL_PATH_NAMES tag is set to YES then Doxygen will prepend the full +# path before files name in the file list and in the header files. If set +# to NO the shortest path that makes the file name unique will be used. + +FULL_PATH_NAMES = NO + +# If the FULL_PATH_NAMES tag is set to YES then the STRIP_FROM_PATH tag +# can be used to strip a user-defined part of the path. Stripping is +# only done if one of the specified strings matches the left-hand part of +# the path. The tag can be used to show relative paths in the file list. +# If left blank the directory from which doxygen is run is used as the +# path to strip. + +STRIP_FROM_PATH = ../.. + +# The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of +# the path mentioned in the documentation of a class, which tells +# the reader which header file to include in order to use a class. +# If left blank only the name of the header file containing the class +# definition is used. Otherwise one should specify the include paths that +# are normally passed to the compiler using the -I flag. + +STRIP_FROM_INC_PATH = + +# If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter +# (but less readable) file names. This can be useful is your file systems +# doesn't support long names like on DOS, Mac, or CD-ROM. + +SHORT_NAMES = NO + +# If the JAVADOC_AUTOBRIEF tag is set to YES then Doxygen +# will interpret the first line (until the first dot) of a JavaDoc-style +# comment as the brief description. If set to NO, the JavaDoc +# comments will behave just like regular Qt-style comments +# (thus requiring an explicit @brief command for a brief description.) + +JAVADOC_AUTOBRIEF = NO + +# If the QT_AUTOBRIEF tag is set to YES then Doxygen will +# interpret the first line (until the first dot) of a Qt-style +# comment as the brief description. If set to NO, the comments +# will behave just like regular Qt-style comments (thus requiring +# an explicit \brief command for a brief description.) + +QT_AUTOBRIEF = NO + +# The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make Doxygen +# treat a multi-line C++ special comment block (i.e. a block of //! or /// +# comments) as a brief description. This used to be the default behaviour. +# The new default is to treat a multi-line C++ comment block as a detailed +# description. Set this tag to YES if you prefer the old behaviour instead. + +MULTILINE_CPP_IS_BRIEF = NO + +# If the DETAILS_AT_TOP tag is set to YES then Doxygen +# will output the detailed description near the top, like JavaDoc. +# If set to NO, the detailed description appears after the member +# documentation. + +DETAILS_AT_TOP = NO + +# If the INHERIT_DOCS tag is set to YES (the default) then an undocumented +# member inherits the documentation from any documented member that it +# re-implements. + +INHERIT_DOCS = YES + +# If the SEPARATE_MEMBER_PAGES tag is set to YES, then doxygen will produce +# a new page for each member. If set to NO, the documentation of a member will +# be part of the file/class/namespace that contains it. + +SEPARATE_MEMBER_PAGES = NO + +# The TAB_SIZE tag can be used to set the number of spaces in a tab. +# Doxygen uses this value to replace tabs by spaces in code fragments. + +TAB_SIZE = 2 + +# This tag can be used to specify a number of aliases that acts +# as commands in the documentation. An alias has the form "name=value". +# For example adding "sideeffect=\par Side Effects:\n" will allow you to +# put the command \sideeffect (or @sideeffect) in the documentation, which +# will result in a user-defined paragraph with heading "Side Effects:". +# You can put \n's in the value part of an alias to insert newlines. + +ALIASES = + +# Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C +# sources only. Doxygen will then generate output that is more tailored for C. +# For instance, some of the names that are used will be different. The list +# of all members will be omitted, etc. + +OPTIMIZE_OUTPUT_FOR_C = NO + +# Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java +# sources only. Doxygen will then generate output that is more tailored for +# Java. For instance, namespaces will be presented as packages, qualified +# scopes will look different, etc. + +OPTIMIZE_OUTPUT_JAVA = NO + +# Set the OPTIMIZE_FOR_FORTRAN tag to YES if your project consists of Fortran +# sources only. Doxygen will then generate output that is more tailored for +# Fortran. + +OPTIMIZE_FOR_FORTRAN = NO + +# Set the OPTIMIZE_OUTPUT_VHDL tag to YES if your project consists of VHDL +# sources. Doxygen will then generate output that is tailored for +# VHDL. + +OPTIMIZE_OUTPUT_VHDL = NO + +# If you use STL classes (i.e. std::string, std::vector, etc.) but do not want +# to include (a tag file for) the STL sources as input, then you should +# set this tag to YES in order to let doxygen match functions declarations and +# definitions whose arguments contain STL classes (e.g. func(std::string); v.s. +# func(std::string) {}). This also make the inheritance and collaboration +# diagrams that involve STL classes more complete and accurate. + +BUILTIN_STL_SUPPORT = NO + +# If you use Microsoft's C++/CLI language, you should set this option to YES to +# enable parsing support. + +CPP_CLI_SUPPORT = NO + +# Set the SIP_SUPPORT tag to YES if your project consists of sip sources only. +# Doxygen will parse them like normal C++ but will assume all classes use public +# instead of private inheritance when no explicit protection keyword is present. + +SIP_SUPPORT = NO + +# For Microsoft's IDL there are propget and propput attributes to indicate getter +# and setter methods for a property. Setting this option to YES (the default) +# will make doxygen to replace the get and set methods by a property in the +# documentation. This will only work if the methods are indeed getting or +# setting a simple type. If this is not the case, or you want to show the +# methods anyway, you should set this option to NO. + +IDL_PROPERTY_SUPPORT = YES + +# If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC +# tag is set to YES, then doxygen will reuse the documentation of the first +# member in the group (if any) for the other members of the group. By default +# all members of a group must be documented explicitly. + +DISTRIBUTE_GROUP_DOC = NO + +# Set the SUBGROUPING tag to YES (the default) to allow class member groups of +# the same type (for instance a group of public functions) to be put as a +# subgroup of that type (e.g. under the Public Functions section). Set it to +# NO to prevent subgrouping. Alternatively, this can be done per class using +# the \nosubgrouping command. + +SUBGROUPING = YES + +# When TYPEDEF_HIDES_STRUCT is enabled, a typedef of a struct, union, or enum +# is documented as struct, union, or enum with the name of the typedef. So +# typedef struct TypeS {} TypeT, will appear in the documentation as a struct +# with name TypeT. When disabled the typedef will appear as a member of a file, +# namespace, or class. And the struct will be named TypeS. This can typically +# be useful for C code in case the coding convention dictates that all compound +# types are typedef'ed and only the typedef is referenced, never the tag name. + +TYPEDEF_HIDES_STRUCT = NO + +#--------------------------------------------------------------------------- +# Build related configuration options +#--------------------------------------------------------------------------- + +# If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in +# documentation are documented, even if no documentation was available. +# Private class members and static file members will be hidden unless +# the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES + +EXTRACT_ALL = YES + +# If the EXTRACT_PRIVATE tag is set to YES all private members of a class +# will be included in the documentation. + +EXTRACT_PRIVATE = NO + +# If the EXTRACT_STATIC tag is set to YES all static members of a file +# will be included in the documentation. + +EXTRACT_STATIC = YES + +# If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) +# defined locally in source files will be included in the documentation. +# If set to NO only classes defined in header files are included. + +EXTRACT_LOCAL_CLASSES = YES + +# This flag is only useful for Objective-C code. When set to YES local +# methods, which are defined in the implementation section but not in +# the interface are included in the documentation. +# If set to NO (the default) only methods in the interface are included. + +EXTRACT_LOCAL_METHODS = NO + +# If this flag is set to YES, the members of anonymous namespaces will be +# extracted and appear in the documentation as a namespace called +# 'anonymous_namespace{file}', where file will be replaced with the base +# name of the file that contains the anonymous namespace. By default +# anonymous namespace are hidden. + +EXTRACT_ANON_NSPACES = NO + +# If the HIDE_UNDOC_MEMBERS tag is set to YES, Doxygen will hide all +# undocumented members of documented classes, files or namespaces. +# If set to NO (the default) these members will be included in the +# various overviews, but no documentation section is generated. +# This option has no effect if EXTRACT_ALL is enabled. + +HIDE_UNDOC_MEMBERS = NO + +# If the HIDE_UNDOC_CLASSES tag is set to YES, Doxygen will hide all +# undocumented classes that are normally visible in the class hierarchy. +# If set to NO (the default) these classes will be included in the various +# overviews. This option has no effect if EXTRACT_ALL is enabled. + +HIDE_UNDOC_CLASSES = NO + +# If the HIDE_FRIEND_COMPOUNDS tag is set to YES, Doxygen will hide all +# friend (class|struct|union) declarations. +# If set to NO (the default) these declarations will be included in the +# documentation. + +HIDE_FRIEND_COMPOUNDS = NO + +# If the HIDE_IN_BODY_DOCS tag is set to YES, Doxygen will hide any +# documentation blocks found inside the body of a function. +# If set to NO (the default) these blocks will be appended to the +# function's detailed documentation block. + +HIDE_IN_BODY_DOCS = NO + +# The INTERNAL_DOCS tag determines if documentation +# that is typed after a \internal command is included. If the tag is set +# to NO (the default) then the documentation will be excluded. +# Set it to YES to include the internal documentation. + +INTERNAL_DOCS = NO + +# If the CASE_SENSE_NAMES tag is set to NO then Doxygen will only generate +# file names in lower-case letters. If set to YES upper-case letters are also +# allowed. This is useful if you have classes or files whose names only differ +# in case and if your file system supports case sensitive file names. Windows +# and Mac users are advised to set this option to NO. + +CASE_SENSE_NAMES = YES + +# If the HIDE_SCOPE_NAMES tag is set to NO (the default) then Doxygen +# will show members with their full class and namespace scopes in the +# documentation. If set to YES the scope will be hidden. + +HIDE_SCOPE_NAMES = NO + +# If the SHOW_INCLUDE_FILES tag is set to YES (the default) then Doxygen +# will put a list of the files that are included by a file in the documentation +# of that file. + +SHOW_INCLUDE_FILES = YES + +# If the INLINE_INFO tag is set to YES (the default) then a tag [inline] +# is inserted in the documentation for inline members. + +INLINE_INFO = YES + +# If the SORT_MEMBER_DOCS tag is set to YES (the default) then doxygen +# will sort the (detailed) documentation of file and class members +# alphabetically by member name. If set to NO the members will appear in +# declaration order. + +SORT_MEMBER_DOCS = YES + +# If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the +# brief documentation of file, namespace and class members alphabetically +# by member name. If set to NO (the default) the members will appear in +# declaration order. + +SORT_BRIEF_DOCS = NO + +# If the SORT_GROUP_NAMES tag is set to YES then doxygen will sort the +# hierarchy of group names into alphabetical order. If set to NO (the default) +# the group names will appear in their defined order. + +SORT_GROUP_NAMES = NO + +# If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be +# sorted by fully-qualified names, including namespaces. If set to +# NO (the default), the class list will be sorted only by class name, +# not including the namespace part. +# Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. +# Note: This option applies only to the class list, not to the +# alphabetical list. + +SORT_BY_SCOPE_NAME = NO + +# The GENERATE_TODOLIST tag can be used to enable (YES) or +# disable (NO) the todo list. This list is created by putting \todo +# commands in the documentation. + +GENERATE_TODOLIST = YES + +# The GENERATE_TESTLIST tag can be used to enable (YES) or +# disable (NO) the test list. This list is created by putting \test +# commands in the documentation. + +GENERATE_TESTLIST = YES + +# The GENERATE_BUGLIST tag can be used to enable (YES) or +# disable (NO) the bug list. This list is created by putting \bug +# commands in the documentation. + +GENERATE_BUGLIST = YES + +# The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or +# disable (NO) the deprecated list. This list is created by putting +# \deprecated commands in the documentation. + +GENERATE_DEPRECATEDLIST= YES + +# The ENABLED_SECTIONS tag can be used to enable conditional +# documentation sections, marked by \if sectionname ... \endif. + +ENABLED_SECTIONS = + +# The MAX_INITIALIZER_LINES tag determines the maximum number of lines +# the initial value of a variable or define consists of for it to appear in +# the documentation. If the initializer consists of more lines than specified +# here it will be hidden. Use a value of 0 to hide initializers completely. +# The appearance of the initializer of individual variables and defines in the +# documentation can be controlled using \showinitializer or \hideinitializer +# command in the documentation regardless of this setting. + +MAX_INITIALIZER_LINES = 30 + +# Set the SHOW_USED_FILES tag to NO to disable the list of files generated +# at the bottom of the documentation of classes and structs. If set to YES the +# list will mention the files that were used to generate the documentation. + +SHOW_USED_FILES = YES + +# If the sources in your project are distributed over multiple directories +# then setting the SHOW_DIRECTORIES tag to YES will show the directory hierarchy +# in the documentation. The default is NO. + +SHOW_DIRECTORIES = YES + +# Set the SHOW_FILES tag to NO to disable the generation of the Files page. +# This will remove the Files entry from the Quick Index and from the +# Folder Tree View (if specified). The default is YES. + +SHOW_FILES = YES + +# Set the SHOW_NAMESPACES tag to NO to disable the generation of the +# Namespaces page. This will remove the Namespaces entry from the Quick Index +# and from the Folder Tree View (if specified). The default is YES. + +SHOW_NAMESPACES = YES + +# The FILE_VERSION_FILTER tag can be used to specify a program or script that +# doxygen should invoke to get the current version for each file (typically from +# the version control system). Doxygen will invoke the program by executing (via +# popen()) the command , where is the value of +# the FILE_VERSION_FILTER tag, and is the name of an input file +# provided by doxygen. Whatever the program writes to standard output +# is used as the file version. See the manual for examples. + +FILE_VERSION_FILTER = + +#--------------------------------------------------------------------------- +# configuration options related to warning and progress messages +#--------------------------------------------------------------------------- + +# The QUIET tag can be used to turn on/off the messages that are generated +# by doxygen. Possible values are YES and NO. If left blank NO is used. + +QUIET = NO + +# The WARNINGS tag can be used to turn on/off the warning messages that are +# generated by doxygen. Possible values are YES and NO. If left blank +# NO is used. + +WARNINGS = NO + +# If WARN_IF_UNDOCUMENTED is set to YES, then doxygen will generate warnings +# for undocumented members. If EXTRACT_ALL is set to YES then this flag will +# automatically be disabled. + +WARN_IF_UNDOCUMENTED = NO + +# If WARN_IF_DOC_ERROR is set to YES, doxygen will generate warnings for +# potential errors in the documentation, such as not documenting some +# parameters in a documented function, or documenting parameters that +# don't exist or using markup commands wrongly. + +WARN_IF_DOC_ERROR = YES + +# This WARN_NO_PARAMDOC option can be abled to get warnings for +# functions that are documented, but have no documentation for their parameters +# or return value. If set to NO (the default) doxygen will only warn about +# wrong or incomplete parameter documentation, but not about the absence of +# documentation. + +WARN_NO_PARAMDOC = NO + +# The WARN_FORMAT tag determines the format of the warning messages that +# doxygen can produce. The string should contain the $file, $line, and $text +# tags, which will be replaced by the file and line number from which the +# warning originated and the warning text. Optionally the format may contain +# $version, which will be replaced by the version of the file (if it could +# be obtained via FILE_VERSION_FILTER) + +WARN_FORMAT = + +# The WARN_LOGFILE tag can be used to specify a file to which warning +# and error messages should be written. If left blank the output is written +# to stderr. + +WARN_LOGFILE = + +#--------------------------------------------------------------------------- +# configuration options related to the input files +#--------------------------------------------------------------------------- + +# The INPUT tag can be used to specify the files and/or directories that contain +# documented source files. You may enter file names like "myfile.cpp" or +# directories like "/usr/src/myproject". Separate the files or directories +# with spaces. + +INPUT = @abs_top_srcdir@/include \ + @abs_top_srcdir@/lib \ + @abs_top_srcdir@/docs/doxygen.intro + +# This tag can be used to specify the character encoding of the source files +# that doxygen parses. Internally doxygen uses the UTF-8 encoding, which is +# also the default input encoding. Doxygen uses libiconv (or the iconv built +# into libc) for the transcoding. See http://www.gnu.org/software/libiconv for +# the list of possible encodings. + +INPUT_ENCODING = UTF-8 + +# If the value of the INPUT tag contains directories, you can use the +# FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp +# and *.h) to filter out the source-files in the directories. If left +# blank the following patterns are tested: +# *.c *.cc *.cxx *.cpp *.c++ *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh *.hxx +# *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm *.py *.f90 + +FILE_PATTERNS = + +# The RECURSIVE tag can be used to turn specify whether or not subdirectories +# should be searched for input files as well. Possible values are YES and NO. +# If left blank NO is used. + +RECURSIVE = YES + +# The EXCLUDE tag can be used to specify files and/or directories that should +# excluded from the INPUT source files. This way you can easily exclude a +# subdirectory from a directory tree whose root is specified with the INPUT tag. + +EXCLUDE = + +# The EXCLUDE_SYMLINKS tag can be used select whether or not files or +# directories that are symbolic links (a Unix filesystem feature) are excluded +# from the input. + +EXCLUDE_SYMLINKS = NO + +# If the value of the INPUT tag contains directories, you can use the +# EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude +# certain files from those directories. Note that the wildcards are matched +# against the file with absolute path, so to exclude all test directories +# for example use the pattern */test/* + +EXCLUDE_PATTERNS = + +# The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names +# (namespaces, classes, functions, etc.) that should be excluded from the +# output. The symbol name can be a fully qualified name, a word, or if the +# wildcard * is used, a substring. Examples: ANamespace, AClass, +# AClass::ANamespace, ANamespace::*Test + +EXCLUDE_SYMBOLS = + +# The EXAMPLE_PATH tag can be used to specify one or more files or +# directories that contain example code fragments that are included (see +# the \include command). + +EXAMPLE_PATH = @abs_top_srcdir@/examples + +# If the value of the EXAMPLE_PATH tag contains directories, you can use the +# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp +# and *.h) to filter out the source-files in the directories. If left +# blank all files are included. + +EXAMPLE_PATTERNS = + +# If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be +# searched for input files to be used with the \include or \dontinclude +# commands irrespective of the value of the RECURSIVE tag. +# Possible values are YES and NO. If left blank NO is used. + +EXAMPLE_RECURSIVE = YES + +# The IMAGE_PATH tag can be used to specify one or more files or +# directories that contain image that are included in the documentation (see +# the \image command). + +IMAGE_PATH = @abs_top_srcdir@/docs/img + +# The INPUT_FILTER tag can be used to specify a program that doxygen should +# invoke to filter for each input file. Doxygen will invoke the filter program +# by executing (via popen()) the command , where +# is the value of the INPUT_FILTER tag, and is the name of an +# input file. Doxygen will then use the output that the filter program writes +# to standard output. If FILTER_PATTERNS is specified, this tag will be +# ignored. + +INPUT_FILTER = + +# The FILTER_PATTERNS tag can be used to specify filters on a per file pattern +# basis. Doxygen will compare the file name with each pattern and apply the +# filter if there is a match. The filters are a list of the form: +# pattern=filter (like *.cpp=my_cpp_filter). See INPUT_FILTER for further +# info on how filters are used. If FILTER_PATTERNS is empty, INPUT_FILTER +# is applied to all files. + +FILTER_PATTERNS = + +# If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using +# INPUT_FILTER) will be used to filter the input files when producing source +# files to browse (i.e. when SOURCE_BROWSER is set to YES). + +FILTER_SOURCE_FILES = NO + +#--------------------------------------------------------------------------- +# configuration options related to source browsing +#--------------------------------------------------------------------------- + +# If the SOURCE_BROWSER tag is set to YES then a list of source files will +# be generated. Documented entities will be cross-referenced with these sources. +# Note: To get rid of all source code in the generated output, make sure also +# VERBATIM_HEADERS is set to NO. + +SOURCE_BROWSER = YES + +# Setting the INLINE_SOURCES tag to YES will include the body +# of functions and classes directly in the documentation. + +INLINE_SOURCES = NO + +# Setting the STRIP_CODE_COMMENTS tag to YES (the default) will instruct +# doxygen to hide any special comment blocks from generated source code +# fragments. Normal C and C++ comments will always remain visible. + +STRIP_CODE_COMMENTS = NO + +# If the REFERENCED_BY_RELATION tag is set to YES +# then for each documented function all documented +# functions referencing it will be listed. + +REFERENCED_BY_RELATION = YES + +# If the REFERENCES_RELATION tag is set to YES +# then for each documented function all documented entities +# called/used by that function will be listed. + +REFERENCES_RELATION = YES + +# If the REFERENCES_LINK_SOURCE tag is set to YES (the default) +# and SOURCE_BROWSER tag is set to YES, then the hyperlinks from +# functions in REFERENCES_RELATION and REFERENCED_BY_RELATION lists will +# link to the source code. Otherwise they will link to the documentstion. + +REFERENCES_LINK_SOURCE = YES + +# If the USE_HTAGS tag is set to YES then the references to source code +# will point to the HTML generated by the htags(1) tool instead of doxygen +# built-in source browser. The htags tool is part of GNU's global source +# tagging system (see http://www.gnu.org/software/global/global.html). You +# will need version 4.8.6 or higher. + +USE_HTAGS = NO + +# If the VERBATIM_HEADERS tag is set to YES (the default) then Doxygen +# will generate a verbatim copy of the header file for each class for +# which an include is specified. Set to NO to disable this. + +VERBATIM_HEADERS = YES + +#--------------------------------------------------------------------------- +# configuration options related to the alphabetical class index +#--------------------------------------------------------------------------- + +# If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index +# of all compounds will be generated. Enable this if the project +# contains a lot of classes, structs, unions or interfaces. + +ALPHABETICAL_INDEX = YES + +# If the alphabetical index is enabled (see ALPHABETICAL_INDEX) then +# the COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns +# in which this list will be split (can be a number in the range [1..20]) + +COLS_IN_ALPHA_INDEX = 4 + +# In case all classes in a project start with a common prefix, all +# classes will be put under the same header in the alphabetical index. +# The IGNORE_PREFIX tag can be used to specify one or more prefixes that +# should be ignored while generating the index headers. + +IGNORE_PREFIX = llvm:: + +#--------------------------------------------------------------------------- +# configuration options related to the HTML output +#--------------------------------------------------------------------------- + +# If the GENERATE_HTML tag is set to YES (the default) Doxygen will +# generate HTML output. + +GENERATE_HTML = YES + +# The HTML_OUTPUT tag is used to specify where the HTML docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `html' will be used as the default path. + +HTML_OUTPUT = html + +# The HTML_FILE_EXTENSION tag can be used to specify the file extension for +# each generated HTML page (for example: .htm,.php,.asp). If it is left blank +# doxygen will generate files with .html extension. + +HTML_FILE_EXTENSION = .html + +# The HTML_HEADER tag can be used to specify a personal HTML header for +# each generated HTML page. If it is left blank doxygen will generate a +# standard header. + +HTML_HEADER = @abs_top_srcdir@/docs/doxygen.header + +# The HTML_FOOTER tag can be used to specify a personal HTML footer for +# each generated HTML page. If it is left blank doxygen will generate a +# standard footer. + +HTML_FOOTER = @abs_top_srcdir@/docs/doxygen.footer + +# The HTML_STYLESHEET tag can be used to specify a user-defined cascading +# style sheet that is used by each HTML page. It can be used to +# fine-tune the look of the HTML output. If the tag is left blank doxygen +# will generate a default style sheet. Note that doxygen will try to copy +# the style sheet file to the HTML output directory, so don't put your own +# stylesheet in the HTML output directory as well, or it will be erased! + +HTML_STYLESHEET = @abs_top_srcdir@/docs/doxygen.css + +# If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes, +# files or namespaces will be aligned in HTML using tables. If set to +# NO a bullet list will be used. + +HTML_ALIGN_MEMBERS = YES + +# If the GENERATE_HTMLHELP tag is set to YES, additional index files +# will be generated that can be used as input for tools like the +# Microsoft HTML help workshop to generate a compiled HTML help file (.chm) +# of the generated HTML documentation. + +GENERATE_HTMLHELP = NO + +# If the GENERATE_DOCSET tag is set to YES, additional index files +# will be generated that can be used as input for Apple's Xcode 3 +# integrated development environment, introduced with OSX 10.5 (Leopard). +# To create a documentation set, doxygen will generate a Makefile in the +# HTML output directory. Running make will produce the docset in that +# directory and running "make install" will install the docset in +# ~/Library/Developer/Shared/Documentation/DocSets so that Xcode will find +# it at startup. + +GENERATE_DOCSET = NO + +# When GENERATE_DOCSET tag is set to YES, this tag determines the name of the +# feed. A documentation feed provides an umbrella under which multiple +# documentation sets from a single provider (such as a company or product suite) +# can be grouped. + +DOCSET_FEEDNAME = "Doxygen generated docs" + +# When GENERATE_DOCSET tag is set to YES, this tag specifies a string that +# should uniquely identify the documentation set bundle. This should be a +# reverse domain-name style string, e.g. com.mycompany.MyDocSet. Doxygen +# will append .docset to the name. + +DOCSET_BUNDLE_ID = org.doxygen.Project + +# If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML +# documentation will contain sections that can be hidden and shown after the +# page has loaded. For this to work a browser that supports +# JavaScript and DHTML is required (for instance Mozilla 1.0+, Firefox +# Netscape 6.0+, Internet explorer 5.0+, Konqueror, or Safari). + +HTML_DYNAMIC_SECTIONS = NO + +# If the GENERATE_HTMLHELP tag is set to YES, the CHM_FILE tag can +# be used to specify the file name of the resulting .chm file. You +# can add a path in front of the file if the result should not be +# written to the html output directory. + +CHM_FILE = + +# If the GENERATE_HTMLHELP tag is set to YES, the HHC_LOCATION tag can +# be used to specify the location (absolute path including file name) of +# the HTML help compiler (hhc.exe). If non-empty doxygen will try to run +# the HTML help compiler on the generated index.hhp. + +HHC_LOCATION = + +# If the GENERATE_HTMLHELP tag is set to YES, the GENERATE_CHI flag +# controls if a separate .chi index file is generated (YES) or that +# it should be included in the master .chm file (NO). + +GENERATE_CHI = NO + +# If the GENERATE_HTMLHELP tag is set to YES, the CHM_INDEX_ENCODING +# is used to encode HtmlHelp index (hhk), content (hhc) and project file +# content. + +CHM_INDEX_ENCODING = + +# If the GENERATE_HTMLHELP tag is set to YES, the BINARY_TOC flag +# controls whether a binary table of contents is generated (YES) or a +# normal table of contents (NO) in the .chm file. + +BINARY_TOC = NO + +# The TOC_EXPAND flag can be set to YES to add extra items for group members +# to the contents of the HTML help documentation and to the tree view. + +TOC_EXPAND = NO + +# The DISABLE_INDEX tag can be used to turn on/off the condensed index at +# top of each HTML page. The value NO (the default) enables the index and +# the value YES disables it. + +DISABLE_INDEX = NO + +# This tag can be used to set the number of enum values (range [1..20]) +# that doxygen will group on one line in the generated HTML documentation. + +ENUM_VALUES_PER_LINE = 4 + +# The GENERATE_TREEVIEW tag is used to specify whether a tree-like index +# structure should be generated to display hierarchical information. +# If the tag value is set to FRAME, a side panel will be generated +# containing a tree-like index structure (just like the one that +# is generated for HTML Help). For this to work a browser that supports +# JavaScript, DHTML, CSS and frames is required (for instance Mozilla 1.0+, +# Netscape 6.0+, Internet explorer 5.0+, or Konqueror). Windows users are +# probably better off using the HTML help feature. Other possible values +# for this tag are: HIERARCHIES, which will generate the Groups, Directories, +# and Class Hiererachy pages using a tree view instead of an ordered list; +# ALL, which combines the behavior of FRAME and HIERARCHIES; and NONE, which +# disables this behavior completely. For backwards compatibility with previous +# releases of Doxygen, the values YES and NO are equivalent to FRAME and NONE +# respectively. + +GENERATE_TREEVIEW = NO + +# If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be +# used to set the initial width (in pixels) of the frame in which the tree +# is shown. + +TREEVIEW_WIDTH = 250 + +# Use this tag to change the font size of Latex formulas included +# as images in the HTML documentation. The default is 10. Note that +# when you change the font size after a successful doxygen run you need +# to manually remove any form_*.png images from the HTML output directory +# to force them to be regenerated. + +FORMULA_FONTSIZE = 10 + +#--------------------------------------------------------------------------- +# configuration options related to the LaTeX output +#--------------------------------------------------------------------------- + +# If the GENERATE_LATEX tag is set to YES (the default) Doxygen will +# generate Latex output. + +GENERATE_LATEX = NO + +# The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `latex' will be used as the default path. + +LATEX_OUTPUT = + +# The LATEX_CMD_NAME tag can be used to specify the LaTeX command name to be +# invoked. If left blank `latex' will be used as the default command name. + +LATEX_CMD_NAME = latex + +# The MAKEINDEX_CMD_NAME tag can be used to specify the command name to +# generate index for LaTeX. If left blank `makeindex' will be used as the +# default command name. + +MAKEINDEX_CMD_NAME = makeindex + +# If the COMPACT_LATEX tag is set to YES Doxygen generates more compact +# LaTeX documents. This may be useful for small projects and may help to +# save some trees in general. + +COMPACT_LATEX = NO + +# The PAPER_TYPE tag can be used to set the paper type that is used +# by the printer. Possible values are: a4, a4wide, letter, legal and +# executive. If left blank a4wide will be used. + +PAPER_TYPE = letter + +# The EXTRA_PACKAGES tag can be to specify one or more names of LaTeX +# packages that should be included in the LaTeX output. + +EXTRA_PACKAGES = + +# The LATEX_HEADER tag can be used to specify a personal LaTeX header for +# the generated latex document. The header should contain everything until +# the first chapter. If it is left blank doxygen will generate a +# standard header. Notice: only use this tag if you know what you are doing! + +LATEX_HEADER = + +# If the PDF_HYPERLINKS tag is set to YES, the LaTeX that is generated +# is prepared for conversion to pdf (using ps2pdf). The pdf file will +# contain links (just like the HTML output) instead of page references +# This makes the output suitable for online browsing using a pdf viewer. + +PDF_HYPERLINKS = NO + +# If the USE_PDFLATEX tag is set to YES, pdflatex will be used instead of +# plain latex in the generated Makefile. Set this option to YES to get a +# higher quality PDF documentation. + +USE_PDFLATEX = NO + +# If the LATEX_BATCHMODE tag is set to YES, doxygen will add the \\batchmode. +# command to the generated LaTeX files. This will instruct LaTeX to keep +# running if errors occur, instead of asking the user for help. +# This option is also used when generating formulas in HTML. + +LATEX_BATCHMODE = NO + +# If LATEX_HIDE_INDICES is set to YES then doxygen will not +# include the index chapters (such as File Index, Compound Index, etc.) +# in the output. + +LATEX_HIDE_INDICES = NO + +#--------------------------------------------------------------------------- +# configuration options related to the RTF output +#--------------------------------------------------------------------------- + +# If the GENERATE_RTF tag is set to YES Doxygen will generate RTF output +# The RTF output is optimized for Word 97 and may not look very pretty with +# other RTF readers or editors. + +GENERATE_RTF = NO + +# The RTF_OUTPUT tag is used to specify where the RTF docs will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `rtf' will be used as the default path. + +RTF_OUTPUT = + +# If the COMPACT_RTF tag is set to YES Doxygen generates more compact +# RTF documents. This may be useful for small projects and may help to +# save some trees in general. + +COMPACT_RTF = NO + +# If the RTF_HYPERLINKS tag is set to YES, the RTF that is generated +# will contain hyperlink fields. The RTF file will +# contain links (just like the HTML output) instead of page references. +# This makes the output suitable for online browsing using WORD or other +# programs which support those fields. +# Note: wordpad (write) and others do not support links. + +RTF_HYPERLINKS = NO + +# Load stylesheet definitions from file. Syntax is similar to doxygen's +# config file, i.e. a series of assignments. You only have to provide +# replacements, missing definitions are set to their default value. + +RTF_STYLESHEET_FILE = + +# Set optional variables used in the generation of an rtf document. +# Syntax is similar to doxygen's config file. + +RTF_EXTENSIONS_FILE = + +#--------------------------------------------------------------------------- +# configuration options related to the man page output +#--------------------------------------------------------------------------- + +# If the GENERATE_MAN tag is set to YES (the default) Doxygen will +# generate man pages + +GENERATE_MAN = NO + +# The MAN_OUTPUT tag is used to specify where the man pages will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `man' will be used as the default path. + +MAN_OUTPUT = + +# The MAN_EXTENSION tag determines the extension that is added to +# the generated man pages (default is the subroutine's section .3) + +MAN_EXTENSION = + +# If the MAN_LINKS tag is set to YES and Doxygen generates man output, +# then it will generate one additional man file for each entity +# documented in the real man page(s). These additional files +# only source the real man page, but without them the man command +# would be unable to find the correct page. The default is NO. + +MAN_LINKS = NO + +#--------------------------------------------------------------------------- +# configuration options related to the XML output +#--------------------------------------------------------------------------- + +# If the GENERATE_XML tag is set to YES Doxygen will +# generate an XML file that captures the structure of +# the code including all documentation. + +GENERATE_XML = NO + +# The XML_OUTPUT tag is used to specify where the XML pages will be put. +# If a relative path is entered the value of OUTPUT_DIRECTORY will be +# put in front of it. If left blank `xml' will be used as the default path. + +XML_OUTPUT = xml + +# The XML_SCHEMA tag can be used to specify an XML schema, +# which can be used by a validating XML parser to check the +# syntax of the XML files. + +XML_SCHEMA = + +# The XML_DTD tag can be used to specify an XML DTD, +# which can be used by a validating XML parser to check the +# syntax of the XML files. + +XML_DTD = + +# If the XML_PROGRAMLISTING tag is set to YES Doxygen will +# dump the program listings (including syntax highlighting +# and cross-referencing information) to the XML output. Note that +# enabling this will significantly increase the size of the XML output. + +XML_PROGRAMLISTING = YES + +#--------------------------------------------------------------------------- +# configuration options for the AutoGen Definitions output +#--------------------------------------------------------------------------- + +# If the GENERATE_AUTOGEN_DEF tag is set to YES Doxygen will +# generate an AutoGen Definitions (see autogen.sf.net) file +# that captures the structure of the code including all +# documentation. Note that this feature is still experimental +# and incomplete at the moment. + +GENERATE_AUTOGEN_DEF = NO + +#--------------------------------------------------------------------------- +# configuration options related to the Perl module output +#--------------------------------------------------------------------------- + +# If the GENERATE_PERLMOD tag is set to YES Doxygen will +# generate a Perl module file that captures the structure of +# the code including all documentation. Note that this +# feature is still experimental and incomplete at the +# moment. + +GENERATE_PERLMOD = NO + +# If the PERLMOD_LATEX tag is set to YES Doxygen will generate +# the necessary Makefile rules, Perl scripts and LaTeX code to be able +# to generate PDF and DVI output from the Perl module output. + +PERLMOD_LATEX = NO + +# If the PERLMOD_PRETTY tag is set to YES the Perl module output will be +# nicely formatted so it can be parsed by a human reader. This is useful +# if you want to understand what is going on. On the other hand, if this +# tag is set to NO the size of the Perl module output will be much smaller +# and Perl will parse it just the same. + +PERLMOD_PRETTY = YES + +# The names of the make variables in the generated doxyrules.make file +# are prefixed with the string contained in PERLMOD_MAKEVAR_PREFIX. +# This is useful so different doxyrules.make files included by the same +# Makefile don't overwrite each other's variables. + +PERLMOD_MAKEVAR_PREFIX = + +#--------------------------------------------------------------------------- +# Configuration options related to the preprocessor +#--------------------------------------------------------------------------- + +# If the ENABLE_PREPROCESSING tag is set to YES (the default) Doxygen will +# evaluate all C-preprocessor directives found in the sources and include +# files. + +ENABLE_PREPROCESSING = YES + +# If the MACRO_EXPANSION tag is set to YES Doxygen will expand all macro +# names in the source code. If set to NO (the default) only conditional +# compilation will be performed. Macro expansion can be done in a controlled +# way by setting EXPAND_ONLY_PREDEF to YES. + +MACRO_EXPANSION = NO + +# If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES +# then the macro expansion is limited to the macros specified with the +# PREDEFINED and EXPAND_AS_DEFINED tags. + +EXPAND_ONLY_PREDEF = NO + +# If the SEARCH_INCLUDES tag is set to YES (the default) the includes files +# in the INCLUDE_PATH (see below) will be search if a #include is found. + +SEARCH_INCLUDES = YES + +# The INCLUDE_PATH tag can be used to specify one or more directories that +# contain include files that are not input files but should be processed by +# the preprocessor. + +INCLUDE_PATH = ../include + +# You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard +# patterns (like *.h and *.hpp) to filter out the header-files in the +# directories. If left blank, the patterns specified with FILE_PATTERNS will +# be used. + +INCLUDE_FILE_PATTERNS = + +# The PREDEFINED tag can be used to specify one or more macro names that +# are defined before the preprocessor is started (similar to the -D option of +# gcc). The argument of the tag is a list of macros of the form: name +# or name=definition (no spaces). If the definition and the = are +# omitted =1 is assumed. To prevent a macro definition from being +# undefined via #undef or recursively expanded use the := operator +# instead of the = operator. + +PREDEFINED = + +# If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then +# this tag can be used to specify a list of macro names that should be expanded. +# The macro definition that is found in the sources will be used. +# Use the PREDEFINED tag if you want to use a different macro definition. + +EXPAND_AS_DEFINED = + +# If the SKIP_FUNCTION_MACROS tag is set to YES (the default) then +# doxygen's preprocessor will remove all function-like macros that are alone +# on a line, have an all uppercase name, and do not end with a semicolon. Such +# function macros are typically used for boiler-plate code, and will confuse +# the parser if not removed. + +SKIP_FUNCTION_MACROS = YES + +#--------------------------------------------------------------------------- +# Configuration::additions related to external references +#--------------------------------------------------------------------------- + +# The TAGFILES option can be used to specify one or more tagfiles. +# Optionally an initial location of the external documentation +# can be added for each tagfile. The format of a tag file without +# this location is as follows: +# TAGFILES = file1 file2 ... +# Adding location for the tag files is done as follows: +# TAGFILES = file1=loc1 "file2 = loc2" ... +# where "loc1" and "loc2" can be relative or absolute paths or +# URLs. If a location is present for each tag, the installdox tool +# does not have to be run to correct the links. +# Note that each tag file must have a unique name +# (where the name does NOT include the path) +# If a tag file is not located in the directory in which doxygen +# is run, you must also specify the path to the tagfile here. + +TAGFILES = + +# When a file name is specified after GENERATE_TAGFILE, doxygen will create +# a tag file that is based on the input files it reads. + +GENERATE_TAGFILE = + +# If the ALLEXTERNALS tag is set to YES all external classes will be listed +# in the class index. If set to NO only the inherited external classes +# will be listed. + +ALLEXTERNALS = YES + +# If the EXTERNAL_GROUPS tag is set to YES all external groups will be listed +# in the modules index. If set to NO, only the current project's groups will +# be listed. + +EXTERNAL_GROUPS = YES + +# The PERL_PATH should be the absolute path and name of the perl script +# interpreter (i.e. the result of `which perl'). + +PERL_PATH = + +#--------------------------------------------------------------------------- +# Configuration options related to the dot tool +#--------------------------------------------------------------------------- + +# If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will +# generate a inheritance diagram (in HTML, RTF and LaTeX) for classes with base +# or super classes. Setting the tag to NO turns the diagrams off. Note that +# this option is superseded by the HAVE_DOT option below. This is only a +# fallback. It is recommended to install and use dot, since it yields more +# powerful graphs. + +CLASS_DIAGRAMS = YES + +# You can define message sequence charts within doxygen comments using the \msc +# command. Doxygen will then run the mscgen tool (see +# http://www.mcternan.me.uk/mscgen/) to produce the chart and insert it in the +# documentation. The MSCGEN_PATH tag allows you to specify the directory where +# the mscgen tool resides. If left empty the tool is assumed to be found in the +# default search path. + +MSCGEN_PATH = + +# If set to YES, the inheritance and collaboration graphs will hide +# inheritance and usage relations if the target is undocumented +# or is not a class. + +HIDE_UNDOC_RELATIONS = NO + +# If you set the HAVE_DOT tag to YES then doxygen will assume the dot tool is +# available from the path. This tool is part of Graphviz, a graph visualization +# toolkit from AT&T and Lucent Bell Labs. The other options in this section +# have no effect if this option is set to NO (the default) + +HAVE_DOT = YES + +# By default doxygen will write a font called FreeSans.ttf to the output +# directory and reference it in all dot files that doxygen generates. This +# font does not include all possible unicode characters however, so when you need +# these (or just want a differently looking font) you can specify the font name +# using DOT_FONTNAME. You need need to make sure dot is able to find the font, +# which can be done by putting it in a standard location or by setting the +# DOTFONTPATH environment variable or by setting DOT_FONTPATH to the directory +# containing the font. + +DOT_FONTNAME = FreeSans + +# By default doxygen will tell dot to use the output directory to look for the +# FreeSans.ttf font (which doxygen will put there itself). If you specify a +# different font using DOT_FONTNAME you can set the path where dot +# can find it using this tag. + +DOT_FONTPATH = + +# If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen +# will generate a graph for each documented class showing the direct and +# indirect inheritance relations. Setting this tag to YES will force the +# the CLASS_DIAGRAMS tag to NO. + +CLASS_GRAPH = YES + +# If the COLLABORATION_GRAPH and HAVE_DOT tags are set to YES then doxygen +# will generate a graph for each documented class showing the direct and +# indirect implementation dependencies (inheritance, containment, and +# class references variables) of the class with other documented classes. + +COLLABORATION_GRAPH = YES + +# If the GROUP_GRAPHS and HAVE_DOT tags are set to YES then doxygen +# will generate a graph for groups, showing the direct groups dependencies + +GROUP_GRAPHS = YES + +# If the UML_LOOK tag is set to YES doxygen will generate inheritance and +# collaboration diagrams in a style similar to the OMG's Unified Modeling +# Language. + +UML_LOOK = NO + +# If set to YES, the inheritance and collaboration graphs will show the +# relations between templates and their instances. + +TEMPLATE_RELATIONS = YES + +# If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDE_GRAPH, and HAVE_DOT +# tags are set to YES then doxygen will generate a graph for each documented +# file showing the direct and indirect include dependencies of the file with +# other documented files. + +INCLUDE_GRAPH = YES + +# If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDED_BY_GRAPH, and +# HAVE_DOT tags are set to YES then doxygen will generate a graph for each +# documented header file showing the documented files that directly or +# indirectly include this file. + +INCLUDED_BY_GRAPH = YES + +# If the CALL_GRAPH and HAVE_DOT options are set to YES then +# doxygen will generate a call dependency graph for every global function +# or class method. Note that enabling this option will significantly increase +# the time of a run. So in most cases it will be better to enable call graphs +# for selected functions only using the \callgraph command. + +CALL_GRAPH = NO + +# If the CALLER_GRAPH and HAVE_DOT tags are set to YES then +# doxygen will generate a caller dependency graph for every global function +# or class method. Note that enabling this option will significantly increase +# the time of a run. So in most cases it will be better to enable caller +# graphs for selected functions only using the \callergraph command. + +CALLER_GRAPH = NO + +# If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen +# will graphical hierarchy of all classes instead of a textual one. + +GRAPHICAL_HIERARCHY = YES + +# If the DIRECTORY_GRAPH, SHOW_DIRECTORIES and HAVE_DOT tags are set to YES +# then doxygen will show the dependencies a directory has on other directories +# in a graphical way. The dependency relations are determined by the #include +# relations between the files in the directories. + +DIRECTORY_GRAPH = YES + +# The DOT_IMAGE_FORMAT tag can be used to set the image format of the images +# generated by dot. Possible values are png, jpg, or gif +# If left blank png will be used. + +DOT_IMAGE_FORMAT = png + +# The tag DOT_PATH can be used to specify the path where the dot tool can be +# found. If left blank, it is assumed the dot tool can be found in the path. + +DOT_PATH = @DOT@ + +# The DOTFILE_DIRS tag can be used to specify one or more directories that +# contain dot files that are included in the documentation (see the +# \dotfile command). + +DOTFILE_DIRS = + +# The DOT_GRAPH_MAX_NODES tag can be used to set the maximum number of +# nodes that will be shown in the graph. If the number of nodes in a graph +# becomes larger than this value, doxygen will truncate the graph, which is +# visualized by representing a node as a red box. Note that doxygen if the +# number of direct children of the root node in a graph is already larger than +# DOT_GRAPH_MAX_NODES then the graph will not be shown at all. Also note +# that the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH. + +DOT_GRAPH_MAX_NODES = 50 + +# The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the +# graphs generated by dot. A depth value of 3 means that only nodes reachable +# from the root by following a path via at most 3 edges will be shown. Nodes +# that lay further from the root node will be omitted. Note that setting this +# option to 1 or 2 may greatly reduce the computation time needed for large +# code bases. Also note that the size of a graph can be further restricted by +# DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction. + +MAX_DOT_GRAPH_DEPTH = 0 + +# Set the DOT_TRANSPARENT tag to YES to generate images with a transparent +# background. This is enabled by default, which results in a transparent +# background. Warning: Depending on the platform used, enabling this option +# may lead to badly anti-aliased labels on the edges of a graph (i.e. they +# become hard to read). + +DOT_TRANSPARENT = YES + +# Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output +# files in one run (i.e. multiple -o and -T options on the command line). This +# makes dot run faster, but since only newer versions of dot (>1.8.10) +# support this, this feature is disabled by default. + +DOT_MULTI_TARGETS = NO + +# If the GENERATE_LEGEND tag is set to YES (the default) Doxygen will +# generate a legend page explaining the meaning of the various boxes and +# arrows in the dot generated graphs. + +GENERATE_LEGEND = YES + +# If the DOT_CLEANUP tag is set to YES (the default) Doxygen will +# remove the intermediate dot files that are used to generate +# the various graphs. + +DOT_CLEANUP = YES + +#--------------------------------------------------------------------------- +# Configuration::additions related to the search engine +#--------------------------------------------------------------------------- + +# The SEARCHENGINE tag specifies whether or not a search engine should be +# used. If set to NO the values of all tags below this one will be ignored. + +SEARCHENGINE = NO Added: www-releases/trunk/2.8/docs/doxygen.css URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/doxygen.css?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/doxygen.css (added) +++ www-releases/trunk/2.8/docs/doxygen.css Mon Oct 4 15:49:23 2010 @@ -0,0 +1,378 @@ +BODY,H1,H2,H3,H4,H5,H6,P,CENTER,TD,TH,UL,DL,DIV { + font-family: Verdana,Geneva,Arial,Helvetica,sans-serif; +} +BODY,TD { + font-size: 90%; +} +H1 { + text-align: center; + font-size: 140%; + font-weight: bold; +} +H2 { + font-size: 120%; + font-style: italic; +} +H3 { + font-size: 100%; +} +CAPTION { font-weight: bold } +DIV.qindex { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.nav { + width: 100%; + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + padding: 2px; + line-height: 140%; +} +DIV.navtab { + background-color: #eeeeff; + border: 1px solid #b0b0b0; + text-align: center; + margin: 2px; + margin-right: 15px; + padding: 2px; +} +TD.navtab { + font-size: 70%; +} +A.qindex { + text-decoration: none; + font-weight: bold; + color: #1A419D; +} +A.qindex:visited { + text-decoration: none; + font-weight: bold; + color: #1A419D +} +A.qindex:hover { + text-decoration: none; + background-color: #ddddff; +} +A.qindexHL { + text-decoration: none; + font-weight: bold; + background-color: #6666cc; + color: #ffffff; + border: 1px double #9295C2; +} +A.qindexHL:hover { + text-decoration: none; + background-color: #6666cc; + color: #ffffff; +} +A.qindexHL:visited { + text-decoration: none; background-color: #6666cc; color: #ffffff } +A.el { text-decoration: none; font-weight: bold } +A.elRef { font-weight: bold } +A.code:link { text-decoration: none; font-weight: normal; color: #0000FF} +A.code:visited { text-decoration: none; font-weight: normal; color: #0000FF} +A.codeRef:link { font-weight: normal; color: #0000FF} +A.codeRef:visited { font-weight: normal; color: #0000FF} +A:hover { text-decoration: none; background-color: #f2f2ff } +DL.el { margin-left: -1cm } +.fragment { + font-family: Fixed, monospace; + font-size: 95%; +} +PRE.fragment { + border: 1px solid #CCCCCC; + background-color: #f5f5f5; + margin-top: 4px; + margin-bottom: 4px; + margin-left: 2px; + margin-right: 8px; + padding-left: 6px; + padding-right: 6px; + padding-top: 4px; + padding-bottom: 4px; +} +DIV.ah { background-color: black; font-weight: bold; color: #ffffff; margin-bottom: 3px; margin-top: 3px } +TD.md { background-color: #F4F4FB; font-weight: bold; } +TD.mdPrefix { + background-color: #F4F4FB; + color: #606060; + font-size: 80%; +} +TD.mdname1 { background-color: #F4F4FB; font-weight: bold; color: #602020; } +TD.mdname { background-color: #F4F4FB; font-weight: bold; color: #602020; width: 600px; } +DIV.groupHeader { + margin-left: 16px; + margin-top: 12px; + margin-bottom: 6px; + font-weight: bold; +} +DIV.groupText { margin-left: 16px; font-style: italic; font-size: 90% } +BODY { + background: white; + color: black; + margin-right: 20px; + margin-left: 20px; +} +TD.indexkey { + background-color: #eeeeff; + font-weight: bold; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TD.indexvalue { + background-color: #eeeeff; + font-style: italic; + padding-right : 10px; + padding-top : 2px; + padding-left : 10px; + padding-bottom : 2px; + margin-left : 0px; + margin-right : 0px; + margin-top : 2px; + margin-bottom : 2px; + border: 1px solid #CCCCCC; +} +TR.memlist { + background-color: #f0f0f0; +} +P.formulaDsp { text-align: center; } +IMG.formulaDsp { } +IMG.formulaInl { vertical-align: middle; } +SPAN.keyword { color: #008000 } +SPAN.keywordtype { color: #604020 } +SPAN.keywordflow { color: #e08000 } +SPAN.comment { color: #800000 } +SPAN.preprocessor { color: #806020 } +SPAN.stringliteral { color: #002080 } +SPAN.charliteral { color: #008080 } +.mdTable { + border: 1px solid #868686; + background-color: #F4F4FB; +} +.mdRow { + padding: 8px 10px; +} +.mdescLeft { + padding: 0px 8px 4px 8px; + font-size: 80%; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.mdescRight { + padding: 0px 8px 4px 8px; + font-size: 80%; + font-style: italic; + background-color: #FAFAFA; + border-top: 1px none #E0E0E0; + border-right: 1px none #E0E0E0; + border-bottom: 1px none #E0E0E0; + border-left: 1px none #E0E0E0; + margin: 0px; +} +.memItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 80%; +} +.memItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 80%; +} +.memTemplItemLeft { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: none; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 80%; +} +.memTemplItemRight { + padding: 1px 8px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: none; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + background-color: #FAFAFA; + font-size: 80%; +} +.memTemplParams { + padding: 1px 0px 0px 8px; + margin: 4px; + border-top-width: 1px; + border-right-width: 1px; + border-bottom-width: 1px; + border-left-width: 1px; + border-top-color: #E0E0E0; + border-right-color: #E0E0E0; + border-bottom-color: #E0E0E0; + border-left-color: #E0E0E0; + border-top-style: solid; + border-right-style: none; + border-bottom-style: none; + border-left-style: none; + color: #606060; + background-color: #FAFAFA; + font-size: 80%; +} +.search { color: #003399; + font-weight: bold; +} +FORM.search { + margin-bottom: 0px; + margin-top: 0px; +} +INPUT.search { font-size: 75%; + color: #000080; + font-weight: normal; + background-color: #eeeeff; +} +TD.tiny { font-size: 75%; +} +a { + color: #252E78; +} +a:visited { + color: #3D2185; +} +.dirtab { padding: 4px; + border-collapse: collapse; + border: 1px solid #b0b0b0; +} +TH.dirtab { background: #eeeeff; + font-weight: bold; +} +HR { height: 1px; + border: none; + border-top: 1px solid black; +} + +/* + * LLVM Modifications. + * Note: Everything above here is generated with "doxygen -w htlm" command. See + * "doxygen --help" for details. What follows are CSS overrides for LLVM + * specific formatting. We want to keep the above so it can be replaced with + * subsequent doxygen upgrades. + */ + +.footer { + font-size: 80%; + font-weight: bold; + text-align: center; + vertical-align: middle; +} +.title { + font-size: 25pt; + color: black; background: url("../img/lines.gif"); + font-weight: bold; + border-width: 1px; + border-style: solid none solid none; + text-align: center; + vertical-align: middle; + padding-left: 8pt; + padding-top: 1px; + padding-bottom: 2px +} +A:link { + cursor: pointer; + text-decoration: none; + font-weight: bolder; +} +A:visited { + cursor: pointer; + text-decoration: underline; + font-weight: bolder; +} +A:hover { + cursor: pointer; + text-decoration: underline; + font-weight: bolder; +} +A:active { + cursor: pointer; + text-decoration: underline; + font-weight: bolder; + font-style: italic; +} +H1 { + text-align: center; + font-size: 140%; + font-weight: bold; +} +H2 { + font-size: 120%; + font-style: italic; +} +H3 { + font-size: 100%; +} +A.qindex {} +A.qindexRef {} +A.el { text-decoration: none; font-weight: bold } +A.elRef { font-weight: bold } +A.code { text-decoration: none; font-weight: normal; color: #4444ee } +A.codeRef { font-weight: normal; color: #4444ee } Added: www-releases/trunk/2.8/docs/doxygen.footer URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/doxygen.footer?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/doxygen.footer (added) +++ www-releases/trunk/2.8/docs/doxygen.footer Mon Oct 4 15:49:23 2010 @@ -0,0 +1,13 @@ +
    + + +
    + + + + Added: www-releases/trunk/2.8/docs/doxygen.header URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/doxygen.header?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/doxygen.header (added) +++ www-releases/trunk/2.8/docs/doxygen.header Mon Oct 4 15:49:23 2010 @@ -0,0 +1,9 @@ + + + + + +LLVM: $title + + +

    LLVM API Documentation

    Added: www-releases/trunk/2.8/docs/doxygen.intro URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/doxygen.intro?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/doxygen.intro (added) +++ www-releases/trunk/2.8/docs/doxygen.intro Mon Oct 4 15:49:23 2010 @@ -0,0 +1,18 @@ +/// @mainpage Low Level Virtual Machine +/// +/// @section main_intro Introduction +/// Welcome to the Low Level Virtual Machine (LLVM). +/// +/// This documentation describes the @b internal software that makes +/// up LLVM, not the @b external use of LLVM. There are no instructions +/// here on how to use LLVM, only the APIs that make up the software. For usage +/// instructions, please see the programmer's guide or reference manual. +/// +/// @section main_caveat Caveat +/// This documentation is generated directly from the source code with doxygen. +/// Since LLVM is constantly under active development, what you're about to +/// read is out of date! However, it may still be useful since certain portions +/// of LLVM are very stable. +/// +/// @section main_changelog Change Log +/// - Original content written 12/30/2003 by Reid Spencer Added: www-releases/trunk/2.8/docs/img/Debugging.gif URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/img/Debugging.gif?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/img/Debugging.gif ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/img/libdeps.gif URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/img/libdeps.gif?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/img/libdeps.gif ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/img/lines.gif URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/img/lines.gif?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/img/lines.gif ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/img/objdeps.gif URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/img/objdeps.gif?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/img/objdeps.gif ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/img/venusflytrap.jpg URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/img/venusflytrap.jpg?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/img/venusflytrap.jpg ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/index.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/index.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/index.html (added) +++ www-releases/trunk/2.8/docs/index.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,293 @@ + + + + Documentation for the LLVM System at SVN head + + + + +
    Documentation for the LLVM System at SVN head
    + +

    If you are using a released version of LLVM, +see the download page to find +your documentation.

    + + + +
    +

    Written by The LLVM Team

    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
      +
    • The +LLVM Announcements List: This is a low volume list that provides important +announcements regarding LLVM. It gets email about once a month.
    • + +
    • The Developer's +List: This list is for people who want to be included in technical +discussions of LLVM. People post to this list when they have questions about +writing code for or using the LLVM tools. It is relatively low volume.
    • + +
    • The Bugs & +Patches Archive: This list gets emailed every time a bug is opened and +closed, and when people submit patches to be included in LLVM. It is higher +volume than the LLVMdev list.
    • + +
    • The Commits +Archive: This list contains all commit messages that are made when LLVM +developers commit code changes to the repository. It is useful for those who +want to stay on the bleeding edge of LLVM development. This list is very high +volume.
    • + +
    • The +Test Results Archive: A message is automatically sent to this list by every +active nightly tester when it completes. As such, this list gets email several +times each day, making it a high volume list.
    • + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/llvm.css URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/llvm.css?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/llvm.css (added) +++ www-releases/trunk/2.8/docs/llvm.css Mon Oct 4 15:49:23 2010 @@ -0,0 +1,100 @@ +/* + * LLVM documentation style sheet + */ + +/* Common styles */ +.body { color: black; background: white; margin: 0 0 0 0 } + +/* No borders on image links */ +a:link img, a:visited img { border-style: none } + +address img { float: right; width: 88px; height: 31px; } +address { clear: right; } + +table { text-align: center; border: 2px solid black; + border-collapse: collapse; margin-top: 1em; margin-left: 1em; + margin-right: 1em; margin-bottom: 1em; } +tr, td { border: 2px solid gray; padding: 4pt 4pt 2pt 2pt; } +th { border: 2px solid gray; font-weight: bold; font-size: 105%; + background: url("img/lines.gif"); + font-family: "Georgia,Palatino,Times,Roman,SanSerif"; + text-align: center; vertical-align: middle; } +/* + * Documentation + */ +/* Common for title and header */ +.doc_title, .doc_section, .doc_subsection, h1, h2 { + color: black; background: url("img/lines.gif"); + font-family: "Georgia,Palatino,Times,Roman,SanSerif"; font-weight: bold; + border-width: 1px; + border-style: solid none solid none; + text-align: center; + vertical-align: middle; + padding-left: 8pt; + padding-top: 1px; + padding-bottom: 2px +} + +h1, .doc_section { text-align: center; font-size: 22pt; + margin: 20pt 0pt 5pt 0pt; } + +.doc_title, .title { text-align: left; font-size: 25pt } + +h2, .doc_subsection { width: 75%; + text-align: left; font-size: 12pt; + padding: 4pt 4pt 4pt 4pt; + margin: 1.5em 0.5em 0.5em 0.5em } + +h3, .doc_subsubsection { margin: 2.0em 0.5em 0.5em 0.5em; + font-weight: bold; font-style: oblique; + border-bottom: 1px solid #999999; font-size: 12pt; + width: 75%; } + +.doc_author { text-align: left; font-weight: bold; padding-left: 20pt } +.doc_text { text-align: left; padding-left: 20pt; padding-right: 10pt } + +.doc_footer { text-align: left; padding: 0 0 0 0 } + +.doc_hilite { color: blue; font-weight: bold; } + +.doc_table { text-align: center; width: 90%; + padding: 1px 1px 1px 1px; border: 1px; } + +.doc_warning { color: red; font-weight: bold } + +/*
    would use this class, and
    adds more padding */ +.doc_code, .literal-block + { border: solid 1px gray; background: #eeeeee; + margin: 0 1em 0 1em; + padding: 0 1em 0 1em; + display: table; + } + +/* It is preferrable to use
     everywhere instead of the
    + * 
    ...
    construct. + * + * Once all docs use
     for code regions, this style can  be merged with the
    + * one above, and we can drop the [pre] qualifier.
    + */
    +pre.doc_code, .literal-block { padding: 1em 2em 1em 1em }
    +
    +.doc_notes      { background: #fafafa; border: 1px solid #cecece;
    +                  display: table; padding: 0 1em 0 .1em }
    +
    +table.layout    { text-align: left; border: none; border-collapse: collapse;
    +                  padding: 4px 4px 4px 4px; }
    +tr.layout, td.layout, td.left, td.right
    +                { border: none; padding: 4pt 4pt 2pt 2pt; vertical-align: top; }
    +td.left         { text-align: left }
    +td.right        { text-align: right }
    +th.layout       { border: none; font-weight: bold; font-size: 105%;
    +                  text-align: center; vertical-align: middle; }
    +
    +/* Left align table cell */
    +.td_left        { border: 2px solid gray; text-align: left; }
    +
    +/* ReST-specific */
    +.title { margin-top: 0 }
    +.topic-title{ display: none }
    +div.contents ul { list-style-type: decimal }
    +.toc-backref    { color: black; text-decoration: none; }
    
    Added: www-releases/trunk/2.8/docs/re_format.7
    URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/re_format.7?rev=115556&view=auto
    ==============================================================================
    --- www-releases/trunk/2.8/docs/re_format.7 (added)
    +++ www-releases/trunk/2.8/docs/re_format.7 Mon Oct  4 15:49:23 2010
    @@ -0,0 +1,756 @@
    +.\"	$OpenBSD: re_format.7,v 1.14 2007/05/31 19:19:30 jmc Exp $
    +.\"
    +.\" Copyright (c) 1997, Phillip F Knaack. All rights reserved.
    +.\"
    +.\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
    +.\" Copyright (c) 1992, 1993, 1994
    +.\"	The Regents of the University of California.  All rights reserved.
    +.\"
    +.\" This code is derived from software contributed to Berkeley by
    +.\" Henry Spencer.
    +.\"
    +.\" Redistribution and use in source and binary forms, with or without
    +.\" modification, are permitted provided that the following conditions
    +.\" are met:
    +.\" 1. Redistributions of source code must retain the above copyright
    +.\"    notice, this list of conditions and the following disclaimer.
    +.\" 2. Redistributions in binary form must reproduce the above copyright
    +.\"    notice, this list of conditions and the following disclaimer in the
    +.\"    documentation and/or other materials provided with the distribution.
    +.\" 3. Neither the name of the University nor the names of its contributors
    +.\"    may be used to endorse or promote products derived from this software
    +.\"    without specific prior written permission.
    +.\"
    +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
    +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
    +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
    +.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
    +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
    +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
    +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
    +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
    +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
    +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
    +.\" SUCH DAMAGE.
    +.\"
    +.\"	@(#)re_format.7	8.3 (Berkeley) 3/20/94
    +.\"
    +.Dd $Mdocdate: May 31 2007 $
    +.Dt RE_FORMAT 7
    +.Os
    +.Sh NAME
    +.Nm re_format
    +.Nd POSIX regular expressions
    +.Sh DESCRIPTION
    +Regular expressions (REs),
    +as defined in
    +.St -p1003.1-2004 ,
    +come in two forms:
    +basic regular expressions
    +(BREs)
    +and extended regular expressions
    +(EREs).
    +Both forms of regular expressions are supported
    +by the interfaces described in
    +.Xr regex 3 .
    +Applications dealing with regular expressions
    +may use one or the other form
    +(or indeed both).
    +For example,
    +.Xr ed 1
    +uses BREs,
    +whilst
    +.Xr egrep 1
    +talks EREs.
    +Consult the manual page for the specific application to find out which
    +it uses.
    +.Pp
    +POSIX leaves some aspects of RE syntax and semantics open;
    +.Sq **
    +marks decisions on these aspects that
    +may not be fully portable to other POSIX implementations.
    +.Pp
    +This manual page first describes regular expressions in general,
    +specifically extended regular expressions,
    +and then discusses differences between them and basic regular expressions.
    +.Sh EXTENDED REGULAR EXPRESSIONS
    +An ERE is one** or more non-empty**
    +.Em branches ,
    +separated by
    +.Sq \*(Ba .
    +It matches anything that matches one of the branches.
    +.Pp
    +A branch is one** or more
    +.Em pieces ,
    +concatenated.
    +It matches a match for the first, followed by a match for the second, etc.
    +.Pp
    +A piece is an
    +.Em atom
    +possibly followed by a single**
    +.Sq * ,
    +.Sq + ,
    +.Sq ?\& ,
    +or
    +.Em bound .
    +An atom followed by
    +.Sq *
    +matches a sequence of 0 or more matches of the atom.
    +An atom followed by
    +.Sq +
    +matches a sequence of 1 or more matches of the atom.
    +An atom followed by
    +.Sq ?\&
    +matches a sequence of 0 or 1 matches of the atom.
    +.Pp
    +A bound is
    +.Sq {
    +followed by an unsigned decimal integer,
    +possibly followed by
    +.Sq ,\&
    +possibly followed by another unsigned decimal integer,
    +always followed by
    +.Sq } .
    +The integers must lie between 0 and
    +.Dv RE_DUP_MAX
    +(255**) inclusive,
    +and if there are two of them, the first may not exceed the second.
    +An atom followed by a bound containing one integer
    +.Ar i
    +and no comma matches
    +a sequence of exactly
    +.Ar i
    +matches of the atom.
    +An atom followed by a bound
    +containing one integer
    +.Ar i
    +and a comma matches
    +a sequence of
    +.Ar i
    +or more matches of the atom.
    +An atom followed by a bound
    +containing two integers
    +.Ar i
    +and
    +.Ar j
    +matches a sequence of
    +.Ar i
    +through
    +.Ar j
    +(inclusive) matches of the atom.
    +.Pp
    +An atom is a regular expression enclosed in
    +.Sq ()
    +(matching a part of the regular expression),
    +an empty set of
    +.Sq ()
    +(matching the null string)**,
    +a
    +.Em bracket expression
    +(see below),
    +.Sq .\&
    +(matching any single character),
    +.Sq ^
    +(matching the null string at the beginning of a line),
    +.Sq $
    +(matching the null string at the end of a line),
    +a
    +.Sq \e
    +followed by one of the characters
    +.Sq ^.[$()|*+?{\e
    +(matching that character taken as an ordinary character),
    +a
    +.Sq \e
    +followed by any other character**
    +(matching that character taken as an ordinary character,
    +as if the
    +.Sq \e
    +had not been present**),
    +or a single character with no other significance (matching that character).
    +A
    +.Sq {
    +followed by a character other than a digit is an ordinary character,
    +not the beginning of a bound**.
    +It is illegal to end an RE with
    +.Sq \e .
    +.Pp
    +A bracket expression is a list of characters enclosed in
    +.Sq [] .
    +It normally matches any single character from the list (but see below).
    +If the list begins with
    +.Sq ^ ,
    +it matches any single character
    +.Em not
    +from the rest of the list
    +(but see below).
    +If two characters in the list are separated by
    +.Sq - ,
    +this is shorthand for the full
    +.Em range
    +of characters between those two (inclusive) in the
    +collating sequence, e.g.\&
    +.Sq [0-9]
    +in ASCII matches any decimal digit.
    +It is illegal** for two ranges to share an endpoint, e.g.\&
    +.Sq a-c-e .
    +Ranges are very collating-sequence-dependent,
    +and portable programs should avoid relying on them.
    +.Pp
    +To include a literal
    +.Sq ]\&
    +in the list, make it the first character
    +(following a possible
    +.Sq ^ ) .
    +To include a literal
    +.Sq - ,
    +make it the first or last character,
    +or the second endpoint of a range.
    +To use a literal
    +.Sq -
    +as the first endpoint of a range,
    +enclose it in
    +.Sq [.
    +and
    +.Sq .]
    +to make it a collating element (see below).
    +With the exception of these and some combinations using
    +.Sq [
    +(see next paragraphs),
    +all other special characters, including
    +.Sq \e ,
    +lose their special significance within a bracket expression.
    +.Pp
    +Within a bracket expression, a collating element
    +(a character,
    +a multi-character sequence that collates as if it were a single character,
    +or a collating-sequence name for either)
    +enclosed in
    +.Sq [.
    +and
    +.Sq .]
    +stands for the sequence of characters of that collating element.
    +The sequence is a single element of the bracket expression's list.
    +A bracket expression containing a multi-character collating element
    +can thus match more than one character,
    +e.g. if the collating sequence includes a
    +.Sq ch
    +collating element,
    +then the RE
    +.Sq [[.ch.]]*c
    +matches the first five characters of
    +.Sq chchcc .
    +.Pp
    +Within a bracket expression, a collating element enclosed in
    +.Sq [=
    +and
    +.Sq =]
    +is an equivalence class, standing for the sequences of characters
    +of all collating elements equivalent to that one, including itself.
    +(If there are no other equivalent collating elements,
    +the treatment is as if the enclosing delimiters were
    +.Sq [.
    +and
    +.Sq .] . )
    +For example, if
    +.Sq x
    +and
    +.Sq y
    +are the members of an equivalence class,
    +then
    +.Sq [[=x=]] ,
    +.Sq [[=y=]] ,
    +and
    +.Sq [xy]
    +are all synonymous.
    +An equivalence class may not** be an endpoint of a range.
    +.Pp
    +Within a bracket expression, the name of a
    +.Em character class
    +enclosed
    +in
    +.Sq [:
    +and
    +.Sq :]
    +stands for the list of all characters belonging to that class.
    +Standard character class names are:
    +.Bd -literal -offset indent
    +alnum	digit	punct
    +alpha	graph	space
    +blank	lower	upper
    +cntrl	print	xdigit
    +.Ed
    +.Pp
    +These stand for the character classes defined in
    +.Xr ctype 3 .
    +A locale may provide others.
    +A character class may not be used as an endpoint of a range.
    +.Pp
    +There are two special cases** of bracket expressions:
    +the bracket expressions
    +.Sq [[:<:]]
    +and
    +.Sq [[:>:]]
    +match the null string at the beginning and end of a word, respectively.
    +A word is defined as a sequence of
    +characters starting and ending with a word character
    +which is neither preceded nor followed by
    +word characters.
    +A word character is an
    +.Em alnum
    +character (as defined by
    +.Xr ctype 3 )
    +or an underscore.
    +This is an extension,
    +compatible with but not specified by POSIX,
    +and should be used with
    +caution in software intended to be portable to other systems.
    +.Pp
    +In the event that an RE could match more than one substring of a given
    +string,
    +the RE matches the one starting earliest in the string.
    +If the RE could match more than one substring starting at that point,
    +it matches the longest.
    +Subexpressions also match the longest possible substrings, subject to
    +the constraint that the whole match be as long as possible,
    +with subexpressions starting earlier in the RE taking priority over
    +ones starting later.
    +Note that higher-level subexpressions thus take priority over
    +their lower-level component subexpressions.
    +.Pp
    +Match lengths are measured in characters, not collating elements.
    +A null string is considered longer than no match at all.
    +For example,
    +.Sq bb*
    +matches the three middle characters of
    +.Sq abbbc ;
    +.Sq (wee|week)(knights|nights)
    +matches all ten characters of
    +.Sq weeknights ;
    +when
    +.Sq (.*).*
    +is matched against
    +.Sq abc ,
    +the parenthesized subexpression matches all three characters;
    +and when
    +.Sq (a*)*
    +is matched against
    +.Sq bc ,
    +both the whole RE and the parenthesized subexpression match the null string.
    +.Pp
    +If case-independent matching is specified,
    +the effect is much as if all case distinctions had vanished from the
    +alphabet.
    +When an alphabetic that exists in multiple cases appears as an
    +ordinary character outside a bracket expression, it is effectively
    +transformed into a bracket expression containing both cases,
    +e.g.\&
    +.Sq x
    +becomes
    +.Sq [xX] .
    +When it appears inside a bracket expression,
    +all case counterparts of it are added to the bracket expression,
    +so that, for example,
    +.Sq [x]
    +becomes
    +.Sq [xX]
    +and
    +.Sq [^x]
    +becomes
    +.Sq [^xX] .
    +.Pp
    +No particular limit is imposed on the length of REs**.
    +Programs intended to be portable should not employ REs longer
    +than 256 bytes,
    +as an implementation can refuse to accept such REs and remain
    +POSIX-compliant.
    +.Pp
    +The following is a list of extended regular expressions:
    +.Bl -tag -width Ds
    +.It Ar c
    +Any character
    +.Ar c
    +not listed below matches itself.
    +.It \e Ns Ar c
    +Any backslash-escaped character
    +.Ar c
    +matches itself.
    +.It \&.
    +Matches any single character that is not a newline
    +.Pq Sq \en .
    +.It Bq Ar char-class
    +Matches any single character in
    +.Ar char-class .
    +To include a
    +.Ql \&]
    +in
    +.Ar char-class ,
    +it must be the first character.
    +A range of characters may be specified by separating the end characters
    +of the range with a
    +.Ql - ;
    +e.g.\&
    +.Ar a-z
    +specifies the lower case characters.
    +The following literal expressions can also be used in
    +.Ar char-class
    +to specify sets of characters:
    +.Bd -unfilled -offset indent
    +[:alnum:] [:cntrl:] [:lower:] [:space:]
    +[:alpha:] [:digit:] [:print:] [:upper:]
    +[:blank:] [:graph:] [:punct:] [:xdigit:]
    +.Ed
    +.Pp
    +If
    +.Ql -
    +appears as the first or last character of
    +.Ar char-class ,
    +then it matches itself.
    +All other characters in
    +.Ar char-class
    +match themselves.
    +.Pp
    +Patterns in
    +.Ar char-class
    +of the form
    +.Eo [.
    +.Ar col-elm
    +.Ec .]\&
    +or
    +.Eo [=
    +.Ar col-elm
    +.Ec =]\& ,
    +where
    +.Ar col-elm
    +is a collating element, are interpreted according to
    +.Xr setlocale 3
    +.Pq not currently supported .
    +.It Bq ^ Ns Ar char-class
    +Matches any single character, other than newline, not in
    +.Ar char-class .
    +.Ar char-class
    +is defined as above.
    +.It ^
    +If
    +.Sq ^
    +is the first character of a regular expression, then it
    +anchors the regular expression to the beginning of a line.
    +Otherwise, it matches itself.
    +.It $
    +If
    +.Sq $
    +is the last character of a regular expression,
    +it anchors the regular expression to the end of a line.
    +Otherwise, it matches itself.
    +.It [[:<:]]
    +Anchors the single character regular expression or subexpression
    +immediately following it to the beginning of a word.
    +.It [[:>:]]
    +Anchors the single character regular expression or subexpression
    +immediately following it to the end of a word.
    +.It Pq Ar re
    +Defines a subexpression
    +.Ar re .
    +Any set of characters enclosed in parentheses
    +matches whatever the set of characters without parentheses matches
    +(that is a long-winded way of saying the constructs
    +.Sq (re)
    +and
    +.Sq re
    +match identically).
    +.It *
    +Matches the single character regular expression or subexpression
    +immediately preceding it zero or more times.
    +If
    +.Sq *
    +is the first character of a regular expression or subexpression,
    +then it matches itself.
    +The
    +.Sq *
    +operator sometimes yields unexpected results.
    +For example, the regular expression
    +.Ar b*
    +matches the beginning of the string
    +.Qq abbb
    +(as opposed to the substring
    +.Qq bbb ) ,
    +since a null match is the only leftmost match.
    +.It +
    +Matches the singular character regular expression
    +or subexpression immediately preceding it
    +one or more times.
    +.It ?
    +Matches the singular character regular expression
    +or subexpression immediately preceding it
    +0 or 1 times.
    +.Sm off
    +.It Xo
    +.Pf { Ar n , m No }\ \&
    +.Pf { Ar n , No }\ \&
    +.Pf { Ar n No }
    +.Xc
    +.Sm on
    +Matches the single character regular expression or subexpression
    +immediately preceding it at least
    +.Ar n
    +and at most
    +.Ar m
    +times.
    +If
    +.Ar m
    +is omitted, then it matches at least
    +.Ar n
    +times.
    +If the comma is also omitted, then it matches exactly
    +.Ar n
    +times.
    +.It \*(Ba
    +Used to separate patterns.
    +For example,
    +the pattern
    +.Sq cat\*(Badog
    +matches either
    +.Sq cat
    +or
    +.Sq dog .
    +.El
    +.Sh BASIC REGULAR EXPRESSIONS
    +Basic regular expressions differ in several respects:
    +.Bl -bullet -offset 3n
    +.It
    +.Sq \*(Ba ,
    +.Sq + ,
    +and
    +.Sq ?\&
    +are ordinary characters and there is no equivalent
    +for their functionality.
    +.It
    +The delimiters for bounds are
    +.Sq \e{
    +and
    +.Sq \e} ,
    +with
    +.Sq {
    +and
    +.Sq }
    +by themselves ordinary characters.
    +.It
    +The parentheses for nested subexpressions are
    +.Sq \e(
    +and
    +.Sq \e) ,
    +with
    +.Sq (
    +and
    +.Sq )\&
    +by themselves ordinary characters.
    +.It
    +.Sq ^
    +is an ordinary character except at the beginning of the
    +RE or** the beginning of a parenthesized subexpression.
    +.It
    +.Sq $
    +is an ordinary character except at the end of the
    +RE or** the end of a parenthesized subexpression.
    +.It
    +.Sq *
    +is an ordinary character if it appears at the beginning of the
    +RE or the beginning of a parenthesized subexpression
    +(after a possible leading
    +.Sq ^ ) .
    +.It
    +Finally, there is one new type of atom, a
    +.Em back-reference :
    +.Sq \e
    +followed by a non-zero decimal digit
    +.Ar d
    +matches the same sequence of characters matched by the
    +.Ar d Ns th
    +parenthesized subexpression
    +(numbering subexpressions by the positions of their opening parentheses,
    +left to right),
    +so that, for example,
    +.Sq \e([bc]\e)\e1
    +matches
    +.Sq bb\&
    +or
    +.Sq cc
    +but not
    +.Sq bc .
    +.El
    +.Pp
    +The following is a list of basic regular expressions:
    +.Bl -tag -width Ds
    +.It Ar c
    +Any character
    +.Ar c
    +not listed below matches itself.
    +.It \e Ns Ar c
    +Any backslash-escaped character
    +.Ar c ,
    +except for
    +.Sq { ,
    +.Sq } ,
    +.Sq \&( ,
    +and
    +.Sq \&) ,
    +matches itself.
    +.It \&.
    +Matches any single character that is not a newline
    +.Pq Sq \en .
    +.It Bq Ar char-class
    +Matches any single character in
    +.Ar char-class .
    +To include a
    +.Ql \&]
    +in
    +.Ar char-class ,
    +it must be the first character.
    +A range of characters may be specified by separating the end characters
    +of the range with a
    +.Ql - ;
    +e.g.\&
    +.Ar a-z
    +specifies the lower case characters.
    +The following literal expressions can also be used in
    +.Ar char-class
    +to specify sets of characters:
    +.Bd -unfilled -offset indent
    +[:alnum:] [:cntrl:] [:lower:] [:space:]
    +[:alpha:] [:digit:] [:print:] [:upper:]
    +[:blank:] [:graph:] [:punct:] [:xdigit:]
    +.Ed
    +.Pp
    +If
    +.Ql -
    +appears as the first or last character of
    +.Ar char-class ,
    +then it matches itself.
    +All other characters in
    +.Ar char-class
    +match themselves.
    +.Pp
    +Patterns in
    +.Ar char-class
    +of the form
    +.Eo [.
    +.Ar col-elm
    +.Ec .]\&
    +or
    +.Eo [=
    +.Ar col-elm
    +.Ec =]\& ,
    +where
    +.Ar col-elm
    +is a collating element, are interpreted according to
    +.Xr setlocale 3
    +.Pq not currently supported .
    +.It Bq ^ Ns Ar char-class
    +Matches any single character, other than newline, not in
    +.Ar char-class .
    +.Ar char-class
    +is defined as above.
    +.It ^
    +If
    +.Sq ^
    +is the first character of a regular expression, then it
    +anchors the regular expression to the beginning of a line.
    +Otherwise, it matches itself.
    +.It $
    +If
    +.Sq $
    +is the last character of a regular expression,
    +it anchors the regular expression to the end of a line.
    +Otherwise, it matches itself.
    +.It [[:<:]]
    +Anchors the single character regular expression or subexpression
    +immediately following it to the beginning of a word.
    +.It [[:>:]]
    +Anchors the single character regular expression or subexpression
    +immediately following it to the end of a word.
    +.It \e( Ns Ar re Ns \e)
    +Defines a subexpression
    +.Ar re .
    +Subexpressions may be nested.
    +A subsequent backreference of the form
    +.Pf \e Ns Ar n ,
    +where
    +.Ar n
    +is a number in the range [1,9], expands to the text matched by the
    +.Ar n Ns th
    +subexpression.
    +For example, the regular expression
    +.Ar \e(.*\e)\e1
    +matches any string consisting of identical adjacent substrings.
    +Subexpressions are ordered relative to their left delimiter.
    +.It *
    +Matches the single character regular expression or subexpression
    +immediately preceding it zero or more times.
    +If
    +.Sq *
    +is the first character of a regular expression or subexpression,
    +then it matches itself.
    +The
    +.Sq *
    +operator sometimes yields unexpected results.
    +For example, the regular expression
    +.Ar b*
    +matches the beginning of the string
    +.Qq abbb
    +(as opposed to the substring
    +.Qq bbb ) ,
    +since a null match is the only leftmost match.
    +.Sm off
    +.It Xo
    +.Pf \e{ Ar n , m No \e}\ \&
    +.Pf \e{ Ar n , No \e}\ \&
    +.Pf \e{ Ar n No \e}
    +.Xc
    +.Sm on
    +Matches the single character regular expression or subexpression
    +immediately preceding it at least
    +.Ar n
    +and at most
    +.Ar m
    +times.
    +If
    +.Ar m
    +is omitted, then it matches at least
    +.Ar n
    +times.
    +If the comma is also omitted, then it matches exactly
    +.Ar n
    +times.
    +.El
    +.Sh SEE ALSO
    +.Xr ctype 3 ,
    +.Xr regex 3
    +.Sh STANDARDS
    +.St -p1003.1-2004 :
    +Base Definitions, Chapter 9 (Regular Expressions).
    +.Sh BUGS
    +Having two kinds of REs is a botch.
    +.Pp
    +The current POSIX spec says that
    +.Sq )\&
    +is an ordinary character in the absence of an unmatched
    +.Sq ( ;
    +this was an unintentional result of a wording error,
    +and change is likely.
    +Avoid relying on it.
    +.Pp
    +Back-references are a dreadful botch,
    +posing major problems for efficient implementations.
    +They are also somewhat vaguely defined
    +(does
    +.Sq a\e(\e(b\e)*\e2\e)*d
    +match
    +.Sq abbbd ? ) .
    +Avoid using them.
    +.Pp
    +POSIX's specification of case-independent matching is vague.
    +The
    +.Dq one case implies all cases
    +definition given above
    +is the current consensus among implementors as to the right interpretation.
    +.Pp
    +The syntax for word boundaries is incredibly ugly.
    
    Added: www-releases/trunk/2.8/docs/tutorial/LangImpl1.html
    URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl1.html?rev=115556&view=auto
    ==============================================================================
    --- www-releases/trunk/2.8/docs/tutorial/LangImpl1.html (added)
    +++ www-releases/trunk/2.8/docs/tutorial/LangImpl1.html Mon Oct  4 15:49:23 2010
    @@ -0,0 +1,348 @@
    +
    +
    +
    +
    +  Kaleidoscope: Tutorial Introduction and the Lexer
    +  
    +  
    +  
    +
    +
    +
    +
    +
    Kaleidoscope: Tutorial Introduction and the Lexer
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to the "Implementing a language with LLVM" tutorial. This tutorial +runs through the implementation of a simple language, showing how fun and +easy it can be. This tutorial will get you up and started as well as help to +build a framework you can extend to other languages. The code in this tutorial +can also be used as a playground to hack on other LLVM specific things. +

    + +

    +The goal of this tutorial is to progressively unveil our language, describing +how it is built up over time. This will let us cover a fairly broad range of +language design and LLVM-specific usage issues, showing and explaining the code +for it all along the way, without overwhelming you with tons of details up +front.

    + +

    It is useful to point out ahead of time that this tutorial is really about +teaching compiler techniques and LLVM specifically, not about teaching +modern and sane software engineering principles. In practice, this means that +we'll take a number of shortcuts to simplify the exposition. For example, the +code leaks memory, uses global variables all over the place, doesn't use nice +design patterns like visitors, etc... but it +is very simple. If you dig in and use the code as a basis for future projects, +fixing these deficiencies shouldn't be hard.

    + +

    I've tried to put this tutorial together in a way that makes chapters easy to +skip over if you are already familiar with or are uninterested in the various +pieces. The structure of the tutorial is: +

    + +
      +
    • Chapter #1: Introduction to the Kaleidoscope +language, and the definition of its Lexer - This shows where we are going +and the basic functionality that we want it to do. In order to make this +tutorial maximally understandable and hackable, we choose to implement +everything in C++ instead of using lexer and parser generators. LLVM obviously +works just fine with such tools, feel free to use one if you prefer.
    • +
    • Chapter #2: Implementing a Parser and +AST - With the lexer in place, we can talk about parsing techniques and +basic AST construction. This tutorial describes recursive descent parsing and +operator precedence parsing. Nothing in Chapters 1 or 2 is LLVM-specific, +the code doesn't even link in LLVM at this point. :)
    • +
    • Chapter #3: Code generation to LLVM IR - +With the AST ready, we can show off how easy generation of LLVM IR really +is.
    • +
    • Chapter #4: Adding JIT and Optimizer +Support - Because a lot of people are interested in using LLVM as a JIT, +we'll dive right into it and show you the 3 lines it takes to add JIT support. +LLVM is also useful in many other ways, but this is one simple and "sexy" way +to shows off its power. :)
    • +
    • Chapter #5: Extending the Language: Control +Flow - With the language up and running, we show how to extend it with +control flow operations (if/then/else and a 'for' loop). This gives us a chance +to talk about simple SSA construction and control flow.
    • +
    • Chapter #6: Extending the Language: +User-defined Operators - This is a silly but fun chapter that talks about +extending the language to let the user program define their own arbitrary +unary and binary operators (with assignable precedence!). This lets us build a +significant piece of the "language" as library routines.
    • +
    • Chapter #7: Extending the Language: Mutable +Variables - This chapter talks about adding user-defined local variables +along with an assignment operator. The interesting part about this is how +easy and trivial it is to construct SSA form in LLVM: no, LLVM does not +require your front-end to construct SSA form!
    • +
    • Chapter #8: Conclusion and other useful LLVM +tidbits - This chapter wraps up the series by talking about potential +ways to extend the language, but also includes a bunch of pointers to info about +"special topics" like adding garbage collection support, exceptions, debugging, +support for "spaghetti stacks", and a bunch of other tips and tricks.
    • + +
    + +

    By the end of the tutorial, we'll have written a bit less than 700 lines of +non-comment, non-blank, lines of code. With this small amount of code, we'll +have built up a very reasonable compiler for a non-trivial language including +a hand-written lexer, parser, AST, as well as code generation support with a JIT +compiler. While other systems may have interesting "hello world" tutorials, +I think the breadth of this tutorial is a great testament to the strengths of +LLVM and why you should consider it if you're interested in language or compiler +design.

    + +

    A note about this tutorial: we expect you to extend the language and play +with it on your own. Take the code and go crazy hacking away at it, compilers +don't need to be scary creatures - it can be a lot of fun to play with +languages!

    + +
    + + + + + +
    + +

    This tutorial will be illustrated with a toy language that we'll call +"Kaleidoscope" (derived +from "meaning beautiful, form, and view"). +Kaleidoscope is a procedural language that allows you to define functions, use +conditionals, math, etc. Over the course of the tutorial, we'll extend +Kaleidoscope to support the if/then/else construct, a for loop, user defined +operators, JIT compilation with a simple command line interface, etc.

    + +

    Because we want to keep things simple, the only datatype in Kaleidoscope is a +64-bit floating point type (aka 'double' in C parlance). As such, all values +are implicitly double precision and the language doesn't require type +declarations. This gives the language a very nice and simple syntax. For +example, the following simple example computes Fibonacci numbers:

    + +
    +
    +# Compute the x'th fibonacci number.
    +def fib(x)
    +  if x < 3 then
    +    1
    +  else
    +    fib(x-1)+fib(x-2)
    +
    +# This expression will compute the 40th number.
    +fib(40)
    +
    +
    + +

    We also allow Kaleidoscope to call into standard library functions (the LLVM +JIT makes this completely trivial). This means that you can use the 'extern' +keyword to define a function before you use it (this is also useful for mutually +recursive functions). For example:

    + +
    +
    +extern sin(arg);
    +extern cos(arg);
    +extern atan2(arg1 arg2);
    +
    +atan2(sin(.4), cos(42))
    +
    +
    + +

    A more interesting example is included in Chapter 6 where we write a little +Kaleidoscope application that displays +a Mandelbrot Set at various levels of magnification.

    + +

    Lets dive into the implementation of this language!

    + +
    + + + + + +
    + +

    When it comes to implementing a language, the first thing needed is +the ability to process a text file and recognize what it says. The traditional +way to do this is to use a "lexer" (aka 'scanner') +to break the input up into "tokens". Each token returned by the lexer includes +a token code and potentially some metadata (e.g. the numeric value of a number). +First, we define the possibilities: +

    + +
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5,
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +
    + +

    Each token returned by our lexer will either be one of the Token enum values +or it will be an 'unknown' character like '+', which is returned as its ASCII +value. If the current token is an identifier, the IdentifierStr +global variable holds the name of the identifier. If the current token is a +numeric literal (like 1.0), NumVal holds its value. Note that we use +global variables for simplicity, this is not the best choice for a real language +implementation :). +

    + +

    The actual implementation of the lexer is a single function named +gettok. The gettok function is called to return the next token +from standard input. Its definition starts as:

    + +
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +
    + +

    +gettok works by calling the C getchar() function to read +characters one at a time from standard input. It eats them as it recognizes +them and stores the last character read, but not processed, in LastChar. The +first thing that it has to do is ignore whitespace between tokens. This is +accomplished with the loop above.

    + +

    The next thing gettok needs to do is recognize identifiers and +specific keywords like "def". Kaleidoscope does this with this simple loop:

    + +
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    return tok_identifier;
    +  }
    +
    +
    + +

    Note that this code sets the 'IdentifierStr' global whenever it +lexes an identifier. Also, since language keywords are matched by the same +loop, we handle them here inline. Numeric values are similar:

    + +
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +
    + +

    This is all pretty straight-forward code for processing input. When reading +a numeric value from input, we use the C strtod function to convert it +to a numeric value that we store in NumVal. Note that this isn't doing +sufficient error checking: it will incorrectly read "1.23.45.67" and handle it as +if you typed in "1.23". Feel free to extend it :). Next we handle comments: +

    + +
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +
    +
    + +

    We handle comments by skipping to the end of the line and then return the +next token. Finally, if the input doesn't match one of the above cases, it is +either an operator character like '+' or the end of the file. These are handled +with this code:

    + +
    +
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +  
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +
    + +

    With this, we have the complete lexer for the basic Kaleidoscope language +(the full code listing for the Lexer is +available in the next chapter of the tutorial). +Next we'll build a simple parser that uses this to +build an Abstract Syntax Tree. When we have that, we'll include a driver +so that you can use the lexer and parser together. +

    + +Next: Implementing a Parser and AST +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl2.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl2.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl2.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl2.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1233 @@ + + + + + Kaleidoscope: Implementing a Parser and AST + + + + + + + +
    Kaleidoscope: Implementing a Parser and AST
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 2 of the "Implementing a language +with LLVM" tutorial. This chapter shows you how to use the lexer, built in +Chapter 1, to build a full parser for +our Kaleidoscope language. Once we have a parser, we'll define and build an Abstract Syntax +Tree (AST).

    + +

    The parser we will build uses a combination of Recursive Descent +Parsing and Operator-Precedence +Parsing to parse the Kaleidoscope language (the latter for +binary expressions and the former for everything else). Before we get to +parsing though, lets talk about the output of the parser: the Abstract Syntax +Tree.

    + +
    + + + + + +
    + +

    The AST for a program captures its behavior in such a way that it is easy for +later stages of the compiler (e.g. code generation) to interpret. We basically +want one object for each construct in the language, and the AST should closely +model the language. In Kaleidoscope, we have expressions, a prototype, and a +function object. We'll start with expressions first:

    + +
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +};
    +
    +
    + +

    The code above shows the definition of the base ExprAST class and one +subclass which we use for numeric literals. The important thing to note about +this code is that the NumberExprAST class captures the numeric value of the +literal as an instance variable. This allows later phases of the compiler to +know what the stored numeric value is.

    + +

    Right now we only create the AST, so there are no useful accessor methods on +them. It would be very easy to add a virtual method to pretty print the code, +for example. Here are the other expression AST node definitions that we'll use +in the basic form of the Kaleidoscope language: +

    + +
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +};
    +
    +
    + +

    This is all (intentionally) rather straight-forward: variables capture the +variable name, binary operators capture their opcode (e.g. '+'), and calls +capture a function name as well as a list of any argument expressions. One thing +that is nice about our AST is that it captures the language features without +talking about the syntax of the language. Note that there is no discussion about +precedence of binary operators, lexical structure, etc.

    + +

    For our basic language, these are all of the expression nodes we'll define. +Because it doesn't have conditional control flow, it isn't Turing-complete; +we'll fix that in a later installment. The two things we need next are a way +to talk about the interface to a function, and a way to talk about functions +themselves:

    + +
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes).
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args)
    +    : Name(name), Args(args) {}
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +};
    +
    +
    + +

    In Kaleidoscope, functions are typed with just a count of their arguments. +Since all values are double precision floating point, the type of each argument +doesn't need to be stored anywhere. In a more aggressive and realistic +language, the "ExprAST" class would probably have a type field.

    + +

    With this scaffolding, we can now talk about parsing expressions and function +bodies in Kaleidoscope.

    + +
    + + + + + +
    + +

    Now that we have an AST to build, we need to define the parser code to build +it. The idea here is that we want to parse something like "x+y" (which is +returned as three tokens by the lexer) into an AST that could be generated with +calls like this:

    + +
    +
    +  ExprAST *X = new VariableExprAST("x");
    +  ExprAST *Y = new VariableExprAST("y");
    +  ExprAST *Result = new BinaryExprAST('+', X, Y);
    +
    +
    + +

    In order to do this, we'll start by defining some basic helper routines:

    + +
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +
    + +

    +This implements a simple token buffer around the lexer. This allows +us to look one token ahead at what the lexer is returning. Every function in +our parser will assume that CurTok is the current token that needs to be +parsed.

    + +
    +
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +
    + +

    +The Error routines are simple helper routines that our parser will use +to handle errors. The error recovery in our parser will not be the best and +is not particular user-friendly, but it will be enough for our tutorial. These +routines make it easier to handle errors in routines that have various return +types: they always return null.

    + +

    With these basic helper functions, we can implement the first +piece of our grammar: numeric literals.

    + +
    + + + + + +
    + +

    We start with numeric literals, because they are the simplest to process. +For each production in our grammar, we'll define a function which parses that +production. For numeric literals, we have: +

    + +
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +
    + +

    This routine is very simple: it expects to be called when the current token +is a tok_number token. It takes the current number value, creates +a NumberExprAST node, advances the lexer to the next token, and finally +returns.

    + +

    There are some interesting aspects to this. The most important one is that +this routine eats all of the tokens that correspond to the production and +returns the lexer buffer with the next token (which is not part of the grammar +production) ready to go. This is a fairly standard way to go for recursive +descent parsers. For a better example, the parenthesis operator is defined like +this:

    + +
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +
    + +

    This function illustrates a number of interesting things about the +parser:

    + +

    +1) It shows how we use the Error routines. When called, this function expects +that the current token is a '(' token, but after parsing the subexpression, it +is possible that there is no ')' waiting. For example, if the user types in +"(4 x" instead of "(4)", the parser should emit an error. Because errors can +occur, the parser needs a way to indicate that they happened: in our parser, we +return null on an error.

    + +

    2) Another interesting aspect of this function is that it uses recursion by +calling ParseExpression (we will soon see that ParseExpression can call +ParseParenExpr). This is powerful because it allows us to handle +recursive grammars, and keeps each production very simple. Note that +parentheses do not cause construction of AST nodes themselves. While we could +do it this way, the most important role of parentheses are to guide the parser +and provide grouping. Once the parser constructs the AST, parentheses are not +needed.

    + +

    The next simple production is for handling variable references and function +calls:

    + +
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +
    + +

    This routine follows the same style as the other routines. (It expects to be +called if the current token is a tok_identifier token). It also has +recursion and error handling. One interesting aspect of this is that it uses +look-ahead to determine if the current identifier is a stand alone +variable reference or if it is a function call expression. It handles this by +checking to see if the token after the identifier is a '(' token, constructing +either a VariableExprAST or CallExprAST node as appropriate. +

    + +

    Now that we have all of our simple expression-parsing logic in place, we can +define a helper function to wrap it together into one entry point. We call this +class of expressions "primary" expressions, for reasons that will become more +clear later in the tutorial. In order to +parse an arbitrary primary expression, we need to determine what sort of +expression it is:

    + +
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  }
    +}
    +
    +
    + +

    Now that you see the definition of this function, it is more obvious why we +can assume the state of CurTok in the various functions. This uses look-ahead +to determine which sort of expression is being inspected, and then parses it +with a function call.

    + +

    Now that basic expressions are handled, we need to handle binary expressions. +They are a bit more complex.

    + +
    + + + + + +
    + +

    Binary expressions are significantly harder to parse because they are often +ambiguous. For example, when given the string "x+y*z", the parser can choose +to parse it as either "(x+y)*z" or "x+(y*z)". With common definitions from +mathematics, we expect the later parse, because "*" (multiplication) has +higher precedence than "+" (addition).

    + +

    There are many ways to handle this, but an elegant and efficient way is to +use Operator-Precedence +Parsing. This parsing technique uses the precedence of binary operators to +guide recursion. To start with, we need a table of precedences:

    + +
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +    
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +int main() {
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +  ...
    +}
    +
    +
    + +

    For the basic form of Kaleidoscope, we will only support 4 binary operators +(this can obviously be extended by you, our brave and intrepid reader). The +GetTokPrecedence function returns the precedence for the current token, +or -1 if the token is not a binary operator. Having a map makes it easy to add +new operators and makes it clear that the algorithm doesn't depend on the +specific operators involved, but it would be easy enough to eliminate the map +and do the comparisons in the GetTokPrecedence function. (Or just use +a fixed-size array).

    + +

    With the helper above defined, we can now start parsing binary expressions. +The basic idea of operator precedence parsing is to break down an expression +with potentially ambiguous binary operators into pieces. Consider ,for example, +the expression "a+b+(c+d)*e*f+g". Operator precedence parsing considers this +as a stream of primary expressions separated by binary operators. As such, +it will first parse the leading primary expression "a", then it will see the +pairs [+, b] [+, (c+d)] [*, e] [*, f] and [+, g]. Note that because parentheses +are primary expressions, the binary expression parser doesn't need to worry +about nested subexpressions like (c+d) at all. +

    + +

    +To start, an expression is a primary expression potentially followed by a +sequence of [binop,primaryexpr] pairs:

    + +
    +
    +/// expression
    +///   ::= primary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParsePrimary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +
    + +

    ParseBinOpRHS is the function that parses the sequence of pairs for +us. It takes a precedence and a pointer to an expression for the part that has been +parsed so far. Note that "x" is a perfectly valid expression: As such, "binoprhs" is +allowed to be empty, in which case it returns the expression that is passed into +it. In our example above, the code passes the expression for "a" into +ParseBinOpRHS and the current token is "+".

    + +

    The precedence value passed into ParseBinOpRHS indicates the +minimal operator precedence that the function is allowed to eat. For +example, if the current pair stream is [+, x] and ParseBinOpRHS is +passed in a precedence of 40, it will not consume any tokens (because the +precedence of '+' is only 20). With this in mind, ParseBinOpRHS starts +with:

    + +
    +
    +/// binoprhs
    +///   ::= ('+' primary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +
    +
    + +

    This code gets the precedence of the current token and checks to see if if is +too low. Because we defined invalid tokens to have a precedence of -1, this +check implicitly knows that the pair-stream ends when the token stream runs out +of binary operators. If this check succeeds, we know that the token is a binary +operator and that it will be included in this expression:

    + +
    +
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the primary expression after the binary operator.
    +    ExprAST *RHS = ParsePrimary();
    +    if (!RHS) return 0;
    +
    +
    + +

    As such, this code eats (and remembers) the binary operator and then parses +the primary expression that follows. This builds up the whole pair, the first of +which is [+, b] for the running example.

    + +

    Now that we parsed the left-hand side of an expression and one pair of the +RHS sequence, we have to decide which way the expression associates. In +particular, we could have "(a+b) binop unparsed" or "a + (b binop unparsed)". +To determine this, we look ahead at "binop" to determine its precedence and +compare it to BinOp's precedence (which is '+' in this case):

    + +
    +
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +
    +
    + +

    If the precedence of the binop to the right of "RHS" is lower or equal to the +precedence of our current operator, then we know that the parentheses associate +as "(a+b) binop ...". In our example, the current operator is "+" and the next +operator is "+", we know that they have the same precedence. In this case we'll +create the AST node for "a+b", and then continue parsing:

    + +
    +
    +      ... if body omitted ...
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }  // loop around to the top of the while loop.
    +}
    +
    +
    + +

    In our example above, this will turn "a+b+" into "(a+b)" and execute the next +iteration of the loop, with "+" as the current token. The code above will eat, +remember, and parse "(c+d)" as the primary expression, which makes the +current pair equal to [+, (c+d)]. It will then evaluate the 'if' conditional above with +"*" as the binop to the right of the primary. In this case, the precedence of "*" is +higher than the precedence of "+" so the if condition will be entered.

    + +

    The critical question left here is "how can the if condition parse the right +hand side in full"? In particular, to build the AST correctly for our example, +it needs to get all of "(c+d)*e*f" as the RHS expression variable. The code to +do this is surprisingly simple (code from the above two blocks duplicated for +context):

    + +
    +
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }  // loop around to the top of the while loop.
    +}
    +
    +
    + +

    At this point, we know that the binary operator to the RHS of our primary +has higher precedence than the binop we are currently parsing. As such, we know +that any sequence of pairs whose operators are all higher precedence than "+" +should be parsed together and returned as "RHS". To do this, we recursively +invoke the ParseBinOpRHS function specifying "TokPrec+1" as the minimum +precedence required for it to continue. In our example above, this will cause +it to return the AST node for "(c+d)*e*f" as RHS, which is then set as the RHS +of the '+' expression.

    + +

    Finally, on the next iteration of the while loop, the "+g" piece is parsed +and added to the AST. With this little bit of code (14 non-trivial lines), we +correctly handle fully general binary expression parsing in a very elegant way. +This was a whirlwind tour of this code, and it is somewhat subtle. I recommend +running through it with a few tough examples to see how it works. +

    + +

    This wraps up handling of expressions. At this point, we can point the +parser at an arbitrary token stream and build an expression from it, stopping +at the first token that is not part of the expression. Next up we need to +handle function definitions, etc.

    + +
    + + + + + +
    + +

    +The next thing missing is handling of function prototypes. In Kaleidoscope, +these are used both for 'extern' function declarations as well as function body +definitions. The code to do this is straight-forward and not very interesting +(once you've survived expressions): +

    + +
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +static PrototypeAST *ParsePrototype() {
    +  if (CurTok != tok_identifier)
    +    return ErrorP("Expected function name in prototype");
    +
    +  std::string FnName = IdentifierStr;
    +  getNextToken();
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  // Read the list of argument names.
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  return new PrototypeAST(FnName, ArgNames);
    +}
    +
    +
    + +

    Given this, a function definition is very simple, just a prototype plus +an expression to implement the body:

    + +
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +
    + +

    In addition, we support 'extern' to declare functions like 'sin' and 'cos' as +well as to support forward declaration of user functions. These 'extern's are just +prototypes with no body:

    + +
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +
    + +

    Finally, we'll also let the user type in arbitrary top-level expressions and +evaluate them on the fly. We will handle this by defining anonymous nullary +(zero argument) functions for them:

    + +
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +
    + +

    Now that we have all the pieces, let's build a little driver that will let us +actually execute this code we've built!

    + +
    + + + + + +
    + +

    The driver for this simply invokes all of the parsing pieces with a top-level +dispatch loop. There isn't much interesting here, so I'll just include the +top-level loop. See below for full code in the "Top-Level +Parsing" section.

    + +
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +
    + +

    The most interesting part of this is that we ignore top-level semicolons. +Why is this, you ask? The basic reason is that if you type "4 + 5" at the +command line, the parser doesn't know whether that is the end of what you will type +or not. For example, on the next line you could type "def foo..." in which case +4+5 is the end of a top-level expression. Alternatively you could type "* 6", +which would continue the expression. Having top-level semicolons allows you to +type "4+5;", and the parser will know you are done.

    + +
    + + + + + +
    + +

    With just under 400 lines of commented code (240 lines of non-comment, +non-blank code), we fully defined our minimal language, including a lexer, +parser, and AST builder. With this done, the executable will validate +Kaleidoscope code and tell us if it is grammatically invalid. For +example, here is a sample interaction:

    + +
    +
    +$ ./a.out
    +ready> def foo(x y) x+foo(y, 4.0);
    +Parsed a function definition.
    +ready> def foo(x y) x+y y;
    +Parsed a function definition.
    +Parsed a top-level expr
    +ready> def foo(x y) x+y );
    +Parsed a function definition.
    +Error: unknown token when expecting an expression
    +ready> extern sin(a);
    +ready> Parsed an extern
    +ready> ^D
    +$ 
    +
    +
    + +

    There is a lot of room for extension here. You can define new AST nodes, +extend the language in many ways, etc. In the next +installment, we will describe how to generate LLVM Intermediate +Representation (IR) from the AST.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for this and the previous chapter. +Note that it is fully self-contained: you don't need LLVM or any external +libraries at all for this. (Besides the C and C++ standard libraries, of +course.) To build this, just compile with:

    + +
    +
    +   # Compile
    +   g++ -g -O3 toy.cpp 
    +   # Run
    +   ./a.out 
    +
    +
    + +

    Here is the code:

    + +
    +
    +#include <cstdio>
    +#include <cstdlib>
    +#include <string>
    +#include <map>
    +#include <vector>
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes).
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args)
    +    : Name(name), Args(args) {}
    +  
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  }
    +}
    +
    +/// binoprhs
    +///   ::= ('+' primary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the primary expression after the binary operator.
    +    ExprAST *RHS = ParsePrimary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= primary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParsePrimary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +static PrototypeAST *ParsePrototype() {
    +  if (CurTok != tok_identifier)
    +    return ErrorP("Expected function name in prototype");
    +
    +  std::string FnName = IdentifierStr;
    +  getNextToken();
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  return new PrototypeAST(FnName, ArgNames);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing
    +//===----------------------------------------------------------------------===//
    +
    +static void HandleDefinition() {
    +  if (ParseDefinition()) {
    +    fprintf(stderr, "Parsed a function definition.\n");
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (ParseExtern()) {
    +    fprintf(stderr, "Parsed an extern\n");
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (ParseTopLevelExpr()) {
    +    fprintf(stderr, "Parsed a top-level expr\n");
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  return 0;
    +}
    +
    +
    +Next: Implementing Code Generation to LLVM IR +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl3.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl3.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl3.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl3.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1269 @@ + + + + + Kaleidoscope: Implementing code generation to LLVM IR + + + + + + + +
    Kaleidoscope: Code generation to LLVM IR
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 3 of the "Implementing a language +with LLVM" tutorial. This chapter shows you how to transform the Abstract Syntax Tree, built in Chapter 2, into LLVM IR. +This will teach you a little bit about how LLVM does things, as well as +demonstrate how easy it is to use. It's much more work to build a lexer and +parser than it is to generate LLVM IR code. :) +

    + +

    Please note: the code in this chapter and later require LLVM 2.2 or +later. LLVM 2.1 and before will not work with it. Also note that you need +to use a version of this tutorial that matches your LLVM release: If you are +using an official LLVM release, use the version of the documentation included +with your release or on the llvm.org +releases page.

    + +
    + + + + + +
    + +

    +In order to generate LLVM IR, we want some simple setup to get started. First +we define virtual code generation (codegen) methods in each AST class:

    + +
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +...
    +
    +
    + +

    The Codegen() method says to emit IR for that AST node along with all the things it +depends on, and they all return an LLVM Value object. +"Value" is the class used to represent a "Static Single +Assignment (SSA) register" or "SSA value" in LLVM. The most distinct aspect +of SSA values is that their value is computed as the related instruction +executes, and it does not get a new value until (and if) the instruction +re-executes. In other words, there is no way to "change" an SSA value. For +more information, please read up on Static Single +Assignment - the concepts are really quite natural once you grok them.

    + +

    Note that instead of adding virtual methods to the ExprAST class hierarchy, +it could also make sense to use a visitor pattern or some +other way to model this. Again, this tutorial won't dwell on good software +engineering practices: for our purposes, adding a virtual method is +simplest.

    + +

    The +second thing we want is an "Error" method like we used for the parser, which will +be used to report errors found during code generation (for example, use of an +undeclared parameter):

    + +
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, Value*> NamedValues;
    +
    +
    + +

    The static variables will be used during code generation. TheModule +is the LLVM construct that contains all of the functions and global variables in +a chunk of code. In many ways, it is the top-level structure that the LLVM IR +uses to contain code.

    + +

    The Builder object is a helper object that makes it easy to generate +LLVM instructions. Instances of the IRBuilder +class template keep track of the current place to insert instructions and has +methods to create new instructions.

    + +

    The NamedValues map keeps track of which values are defined in the +current scope and what their LLVM representation is. (In other words, it is a +symbol table for the code). In this form of Kaleidoscope, the only things that +can be referenced are function parameters. As such, function parameters will +be in this map when generating code for their function body.

    + +

    +With these basics in place, we can start talking about how to generate code for +each expression. Note that this assumes that the Builder has been set +up to generate code into something. For now, we'll assume that this +has already been done, and we'll just use it to emit code. +

    + +
    + + + + + +
    + +

    Generating LLVM code for expression nodes is very straightforward: less +than 45 lines of commented code for all four of our expression nodes. First +we'll do numeric literals:

    + +
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +
    + +

    In the LLVM IR, numeric constants are represented with the +ConstantFP class, which holds the numeric value in an APFloat +internally (APFloat has the capability of holding floating point +constants of Arbitrary Precision). This code basically just +creates and returns a ConstantFP. Note that in the LLVM IR +that constants are all uniqued together and shared. For this reason, the API +uses the "foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)".

    + +
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  return V ? V : ErrorV("Unknown variable name");
    +}
    +
    +
    + +

    References to variables are also quite simple using LLVM. In the simple version +of Kaleidoscope, we assume that the variable has already been emitted somewhere +and its value is available. In practice, the only values that can be in the +NamedValues map are function arguments. This +code simply checks to see that the specified name is in the map (if not, an +unknown variable is being referenced) and returns the value for it. In future +chapters, we'll add support for loop induction +variables in the symbol table, and for local variables.

    + +
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: return ErrorV("invalid binary operator");
    +  }
    +}
    +
    +
    + +

    Binary operators start to get more interesting. The basic idea here is that +we recursively emit code for the left-hand side of the expression, then the +right-hand side, then we compute the result of the binary expression. In this +code, we do a simple switch on the opcode to create the right LLVM instruction. +

    + +

    In the example above, the LLVM builder class is starting to show its value. +IRBuilder knows where to insert the newly created instruction, all you have to +do is specify what instruction to create (e.g. with CreateFAdd), which +operands to use (L and R here) and optionally provide a name +for the generated instruction.

    + +

    One nice thing about LLVM is that the name is just a hint. For instance, if +the code above emits multiple "addtmp" variables, LLVM will automatically +provide each one with an increasing, unique numeric suffix. Local value names +for instructions are purely optional, but it makes it much easier to read the +IR dumps.

    + +

    LLVM instructions are constrained by +strict rules: for example, the Left and Right operators of +an add instruction must have the same +type, and the result type of the add must match the operand types. Because +all values in Kaleidoscope are doubles, this makes for very simple code for add, +sub and mul.

    + +

    On the other hand, LLVM specifies that the fcmp instruction always returns an 'i1' value +(a one bit integer). The problem with this is that Kaleidoscope wants the value to be a 0.0 or 1.0 value. In order to get these semantics, we combine the fcmp instruction with +a uitofp instruction. This instruction +converts its input integer into a floating point value by treating the input +as an unsigned value. In contrast, if we used the sitofp instruction, the Kaleidoscope '<' +operator would return 0.0 and -1.0, depending on the input value.

    + +
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +
    + +

    Code generation for function calls is quite straightforward with LLVM. The +code above initially does a function name lookup in the LLVM Module's symbol +table. Recall that the LLVM Module is the container that holds all of the +functions we are JIT'ing. By giving each function the same name as what the +user specifies, we can use the LLVM symbol table to resolve function names for +us.

    + +

    Once we have the function to call, we recursively codegen each argument that +is to be passed in, and create an LLVM call +instruction. Note that LLVM uses the native C calling conventions by +default, allowing these calls to also call into standard library functions like +"sin" and "cos", with no additional effort.

    + +

    This wraps up our handling of the four basic expressions that we have so far +in Kaleidoscope. Feel free to go in and add some more. For example, by +browsing the LLVM language reference you'll find +several other interesting instructions that are really easy to plug into our +basic framework.

    + +
    + + + + + +
    + +

    Code generation for prototypes and functions must handle a number of +details, which make their code less beautiful than expression code +generation, but allows us to illustrate some important points. First, lets +talk about code generation for prototypes: they are used both for function +bodies and external function declarations. The code starts with:

    + +
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +
    +
    + +

    This code packs a lot of power into a few lines. Note first that this +function returns a "Function*" instead of a "Value*". Because a "prototype" +really talks about the external interface for a function (not the value computed +by an expression), it makes sense for it to return the LLVM Function it +corresponds to when codegen'd.

    + +

    The call to FunctionType::get creates +the FunctionType that should be used for a given Prototype. Since all +function arguments in Kaleidoscope are of type double, the first line creates +a vector of "N" LLVM double types. It then uses the Functiontype::get +method to create a function type that takes "N" doubles as arguments, returns +one double as a result, and that is not vararg (the false parameter indicates +this). Note that Types in LLVM are uniqued just like Constants are, so you +don't "new" a type, you "get" it.

    + +

    The final line above actually creates the function that the prototype will +correspond to. This indicates the type, linkage and name to use, as well as which +module to insert into. "external linkage" +means that the function may be defined outside the current module and/or that it +is callable by functions outside the module. The Name passed in is the name the +user specified: since "TheModule" is specified, this name is registered +in "TheModule"s symbol table, which is used by the function call code +above.

    + +
    +
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +
    +
    + +

    The Module symbol table works just like the Function symbol table when it +comes to name conflicts: if a new function is created with a name was previously +added to the symbol table, it will get implicitly renamed when added to the +Module. The code above exploits this fact to determine if there was a previous +definition of this function.

    + +

    In Kaleidoscope, I choose to allow redefinitions of functions in two cases: +first, we want to allow 'extern'ing a function more than once, as long as the +prototypes for the externs match (since all arguments have the same type, we +just have to check that the number of arguments match). Second, we want to +allow 'extern'ing a function and then defining a body for it. This is useful +when defining mutually recursive functions.

    + +

    In order to implement this, the code above first checks to see if there is +a collision on the name of the function. If so, it deletes the function we just +created (by calling eraseFromParent) and then calling +getFunction to get the existing function with the specified name. Note +that many APIs in LLVM have "erase" forms and "remove" forms. The "remove" form +unlinks the object from its parent (e.g. a Function from a Module) and returns +it. The "erase" form unlinks the object and then deletes it.

    + +
    +
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +
    +
    + +

    In order to verify the logic above, we first check to see if the pre-existing +function is "empty". In this case, empty means that it has no basic blocks in +it, which means it has no body. If it has no body, it is a forward +declaration. Since we don't allow anything after a full definition of the +function, the code rejects this case. If the previous reference to a function +was an 'extern', we simply verify that the number of arguments for that +definition and this one match up. If not, we emit an error.

    + +
    +
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx) {
    +    AI->setName(Args[Idx]);
    +    
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = AI;
    +  }
    +  return F;
    +}
    +
    +
    + +

    The last bit of code for prototypes loops over all of the arguments in the +function, setting the name of the LLVM Argument objects to match, and registering +the arguments in the NamedValues map for future use by the +VariableExprAST AST node. Once this is set up, it returns the Function +object to the caller. Note that we don't check for conflicting +argument names here (e.g. "extern foo(a b a)"). Doing so would be very +straight-forward with the mechanics we have already used above.

    + +
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +
    +
    + +

    Code generation for function definitions starts out simply enough: we just +codegen the prototype (Proto) and verify that it is ok. We then clear out the +NamedValues map to make sure that there isn't anything in it from the +last function we compiled. Code generation of the prototype ensures that there +is an LLVM Function object that is ready to go for us.

    + +
    +
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +
    +
    + +

    Now we get to the point where the Builder is set up. The first +line creates a new basic +block (named "entry"), which is inserted into TheFunction. The +second line then tells the builder that new instructions should be inserted into +the end of the new basic block. Basic blocks in LLVM are an important part +of functions that define the Control Flow Graph. +Since we don't have any control flow, our functions will only contain one +block at this point. We'll fix this in Chapter 5 :).

    + +
    +
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    return TheFunction;
    +  }
    +
    +
    + +

    Once the insertion point is set up, we call the CodeGen() method for +the root expression of the function. If no error happens, this emits code to +compute the expression into the entry block and returns the value that was +computed. Assuming no error, we then create an LLVM ret instruction, which completes the function. +Once the function is built, we call verifyFunction, which +is provided by LLVM. This function does a variety of consistency checks on the +generated code, to determine if our compiler is doing everything right. Using +this is important: it can catch a lot of bugs. Once the function is finished +and validated, we return it.

    + +
    +
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +  return 0;
    +}
    +
    +
    + +

    The only piece left here is handling of the error case. For simplicity, we +handle this by merely deleting the function we produced with the +eraseFromParent method. This allows the user to redefine a function +that they incorrectly typed in before: if we didn't delete it, it would live in +the symbol table, with a body, preventing future redefinition.

    + +

    This code does have a bug, though. Since the PrototypeAST::Codegen +can return a previously defined forward declaration, our code can actually delete +a forward declaration. There are a number of ways to fix this bug, see what you +can come up with! Here is a testcase:

    + +
    +
    +extern foo(a b);     # ok, defines foo.
    +def foo(a b) c;      # error, 'c' is invalid.
    +def bar() foo(1, 2); # error, unknown function "foo"
    +
    +
    + +
    + + + + + +
    + +

    +For now, code generation to LLVM doesn't really get us much, except that we can +look at the pretty IR calls. The sample code inserts calls to Codegen into the +"HandleDefinition", "HandleExtern" etc functions, and then +dumps out the LLVM IR. This gives a nice way to look at the LLVM IR for simple +functions. For example: +

    + +
    +
    +ready> 4+5;
    +Read top-level expression:
    +define double @""() {
    +entry:
    +        ret double 9.000000e+00
    +}
    +
    +
    + +

    Note how the parser turns the top-level expression into anonymous functions +for us. This will be handy when we add JIT +support in the next chapter. Also note that the code is very literally +transcribed, no optimizations are being performed except simple constant +folding done by IRBuilder. We will +add optimizations explicitly in +the next chapter.

    + +
    +
    +ready> def foo(a b) a*a + 2*a*b + b*b;
    +Read function definition:
    +define double @foo(double %a, double %b) {
    +entry:
    +        %multmp = fmul double %a, %a
    +        %multmp1 = fmul double 2.000000e+00, %a
    +        %multmp2 = fmul double %multmp1, %b
    +        %addtmp = fadd double %multmp, %multmp2
    +        %multmp3 = fmul double %b, %b
    +        %addtmp4 = fadd double %addtmp, %multmp3
    +        ret double %addtmp4
    +}
    +
    +
    + +

    This shows some simple arithmetic. Notice the striking similarity to the +LLVM builder calls that we use to create the instructions.

    + +
    +
    +ready> def bar(a) foo(a, 4.0) + bar(31337);
    +Read function definition:
    +define double @bar(double %a) {
    +entry:
    +        %calltmp = call double @foo(double %a, double 4.000000e+00)
    +        %calltmp1 = call double @bar(double 3.133700e+04)
    +        %addtmp = fadd double %calltmp, %calltmp1
    +        ret double %addtmp
    +}
    +
    +
    + +

    This shows some function calls. Note that this function will take a long +time to execute if you call it. In the future we'll add conditional control +flow to actually make recursion useful :).

    + +
    +
    +ready> extern cos(x);
    +Read extern: 
    +declare double @cos(double)
    +
    +ready> cos(1.234);
    +Read top-level expression:
    +define double @""() {
    +entry:
    +        %calltmp = call double @cos(double 1.234000e+00)
    +        ret double %calltmp
    +}
    +
    +
    + +

    This shows an extern for the libm "cos" function, and a call to it.

    + + +
    +
    +ready> ^D
    +; ModuleID = 'my cool jit'
    +
    +define double @""() {
    +entry:
    +        %addtmp = fadd double 4.000000e+00, 5.000000e+00
    +        ret double %addtmp
    +}
    +
    +define double @foo(double %a, double %b) {
    +entry:
    +        %multmp = fmul double %a, %a
    +        %multmp1 = fmul double 2.000000e+00, %a
    +        %multmp2 = fmul double %multmp1, %b
    +        %addtmp = fadd double %multmp, %multmp2
    +        %multmp3 = fmul double %b, %b
    +        %addtmp4 = fadd double %addtmp, %multmp3
    +        ret double %addtmp4
    +}
    +
    +define double @bar(double %a) {
    +entry:
    +        %calltmp = call double @foo(double %a, double 4.000000e+00)
    +        %calltmp1 = call double @bar(double 3.133700e+04)
    +        %addtmp = fadd double %calltmp, %calltmp1
    +        ret double %addtmp
    +}
    +
    +declare double @cos(double)
    +
    +define double @""() {
    +entry:
    +        %calltmp = call double @cos(double 1.234000e+00)
    +        ret double %calltmp
    +}
    +
    +
    + +

    When you quit the current demo, it dumps out the IR for the entire module +generated. Here you can see the big picture with all the functions referencing +each other.

    + +

    This wraps up the third chapter of the Kaleidoscope tutorial. Up next, we'll +describe how to add JIT codegen and optimizer +support to this so we can actually start running code!

    + +
    + + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +LLVM code generator. Because this uses the LLVM libraries, we need to link +them in. To do this, we use the llvm-config tool to inform +our makefile/command line about which options to use:

    + +
    +
    +   # Compile
    +   g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy
    +   # Run
    +   ./toy
    +
    +
    + +

    Here is the code:

    + +
    +
    +// To build this:
    +// See example below.
    +
    +#include "llvm/DerivedTypes.h"
    +#include "llvm/LLVMContext.h"
    +#include "llvm/Module.h"
    +#include "llvm/Analysis/Verifier.h"
    +#include "llvm/Support/IRBuilder.h"
    +#include <cstdio>
    +#include <string>
    +#include <map>
    +#include <vector>
    +using namespace llvm;
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes).
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args)
    +    : Name(name), Args(args) {}
    +  
    +  Function *Codegen();
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +  Function *Codegen();
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  }
    +}
    +
    +/// binoprhs
    +///   ::= ('+' primary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the primary expression after the binary operator.
    +    ExprAST *RHS = ParsePrimary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= primary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParsePrimary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +static PrototypeAST *ParsePrototype() {
    +  if (CurTok != tok_identifier)
    +    return ErrorP("Expected function name in prototype");
    +
    +  std::string FnName = IdentifierStr;
    +  getNextToken();
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  return new PrototypeAST(FnName, ArgNames);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Code Generation
    +//===----------------------------------------------------------------------===//
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, Value*> NamedValues;
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  return V ? V : ErrorV("Unknown variable name");
    +}
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: return ErrorV("invalid binary operator");
    +  }
    +}
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +  
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +    
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +  
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx) {
    +    AI->setName(Args[Idx]);
    +    
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = AI;
    +  }
    +  
    +  return F;
    +}
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    return TheFunction;
    +  }
    +  
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing and JIT Driver
    +//===----------------------------------------------------------------------===//
    +
    +static void HandleDefinition() {
    +  if (FunctionAST *F = ParseDefinition()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read function definition:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (PrototypeAST *P = ParseExtern()) {
    +    if (Function *F = P->Codegen()) {
    +      fprintf(stderr, "Read extern: ");
    +      F->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read top-level expression:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// "Library" functions that can be "extern'd" from user code.
    +//===----------------------------------------------------------------------===//
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  LLVMContext &Context = getGlobalContext();
    +
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Make the module, which holds all the code.
    +  TheModule = new Module("my cool jit", Context);
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  // Print out all of the generated code.
    +  TheModule->dump();
    +
    +  return 0;
    +}
    +
    +
    +Next: Adding JIT and Optimizer Support +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-01 13:09:20 -0700 (Wed, 01 Sep 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl4.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl4.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl4.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl4.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1132 @@ + + + + + Kaleidoscope: Adding JIT and Optimizer Support + + + + + + + +
    Kaleidoscope: Adding JIT and Optimizer Support
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 4 of the "Implementing a language +with LLVM" tutorial. Chapters 1-3 described the implementation of a simple +language and added support for generating LLVM IR. This chapter describes +two new techniques: adding optimizer support to your language, and adding JIT +compiler support. These additions will demonstrate how to get nice, efficient code +for the Kaleidoscope language.

    + +
    + + + + + +
    + +

    +Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately, +it does not produce wonderful code. The IRBuilder, however, does give us +obvious optimizations when compiling simple code:

    + +
    +
    +ready> def test(x) 1+2+x;
    +Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 3.000000e+00, %x
    +        ret double %addtmp
    +}
    +
    +
    + +

    This code is not a literal transcription of the AST built by parsing the +input. That would be: + +

    +
    +ready> def test(x) 1+2+x;
    +Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 2.000000e+00, 1.000000e+00
    +        %addtmp1 = fadd double %addtmp, %x
    +        ret double %addtmp1
    +}
    +
    +
    + +

    Constant folding, as seen above, in particular, is a very common and very +important optimization: so much so that many language implementors implement +constant folding support in their AST representation.

    + +

    With LLVM, you don't need this support in the AST. Since all calls to build +LLVM IR go through the LLVM IR builder, the builder itself checked to see if +there was a constant folding opportunity when you call it. If so, it just does +the constant fold and return the constant instead of creating an instruction. + +

    Well, that was easy :). In practice, we recommend always using +IRBuilder when generating code like this. It has no +"syntactic overhead" for its use (you don't have to uglify your compiler with +constant checks everywhere) and it can dramatically reduce the amount of +LLVM IR that is generated in some cases (particular for languages with a macro +preprocessor or that use a lot of constants).

    + +

    On the other hand, the IRBuilder is limited by the fact +that it does all of its analysis inline with the code as it is built. If you +take a slightly more complex example:

    + +
    +
    +ready> def test(x) (1+2+x)*(x+(1+2));
    +ready> Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 3.000000e+00, %x
    +        %addtmp1 = fadd double %x, 3.000000e+00
    +        %multmp = fmul double %addtmp, %addtmp1
    +        ret double %multmp
    +}
    +
    +
    + +

    In this case, the LHS and RHS of the multiplication are the same value. We'd +really like to see this generate "tmp = x+3; result = tmp*tmp;" instead +of computing "x+3" twice.

    + +

    Unfortunately, no amount of local analysis will be able to detect and correct +this. This requires two transformations: reassociation of expressions (to +make the add's lexically identical) and Common Subexpression Elimination (CSE) +to delete the redundant add instruction. Fortunately, LLVM provides a broad +range of optimizations that you can use, in the form of "passes".

    + +
    + + + + + +
    + +

    LLVM provides many optimization passes, which do many different sorts of +things and have different tradeoffs. Unlike other systems, LLVM doesn't hold +to the mistaken notion that one set of optimizations is right for all languages +and for all situations. LLVM allows a compiler implementor to make complete +decisions about what optimizations to use, in which order, and in what +situation.

    + +

    As a concrete example, LLVM supports both "whole module" passes, which look +across as large of body of code as they can (often a whole file, but if run +at link time, this can be a substantial portion of the whole program). It also +supports and includes "per-function" passes which just operate on a single +function at a time, without looking at other functions. For more information +on passes and how they are run, see the How +to Write a Pass document and the List of LLVM +Passes.

    + +

    For Kaleidoscope, we are currently generating functions on the fly, one at +a time, as the user types them in. We aren't shooting for the ultimate +optimization experience in this setting, but we also want to catch the easy and +quick stuff where possible. As such, we will choose to run a few per-function +optimizations as the user types the function in. If we wanted to make a "static +Kaleidoscope compiler", we would use exactly the code we have now, except that +we would defer running the optimizer until the entire file has been parsed.

    + +

    In order to get per-function optimizations going, we need to set up a +FunctionPassManager to hold and +organize the LLVM optimizations that we want to run. Once we have that, we can +add a set of optimizations to run. The code looks like this:

    + +
    +
    +  FunctionPassManager OurFPM(TheModule);
    +
    +  // Set up the optimizer pipeline.  Start with registering info about how the
    +  // target lays out data structures.
    +  OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +  // Do simple "peephole" optimizations and bit-twiddling optzns.
    +  OurFPM.add(createInstructionCombiningPass());
    +  // Reassociate expressions.
    +  OurFPM.add(createReassociatePass());
    +  // Eliminate Common SubExpressions.
    +  OurFPM.add(createGVNPass());
    +  // Simplify the control flow graph (deleting unreachable blocks, etc).
    +  OurFPM.add(createCFGSimplificationPass());
    +
    +  OurFPM.doInitialization();
    +
    +  // Set the global so the code gen can use this.
    +  TheFPM = &OurFPM;
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +
    + +

    This code defines a FunctionPassManager, "OurFPM". It +requires a pointer to the Module to construct itself. Once it is set +up, we use a series of "add" calls to add a bunch of LLVM passes. The first +pass is basically boilerplate, it adds a pass so that later optimizations know +how the data structures in the program are laid out. The +"TheExecutionEngine" variable is related to the JIT, which we will get +to in the next section.

    + +

    In this case, we choose to add 4 optimization passes. The passes we chose +here are a pretty standard set of "cleanup" optimizations that are useful for +a wide variety of code. I won't delve into what they do but, believe me, +they are a good starting place :).

    + +

    Once the PassManager is set up, we need to make use of it. We do this by +running it after our newly created function is constructed (in +FunctionAST::Codegen), but before it is returned to the client:

    + +
    +
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    // Optimize the function.
    +    TheFPM->run(*TheFunction);
    +    
    +    return TheFunction;
    +  }
    +
    +
    + +

    As you can see, this is pretty straightforward. The +FunctionPassManager optimizes and updates the LLVM Function* in place, +improving (hopefully) its body. With this in place, we can try our test above +again:

    + +
    +
    +ready> def test(x) (1+2+x)*(x+(1+2));
    +ready> Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double %x, 3.000000e+00
    +        %multmp = fmul double %addtmp, %addtmp
    +        ret double %multmp
    +}
    +
    +
    + +

    As expected, we now get our nicely optimized code, saving a floating point +add instruction from every execution of this function.

    + +

    LLVM provides a wide variety of optimizations that can be used in certain +circumstances. Some documentation about the various +passes is available, but it isn't very complete. Another good source of +ideas can come from looking at the passes that llvm-gcc or +llvm-ld run to get started. The "opt" tool allows you to +experiment with passes from the command line, so you can see if they do +anything.

    + +

    Now that we have reasonable code coming out of our front-end, lets talk about +executing it!

    + +
    + + + + + +
    + +

    Code that is available in LLVM IR can have a wide variety of tools +applied to it. For example, you can run optimizations on it (as we did above), +you can dump it out in textual or binary forms, you can compile the code to an +assembly file (.s) for some target, or you can JIT compile it. The nice thing +about the LLVM IR representation is that it is the "common currency" between +many different parts of the compiler. +

    + +

    In this section, we'll add JIT compiler support to our interpreter. The +basic idea that we want for Kaleidoscope is to have the user enter function +bodies as they do now, but immediately evaluate the top-level expressions they +type in. For example, if they type in "1 + 2;", we should evaluate and print +out 3. If they define a function, they should be able to call it from the +command line.

    + +

    In order to do this, we first declare and initialize the JIT. This is done +by adding a global variable and a call in main:

    + +
    +
    +static ExecutionEngine *TheExecutionEngine;
    +...
    +int main() {
    +  ..
    +  // Create the JIT.  This takes ownership of the module.
    +  TheExecutionEngine = EngineBuilder(TheModule).create();
    +  ..
    +}
    +
    +
    + +

    This creates an abstract "Execution Engine" which can be either a JIT +compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler +for you if one is available for your platform, otherwise it will fall back to +the interpreter.

    + +

    Once the ExecutionEngine is created, the JIT is ready to be used. +There are a variety of APIs that are useful, but the simplest one is the +"getPointerToFunction(F)" method. This method JIT compiles the +specified LLVM Function and returns a function pointer to the generated machine +code. In our case, this means that we can change the code that parses a +top-level expression to look like this:

    + +
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      LF->dump();  // Dump the function for exposition purposes.
    +    
    +      // JIT the function, returning a function pointer.
    +      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
    +      
    +      // Cast it to the right type (takes no arguments, returns a double) so we
    +      // can call it as a native function.
    +      double (*FP)() = (double (*)())(intptr_t)FPtr;
    +      fprintf(stderr, "Evaluated to %f\n", FP());
    +    }
    +
    +
    + +

    Recall that we compile top-level expressions into a self-contained LLVM +function that takes no arguments and returns the computed double. Because the +LLVM JIT compiler matches the native platform ABI, this means that you can just +cast the result pointer to a function pointer of that type and call it directly. +This means, there is no difference between JIT compiled code and native machine +code that is statically linked into your application.

    + +

    With just these two changes, lets see how Kaleidoscope works now!

    + +
    +
    +ready> 4+5;
    +define double @""() {
    +entry:
    +        ret double 9.000000e+00
    +}
    +
    +Evaluated to 9.000000
    +
    +
    + +

    Well this looks like it is basically working. The dump of the function +shows the "no argument function that always returns double" that we synthesize +for each top-level expression that is typed in. This demonstrates very basic +functionality, but can we do more?

    + +
    +
    +ready> def testfunc(x y) x + y*2;  
    +Read function definition:
    +define double @testfunc(double %x, double %y) {
    +entry:
    +        %multmp = fmul double %y, 2.000000e+00
    +        %addtmp = fadd double %multmp, %x
    +        ret double %addtmp
    +}
    +
    +ready> testfunc(4, 10);
    +define double @""() {
    +entry:
    +        %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
    +        ret double %calltmp
    +}
    +
    +Evaluated to 24.000000
    +
    +
    + +

    This illustrates that we can now call user code, but there is something a bit +subtle going on here. Note that we only invoke the JIT on the anonymous +functions that call testfunc, but we never invoked it +on testfunc itself. What actually happened here is that the JIT +scanned for all non-JIT'd functions transitively called from the anonymous +function and compiled all of them before returning +from getPointerToFunction().

    + +

    The JIT provides a number of other more advanced interfaces for things like +freeing allocated machine code, rejit'ing functions to update them, etc. +However, even with this simple code, we get some surprisingly powerful +capabilities - check this out (I removed the dump of the anonymous functions, +you should get the idea by now :) :

    + +
    +
    +ready> extern sin(x);
    +Read extern: 
    +declare double @sin(double)
    +
    +ready> extern cos(x);
    +Read extern: 
    +declare double @cos(double)
    +
    +ready> sin(1.0);
    +Evaluated to 0.841471
    +
    +ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
    +Read function definition:
    +define double @foo(double %x) {
    +entry:
    +        %calltmp = call double @sin(double %x)
    +        %multmp = fmul double %calltmp, %calltmp
    +        %calltmp2 = call double @cos(double %x)
    +        %multmp4 = fmul double %calltmp2, %calltmp2
    +        %addtmp = fadd double %multmp, %multmp4
    +        ret double %addtmp
    +}
    +
    +ready> foo(4.0);
    +Evaluated to 1.000000
    +
    +
    + +

    Whoa, how does the JIT know about sin and cos? The answer is surprisingly +simple: in this +example, the JIT started execution of a function and got to a function call. It +realized that the function was not yet JIT compiled and invoked the standard set +of routines to resolve the function. In this case, there is no body defined +for the function, so the JIT ended up calling "dlsym("sin")" on the +Kaleidoscope process itself. +Since "sin" is defined within the JIT's address space, it simply +patches up calls in the module to call the libm version of sin +directly.

    + +

    The LLVM JIT provides a number of interfaces (look in the +ExecutionEngine.h file) for controlling how unknown functions get +resolved. It allows you to establish explicit mappings between IR objects and +addresses (useful for LLVM global variables that you want to map to static +tables, for example), allows you to dynamically decide on the fly based on the +function name, and even allows you to have the JIT compile functions lazily the +first time they're called.

    + +

    One interesting application of this is that we can now extend the language +by writing arbitrary C++ code to implement operations. For example, if we add: +

    + +
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +
    + +

    Now we can produce simple output to the console by using things like: +"extern putchard(x); putchard(120);", which prints a lowercase 'x' on +the console (120 is the ASCII code for 'x'). Similar code could be used to +implement file I/O, console input, and many other capabilities in +Kaleidoscope.

    + +

    This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At +this point, we can compile a non-Turing-complete programming language, optimize +and JIT compile it in a user-driven way. Next up we'll look into extending the language with control flow constructs, +tackling some interesting LLVM IR issues along the way.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +LLVM JIT and optimizer. To build this example, use: +

    + +
    +
    +   # Compile
    +   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    +   # Run
    +   ./toy
    +
    +
    + +

    +If you are compiling this on Linux, make sure to add the "-rdynamic" option +as well. This makes sure that the external functions are resolved properly +at runtime.

    + +

    Here is the code:

    + +
    +
    +#include "llvm/DerivedTypes.h"
    +#include "llvm/ExecutionEngine/ExecutionEngine.h"
    +#include "llvm/ExecutionEngine/JIT.h"
    +#include "llvm/LLVMContext.h"
    +#include "llvm/Module.h"
    +#include "llvm/PassManager.h"
    +#include "llvm/Analysis/Verifier.h"
    +#include "llvm/Target/TargetData.h"
    +#include "llvm/Target/TargetSelect.h"
    +#include "llvm/Transforms/Scalar.h"
    +#include "llvm/Support/IRBuilder.h"
    +#include <cstdio>
    +#include <string>
    +#include <map>
    +#include <vector>
    +using namespace llvm;
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes).
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args)
    +    : Name(name), Args(args) {}
    +  
    +  Function *Codegen();
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +  Function *Codegen();
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  }
    +}
    +
    +/// binoprhs
    +///   ::= ('+' primary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the primary expression after the binary operator.
    +    ExprAST *RHS = ParsePrimary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= primary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParsePrimary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +static PrototypeAST *ParsePrototype() {
    +  if (CurTok != tok_identifier)
    +    return ErrorP("Expected function name in prototype");
    +
    +  std::string FnName = IdentifierStr;
    +  getNextToken();
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  return new PrototypeAST(FnName, ArgNames);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Code Generation
    +//===----------------------------------------------------------------------===//
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, Value*> NamedValues;
    +static FunctionPassManager *TheFPM;
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  return V ? V : ErrorV("Unknown variable name");
    +}
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: return ErrorV("invalid binary operator");
    +  }
    +}
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +  
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +    
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +  
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx) {
    +    AI->setName(Args[Idx]);
    +    
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = AI;
    +  }
    +  
    +  return F;
    +}
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    // Optimize the function.
    +    TheFPM->run(*TheFunction);
    +    
    +    return TheFunction;
    +  }
    +  
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing and JIT Driver
    +//===----------------------------------------------------------------------===//
    +
    +static ExecutionEngine *TheExecutionEngine;
    +
    +static void HandleDefinition() {
    +  if (FunctionAST *F = ParseDefinition()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read function definition:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (PrototypeAST *P = ParseExtern()) {
    +    if (Function *F = P->Codegen()) {
    +      fprintf(stderr, "Read extern: ");
    +      F->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      // JIT the function, returning a function pointer.
    +      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
    +      
    +      // Cast it to the right type (takes no arguments, returns a double) so we
    +      // can call it as a native function.
    +      double (*FP)() = (double (*)())(intptr_t)FPtr;
    +      fprintf(stderr, "Evaluated to %f\n", FP());
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// "Library" functions that can be "extern'd" from user code.
    +//===----------------------------------------------------------------------===//
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  InitializeNativeTarget();
    +  LLVMContext &Context = getGlobalContext();
    +
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Make the module, which holds all the code.
    +  TheModule = new Module("my cool jit", Context);
    +
    +  // Create the JIT.  This takes ownership of the module.
    +  std::string ErrStr;
    +  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
    +  if (!TheExecutionEngine) {
    +    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
    +    exit(1);
    +  }
    +
    +  FunctionPassManager OurFPM(TheModule);
    +
    +  // Set up the optimizer pipeline.  Start with registering info about how the
    +  // target lays out data structures.
    +  OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +  // Do simple "peephole" optimizations and bit-twiddling optzns.
    +  OurFPM.add(createInstructionCombiningPass());
    +  // Reassociate expressions.
    +  OurFPM.add(createReassociatePass());
    +  // Eliminate Common SubExpressions.
    +  OurFPM.add(createGVNPass());
    +  // Simplify the control flow graph (deleting unreachable blocks, etc).
    +  OurFPM.add(createCFGSimplificationPass());
    +
    +  OurFPM.doInitialization();
    +
    +  // Set the global so the code gen can use this.
    +  TheFPM = &OurFPM;
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  TheFPM = 0;
    +
    +  // Print out all of the generated code.
    +  TheModule->dump();
    +
    +  return 0;
    +}
    +
    +
    + +Next: Extending the language: control flow +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-06-13 23:09:39 -0700 (Sun, 13 Jun 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl5-cfg.png URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl5-cfg.png?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/docs/tutorial/LangImpl5-cfg.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/tutorial/LangImpl5.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl5.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl5.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl5.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1777 @@ + + + + + Kaleidoscope: Extending the Language: Control Flow + + + + + + + +
    Kaleidoscope: Extending the Language: Control Flow
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 5 of the "Implementing a language +with LLVM" tutorial. Parts 1-4 described the implementation of the simple +Kaleidoscope language and included support for generating LLVM IR, followed by +optimizations and a JIT compiler. Unfortunately, as presented, Kaleidoscope is +mostly useless: it has no control flow other than call and return. This means +that you can't have conditional branches in the code, significantly limiting its +power. In this episode of "build that compiler", we'll extend Kaleidoscope to +have an if/then/else expression plus a simple 'for' loop.

    + +
    + + + + + +
    + +

    +Extending Kaleidoscope to support if/then/else is quite straightforward. It +basically requires adding lexer support for this "new" concept to the lexer, +parser, AST, and LLVM code emitter. This example is nice, because it shows how +easy it is to "grow" a language over time, incrementally extending it as new +ideas are discovered.

    + +

    Before we get going on "how" we add this extension, lets talk about "what" we +want. The basic idea is that we want to be able to write this sort of thing: +

    + +
    +
    +def fib(x)
    +  if x < 3 then
    +    1
    +  else
    +    fib(x-1)+fib(x-2);
    +
    +
    + +

    In Kaleidoscope, every construct is an expression: there are no statements. +As such, the if/then/else expression needs to return a value like any other. +Since we're using a mostly functional form, we'll have it evaluate its +conditional, then return the 'then' or 'else' value based on how the condition +was resolved. This is very similar to the C "?:" expression.

    + +

    The semantics of the if/then/else expression is that it evaluates the +condition to a boolean equality value: 0.0 is considered to be false and +everything else is considered to be true. +If the condition is true, the first subexpression is evaluated and returned, if +the condition is false, the second subexpression is evaluated and returned. +Since Kaleidoscope allows side-effects, this behavior is important to nail down. +

    + +

    Now that we know what we "want", lets break this down into its constituent +pieces.

    + +
    + + + + + + +
    + +

    The lexer extensions are straightforward. First we add new enum values +for the relevant tokens:

    + +
    +
    +  // control
    +  tok_if = -6, tok_then = -7, tok_else = -8,
    +
    +
    + +

    Once we have that, we recognize the new keywords in the lexer. This is pretty simple +stuff:

    + +
    +
    +    ...
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    if (IdentifierStr == "if") return tok_if;
    +    if (IdentifierStr == "then") return tok_then;
    +    if (IdentifierStr == "else") return tok_else;
    +    return tok_identifier;
    +
    +
    + +
    + + + + + +
    + +

    To represent the new expression we add a new AST node for it:

    + +
    +
    +/// IfExprAST - Expression class for if/then/else.
    +class IfExprAST : public ExprAST {
    +  ExprAST *Cond, *Then, *Else;
    +public:
    +  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
    +    : Cond(cond), Then(then), Else(_else) {}
    +  virtual Value *Codegen();
    +};
    +
    +
    + +

    The AST node just has pointers to the various subexpressions.

    + +
    + + + + + +
    + +

    Now that we have the relevant tokens coming from the lexer and we have the +AST node to build, our parsing logic is relatively straightforward. First we +define a new parsing function:

    + +
    +
    +/// ifexpr ::= 'if' expression 'then' expression 'else' expression
    +static ExprAST *ParseIfExpr() {
    +  getNextToken();  // eat the if.
    +  
    +  // condition.
    +  ExprAST *Cond = ParseExpression();
    +  if (!Cond) return 0;
    +  
    +  if (CurTok != tok_then)
    +    return Error("expected then");
    +  getNextToken();  // eat the then
    +  
    +  ExprAST *Then = ParseExpression();
    +  if (Then == 0) return 0;
    +  
    +  if (CurTok != tok_else)
    +    return Error("expected else");
    +  
    +  getNextToken();
    +  
    +  ExprAST *Else = ParseExpression();
    +  if (!Else) return 0;
    +  
    +  return new IfExprAST(Cond, Then, Else);
    +}
    +
    +
    + +

    Next we hook it up as a primary expression:

    + +
    +
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  case tok_if:         return ParseIfExpr();
    +  }
    +}
    +
    +
    + +
    + + + + + +
    + +

    Now that we have it parsing and building the AST, the final piece is adding +LLVM code generation support. This is the most interesting part of the +if/then/else example, because this is where it starts to introduce new concepts. +All of the code above has been thoroughly described in previous chapters. +

    + +

    To motivate the code we want to produce, lets take a look at a simple +example. Consider:

    + +
    +
    +extern foo();
    +extern bar();
    +def baz(x) if x then foo() else bar();
    +
    +
    + +

    If you disable optimizations, the code you'll (soon) get from Kaleidoscope +looks like this:

    + +
    +
    +declare double @foo()
    +
    +declare double @bar()
    +
    +define double @baz(double %x) {
    +entry:
    +	%ifcond = fcmp one double %x, 0.000000e+00
    +	br i1 %ifcond, label %then, label %else
    +
    +then:		; preds = %entry
    +	%calltmp = call double @foo()
    +	br label %ifcont
    +
    +else:		; preds = %entry
    +	%calltmp1 = call double @bar()
    +	br label %ifcont
    +
    +ifcont:		; preds = %else, %then
    +	%iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
    +	ret double %iftmp
    +}
    +
    +
    + +

    To visualize the control flow graph, you can use a nifty feature of the LLVM +'opt' tool. If you put this LLVM IR +into "t.ll" and run "llvm-as < t.ll | opt -analyze -view-cfg", a window will pop up and you'll +see this graph:

    + +
    Example CFG
    + +

    Another way to get this is to call "F->viewCFG()" or +"F->viewCFGOnly()" (where F is a "Function*") either by +inserting actual calls into the code and recompiling or by calling these in the +debugger. LLVM has many nice features for visualizing various graphs.

    + +

    Getting back to the generated code, it is fairly simple: the entry block +evaluates the conditional expression ("x" in our case here) and compares the +result to 0.0 with the "fcmp one" +instruction ('one' is "Ordered and Not Equal"). Based on the result of this +expression, the code jumps to either the "then" or "else" blocks, which contain +the expressions for the true/false cases.

    + +

    Once the then/else blocks are finished executing, they both branch back to the +'ifcont' block to execute the code that happens after the if/then/else. In this +case the only thing left to do is to return to the caller of the function. The +question then becomes: how does the code know which expression to return?

    + +

    The answer to this question involves an important SSA operation: the +Phi +operation. If you're not familiar with SSA, the wikipedia +article is a good introduction and there are various other introductions to +it available on your favorite search engine. The short version is that +"execution" of the Phi operation requires "remembering" which block control came +from. The Phi operation takes on the value corresponding to the input control +block. In this case, if control comes in from the "then" block, it gets the +value of "calltmp". If control comes from the "else" block, it gets the value +of "calltmp1".

    + +

    At this point, you are probably starting to think "Oh no! This means my +simple and elegant front-end will have to start generating SSA form in order to +use LLVM!". Fortunately, this is not the case, and we strongly advise +not implementing an SSA construction algorithm in your front-end +unless there is an amazingly good reason to do so. In practice, there are two +sorts of values that float around in code written for your average imperative +programming language that might need Phi nodes:

    + +
      +
    1. Code that involves user variables: x = 1; x = x + 1;
    2. +
    3. Values that are implicit in the structure of your AST, such as the Phi node +in this case.
    4. +
    + +

    In Chapter 7 of this tutorial ("mutable +variables"), we'll talk about #1 +in depth. For now, just believe me that you don't need SSA construction to +handle this case. For #2, you have the choice of using the techniques that we will +describe for #1, or you can insert Phi nodes directly, if convenient. In this +case, it is really really easy to generate the Phi node, so we choose to do it +directly.

    + +

    Okay, enough of the motivation and overview, lets generate code!

    + +
    + + + + + +
    + +

    In order to generate code for this, we implement the Codegen method +for IfExprAST:

    + +
    +
    +Value *IfExprAST::Codegen() {
    +  Value *CondV = Cond->Codegen();
    +  if (CondV == 0) return 0;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  CondV = Builder.CreateFCmpONE(CondV, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                "ifcond");
    +
    +
    + +

    This code is straightforward and similar to what we saw before. We emit the +expression for the condition, then compare that value to zero to get a truth +value as a 1-bit (bool) value.

    + +
    +
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  
    +  // Create blocks for the then and else cases.  Insert the 'then' block at the
    +  // end of the function.
    +  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
    +  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
    +  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
    +
    +  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
    +
    +
    + +

    This code creates the basic blocks that are related to the if/then/else +statement, and correspond directly to the blocks in the example above. The +first line gets the current Function object that is being built. It +gets this by asking the builder for the current BasicBlock, and asking that +block for its "parent" (the function it is currently embedded into).

    + +

    Once it has that, it creates three blocks. Note that it passes "TheFunction" +into the constructor for the "then" block. This causes the constructor to +automatically insert the new block into the end of the specified function. The +other two blocks are created, but aren't yet inserted into the function.

    + +

    Once the blocks are created, we can emit the conditional branch that chooses +between them. Note that creating new blocks does not implicitly affect the +IRBuilder, so it is still inserting into the block that the condition +went into. Also note that it is creating a branch to the "then" block and the +"else" block, even though the "else" block isn't inserted into the function yet. +This is all ok: it is the standard way that LLVM supports forward +references.

    + +
    +
    +  // Emit then value.
    +  Builder.SetInsertPoint(ThenBB);
    +  
    +  Value *ThenV = Then->Codegen();
    +  if (ThenV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
    +  ThenBB = Builder.GetInsertBlock();
    +
    +
    + +

    After the conditional branch is inserted, we move the builder to start +inserting into the "then" block. Strictly speaking, this call moves the +insertion point to be at the end of the specified block. However, since the +"then" block is empty, it also starts out by inserting at the beginning of the +block. :)

    + +

    Once the insertion point is set, we recursively codegen the "then" expression +from the AST. To finish off the "then" block, we create an unconditional branch +to the merge block. One interesting (and very important) aspect of the LLVM IR +is that it requires all basic blocks +to be "terminated" with a control flow +instruction such as return or branch. This means that all control flow, +including fall throughs must be made explicit in the LLVM IR. If you +violate this rule, the verifier will emit an error.

    + +

    The final line here is quite subtle, but is very important. The basic issue +is that when we create the Phi node in the merge block, we need to set up the +block/value pairs that indicate how the Phi will work. Importantly, the Phi +node expects to have an entry for each predecessor of the block in the CFG. Why +then, are we getting the current block when we just set it to ThenBB 5 lines +above? The problem is that the "Then" expression may actually itself change the +block that the Builder is emitting into if, for example, it contains a nested +"if/then/else" expression. Because calling Codegen recursively could +arbitrarily change the notion of the current block, we are required to get an +up-to-date value for code that will set up the Phi node.

    + +
    +
    +  // Emit else block.
    +  TheFunction->getBasicBlockList().push_back(ElseBB);
    +  Builder.SetInsertPoint(ElseBB);
    +  
    +  Value *ElseV = Else->Codegen();
    +  if (ElseV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
    +  ElseBB = Builder.GetInsertBlock();
    +
    +
    + +

    Code generation for the 'else' block is basically identical to codegen for +the 'then' block. The only significant difference is the first line, which adds +the 'else' block to the function. Recall previously that the 'else' block was +created, but not added to the function. Now that the 'then' and 'else' blocks +are emitted, we can finish up with the merge code:

    + +
    +
    +  // Emit merge block.
    +  TheFunction->getBasicBlockList().push_back(MergeBB);
    +  Builder.SetInsertPoint(MergeBB);
    +  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
    +                                  "iftmp");
    +  
    +  PN->addIncoming(ThenV, ThenBB);
    +  PN->addIncoming(ElseV, ElseBB);
    +  return PN;
    +}
    +
    +
    + +

    The first two lines here are now familiar: the first adds the "merge" block +to the Function object (it was previously floating, like the else block above). +The second block changes the insertion point so that newly created code will go +into the "merge" block. Once that is done, we need to create the PHI node and +set up the block/value pairs for the PHI.

    + +

    Finally, the CodeGen function returns the phi node as the value computed by +the if/then/else expression. In our example above, this returned value will +feed into the code for the top-level function, which will create the return +instruction.

    + +

    Overall, we now have the ability to execute conditional code in +Kaleidoscope. With this extension, Kaleidoscope is a fairly complete language +that can calculate a wide variety of numeric functions. Next up we'll add +another useful expression that is familiar from non-functional languages...

    + +
    + + + + + +
    + +

    Now that we know how to add basic control flow constructs to the language, +we have the tools to add more powerful things. Lets add something more +aggressive, a 'for' expression:

    + +
    +
    + extern putchard(char)
    + def printstar(n)
    +   for i = 1, i < n, 1.0 in
    +     putchard(42);  # ascii 42 = '*'
    +     
    + # print 100 '*' characters
    + printstar(100);
    +
    +
    + +

    This expression defines a new variable ("i" in this case) which iterates from +a starting value, while the condition ("i < n" in this case) is true, +incrementing by an optional step value ("1.0" in this case). If the step value +is omitted, it defaults to 1.0. While the loop is true, it executes its +body expression. Because we don't have anything better to return, we'll just +define the loop as always returning 0.0. In the future when we have mutable +variables, it will get more useful.

    + +

    As before, lets talk about the changes that we need to Kaleidoscope to +support this.

    + +
    + + + + + +
    + +

    The lexer extensions are the same sort of thing as for if/then/else:

    + +
    +
    +  ... in enum Token ...
    +  // control
    +  tok_if = -6, tok_then = -7, tok_else = -8,
    +  tok_for = -9, tok_in = -10
    +
    +  ... in gettok ...
    +  if (IdentifierStr == "def") return tok_def;
    +  if (IdentifierStr == "extern") return tok_extern;
    +  if (IdentifierStr == "if") return tok_if;
    +  if (IdentifierStr == "then") return tok_then;
    +  if (IdentifierStr == "else") return tok_else;
    +  if (IdentifierStr == "for") return tok_for;
    +  if (IdentifierStr == "in") return tok_in;
    +  return tok_identifier;
    +
    +
    + +
    + + + + + +
    + +

    The AST node is just as simple. It basically boils down to capturing +the variable name and the constituent expressions in the node.

    + +
    +
    +/// ForExprAST - Expression class for for/in.
    +class ForExprAST : public ExprAST {
    +  std::string VarName;
    +  ExprAST *Start, *End, *Step, *Body;
    +public:
    +  ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
    +             ExprAST *step, ExprAST *body)
    +    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
    +  virtual Value *Codegen();
    +};
    +
    +
    + +
    + + + + + +
    + +

    The parser code is also fairly standard. The only interesting thing here is +handling of the optional step value. The parser code handles it by checking to +see if the second comma is present. If not, it sets the step value to null in +the AST node:

    + +
    +
    +/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
    +static ExprAST *ParseForExpr() {
    +  getNextToken();  // eat the for.
    +
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after for");
    +  
    +  std::string IdName = IdentifierStr;
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '=')
    +    return Error("expected '=' after for");
    +  getNextToken();  // eat '='.
    +  
    +  
    +  ExprAST *Start = ParseExpression();
    +  if (Start == 0) return 0;
    +  if (CurTok != ',')
    +    return Error("expected ',' after for start value");
    +  getNextToken();
    +  
    +  ExprAST *End = ParseExpression();
    +  if (End == 0) return 0;
    +  
    +  // The step value is optional.
    +  ExprAST *Step = 0;
    +  if (CurTok == ',') {
    +    getNextToken();
    +    Step = ParseExpression();
    +    if (Step == 0) return 0;
    +  }
    +  
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' after for");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +
    +  return new ForExprAST(IdName, Start, End, Step, Body);
    +}
    +
    +
    + +
    + + + + + +
    + +

    Now we get to the good part: the LLVM IR we want to generate for this thing. +With the simple example above, we get this LLVM IR (note that this dump is +generated with optimizations disabled for clarity): +

    + +
    +
    +declare double @putchard(double)
    +
    +define double @printstar(double %n) {
    +entry:
    +        ; initial value = 1.0 (inlined into phi)
    +	br label %loop
    +
    +loop:		; preds = %loop, %entry
    +	%i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
    +        ; body
    +	%calltmp = call double @putchard(double 4.200000e+01)
    +        ; increment
    +	%nextvar = fadd double %i, 1.000000e+00
    +
    +        ; termination test
    +	%cmptmp = fcmp ult double %i, %n
    +	%booltmp = uitofp i1 %cmptmp to double
    +	%loopcond = fcmp one double %booltmp, 0.000000e+00
    +	br i1 %loopcond, label %loop, label %afterloop
    +
    +afterloop:		; preds = %loop
    +        ; loop always returns 0.0
    +	ret double 0.000000e+00
    +}
    +
    +
    + +

    This loop contains all the same constructs we saw before: a phi node, several +expressions, and some basic blocks. Lets see how this fits together.

    + +
    + + + + + +
    + +

    The first part of Codegen is very simple: we just output the start expression +for the loop value:

    + +
    +
    +Value *ForExprAST::Codegen() {
    +  // Emit the start code first, without 'variable' in scope.
    +  Value *StartVal = Start->Codegen();
    +  if (StartVal == 0) return 0;
    +
    +
    + +

    With this out of the way, the next step is to set up the LLVM basic block +for the start of the loop body. In the case above, the whole loop body is one +block, but remember that the body code itself could consist of multiple blocks +(e.g. if it contains an if/then/else or a for/in expression).

    + +
    +
    +  // Make the new basic block for the loop header, inserting after current
    +  // block.
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
    +  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
    +  
    +  // Insert an explicit fall through from the current block to the LoopBB.
    +  Builder.CreateBr(LoopBB);
    +
    +
    + +

    This code is similar to what we saw for if/then/else. Because we will need +it to create the Phi node, we remember the block that falls through into the +loop. Once we have that, we create the actual block that starts the loop and +create an unconditional branch for the fall-through between the two blocks.

    + +
    +
    +  // Start insertion in LoopBB.
    +  Builder.SetInsertPoint(LoopBB);
    +  
    +  // Start the PHI node with an entry for Start.
    +  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), VarName.c_str());
    +  Variable->addIncoming(StartVal, PreheaderBB);
    +
    +
    + +

    Now that the "preheader" for the loop is set up, we switch to emitting code +for the loop body. To begin with, we move the insertion point and create the +PHI node for the loop induction variable. Since we already know the incoming +value for the starting value, we add it to the Phi node. Note that the Phi will +eventually get a second value for the backedge, but we can't set it up yet +(because it doesn't exist!).

    + +
    +
    +  // Within the loop, the variable is defined equal to the PHI node.  If it
    +  // shadows an existing variable, we have to restore it, so save it now.
    +  Value *OldVal = NamedValues[VarName];
    +  NamedValues[VarName] = Variable;
    +  
    +  // Emit the body of the loop.  This, like any other expr, can change the
    +  // current BB.  Note that we ignore the value computed by the body, but don't
    +  // allow an error.
    +  if (Body->Codegen() == 0)
    +    return 0;
    +
    +
    + +

    Now the code starts to get more interesting. Our 'for' loop introduces a new +variable to the symbol table. This means that our symbol table can now contain +either function arguments or loop variables. To handle this, before we codegen +the body of the loop, we add the loop variable as the current value for its +name. Note that it is possible that there is a variable of the same name in the +outer scope. It would be easy to make this an error (emit an error and return +null if there is already an entry for VarName) but we choose to allow shadowing +of variables. In order to handle this correctly, we remember the Value that +we are potentially shadowing in OldVal (which will be null if there is +no shadowed variable).

    + +

    Once the loop variable is set into the symbol table, the code recursively +codegen's the body. This allows the body to use the loop variable: any +references to it will naturally find it in the symbol table.

    + +
    +
    +  // Emit the step value.
    +  Value *StepVal;
    +  if (Step) {
    +    StepVal = Step->Codegen();
    +    if (StepVal == 0) return 0;
    +  } else {
    +    // If not specified, use 1.0.
    +    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
    +  }
    +  
    +  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
    +
    +
    + +

    Now that the body is emitted, we compute the next value of the iteration +variable by adding the step value, or 1.0 if it isn't present. 'NextVar' +will be the value of the loop variable on the next iteration of the loop.

    + +
    +
    +  // Compute the end condition.
    +  Value *EndCond = End->Codegen();
    +  if (EndCond == 0) return EndCond;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  EndCond = Builder.CreateFCmpONE(EndCond, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                  "loopcond");
    +
    +
    + +

    Finally, we evaluate the exit value of the loop, to determine whether the +loop should exit. This mirrors the condition evaluation for the if/then/else +statement.

    + +
    +
    +  // Create the "after loop" block and insert it.
    +  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
    +  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
    +  
    +  // Insert the conditional branch into the end of LoopEndBB.
    +  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
    +  
    +  // Any new code will be inserted in AfterBB.
    +  Builder.SetInsertPoint(AfterBB);
    +
    +
    + +

    With the code for the body of the loop complete, we just need to finish up +the control flow for it. This code remembers the end block (for the phi node), then creates the block for the loop exit ("afterloop"). Based on the value of the +exit condition, it creates a conditional branch that chooses between executing +the loop again and exiting the loop. Any future code is emitted in the +"afterloop" block, so it sets the insertion position to it.

    + +
    +
    +  // Add a new entry to the PHI node for the backedge.
    +  Variable->addIncoming(NextVar, LoopEndBB);
    +  
    +  // Restore the unshadowed variable.
    +  if (OldVal)
    +    NamedValues[VarName] = OldVal;
    +  else
    +    NamedValues.erase(VarName);
    +  
    +  // for expr always returns 0.0.
    +  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
    +}
    +
    +
    + +

    The final code handles various cleanups: now that we have the "NextVar" +value, we can add the incoming value to the loop PHI node. After that, we +remove the loop variable from the symbol table, so that it isn't in scope after +the for loop. Finally, code generation of the for loop always returns 0.0, so +that is what we return from ForExprAST::Codegen.

    + +

    With this, we conclude the "adding control flow to Kaleidoscope" chapter of +the tutorial. In this chapter we added two control flow constructs, and used them to motivate a couple of aspects of the LLVM IR that are important for front-end implementors +to know. In the next chapter of our saga, we will get a bit crazier and add +user-defined operators to our poor innocent +language.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

    + +
    +
    +   # Compile
    +   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    +   # Run
    +   ./toy
    +
    +
    + +

    Here is the code:

    + +
    +
    +#include "llvm/DerivedTypes.h"
    +#include "llvm/ExecutionEngine/ExecutionEngine.h"
    +#include "llvm/ExecutionEngine/JIT.h"
    +#include "llvm/LLVMContext.h"
    +#include "llvm/Module.h"
    +#include "llvm/PassManager.h"
    +#include "llvm/Analysis/Verifier.h"
    +#include "llvm/Target/TargetData.h"
    +#include "llvm/Target/TargetSelect.h"
    +#include "llvm/Transforms/Scalar.h"
    +#include "llvm/Support/IRBuilder.h"
    +#include <cstdio>
    +#include <string>
    +#include <map>
    +#include <vector>
    +using namespace llvm;
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5,
    +  
    +  // control
    +  tok_if = -6, tok_then = -7, tok_else = -8,
    +  tok_for = -9, tok_in = -10
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    if (IdentifierStr == "if") return tok_if;
    +    if (IdentifierStr == "then") return tok_then;
    +    if (IdentifierStr == "else") return tok_else;
    +    if (IdentifierStr == "for") return tok_for;
    +    if (IdentifierStr == "in") return tok_in;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// IfExprAST - Expression class for if/then/else.
    +class IfExprAST : public ExprAST {
    +  ExprAST *Cond, *Then, *Else;
    +public:
    +  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
    +  : Cond(cond), Then(then), Else(_else) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// ForExprAST - Expression class for for/in.
    +class ForExprAST : public ExprAST {
    +  std::string VarName;
    +  ExprAST *Start, *End, *Step, *Body;
    +public:
    +  ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
    +             ExprAST *step, ExprAST *body)
    +    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes).
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args)
    +    : Name(name), Args(args) {}
    +  
    +  Function *Codegen();
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +  Function *Codegen();
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// ifexpr ::= 'if' expression 'then' expression 'else' expression
    +static ExprAST *ParseIfExpr() {
    +  getNextToken();  // eat the if.
    +  
    +  // condition.
    +  ExprAST *Cond = ParseExpression();
    +  if (!Cond) return 0;
    +  
    +  if (CurTok != tok_then)
    +    return Error("expected then");
    +  getNextToken();  // eat the then
    +  
    +  ExprAST *Then = ParseExpression();
    +  if (Then == 0) return 0;
    +  
    +  if (CurTok != tok_else)
    +    return Error("expected else");
    +  
    +  getNextToken();
    +  
    +  ExprAST *Else = ParseExpression();
    +  if (!Else) return 0;
    +  
    +  return new IfExprAST(Cond, Then, Else);
    +}
    +
    +/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
    +static ExprAST *ParseForExpr() {
    +  getNextToken();  // eat the for.
    +
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after for");
    +  
    +  std::string IdName = IdentifierStr;
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '=')
    +    return Error("expected '=' after for");
    +  getNextToken();  // eat '='.
    +  
    +  
    +  ExprAST *Start = ParseExpression();
    +  if (Start == 0) return 0;
    +  if (CurTok != ',')
    +    return Error("expected ',' after for start value");
    +  getNextToken();
    +  
    +  ExprAST *End = ParseExpression();
    +  if (End == 0) return 0;
    +  
    +  // The step value is optional.
    +  ExprAST *Step = 0;
    +  if (CurTok == ',') {
    +    getNextToken();
    +    Step = ParseExpression();
    +    if (Step == 0) return 0;
    +  }
    +  
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' after for");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +
    +  return new ForExprAST(IdName, Start, End, Step, Body);
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +///   ::= ifexpr
    +///   ::= forexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  case tok_if:         return ParseIfExpr();
    +  case tok_for:        return ParseForExpr();
    +  }
    +}
    +
    +/// binoprhs
    +///   ::= ('+' primary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the primary expression after the binary operator.
    +    ExprAST *RHS = ParsePrimary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= primary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParsePrimary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +static PrototypeAST *ParsePrototype() {
    +  if (CurTok != tok_identifier)
    +    return ErrorP("Expected function name in prototype");
    +
    +  std::string FnName = IdentifierStr;
    +  getNextToken();
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  return new PrototypeAST(FnName, ArgNames);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Code Generation
    +//===----------------------------------------------------------------------===//
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, Value*> NamedValues;
    +static FunctionPassManager *TheFPM;
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  return V ? V : ErrorV("Unknown variable name");
    +}
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: return ErrorV("invalid binary operator");
    +  }
    +}
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +Value *IfExprAST::Codegen() {
    +  Value *CondV = Cond->Codegen();
    +  if (CondV == 0) return 0;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  CondV = Builder.CreateFCmpONE(CondV, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                "ifcond");
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  
    +  // Create blocks for the then and else cases.  Insert the 'then' block at the
    +  // end of the function.
    +  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
    +  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
    +  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
    +  
    +  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
    +  
    +  // Emit then value.
    +  Builder.SetInsertPoint(ThenBB);
    +  
    +  Value *ThenV = Then->Codegen();
    +  if (ThenV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
    +  ThenBB = Builder.GetInsertBlock();
    +  
    +  // Emit else block.
    +  TheFunction->getBasicBlockList().push_back(ElseBB);
    +  Builder.SetInsertPoint(ElseBB);
    +  
    +  Value *ElseV = Else->Codegen();
    +  if (ElseV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
    +  ElseBB = Builder.GetInsertBlock();
    +  
    +  // Emit merge block.
    +  TheFunction->getBasicBlockList().push_back(MergeBB);
    +  Builder.SetInsertPoint(MergeBB);
    +  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
    +                                  "iftmp");
    +  
    +  PN->addIncoming(ThenV, ThenBB);
    +  PN->addIncoming(ElseV, ElseBB);
    +  return PN;
    +}
    +
    +Value *ForExprAST::Codegen() {
    +  // Output this as:
    +  //   ...
    +  //   start = startexpr
    +  //   goto loop
    +  // loop: 
    +  //   variable = phi [start, loopheader], [nextvariable, loopend]
    +  //   ...
    +  //   bodyexpr
    +  //   ...
    +  // loopend:
    +  //   step = stepexpr
    +  //   nextvariable = variable + step
    +  //   endcond = endexpr
    +  //   br endcond, loop, endloop
    +  // outloop:
    +  
    +  // Emit the start code first, without 'variable' in scope.
    +  Value *StartVal = Start->Codegen();
    +  if (StartVal == 0) return 0;
    +  
    +  // Make the new basic block for the loop header, inserting after current
    +  // block.
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
    +  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
    +  
    +  // Insert an explicit fall through from the current block to the LoopBB.
    +  Builder.CreateBr(LoopBB);
    +
    +  // Start insertion in LoopBB.
    +  Builder.SetInsertPoint(LoopBB);
    +  
    +  // Start the PHI node with an entry for Start.
    +  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), VarName.c_str());
    +  Variable->addIncoming(StartVal, PreheaderBB);
    +  
    +  // Within the loop, the variable is defined equal to the PHI node.  If it
    +  // shadows an existing variable, we have to restore it, so save it now.
    +  Value *OldVal = NamedValues[VarName];
    +  NamedValues[VarName] = Variable;
    +  
    +  // Emit the body of the loop.  This, like any other expr, can change the
    +  // current BB.  Note that we ignore the value computed by the body, but don't
    +  // allow an error.
    +  if (Body->Codegen() == 0)
    +    return 0;
    +  
    +  // Emit the step value.
    +  Value *StepVal;
    +  if (Step) {
    +    StepVal = Step->Codegen();
    +    if (StepVal == 0) return 0;
    +  } else {
    +    // If not specified, use 1.0.
    +    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
    +  }
    +  
    +  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
    +
    +  // Compute the end condition.
    +  Value *EndCond = End->Codegen();
    +  if (EndCond == 0) return EndCond;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  EndCond = Builder.CreateFCmpONE(EndCond, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                  "loopcond");
    +  
    +  // Create the "after loop" block and insert it.
    +  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
    +  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
    +  
    +  // Insert the conditional branch into the end of LoopEndBB.
    +  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
    +  
    +  // Any new code will be inserted in AfterBB.
    +  Builder.SetInsertPoint(AfterBB);
    +  
    +  // Add a new entry to the PHI node for the backedge.
    +  Variable->addIncoming(NextVar, LoopEndBB);
    +  
    +  // Restore the unshadowed variable.
    +  if (OldVal)
    +    NamedValues[VarName] = OldVal;
    +  else
    +    NamedValues.erase(VarName);
    +
    +  
    +  // for expr always returns 0.0.
    +  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
    +}
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +  
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +    
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +  
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx) {
    +    AI->setName(Args[Idx]);
    +    
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = AI;
    +  }
    +  
    +  return F;
    +}
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    // Optimize the function.
    +    TheFPM->run(*TheFunction);
    +    
    +    return TheFunction;
    +  }
    +  
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing and JIT Driver
    +//===----------------------------------------------------------------------===//
    +
    +static ExecutionEngine *TheExecutionEngine;
    +
    +static void HandleDefinition() {
    +  if (FunctionAST *F = ParseDefinition()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read function definition:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (PrototypeAST *P = ParseExtern()) {
    +    if (Function *F = P->Codegen()) {
    +      fprintf(stderr, "Read extern: ");
    +      F->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      // JIT the function, returning a function pointer.
    +      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
    +      
    +      // Cast it to the right type (takes no arguments, returns a double) so we
    +      // can call it as a native function.
    +      double (*FP)() = (double (*)())(intptr_t)FPtr;
    +      fprintf(stderr, "Evaluated to %f\n", FP());
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// "Library" functions that can be "extern'd" from user code.
    +//===----------------------------------------------------------------------===//
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  InitializeNativeTarget();
    +  LLVMContext &Context = getGlobalContext();
    +
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Make the module, which holds all the code.
    +  TheModule = new Module("my cool jit", Context);
    +
    +  // Create the JIT.  This takes ownership of the module.
    +  std::string ErrStr;
    +  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
    +  if (!TheExecutionEngine) {
    +    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
    +    exit(1);
    +  }
    +
    +  FunctionPassManager OurFPM(TheModule);
    +
    +  // Set up the optimizer pipeline.  Start with registering info about how the
    +  // target lays out data structures.
    +  OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +  // Do simple "peephole" optimizations and bit-twiddling optzns.
    +  OurFPM.add(createInstructionCombiningPass());
    +  // Reassociate expressions.
    +  OurFPM.add(createReassociatePass());
    +  // Eliminate Common SubExpressions.
    +  OurFPM.add(createGVNPass());
    +  // Simplify the control flow graph (deleting unreachable blocks, etc).
    +  OurFPM.add(createCFGSimplificationPass());
    +
    +  OurFPM.doInitialization();
    +
    +  // Set the global so the code gen can use this.
    +  TheFPM = &OurFPM;
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  TheFPM = 0;
    +
    +  // Print out all of the generated code.
    +  TheModule->dump();
    +
    +  return 0;
    +}
    +
    +
    + +Next: Extending the language: user-defined operators +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-01 13:09:20 -0700 (Wed, 01 Sep 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl6.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl6.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl6.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl6.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1814 @@ + + + + + Kaleidoscope: Extending the Language: User-defined Operators + + + + + + + +
    Kaleidoscope: Extending the Language: User-defined Operators
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 6 of the "Implementing a language +with LLVM" tutorial. At this point in our tutorial, we now have a fully +functional language that is fairly minimal, but also useful. There +is still one big problem with it, however. Our language doesn't have many +useful operators (like division, logical negation, or even any comparisons +besides less-than).

    + +

    This chapter of the tutorial takes a wild digression into adding user-defined +operators to the simple and beautiful Kaleidoscope language. This digression now gives +us a simple and ugly language in some ways, but also a powerful one at the same time. +One of the great things about creating your own language is that you get to +decide what is good or bad. In this tutorial we'll assume that it is okay to +use this as a way to show some interesting parsing techniques.

    + +

    At the end of this tutorial, we'll run through an example Kaleidoscope +application that renders the Mandelbrot set. This gives +an example of what you can build with Kaleidoscope and its feature set.

    + +
    + + + + + +
    + +

    +The "operator overloading" that we will add to Kaleidoscope is more general than +languages like C++. In C++, you are only allowed to redefine existing +operators: you can't programatically change the grammar, introduce new +operators, change precedence levels, etc. In this chapter, we will add this +capability to Kaleidoscope, which will let the user round out the set of +operators that are supported.

    + +

    The point of going into user-defined operators in a tutorial like this is to +show the power and flexibility of using a hand-written parser. Thus far, the parser +we have been implementing uses recursive descent for most parts of the grammar and +operator precedence parsing for the expressions. See Chapter 2 for details. Without using operator +precedence parsing, it would be very difficult to allow the programmer to +introduce new operators into the grammar: the grammar is dynamically extensible +as the JIT runs.

    + +

    The two specific features we'll add are programmable unary operators (right +now, Kaleidoscope has no unary operators at all) as well as binary operators. +An example of this is:

    + +
    +
    +# Logical unary not.
    +def unary!(v)
    +  if v then
    +    0
    +  else
    +    1;
    +
    +# Define > with the same precedence as <.
    +def binary> 10 (LHS RHS)
    +  RHS < LHS;
    +
    +# Binary "logical or", (note that it does not "short circuit")
    +def binary| 5 (LHS RHS)
    +  if LHS then
    +    1
    +  else if RHS then
    +    1
    +  else
    +    0;
    +
    +# Define = with slightly lower precedence than relationals.
    +def binary= 9 (LHS RHS)
    +  !(LHS < RHS | LHS > RHS);
    +
    +
    + +

    Many languages aspire to being able to implement their standard runtime +library in the language itself. In Kaleidoscope, we can implement significant +parts of the language in the library!

    + +

    We will break down implementation of these features into two parts: +implementing support for user-defined binary operators and adding unary +operators.

    + +
    + + + + + +
    + +

    Adding support for user-defined binary operators is pretty simple with our +current framework. We'll first add support for the unary/binary keywords:

    + +
    +
    +enum Token {
    +  ...
    +  // operators
    +  tok_binary = -11, tok_unary = -12
    +};
    +...
    +static int gettok() {
    +...
    +    if (IdentifierStr == "for") return tok_for;
    +    if (IdentifierStr == "in") return tok_in;
    +    if (IdentifierStr == "binary") return tok_binary;
    +    if (IdentifierStr == "unary") return tok_unary;
    +    return tok_identifier;
    +
    +
    + +

    This just adds lexer support for the unary and binary keywords, like we +did in previous chapters. One nice thing +about our current AST, is that we represent binary operators with full generalisation +by using their ASCII code as the opcode. For our extended operators, we'll use this +same representation, so we don't need any new AST or parser support.

    + +

    On the other hand, we have to be able to represent the definitions of these +new operators, in the "def binary| 5" part of the function definition. In our +grammar so far, the "name" for the function definition is parsed as the +"prototype" production and into the PrototypeAST AST node. To +represent our new user-defined operators as prototypes, we have to extend +the PrototypeAST AST node like this:

    + +
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its argument names as well as if it is an operator.
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +  bool isOperator;
    +  unsigned Precedence;  // Precedence if a binary op.
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args,
    +               bool isoperator = false, unsigned prec = 0)
    +  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
    +  
    +  bool isUnaryOp() const { return isOperator && Args.size() == 1; }
    +  bool isBinaryOp() const { return isOperator && Args.size() == 2; }
    +  
    +  char getOperatorName() const {
    +    assert(isUnaryOp() || isBinaryOp());
    +    return Name[Name.size()-1];
    +  }
    +  
    +  unsigned getBinaryPrecedence() const { return Precedence; }
    +  
    +  Function *Codegen();
    +};
    +
    +
    + +

    Basically, in addition to knowing a name for the prototype, we now keep track +of whether it was an operator, and if it was, what precedence level the operator +is at. The precedence is only used for binary operators (as you'll see below, +it just doesn't apply for unary operators). Now that we have a way to represent +the prototype for a user-defined operator, we need to parse it:

    + +
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +///   ::= binary LETTER number? (id, id)
    +static PrototypeAST *ParsePrototype() {
    +  std::string FnName;
    +  
    +  unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
    +  unsigned BinaryPrecedence = 30;
    +  
    +  switch (CurTok) {
    +  default:
    +    return ErrorP("Expected function name in prototype");
    +  case tok_identifier:
    +    FnName = IdentifierStr;
    +    Kind = 0;
    +    getNextToken();
    +    break;
    +  case tok_binary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected binary operator");
    +    FnName = "binary";
    +    FnName += (char)CurTok;
    +    Kind = 2;
    +    getNextToken();
    +    
    +    // Read the precedence if present.
    +    if (CurTok == tok_number) {
    +      if (NumVal < 1 || NumVal > 100)
    +        return ErrorP("Invalid precedecnce: must be 1..100");
    +      BinaryPrecedence = (unsigned)NumVal;
    +      getNextToken();
    +    }
    +    break;
    +  }
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  // Verify right number of names for operator.
    +  if (Kind && ArgNames.size() != Kind)
    +    return ErrorP("Invalid number of operands for operator");
    +  
    +  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
    +}
    +
    +
    + +

    This is all fairly straightforward parsing code, and we have already seen +a lot of similar code in the past. One interesting part about the code above is +the couple lines that set up FnName for binary operators. This builds names +like "binary@" for a newly defined "@" operator. This then takes advantage of the +fact that symbol names in the LLVM symbol table are allowed to have any character in +them, including embedded nul characters.

    + +

    The next interesting thing to add, is codegen support for these binary operators. +Given our current structure, this is a simple addition of a default case for our +existing binary operator node:

    + +
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: break;
    +  }
    +  
    +  // If it wasn't a builtin binary operator, it must be a user defined one. Emit
    +  // a call to it.
    +  Function *F = TheModule->getFunction(std::string("binary")+Op);
    +  assert(F && "binary operator not found!");
    +  
    +  Value *Ops[] = { L, R };
    +  return Builder.CreateCall(F, Ops, Ops+2, "binop");
    +}
    +
    +
    +
    + +

    As you can see above, the new code is actually really simple. It just does +a lookup for the appropriate operator in the symbol table and generates a +function call to it. Since user-defined operators are just built as normal +functions (because the "prototype" boils down to a function with the right +name) everything falls into place.

    + +

    The final piece of code we are missing, is a bit of top-level magic:

    + +
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // If this is an operator, install it.
    +  if (Proto->isBinaryOp())
    +    BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +    ...
    +
    +
    + +

    Basically, before codegening a function, if it is a user-defined operator, we +register it in the precedence table. This allows the binary operator parsing +logic we already have in place to handle it. Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".

    + +

    Now we have useful user-defined binary operators. This builds a lot +on the previous framework we built for other operators. Adding unary operators +is a bit more challenging, because we don't have any framework for it yet - lets +see what it takes.

    + +
    + + + + + +
    + +

    Since we don't currently support unary operators in the Kaleidoscope +language, we'll need to add everything to support them. Above, we added simple +support for the 'unary' keyword to the lexer. In addition to that, we need an +AST node:

    + +
    +
    +/// UnaryExprAST - Expression class for a unary operator.
    +class UnaryExprAST : public ExprAST {
    +  char Opcode;
    +  ExprAST *Operand;
    +public:
    +  UnaryExprAST(char opcode, ExprAST *operand) 
    +    : Opcode(opcode), Operand(operand) {}
    +  virtual Value *Codegen();
    +};
    +
    +
    + +

    This AST node is very simple and obvious by now. It directly mirrors the +binary operator AST node, except that it only has one child. With this, we +need to add the parsing logic. Parsing a unary operator is pretty simple: we'll +add a new function to do it:

    + +
    +
    +/// unary
    +///   ::= primary
    +///   ::= '!' unary
    +static ExprAST *ParseUnary() {
    +  // If the current token is not an operator, it must be a primary expr.
    +  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
    +    return ParsePrimary();
    +  
    +  // If this is a unary operator, read it.
    +  int Opc = CurTok;
    +  getNextToken();
    +  if (ExprAST *Operand = ParseUnary())
    +    return new UnaryExprAST(Opc, Operand);
    +  return 0;
    +}
    +
    +
    + +

    The grammar we add is pretty straightforward here. If we see a unary +operator when parsing a primary operator, we eat the operator as a prefix and +parse the remaining piece as another unary operator. This allows us to handle +multiple unary operators (e.g. "!!x"). Note that unary operators can't have +ambiguous parses like binary operators can, so there is no need for precedence +information.

    + +

    The problem with this function, is that we need to call ParseUnary from somewhere. +To do this, we change previous callers of ParsePrimary to call ParseUnary +instead:

    + +
    +
    +/// binoprhs
    +///   ::= ('+' unary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  ...
    +    // Parse the unary expression after the binary operator.
    +    ExprAST *RHS = ParseUnary();
    +    if (!RHS) return 0;
    +  ...
    +}
    +/// expression
    +///   ::= unary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParseUnary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +
    + +

    With these two simple changes, we are now able to parse unary operators and build the +AST for them. Next up, we need to add parser support for prototypes, to parse +the unary operator prototype. We extend the binary operator code above +with:

    + +
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +///   ::= binary LETTER number? (id, id)
    +///   ::= unary LETTER (id)
    +static PrototypeAST *ParsePrototype() {
    +  std::string FnName;
    +  
    +  unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
    +  unsigned BinaryPrecedence = 30;
    +  
    +  switch (CurTok) {
    +  default:
    +    return ErrorP("Expected function name in prototype");
    +  case tok_identifier:
    +    FnName = IdentifierStr;
    +    Kind = 0;
    +    getNextToken();
    +    break;
    +  case tok_unary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected unary operator");
    +    FnName = "unary";
    +    FnName += (char)CurTok;
    +    Kind = 1;
    +    getNextToken();
    +    break;
    +  case tok_binary:
    +    ...
    +
    +
    + +

    As with binary operators, we name unary operators with a name that includes +the operator character. This assists us at code generation time. Speaking of, +the final piece we need to add is codegen support for unary operators. It looks +like this:

    + +
    +
    +Value *UnaryExprAST::Codegen() {
    +  Value *OperandV = Operand->Codegen();
    +  if (OperandV == 0) return 0;
    +  
    +  Function *F = TheModule->getFunction(std::string("unary")+Opcode);
    +  if (F == 0)
    +    return ErrorV("Unknown unary operator");
    +  
    +  return Builder.CreateCall(F, OperandV, "unop");
    +}
    +
    +
    + +

    This code is similar to, but simpler than, the code for binary operators. It +is simpler primarily because it doesn't need to handle any predefined operators. +

    + +
    + + + + + +
    + +

    It is somewhat hard to believe, but with a few simple extensions we've +covered in the last chapters, we have grown a real-ish language. With this, we +can do a lot of interesting things, including I/O, math, and a bunch of other +things. For example, we can now add a nice sequencing operator (printd is +defined to print out the specified value and a newline):

    + +
    +
    +ready> extern printd(x);
    +Read extern: declare double @printd(double)
    +ready> def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.
    +..
    +ready> printd(123) : printd(456) : printd(789);
    +123.000000
    +456.000000
    +789.000000
    +Evaluated to 0.000000
    +
    +
    + +

    We can also define a bunch of other "primitive" operations, such as:

    + +
    +
    +# Logical unary not.
    +def unary!(v)
    +  if v then
    +    0
    +  else
    +    1;
    +    
    +# Unary negate.
    +def unary-(v)
    +  0-v;
    +
    +# Define > with the same precedence as <.
    +def binary> 10 (LHS RHS)
    +  RHS < LHS;
    +
    +# Binary logical or, which does not short circuit. 
    +def binary| 5 (LHS RHS)
    +  if LHS then
    +    1
    +  else if RHS then
    +    1
    +  else
    +    0;
    +
    +# Binary logical and, which does not short circuit. 
    +def binary& 6 (LHS RHS)
    +  if !LHS then
    +    0
    +  else
    +    !!RHS;
    +
    +# Define = with slightly lower precedence than relationals.
    +def binary = 9 (LHS RHS)
    +  !(LHS < RHS | LHS > RHS);
    +
    +
    +
    + + +

    Given the previous if/then/else support, we can also define interesting +functions for I/O. For example, the following prints out a character whose +"density" reflects the value passed in: the lower the value, the denser the +character:

    + +
    +
    +ready>
    +
    +extern putchard(char)
    +def printdensity(d)
    +  if d > 8 then
    +    putchard(32)  # ' '
    +  else if d > 4 then
    +    putchard(46)  # '.'
    +  else if d > 2 then
    +    putchard(43)  # '+'
    +  else
    +    putchard(42); # '*'
    +...
    +ready> printdensity(1): printdensity(2): printdensity(3) : 
    +          printdensity(4): printdensity(5): printdensity(9): putchard(10);
    +*++.. 
    +Evaluated to 0.000000
    +
    +
    + +

    Based on these simple primitive operations, we can start to define more +interesting things. For example, here's a little function that solves for the +number of iterations it takes a function in the complex plane to +converge:

    + +
    +
    +# determine whether the specific location diverges.
    +# Solve for z = z^2 + c in the complex plane.
    +def mandleconverger(real imag iters creal cimag)
    +  if iters > 255 | (real*real + imag*imag > 4) then
    +    iters
    +  else
    +    mandleconverger(real*real - imag*imag + creal,
    +                    2*real*imag + cimag,
    +                    iters+1, creal, cimag);
    +
    +# return the number of iterations required for the iteration to escape
    +def mandleconverge(real imag)
    +  mandleconverger(real, imag, 0, real, imag);
    +
    +
    + +

    This "z = z2 + c" function is a beautiful little creature that is the basis +for computation of the Mandelbrot Set. Our +mandelconverge function returns the number of iterations that it takes +for a complex orbit to escape, saturating to 255. This is not a very useful +function by itself, but if you plot its value over a two-dimensional plane, +you can see the Mandelbrot set. Given that we are limited to using putchard +here, our amazing graphical output is limited, but we can whip together +something using the density plotter above:

    + +
    +
    +# compute and plot the mandlebrot set with the specified 2 dimensional range
    +# info.
    +def mandelhelp(xmin xmax xstep   ymin ymax ystep)
    +  for y = ymin, y < ymax, ystep in (
    +    (for x = xmin, x < xmax, xstep in
    +       printdensity(mandleconverge(x,y)))
    +    : putchard(10)
    +  )
    + 
    +# mandel - This is a convenient helper function for ploting the mandelbrot set
    +# from the specified position with the specified Magnification.
    +def mandel(realstart imagstart realmag imagmag) 
    +  mandelhelp(realstart, realstart+realmag*78, realmag,
    +             imagstart, imagstart+imagmag*40, imagmag);
    +
    +
    + +

    Given this, we can try plotting out the mandlebrot set! Lets try it out:

    + +
    +
    +ready> mandel(-2.3, -1.3, 0.05, 0.07);
    +*******************************+++++++++++*************************************
    +*************************+++++++++++++++++++++++*******************************
    +**********************+++++++++++++++++++++++++++++****************************
    +*******************+++++++++++++++++++++.. ...++++++++*************************
    +*****************++++++++++++++++++++++.... ...+++++++++***********************
    +***************+++++++++++++++++++++++.....   ...+++++++++*********************
    +**************+++++++++++++++++++++++....     ....+++++++++********************
    +*************++++++++++++++++++++++......      .....++++++++*******************
    +************+++++++++++++++++++++.......       .......+++++++******************
    +***********+++++++++++++++++++....                ... .+++++++*****************
    +**********+++++++++++++++++.......                     .+++++++****************
    +*********++++++++++++++...........                    ...+++++++***************
    +********++++++++++++............                      ...++++++++**************
    +********++++++++++... ..........                        .++++++++**************
    +*******+++++++++.....                                   .+++++++++*************
    +*******++++++++......                                  ..+++++++++*************
    +*******++++++.......                                   ..+++++++++*************
    +*******+++++......                                     ..+++++++++*************
    +*******.... ....                                      ...+++++++++*************
    +*******.... .                                         ...+++++++++*************
    +*******+++++......                                    ...+++++++++*************
    +*******++++++.......                                   ..+++++++++*************
    +*******++++++++......                                   .+++++++++*************
    +*******+++++++++.....                                  ..+++++++++*************
    +********++++++++++... ..........                        .++++++++**************
    +********++++++++++++............                      ...++++++++**************
    +*********++++++++++++++..........                     ...+++++++***************
    +**********++++++++++++++++........                     .+++++++****************
    +**********++++++++++++++++++++....                ... ..+++++++****************
    +***********++++++++++++++++++++++.......       .......++++++++*****************
    +************+++++++++++++++++++++++......      ......++++++++******************
    +**************+++++++++++++++++++++++....      ....++++++++********************
    +***************+++++++++++++++++++++++.....   ...+++++++++*********************
    +*****************++++++++++++++++++++++....  ...++++++++***********************
    +*******************+++++++++++++++++++++......++++++++*************************
    +*********************++++++++++++++++++++++.++++++++***************************
    +*************************+++++++++++++++++++++++*******************************
    +******************************+++++++++++++************************************
    +*******************************************************************************
    +*******************************************************************************
    +*******************************************************************************
    +Evaluated to 0.000000
    +ready> mandel(-2, -1, 0.02, 0.04);
    +**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
    +***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    +*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
    +*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
    +*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
    +***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
    +**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
    +************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
    +***********++++++++++++++++++++++++++++++++++++++++++++++++++........        . 
    +**********++++++++++++++++++++++++++++++++++++++++++++++.............          
    +********+++++++++++++++++++++++++++++++++++++++++++..................          
    +*******+++++++++++++++++++++++++++++++++++++++.......................          
    +******+++++++++++++++++++++++++++++++++++...........................           
    +*****++++++++++++++++++++++++++++++++............................              
    +*****++++++++++++++++++++++++++++...............................               
    +****++++++++++++++++++++++++++......   .........................               
    +***++++++++++++++++++++++++.........     ......    ...........                 
    +***++++++++++++++++++++++............                                          
    +**+++++++++++++++++++++..............                                          
    +**+++++++++++++++++++................                                          
    +*++++++++++++++++++.................                                           
    +*++++++++++++++++............ ...                                              
    +*++++++++++++++..............                                                  
    +*+++....++++................                                                   
    +*..........  ...........                                                       
    +*                                                                              
    +*..........  ...........                                                       
    +*+++....++++................                                                   
    +*++++++++++++++..............                                                  
    +*++++++++++++++++............ ...                                              
    +*++++++++++++++++++.................                                           
    +**+++++++++++++++++++................                                          
    +**+++++++++++++++++++++..............                                          
    +***++++++++++++++++++++++............                                          
    +***++++++++++++++++++++++++.........     ......    ...........                 
    +****++++++++++++++++++++++++++......   .........................               
    +*****++++++++++++++++++++++++++++...............................               
    +*****++++++++++++++++++++++++++++++++............................              
    +******+++++++++++++++++++++++++++++++++++...........................           
    +*******+++++++++++++++++++++++++++++++++++++++.......................          
    +********+++++++++++++++++++++++++++++++++++++++++++..................          
    +Evaluated to 0.000000
    +ready> mandel(-0.9, -1.4, 0.02, 0.03);
    +*******************************************************************************
    +*******************************************************************************
    +*******************************************************************************
    +**********+++++++++++++++++++++************************************************
    +*+++++++++++++++++++++++++++++++++++++++***************************************
    ++++++++++++++++++++++++++++++++++++++++++++++**********************************
    +++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
    ++++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
    ++++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
    ++++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
    +++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
    ++++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
    +++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
    +++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
    ++++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
    +++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
    +++++++++++++++++++++...........                .........++++++++++++++++++++++*
    +++++++++++++++++++............                  ...........++++++++++++++++++++
    +++++++++++++++++...............                 .............++++++++++++++++++
    +++++++++++++++.................                 ...............++++++++++++++++
    +++++++++++++..................                  .................++++++++++++++
    ++++++++++..................                      .................+++++++++++++
    +++++++........        .                               .........  ..++++++++++++
    +++............                                         ......    ....++++++++++
    +..............                                                    ...++++++++++
    +..............                                                    ....+++++++++
    +..............                                                    .....++++++++
    +.............                                                    ......++++++++
    +...........                                                     .......++++++++
    +.........                                                       ........+++++++
    +.........                                                       ........+++++++
    +.........                                                           ....+++++++
    +........                                                             ...+++++++
    +.......                                                              ...+++++++
    +                                                                    ....+++++++
    +                                                                   .....+++++++
    +                                                                    ....+++++++
    +                                                                    ....+++++++
    +                                                                    ....+++++++
    +Evaluated to 0.000000
    +ready> ^D
    +
    +
    + +

    At this point, you may be starting to realize that Kaleidoscope is a real +and powerful language. It may not be self-similar :), but it can be used to +plot things that are!

    + +

    With this, we conclude the "adding user-defined operators" chapter of the +tutorial. We have successfully augmented our language, adding the ability to extend the +language in the library, and we have shown how this can be used to build a simple but +interesting end-user application in Kaleidoscope. At this point, Kaleidoscope +can build a variety of applications that are functional and can call functions +with side-effects, but it can't actually define and mutate a variable itself. +

    + +

    Strikingly, variable mutation is an important feature of some +languages, and it is not at all obvious how to add +support for mutable variables without having to add an "SSA construction" +phase to your front-end. In the next chapter, we will describe how you can +add variable mutation without building SSA in your front-end.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

    + +
    +
    +   # Compile
    +   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    +   # Run
    +   ./toy
    +
    +
    + +

    Here is the code:

    + +
    +
    +#include "llvm/DerivedTypes.h"
    +#include "llvm/ExecutionEngine/ExecutionEngine.h"
    +#include "llvm/ExecutionEngine/JIT.h"
    +#include "llvm/LLVMContext.h"
    +#include "llvm/Module.h"
    +#include "llvm/PassManager.h"
    +#include "llvm/Analysis/Verifier.h"
    +#include "llvm/Target/TargetData.h"
    +#include "llvm/Target/TargetSelect.h"
    +#include "llvm/Transforms/Scalar.h"
    +#include "llvm/Support/IRBuilder.h"
    +#include <cstdio>
    +#include <string>
    +#include <map>
    +#include <vector>
    +using namespace llvm;
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5,
    +  
    +  // control
    +  tok_if = -6, tok_then = -7, tok_else = -8,
    +  tok_for = -9, tok_in = -10,
    +  
    +  // operators
    +  tok_binary = -11, tok_unary = -12
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    if (IdentifierStr == "if") return tok_if;
    +    if (IdentifierStr == "then") return tok_then;
    +    if (IdentifierStr == "else") return tok_else;
    +    if (IdentifierStr == "for") return tok_for;
    +    if (IdentifierStr == "in") return tok_in;
    +    if (IdentifierStr == "binary") return tok_binary;
    +    if (IdentifierStr == "unary") return tok_unary;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// UnaryExprAST - Expression class for a unary operator.
    +class UnaryExprAST : public ExprAST {
    +  char Opcode;
    +  ExprAST *Operand;
    +public:
    +  UnaryExprAST(char opcode, ExprAST *operand) 
    +    : Opcode(opcode), Operand(operand) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// IfExprAST - Expression class for if/then/else.
    +class IfExprAST : public ExprAST {
    +  ExprAST *Cond, *Then, *Else;
    +public:
    +  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
    +  : Cond(cond), Then(then), Else(_else) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// ForExprAST - Expression class for for/in.
    +class ForExprAST : public ExprAST {
    +  std::string VarName;
    +  ExprAST *Start, *End, *Step, *Body;
    +public:
    +  ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
    +             ExprAST *step, ExprAST *body)
    +    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes), as well as if it is an operator.
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +  bool isOperator;
    +  unsigned Precedence;  // Precedence if a binary op.
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args,
    +               bool isoperator = false, unsigned prec = 0)
    +  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
    +  
    +  bool isUnaryOp() const { return isOperator && Args.size() == 1; }
    +  bool isBinaryOp() const { return isOperator && Args.size() == 2; }
    +  
    +  char getOperatorName() const {
    +    assert(isUnaryOp() || isBinaryOp());
    +    return Name[Name.size()-1];
    +  }
    +  
    +  unsigned getBinaryPrecedence() const { return Precedence; }
    +  
    +  Function *Codegen();
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +  Function *Codegen();
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// ifexpr ::= 'if' expression 'then' expression 'else' expression
    +static ExprAST *ParseIfExpr() {
    +  getNextToken();  // eat the if.
    +  
    +  // condition.
    +  ExprAST *Cond = ParseExpression();
    +  if (!Cond) return 0;
    +  
    +  if (CurTok != tok_then)
    +    return Error("expected then");
    +  getNextToken();  // eat the then
    +  
    +  ExprAST *Then = ParseExpression();
    +  if (Then == 0) return 0;
    +  
    +  if (CurTok != tok_else)
    +    return Error("expected else");
    +  
    +  getNextToken();
    +  
    +  ExprAST *Else = ParseExpression();
    +  if (!Else) return 0;
    +  
    +  return new IfExprAST(Cond, Then, Else);
    +}
    +
    +/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
    +static ExprAST *ParseForExpr() {
    +  getNextToken();  // eat the for.
    +
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after for");
    +  
    +  std::string IdName = IdentifierStr;
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '=')
    +    return Error("expected '=' after for");
    +  getNextToken();  // eat '='.
    +  
    +  
    +  ExprAST *Start = ParseExpression();
    +  if (Start == 0) return 0;
    +  if (CurTok != ',')
    +    return Error("expected ',' after for start value");
    +  getNextToken();
    +  
    +  ExprAST *End = ParseExpression();
    +  if (End == 0) return 0;
    +  
    +  // The step value is optional.
    +  ExprAST *Step = 0;
    +  if (CurTok == ',') {
    +    getNextToken();
    +    Step = ParseExpression();
    +    if (Step == 0) return 0;
    +  }
    +  
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' after for");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +
    +  return new ForExprAST(IdName, Start, End, Step, Body);
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +///   ::= ifexpr
    +///   ::= forexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  case tok_if:         return ParseIfExpr();
    +  case tok_for:        return ParseForExpr();
    +  }
    +}
    +
    +/// unary
    +///   ::= primary
    +///   ::= '!' unary
    +static ExprAST *ParseUnary() {
    +  // If the current token is not an operator, it must be a primary expr.
    +  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
    +    return ParsePrimary();
    +  
    +  // If this is a unary operator, read it.
    +  int Opc = CurTok;
    +  getNextToken();
    +  if (ExprAST *Operand = ParseUnary())
    +    return new UnaryExprAST(Opc, Operand);
    +  return 0;
    +}
    +
    +/// binoprhs
    +///   ::= ('+' unary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the unary expression after the binary operator.
    +    ExprAST *RHS = ParseUnary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= unary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParseUnary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +///   ::= binary LETTER number? (id, id)
    +///   ::= unary LETTER (id)
    +static PrototypeAST *ParsePrototype() {
    +  std::string FnName;
    +  
    +  unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
    +  unsigned BinaryPrecedence = 30;
    +  
    +  switch (CurTok) {
    +  default:
    +    return ErrorP("Expected function name in prototype");
    +  case tok_identifier:
    +    FnName = IdentifierStr;
    +    Kind = 0;
    +    getNextToken();
    +    break;
    +  case tok_unary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected unary operator");
    +    FnName = "unary";
    +    FnName += (char)CurTok;
    +    Kind = 1;
    +    getNextToken();
    +    break;
    +  case tok_binary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected binary operator");
    +    FnName = "binary";
    +    FnName += (char)CurTok;
    +    Kind = 2;
    +    getNextToken();
    +    
    +    // Read the precedence if present.
    +    if (CurTok == tok_number) {
    +      if (NumVal < 1 || NumVal > 100)
    +        return ErrorP("Invalid precedecnce: must be 1..100");
    +      BinaryPrecedence = (unsigned)NumVal;
    +      getNextToken();
    +    }
    +    break;
    +  }
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  // Verify right number of names for operator.
    +  if (Kind && ArgNames.size() != Kind)
    +    return ErrorP("Invalid number of operands for operator");
    +  
    +  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Code Generation
    +//===----------------------------------------------------------------------===//
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, Value*> NamedValues;
    +static FunctionPassManager *TheFPM;
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  return V ? V : ErrorV("Unknown variable name");
    +}
    +
    +Value *UnaryExprAST::Codegen() {
    +  Value *OperandV = Operand->Codegen();
    +  if (OperandV == 0) return 0;
    +  
    +  Function *F = TheModule->getFunction(std::string("unary")+Opcode);
    +  if (F == 0)
    +    return ErrorV("Unknown unary operator");
    +  
    +  return Builder.CreateCall(F, OperandV, "unop");
    +}
    +
    +Value *BinaryExprAST::Codegen() {
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: break;
    +  }
    +  
    +  // If it wasn't a builtin binary operator, it must be a user defined one. Emit
    +  // a call to it.
    +  Function *F = TheModule->getFunction(std::string("binary")+Op);
    +  assert(F && "binary operator not found!");
    +  
    +  Value *Ops[] = { L, R };
    +  return Builder.CreateCall(F, Ops, Ops+2, "binop");
    +}
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +Value *IfExprAST::Codegen() {
    +  Value *CondV = Cond->Codegen();
    +  if (CondV == 0) return 0;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  CondV = Builder.CreateFCmpONE(CondV, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                "ifcond");
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  
    +  // Create blocks for the then and else cases.  Insert the 'then' block at the
    +  // end of the function.
    +  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
    +  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
    +  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
    +  
    +  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
    +  
    +  // Emit then value.
    +  Builder.SetInsertPoint(ThenBB);
    +  
    +  Value *ThenV = Then->Codegen();
    +  if (ThenV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
    +  ThenBB = Builder.GetInsertBlock();
    +  
    +  // Emit else block.
    +  TheFunction->getBasicBlockList().push_back(ElseBB);
    +  Builder.SetInsertPoint(ElseBB);
    +  
    +  Value *ElseV = Else->Codegen();
    +  if (ElseV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
    +  ElseBB = Builder.GetInsertBlock();
    +  
    +  // Emit merge block.
    +  TheFunction->getBasicBlockList().push_back(MergeBB);
    +  Builder.SetInsertPoint(MergeBB);
    +  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
    +                                  "iftmp");
    +  
    +  PN->addIncoming(ThenV, ThenBB);
    +  PN->addIncoming(ElseV, ElseBB);
    +  return PN;
    +}
    +
    +Value *ForExprAST::Codegen() {
    +  // Output this as:
    +  //   ...
    +  //   start = startexpr
    +  //   goto loop
    +  // loop: 
    +  //   variable = phi [start, loopheader], [nextvariable, loopend]
    +  //   ...
    +  //   bodyexpr
    +  //   ...
    +  // loopend:
    +  //   step = stepexpr
    +  //   nextvariable = variable + step
    +  //   endcond = endexpr
    +  //   br endcond, loop, endloop
    +  // outloop:
    +  
    +  // Emit the start code first, without 'variable' in scope.
    +  Value *StartVal = Start->Codegen();
    +  if (StartVal == 0) return 0;
    +  
    +  // Make the new basic block for the loop header, inserting after current
    +  // block.
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
    +  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
    +  
    +  // Insert an explicit fall through from the current block to the LoopBB.
    +  Builder.CreateBr(LoopBB);
    +
    +  // Start insertion in LoopBB.
    +  Builder.SetInsertPoint(LoopBB);
    +  
    +  // Start the PHI node with an entry for Start.
    +  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), VarName.c_str());
    +  Variable->addIncoming(StartVal, PreheaderBB);
    +  
    +  // Within the loop, the variable is defined equal to the PHI node.  If it
    +  // shadows an existing variable, we have to restore it, so save it now.
    +  Value *OldVal = NamedValues[VarName];
    +  NamedValues[VarName] = Variable;
    +  
    +  // Emit the body of the loop.  This, like any other expr, can change the
    +  // current BB.  Note that we ignore the value computed by the body, but don't
    +  // allow an error.
    +  if (Body->Codegen() == 0)
    +    return 0;
    +  
    +  // Emit the step value.
    +  Value *StepVal;
    +  if (Step) {
    +    StepVal = Step->Codegen();
    +    if (StepVal == 0) return 0;
    +  } else {
    +    // If not specified, use 1.0.
    +    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
    +  }
    +  
    +  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
    +
    +  // Compute the end condition.
    +  Value *EndCond = End->Codegen();
    +  if (EndCond == 0) return EndCond;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  EndCond = Builder.CreateFCmpONE(EndCond, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                  "loopcond");
    +  
    +  // Create the "after loop" block and insert it.
    +  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
    +  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
    +  
    +  // Insert the conditional branch into the end of LoopEndBB.
    +  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
    +  
    +  // Any new code will be inserted in AfterBB.
    +  Builder.SetInsertPoint(AfterBB);
    +  
    +  // Add a new entry to the PHI node for the backedge.
    +  Variable->addIncoming(NextVar, LoopEndBB);
    +  
    +  // Restore the unshadowed variable.
    +  if (OldVal)
    +    NamedValues[VarName] = OldVal;
    +  else
    +    NamedValues.erase(VarName);
    +
    +  
    +  // for expr always returns 0.0.
    +  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
    +}
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +  
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +    
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +  
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx) {
    +    AI->setName(Args[Idx]);
    +    
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = AI;
    +  }
    +  
    +  return F;
    +}
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // If this is an operator, install it.
    +  if (Proto->isBinaryOp())
    +    BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    // Optimize the function.
    +    TheFPM->run(*TheFunction);
    +    
    +    return TheFunction;
    +  }
    +  
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +
    +  if (Proto->isBinaryOp())
    +    BinopPrecedence.erase(Proto->getOperatorName());
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing and JIT Driver
    +//===----------------------------------------------------------------------===//
    +
    +static ExecutionEngine *TheExecutionEngine;
    +
    +static void HandleDefinition() {
    +  if (FunctionAST *F = ParseDefinition()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read function definition:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (PrototypeAST *P = ParseExtern()) {
    +    if (Function *F = P->Codegen()) {
    +      fprintf(stderr, "Read extern: ");
    +      F->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      // JIT the function, returning a function pointer.
    +      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
    +      
    +      // Cast it to the right type (takes no arguments, returns a double) so we
    +      // can call it as a native function.
    +      double (*FP)() = (double (*)())(intptr_t)FPtr;
    +      fprintf(stderr, "Evaluated to %f\n", FP());
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// "Library" functions that can be "extern'd" from user code.
    +//===----------------------------------------------------------------------===//
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +/// printd - printf that takes a double prints it as "%f\n", returning 0.
    +extern "C" 
    +double printd(double X) {
    +  printf("%f\n", X);
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  InitializeNativeTarget();
    +  LLVMContext &Context = getGlobalContext();
    +
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Make the module, which holds all the code.
    +  TheModule = new Module("my cool jit", Context);
    +
    +  // Create the JIT.  This takes ownership of the module.
    +  std::string ErrStr;
    +  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
    +  if (!TheExecutionEngine) {
    +    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
    +    exit(1);
    +  }
    +
    +  FunctionPassManager OurFPM(TheModule);
    +
    +  // Set up the optimizer pipeline.  Start with registering info about how the
    +  // target lays out data structures.
    +  OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +  // Do simple "peephole" optimizations and bit-twiddling optzns.
    +  OurFPM.add(createInstructionCombiningPass());
    +  // Reassociate expressions.
    +  OurFPM.add(createReassociatePass());
    +  // Eliminate Common SubExpressions.
    +  OurFPM.add(createGVNPass());
    +  // Simplify the control flow graph (deleting unreachable blocks, etc).
    +  OurFPM.add(createCFGSimplificationPass());
    +
    +  OurFPM.doInitialization();
    +
    +  // Set the global so the code gen can use this.
    +  TheFPM = &OurFPM;
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  TheFPM = 0;
    +
    +  // Print out all of the generated code.
    +  TheModule->dump();
    +
    +  return 0;
    +}
    +
    +
    + +Next: Extending the language: mutable variables / SSA construction +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-01 13:09:20 -0700 (Wed, 01 Sep 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl7.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl7.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl7.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl7.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,2164 @@ + + + + + Kaleidoscope: Extending the Language: Mutable Variables / SSA + construction + + + + + + + +
    Kaleidoscope: Extending the Language: Mutable Variables
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to Chapter 7 of the "Implementing a language +with LLVM" tutorial. In chapters 1 through 6, we've built a very +respectable, albeit simple, functional +programming language. In our journey, we learned some parsing techniques, +how to build and represent an AST, how to build LLVM IR, and how to optimize +the resultant code as well as JIT compile it.

    + +

    While Kaleidoscope is interesting as a functional language, the fact that it +is functional makes it "too easy" to generate LLVM IR for it. In particular, a +functional language makes it very easy to build LLVM IR directly in SSA form. +Since LLVM requires that the input code be in SSA form, this is a very nice +property and it is often unclear to newcomers how to generate code for an +imperative language with mutable variables.

    + +

    The short (and happy) summary of this chapter is that there is no need for +your front-end to build SSA form: LLVM provides highly tuned and well tested +support for this, though the way it works is a bit unexpected for some.

    + +
    + + + + + +
    + +

    +To understand why mutable variables cause complexities in SSA construction, +consider this extremely simple C example: +

    + +
    +
    +int G, H;
    +int test(_Bool Condition) {
    +  int X;
    +  if (Condition)
    +    X = G;
    +  else
    +    X = H;
    +  return X;
    +}
    +
    +
    + +

    In this case, we have the variable "X", whose value depends on the path +executed in the program. Because there are two different possible values for X +before the return instruction, a PHI node is inserted to merge the two values. +The LLVM IR that we want for this example looks like this:

    + +
    +
    + at G = weak global i32 0   ; type of @G is i32*
    + at H = weak global i32 0   ; type of @H is i32*
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +	br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +	%X.0 = load i32* @G
    +	br label %cond_next
    +
    +cond_false:
    +	%X.1 = load i32* @H
    +	br label %cond_next
    +
    +cond_next:
    +	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    +	ret i32 %X.2
    +}
    +
    +
    + +

    In this example, the loads from the G and H global variables are explicit in +the LLVM IR, and they live in the then/else branches of the if statement +(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node +in the cond_next block selects the right value to use based on where control +flow is coming from: if control flow comes from the cond_false block, X.2 gets +the value of X.1. Alternatively, if control flow comes from cond_true, it gets +the value of X.0. The intent of this chapter is not to explain the details of +SSA form. For more information, see one of the many online +references.

    + +

    The question for this article is "who places the phi nodes when lowering +assignments to mutable variables?". The issue here is that LLVM +requires that its IR be in SSA form: there is no "non-ssa" mode for it. +However, SSA construction requires non-trivial algorithms and data structures, +so it is inconvenient and wasteful for every front-end to have to reproduce this +logic.

    + +
    + + + + + +
    + +

    The 'trick' here is that while LLVM does require all register values to be +in SSA form, it does not require (or permit) memory objects to be in SSA form. +In the example above, note that the loads from G and H are direct accesses to +G and H: they are not renamed or versioned. This differs from some other +compiler systems, which do try to version memory objects. In LLVM, instead of +encoding dataflow analysis of memory into the LLVM IR, it is handled with Analysis Passes which are computed on +demand.

    + +

    +With this in mind, the high-level idea is that we want to make a stack variable +(which lives in memory, because it is on the stack) for each mutable object in +a function. To take advantage of this trick, we need to talk about how LLVM +represents stack variables. +

    + +

    In LLVM, all memory accesses are explicit with load/store instructions, and +it is carefully designed not to have (or need) an "address-of" operator. Notice +how the type of the @G/@H global variables is actually "i32*" even though the +variable is defined as "i32". What this means is that @G defines space +for an i32 in the global data area, but its name actually refers to the +address for that space. Stack variables work the same way, except that instead of +being declared with global variable definitions, they are declared with the +LLVM alloca instruction:

    + +
    +
    +define i32 @example() {
    +entry:
    +	%X = alloca i32           ; type of %X is i32*.
    +	...
    +	%tmp = load i32* %X       ; load the stack value %X from the stack.
    +	%tmp2 = add i32 %tmp, 1   ; increment it
    +	store i32 %tmp2, i32* %X  ; store it back
    +	...
    +
    +
    + +

    This code shows an example of how you can declare and manipulate a stack +variable in the LLVM IR. Stack memory allocated with the alloca instruction is +fully general: you can pass the address of the stack slot to functions, you can +store it in other variables, etc. In our example above, we could rewrite the +example to use the alloca technique to avoid using a PHI node:

    + +
    +
    + at G = weak global i32 0   ; type of @G is i32*
    + at H = weak global i32 0   ; type of @H is i32*
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +	%X = alloca i32           ; type of %X is i32*.
    +	br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +	%X.0 = load i32* @G
    +        store i32 %X.0, i32* %X   ; Update X
    +	br label %cond_next
    +
    +cond_false:
    +	%X.1 = load i32* @H
    +        store i32 %X.1, i32* %X   ; Update X
    +	br label %cond_next
    +
    +cond_next:
    +	%X.2 = load i32* %X       ; Read X
    +	ret i32 %X.2
    +}
    +
    +
    + +

    With this, we have discovered a way to handle arbitrary mutable variables +without the need to create Phi nodes at all:

    + +
      +
    1. Each mutable variable becomes a stack allocation.
    2. +
    3. Each read of the variable becomes a load from the stack.
    4. +
    5. Each update of the variable becomes a store to the stack.
    6. +
    7. Taking the address of a variable just uses the stack address directly.
    8. +
    + +

    While this solution has solved our immediate problem, it introduced another +one: we have now apparently introduced a lot of stack traffic for very simple +and common operations, a major performance problem. Fortunately for us, the +LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles +this case, promoting allocas like this into SSA registers, inserting Phi nodes +as appropriate. If you run this example through the pass, for example, you'll +get:

    + +
    +
    +$ llvm-as < example.ll | opt -mem2reg | llvm-dis
    + at G = weak global i32 0
    + at H = weak global i32 0
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +	br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +	%X.0 = load i32* @G
    +	br label %cond_next
    +
    +cond_false:
    +	%X.1 = load i32* @H
    +	br label %cond_next
    +
    +cond_next:
    +	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    +	ret i32 %X.01
    +}
    +
    +
    + +

    The mem2reg pass implements the standard "iterated dominance frontier" +algorithm for constructing SSA form and has a number of optimizations that speed +up (very common) degenerate cases. The mem2reg optimization pass is the answer to dealing +with mutable variables, and we highly recommend that you depend on it. Note that +mem2reg only works on variables in certain circumstances:

    + +
      +
    1. mem2reg is alloca-driven: it looks for allocas and if it can handle them, it +promotes them. It does not apply to global variables or heap allocations.
    2. + +
    3. mem2reg only looks for alloca instructions in the entry block of the +function. Being in the entry block guarantees that the alloca is only executed +once, which makes analysis simpler.
    4. + +
    5. mem2reg only promotes allocas whose uses are direct loads and stores. If +the address of the stack object is passed to a function, or if any funny pointer +arithmetic is involved, the alloca will not be promoted.
    6. + +
    7. mem2reg only works on allocas of first class +values (such as pointers, scalars and vectors), and only if the array size +of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of +promoting structs or arrays to registers. Note that the "scalarrepl" pass is +more powerful and can promote structs, "unions", and arrays in many cases.
    8. + +
    + +

    +All of these properties are easy to satisfy for most imperative languages, and +we'll illustrate it below with Kaleidoscope. The final question you may be +asking is: should I bother with this nonsense for my front-end? Wouldn't it be +better if I just did SSA construction directly, avoiding use of the mem2reg +optimization pass? In short, we strongly recommend that you use this technique +for building SSA form, unless there is an extremely good reason not to. Using +this technique is:

    + +
      +
    • Proven and well tested: llvm-gcc and clang both use this technique for local +mutable variables. As such, the most common clients of LLVM are using this to +handle a bulk of their variables. You can be sure that bugs are found fast and +fixed early.
    • + +
    • Extremely Fast: mem2reg has a number of special cases that make it fast in +common cases as well as fully general. For example, it has fast-paths for +variables that are only used in a single block, variables that only have one +assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc. +
    • + +
    • Needed for debug info generation: +Debug information in LLVM relies on having the address of the variable +exposed so that debug info can be attached to it. This technique dovetails +very naturally with this style of debug info.
    • +
    + +

    If nothing else, this makes it much easier to get your front-end up and +running, and is very simple to implement. Lets extend Kaleidoscope with mutable +variables now! +

    + +
    + + + + + +
    + +

    Now that we know the sort of problem we want to tackle, lets see what this +looks like in the context of our little Kaleidoscope language. We're going to +add two features:

    + +
      +
    1. The ability to mutate variables with the '=' operator.
    2. +
    3. The ability to define new variables.
    4. +
    + +

    While the first item is really what this is about, we only have variables +for incoming arguments as well as for induction variables, and redefining those only +goes so far :). Also, the ability to define new variables is a +useful thing regardless of whether you will be mutating them. Here's a +motivating example that shows how we could use these:

    + +
    +
    +# Define ':' for sequencing: as a low-precedence operator that ignores operands
    +# and just returns the RHS.
    +def binary : 1 (x y) y;
    +
    +# Recursive fib, we could do this before.
    +def fib(x)
    +  if (x < 3) then
    +    1
    +  else
    +    fib(x-1)+fib(x-2);
    +
    +# Iterative fib.
    +def fibi(x)
    +  var a = 1, b = 1, c in
    +  (for i = 3, i < x in 
    +     c = a + b :
    +     a = b :
    +     b = c) :
    +  b;
    +
    +# Call it. 
    +fibi(10);
    +
    +
    + +

    +In order to mutate variables, we have to change our existing variables to use +the "alloca trick". Once we have that, we'll add our new operator, then extend +Kaleidoscope to support new variable definitions. +

    + +
    + + + + + +
    + +

    +The symbol table in Kaleidoscope is managed at code generation time by the +'NamedValues' map. This map currently keeps track of the LLVM "Value*" +that holds the double value for the named variable. In order to support +mutation, we need to change this slightly, so that it NamedValues holds +the memory location of the variable in question. Note that this +change is a refactoring: it changes the structure of the code, but does not +(by itself) change the behavior of the compiler. All of these changes are +isolated in the Kaleidoscope code generator.

    + +

    +At this point in Kaleidoscope's development, it only supports variables for two +things: incoming arguments to functions and the induction variable of 'for' +loops. For consistency, we'll allow mutation of these variables in addition to +other user-defined variables. This means that these will both need memory +locations. +

    + +

    To start our transformation of Kaleidoscope, we'll change the NamedValues +map so that it maps to AllocaInst* instead of Value*. Once we do this, the C++ +compiler will tell us what parts of the code we need to update:

    + +
    +
    +static std::map<std::string, AllocaInst*> NamedValues;
    +
    +
    + +

    Also, since we will need to create these alloca's, we'll use a helper +function that ensures that the allocas are created in the entry block of the +function:

    + +
    +
    +/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
    +/// the function.  This is used for mutable variables etc.
    +static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
    +                                          const std::string &VarName) {
    +  IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
    +                 TheFunction->getEntryBlock().begin());
    +  return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
    +                           VarName.c_str());
    +}
    +
    +
    + +

    This funny looking code creates an IRBuilder object that is pointing at +the first instruction (.begin()) of the entry block. It then creates an alloca +with the expected name and returns it. Because all values in Kaleidoscope are +doubles, there is no need to pass in a type to use.

    + +

    With this in place, the first functionality change we want to make is to +variable references. In our new scheme, variables live on the stack, so code +generating a reference to them actually needs to produce a load from the stack +slot:

    + +
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  if (V == 0) return ErrorV("Unknown variable name");
    +
    +  // Load the value.
    +  return Builder.CreateLoad(V, Name.c_str());
    +}
    +
    +
    + +

    As you can see, this is pretty straightforward. Now we need to update the +things that define the variables to set up the alloca. We'll start with +ForExprAST::Codegen (see the full code listing for +the unabridged code):

    + +
    +
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +
    +  // Create an alloca for the variable in the entry block.
    +  AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    +  
    +    // Emit the start code first, without 'variable' in scope.
    +  Value *StartVal = Start->Codegen();
    +  if (StartVal == 0) return 0;
    +  
    +  // Store the value into the alloca.
    +  Builder.CreateStore(StartVal, Alloca);
    +  ...
    +
    +  // Compute the end condition.
    +  Value *EndCond = End->Codegen();
    +  if (EndCond == 0) return EndCond;
    +  
    +  // Reload, increment, and restore the alloca.  This handles the case where
    +  // the body of the loop mutates the variable.
    +  Value *CurVar = Builder.CreateLoad(Alloca);
    +  Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
    +  Builder.CreateStore(NextVar, Alloca);
    +  ...
    +
    +
    + +

    This code is virtually identical to the code before we allowed mutable variables. The +big difference is that we no longer have to construct a PHI node, and we use +load/store to access the variable as needed.

    + +

    To support mutable argument variables, we need to also make allocas for them. +The code for this is also pretty simple:

    + +
    +
    +/// CreateArgumentAllocas - Create an alloca for each argument and register the
    +/// argument in the symbol table so that references to it will succeed.
    +void PrototypeAST::CreateArgumentAllocas(Function *F) {
    +  Function::arg_iterator AI = F->arg_begin();
    +  for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
    +    // Create an alloca for this variable.
    +    AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
    +
    +    // Store the initial value into the alloca.
    +    Builder.CreateStore(AI, Alloca);
    +
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = Alloca;
    +  }
    +}
    +
    +
    + +

    For each argument, we make an alloca, store the input value to the function +into the alloca, and register the alloca as the memory location for the +argument. This method gets invoked by FunctionAST::Codegen right after +it sets up the entry block for the function.

    + +

    The final missing piece is adding the mem2reg pass, which allows us to get +good codegen once again:

    + +
    +
    +    // Set up the optimizer pipeline.  Start with registering info about how the
    +    // target lays out data structures.
    +    OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +    // Promote allocas to registers.
    +    OurFPM.add(createPromoteMemoryToRegisterPass());
    +    // Do simple "peephole" optimizations and bit-twiddling optzns.
    +    OurFPM.add(createInstructionCombiningPass());
    +    // Reassociate expressions.
    +    OurFPM.add(createReassociatePass());
    +
    +
    + +

    It is interesting to see what the code looks like before and after the +mem2reg optimization runs. For example, this is the before/after code for our +recursive fib function. Before the optimization:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +	%x1 = alloca double
    +	store double %x, double* %x1
    +	%x2 = load double* %x1
    +	%cmptmp = fcmp ult double %x2, 3.000000e+00
    +	%booltmp = uitofp i1 %cmptmp to double
    +	%ifcond = fcmp one double %booltmp, 0.000000e+00
    +	br i1 %ifcond, label %then, label %else
    +
    +then:		; preds = %entry
    +	br label %ifcont
    +
    +else:		; preds = %entry
    +	%x3 = load double* %x1
    +	%subtmp = fsub double %x3, 1.000000e+00
    +	%calltmp = call double @fib(double %subtmp)
    +	%x4 = load double* %x1
    +	%subtmp5 = fsub double %x4, 2.000000e+00
    +	%calltmp6 = call double @fib(double %subtmp5)
    +	%addtmp = fadd double %calltmp, %calltmp6
    +	br label %ifcont
    +
    +ifcont:		; preds = %else, %then
    +	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    +	ret double %iftmp
    +}
    +
    +
    + +

    Here there is only one variable (x, the input argument) but you can still +see the extremely simple-minded code generation strategy we are using. In the +entry block, an alloca is created, and the initial input value is stored into +it. Each reference to the variable does a reload from the stack. Also, note +that we didn't modify the if/then/else expression, so it still inserts a PHI +node. While we could make an alloca for it, it is actually easier to create a +PHI node for it, so we still just make the PHI.

    + +

    Here is the code after the mem2reg pass runs:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +	%cmptmp = fcmp ult double %x, 3.000000e+00
    +	%booltmp = uitofp i1 %cmptmp to double
    +	%ifcond = fcmp one double %booltmp, 0.000000e+00
    +	br i1 %ifcond, label %then, label %else
    +
    +then:
    +	br label %ifcont
    +
    +else:
    +	%subtmp = fsub double %x, 1.000000e+00
    +	%calltmp = call double @fib(double %subtmp)
    +	%subtmp5 = fsub double %x, 2.000000e+00
    +	%calltmp6 = call double @fib(double %subtmp5)
    +	%addtmp = fadd double %calltmp, %calltmp6
    +	br label %ifcont
    +
    +ifcont:		; preds = %else, %then
    +	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    +	ret double %iftmp
    +}
    +
    +
    + +

    This is a trivial case for mem2reg, since there are no redefinitions of the +variable. The point of showing this is to calm your tension about inserting +such blatent inefficiencies :).

    + +

    After the rest of the optimizers run, we get:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +	%cmptmp = fcmp ult double %x, 3.000000e+00
    +	%booltmp = uitofp i1 %cmptmp to double
    +	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
    +	br i1 %ifcond, label %else, label %ifcont
    +
    +else:
    +	%subtmp = fsub double %x, 1.000000e+00
    +	%calltmp = call double @fib(double %subtmp)
    +	%subtmp5 = fsub double %x, 2.000000e+00
    +	%calltmp6 = call double @fib(double %subtmp5)
    +	%addtmp = fadd double %calltmp, %calltmp6
    +	ret double %addtmp
    +
    +ifcont:
    +	ret double 1.000000e+00
    +}
    +
    +
    + +

    Here we see that the simplifycfg pass decided to clone the return instruction +into the end of the 'else' block. This allowed it to eliminate some branches +and the PHI node.

    + +

    Now that all symbol table references are updated to use stack variables, +we'll add the assignment operator.

    + +
    + + + + + +
    + +

    With our current framework, adding a new assignment operator is really +simple. We will parse it just like any other binary operator, but handle it +internally (instead of allowing the user to define it). The first step is to +set a precedence:

    + +
    +
    + int main() {
    +   // Install standard binary operators.
    +   // 1 is lowest precedence.
    +   BinopPrecedence['='] = 2;
    +   BinopPrecedence['<'] = 10;
    +   BinopPrecedence['+'] = 20;
    +   BinopPrecedence['-'] = 20;
    +
    +
    + +

    Now that the parser knows the precedence of the binary operator, it takes +care of all the parsing and AST generation. We just need to implement codegen +for the assignment operator. This looks like:

    + +
    +
    +Value *BinaryExprAST::Codegen() {
    +  // Special case '=' because we don't want to emit the LHS as an expression.
    +  if (Op == '=') {
    +    // Assignment requires the LHS to be an identifier.
    +    VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS);
    +    if (!LHSE)
    +      return ErrorV("destination of '=' must be a variable");
    +
    +
    + +

    Unlike the rest of the binary operators, our assignment operator doesn't +follow the "emit LHS, emit RHS, do computation" model. As such, it is handled +as a special case before the other binary operators are handled. The other +strange thing is that it requires the LHS to be a variable. It is invalid to +have "(x+1) = expr" - only things like "x = expr" are allowed. +

    + +
    +
    +    // Codegen the RHS.
    +    Value *Val = RHS->Codegen();
    +    if (Val == 0) return 0;
    +
    +    // Look up the name.
    +    Value *Variable = NamedValues[LHSE->getName()];
    +    if (Variable == 0) return ErrorV("Unknown variable name");
    +
    +    Builder.CreateStore(Val, Variable);
    +    return Val;
    +  }
    +  ...  
    +
    +
    + +

    Once we have the variable, codegen'ing the assignment is straightforward: +we emit the RHS of the assignment, create a store, and return the computed +value. Returning a value allows for chained assignments like "X = (Y = Z)".

    + +

    Now that we have an assignment operator, we can mutate loop variables and +arguments. For example, we can now run code like this:

    + +
    +
    +# Function to print a double.
    +extern printd(x);
    +
    +# Define ':' for sequencing: as a low-precedence operator that ignores operands
    +# and just returns the RHS.
    +def binary : 1 (x y) y;
    +
    +def test(x)
    +  printd(x) :
    +  x = 4 :
    +  printd(x);
    +
    +test(123);
    +
    +
    + +

    When run, this example prints "123" and then "4", showing that we did +actually mutate the value! Okay, we have now officially implemented our goal: +getting this to work requires SSA construction in the general case. However, +to be really useful, we want the ability to define our own local variables, lets +add this next! +

    + +
    + + + + + +
    + +

    Adding var/in is just like any other other extensions we made to +Kaleidoscope: we extend the lexer, the parser, the AST and the code generator. +The first step for adding our new 'var/in' construct is to extend the lexer. +As before, this is pretty trivial, the code looks like this:

    + +
    +
    +enum Token {
    +  ...
    +  // var definition
    +  tok_var = -13
    +...
    +}
    +...
    +static int gettok() {
    +...
    +    if (IdentifierStr == "in") return tok_in;
    +    if (IdentifierStr == "binary") return tok_binary;
    +    if (IdentifierStr == "unary") return tok_unary;
    +    if (IdentifierStr == "var") return tok_var;
    +    return tok_identifier;
    +...
    +
    +
    + +

    The next step is to define the AST node that we will construct. For var/in, +it looks like this:

    + +
    +
    +/// VarExprAST - Expression class for var/in
    +class VarExprAST : public ExprAST {
    +  std::vector<std::pair<std::string, ExprAST*> > VarNames;
    +  ExprAST *Body;
    +public:
    +  VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
    +             ExprAST *body)
    +  : VarNames(varnames), Body(body) {}
    +  
    +  virtual Value *Codegen();
    +};
    +
    +
    + +

    var/in allows a list of names to be defined all at once, and each name can +optionally have an initializer value. As such, we capture this information in +the VarNames vector. Also, var/in has a body, this body is allowed to access +the variables defined by the var/in.

    + +

    With this in place, we can define the parser pieces. The first thing we do is add +it as a primary expression:

    + +
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +///   ::= ifexpr
    +///   ::= forexpr
    +///   ::= varexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  case tok_if:         return ParseIfExpr();
    +  case tok_for:        return ParseForExpr();
    +  case tok_var:        return ParseVarExpr();
    +  }
    +}
    +
    +
    + +

    Next we define ParseVarExpr:

    + +
    +
    +/// varexpr ::= 'var' identifier ('=' expression)? 
    +//                    (',' identifier ('=' expression)?)* 'in' expression
    +static ExprAST *ParseVarExpr() {
    +  getNextToken();  // eat the var.
    +
    +  std::vector<std::pair<std::string, ExprAST*> > VarNames;
    +
    +  // At least one variable name is required.
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after var");
    +
    +
    + +

    The first part of this code parses the list of identifier/expr pairs into the +local VarNames vector. + +

    +
    +  while (1) {
    +    std::string Name = IdentifierStr;
    +    getNextToken();  // eat identifier.
    +
    +    // Read the optional initializer.
    +    ExprAST *Init = 0;
    +    if (CurTok == '=') {
    +      getNextToken(); // eat the '='.
    +      
    +      Init = ParseExpression();
    +      if (Init == 0) return 0;
    +    }
    +    
    +    VarNames.push_back(std::make_pair(Name, Init));
    +    
    +    // End of var list, exit loop.
    +    if (CurTok != ',') break;
    +    getNextToken(); // eat the ','.
    +    
    +    if (CurTok != tok_identifier)
    +      return Error("expected identifier list after var");
    +  }
    +
    +
    + +

    Once all the variables are parsed, we then parse the body and create the +AST node:

    + +
    +
    +  // At this point, we have to have 'in'.
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' keyword after 'var'");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +  
    +  return new VarExprAST(VarNames, Body);
    +}
    +
    +
    + +

    Now that we can parse and represent the code, we need to support emission of +LLVM IR for it. This code starts out with:

    + +
    +
    +Value *VarExprAST::Codegen() {
    +  std::vector<AllocaInst *> OldBindings;
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +
    +  // Register all variables and emit their initializer.
    +  for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
    +    const std::string &VarName = VarNames[i].first;
    +    ExprAST *Init = VarNames[i].second;
    +
    +
    + +

    Basically it loops over all the variables, installing them one at a time. +For each variable we put into the symbol table, we remember the previous value +that we replace in OldBindings.

    + +
    +
    +    // Emit the initializer before adding the variable to scope, this prevents
    +    // the initializer from referencing the variable itself, and permits stuff
    +    // like this:
    +    //  var a = 1 in
    +    //    var a = a in ...   # refers to outer 'a'.
    +    Value *InitVal;
    +    if (Init) {
    +      InitVal = Init->Codegen();
    +      if (InitVal == 0) return 0;
    +    } else { // If not specified, use 0.0.
    +      InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
    +    }
    +    
    +    AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    +    Builder.CreateStore(InitVal, Alloca);
    +
    +    // Remember the old variable binding so that we can restore the binding when
    +    // we unrecurse.
    +    OldBindings.push_back(NamedValues[VarName]);
    +    
    +    // Remember this binding.
    +    NamedValues[VarName] = Alloca;
    +  }
    +
    +
    + +

    There are more comments here than code. The basic idea is that we emit the +initializer, create the alloca, then update the symbol table to point to it. +Once all the variables are installed in the symbol table, we evaluate the body +of the var/in expression:

    + +
    +
    +  // Codegen the body, now that all vars are in scope.
    +  Value *BodyVal = Body->Codegen();
    +  if (BodyVal == 0) return 0;
    +
    +
    + +

    Finally, before returning, we restore the previous variable bindings:

    + +
    +
    +  // Pop all our variables from scope.
    +  for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
    +    NamedValues[VarNames[i].first] = OldBindings[i];
    +
    +  // Return the body computation.
    +  return BodyVal;
    +}
    +
    +
    + +

    The end result of all of this is that we get properly scoped variable +definitions, and we even (trivially) allow mutation of them :).

    + +

    With this, we completed what we set out to do. Our nice iterative fib +example from the intro compiles and runs just fine. The mem2reg pass optimizes +all of our stack variables into SSA registers, inserting PHI nodes where needed, +and our front-end remains simple: no "iterated dominance frontier" computation +anywhere in sight.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with mutable +variables and var/in support. To build this example, use: +

    + +
    +
    +   # Compile
    +   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
    +   # Run
    +   ./toy
    +
    +
    + +

    Here is the code:

    + +
    +
    +#include "llvm/DerivedTypes.h"
    +#include "llvm/ExecutionEngine/ExecutionEngine.h"
    +#include "llvm/ExecutionEngine/JIT.h"
    +#include "llvm/LLVMContext.h"
    +#include "llvm/Module.h"
    +#include "llvm/PassManager.h"
    +#include "llvm/Analysis/Verifier.h"
    +#include "llvm/Target/TargetData.h"
    +#include "llvm/Target/TargetSelect.h"
    +#include "llvm/Transforms/Scalar.h"
    +#include "llvm/Support/IRBuilder.h"
    +#include <cstdio>
    +#include <string>
    +#include <map>
    +#include <vector>
    +using namespace llvm;
    +
    +//===----------------------------------------------------------------------===//
    +// Lexer
    +//===----------------------------------------------------------------------===//
    +
    +// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
    +// of these for known things.
    +enum Token {
    +  tok_eof = -1,
    +
    +  // commands
    +  tok_def = -2, tok_extern = -3,
    +
    +  // primary
    +  tok_identifier = -4, tok_number = -5,
    +  
    +  // control
    +  tok_if = -6, tok_then = -7, tok_else = -8,
    +  tok_for = -9, tok_in = -10,
    +  
    +  // operators
    +  tok_binary = -11, tok_unary = -12,
    +  
    +  // var definition
    +  tok_var = -13
    +};
    +
    +static std::string IdentifierStr;  // Filled in if tok_identifier
    +static double NumVal;              // Filled in if tok_number
    +
    +/// gettok - Return the next token from standard input.
    +static int gettok() {
    +  static int LastChar = ' ';
    +
    +  // Skip any whitespace.
    +  while (isspace(LastChar))
    +    LastChar = getchar();
    +
    +  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
    +    IdentifierStr = LastChar;
    +    while (isalnum((LastChar = getchar())))
    +      IdentifierStr += LastChar;
    +
    +    if (IdentifierStr == "def") return tok_def;
    +    if (IdentifierStr == "extern") return tok_extern;
    +    if (IdentifierStr == "if") return tok_if;
    +    if (IdentifierStr == "then") return tok_then;
    +    if (IdentifierStr == "else") return tok_else;
    +    if (IdentifierStr == "for") return tok_for;
    +    if (IdentifierStr == "in") return tok_in;
    +    if (IdentifierStr == "binary") return tok_binary;
    +    if (IdentifierStr == "unary") return tok_unary;
    +    if (IdentifierStr == "var") return tok_var;
    +    return tok_identifier;
    +  }
    +
    +  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
    +    std::string NumStr;
    +    do {
    +      NumStr += LastChar;
    +      LastChar = getchar();
    +    } while (isdigit(LastChar) || LastChar == '.');
    +
    +    NumVal = strtod(NumStr.c_str(), 0);
    +    return tok_number;
    +  }
    +
    +  if (LastChar == '#') {
    +    // Comment until end of line.
    +    do LastChar = getchar();
    +    while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
    +    
    +    if (LastChar != EOF)
    +      return gettok();
    +  }
    +  
    +  // Check for end of file.  Don't eat the EOF.
    +  if (LastChar == EOF)
    +    return tok_eof;
    +
    +  // Otherwise, just return the character as its ascii value.
    +  int ThisChar = LastChar;
    +  LastChar = getchar();
    +  return ThisChar;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Abstract Syntax Tree (aka Parse Tree)
    +//===----------------------------------------------------------------------===//
    +
    +/// ExprAST - Base class for all expression nodes.
    +class ExprAST {
    +public:
    +  virtual ~ExprAST() {}
    +  virtual Value *Codegen() = 0;
    +};
    +
    +/// NumberExprAST - Expression class for numeric literals like "1.0".
    +class NumberExprAST : public ExprAST {
    +  double Val;
    +public:
    +  NumberExprAST(double val) : Val(val) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VariableExprAST - Expression class for referencing a variable, like "a".
    +class VariableExprAST : public ExprAST {
    +  std::string Name;
    +public:
    +  VariableExprAST(const std::string &name) : Name(name) {}
    +  const std::string &getName() const { return Name; }
    +  virtual Value *Codegen();
    +};
    +
    +/// UnaryExprAST - Expression class for a unary operator.
    +class UnaryExprAST : public ExprAST {
    +  char Opcode;
    +  ExprAST *Operand;
    +public:
    +  UnaryExprAST(char opcode, ExprAST *operand) 
    +    : Opcode(opcode), Operand(operand) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// BinaryExprAST - Expression class for a binary operator.
    +class BinaryExprAST : public ExprAST {
    +  char Op;
    +  ExprAST *LHS, *RHS;
    +public:
    +  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 
    +    : Op(op), LHS(lhs), RHS(rhs) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// CallExprAST - Expression class for function calls.
    +class CallExprAST : public ExprAST {
    +  std::string Callee;
    +  std::vector<ExprAST*> Args;
    +public:
    +  CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
    +    : Callee(callee), Args(args) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// IfExprAST - Expression class for if/then/else.
    +class IfExprAST : public ExprAST {
    +  ExprAST *Cond, *Then, *Else;
    +public:
    +  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
    +  : Cond(cond), Then(then), Else(_else) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// ForExprAST - Expression class for for/in.
    +class ForExprAST : public ExprAST {
    +  std::string VarName;
    +  ExprAST *Start, *End, *Step, *Body;
    +public:
    +  ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
    +             ExprAST *step, ExprAST *body)
    +    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
    +  virtual Value *Codegen();
    +};
    +
    +/// VarExprAST - Expression class for var/in
    +class VarExprAST : public ExprAST {
    +  std::vector<std::pair<std::string, ExprAST*> > VarNames;
    +  ExprAST *Body;
    +public:
    +  VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
    +             ExprAST *body)
    +  : VarNames(varnames), Body(body) {}
    +  
    +  virtual Value *Codegen();
    +};
    +
    +/// PrototypeAST - This class represents the "prototype" for a function,
    +/// which captures its name, and its argument names (thus implicitly the number
    +/// of arguments the function takes), as well as if it is an operator.
    +class PrototypeAST {
    +  std::string Name;
    +  std::vector<std::string> Args;
    +  bool isOperator;
    +  unsigned Precedence;  // Precedence if a binary op.
    +public:
    +  PrototypeAST(const std::string &name, const std::vector<std::string> &args,
    +               bool isoperator = false, unsigned prec = 0)
    +  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
    +  
    +  bool isUnaryOp() const { return isOperator && Args.size() == 1; }
    +  bool isBinaryOp() const { return isOperator && Args.size() == 2; }
    +  
    +  char getOperatorName() const {
    +    assert(isUnaryOp() || isBinaryOp());
    +    return Name[Name.size()-1];
    +  }
    +  
    +  unsigned getBinaryPrecedence() const { return Precedence; }
    +  
    +  Function *Codegen();
    +  
    +  void CreateArgumentAllocas(Function *F);
    +};
    +
    +/// FunctionAST - This class represents a function definition itself.
    +class FunctionAST {
    +  PrototypeAST *Proto;
    +  ExprAST *Body;
    +public:
    +  FunctionAST(PrototypeAST *proto, ExprAST *body)
    +    : Proto(proto), Body(body) {}
    +  
    +  Function *Codegen();
    +};
    +
    +//===----------------------------------------------------------------------===//
    +// Parser
    +//===----------------------------------------------------------------------===//
    +
    +/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
    +/// token the parser is looking at.  getNextToken reads another token from the
    +/// lexer and updates CurTok with its results.
    +static int CurTok;
    +static int getNextToken() {
    +  return CurTok = gettok();
    +}
    +
    +/// BinopPrecedence - This holds the precedence for each binary operator that is
    +/// defined.
    +static std::map<char, int> BinopPrecedence;
    +
    +/// GetTokPrecedence - Get the precedence of the pending binary operator token.
    +static int GetTokPrecedence() {
    +  if (!isascii(CurTok))
    +    return -1;
    +  
    +  // Make sure it's a declared binop.
    +  int TokPrec = BinopPrecedence[CurTok];
    +  if (TokPrec <= 0) return -1;
    +  return TokPrec;
    +}
    +
    +/// Error* - These are little helper functions for error handling.
    +ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
    +PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
    +FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
    +
    +static ExprAST *ParseExpression();
    +
    +/// identifierexpr
    +///   ::= identifier
    +///   ::= identifier '(' expression* ')'
    +static ExprAST *ParseIdentifierExpr() {
    +  std::string IdName = IdentifierStr;
    +  
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '(') // Simple variable ref.
    +    return new VariableExprAST(IdName);
    +  
    +  // Call.
    +  getNextToken();  // eat (
    +  std::vector<ExprAST*> Args;
    +  if (CurTok != ')') {
    +    while (1) {
    +      ExprAST *Arg = ParseExpression();
    +      if (!Arg) return 0;
    +      Args.push_back(Arg);
    +
    +      if (CurTok == ')') break;
    +
    +      if (CurTok != ',')
    +        return Error("Expected ')' or ',' in argument list");
    +      getNextToken();
    +    }
    +  }
    +
    +  // Eat the ')'.
    +  getNextToken();
    +  
    +  return new CallExprAST(IdName, Args);
    +}
    +
    +/// numberexpr ::= number
    +static ExprAST *ParseNumberExpr() {
    +  ExprAST *Result = new NumberExprAST(NumVal);
    +  getNextToken(); // consume the number
    +  return Result;
    +}
    +
    +/// parenexpr ::= '(' expression ')'
    +static ExprAST *ParseParenExpr() {
    +  getNextToken();  // eat (.
    +  ExprAST *V = ParseExpression();
    +  if (!V) return 0;
    +  
    +  if (CurTok != ')')
    +    return Error("expected ')'");
    +  getNextToken();  // eat ).
    +  return V;
    +}
    +
    +/// ifexpr ::= 'if' expression 'then' expression 'else' expression
    +static ExprAST *ParseIfExpr() {
    +  getNextToken();  // eat the if.
    +  
    +  // condition.
    +  ExprAST *Cond = ParseExpression();
    +  if (!Cond) return 0;
    +  
    +  if (CurTok != tok_then)
    +    return Error("expected then");
    +  getNextToken();  // eat the then
    +  
    +  ExprAST *Then = ParseExpression();
    +  if (Then == 0) return 0;
    +  
    +  if (CurTok != tok_else)
    +    return Error("expected else");
    +  
    +  getNextToken();
    +  
    +  ExprAST *Else = ParseExpression();
    +  if (!Else) return 0;
    +  
    +  return new IfExprAST(Cond, Then, Else);
    +}
    +
    +/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
    +static ExprAST *ParseForExpr() {
    +  getNextToken();  // eat the for.
    +
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after for");
    +  
    +  std::string IdName = IdentifierStr;
    +  getNextToken();  // eat identifier.
    +  
    +  if (CurTok != '=')
    +    return Error("expected '=' after for");
    +  getNextToken();  // eat '='.
    +  
    +  
    +  ExprAST *Start = ParseExpression();
    +  if (Start == 0) return 0;
    +  if (CurTok != ',')
    +    return Error("expected ',' after for start value");
    +  getNextToken();
    +  
    +  ExprAST *End = ParseExpression();
    +  if (End == 0) return 0;
    +  
    +  // The step value is optional.
    +  ExprAST *Step = 0;
    +  if (CurTok == ',') {
    +    getNextToken();
    +    Step = ParseExpression();
    +    if (Step == 0) return 0;
    +  }
    +  
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' after for");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +
    +  return new ForExprAST(IdName, Start, End, Step, Body);
    +}
    +
    +/// varexpr ::= 'var' identifier ('=' expression)? 
    +//                    (',' identifier ('=' expression)?)* 'in' expression
    +static ExprAST *ParseVarExpr() {
    +  getNextToken();  // eat the var.
    +
    +  std::vector<std::pair<std::string, ExprAST*> > VarNames;
    +
    +  // At least one variable name is required.
    +  if (CurTok != tok_identifier)
    +    return Error("expected identifier after var");
    +  
    +  while (1) {
    +    std::string Name = IdentifierStr;
    +    getNextToken();  // eat identifier.
    +
    +    // Read the optional initializer.
    +    ExprAST *Init = 0;
    +    if (CurTok == '=') {
    +      getNextToken(); // eat the '='.
    +      
    +      Init = ParseExpression();
    +      if (Init == 0) return 0;
    +    }
    +    
    +    VarNames.push_back(std::make_pair(Name, Init));
    +    
    +    // End of var list, exit loop.
    +    if (CurTok != ',') break;
    +    getNextToken(); // eat the ','.
    +    
    +    if (CurTok != tok_identifier)
    +      return Error("expected identifier list after var");
    +  }
    +  
    +  // At this point, we have to have 'in'.
    +  if (CurTok != tok_in)
    +    return Error("expected 'in' keyword after 'var'");
    +  getNextToken();  // eat 'in'.
    +  
    +  ExprAST *Body = ParseExpression();
    +  if (Body == 0) return 0;
    +  
    +  return new VarExprAST(VarNames, Body);
    +}
    +
    +/// primary
    +///   ::= identifierexpr
    +///   ::= numberexpr
    +///   ::= parenexpr
    +///   ::= ifexpr
    +///   ::= forexpr
    +///   ::= varexpr
    +static ExprAST *ParsePrimary() {
    +  switch (CurTok) {
    +  default: return Error("unknown token when expecting an expression");
    +  case tok_identifier: return ParseIdentifierExpr();
    +  case tok_number:     return ParseNumberExpr();
    +  case '(':            return ParseParenExpr();
    +  case tok_if:         return ParseIfExpr();
    +  case tok_for:        return ParseForExpr();
    +  case tok_var:        return ParseVarExpr();
    +  }
    +}
    +
    +/// unary
    +///   ::= primary
    +///   ::= '!' unary
    +static ExprAST *ParseUnary() {
    +  // If the current token is not an operator, it must be a primary expr.
    +  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
    +    return ParsePrimary();
    +  
    +  // If this is a unary operator, read it.
    +  int Opc = CurTok;
    +  getNextToken();
    +  if (ExprAST *Operand = ParseUnary())
    +    return new UnaryExprAST(Opc, Operand);
    +  return 0;
    +}
    +
    +/// binoprhs
    +///   ::= ('+' unary)*
    +static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
    +  // If this is a binop, find its precedence.
    +  while (1) {
    +    int TokPrec = GetTokPrecedence();
    +    
    +    // If this is a binop that binds at least as tightly as the current binop,
    +    // consume it, otherwise we are done.
    +    if (TokPrec < ExprPrec)
    +      return LHS;
    +    
    +    // Okay, we know this is a binop.
    +    int BinOp = CurTok;
    +    getNextToken();  // eat binop
    +    
    +    // Parse the unary expression after the binary operator.
    +    ExprAST *RHS = ParseUnary();
    +    if (!RHS) return 0;
    +    
    +    // If BinOp binds less tightly with RHS than the operator after RHS, let
    +    // the pending operator take RHS as its LHS.
    +    int NextPrec = GetTokPrecedence();
    +    if (TokPrec < NextPrec) {
    +      RHS = ParseBinOpRHS(TokPrec+1, RHS);
    +      if (RHS == 0) return 0;
    +    }
    +    
    +    // Merge LHS/RHS.
    +    LHS = new BinaryExprAST(BinOp, LHS, RHS);
    +  }
    +}
    +
    +/// expression
    +///   ::= unary binoprhs
    +///
    +static ExprAST *ParseExpression() {
    +  ExprAST *LHS = ParseUnary();
    +  if (!LHS) return 0;
    +  
    +  return ParseBinOpRHS(0, LHS);
    +}
    +
    +/// prototype
    +///   ::= id '(' id* ')'
    +///   ::= binary LETTER number? (id, id)
    +///   ::= unary LETTER (id)
    +static PrototypeAST *ParsePrototype() {
    +  std::string FnName;
    +  
    +  unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
    +  unsigned BinaryPrecedence = 30;
    +  
    +  switch (CurTok) {
    +  default:
    +    return ErrorP("Expected function name in prototype");
    +  case tok_identifier:
    +    FnName = IdentifierStr;
    +    Kind = 0;
    +    getNextToken();
    +    break;
    +  case tok_unary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected unary operator");
    +    FnName = "unary";
    +    FnName += (char)CurTok;
    +    Kind = 1;
    +    getNextToken();
    +    break;
    +  case tok_binary:
    +    getNextToken();
    +    if (!isascii(CurTok))
    +      return ErrorP("Expected binary operator");
    +    FnName = "binary";
    +    FnName += (char)CurTok;
    +    Kind = 2;
    +    getNextToken();
    +    
    +    // Read the precedence if present.
    +    if (CurTok == tok_number) {
    +      if (NumVal < 1 || NumVal > 100)
    +        return ErrorP("Invalid precedecnce: must be 1..100");
    +      BinaryPrecedence = (unsigned)NumVal;
    +      getNextToken();
    +    }
    +    break;
    +  }
    +  
    +  if (CurTok != '(')
    +    return ErrorP("Expected '(' in prototype");
    +  
    +  std::vector<std::string> ArgNames;
    +  while (getNextToken() == tok_identifier)
    +    ArgNames.push_back(IdentifierStr);
    +  if (CurTok != ')')
    +    return ErrorP("Expected ')' in prototype");
    +  
    +  // success.
    +  getNextToken();  // eat ')'.
    +  
    +  // Verify right number of names for operator.
    +  if (Kind && ArgNames.size() != Kind)
    +    return ErrorP("Invalid number of operands for operator");
    +  
    +  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
    +}
    +
    +/// definition ::= 'def' prototype expression
    +static FunctionAST *ParseDefinition() {
    +  getNextToken();  // eat def.
    +  PrototypeAST *Proto = ParsePrototype();
    +  if (Proto == 0) return 0;
    +
    +  if (ExprAST *E = ParseExpression())
    +    return new FunctionAST(Proto, E);
    +  return 0;
    +}
    +
    +/// toplevelexpr ::= expression
    +static FunctionAST *ParseTopLevelExpr() {
    +  if (ExprAST *E = ParseExpression()) {
    +    // Make an anonymous proto.
    +    PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
    +    return new FunctionAST(Proto, E);
    +  }
    +  return 0;
    +}
    +
    +/// external ::= 'extern' prototype
    +static PrototypeAST *ParseExtern() {
    +  getNextToken();  // eat extern.
    +  return ParsePrototype();
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Code Generation
    +//===----------------------------------------------------------------------===//
    +
    +static Module *TheModule;
    +static IRBuilder<> Builder(getGlobalContext());
    +static std::map<std::string, AllocaInst*> NamedValues;
    +static FunctionPassManager *TheFPM;
    +
    +Value *ErrorV(const char *Str) { Error(Str); return 0; }
    +
    +/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
    +/// the function.  This is used for mutable variables etc.
    +static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
    +                                          const std::string &VarName) {
    +  IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
    +                 TheFunction->getEntryBlock().begin());
    +  return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
    +                           VarName.c_str());
    +}
    +
    +Value *NumberExprAST::Codegen() {
    +  return ConstantFP::get(getGlobalContext(), APFloat(Val));
    +}
    +
    +Value *VariableExprAST::Codegen() {
    +  // Look this variable up in the function.
    +  Value *V = NamedValues[Name];
    +  if (V == 0) return ErrorV("Unknown variable name");
    +
    +  // Load the value.
    +  return Builder.CreateLoad(V, Name.c_str());
    +}
    +
    +Value *UnaryExprAST::Codegen() {
    +  Value *OperandV = Operand->Codegen();
    +  if (OperandV == 0) return 0;
    +  
    +  Function *F = TheModule->getFunction(std::string("unary")+Opcode);
    +  if (F == 0)
    +    return ErrorV("Unknown unary operator");
    +  
    +  return Builder.CreateCall(F, OperandV, "unop");
    +}
    +
    +Value *BinaryExprAST::Codegen() {
    +  // Special case '=' because we don't want to emit the LHS as an expression.
    +  if (Op == '=') {
    +    // Assignment requires the LHS to be an identifier.
    +    VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS);
    +    if (!LHSE)
    +      return ErrorV("destination of '=' must be a variable");
    +    // Codegen the RHS.
    +    Value *Val = RHS->Codegen();
    +    if (Val == 0) return 0;
    +
    +    // Look up the name.
    +    Value *Variable = NamedValues[LHSE->getName()];
    +    if (Variable == 0) return ErrorV("Unknown variable name");
    +
    +    Builder.CreateStore(Val, Variable);
    +    return Val;
    +  }
    +  
    +  Value *L = LHS->Codegen();
    +  Value *R = RHS->Codegen();
    +  if (L == 0 || R == 0) return 0;
    +  
    +  switch (Op) {
    +  case '+': return Builder.CreateFAdd(L, R, "addtmp");
    +  case '-': return Builder.CreateFSub(L, R, "subtmp");
    +  case '*': return Builder.CreateFMul(L, R, "multmp");
    +  case '<':
    +    L = Builder.CreateFCmpULT(L, R, "cmptmp");
    +    // Convert bool 0/1 to double 0.0 or 1.0
    +    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
    +                                "booltmp");
    +  default: break;
    +  }
    +  
    +  // If it wasn't a builtin binary operator, it must be a user defined one. Emit
    +  // a call to it.
    +  Function *F = TheModule->getFunction(std::string("binary")+Op);
    +  assert(F && "binary operator not found!");
    +  
    +  Value *Ops[] = { L, R };
    +  return Builder.CreateCall(F, Ops, Ops+2, "binop");
    +}
    +
    +Value *CallExprAST::Codegen() {
    +  // Look up the name in the global module table.
    +  Function *CalleeF = TheModule->getFunction(Callee);
    +  if (CalleeF == 0)
    +    return ErrorV("Unknown function referenced");
    +  
    +  // If argument mismatch error.
    +  if (CalleeF->arg_size() != Args.size())
    +    return ErrorV("Incorrect # arguments passed");
    +
    +  std::vector<Value*> ArgsV;
    +  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    +    ArgsV.push_back(Args[i]->Codegen());
    +    if (ArgsV.back() == 0) return 0;
    +  }
    +  
    +  return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
    +}
    +
    +Value *IfExprAST::Codegen() {
    +  Value *CondV = Cond->Codegen();
    +  if (CondV == 0) return 0;
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  CondV = Builder.CreateFCmpONE(CondV, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                "ifcond");
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +  
    +  // Create blocks for the then and else cases.  Insert the 'then' block at the
    +  // end of the function.
    +  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
    +  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
    +  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
    +  
    +  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
    +  
    +  // Emit then value.
    +  Builder.SetInsertPoint(ThenBB);
    +  
    +  Value *ThenV = Then->Codegen();
    +  if (ThenV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
    +  ThenBB = Builder.GetInsertBlock();
    +  
    +  // Emit else block.
    +  TheFunction->getBasicBlockList().push_back(ElseBB);
    +  Builder.SetInsertPoint(ElseBB);
    +  
    +  Value *ElseV = Else->Codegen();
    +  if (ElseV == 0) return 0;
    +  
    +  Builder.CreateBr(MergeBB);
    +  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
    +  ElseBB = Builder.GetInsertBlock();
    +  
    +  // Emit merge block.
    +  TheFunction->getBasicBlockList().push_back(MergeBB);
    +  Builder.SetInsertPoint(MergeBB);
    +  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
    +                                  "iftmp");
    +  
    +  PN->addIncoming(ThenV, ThenBB);
    +  PN->addIncoming(ElseV, ElseBB);
    +  return PN;
    +}
    +
    +Value *ForExprAST::Codegen() {
    +  // Output this as:
    +  //   var = alloca double
    +  //   ...
    +  //   start = startexpr
    +  //   store start -> var
    +  //   goto loop
    +  // loop: 
    +  //   ...
    +  //   bodyexpr
    +  //   ...
    +  // loopend:
    +  //   step = stepexpr
    +  //   endcond = endexpr
    +  //
    +  //   curvar = load var
    +  //   nextvar = curvar + step
    +  //   store nextvar -> var
    +  //   br endcond, loop, endloop
    +  // outloop:
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +
    +  // Create an alloca for the variable in the entry block.
    +  AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    +  
    +  // Emit the start code first, without 'variable' in scope.
    +  Value *StartVal = Start->Codegen();
    +  if (StartVal == 0) return 0;
    +  
    +  // Store the value into the alloca.
    +  Builder.CreateStore(StartVal, Alloca);
    +  
    +  // Make the new basic block for the loop header, inserting after current
    +  // block.
    +  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
    +  
    +  // Insert an explicit fall through from the current block to the LoopBB.
    +  Builder.CreateBr(LoopBB);
    +
    +  // Start insertion in LoopBB.
    +  Builder.SetInsertPoint(LoopBB);
    +  
    +  // Within the loop, the variable is defined equal to the PHI node.  If it
    +  // shadows an existing variable, we have to restore it, so save it now.
    +  AllocaInst *OldVal = NamedValues[VarName];
    +  NamedValues[VarName] = Alloca;
    +  
    +  // Emit the body of the loop.  This, like any other expr, can change the
    +  // current BB.  Note that we ignore the value computed by the body, but don't
    +  // allow an error.
    +  if (Body->Codegen() == 0)
    +    return 0;
    +  
    +  // Emit the step value.
    +  Value *StepVal;
    +  if (Step) {
    +    StepVal = Step->Codegen();
    +    if (StepVal == 0) return 0;
    +  } else {
    +    // If not specified, use 1.0.
    +    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
    +  }
    +  
    +  // Compute the end condition.
    +  Value *EndCond = End->Codegen();
    +  if (EndCond == 0) return EndCond;
    +  
    +  // Reload, increment, and restore the alloca.  This handles the case where
    +  // the body of the loop mutates the variable.
    +  Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
    +  Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
    +  Builder.CreateStore(NextVar, Alloca);
    +  
    +  // Convert condition to a bool by comparing equal to 0.0.
    +  EndCond = Builder.CreateFCmpONE(EndCond, 
    +                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
    +                                  "loopcond");
    +  
    +  // Create the "after loop" block and insert it.
    +  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
    +  
    +  // Insert the conditional branch into the end of LoopEndBB.
    +  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
    +  
    +  // Any new code will be inserted in AfterBB.
    +  Builder.SetInsertPoint(AfterBB);
    +  
    +  // Restore the unshadowed variable.
    +  if (OldVal)
    +    NamedValues[VarName] = OldVal;
    +  else
    +    NamedValues.erase(VarName);
    +
    +  
    +  // for expr always returns 0.0.
    +  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
    +}
    +
    +Value *VarExprAST::Codegen() {
    +  std::vector<AllocaInst *> OldBindings;
    +  
    +  Function *TheFunction = Builder.GetInsertBlock()->getParent();
    +
    +  // Register all variables and emit their initializer.
    +  for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
    +    const std::string &VarName = VarNames[i].first;
    +    ExprAST *Init = VarNames[i].second;
    +    
    +    // Emit the initializer before adding the variable to scope, this prevents
    +    // the initializer from referencing the variable itself, and permits stuff
    +    // like this:
    +    //  var a = 1 in
    +    //    var a = a in ...   # refers to outer 'a'.
    +    Value *InitVal;
    +    if (Init) {
    +      InitVal = Init->Codegen();
    +      if (InitVal == 0) return 0;
    +    } else { // If not specified, use 0.0.
    +      InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
    +    }
    +    
    +    AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
    +    Builder.CreateStore(InitVal, Alloca);
    +
    +    // Remember the old variable binding so that we can restore the binding when
    +    // we unrecurse.
    +    OldBindings.push_back(NamedValues[VarName]);
    +    
    +    // Remember this binding.
    +    NamedValues[VarName] = Alloca;
    +  }
    +  
    +  // Codegen the body, now that all vars are in scope.
    +  Value *BodyVal = Body->Codegen();
    +  if (BodyVal == 0) return 0;
    +  
    +  // Pop all our variables from scope.
    +  for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
    +    NamedValues[VarNames[i].first] = OldBindings[i];
    +
    +  // Return the body computation.
    +  return BodyVal;
    +}
    +
    +Function *PrototypeAST::Codegen() {
    +  // Make the function type:  double(double,double) etc.
    +  std::vector<const Type*> Doubles(Args.size(),
    +                                   Type::getDoubleTy(getGlobalContext()));
    +  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
    +                                       Doubles, false);
    +  
    +  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
    +  
    +  // If F conflicted, there was already something named 'Name'.  If it has a
    +  // body, don't allow redefinition or reextern.
    +  if (F->getName() != Name) {
    +    // Delete the one we just made and get the existing one.
    +    F->eraseFromParent();
    +    F = TheModule->getFunction(Name);
    +    
    +    // If F already has a body, reject this.
    +    if (!F->empty()) {
    +      ErrorF("redefinition of function");
    +      return 0;
    +    }
    +    
    +    // If F took a different number of args, reject.
    +    if (F->arg_size() != Args.size()) {
    +      ErrorF("redefinition of function with different # args");
    +      return 0;
    +    }
    +  }
    +  
    +  // Set names for all arguments.
    +  unsigned Idx = 0;
    +  for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
    +       ++AI, ++Idx)
    +    AI->setName(Args[Idx]);
    +    
    +  return F;
    +}
    +
    +/// CreateArgumentAllocas - Create an alloca for each argument and register the
    +/// argument in the symbol table so that references to it will succeed.
    +void PrototypeAST::CreateArgumentAllocas(Function *F) {
    +  Function::arg_iterator AI = F->arg_begin();
    +  for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
    +    // Create an alloca for this variable.
    +    AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
    +
    +    // Store the initial value into the alloca.
    +    Builder.CreateStore(AI, Alloca);
    +
    +    // Add arguments to variable symbol table.
    +    NamedValues[Args[Idx]] = Alloca;
    +  }
    +}
    +
    +Function *FunctionAST::Codegen() {
    +  NamedValues.clear();
    +  
    +  Function *TheFunction = Proto->Codegen();
    +  if (TheFunction == 0)
    +    return 0;
    +  
    +  // If this is an operator, install it.
    +  if (Proto->isBinaryOp())
    +    BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
    +  
    +  // Create a new basic block to start insertion into.
    +  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    +  Builder.SetInsertPoint(BB);
    +  
    +  // Add all arguments to the symbol table and create their allocas.
    +  Proto->CreateArgumentAllocas(TheFunction);
    +
    +  if (Value *RetVal = Body->Codegen()) {
    +    // Finish off the function.
    +    Builder.CreateRet(RetVal);
    +
    +    // Validate the generated code, checking for consistency.
    +    verifyFunction(*TheFunction);
    +
    +    // Optimize the function.
    +    TheFPM->run(*TheFunction);
    +    
    +    return TheFunction;
    +  }
    +  
    +  // Error reading body, remove function.
    +  TheFunction->eraseFromParent();
    +
    +  if (Proto->isBinaryOp())
    +    BinopPrecedence.erase(Proto->getOperatorName());
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Top-Level parsing and JIT Driver
    +//===----------------------------------------------------------------------===//
    +
    +static ExecutionEngine *TheExecutionEngine;
    +
    +static void HandleDefinition() {
    +  if (FunctionAST *F = ParseDefinition()) {
    +    if (Function *LF = F->Codegen()) {
    +      fprintf(stderr, "Read function definition:");
    +      LF->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleExtern() {
    +  if (PrototypeAST *P = ParseExtern()) {
    +    if (Function *F = P->Codegen()) {
    +      fprintf(stderr, "Read extern: ");
    +      F->dump();
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +static void HandleTopLevelExpression() {
    +  // Evaluate a top-level expression into an anonymous function.
    +  if (FunctionAST *F = ParseTopLevelExpr()) {
    +    if (Function *LF = F->Codegen()) {
    +      // JIT the function, returning a function pointer.
    +      void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
    +      
    +      // Cast it to the right type (takes no arguments, returns a double) so we
    +      // can call it as a native function.
    +      double (*FP)() = (double (*)())(intptr_t)FPtr;
    +      fprintf(stderr, "Evaluated to %f\n", FP());
    +    }
    +  } else {
    +    // Skip token for error recovery.
    +    getNextToken();
    +  }
    +}
    +
    +/// top ::= definition | external | expression | ';'
    +static void MainLoop() {
    +  while (1) {
    +    fprintf(stderr, "ready> ");
    +    switch (CurTok) {
    +    case tok_eof:    return;
    +    case ';':        getNextToken(); break;  // ignore top-level semicolons.
    +    case tok_def:    HandleDefinition(); break;
    +    case tok_extern: HandleExtern(); break;
    +    default:         HandleTopLevelExpression(); break;
    +    }
    +  }
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// "Library" functions that can be "extern'd" from user code.
    +//===----------------------------------------------------------------------===//
    +
    +/// putchard - putchar that takes a double and returns 0.
    +extern "C" 
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +/// printd - printf that takes a double prints it as "%f\n", returning 0.
    +extern "C" 
    +double printd(double X) {
    +  printf("%f\n", X);
    +  return 0;
    +}
    +
    +//===----------------------------------------------------------------------===//
    +// Main driver code.
    +//===----------------------------------------------------------------------===//
    +
    +int main() {
    +  InitializeNativeTarget();
    +  LLVMContext &Context = getGlobalContext();
    +
    +  // Install standard binary operators.
    +  // 1 is lowest precedence.
    +  BinopPrecedence['='] = 2;
    +  BinopPrecedence['<'] = 10;
    +  BinopPrecedence['+'] = 20;
    +  BinopPrecedence['-'] = 20;
    +  BinopPrecedence['*'] = 40;  // highest.
    +
    +  // Prime the first token.
    +  fprintf(stderr, "ready> ");
    +  getNextToken();
    +
    +  // Make the module, which holds all the code.
    +  TheModule = new Module("my cool jit", Context);
    +
    +  // Create the JIT.  This takes ownership of the module.
    +  std::string ErrStr;
    +  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
    +  if (!TheExecutionEngine) {
    +    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
    +    exit(1);
    +  }
    +
    +  FunctionPassManager OurFPM(TheModule);
    +
    +  // Set up the optimizer pipeline.  Start with registering info about how the
    +  // target lays out data structures.
    +  OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
    +  // Promote allocas to registers.
    +  OurFPM.add(createPromoteMemoryToRegisterPass());
    +  // Do simple "peephole" optimizations and bit-twiddling optzns.
    +  OurFPM.add(createInstructionCombiningPass());
    +  // Reassociate expressions.
    +  OurFPM.add(createReassociatePass());
    +  // Eliminate Common SubExpressions.
    +  OurFPM.add(createGVNPass());
    +  // Simplify the control flow graph (deleting unreachable blocks, etc).
    +  OurFPM.add(createCFGSimplificationPass());
    +
    +  OurFPM.doInitialization();
    +
    +  // Set the global so the code gen can use this.
    +  TheFPM = &OurFPM;
    +
    +  // Run the main "interpreter loop" now.
    +  MainLoop();
    +
    +  TheFPM = 0;
    +
    +  // Print out all of the generated code.
    +  TheModule->dump();
    +
    +  return 0;
    +}
    +
    +
    + +Next: Conclusion and other useful LLVM tidbits +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-01 13:09:20 -0700 (Wed, 01 Sep 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/LangImpl8.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/LangImpl8.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/LangImpl8.html (added) +++ www-releases/trunk/2.8/docs/tutorial/LangImpl8.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,365 @@ + + + + + Kaleidoscope: Conclusion and other useful LLVM tidbits + + + + + + + +
    Kaleidoscope: Conclusion and other useful LLVM + tidbits
    + + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Welcome to the the final chapter of the "Implementing a +language with LLVM" tutorial. In the course of this tutorial, we have grown +our little Kaleidoscope language from being a useless toy, to being a +semi-interesting (but probably still useless) toy. :)

    + +

    It is interesting to see how far we've come, and how little code it has +taken. We built the entire lexer, parser, AST, code generator, and an +interactive run-loop (with a JIT!) by-hand in under 700 lines of +(non-comment/non-blank) code.

    + +

    Our little language supports a couple of interesting features: it supports +user defined binary and unary operators, it uses JIT compilation for immediate +evaluation, and it supports a few control flow constructs with SSA construction. +

    + +

    Part of the idea of this tutorial was to show you how easy and fun it can be +to define, build, and play with languages. Building a compiler need not be a +scary or mystical process! Now that you've seen some of the basics, I strongly +encourage you to take the code and hack on it. For example, try adding:

    + +
      +
    • global variables - While global variables have questional value in +modern software engineering, they are often useful when putting together quick +little hacks like the Kaleidoscope compiler itself. Fortunately, our current +setup makes it very easy to add global variables: just have value lookup check +to see if an unresolved variable is in the global variable symbol table before +rejecting it. To create a new global variable, make an instance of the LLVM +GlobalVariable class.
    • + +
    • typed variables - Kaleidoscope currently only supports variables of +type double. This gives the language a very nice elegance, because only +supporting one type means that you never have to specify types. Different +languages have different ways of handling this. The easiest way is to require +the user to specify types for every variable definition, and record the type +of the variable in the symbol table along with its Value*.
    • + +
    • arrays, structs, vectors, etc - Once you add types, you can start +extending the type system in all sorts of interesting ways. Simple arrays are +very easy and are quite useful for many different applications. Adding them is +mostly an exercise in learning how the LLVM getelementptr instruction works: it +is so nifty/unconventional, it has its own FAQ! If you add support +for recursive types (e.g. linked lists), make sure to read the section in the LLVM +Programmer's Manual that describes how to construct them.
    • + +
    • standard runtime - Our current language allows the user to access +arbitrary external functions, and we use it for things like "printd" and +"putchard". As you extend the language to add higher-level constructs, often +these constructs make the most sense if they are lowered to calls into a +language-supplied runtime. For example, if you add hash tables to the language, +it would probably make sense to add the routines to a runtime, instead of +inlining them all the way.
    • + +
    • memory management - Currently we can only access the stack in +Kaleidoscope. It would also be useful to be able to allocate heap memory, +either with calls to the standard libc malloc/free interface or with a garbage +collector. If you would like to use garbage collection, note that LLVM fully +supports Accurate Garbage Collection +including algorithms that move objects and need to scan/update the stack.
    • + +
    • debugger support - LLVM supports generation of DWARF Debug info which is understood by +common debuggers like GDB. Adding support for debug info is fairly +straightforward. The best way to understand it is to compile some C/C++ code +with "llvm-gcc -g -O0" and taking a look at what it produces.
    • + +
    • exception handling support - LLVM supports generation of zero cost exceptions which interoperate +with code compiled in other languages. You could also generate code by +implicitly making every function return an error value and checking it. You +could also make explicit use of setjmp/longjmp. There are many different ways +to go here.
    • + +
    • object orientation, generics, database access, complex numbers, +geometric programming, ... - Really, there is +no end of crazy features that you can add to the language.
    • + +
    • unusual domains - We've been talking about applying LLVM to a domain +that many people are interested in: building a compiler for a specific language. +However, there are many other domains that can use compiler technology that are +not typically considered. For example, LLVM has been used to implement OpenGL +graphics acceleration, translate C++ code to ActionScript, and many other +cute and clever things. Maybe you will be the first to JIT compile a regular +expression interpreter into native code with LLVM?
    • + +
    + +

    +Have fun - try doing something crazy and unusual. Building a language like +everyone else always has, is much less fun than trying something a little crazy +or off the wall and seeing how it turns out. If you get stuck or want to talk +about it, feel free to email the llvmdev mailing +list: it has lots of people who are interested in languages and are often +willing to help out. +

    + +

    Before we end this tutorial, I want to talk about some "tips and tricks" for generating +LLVM IR. These are some of the more subtle things that may not be obvious, but +are very useful if you want to take advantage of LLVM's capabilities.

    + +
    + + + + + +
    + +

    We have a couple common questions about code in the LLVM IR form - lets just +get these out of the way right now, shall we?

    + +
    + + + + + +
    + +

    Kaleidoscope is an example of a "portable language": any program written in +Kaleidoscope will work the same way on any target that it runs on. Many other +languages have this property, e.g. lisp, java, haskell, javascript, python, etc +(note that while these languages are portable, not all their libraries are).

    + +

    One nice aspect of LLVM is that it is often capable of preserving target +independence in the IR: you can take the LLVM IR for a Kaleidoscope-compiled +program and run it on any target that LLVM supports, even emitting C code and +compiling that on targets that LLVM doesn't support natively. You can trivially +tell that the Kaleidoscope compiler generates target-independent code because it +never queries for any target-specific information when generating code.

    + +

    The fact that LLVM provides a compact, target-independent, representation for +code gets a lot of people excited. Unfortunately, these people are usually +thinking about C or a language from the C family when they are asking questions +about language portability. I say "unfortunately", because there is really no +way to make (fully general) C code portable, other than shipping the source code +around (and of course, C source code is not actually portable in general +either - ever port a really old application from 32- to 64-bits?).

    + +

    The problem with C (again, in its full generality) is that it is heavily +laden with target specific assumptions. As one simple example, the preprocessor +often destructively removes target-independence from the code when it processes +the input text:

    + +
    +
    +#ifdef __i386__
    +  int X = 1;
    +#else
    +  int X = 42;
    +#endif
    +
    +
    + +

    While it is possible to engineer more and more complex solutions to problems +like this, it cannot be solved in full generality in a way that is better than shipping +the actual source code.

    + +

    That said, there are interesting subsets of C that can be made portable. If +you are willing to fix primitive types to a fixed size (say int = 32-bits, +and long = 64-bits), don't care about ABI compatibility with existing binaries, +and are willing to give up some other minor features, you can have portable +code. This can make sense for specialized domains such as an +in-kernel language.

    + +
    + + + + + +
    + +

    Many of the languages above are also "safe" languages: it is impossible for +a program written in Java to corrupt its address space and crash the process +(assuming the JVM has no bugs). +Safety is an interesting property that requires a combination of language +design, runtime support, and often operating system support.

    + +

    It is certainly possible to implement a safe language in LLVM, but LLVM IR +does not itself guarantee safety. The LLVM IR allows unsafe pointer casts, +use after free bugs, buffer over-runs, and a variety of other problems. Safety +needs to be implemented as a layer on top of LLVM and, conveniently, several +groups have investigated this. Ask on the llvmdev mailing +list if you are interested in more details.

    + +
    + + + + + +
    + +

    One thing about LLVM that turns off many people is that it does not solve all +the world's problems in one system (sorry 'world hunger', someone else will have +to solve you some other day). One specific complaint is that people perceive +LLVM as being incapable of performing high-level language-specific optimization: +LLVM "loses too much information".

    + +

    Unfortunately, this is really not the place to give you a full and unified +version of "Chris Lattner's theory of compiler design". Instead, I'll make a +few observations:

    + +

    First, you're right that LLVM does lose information. For example, as of this +writing, there is no way to distinguish in the LLVM IR whether an SSA-value came +from a C "int" or a C "long" on an ILP32 machine (other than debug info). Both +get compiled down to an 'i32' value and the information about what it came from +is lost. The more general issue here, is that the LLVM type system uses +"structural equivalence" instead of "name equivalence". Another place this +surprises people is if you have two types in a high-level language that have the +same structure (e.g. two different structs that have a single int field): these +types will compile down into a single LLVM type and it will be impossible to +tell what it came from.

    + +

    Second, while LLVM does lose information, LLVM is not a fixed target: we +continue to enhance and improve it in many different ways. In addition to +adding new features (LLVM did not always support exceptions or debug info), we +also extend the IR to capture important information for optimization (e.g. +whether an argument is sign or zero extended, information about pointers +aliasing, etc). Many of the enhancements are user-driven: people want LLVM to +include some specific feature, so they go ahead and extend it.

    + +

    Third, it is possible and easy to add language-specific +optimizations, and you have a number of choices in how to do it. As one trivial +example, it is easy to add language-specific optimization passes that +"know" things about code compiled for a language. In the case of the C family, +there is an optimization pass that "knows" about the standard C library +functions. If you call "exit(0)" in main(), it knows that it is safe to +optimize that into "return 0;" because C specifies what the 'exit' +function does.

    + +

    In addition to simple library knowledge, it is possible to embed a variety of +other language-specific information into the LLVM IR. If you have a specific +need and run into a wall, please bring the topic up on the llvmdev list. At the +very worst, you can always treat LLVM as if it were a "dumb code generator" and +implement the high-level optimizations you desire in your front-end, on the +language-specific AST. +

    + +
    + + + + + +
    + +

    There is a variety of useful tips and tricks that you come to know after +working on/with LLVM that aren't obvious at first glance. Instead of letting +everyone rediscover them, this section talks about some of these issues.

    + +
    + + + + + +
    + +

    One interesting thing that comes up, if you are trying to keep the code +generated by your compiler "target independent", is that you often need to know +the size of some LLVM type or the offset of some field in an llvm structure. +For example, you might need to pass the size of a type into a function that +allocates memory.

    + +

    Unfortunately, this can vary widely across targets: for example the width of +a pointer is trivially target-specific. However, there is a clever +way to use the getelementptr instruction that allows you to compute this +in a portable way.

    + +
    + + + + + +
    + +

    Some languages want to explicitly manage their stack frames, often so that +they are garbage collected or to allow easy implementation of closures. There +are often better ways to implement these features than explicit stack frames, +but LLVM +does support them, if you want. It requires your front-end to convert the +code into Continuation +Passing Style and the use of tail calls (which LLVM also supports).

    + +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/Makefile URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/Makefile?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/Makefile (added) +++ www-releases/trunk/2.8/docs/tutorial/Makefile Mon Oct 4 15:49:23 2010 @@ -0,0 +1,28 @@ +##===- docs/tutorial/Makefile ------------------------------*- Makefile -*-===## +# +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +# +##===----------------------------------------------------------------------===## + +LEVEL := ../.. +include $(LEVEL)/Makefile.common + +HTML := $(wildcard $(PROJ_SRC_DIR)/*.html) +EXTRA_DIST := $(HTML) index.html +HTML_DIR := $(DESTDIR)$(PROJ_docsdir)/html/tutorial + +install-local:: $(HTML) + $(Echo) Installing HTML Tutorial Documentation + $(Verb) $(MKDIR) $(HTML_DIR) + $(Verb) $(DataInstall) $(HTML) $(HTML_DIR) + $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/index.html $(HTML_DIR) + +uninstall-local:: + $(Echo) Uninstalling Tutorial Documentation + $(Verb) $(RM) -rf $(HTML_DIR) + +printvars:: + $(Echo) "HTML : " '$(HTML)' Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl1.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl1.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl1.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl1.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,365 @@ + + + + + Kaleidoscope: Tutorial Introduction and the Lexer + + + + + + + + +
    Kaleidoscope: Tutorial Introduction and the Lexer
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to the "Implementing a language with LLVM" tutorial. This tutorial +runs through the implementation of a simple language, showing how fun and +easy it can be. This tutorial will get you up and started as well as help to +build a framework you can extend to other languages. The code in this tutorial +can also be used as a playground to hack on other LLVM specific things. +

    + +

    +The goal of this tutorial is to progressively unveil our language, describing +how it is built up over time. This will let us cover a fairly broad range of +language design and LLVM-specific usage issues, showing and explaining the code +for it all along the way, without overwhelming you with tons of details up +front.

    + +

    It is useful to point out ahead of time that this tutorial is really about +teaching compiler techniques and LLVM specifically, not about teaching +modern and sane software engineering principles. In practice, this means that +we'll take a number of shortcuts to simplify the exposition. For example, the +code leaks memory, uses global variables all over the place, doesn't use nice +design patterns like visitors, etc... but it +is very simple. If you dig in and use the code as a basis for future projects, +fixing these deficiencies shouldn't be hard.

    + +

    I've tried to put this tutorial together in a way that makes chapters easy to +skip over if you are already familiar with or are uninterested in the various +pieces. The structure of the tutorial is: +

    + +
      +
    • Chapter #1: Introduction to the Kaleidoscope +language, and the definition of its Lexer - This shows where we are going +and the basic functionality that we want it to do. In order to make this +tutorial maximally understandable and hackable, we choose to implement +everything in Objective Caml instead of using lexer and parser generators. +LLVM obviously works just fine with such tools, feel free to use one if you +prefer.
    • +
    • Chapter #2: Implementing a Parser and +AST - With the lexer in place, we can talk about parsing techniques and +basic AST construction. This tutorial describes recursive descent parsing and +operator precedence parsing. Nothing in Chapters 1 or 2 is LLVM-specific, +the code doesn't even link in LLVM at this point. :)
    • +
    • Chapter #3: Code generation to LLVM +IR - With the AST ready, we can show off how easy generation of LLVM IR +really is.
    • +
    • Chapter #4: Adding JIT and Optimizer +Support - Because a lot of people are interested in using LLVM as a JIT, +we'll dive right into it and show you the 3 lines it takes to add JIT support. +LLVM is also useful in many other ways, but this is one simple and "sexy" way +to shows off its power. :)
    • +
    • Chapter #5: Extending the Language: +Control Flow - With the language up and running, we show how to extend it +with control flow operations (if/then/else and a 'for' loop). This gives us a +chance to talk about simple SSA construction and control flow.
    • +
    • Chapter #6: Extending the Language: +User-defined Operators - This is a silly but fun chapter that talks about +extending the language to let the user program define their own arbitrary +unary and binary operators (with assignable precedence!). This lets us build a +significant piece of the "language" as library routines.
    • +
    • Chapter #7: Extending the Language: +Mutable Variables - This chapter talks about adding user-defined local +variables along with an assignment operator. The interesting part about this +is how easy and trivial it is to construct SSA form in LLVM: no, LLVM does +not require your front-end to construct SSA form!
    • +
    • Chapter #8: Conclusion and other +useful LLVM tidbits - This chapter wraps up the series by talking about +potential ways to extend the language, but also includes a bunch of pointers to +info about "special topics" like adding garbage collection support, exceptions, +debugging, support for "spaghetti stacks", and a bunch of other tips and +tricks.
    • + +
    + +

    By the end of the tutorial, we'll have written a bit less than 700 lines of +non-comment, non-blank, lines of code. With this small amount of code, we'll +have built up a very reasonable compiler for a non-trivial language including +a hand-written lexer, parser, AST, as well as code generation support with a JIT +compiler. While other systems may have interesting "hello world" tutorials, +I think the breadth of this tutorial is a great testament to the strengths of +LLVM and why you should consider it if you're interested in language or compiler +design.

    + +

    A note about this tutorial: we expect you to extend the language and play +with it on your own. Take the code and go crazy hacking away at it, compilers +don't need to be scary creatures - it can be a lot of fun to play with +languages!

    + +
    + + + + + +
    + +

    This tutorial will be illustrated with a toy language that we'll call +"Kaleidoscope" (derived +from "meaning beautiful, form, and view"). +Kaleidoscope is a procedural language that allows you to define functions, use +conditionals, math, etc. Over the course of the tutorial, we'll extend +Kaleidoscope to support the if/then/else construct, a for loop, user defined +operators, JIT compilation with a simple command line interface, etc.

    + +

    Because we want to keep things simple, the only datatype in Kaleidoscope is a +64-bit floating point type (aka 'float' in O'Caml parlance). As such, all +values are implicitly double precision and the language doesn't require type +declarations. This gives the language a very nice and simple syntax. For +example, the following simple example computes Fibonacci numbers:

    + +
    +
    +# Compute the x'th fibonacci number.
    +def fib(x)
    +  if x < 3 then
    +    1
    +  else
    +    fib(x-1)+fib(x-2)
    +
    +# This expression will compute the 40th number.
    +fib(40)
    +
    +
    + +

    We also allow Kaleidoscope to call into standard library functions (the LLVM +JIT makes this completely trivial). This means that you can use the 'extern' +keyword to define a function before you use it (this is also useful for mutually +recursive functions). For example:

    + +
    +
    +extern sin(arg);
    +extern cos(arg);
    +extern atan2(arg1 arg2);
    +
    +atan2(sin(.4), cos(42))
    +
    +
    + +

    A more interesting example is included in Chapter 6 where we write a little +Kaleidoscope application that displays +a Mandelbrot Set at various levels of magnification.

    + +

    Lets dive into the implementation of this language!

    + +
    + + + + + +
    + +

    When it comes to implementing a language, the first thing needed is +the ability to process a text file and recognize what it says. The traditional +way to do this is to use a "lexer" (aka 'scanner') +to break the input up into "tokens". Each token returned by the lexer includes +a token code and potentially some metadata (e.g. the numeric value of a number). +First, we define the possibilities: +

    + +
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +
    + +

    Each token returned by our lexer will be one of the token variant values. +An unknown character like '+' will be returned as Token.Kwd '+'. If +the curr token is an identifier, the value will be Token.Ident s. If +the current token is a numeric literal (like 1.0), the value will be +Token.Number 1.0. +

    + +

    The actual implementation of the lexer is a collection of functions driven +by a function named Lexer.lex. The Lexer.lex function is +called to return the next token from standard input. We will use +Camlp4 +to simplify the tokenization of the standard input. Its definition starts +as:

    + +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +
    + +

    +Lexer.lex works by recursing over a char Stream.t to read +characters one at a time from the standard input. It eats them as it recognizes +them and stores them in in a Token.token variant. The first thing that +it has to do is ignore whitespace between tokens. This is accomplished with the +recursive call above.

    + +

    The next thing Lexer.lex needs to do is recognize identifiers and +specific keywords like "def". Kaleidoscope does this with a pattern match +and a helper function.

    + +

    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +...
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +
    + +

    Numeric values are similar:

    + +
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +...
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +
    + +

    This is all pretty straight-forward code for processing input. When reading +a numeric value from input, we use the ocaml float_of_string function +to convert it to a numeric value that we store in Token.Number. Note +that this isn't doing sufficient error checking: it will raise Failure +if the string "1.23.45.67". Feel free to extend it :). Next we handle +comments: +

    + +
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +...
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +

    We handle comments by skipping to the end of the line and then return the +next token. Finally, if the input doesn't match one of the above cases, it is +either an operator character like '+' or the end of the file. These are handled +with this code:

    + +
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +
    + +

    With this, we have the complete lexer for the basic Kaleidoscope language +(the full code listing for the Lexer is +available in the next chapter of the +tutorial). Next we'll build a simple parser that +uses this to build an Abstract Syntax Tree. When we have that, we'll +include a driver so that you can use the lexer and parser together. +

    + +Next: Implementing a Parser and AST +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl2.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl2.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl2.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl2.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1045 @@ + + + + + Kaleidoscope: Implementing a Parser and AST + + + + + + + + +
    Kaleidoscope: Implementing a Parser and AST
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 2 of the "Implementing a language +with LLVM in Objective Caml" tutorial. This chapter shows you how to use +the lexer, built in Chapter 1, to build a +full parser for our +Kaleidoscope language. Once we have a parser, we'll define and build an Abstract Syntax +Tree (AST).

    + +

    The parser we will build uses a combination of Recursive Descent +Parsing and Operator-Precedence +Parsing to parse the Kaleidoscope language (the latter for +binary expressions and the former for everything else). Before we get to +parsing though, lets talk about the output of the parser: the Abstract Syntax +Tree.

    + +
    + + + + + +
    + +

    The AST for a program captures its behavior in such a way that it is easy for +later stages of the compiler (e.g. code generation) to interpret. We basically +want one object for each construct in the language, and the AST should closely +model the language. In Kaleidoscope, we have expressions, a prototype, and a +function object. We'll start with expressions first:

    + +
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +
    + +

    The code above shows the definition of the base ExprAST class and one +subclass which we use for numeric literals. The important thing to note about +this code is that the Number variant captures the numeric value of the +literal as an instance variable. This allows later phases of the compiler to +know what the stored numeric value is.

    + +

    Right now we only create the AST, so there are no useful functions on +them. It would be very easy to add a function to pretty print the code, +for example. Here are the other expression AST node definitions that we'll use +in the basic form of the Kaleidoscope language: +

    + +
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +
    + +

    This is all (intentionally) rather straight-forward: variables capture the +variable name, binary operators capture their opcode (e.g. '+'), and calls +capture a function name as well as a list of any argument expressions. One thing +that is nice about our AST is that it captures the language features without +talking about the syntax of the language. Note that there is no discussion about +precedence of binary operators, lexical structure, etc.

    + +

    For our basic language, these are all of the expression nodes we'll define. +Because it doesn't have conditional control flow, it isn't Turing-complete; +we'll fix that in a later installment. The two things we need next are a way +to talk about the interface to a function, and a way to talk about functions +themselves:

    + +
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto = Prototype of string * string array
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +

    In Kaleidoscope, functions are typed with just a count of their arguments. +Since all values are double precision floating point, the type of each argument +doesn't need to be stored anywhere. In a more aggressive and realistic +language, the "expr" variants would probably have a type field.

    + +

    With this scaffolding, we can now talk about parsing expressions and function +bodies in Kaleidoscope.

    + +
    + + + + + +
    + +

    Now that we have an AST to build, we need to define the parser code to build +it. The idea here is that we want to parse something like "x+y" (which is +returned as three tokens by the lexer) into an AST that could be generated with +calls like this:

    + +
    +
    +  let x = Variable "x" in
    +  let y = Variable "y" in
    +  let result = Binary ('+', x, y) in
    +  ...
    +
    +
    + +

    +The error handling routines make use of the builtin Stream.Failure and +Stream.Errors. Stream.Failure is raised when the parser is +unable to find any matching token in the first position of a pattern. +Stream.Error is raised when the first token matches, but the rest do +not. The error recovery in our parser will not be the best and is not +particular user-friendly, but it will be enough for our tutorial. These +exceptions make it easier to handle errors in routines that have various return +types.

    + +

    With these basic types and exceptions, we can implement the first +piece of our grammar: numeric literals.

    + +
    + + + + + +
    + +

    We start with numeric literals, because they are the simplest to process. +For each production in our grammar, we'll define a function which parses that +production. We call this class of expressions "primary" expressions, for +reasons that will become more clear +later in the tutorial. In order to parse an arbitrary primary expression, +we need to determine what sort of expression it is. For numeric literals, we +have:

    + +
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr *)
    +parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +
    + +

    This routine is very simple: it expects to be called when the current token +is a Token.Number token. It takes the current number value, creates +a Ast.Number node, advances the lexer to the next token, and finally +returns.

    + +

    There are some interesting aspects to this. The most important one is that +this routine eats all of the tokens that correspond to the production and +returns the lexer buffer with the next token (which is not part of the grammar +production) ready to go. This is a fairly standard way to go for recursive +descent parsers. For a better example, the parenthesis operator is defined like +this:

    + +
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +
    + +

    This function illustrates a number of interesting things about the +parser:

    + +

    +1) It shows how we use the Stream.Error exception. When called, this +function expects that the current token is a '(' token, but after parsing the +subexpression, it is possible that there is no ')' waiting. For example, if +the user types in "(4 x" instead of "(4)", the parser should emit an error. +Because errors can occur, the parser needs a way to indicate that they +happened. In our parser, we use the camlp4 shortcut syntax token ?? "parse +error", where if the token before the ?? does not match, then +Stream.Error "parse error" will be raised.

    + +

    2) Another interesting aspect of this function is that it uses recursion by +calling Parser.parse_primary (we will soon see that +Parser.parse_primary can call Parser.parse_primary). This is +powerful because it allows us to handle recursive grammars, and keeps each +production very simple. Note that parentheses do not cause construction of AST +nodes themselves. While we could do it this way, the most important role of +parentheses are to guide the parser and provide grouping. Once the parser +constructs the AST, parentheses are not needed.

    + +

    The next simple production is for handling variable references and function +calls:

    + +
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +
    + +

    This routine follows the same style as the other routines. (It expects to be +called if the current token is a Token.Ident token). It also has +recursion and error handling. One interesting aspect of this is that it uses +look-ahead to determine if the current identifier is a stand alone +variable reference or if it is a function call expression. It handles this by +checking to see if the token after the identifier is a '(' token, constructing +either a Ast.Variable or Ast.Call node as appropriate. +

    + +

    We finish up by raising an exception if we received a token we didn't +expect:

    + +
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +
    + +

    Now that basic expressions are handled, we need to handle binary expressions. +They are a bit more complex.

    + +
    + + + + + +
    + +

    Binary expressions are significantly harder to parse because they are often +ambiguous. For example, when given the string "x+y*z", the parser can choose +to parse it as either "(x+y)*z" or "x+(y*z)". With common definitions from +mathematics, we expect the later parse, because "*" (multiplication) has +higher precedence than "+" (addition).

    + +

    There are many ways to handle this, but an elegant and efficient way is to +use Operator-Precedence +Parsing. This parsing technique uses the precedence of binary operators to +guide recursion. To start with, we need a table of precedences:

    + +
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +...
    +
    +let main () =
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +  ...
    +
    +
    + +

    For the basic form of Kaleidoscope, we will only support 4 binary operators +(this can obviously be extended by you, our brave and intrepid reader). The +Parser.precedence function returns the precedence for the current +token, or -1 if the token is not a binary operator. Having a Hashtbl.t +makes it easy to add new operators and makes it clear that the algorithm doesn't +depend on the specific operators involved, but it would be easy enough to +eliminate the Hashtbl.t and do the comparisons in the +Parser.precedence function. (Or just use a fixed-size array).

    + +

    With the helper above defined, we can now start parsing binary expressions. +The basic idea of operator precedence parsing is to break down an expression +with potentially ambiguous binary operators into pieces. Consider ,for example, +the expression "a+b+(c+d)*e*f+g". Operator precedence parsing considers this +as a stream of primary expressions separated by binary operators. As such, +it will first parse the leading primary expression "a", then it will see the +pairs [+, b] [+, (c+d)] [*, e] [*, f] and [+, g]. Note that because parentheses +are primary expressions, the binary expression parser doesn't need to worry +about nested subexpressions like (c+d) at all. +

    + +

    +To start, an expression is a primary expression potentially followed by a +sequence of [binop,primaryexpr] pairs:

    + +
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +
    + +

    Parser.parse_bin_rhs is the function that parses the sequence of +pairs for us. It takes a precedence and a pointer to an expression for the part +that has been parsed so far. Note that "x" is a perfectly valid expression: As +such, "binoprhs" is allowed to be empty, in which case it returns the expression +that is passed into it. In our example above, the code passes the expression for +"a" into Parser.parse_bin_rhs and the current token is "+".

    + +

    The precedence value passed into Parser.parse_bin_rhs indicates the +minimal operator precedence that the function is allowed to eat. For +example, if the current pair stream is [+, x] and Parser.parse_bin_rhs +is passed in a precedence of 40, it will not consume any tokens (because the +precedence of '+' is only 20). With this in mind, Parser.parse_bin_rhs +starts with:

    + +
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +
    +
    + +

    This code gets the precedence of the current token and checks to see if if is +too low. Because we defined invalid tokens to have a precedence of -1, this +check implicitly knows that the pair-stream ends when the token stream runs out +of binary operators. If this check succeeds, we know that the token is a binary +operator and that it will be included in this expression:

    + +
    +
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +
    +
    + +

    As such, this code eats (and remembers) the binary operator and then parses +the primary expression that follows. This builds up the whole pair, the first of +which is [+, b] for the running example.

    + +

    Now that we parsed the left-hand side of an expression and one pair of the +RHS sequence, we have to decide which way the expression associates. In +particular, we could have "(a+b) binop unparsed" or "a + (b binop unparsed)". +To determine this, we look ahead at "binop" to determine its precedence and +compare it to BinOp's precedence (which is '+' in this case):

    + +
    +
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +
    +
    + +

    If the precedence of the binop to the right of "RHS" is lower or equal to the +precedence of our current operator, then we know that the parentheses associate +as "(a+b) binop ...". In our example, the current operator is "+" and the next +operator is "+", we know that they have the same precedence. In this case we'll +create the AST node for "a+b", and then continue parsing:

    + +
    +
    +          ... if body omitted ...
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +
    +
    + +

    In our example above, this will turn "a+b+" into "(a+b)" and execute the next +iteration of the loop, with "+" as the current token. The code above will eat, +remember, and parse "(c+d)" as the primary expression, which makes the +current pair equal to [+, (c+d)]. It will then evaluate the 'if' conditional above with +"*" as the binop to the right of the primary. In this case, the precedence of "*" is +higher than the precedence of "+" so the if condition will be entered.

    + +

    The critical question left here is "how can the if condition parse the right +hand side in full"? In particular, to build the AST correctly for our example, +it needs to get all of "(c+d)*e*f" as the RHS expression variable. The code to +do this is surprisingly simple (code from the above two blocks duplicated for +context):

    + +
    +
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              if token_prec < precedence c2
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +
    +
    + +

    At this point, we know that the binary operator to the RHS of our primary +has higher precedence than the binop we are currently parsing. As such, we know +that any sequence of pairs whose operators are all higher precedence than "+" +should be parsed together and returned as "RHS". To do this, we recursively +invoke the Parser.parse_bin_rhs function specifying "token_prec+1" as +the minimum precedence required for it to continue. In our example above, this +will cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set +as the RHS of the '+' expression.

    + +

    Finally, on the next iteration of the while loop, the "+g" piece is parsed +and added to the AST. With this little bit of code (14 non-trivial lines), we +correctly handle fully general binary expression parsing in a very elegant way. +This was a whirlwind tour of this code, and it is somewhat subtle. I recommend +running through it with a few tough examples to see how it works. +

    + +

    This wraps up handling of expressions. At this point, we can point the +parser at an arbitrary token stream and build an expression from it, stopping +at the first token that is not part of the expression. Next up we need to +handle function definitions, etc.

    + +
    + + + + + +
    + +

    +The next thing missing is handling of function prototypes. In Kaleidoscope, +these are used both for 'extern' function declarations as well as function body +definitions. The code to do this is straight-forward and not very interesting +(once you've survived expressions): +

    + +
    +
    +(* prototype
    + *   ::= id '(' id* ')' *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +
    + +

    Given this, a function definition is very simple, just a prototype plus +an expression to implement the body:

    + +
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +
    + +

    In addition, we support 'extern' to declare functions like 'sin' and 'cos' as +well as to support forward declaration of user functions. These 'extern's are just +prototypes with no body:

    + +
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +

    Finally, we'll also let the user type in arbitrary top-level expressions and +evaluate them on the fly. We will handle this by defining anonymous nullary +(zero argument) functions for them:

    + +
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +
    + +

    Now that we have all the pieces, let's build a little driver that will let us +actually execute this code we've built!

    + +
    + + + + + +
    + +

    The driver for this simply invokes all of the parsing pieces with a top-level +dispatch loop. There isn't much interesting here, so I'll just include the +top-level loop. See below for full code in the "Top-Level +Parsing" section.

    + +
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            ignore(Parser.parse_definition stream);
    +            print_endline "parsed a function definition.";
    +        | Token.Extern ->
    +            ignore(Parser.parse_extern stream);
    +            print_endline "parsed an extern.";
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            ignore(Parser.parse_toplevel stream);
    +            print_endline "parsed a top-level expr";
    +        with Stream.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop stream
    +
    +
    + +

    The most interesting part of this is that we ignore top-level semicolons. +Why is this, you ask? The basic reason is that if you type "4 + 5" at the +command line, the parser doesn't know whether that is the end of what you will type +or not. For example, on the next line you could type "def foo..." in which case +4+5 is the end of a top-level expression. Alternatively you could type "* 6", +which would continue the expression. Having top-level semicolons allows you to +type "4+5;", and the parser will know you are done.

    + +
    + + + + + +
    + +

    With just under 300 lines of commented code (240 lines of non-comment, +non-blank code), we fully defined our minimal language, including a lexer, +parser, and AST builder. With this done, the executable will validate +Kaleidoscope code and tell us if it is grammatically invalid. For +example, here is a sample interaction:

    + +
    +
    +$ ./toy.byte
    +ready> def foo(x y) x+foo(y, 4.0);
    +Parsed a function definition.
    +ready> def foo(x y) x+y y;
    +Parsed a function definition.
    +Parsed a top-level expr
    +ready> def foo(x y) x+y );
    +Parsed a function definition.
    +Error: unknown token when expecting an expression
    +ready> extern sin(a);
    +ready> Parsed an extern
    +ready> ^D
    +$
    +
    +
    + +

    There is a lot of room for extension here. You can define new AST nodes, +extend the language in many ways, etc. In the +next installment, we will describe how to generate LLVM Intermediate +Representation (IR) from the AST.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for this and the previous chapter. +Note that it is fully self-contained: you don't need LLVM or any external +libraries at all for this. (Besides the ocaml standard libraries, of +course.) To build this, just compile with:

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto = Prototype of string * string array
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the primary expression after the binary operator. *)
    +        let rhs = parse_primary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')' *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            ignore(Parser.parse_definition stream);
    +            print_endline "parsed a function definition.";
    +        | Token.Extern ->
    +            ignore(Parser.parse_extern stream);
    +            print_endline "parsed an extern.";
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            ignore(Parser.parse_toplevel stream);
    +            print_endline "parsed a top-level expr";
    +        with Stream.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +let main () =
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop stream;
    +;;
    +
    +main ()
    +
    +
    +
    + +Next: Implementing Code Generation to LLVM IR +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl3.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl3.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl3.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl3.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1093 @@ + + + + + Kaleidoscope: Implementing code generation to LLVM IR + + + + + + + + +
    Kaleidoscope: Code generation to LLVM IR
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 3 of the "Implementing a language +with LLVM" tutorial. This chapter shows you how to transform the Abstract Syntax Tree, built in Chapter 2, into +LLVM IR. This will teach you a little bit about how LLVM does things, as well +as demonstrate how easy it is to use. It's much more work to build a lexer and +parser than it is to generate LLVM IR code. :) +

    + +

    Please note: the code in this chapter and later require LLVM 2.3 or +LLVM SVN to work. LLVM 2.2 and before will not work with it.

    + +
    + + + + + +
    + +

    +In order to generate LLVM IR, we want some simple setup to get started. First +we define virtual code generation (codegen) methods in each AST class:

    + +
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> ...
    +  | Ast.Variable name -> ...
    +
    +
    + +

    The Codegen.codegen_expr function says to emit IR for that AST node +along with all the things it depends on, and they all return an LLVM Value +object. "Value" is the class used to represent a "Static Single +Assignment (SSA) register" or "SSA value" in LLVM. The most distinct aspect +of SSA values is that their value is computed as the related instruction +executes, and it does not get a new value until (and if) the instruction +re-executes. In other words, there is no way to "change" an SSA value. For +more information, please read up on Static Single +Assignment - the concepts are really quite natural once you grok them.

    + +

    The +second thing we want is an "Error" exception like we used for the parser, which +will be used to report errors found during code generation (for example, use of +an undeclared parameter):

    + +
    +
    +exception Error of string
    +
    +let the_module = create_module (global_context ()) "my cool jit"
    +let builder = builder (global_context ())
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +
    + +

    The static variables will be used during code generation. +Codgen.the_module is the LLVM construct that contains all of the +functions and global variables in a chunk of code. In many ways, it is the +top-level structure that the LLVM IR uses to contain code.

    + +

    The Codegen.builder object is a helper object that makes it easy to +generate LLVM instructions. Instances of the IRBuilder +class keep track of the current place to insert instructions and has methods to +create new instructions.

    + +

    The Codegen.named_values map keeps track of which values are defined +in the current scope and what their LLVM representation is. (In other words, it +is a symbol table for the code). In this form of Kaleidoscope, the only things +that can be referenced are function parameters. As such, function parameters +will be in this map when generating code for their function body.

    + +

    +With these basics in place, we can start talking about how to generate code for +each expression. Note that this assumes that the Codgen.builder has +been set up to generate code into something. For now, we'll assume +that this has already been done, and we'll just use it to emit code.

    + +
    + + + + + +
    + +

    Generating LLVM code for expression nodes is very straightforward: less +than 30 lines of commented code for all four of our expression nodes. First +we'll do numeric literals:

    + +
    +
    +  | Ast.Number n -> const_float double_type n
    +
    +
    + +

    In the LLVM IR, numeric constants are represented with the +ConstantFP class, which holds the numeric value in an APFloat +internally (APFloat has the capability of holding floating point +constants of Arbitrary Precision). This code basically just +creates and returns a ConstantFP. Note that in the LLVM IR +that constants are all uniqued together and shared. For this reason, the API +uses "the foo::get(..)" idiom instead of "new foo(..)" or "foo::Create(..)".

    + +
    +
    +  | Ast.Variable name ->
    +      (try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name"))
    +
    +
    + +

    References to variables are also quite simple using LLVM. In the simple +version of Kaleidoscope, we assume that the variable has already been emitted +somewhere and its value is available. In practice, the only values that can be +in the Codegen.named_values map are function arguments. This code +simply checks to see that the specified name is in the map (if not, an unknown +variable is being referenced) and returns the value for it. In future chapters, +we'll add support for loop induction variables +in the symbol table, and for local +variables.

    + +
    +
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ -> raise (Error "invalid binary operator")
    +      end
    +
    +
    + +

    Binary operators start to get more interesting. The basic idea here is that +we recursively emit code for the left-hand side of the expression, then the +right-hand side, then we compute the result of the binary expression. In this +code, we do a simple switch on the opcode to create the right LLVM instruction. +

    + +

    In the example above, the LLVM builder class is starting to show its value. +IRBuilder knows where to insert the newly created instruction, all you have to +do is specify what instruction to create (e.g. with Llvm.create_add), +which operands to use (lhs and rhs here) and optionally +provide a name for the generated instruction.

    + +

    One nice thing about LLVM is that the name is just a hint. For instance, if +the code above emits multiple "addtmp" variables, LLVM will automatically +provide each one with an increasing, unique numeric suffix. Local value names +for instructions are purely optional, but it makes it much easier to read the +IR dumps.

    + +

    LLVM instructions are constrained by +strict rules: for example, the Left and Right operators of +an add instruction must have the same +type, and the result type of the add must match the operand types. Because +all values in Kaleidoscope are doubles, this makes for very simple code for add, +sub and mul.

    + +

    On the other hand, LLVM specifies that the fcmp instruction always returns an 'i1' value +(a one bit integer). The problem with this is that Kaleidoscope wants the value to be a 0.0 or 1.0 value. In order to get these semantics, we combine the fcmp instruction with +a uitofp instruction. This instruction +converts its input integer into a floating point value by treating the input +as an unsigned value. In contrast, if we used the sitofp instruction, the Kaleidoscope '<' +operator would return 0.0 and -1.0, depending on the input value.

    + +
    +
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +
    +
    + +

    Code generation for function calls is quite straightforward with LLVM. The +code above initially does a function name lookup in the LLVM Module's symbol +table. Recall that the LLVM Module is the container that holds all of the +functions we are JIT'ing. By giving each function the same name as what the +user specifies, we can use the LLVM symbol table to resolve function names for +us.

    + +

    Once we have the function to call, we recursively codegen each argument that +is to be passed in, and create an LLVM call +instruction. Note that LLVM uses the native C calling conventions by +default, allowing these calls to also call into standard library functions like +"sin" and "cos", with no additional effort.

    + +

    This wraps up our handling of the four basic expressions that we have so far +in Kaleidoscope. Feel free to go in and add some more. For example, by +browsing the LLVM language reference you'll find +several other interesting instructions that are really easy to plug into our +basic framework.

    + +
    + + + + + +
    + +

    Code generation for prototypes and functions must handle a number of +details, which make their code less beautiful than expression code +generation, but allows us to illustrate some important points. First, lets +talk about code generation for prototypes: they are used both for function +bodies and external function declarations. The code starts with:

    + +
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +
    +
    + +

    This code packs a lot of power into a few lines. Note first that this +function returns a "Function*" instead of a "Value*" (although at the moment +they both are modeled by llvalue in ocaml). Because a "prototype" +really talks about the external interface for a function (not the value computed +by an expression), it makes sense for it to return the LLVM Function it +corresponds to when codegen'd.

    + +

    The call to Llvm.function_type creates the Llvm.llvalue +that should be used for a given Prototype. Since all function arguments in +Kaleidoscope are of type double, the first line creates a vector of "N" LLVM +double types. It then uses the Llvm.function_type method to create a +function type that takes "N" doubles as arguments, returns one double as a +result, and that is not vararg (that uses the function +Llvm.var_arg_function_type). Note that Types in LLVM are uniqued just +like Constants are, so you don't "new" a type, you "get" it.

    + +

    The final line above checks if the function has already been defined in +Codegen.the_module. If not, we will create it.

    + +
    +
    +        | None -> declare_function name ft the_module
    +
    +
    + +

    This indicates the type and name to use, as well as which module to insert +into. By default we assume a function has +Llvm.Linkage.ExternalLinkage. "external +linkage" means that the function may be defined outside the current module +and/or that it is callable by functions outside the module. The "name" +passed in is the name the user specified: this name is registered in +"Codegen.the_module"s symbol table, which is used by the function call +code above.

    + +

    In Kaleidoscope, I choose to allow redefinitions of functions in two cases: +first, we want to allow 'extern'ing a function more than once, as long as the +prototypes for the externs match (since all arguments have the same type, we +just have to check that the number of arguments match). Second, we want to +allow 'extern'ing a function and then defining a body for it. This is useful +when defining mutually recursive functions.

    + +
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if Array.length (basic_blocks f) == 0 then () else
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if Array.length (params f) == Array.length args then () else
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +
    + +

    In order to verify the logic above, we first check to see if the pre-existing +function is "empty". In this case, empty means that it has no basic blocks in +it, which means it has no body. If it has no body, it is a forward +declaration. Since we don't allow anything after a full definition of the +function, the code rejects this case. If the previous reference to a function +was an 'extern', we simply verify that the number of arguments for that +definition and this one match up. If not, we emit an error.

    + +
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +
    + +

    The last bit of code for prototypes loops over all of the arguments in the +function, setting the name of the LLVM Argument objects to match, and registering +the arguments in the Codegen.named_values map for future use by the +Ast.Variable variant. Once this is set up, it returns the Function +object to the caller. Note that we don't check for conflicting +argument names here (e.g. "extern foo(a b a)"). Doing so would be very +straight-forward with the mechanics we have already used above.

    + +
    +
    +let codegen_func = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +
    + +

    Code generation for function definitions starts out simply enough: we just +codegen the prototype (Proto) and verify that it is ok. We then clear out the +Codegen.named_values map to make sure that there isn't anything in it +from the last function we compiled. Code generation of the prototype ensures +that there is an LLVM Function object that is ready to go for us.

    + +
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        let ret_val = codegen_expr body in
    +
    +
    + +

    Now we get to the point where the Codegen.builder is set up. The +first line creates a new +basic block (named +"entry"), which is inserted into the_function. The second line then +tells the builder that new instructions should be inserted into the end of the +new basic block. Basic blocks in LLVM are an important part of functions that +define the Control Flow Graph. +Since we don't have any control flow, our functions will only contain one +block at this point. We'll fix this in Chapter +5 :).

    + +
    +
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        the_function
    +
    +
    + +

    Once the insertion point is set up, we call the Codegen.codegen_func +method for the root expression of the function. If no error happens, this emits +code to compute the expression into the entry block and returns the value that +was computed. Assuming no error, we then create an LLVM ret instruction, which completes the function. +Once the function is built, we call +Llvm_analysis.assert_valid_function, which is provided by LLVM. This +function does a variety of consistency checks on the generated code, to +determine if our compiler is doing everything right. Using this is important: +it can catch a lot of bugs. Once the function is finished and validated, we +return it.

    + +
    +
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +

    The only piece left here is handling of the error case. For simplicity, we +handle this by merely deleting the function we produced with the +Llvm.delete_function method. This allows the user to redefine a +function that they incorrectly typed in before: if we didn't delete it, it +would live in the symbol table, with a body, preventing future redefinition.

    + +

    This code does have a bug, though. Since the Codegen.codegen_proto +can return a previously defined forward declaration, our code can actually delete +a forward declaration. There are a number of ways to fix this bug, see what you +can come up with! Here is a testcase:

    + +
    +
    +extern foo(a b);     # ok, defines foo.
    +def foo(a b) c;      # error, 'c' is invalid.
    +def bar() foo(1, 2); # error, unknown function "foo"
    +
    +
    + +
    + + + + + +
    + +

    +For now, code generation to LLVM doesn't really get us much, except that we can +look at the pretty IR calls. The sample code inserts calls to Codegen into the +"Toplevel.main_loop", and then dumps out the LLVM IR. This gives a +nice way to look at the LLVM IR for simple functions. For example: +

    + +
    +
    +ready> 4+5;
    +Read top-level expression:
    +define double @""() {
    +entry:
    +        %addtmp = fadd double 4.000000e+00, 5.000000e+00
    +        ret double %addtmp
    +}
    +
    +
    + +

    Note how the parser turns the top-level expression into anonymous functions +for us. This will be handy when we add JIT +support in the next chapter. Also note that the code is very literally +transcribed, no optimizations are being performed. We will +add optimizations explicitly +in the next chapter.

    + +
    +
    +ready> def foo(a b) a*a + 2*a*b + b*b;
    +Read function definition:
    +define double @foo(double %a, double %b) {
    +entry:
    +        %multmp = fmul double %a, %a
    +        %multmp1 = fmul double 2.000000e+00, %a
    +        %multmp2 = fmul double %multmp1, %b
    +        %addtmp = fadd double %multmp, %multmp2
    +        %multmp3 = fmul double %b, %b
    +        %addtmp4 = fadd double %addtmp, %multmp3
    +        ret double %addtmp4
    +}
    +
    +
    + +

    This shows some simple arithmetic. Notice the striking similarity to the +LLVM builder calls that we use to create the instructions.

    + +
    +
    +ready> def bar(a) foo(a, 4.0) + bar(31337);
    +Read function definition:
    +define double @bar(double %a) {
    +entry:
    +        %calltmp = call double @foo(double %a, double 4.000000e+00)
    +        %calltmp1 = call double @bar(double 3.133700e+04)
    +        %addtmp = fadd double %calltmp, %calltmp1
    +        ret double %addtmp
    +}
    +
    +
    + +

    This shows some function calls. Note that this function will take a long +time to execute if you call it. In the future we'll add conditional control +flow to actually make recursion useful :).

    + +
    +
    +ready> extern cos(x);
    +Read extern:
    +declare double @cos(double)
    +
    +ready> cos(1.234);
    +Read top-level expression:
    +define double @""() {
    +entry:
    +        %calltmp = call double @cos(double 1.234000e+00)
    +        ret double %calltmp
    +}
    +
    +
    + +

    This shows an extern for the libm "cos" function, and a call to it.

    + + +
    +
    +ready> ^D
    +; ModuleID = 'my cool jit'
    +
    +define double @""() {
    +entry:
    +        %addtmp = fadd double 4.000000e+00, 5.000000e+00
    +        ret double %addtmp
    +}
    +
    +define double @foo(double %a, double %b) {
    +entry:
    +        %multmp = fmul double %a, %a
    +        %multmp1 = fmul double 2.000000e+00, %a
    +        %multmp2 = fmul double %multmp1, %b
    +        %addtmp = fadd double %multmp, %multmp2
    +        %multmp3 = fmul double %b, %b
    +        %addtmp4 = fadd double %addtmp, %multmp3
    +        ret double %addtmp4
    +}
    +
    +define double @bar(double %a) {
    +entry:
    +        %calltmp = call double @foo(double %a, double 4.000000e+00)
    +        %calltmp1 = call double @bar(double 3.133700e+04)
    +        %addtmp = fadd double %calltmp, %calltmp1
    +        ret double %addtmp
    +}
    +
    +declare double @cos(double)
    +
    +define double @""() {
    +entry:
    +        %calltmp = call double @cos(double 1.234000e+00)
    +        ret double %calltmp
    +}
    +
    +
    + +

    When you quit the current demo, it dumps out the IR for the entire module +generated. Here you can see the big picture with all the functions referencing +each other.

    + +

    This wraps up the third chapter of the Kaleidoscope tutorial. Up next, we'll +describe how to add JIT codegen and optimizer +support to this so we can actually start running code!

    + +
    + + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +LLVM code generator. Because this uses the LLVM libraries, we need to link +them in. To do this, we use the llvm-config tool to inform +our makefile/command line about which options to use:

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    +
    +
    + +
    myocamlbuild.ml:
    +
    +
    +open Ocamlbuild_plugin;;
    +
    +ocaml_lib ~extern:true "llvm";;
    +ocaml_lib ~extern:true "llvm_analysis";;
    +
    +flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto = Prototype of string * string array
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the primary expression after the binary operator. *)
    +        let rhs = parse_primary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')' *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    codegen.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Code Generation
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +exception Error of string
    +
    +let context = global_context ()
    +let the_module = create_module context "my cool jit"
    +let builder = builder context
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> const_float double_type n
    +  | Ast.Variable name ->
    +      (try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name"))
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ -> raise (Error "invalid binary operator")
    +      end
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +        | None -> declare_function name ft the_module
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if block_begin f <> At_end f then
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if element_type (type_of f) <> ft then
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +let codegen_func = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        the_function
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            let e = Parser.parse_definition stream in
    +            print_endline "parsed a function definition.";
    +            dump_value (Codegen.codegen_func e);
    +        | Token.Extern ->
    +            let e = Parser.parse_extern stream in
    +            print_endline "parsed an extern.";
    +            dump_value (Codegen.codegen_proto e);
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            dump_value (Codegen.codegen_func e);
    +        with Stream.Error s | Codegen.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +let main () =
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop stream;
    +
    +  (* Print out all the generated code. *)
    +  dump_module Codegen.the_module
    +;;
    +
    +main ()
    +
    +
    +
    + +Next: Adding JIT and Optimizer Support +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl4.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl4.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl4.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl4.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1029 @@ + + + + + Kaleidoscope: Adding JIT and Optimizer Support + + + + + + + + +
    Kaleidoscope: Adding JIT and Optimizer Support
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 4 of the "Implementing a language +with LLVM" tutorial. Chapters 1-3 described the implementation of a simple +language and added support for generating LLVM IR. This chapter describes +two new techniques: adding optimizer support to your language, and adding JIT +compiler support. These additions will demonstrate how to get nice, efficient code +for the Kaleidoscope language.

    + +
    + + + + + +
    + +

    Note: the default IRBuilder now always includes the constant +folding optimisations below.

    + +

    +Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately, +it does not produce wonderful code. For example, when compiling simple code, +we don't get obvious optimizations:

    + +
    +
    +ready> def test(x) 1+2+x;
    +Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 1.000000e+00, 2.000000e+00
    +        %addtmp1 = fadd double %addtmp, %x
    +        ret double %addtmp1
    +}
    +
    +
    + +

    This code is a very, very literal transcription of the AST built by parsing +the input. As such, this transcription lacks optimizations like constant folding +(we'd like to get "add x, 3.0" in the example above) as well as other +more important optimizations. Constant folding, in particular, is a very common +and very important optimization: so much so that many language implementors +implement constant folding support in their AST representation.

    + +

    With LLVM, you don't need this support in the AST. Since all calls to build +LLVM IR go through the LLVM builder, it would be nice if the builder itself +checked to see if there was a constant folding opportunity when you call it. +If so, it could just do the constant fold and return the constant instead of +creating an instruction. This is exactly what the LLVMFoldingBuilder +class does. + +

    All we did was switch from LLVMBuilder to +LLVMFoldingBuilder. Though we change no other code, we now have all of our +instructions implicitly constant folded without us having to do anything +about it. For example, the input above now compiles to:

    + +
    +
    +ready> def test(x) 1+2+x;
    +Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 3.000000e+00, %x
    +        ret double %addtmp
    +}
    +
    +
    + +

    Well, that was easy :). In practice, we recommend always using +LLVMFoldingBuilder when generating code like this. It has no +"syntactic overhead" for its use (you don't have to uglify your compiler with +constant checks everywhere) and it can dramatically reduce the amount of +LLVM IR that is generated in some cases (particular for languages with a macro +preprocessor or that use a lot of constants).

    + +

    On the other hand, the LLVMFoldingBuilder is limited by the fact +that it does all of its analysis inline with the code as it is built. If you +take a slightly more complex example:

    + +
    +
    +ready> def test(x) (1+2+x)*(x+(1+2));
    +ready> Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double 3.000000e+00, %x
    +        %addtmp1 = fadd double %x, 3.000000e+00
    +        %multmp = fmul double %addtmp, %addtmp1
    +        ret double %multmp
    +}
    +
    +
    + +

    In this case, the LHS and RHS of the multiplication are the same value. We'd +really like to see this generate "tmp = x+3; result = tmp*tmp;" instead +of computing "x*3" twice.

    + +

    Unfortunately, no amount of local analysis will be able to detect and correct +this. This requires two transformations: reassociation of expressions (to +make the add's lexically identical) and Common Subexpression Elimination (CSE) +to delete the redundant add instruction. Fortunately, LLVM provides a broad +range of optimizations that you can use, in the form of "passes".

    + +
    + + + + + +
    + +

    LLVM provides many optimization passes, which do many different sorts of +things and have different tradeoffs. Unlike other systems, LLVM doesn't hold +to the mistaken notion that one set of optimizations is right for all languages +and for all situations. LLVM allows a compiler implementor to make complete +decisions about what optimizations to use, in which order, and in what +situation.

    + +

    As a concrete example, LLVM supports both "whole module" passes, which look +across as large of body of code as they can (often a whole file, but if run +at link time, this can be a substantial portion of the whole program). It also +supports and includes "per-function" passes which just operate on a single +function at a time, without looking at other functions. For more information +on passes and how they are run, see the How +to Write a Pass document and the List of LLVM +Passes.

    + +

    For Kaleidoscope, we are currently generating functions on the fly, one at +a time, as the user types them in. We aren't shooting for the ultimate +optimization experience in this setting, but we also want to catch the easy and +quick stuff where possible. As such, we will choose to run a few per-function +optimizations as the user types the function in. If we wanted to make a "static +Kaleidoscope compiler", we would use exactly the code we have now, except that +we would defer running the optimizer until the entire file has been parsed.

    + +

    In order to get per-function optimizations going, we need to set up a +Llvm.PassManager to hold and +organize the LLVM optimizations that we want to run. Once we have that, we can +add a set of optimizations to run. The code looks like this:

    + +
    +
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combining the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +  (* Eliminate Common SubExpressions. *)
    +  add_gvn the_fpm;
    +
    +  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
    +  add_cfg_simplification the_fpm;
    +
    +  ignore (PassManager.initialize the_fpm);
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop the_fpm the_execution_engine stream;
    +
    +
    + +

    The meat of the matter here, is the definition of "the_fpm". It +requires a pointer to the the_module to construct itself. Once it is +set up, we use a series of "add" calls to add a bunch of LLVM passes. The +first pass is basically boilerplate, it adds a pass so that later optimizations +know how the data structures in the program are laid out. The +"the_execution_engine" variable is related to the JIT, which we will +get to in the next section.

    + +

    In this case, we choose to add 4 optimization passes. The passes we chose +here are a pretty standard set of "cleanup" optimizations that are useful for +a wide variety of code. I won't delve into what they do but, believe me, +they are a good starting place :).

    + +

    Once the Llvm.PassManager. is set up, we need to make use of it. +We do this by running it after our newly created function is constructed (in +Codegen.codegen_func), but before it is returned to the client:

    + +
    +
    +let codegen_func the_fpm = function
    +      ...
    +      try
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        (* Optimize the function. *)
    +        let _ = PassManager.run_function the_function the_fpm in
    +
    +        the_function
    +
    +
    + +

    As you can see, this is pretty straightforward. The the_fpm +optimizes and updates the LLVM Function* in place, improving (hopefully) its +body. With this in place, we can try our test above again:

    + +
    +
    +ready> def test(x) (1+2+x)*(x+(1+2));
    +ready> Read function definition:
    +define double @test(double %x) {
    +entry:
    +        %addtmp = fadd double %x, 3.000000e+00
    +        %multmp = fmul double %addtmp, %addtmp
    +        ret double %multmp
    +}
    +
    +
    + +

    As expected, we now get our nicely optimized code, saving a floating point +add instruction from every execution of this function.

    + +

    LLVM provides a wide variety of optimizations that can be used in certain +circumstances. Some documentation about the various +passes is available, but it isn't very complete. Another good source of +ideas can come from looking at the passes that llvm-gcc or +llvm-ld run to get started. The "opt" tool allows you to +experiment with passes from the command line, so you can see if they do +anything.

    + +

    Now that we have reasonable code coming out of our front-end, lets talk about +executing it!

    + +
    + + + + + +
    + +

    Code that is available in LLVM IR can have a wide variety of tools +applied to it. For example, you can run optimizations on it (as we did above), +you can dump it out in textual or binary forms, you can compile the code to an +assembly file (.s) for some target, or you can JIT compile it. The nice thing +about the LLVM IR representation is that it is the "common currency" between +many different parts of the compiler. +

    + +

    In this section, we'll add JIT compiler support to our interpreter. The +basic idea that we want for Kaleidoscope is to have the user enter function +bodies as they do now, but immediately evaluate the top-level expressions they +type in. For example, if they type in "1 + 2;", we should evaluate and print +out 3. If they define a function, they should be able to call it from the +command line.

    + +

    In order to do this, we first declare and initialize the JIT. This is done +by adding a global variable and a call in main:

    + +
    +
    +...
    +let main () =
    +  ...
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  ...
    +
    +
    + +

    This creates an abstract "Execution Engine" which can be either a JIT +compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler +for you if one is available for your platform, otherwise it will fall back to +the interpreter.

    + +

    Once the Llvm_executionengine.ExecutionEngine.t is created, the JIT +is ready to be used. There are a variety of APIs that are useful, but the +simplest one is the "Llvm_executionengine.ExecutionEngine.run_function" +function. This method JIT compiles the specified LLVM Function and returns a +function pointer to the generated machine code. In our case, this means that we +can change the code that parses a top-level expression to look like this:

    + +
    +
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            let the_function = Codegen.codegen_func the_fpm e in
    +            dump_value the_function;
    +
    +            (* JIT the function, returning a function pointer. *)
    +            let result = ExecutionEngine.run_function the_function [||]
    +              the_execution_engine in
    +
    +            print_string "Evaluated to ";
    +            print_float (GenericValue.as_float Codegen.double_type result);
    +            print_newline ();
    +
    +
    + +

    Recall that we compile top-level expressions into a self-contained LLVM +function that takes no arguments and returns the computed double. Because the +LLVM JIT compiler matches the native platform ABI, this means that you can just +cast the result pointer to a function pointer of that type and call it directly. +This means, there is no difference between JIT compiled code and native machine +code that is statically linked into your application.

    + +

    With just these two changes, lets see how Kaleidoscope works now!

    + +
    +
    +ready> 4+5;
    +define double @""() {
    +entry:
    +        ret double 9.000000e+00
    +}
    +
    +Evaluated to 9.000000
    +
    +
    + +

    Well this looks like it is basically working. The dump of the function +shows the "no argument function that always returns double" that we synthesize +for each top level expression that is typed in. This demonstrates very basic +functionality, but can we do more?

    + +
    +
    +ready> def testfunc(x y) x + y*2; 
    +Read function definition:
    +define double @testfunc(double %x, double %y) {
    +entry:
    +        %multmp = fmul double %y, 2.000000e+00
    +        %addtmp = fadd double %multmp, %x
    +        ret double %addtmp
    +}
    +
    +ready> testfunc(4, 10);
    +define double @""() {
    +entry:
    +        %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
    +        ret double %calltmp
    +}
    +
    +Evaluated to 24.000000
    +
    +
    + +

    This illustrates that we can now call user code, but there is something a bit +subtle going on here. Note that we only invoke the JIT on the anonymous +functions that call testfunc, but we never invoked it +on testfunc itself. What actually happened here is that the JIT +scanned for all non-JIT'd functions transitively called from the anonymous +function and compiled all of them before returning +from run_function.

    + +

    The JIT provides a number of other more advanced interfaces for things like +freeing allocated machine code, rejit'ing functions to update them, etc. +However, even with this simple code, we get some surprisingly powerful +capabilities - check this out (I removed the dump of the anonymous functions, +you should get the idea by now :) :

    + +
    +
    +ready> extern sin(x);
    +Read extern:
    +declare double @sin(double)
    +
    +ready> extern cos(x);
    +Read extern:
    +declare double @cos(double)
    +
    +ready> sin(1.0);
    +Evaluated to 0.841471
    +
    +ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
    +Read function definition:
    +define double @foo(double %x) {
    +entry:
    +        %calltmp = call double @sin(double %x)
    +        %multmp = fmul double %calltmp, %calltmp
    +        %calltmp2 = call double @cos(double %x)
    +        %multmp4 = fmul double %calltmp2, %calltmp2
    +        %addtmp = fadd double %multmp, %multmp4
    +        ret double %addtmp
    +}
    +
    +ready> foo(4.0);
    +Evaluated to 1.000000
    +
    +
    + +

    Whoa, how does the JIT know about sin and cos? The answer is surprisingly +simple: in this example, the JIT started execution of a function and got to a +function call. It realized that the function was not yet JIT compiled and +invoked the standard set of routines to resolve the function. In this case, +there is no body defined for the function, so the JIT ended up calling +"dlsym("sin")" on the Kaleidoscope process itself. Since +"sin" is defined within the JIT's address space, it simply patches up +calls in the module to call the libm version of sin directly.

    + +

    The LLVM JIT provides a number of interfaces (look in the +llvm_executionengine.mli file) for controlling how unknown functions +get resolved. It allows you to establish explicit mappings between IR objects +and addresses (useful for LLVM global variables that you want to map to static +tables, for example), allows you to dynamically decide on the fly based on the +function name, and even allows you to have the JIT compile functions lazily the +first time they're called.

    + +

    One interesting application of this is that we can now extend the language +by writing arbitrary C code to implement operations. For example, if we add: +

    + +
    +
    +/* putchard - putchar that takes a double and returns 0. */
    +extern "C"
    +double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +
    + +

    Now we can produce simple output to the console by using things like: +"extern putchard(x); putchard(120);", which prints a lowercase 'x' on +the console (120 is the ASCII code for 'x'). Similar code could be used to +implement file I/O, console input, and many other capabilities in +Kaleidoscope.

    + +

    This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At +this point, we can compile a non-Turing-complete programming language, optimize +and JIT compile it in a user-driven way. Next up we'll look into extending the language with control flow +constructs, tackling some interesting LLVM IR issues along the way.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +LLVM JIT and optimizer. To build this example, use: +

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    +<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    +<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    +
    +
    + +
    myocamlbuild.ml:
    +
    +
    +open Ocamlbuild_plugin;;
    +
    +ocaml_lib ~extern:true "llvm";;
    +ocaml_lib ~extern:true "llvm_analysis";;
    +ocaml_lib ~extern:true "llvm_executionengine";;
    +ocaml_lib ~extern:true "llvm_target";;
    +ocaml_lib ~extern:true "llvm_scalar_opts";;
    +
    +flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
    +dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto = Prototype of string * string array
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the primary expression after the binary operator. *)
    +        let rhs = parse_primary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')' *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    codegen.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Code Generation
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +exception Error of string
    +
    +let context = global_context ()
    +let the_module = create_module context "my cool jit"
    +let builder = builder context
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> const_float double_type n
    +  | Ast.Variable name ->
    +      (try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name"))
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ -> raise (Error "invalid binary operator")
    +      end
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +        | None -> declare_function name ft the_module
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if block_begin f <> At_end f then
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if element_type (type_of f) <> ft then
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +let codegen_func the_fpm = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        (* Optimize the function. *)
    +        let _ = PassManager.run_function the_function the_fpm in
    +
    +        the_function
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop the_fpm the_execution_engine stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop the_fpm the_execution_engine stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            let e = Parser.parse_definition stream in
    +            print_endline "parsed a function definition.";
    +            dump_value (Codegen.codegen_func the_fpm e);
    +        | Token.Extern ->
    +            let e = Parser.parse_extern stream in
    +            print_endline "parsed an extern.";
    +            dump_value (Codegen.codegen_proto e);
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            let the_function = Codegen.codegen_func the_fpm e in
    +            dump_value the_function;
    +
    +            (* JIT the function, returning a function pointer. *)
    +            let result = ExecutionEngine.run_function the_function [||]
    +              the_execution_engine in
    +
    +            print_string "Evaluated to ";
    +            print_float (GenericValue.as_float Codegen.double_type result);
    +            print_newline ();
    +        with Stream.Error s | Codegen.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop the_fpm the_execution_engine stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +open Llvm_target
    +open Llvm_scalar_opts
    +
    +let main () =
    +  ignore (initialize_native_target ());
    +
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combination the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +  (* Eliminate Common SubExpressions. *)
    +  add_gvn the_fpm;
    +
    +  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
    +  add_cfg_simplification the_fpm;
    +
    +  ignore (PassManager.initialize the_fpm);
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop the_fpm the_execution_engine stream;
    +
    +  (* Print out all the generated code. *)
    +  dump_module Codegen.the_module
    +;;
    +
    +main ()
    +
    +
    + +
    bindings.c
    +
    +
    +#include <stdio.h>
    +
    +/* putchard - putchar that takes a double and returns 0. */
    +extern double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +
    +
    + +Next: Extending the language: control flow +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl5.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl5.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl5.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl5.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1569 @@ + + + + + Kaleidoscope: Extending the Language: Control Flow + + + + + + + + +
    Kaleidoscope: Extending the Language: Control Flow
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 5 of the "Implementing a language +with LLVM" tutorial. Parts 1-4 described the implementation of the simple +Kaleidoscope language and included support for generating LLVM IR, followed by +optimizations and a JIT compiler. Unfortunately, as presented, Kaleidoscope is +mostly useless: it has no control flow other than call and return. This means +that you can't have conditional branches in the code, significantly limiting its +power. In this episode of "build that compiler", we'll extend Kaleidoscope to +have an if/then/else expression plus a simple 'for' loop.

    + +
    + + + + + +
    + +

    +Extending Kaleidoscope to support if/then/else is quite straightforward. It +basically requires adding lexer support for this "new" concept to the lexer, +parser, AST, and LLVM code emitter. This example is nice, because it shows how +easy it is to "grow" a language over time, incrementally extending it as new +ideas are discovered.

    + +

    Before we get going on "how" we add this extension, lets talk about "what" we +want. The basic idea is that we want to be able to write this sort of thing: +

    + +
    +
    +def fib(x)
    +  if x < 3 then
    +    1
    +  else
    +    fib(x-1)+fib(x-2);
    +
    +
    + +

    In Kaleidoscope, every construct is an expression: there are no statements. +As such, the if/then/else expression needs to return a value like any other. +Since we're using a mostly functional form, we'll have it evaluate its +conditional, then return the 'then' or 'else' value based on how the condition +was resolved. This is very similar to the C "?:" expression.

    + +

    The semantics of the if/then/else expression is that it evaluates the +condition to a boolean equality value: 0.0 is considered to be false and +everything else is considered to be true. +If the condition is true, the first subexpression is evaluated and returned, if +the condition is false, the second subexpression is evaluated and returned. +Since Kaleidoscope allows side-effects, this behavior is important to nail down. +

    + +

    Now that we know what we "want", lets break this down into its constituent +pieces.

    + +
    + + + + + + +
    + +

    The lexer extensions are straightforward. First we add new variants +for the relevant tokens:

    + +
    +
    +  (* control *)
    +  | If | Then | Else | For | In
    +
    +
    + +

    Once we have that, we recognize the new keywords in the lexer. This is pretty simple +stuff:

    + +
    +
    +      ...
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | "if" -> [< 'Token.If; stream >]
    +      | "then" -> [< 'Token.Then; stream >]
    +      | "else" -> [< 'Token.Else; stream >]
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +
    + +
    + + + + + +
    + +

    To represent the new expression we add a new AST variant for it:

    + +
    +
    +type expr =
    +  ...
    +  (* variant for if/then/else. *)
    +  | If of expr * expr * expr
    +
    +
    + +

    The AST variant just has pointers to the various subexpressions.

    + +
    + + + + + +
    + +

    Now that we have the relevant tokens coming from the lexer and we have the +AST node to build, our parsing logic is relatively straightforward. First we +define a new parsing function:

    + +
    +
    +let rec parse_primary = parser
    +  ...
    +  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    +  | [< 'Token.If; c=parse_expr;
    +       'Token.Then ?? "expected 'then'"; t=parse_expr;
    +       'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    +      Ast.If (c, t, e)
    +
    +
    + +

    Next we hook it up as a primary expression:

    + +
    +
    +let rec parse_primary = parser
    +  ...
    +  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    +  | [< 'Token.If; c=parse_expr;
    +       'Token.Then ?? "expected 'then'"; t=parse_expr;
    +       'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    +      Ast.If (c, t, e)
    +
    +
    + +
    + + + + + +
    + +

    Now that we have it parsing and building the AST, the final piece is adding +LLVM code generation support. This is the most interesting part of the +if/then/else example, because this is where it starts to introduce new concepts. +All of the code above has been thoroughly described in previous chapters. +

    + +

    To motivate the code we want to produce, lets take a look at a simple +example. Consider:

    + +
    +
    +extern foo();
    +extern bar();
    +def baz(x) if x then foo() else bar();
    +
    +
    + +

    If you disable optimizations, the code you'll (soon) get from Kaleidoscope +looks like this:

    + +
    +
    +declare double @foo()
    +
    +declare double @bar()
    +
    +define double @baz(double %x) {
    +entry:
    +  %ifcond = fcmp one double %x, 0.000000e+00
    +  br i1 %ifcond, label %then, label %else
    +
    +then:    ; preds = %entry
    +  %calltmp = call double @foo()
    +  br label %ifcont
    +
    +else:    ; preds = %entry
    +  %calltmp1 = call double @bar()
    +  br label %ifcont
    +
    +ifcont:    ; preds = %else, %then
    +  %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
    +  ret double %iftmp
    +}
    +
    +
    + +

    To visualize the control flow graph, you can use a nifty feature of the LLVM +'opt' tool. If you put this LLVM IR +into "t.ll" and run "llvm-as < t.ll | opt -analyze -view-cfg", a window will pop up and you'll +see this graph:

    + +
    Example CFG
    + +

    Another way to get this is to call "Llvm_analysis.view_function_cfg +f" or "Llvm_analysis.view_function_cfg_only f" (where f +is a "Function") either by inserting actual calls into the code and +recompiling or by calling these in the debugger. LLVM has many nice features +for visualizing various graphs.

    + +

    Getting back to the generated code, it is fairly simple: the entry block +evaluates the conditional expression ("x" in our case here) and compares the +result to 0.0 with the "fcmp one" +instruction ('one' is "Ordered and Not Equal"). Based on the result of this +expression, the code jumps to either the "then" or "else" blocks, which contain +the expressions for the true/false cases.

    + +

    Once the then/else blocks are finished executing, they both branch back to the +'ifcont' block to execute the code that happens after the if/then/else. In this +case the only thing left to do is to return to the caller of the function. The +question then becomes: how does the code know which expression to return?

    + +

    The answer to this question involves an important SSA operation: the +Phi +operation. If you're not familiar with SSA, the wikipedia +article is a good introduction and there are various other introductions to +it available on your favorite search engine. The short version is that +"execution" of the Phi operation requires "remembering" which block control came +from. The Phi operation takes on the value corresponding to the input control +block. In this case, if control comes in from the "then" block, it gets the +value of "calltmp". If control comes from the "else" block, it gets the value +of "calltmp1".

    + +

    At this point, you are probably starting to think "Oh no! This means my +simple and elegant front-end will have to start generating SSA form in order to +use LLVM!". Fortunately, this is not the case, and we strongly advise +not implementing an SSA construction algorithm in your front-end +unless there is an amazingly good reason to do so. In practice, there are two +sorts of values that float around in code written for your average imperative +programming language that might need Phi nodes:

    + +
      +
    1. Code that involves user variables: x = 1; x = x + 1;
    2. +
    3. Values that are implicit in the structure of your AST, such as the Phi node +in this case.
    4. +
    + +

    In Chapter 7 of this tutorial ("mutable +variables"), we'll talk about #1 +in depth. For now, just believe me that you don't need SSA construction to +handle this case. For #2, you have the choice of using the techniques that we will +describe for #1, or you can insert Phi nodes directly, if convenient. In this +case, it is really really easy to generate the Phi node, so we choose to do it +directly.

    + +

    Okay, enough of the motivation and overview, lets generate code!

    + +
    + + + + + +
    + +

    In order to generate code for this, we implement the Codegen method +for IfExprAST:

    + +
    +
    +let rec codegen_expr = function
    +  ...
    +  | Ast.If (cond, then_, else_) ->
    +      let cond = codegen_expr cond in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0 *)
    +      let zero = const_float double_type 0.0 in
    +      let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
    +
    +
    + +

    This code is straightforward and similar to what we saw before. We emit the +expression for the condition, then compare that value to zero to get a truth +value as a 1-bit (bool) value.

    + +
    +
    +      (* Grab the first block so that we might later add the conditional branch
    +       * to it at the end of the function. *)
    +      let start_bb = insertion_block builder in
    +      let the_function = block_parent start_bb in
    +
    +      let then_bb = append_block context "then" the_function in
    +      position_at_end then_bb builder;
    +
    +
    + +

    +As opposed to the C++ tutorial, we have to build +our basic blocks bottom up since we can't have dangling BasicBlocks. We start +off by saving a pointer to the first block (which might not be the entry +block), which we'll need to build a conditional branch later. We do this by +asking the builder for the current BasicBlock. The fourth line +gets the current Function object that is being built. It gets this by the +start_bb for its "parent" (the function it is currently embedded +into).

    + +

    Once it has that, it creates one block. It is automatically appended into +the function's list of blocks.

    + +
    +
    +      (* Emit 'then' value. *)
    +      position_at_end then_bb builder;
    +      let then_val = codegen_expr then_ in
    +
    +      (* Codegen of 'then' can change the current block, update then_bb for the
    +       * phi. We create a new name because one is used for the phi node, and the
    +       * other is used for the conditional branch. *)
    +      let new_then_bb = insertion_block builder in
    +
    +
    + +

    We move the builder to start inserting into the "then" block. Strictly +speaking, this call moves the insertion point to be at the end of the specified +block. However, since the "then" block is empty, it also starts out by +inserting at the beginning of the block. :)

    + +

    Once the insertion point is set, we recursively codegen the "then" expression +from the AST.

    + +

    The final line here is quite subtle, but is very important. The basic issue +is that when we create the Phi node in the merge block, we need to set up the +block/value pairs that indicate how the Phi will work. Importantly, the Phi +node expects to have an entry for each predecessor of the block in the CFG. Why +then, are we getting the current block when we just set it to ThenBB 5 lines +above? The problem is that the "Then" expression may actually itself change the +block that the Builder is emitting into if, for example, it contains a nested +"if/then/else" expression. Because calling Codegen recursively could +arbitrarily change the notion of the current block, we are required to get an +up-to-date value for code that will set up the Phi node.

    + +
    +
    +      (* Emit 'else' value. *)
    +      let else_bb = append_block context "else" the_function in
    +      position_at_end else_bb builder;
    +      let else_val = codegen_expr else_ in
    +
    +      (* Codegen of 'else' can change the current block, update else_bb for the
    +       * phi. *)
    +      let new_else_bb = insertion_block builder in
    +
    +
    + +

    Code generation for the 'else' block is basically identical to codegen for +the 'then' block.

    + +
    +
    +      (* Emit merge block. *)
    +      let merge_bb = append_block context "ifcont" the_function in
    +      position_at_end merge_bb builder;
    +      let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
    +      let phi = build_phi incoming "iftmp" builder in
    +
    +
    + +

    The first two lines here are now familiar: the first adds the "merge" block +to the Function object. The second block changes the insertion point so that +newly created code will go into the "merge" block. Once that is done, we need +to create the PHI node and set up the block/value pairs for the PHI.

    + +
    +
    +      (* Return to the start block to add the conditional branch. *)
    +      position_at_end start_bb builder;
    +      ignore (build_cond_br cond_val then_bb else_bb builder);
    +
    +
    + +

    Once the blocks are created, we can emit the conditional branch that chooses +between them. Note that creating new blocks does not implicitly affect the +IRBuilder, so it is still inserting into the block that the condition +went into. This is why we needed to save the "start" block.

    + +
    +
    +      (* Set a unconditional branch at the end of the 'then' block and the
    +       * 'else' block to the 'merge' block. *)
    +      position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
    +      position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
    +
    +      (* Finally, set the builder to the end of the merge block. *)
    +      position_at_end merge_bb builder;
    +
    +      phi
    +
    +
    + +

    To finish off the blocks, we create an unconditional branch +to the merge block. One interesting (and very important) aspect of the LLVM IR +is that it requires all basic blocks +to be "terminated" with a control flow +instruction such as return or branch. This means that all control flow, +including fall throughs must be made explicit in the LLVM IR. If you +violate this rule, the verifier will emit an error. + +

    Finally, the CodeGen function returns the phi node as the value computed by +the if/then/else expression. In our example above, this returned value will +feed into the code for the top-level function, which will create the return +instruction.

    + +

    Overall, we now have the ability to execute conditional code in +Kaleidoscope. With this extension, Kaleidoscope is a fairly complete language +that can calculate a wide variety of numeric functions. Next up we'll add +another useful expression that is familiar from non-functional languages...

    + +
    + + + + + +
    + +

    Now that we know how to add basic control flow constructs to the language, +we have the tools to add more powerful things. Lets add something more +aggressive, a 'for' expression:

    + +
    +
    + extern putchard(char);
    + def printstar(n)
    +   for i = 1, i < n, 1.0 in
    +     putchard(42);  # ascii 42 = '*'
    +
    + # print 100 '*' characters
    + printstar(100);
    +
    +
    + +

    This expression defines a new variable ("i" in this case) which iterates from +a starting value, while the condition ("i < n" in this case) is true, +incrementing by an optional step value ("1.0" in this case). If the step value +is omitted, it defaults to 1.0. While the loop is true, it executes its +body expression. Because we don't have anything better to return, we'll just +define the loop as always returning 0.0. In the future when we have mutable +variables, it will get more useful.

    + +

    As before, lets talk about the changes that we need to Kaleidoscope to +support this.

    + +
    + + + + + +
    + +

    The lexer extensions are the same sort of thing as for if/then/else:

    + +
    +
    +  ... in Token.token ...
    +  (* control *)
    +  | If | Then | Else
    +  | For | In
    +
    +  ... in Lexer.lex_ident...
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | "if" -> [< 'Token.If; stream >]
    +      | "then" -> [< 'Token.Then; stream >]
    +      | "else" -> [< 'Token.Else; stream >]
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +
    + +
    + + + + + +
    + +

    The AST variant is just as simple. It basically boils down to capturing +the variable name and the constituent expressions in the node.

    + +
    +
    +type expr =
    +  ...
    +  (* variant for for/in. *)
    +  | For of string * expr * expr * expr option * expr
    +
    +
    + +
    + + + + + +
    + +

    The parser code is also fairly standard. The only interesting thing here is +handling of the optional step value. The parser code handles it by checking to +see if the second comma is present. If not, it sets the step value to null in +the AST node:

    + +
    +
    +let rec parse_primary = parser
    +  ...
    +  (* forexpr
    +        ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
    +  | [< 'Token.For;
    +       'Token.Ident id ?? "expected identifier after for";
    +       'Token.Kwd '=' ?? "expected '=' after for";
    +       stream >] ->
    +      begin parser
    +        | [<
    +             start=parse_expr;
    +             'Token.Kwd ',' ?? "expected ',' after for";
    +             end_=parse_expr;
    +             stream >] ->
    +            let step =
    +              begin parser
    +              | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
    +              | [< >] -> None
    +              end stream
    +            in
    +            begin parser
    +            | [< 'Token.In; body=parse_expr >] ->
    +                Ast.For (id, start, end_, step, body)
    +            | [< >] ->
    +                raise (Stream.Error "expected 'in' after for")
    +            end stream
    +        | [< >] ->
    +            raise (Stream.Error "expected '=' after for")
    +      end stream
    +
    +
    + +
    + + + + + +
    + +

    Now we get to the good part: the LLVM IR we want to generate for this thing. +With the simple example above, we get this LLVM IR (note that this dump is +generated with optimizations disabled for clarity): +

    + +
    +
    +declare double @putchard(double)
    +
    +define double @printstar(double %n) {
    +entry:
    +        ; initial value = 1.0 (inlined into phi)
    +  br label %loop
    +
    +loop:    ; preds = %loop, %entry
    +  %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
    +        ; body
    +  %calltmp = call double @putchard(double 4.200000e+01)
    +        ; increment
    +  %nextvar = fadd double %i, 1.000000e+00
    +
    +        ; termination test
    +  %cmptmp = fcmp ult double %i, %n
    +  %booltmp = uitofp i1 %cmptmp to double
    +  %loopcond = fcmp one double %booltmp, 0.000000e+00
    +  br i1 %loopcond, label %loop, label %afterloop
    +
    +afterloop:    ; preds = %loop
    +        ; loop always returns 0.0
    +  ret double 0.000000e+00
    +}
    +
    +
    + +

    This loop contains all the same constructs we saw before: a phi node, several +expressions, and some basic blocks. Lets see how this fits together.

    + +
    + + + + + +
    + +

    The first part of Codegen is very simple: we just output the start expression +for the loop value:

    + +
    +
    +let rec codegen_expr = function
    +  ...
    +  | Ast.For (var_name, start, end_, step, body) ->
    +      (* Emit the start code first, without 'variable' in scope. *)
    +      let start_val = codegen_expr start in
    +
    +
    + +

    With this out of the way, the next step is to set up the LLVM basic block +for the start of the loop body. In the case above, the whole loop body is one +block, but remember that the body code itself could consist of multiple blocks +(e.g. if it contains an if/then/else or a for/in expression).

    + +
    +
    +      (* Make the new basic block for the loop header, inserting after current
    +       * block. *)
    +      let preheader_bb = insertion_block builder in
    +      let the_function = block_parent preheader_bb in
    +      let loop_bb = append_block context "loop" the_function in
    +
    +      (* Insert an explicit fall through from the current block to the
    +       * loop_bb. *)
    +      ignore (build_br loop_bb builder);
    +
    +
    + +

    This code is similar to what we saw for if/then/else. Because we will need +it to create the Phi node, we remember the block that falls through into the +loop. Once we have that, we create the actual block that starts the loop and +create an unconditional branch for the fall-through between the two blocks.

    + +
    +
    +      (* Start insertion in loop_bb. *)
    +      position_at_end loop_bb builder;
    +
    +      (* Start the PHI node with an entry for start. *)
    +      let variable = build_phi [(start_val, preheader_bb)] var_name builder in
    +
    +
    + +

    Now that the "preheader" for the loop is set up, we switch to emitting code +for the loop body. To begin with, we move the insertion point and create the +PHI node for the loop induction variable. Since we already know the incoming +value for the starting value, we add it to the Phi node. Note that the Phi will +eventually get a second value for the backedge, but we can't set it up yet +(because it doesn't exist!).

    + +
    +
    +      (* Within the loop, the variable is defined equal to the PHI node. If it
    +       * shadows an existing variable, we have to restore it, so save it
    +       * now. *)
    +      let old_val =
    +        try Some (Hashtbl.find named_values var_name) with Not_found -> None
    +      in
    +      Hashtbl.add named_values var_name variable;
    +
    +      (* Emit the body of the loop.  This, like any other expr, can change the
    +       * current BB.  Note that we ignore the value computed by the body, but
    +       * don't allow an error *)
    +      ignore (codegen_expr body);
    +
    +
    + +

    Now the code starts to get more interesting. Our 'for' loop introduces a new +variable to the symbol table. This means that our symbol table can now contain +either function arguments or loop variables. To handle this, before we codegen +the body of the loop, we add the loop variable as the current value for its +name. Note that it is possible that there is a variable of the same name in the +outer scope. It would be easy to make this an error (emit an error and return +null if there is already an entry for VarName) but we choose to allow shadowing +of variables. In order to handle this correctly, we remember the Value that +we are potentially shadowing in old_val (which will be None if there is +no shadowed variable).

    + +

    Once the loop variable is set into the symbol table, the code recursively +codegen's the body. This allows the body to use the loop variable: any +references to it will naturally find it in the symbol table.

    + +
    +
    +      (* Emit the step value. *)
    +      let step_val =
    +        match step with
    +        | Some step -> codegen_expr step
    +        (* If not specified, use 1.0. *)
    +        | None -> const_float double_type 1.0
    +      in
    +
    +      let next_var = build_add variable step_val "nextvar" builder in
    +
    +
    + +

    Now that the body is emitted, we compute the next value of the iteration +variable by adding the step value, or 1.0 if it isn't present. +'next_var' will be the value of the loop variable on the next iteration +of the loop.

    + +
    +
    +      (* Compute the end condition. *)
    +      let end_cond = codegen_expr end_ in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0. *)
    +      let zero = const_float double_type 0.0 in
    +      let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
    +
    +
    + +

    Finally, we evaluate the exit value of the loop, to determine whether the +loop should exit. This mirrors the condition evaluation for the if/then/else +statement.

    + +
    +
    +      (* Create the "after loop" block and insert it. *)
    +      let loop_end_bb = insertion_block builder in
    +      let after_bb = append_block context "afterloop" the_function in
    +
    +      (* Insert the conditional branch into the end of loop_end_bb. *)
    +      ignore (build_cond_br end_cond loop_bb after_bb builder);
    +
    +      (* Any new code will be inserted in after_bb. *)
    +      position_at_end after_bb builder;
    +
    +
    + +

    With the code for the body of the loop complete, we just need to finish up +the control flow for it. This code remembers the end block (for the phi node), then creates the block for the loop exit ("afterloop"). Based on the value of the +exit condition, it creates a conditional branch that chooses between executing +the loop again and exiting the loop. Any future code is emitted in the +"afterloop" block, so it sets the insertion position to it.

    + +
    +
    +      (* Add a new entry to the PHI node for the backedge. *)
    +      add_incoming (next_var, loop_end_bb) variable;
    +
    +      (* Restore the unshadowed variable. *)
    +      begin match old_val with
    +      | Some old_val -> Hashtbl.add named_values var_name old_val
    +      | None -> ()
    +      end;
    +
    +      (* for expr always returns 0.0. *)
    +      const_null double_type
    +
    +
    + +

    The final code handles various cleanups: now that we have the +"next_var" value, we can add the incoming value to the loop PHI node. +After that, we remove the loop variable from the symbol table, so that it isn't +in scope after the for loop. Finally, code generation of the for loop always +returns 0.0, so that is what we return from Codegen.codegen_expr.

    + +

    With this, we conclude the "adding control flow to Kaleidoscope" chapter of +the tutorial. In this chapter we added two control flow constructs, and used +them to motivate a couple of aspects of the LLVM IR that are important for +front-end implementors to know. In the next chapter of our saga, we will get +a bit crazier and add user-defined operators +to our poor innocent language.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    +<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    +<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    +
    +
    + +
    myocamlbuild.ml:
    +
    +
    +open Ocamlbuild_plugin;;
    +
    +ocaml_lib ~extern:true "llvm";;
    +ocaml_lib ~extern:true "llvm_analysis";;
    +ocaml_lib ~extern:true "llvm_executionengine";;
    +ocaml_lib ~extern:true "llvm_target";;
    +ocaml_lib ~extern:true "llvm_scalar_opts";;
    +
    +flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
    +dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +  (* control *)
    +  | If | Then | Else
    +  | For | In
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | "if" -> [< 'Token.If; stream >]
    +      | "then" -> [< 'Token.Then; stream >]
    +      | "else" -> [< 'Token.Else; stream >]
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +  (* variant for if/then/else. *)
    +  | If of expr * expr * expr
    +
    +  (* variant for for/in. *)
    +  | For of string * expr * expr * expr option * expr
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto = Prototype of string * string array
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr
    + *   ::= ifexpr
    + *   ::= forexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    +  | [< 'Token.If; c=parse_expr;
    +       'Token.Then ?? "expected 'then'"; t=parse_expr;
    +       'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    +      Ast.If (c, t, e)
    +
    +  (* forexpr
    +        ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
    +  | [< 'Token.For;
    +       'Token.Ident id ?? "expected identifier after for";
    +       'Token.Kwd '=' ?? "expected '=' after for";
    +       stream >] ->
    +      begin parser
    +        | [<
    +             start=parse_expr;
    +             'Token.Kwd ',' ?? "expected ',' after for";
    +             end_=parse_expr;
    +             stream >] ->
    +            let step =
    +              begin parser
    +              | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
    +              | [< >] -> None
    +              end stream
    +            in
    +            begin parser
    +            | [< 'Token.In; body=parse_expr >] ->
    +                Ast.For (id, start, end_, step, body)
    +            | [< >] ->
    +                raise (Stream.Error "expected 'in' after for")
    +            end stream
    +        | [< >] ->
    +            raise (Stream.Error "expected '=' after for")
    +      end stream
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the primary expression after the binary operator. *)
    +        let rhs = parse_primary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')' *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    codegen.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Code Generation
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +exception Error of string
    +
    +let context = global_context ()
    +let the_module = create_module context "my cool jit"
    +let builder = builder context
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> const_float double_type n
    +  | Ast.Variable name ->
    +      (try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name"))
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ -> raise (Error "invalid binary operator")
    +      end
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +  | Ast.If (cond, then_, else_) ->
    +      let cond = codegen_expr cond in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0 *)
    +      let zero = const_float double_type 0.0 in
    +      let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
    +
    +      (* Grab the first block so that we might later add the conditional branch
    +       * to it at the end of the function. *)
    +      let start_bb = insertion_block builder in
    +      let the_function = block_parent start_bb in
    +
    +      let then_bb = append_block context "then" the_function in
    +
    +      (* Emit 'then' value. *)
    +      position_at_end then_bb builder;
    +      let then_val = codegen_expr then_ in
    +
    +      (* Codegen of 'then' can change the current block, update then_bb for the
    +       * phi. We create a new name because one is used for the phi node, and the
    +       * other is used for the conditional branch. *)
    +      let new_then_bb = insertion_block builder in
    +
    +      (* Emit 'else' value. *)
    +      let else_bb = append_block context "else" the_function in
    +      position_at_end else_bb builder;
    +      let else_val = codegen_expr else_ in
    +
    +      (* Codegen of 'else' can change the current block, update else_bb for the
    +       * phi. *)
    +      let new_else_bb = insertion_block builder in
    +
    +      (* Emit merge block. *)
    +      let merge_bb = append_block context "ifcont" the_function in
    +      position_at_end merge_bb builder;
    +      let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
    +      let phi = build_phi incoming "iftmp" builder in
    +
    +      (* Return to the start block to add the conditional branch. *)
    +      position_at_end start_bb builder;
    +      ignore (build_cond_br cond_val then_bb else_bb builder);
    +
    +      (* Set a unconditional branch at the end of the 'then' block and the
    +       * 'else' block to the 'merge' block. *)
    +      position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
    +      position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
    +
    +      (* Finally, set the builder to the end of the merge block. *)
    +      position_at_end merge_bb builder;
    +
    +      phi
    +  | Ast.For (var_name, start, end_, step, body) ->
    +      (* Emit the start code first, without 'variable' in scope. *)
    +      let start_val = codegen_expr start in
    +
    +      (* Make the new basic block for the loop header, inserting after current
    +       * block. *)
    +      let preheader_bb = insertion_block builder in
    +      let the_function = block_parent preheader_bb in
    +      let loop_bb = append_block context "loop" the_function in
    +
    +      (* Insert an explicit fall through from the current block to the
    +       * loop_bb. *)
    +      ignore (build_br loop_bb builder);
    +
    +      (* Start insertion in loop_bb. *)
    +      position_at_end loop_bb builder;
    +
    +      (* Start the PHI node with an entry for start. *)
    +      let variable = build_phi [(start_val, preheader_bb)] var_name builder in
    +
    +      (* Within the loop, the variable is defined equal to the PHI node. If it
    +       * shadows an existing variable, we have to restore it, so save it
    +       * now. *)
    +      let old_val =
    +        try Some (Hashtbl.find named_values var_name) with Not_found -> None
    +      in
    +      Hashtbl.add named_values var_name variable;
    +
    +      (* Emit the body of the loop.  This, like any other expr, can change the
    +       * current BB.  Note that we ignore the value computed by the body, but
    +       * don't allow an error *)
    +      ignore (codegen_expr body);
    +
    +      (* Emit the step value. *)
    +      let step_val =
    +        match step with
    +        | Some step -> codegen_expr step
    +        (* If not specified, use 1.0. *)
    +        | None -> const_float double_type 1.0
    +      in
    +
    +      let next_var = build_add variable step_val "nextvar" builder in
    +
    +      (* Compute the end condition. *)
    +      let end_cond = codegen_expr end_ in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0. *)
    +      let zero = const_float double_type 0.0 in
    +      let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
    +
    +      (* Create the "after loop" block and insert it. *)
    +      let loop_end_bb = insertion_block builder in
    +      let after_bb = append_block context "afterloop" the_function in
    +
    +      (* Insert the conditional branch into the end of loop_end_bb. *)
    +      ignore (build_cond_br end_cond loop_bb after_bb builder);
    +
    +      (* Any new code will be inserted in after_bb. *)
    +      position_at_end after_bb builder;
    +
    +      (* Add a new entry to the PHI node for the backedge. *)
    +      add_incoming (next_var, loop_end_bb) variable;
    +
    +      (* Restore the unshadowed variable. *)
    +      begin match old_val with
    +      | Some old_val -> Hashtbl.add named_values var_name old_val
    +      | None -> ()
    +      end;
    +
    +      (* for expr always returns 0.0. *)
    +      const_null double_type
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +        | None -> declare_function name ft the_module
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if block_begin f <> At_end f then
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if element_type (type_of f) <> ft then
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +let codegen_func the_fpm = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        (* Optimize the function. *)
    +        let _ = PassManager.run_function the_function the_fpm in
    +
    +        the_function
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop the_fpm the_execution_engine stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop the_fpm the_execution_engine stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            let e = Parser.parse_definition stream in
    +            print_endline "parsed a function definition.";
    +            dump_value (Codegen.codegen_func the_fpm e);
    +        | Token.Extern ->
    +            let e = Parser.parse_extern stream in
    +            print_endline "parsed an extern.";
    +            dump_value (Codegen.codegen_proto e);
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            let the_function = Codegen.codegen_func the_fpm e in
    +            dump_value the_function;
    +
    +            (* JIT the function, returning a function pointer. *)
    +            let result = ExecutionEngine.run_function the_function [||]
    +              the_execution_engine in
    +
    +            print_string "Evaluated to ";
    +            print_float (GenericValue.as_float Codegen.double_type result);
    +            print_newline ();
    +        with Stream.Error s | Codegen.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop the_fpm the_execution_engine stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +open Llvm_target
    +open Llvm_scalar_opts
    +
    +let main () =
    +  ignore (initialize_native_target ());
    +
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combination the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +  (* Eliminate Common SubExpressions. *)
    +  add_gvn the_fpm;
    +
    +  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
    +  add_cfg_simplification the_fpm;
    +
    +  ignore (PassManager.initialize the_fpm);
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop the_fpm the_execution_engine stream;
    +
    +  (* Print out all the generated code. *)
    +  dump_module Codegen.the_module
    +;;
    +
    +main ()
    +
    +
    + +
    bindings.c
    +
    +
    +#include <stdio.h>
    +
    +/* putchard - putchar that takes a double and returns 0. */
    +extern double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +
    +
    + +Next: Extending the language: user-defined +operators +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl6.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl6.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl6.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl6.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1574 @@ + + + + + Kaleidoscope: Extending the Language: User-defined Operators + + + + + + + + +
    Kaleidoscope: Extending the Language: User-defined Operators
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 6 of the "Implementing a language +with LLVM" tutorial. At this point in our tutorial, we now have a fully +functional language that is fairly minimal, but also useful. There +is still one big problem with it, however. Our language doesn't have many +useful operators (like division, logical negation, or even any comparisons +besides less-than).

    + +

    This chapter of the tutorial takes a wild digression into adding user-defined +operators to the simple and beautiful Kaleidoscope language. This digression now +gives us a simple and ugly language in some ways, but also a powerful one at the +same time. One of the great things about creating your own language is that you +get to decide what is good or bad. In this tutorial we'll assume that it is +okay to use this as a way to show some interesting parsing techniques.

    + +

    At the end of this tutorial, we'll run through an example Kaleidoscope +application that renders the Mandelbrot set. This gives +an example of what you can build with Kaleidoscope and its feature set.

    + +
    + + + + + +
    + +

    +The "operator overloading" that we will add to Kaleidoscope is more general than +languages like C++. In C++, you are only allowed to redefine existing +operators: you can't programatically change the grammar, introduce new +operators, change precedence levels, etc. In this chapter, we will add this +capability to Kaleidoscope, which will let the user round out the set of +operators that are supported.

    + +

    The point of going into user-defined operators in a tutorial like this is to +show the power and flexibility of using a hand-written parser. Thus far, the parser +we have been implementing uses recursive descent for most parts of the grammar and +operator precedence parsing for the expressions. See Chapter 2 for details. Without using operator +precedence parsing, it would be very difficult to allow the programmer to +introduce new operators into the grammar: the grammar is dynamically extensible +as the JIT runs.

    + +

    The two specific features we'll add are programmable unary operators (right +now, Kaleidoscope has no unary operators at all) as well as binary operators. +An example of this is:

    + +
    +
    +# Logical unary not.
    +def unary!(v)
    +  if v then
    +    0
    +  else
    +    1;
    +
    +# Define > with the same precedence as <.
    +def binary> 10 (LHS RHS)
    +  RHS < LHS;
    +
    +# Binary "logical or", (note that it does not "short circuit")
    +def binary| 5 (LHS RHS)
    +  if LHS then
    +    1
    +  else if RHS then
    +    1
    +  else
    +    0;
    +
    +# Define = with slightly lower precedence than relationals.
    +def binary= 9 (LHS RHS)
    +  !(LHS < RHS | LHS > RHS);
    +
    +
    + +

    Many languages aspire to being able to implement their standard runtime +library in the language itself. In Kaleidoscope, we can implement significant +parts of the language in the library!

    + +

    We will break down implementation of these features into two parts: +implementing support for user-defined binary operators and adding unary +operators.

    + +
    + + + + + +
    + +

    Adding support for user-defined binary operators is pretty simple with our +current framework. We'll first add support for the unary/binary keywords:

    + +
    +
    +type token =
    +  ...
    +  (* operators *)
    +  | Binary | Unary
    +
    +...
    +
    +and lex_ident buffer = parser
    +  ...
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | "binary" -> [< 'Token.Binary; stream >]
    +      | "unary" -> [< 'Token.Unary; stream >]
    +
    +
    + +

    This just adds lexer support for the unary and binary keywords, like we +did in previous chapters. One nice +thing about our current AST, is that we represent binary operators with full +generalisation by using their ASCII code as the opcode. For our extended +operators, we'll use this same representation, so we don't need any new AST or +parser support.

    + +

    On the other hand, we have to be able to represent the definitions of these +new operators, in the "def binary| 5" part of the function definition. In our +grammar so far, the "name" for the function definition is parsed as the +"prototype" production and into the Ast.Prototype AST node. To +represent our new user-defined operators as prototypes, we have to extend +the Ast.Prototype AST node like this:

    + +
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto =
    +  | Prototype of string * string array
    +  | BinOpPrototype of string * string array * int
    +
    +
    + +

    Basically, in addition to knowing a name for the prototype, we now keep track +of whether it was an operator, and if it was, what precedence level the operator +is at. The precedence is only used for binary operators (as you'll see below, +it just doesn't apply for unary operators). Now that we have a way to represent +the prototype for a user-defined operator, we need to parse it:

    + +
    +
    +(* prototype
    + *   ::= id '(' id* ')'
    + *   ::= binary LETTER number? (id, id)
    + *   ::= unary LETTER number? (id) *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +  let parse_operator = parser
    +    | [< 'Token.Unary >] -> "unary", 1
    +    | [< 'Token.Binary >] -> "binary", 2
    +  in
    +  let parse_binary_precedence = parser
    +    | [< 'Token.Number n >] -> int_of_float n
    +    | [< >] -> 30
    +  in
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +  | [< (prefix, kind)=parse_operator;
    +       'Token.Kwd op ?? "expected an operator";
    +       (* Read the precedence if present. *)
    +       binary_precedence=parse_binary_precedence;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +        args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      let name = prefix ^ (String.make 1 op) in
    +      let args = Array.of_list (List.rev args) in
    +
    +      (* Verify right number of arguments for operator. *)
    +      if Array.length args != kind
    +      then raise (Stream.Error "invalid number of operands for operator")
    +      else
    +        if kind == 1 then
    +          Ast.Prototype (name, args)
    +        else
    +          Ast.BinOpPrototype (name, args, binary_precedence)
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +
    + +

    This is all fairly straightforward parsing code, and we have already seen +a lot of similar code in the past. One interesting part about the code above is +the couple lines that set up name for binary operators. This builds +names like "binary@" for a newly defined "@" operator. This then takes +advantage of the fact that symbol names in the LLVM symbol table are allowed to +have any character in them, including embedded nul characters.

    + +

    The next interesting thing to add, is codegen support for these binary +operators. Given our current structure, this is a simple addition of a default +case for our existing binary operator node:

    + +
    +
    +let codegen_expr = function
    +  ...
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ ->
    +            (* If it wasn't a builtin binary operator, it must be a user defined
    +             * one. Emit a call to it. *)
    +            let callee = "binary" ^ (String.make 1 op) in
    +            let callee =
    +              match lookup_function callee the_module with
    +              | Some callee -> callee
    +              | None -> raise (Error "binary operator not found!")
    +            in
    +            build_call callee [|lhs_val; rhs_val|] "binop" builder
    +      end
    +
    +
    + +

    As you can see above, the new code is actually really simple. It just does +a lookup for the appropriate operator in the symbol table and generates a +function call to it. Since user-defined operators are just built as normal +functions (because the "prototype" boils down to a function with the right +name) everything falls into place.

    + +

    The final piece of code we are missing, is a bit of top level magic:

    + +
    +
    +let codegen_func the_fpm = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* If this is an operator, install it. *)
    +      begin match proto with
    +      | Ast.BinOpPrototype (name, args, prec) ->
    +          let op = name.[String.length name - 1] in
    +          Hashtbl.add Parser.binop_precedence op prec;
    +      | _ -> ()
    +      end;
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +      ...
    +
    +
    + +

    Basically, before codegening a function, if it is a user-defined operator, we +register it in the precedence table. This allows the binary operator parsing +logic we already have in place to handle it. Since we are working on a +fully-general operator precedence parser, this is all we need to do to "extend +the grammar".

    + +

    Now we have useful user-defined binary operators. This builds a lot +on the previous framework we built for other operators. Adding unary operators +is a bit more challenging, because we don't have any framework for it yet - lets +see what it takes.

    + +
    + + + + + +
    + +

    Since we don't currently support unary operators in the Kaleidoscope +language, we'll need to add everything to support them. Above, we added simple +support for the 'unary' keyword to the lexer. In addition to that, we need an +AST node:

    + +
    +
    +type expr =
    +  ...
    +  (* variant for a unary operator. *)
    +  | Unary of char * expr
    +  ...
    +
    +
    + +

    This AST node is very simple and obvious by now. It directly mirrors the +binary operator AST node, except that it only has one child. With this, we +need to add the parsing logic. Parsing a unary operator is pretty simple: we'll +add a new function to do it:

    + +
    +
    +(* unary
    + *   ::= primary
    + *   ::= '!' unary *)
    +and parse_unary = parser
    +  (* If this is a unary operator, read it. *)
    +  | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
    +      Ast.Unary (op, operand)
    +
    +  (* If the current token is not an operator, it must be a primary expr. *)
    +  | [< stream >] -> parse_primary stream
    +
    +
    + +

    The grammar we add is pretty straightforward here. If we see a unary +operator when parsing a primary operator, we eat the operator as a prefix and +parse the remaining piece as another unary operator. This allows us to handle +multiple unary operators (e.g. "!!x"). Note that unary operators can't have +ambiguous parses like binary operators can, so there is no need for precedence +information.

    + +

    The problem with this function, is that we need to call ParseUnary from +somewhere. To do this, we change previous callers of ParsePrimary to call +parse_unary instead:

    + +
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +        ...
    +        (* Parse the unary expression after the binary operator. *)
    +        let rhs = parse_unary stream in
    +        ...
    +
    +...
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +
    + +

    With these two simple changes, we are now able to parse unary operators and build the +AST for them. Next up, we need to add parser support for prototypes, to parse +the unary operator prototype. We extend the binary operator code above +with:

    + +
    +
    +(* prototype
    + *   ::= id '(' id* ')'
    + *   ::= binary LETTER number? (id, id)
    + *   ::= unary LETTER number? (id) *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +  let parse_operator = parser
    +    | [< 'Token.Unary >] -> "unary", 1
    +    | [< 'Token.Binary >] -> "binary", 2
    +  in
    +  let parse_binary_precedence = parser
    +    | [< 'Token.Number n >] -> int_of_float n
    +    | [< >] -> 30
    +  in
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +  | [< (prefix, kind)=parse_operator;
    +       'Token.Kwd op ?? "expected an operator";
    +       (* Read the precedence if present. *)
    +       binary_precedence=parse_binary_precedence;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +        args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      let name = prefix ^ (String.make 1 op) in
    +      let args = Array.of_list (List.rev args) in
    +
    +      (* Verify right number of arguments for operator. *)
    +      if Array.length args != kind
    +      then raise (Stream.Error "invalid number of operands for operator")
    +      else
    +        if kind == 1 then
    +          Ast.Prototype (name, args)
    +        else
    +          Ast.BinOpPrototype (name, args, binary_precedence)
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +
    + +

    As with binary operators, we name unary operators with a name that includes +the operator character. This assists us at code generation time. Speaking of, +the final piece we need to add is codegen support for unary operators. It looks +like this:

    + +
    +
    +let rec codegen_expr = function
    +  ...
    +  | Ast.Unary (op, operand) ->
    +      let operand = codegen_expr operand in
    +      let callee = "unary" ^ (String.make 1 op) in
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown unary operator")
    +      in
    +      build_call callee [|operand|] "unop" builder
    +
    +
    + +

    This code is similar to, but simpler than, the code for binary operators. It +is simpler primarily because it doesn't need to handle any predefined operators. +

    + +
    + + + + + +
    + +

    It is somewhat hard to believe, but with a few simple extensions we've +covered in the last chapters, we have grown a real-ish language. With this, we +can do a lot of interesting things, including I/O, math, and a bunch of other +things. For example, we can now add a nice sequencing operator (printd is +defined to print out the specified value and a newline):

    + +
    +
    +ready> extern printd(x);
    +Read extern: declare double @printd(double)
    +ready> def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.
    +..
    +ready> printd(123) : printd(456) : printd(789);
    +123.000000
    +456.000000
    +789.000000
    +Evaluated to 0.000000
    +
    +
    + +

    We can also define a bunch of other "primitive" operations, such as:

    + +
    +
    +# Logical unary not.
    +def unary!(v)
    +  if v then
    +    0
    +  else
    +    1;
    +
    +# Unary negate.
    +def unary-(v)
    +  0-v;
    +
    +# Define > with the same precedence as <.
    +def binary> 10 (LHS RHS)
    +  RHS < LHS;
    +
    +# Binary logical or, which does not short circuit.
    +def binary| 5 (LHS RHS)
    +  if LHS then
    +    1
    +  else if RHS then
    +    1
    +  else
    +    0;
    +
    +# Binary logical and, which does not short circuit.
    +def binary& 6 (LHS RHS)
    +  if !LHS then
    +    0
    +  else
    +    !!RHS;
    +
    +# Define = with slightly lower precedence than relationals.
    +def binary = 9 (LHS RHS)
    +  !(LHS < RHS | LHS > RHS);
    +
    +
    +
    + + +

    Given the previous if/then/else support, we can also define interesting +functions for I/O. For example, the following prints out a character whose +"density" reflects the value passed in: the lower the value, the denser the +character:

    + +
    +
    +ready>
    +
    +extern putchard(char)
    +def printdensity(d)
    +  if d > 8 then
    +    putchard(32)  # ' '
    +  else if d > 4 then
    +    putchard(46)  # '.'
    +  else if d > 2 then
    +    putchard(43)  # '+'
    +  else
    +    putchard(42); # '*'
    +...
    +ready> printdensity(1): printdensity(2): printdensity(3) :
    +          printdensity(4): printdensity(5): printdensity(9): putchard(10);
    +*++..
    +Evaluated to 0.000000
    +
    +
    + +

    Based on these simple primitive operations, we can start to define more +interesting things. For example, here's a little function that solves for the +number of iterations it takes a function in the complex plane to +converge:

    + +
    +
    +# determine whether the specific location diverges.
    +# Solve for z = z^2 + c in the complex plane.
    +def mandleconverger(real imag iters creal cimag)
    +  if iters > 255 | (real*real + imag*imag > 4) then
    +    iters
    +  else
    +    mandleconverger(real*real - imag*imag + creal,
    +                    2*real*imag + cimag,
    +                    iters+1, creal, cimag);
    +
    +# return the number of iterations required for the iteration to escape
    +def mandleconverge(real imag)
    +  mandleconverger(real, imag, 0, real, imag);
    +
    +
    + +

    This "z = z2 + c" function is a beautiful little creature that is the basis +for computation of the Mandelbrot Set. Our +mandelconverge function returns the number of iterations that it takes +for a complex orbit to escape, saturating to 255. This is not a very useful +function by itself, but if you plot its value over a two-dimensional plane, +you can see the Mandelbrot set. Given that we are limited to using putchard +here, our amazing graphical output is limited, but we can whip together +something using the density plotter above:

    + +
    +
    +# compute and plot the mandlebrot set with the specified 2 dimensional range
    +# info.
    +def mandelhelp(xmin xmax xstep   ymin ymax ystep)
    +  for y = ymin, y < ymax, ystep in (
    +    (for x = xmin, x < xmax, xstep in
    +       printdensity(mandleconverge(x,y)))
    +    : putchard(10)
    +  )
    +
    +# mandel - This is a convenient helper function for ploting the mandelbrot set
    +# from the specified position with the specified Magnification.
    +def mandel(realstart imagstart realmag imagmag)
    +  mandelhelp(realstart, realstart+realmag*78, realmag,
    +             imagstart, imagstart+imagmag*40, imagmag);
    +
    +
    + +

    Given this, we can try plotting out the mandlebrot set! Lets try it out:

    + +
    +
    +ready> mandel(-2.3, -1.3, 0.05, 0.07);
    +*******************************+++++++++++*************************************
    +*************************+++++++++++++++++++++++*******************************
    +**********************+++++++++++++++++++++++++++++****************************
    +*******************+++++++++++++++++++++.. ...++++++++*************************
    +*****************++++++++++++++++++++++.... ...+++++++++***********************
    +***************+++++++++++++++++++++++.....   ...+++++++++*********************
    +**************+++++++++++++++++++++++....     ....+++++++++********************
    +*************++++++++++++++++++++++......      .....++++++++*******************
    +************+++++++++++++++++++++.......       .......+++++++******************
    +***********+++++++++++++++++++....                ... .+++++++*****************
    +**********+++++++++++++++++.......                     .+++++++****************
    +*********++++++++++++++...........                    ...+++++++***************
    +********++++++++++++............                      ...++++++++**************
    +********++++++++++... ..........                        .++++++++**************
    +*******+++++++++.....                                   .+++++++++*************
    +*******++++++++......                                  ..+++++++++*************
    +*******++++++.......                                   ..+++++++++*************
    +*******+++++......                                     ..+++++++++*************
    +*******.... ....                                      ...+++++++++*************
    +*******.... .                                         ...+++++++++*************
    +*******+++++......                                    ...+++++++++*************
    +*******++++++.......                                   ..+++++++++*************
    +*******++++++++......                                   .+++++++++*************
    +*******+++++++++.....                                  ..+++++++++*************
    +********++++++++++... ..........                        .++++++++**************
    +********++++++++++++............                      ...++++++++**************
    +*********++++++++++++++..........                     ...+++++++***************
    +**********++++++++++++++++........                     .+++++++****************
    +**********++++++++++++++++++++....                ... ..+++++++****************
    +***********++++++++++++++++++++++.......       .......++++++++*****************
    +************+++++++++++++++++++++++......      ......++++++++******************
    +**************+++++++++++++++++++++++....      ....++++++++********************
    +***************+++++++++++++++++++++++.....   ...+++++++++*********************
    +*****************++++++++++++++++++++++....  ...++++++++***********************
    +*******************+++++++++++++++++++++......++++++++*************************
    +*********************++++++++++++++++++++++.++++++++***************************
    +*************************+++++++++++++++++++++++*******************************
    +******************************+++++++++++++************************************
    +*******************************************************************************
    +*******************************************************************************
    +*******************************************************************************
    +Evaluated to 0.000000
    +ready> mandel(-2, -1, 0.02, 0.04);
    +**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
    +***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    +*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
    +*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
    +*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
    +***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
    +**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
    +************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
    +***********++++++++++++++++++++++++++++++++++++++++++++++++++........        .
    +**********++++++++++++++++++++++++++++++++++++++++++++++.............
    +********+++++++++++++++++++++++++++++++++++++++++++..................
    +*******+++++++++++++++++++++++++++++++++++++++.......................
    +******+++++++++++++++++++++++++++++++++++...........................
    +*****++++++++++++++++++++++++++++++++............................
    +*****++++++++++++++++++++++++++++...............................
    +****++++++++++++++++++++++++++......   .........................
    +***++++++++++++++++++++++++.........     ......    ...........
    +***++++++++++++++++++++++............
    +**+++++++++++++++++++++..............
    +**+++++++++++++++++++................
    +*++++++++++++++++++.................
    +*++++++++++++++++............ ...
    +*++++++++++++++..............
    +*+++....++++................
    +*..........  ...........
    +*
    +*..........  ...........
    +*+++....++++................
    +*++++++++++++++..............
    +*++++++++++++++++............ ...
    +*++++++++++++++++++.................
    +**+++++++++++++++++++................
    +**+++++++++++++++++++++..............
    +***++++++++++++++++++++++............
    +***++++++++++++++++++++++++.........     ......    ...........
    +****++++++++++++++++++++++++++......   .........................
    +*****++++++++++++++++++++++++++++...............................
    +*****++++++++++++++++++++++++++++++++............................
    +******+++++++++++++++++++++++++++++++++++...........................
    +*******+++++++++++++++++++++++++++++++++++++++.......................
    +********+++++++++++++++++++++++++++++++++++++++++++..................
    +Evaluated to 0.000000
    +ready> mandel(-0.9, -1.4, 0.02, 0.03);
    +*******************************************************************************
    +*******************************************************************************
    +*******************************************************************************
    +**********+++++++++++++++++++++************************************************
    +*+++++++++++++++++++++++++++++++++++++++***************************************
    ++++++++++++++++++++++++++++++++++++++++++++++**********************************
    +++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
    ++++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
    ++++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
    ++++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
    +++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
    ++++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
    +++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
    +++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
    ++++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
    +++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
    +++++++++++++++++++++...........                .........++++++++++++++++++++++*
    +++++++++++++++++++............                  ...........++++++++++++++++++++
    +++++++++++++++++...............                 .............++++++++++++++++++
    +++++++++++++++.................                 ...............++++++++++++++++
    +++++++++++++..................                  .................++++++++++++++
    ++++++++++..................                      .................+++++++++++++
    +++++++........        .                               .........  ..++++++++++++
    +++............                                         ......    ....++++++++++
    +..............                                                    ...++++++++++
    +..............                                                    ....+++++++++
    +..............                                                    .....++++++++
    +.............                                                    ......++++++++
    +...........                                                     .......++++++++
    +.........                                                       ........+++++++
    +.........                                                       ........+++++++
    +.........                                                           ....+++++++
    +........                                                             ...+++++++
    +.......                                                              ...+++++++
    +                                                                    ....+++++++
    +                                                                   .....+++++++
    +                                                                    ....+++++++
    +                                                                    ....+++++++
    +                                                                    ....+++++++
    +Evaluated to 0.000000
    +ready> ^D
    +
    +
    + +

    At this point, you may be starting to realize that Kaleidoscope is a real +and powerful language. It may not be self-similar :), but it can be used to +plot things that are!

    + +

    With this, we conclude the "adding user-defined operators" chapter of the +tutorial. We have successfully augmented our language, adding the ability to +extend the language in the library, and we have shown how this can be used to +build a simple but interesting end-user application in Kaleidoscope. At this +point, Kaleidoscope can build a variety of applications that are functional and +can call functions with side-effects, but it can't actually define and mutate a +variable itself.

    + +

    Strikingly, variable mutation is an important feature of some +languages, and it is not at all obvious how to add +support for mutable variables without having to add an "SSA construction" +phase to your front-end. In the next chapter, we will describe how you can +add variable mutation without building SSA in your front-end.

    + +
    + + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    +<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    +<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    +
    +
    + +
    myocamlbuild.ml:
    +
    +
    +open Ocamlbuild_plugin;;
    +
    +ocaml_lib ~extern:true "llvm";;
    +ocaml_lib ~extern:true "llvm_analysis";;
    +ocaml_lib ~extern:true "llvm_executionengine";;
    +ocaml_lib ~extern:true "llvm_target";;
    +ocaml_lib ~extern:true "llvm_scalar_opts";;
    +
    +flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
    +dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +  (* control *)
    +  | If | Then | Else
    +  | For | In
    +
    +  (* operators *)
    +  | Binary | Unary
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | "if" -> [< 'Token.If; stream >]
    +      | "then" -> [< 'Token.Then; stream >]
    +      | "else" -> [< 'Token.Else; stream >]
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | "binary" -> [< 'Token.Binary; stream >]
    +      | "unary" -> [< 'Token.Unary; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a unary operator. *)
    +  | Unary of char * expr
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +  (* variant for if/then/else. *)
    +  | If of expr * expr * expr
    +
    +  (* variant for for/in. *)
    +  | For of string * expr * expr * expr option * expr
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto =
    +  | Prototype of string * string array
    +  | BinOpPrototype of string * string array * int
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr
    + *   ::= ifexpr
    + *   ::= forexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    +  | [< 'Token.If; c=parse_expr;
    +       'Token.Then ?? "expected 'then'"; t=parse_expr;
    +       'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    +      Ast.If (c, t, e)
    +
    +  (* forexpr
    +        ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
    +  | [< 'Token.For;
    +       'Token.Ident id ?? "expected identifier after for";
    +       'Token.Kwd '=' ?? "expected '=' after for";
    +       stream >] ->
    +      begin parser
    +        | [<
    +             start=parse_expr;
    +             'Token.Kwd ',' ?? "expected ',' after for";
    +             end_=parse_expr;
    +             stream >] ->
    +            let step =
    +              begin parser
    +              | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
    +              | [< >] -> None
    +              end stream
    +            in
    +            begin parser
    +            | [< 'Token.In; body=parse_expr >] ->
    +                Ast.For (id, start, end_, step, body)
    +            | [< >] ->
    +                raise (Stream.Error "expected 'in' after for")
    +            end stream
    +        | [< >] ->
    +            raise (Stream.Error "expected '=' after for")
    +      end stream
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* unary
    + *   ::= primary
    + *   ::= '!' unary *)
    +and parse_unary = parser
    +  (* If this is a unary operator, read it. *)
    +  | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
    +      Ast.Unary (op, operand)
    +
    +  (* If the current token is not an operator, it must be a primary expr. *)
    +  | [< stream >] -> parse_primary stream
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the unary expression after the binary operator. *)
    +        let rhs = parse_unary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')'
    + *   ::= binary LETTER number? (id, id)
    + *   ::= unary LETTER number? (id) *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +  let parse_operator = parser
    +    | [< 'Token.Unary >] -> "unary", 1
    +    | [< 'Token.Binary >] -> "binary", 2
    +  in
    +  let parse_binary_precedence = parser
    +    | [< 'Token.Number n >] -> int_of_float n
    +    | [< >] -> 30
    +  in
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +  | [< (prefix, kind)=parse_operator;
    +       'Token.Kwd op ?? "expected an operator";
    +       (* Read the precedence if present. *)
    +       binary_precedence=parse_binary_precedence;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +        args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      let name = prefix ^ (String.make 1 op) in
    +      let args = Array.of_list (List.rev args) in
    +
    +      (* Verify right number of arguments for operator. *)
    +      if Array.length args != kind
    +      then raise (Stream.Error "invalid number of operands for operator")
    +      else
    +        if kind == 1 then
    +          Ast.Prototype (name, args)
    +        else
    +          Ast.BinOpPrototype (name, args, binary_precedence)
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    codegen.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Code Generation
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +exception Error of string
    +
    +let context = global_context ()
    +let the_module = create_module context "my cool jit"
    +let builder = builder context
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> const_float double_type n
    +  | Ast.Variable name ->
    +      (try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name"))
    +  | Ast.Unary (op, operand) ->
    +      let operand = codegen_expr operand in
    +      let callee = "unary" ^ (String.make 1 op) in
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown unary operator")
    +      in
    +      build_call callee [|operand|] "unop" builder
    +  | Ast.Binary (op, lhs, rhs) ->
    +      let lhs_val = codegen_expr lhs in
    +      let rhs_val = codegen_expr rhs in
    +      begin
    +        match op with
    +        | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +        | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +        | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +        | '<' ->
    +            (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +            let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +            build_uitofp i double_type "booltmp" builder
    +        | _ ->
    +            (* If it wasn't a builtin binary operator, it must be a user defined
    +             * one. Emit a call to it. *)
    +            let callee = "binary" ^ (String.make 1 op) in
    +            let callee =
    +              match lookup_function callee the_module with
    +              | Some callee -> callee
    +              | None -> raise (Error "binary operator not found!")
    +            in
    +            build_call callee [|lhs_val; rhs_val|] "binop" builder
    +      end
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +  | Ast.If (cond, then_, else_) ->
    +      let cond = codegen_expr cond in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0 *)
    +      let zero = const_float double_type 0.0 in
    +      let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
    +
    +      (* Grab the first block so that we might later add the conditional branch
    +       * to it at the end of the function. *)
    +      let start_bb = insertion_block builder in
    +      let the_function = block_parent start_bb in
    +
    +      let then_bb = append_block context "then" the_function in
    +
    +      (* Emit 'then' value. *)
    +      position_at_end then_bb builder;
    +      let then_val = codegen_expr then_ in
    +
    +      (* Codegen of 'then' can change the current block, update then_bb for the
    +       * phi. We create a new name because one is used for the phi node, and the
    +       * other is used for the conditional branch. *)
    +      let new_then_bb = insertion_block builder in
    +
    +      (* Emit 'else' value. *)
    +      let else_bb = append_block context "else" the_function in
    +      position_at_end else_bb builder;
    +      let else_val = codegen_expr else_ in
    +
    +      (* Codegen of 'else' can change the current block, update else_bb for the
    +       * phi. *)
    +      let new_else_bb = insertion_block builder in
    +
    +      (* Emit merge block. *)
    +      let merge_bb = append_block context "ifcont" the_function in
    +      position_at_end merge_bb builder;
    +      let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
    +      let phi = build_phi incoming "iftmp" builder in
    +
    +      (* Return to the start block to add the conditional branch. *)
    +      position_at_end start_bb builder;
    +      ignore (build_cond_br cond_val then_bb else_bb builder);
    +
    +      (* Set a unconditional branch at the end of the 'then' block and the
    +       * 'else' block to the 'merge' block. *)
    +      position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
    +      position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
    +
    +      (* Finally, set the builder to the end of the merge block. *)
    +      position_at_end merge_bb builder;
    +
    +      phi
    +  | Ast.For (var_name, start, end_, step, body) ->
    +      (* Emit the start code first, without 'variable' in scope. *)
    +      let start_val = codegen_expr start in
    +
    +      (* Make the new basic block for the loop header, inserting after current
    +       * block. *)
    +      let preheader_bb = insertion_block builder in
    +      let the_function = block_parent preheader_bb in
    +      let loop_bb = append_block context "loop" the_function in
    +
    +      (* Insert an explicit fall through from the current block to the
    +       * loop_bb. *)
    +      ignore (build_br loop_bb builder);
    +
    +      (* Start insertion in loop_bb. *)
    +      position_at_end loop_bb builder;
    +
    +      (* Start the PHI node with an entry for start. *)
    +      let variable = build_phi [(start_val, preheader_bb)] var_name builder in
    +
    +      (* Within the loop, the variable is defined equal to the PHI node. If it
    +       * shadows an existing variable, we have to restore it, so save it
    +       * now. *)
    +      let old_val =
    +        try Some (Hashtbl.find named_values var_name) with Not_found -> None
    +      in
    +      Hashtbl.add named_values var_name variable;
    +
    +      (* Emit the body of the loop.  This, like any other expr, can change the
    +       * current BB.  Note that we ignore the value computed by the body, but
    +       * don't allow an error *)
    +      ignore (codegen_expr body);
    +
    +      (* Emit the step value. *)
    +      let step_val =
    +        match step with
    +        | Some step -> codegen_expr step
    +        (* If not specified, use 1.0. *)
    +        | None -> const_float double_type 1.0
    +      in
    +
    +      let next_var = build_add variable step_val "nextvar" builder in
    +
    +      (* Compute the end condition. *)
    +      let end_cond = codegen_expr end_ in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0. *)
    +      let zero = const_float double_type 0.0 in
    +      let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
    +
    +      (* Create the "after loop" block and insert it. *)
    +      let loop_end_bb = insertion_block builder in
    +      let after_bb = append_block context "afterloop" the_function in
    +
    +      (* Insert the conditional branch into the end of loop_end_bb. *)
    +      ignore (build_cond_br end_cond loop_bb after_bb builder);
    +
    +      (* Any new code will be inserted in after_bb. *)
    +      position_at_end after_bb builder;
    +
    +      (* Add a new entry to the PHI node for the backedge. *)
    +      add_incoming (next_var, loop_end_bb) variable;
    +
    +      (* Restore the unshadowed variable. *)
    +      begin match old_val with
    +      | Some old_val -> Hashtbl.add named_values var_name old_val
    +      | None -> ()
    +      end;
    +
    +      (* for expr always returns 0.0. *)
    +      const_null double_type
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +        | None -> declare_function name ft the_module
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if block_begin f <> At_end f then
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if element_type (type_of f) <> ft then
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +let codegen_func the_fpm = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* If this is an operator, install it. *)
    +      begin match proto with
    +      | Ast.BinOpPrototype (name, args, prec) ->
    +          let op = name.[String.length name - 1] in
    +          Hashtbl.add Parser.binop_precedence op prec;
    +      | _ -> ()
    +      end;
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        (* Optimize the function. *)
    +        let _ = PassManager.run_function the_function the_fpm in
    +
    +        the_function
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop the_fpm the_execution_engine stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop the_fpm the_execution_engine stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            let e = Parser.parse_definition stream in
    +            print_endline "parsed a function definition.";
    +            dump_value (Codegen.codegen_func the_fpm e);
    +        | Token.Extern ->
    +            let e = Parser.parse_extern stream in
    +            print_endline "parsed an extern.";
    +            dump_value (Codegen.codegen_proto e);
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            let the_function = Codegen.codegen_func the_fpm e in
    +            dump_value the_function;
    +
    +            (* JIT the function, returning a function pointer. *)
    +            let result = ExecutionEngine.run_function the_function [||]
    +              the_execution_engine in
    +
    +            print_string "Evaluated to ";
    +            print_float (GenericValue.as_float Codegen.double_type result);
    +            print_newline ();
    +        with Stream.Error s | Codegen.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop the_fpm the_execution_engine stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +open Llvm_target
    +open Llvm_scalar_opts
    +
    +let main () =
    +  ignore (initialize_native_target ());
    +
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combination the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +  (* Eliminate Common SubExpressions. *)
    +  add_gvn the_fpm;
    +
    +  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
    +  add_cfg_simplification the_fpm;
    +
    +  ignore (PassManager.initialize the_fpm);
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop the_fpm the_execution_engine stream;
    +
    +  (* Print out all the generated code. *)
    +  dump_module Codegen.the_module
    +;;
    +
    +main ()
    +
    +
    + +
    bindings.c
    +
    +
    +#include <stdio.h>
    +
    +/* putchard - putchar that takes a double and returns 0. */
    +extern double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +/* printd - printf that takes a double prints it as "%f\n", returning 0. */
    +extern double printd(double X) {
    +  printf("%f\n", X);
    +  return 0;
    +}
    +
    +
    +
    + +Next: Extending the language: mutable variables / +SSA construction +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + Erick Tryzelaar
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-06-21 13:31:30 -0700 (Mon, 21 Jun 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl7.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl7.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl7.html (added) +++ www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl7.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1907 @@ + + + + + Kaleidoscope: Extending the Language: Mutable Variables / SSA + construction + + + + + + + + +
    Kaleidoscope: Extending the Language: Mutable Variables
    + + + +
    +

    + Written by Chris Lattner + and Erick Tryzelaar +

    +
    + + + + + +
    + +

    Welcome to Chapter 7 of the "Implementing a language +with LLVM" tutorial. In chapters 1 through 6, we've built a very +respectable, albeit simple, functional +programming language. In our journey, we learned some parsing techniques, +how to build and represent an AST, how to build LLVM IR, and how to optimize +the resultant code as well as JIT compile it.

    + +

    While Kaleidoscope is interesting as a functional language, the fact that it +is functional makes it "too easy" to generate LLVM IR for it. In particular, a +functional language makes it very easy to build LLVM IR directly in SSA form. +Since LLVM requires that the input code be in SSA form, this is a very nice +property and it is often unclear to newcomers how to generate code for an +imperative language with mutable variables.

    + +

    The short (and happy) summary of this chapter is that there is no need for +your front-end to build SSA form: LLVM provides highly tuned and well tested +support for this, though the way it works is a bit unexpected for some.

    + +
    + + + + + +
    + +

    +To understand why mutable variables cause complexities in SSA construction, +consider this extremely simple C example: +

    + +
    +
    +int G, H;
    +int test(_Bool Condition) {
    +  int X;
    +  if (Condition)
    +    X = G;
    +  else
    +    X = H;
    +  return X;
    +}
    +
    +
    + +

    In this case, we have the variable "X", whose value depends on the path +executed in the program. Because there are two different possible values for X +before the return instruction, a PHI node is inserted to merge the two values. +The LLVM IR that we want for this example looks like this:

    + +
    +
    + at G = weak global i32 0   ; type of @G is i32*
    + at H = weak global i32 0   ; type of @H is i32*
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +  br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +  %X.0 = load i32* @G
    +  br label %cond_next
    +
    +cond_false:
    +  %X.1 = load i32* @H
    +  br label %cond_next
    +
    +cond_next:
    +  %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    +  ret i32 %X.2
    +}
    +
    +
    + +

    In this example, the loads from the G and H global variables are explicit in +the LLVM IR, and they live in the then/else branches of the if statement +(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node +in the cond_next block selects the right value to use based on where control +flow is coming from: if control flow comes from the cond_false block, X.2 gets +the value of X.1. Alternatively, if control flow comes from cond_true, it gets +the value of X.0. The intent of this chapter is not to explain the details of +SSA form. For more information, see one of the many online +references.

    + +

    The question for this article is "who places the phi nodes when lowering +assignments to mutable variables?". The issue here is that LLVM +requires that its IR be in SSA form: there is no "non-ssa" mode for it. +However, SSA construction requires non-trivial algorithms and data structures, +so it is inconvenient and wasteful for every front-end to have to reproduce this +logic.

    + +
    + + + + + +
    + +

    The 'trick' here is that while LLVM does require all register values to be +in SSA form, it does not require (or permit) memory objects to be in SSA form. +In the example above, note that the loads from G and H are direct accesses to +G and H: they are not renamed or versioned. This differs from some other +compiler systems, which do try to version memory objects. In LLVM, instead of +encoding dataflow analysis of memory into the LLVM IR, it is handled with Analysis Passes which are computed on +demand.

    + +

    +With this in mind, the high-level idea is that we want to make a stack variable +(which lives in memory, because it is on the stack) for each mutable object in +a function. To take advantage of this trick, we need to talk about how LLVM +represents stack variables. +

    + +

    In LLVM, all memory accesses are explicit with load/store instructions, and +it is carefully designed not to have (or need) an "address-of" operator. Notice +how the type of the @G/@H global variables is actually "i32*" even though the +variable is defined as "i32". What this means is that @G defines space +for an i32 in the global data area, but its name actually refers to the +address for that space. Stack variables work the same way, except that instead of +being declared with global variable definitions, they are declared with the +LLVM alloca instruction:

    + +
    +
    +define i32 @example() {
    +entry:
    +  %X = alloca i32           ; type of %X is i32*.
    +  ...
    +  %tmp = load i32* %X       ; load the stack value %X from the stack.
    +  %tmp2 = add i32 %tmp, 1   ; increment it
    +  store i32 %tmp2, i32* %X  ; store it back
    +  ...
    +
    +
    + +

    This code shows an example of how you can declare and manipulate a stack +variable in the LLVM IR. Stack memory allocated with the alloca instruction is +fully general: you can pass the address of the stack slot to functions, you can +store it in other variables, etc. In our example above, we could rewrite the +example to use the alloca technique to avoid using a PHI node:

    + +
    +
    + at G = weak global i32 0   ; type of @G is i32*
    + at H = weak global i32 0   ; type of @H is i32*
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +  %X = alloca i32           ; type of %X is i32*.
    +  br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +  %X.0 = load i32* @G
    +        store i32 %X.0, i32* %X   ; Update X
    +  br label %cond_next
    +
    +cond_false:
    +  %X.1 = load i32* @H
    +        store i32 %X.1, i32* %X   ; Update X
    +  br label %cond_next
    +
    +cond_next:
    +  %X.2 = load i32* %X       ; Read X
    +  ret i32 %X.2
    +}
    +
    +
    + +

    With this, we have discovered a way to handle arbitrary mutable variables +without the need to create Phi nodes at all:

    + +
      +
    1. Each mutable variable becomes a stack allocation.
    2. +
    3. Each read of the variable becomes a load from the stack.
    4. +
    5. Each update of the variable becomes a store to the stack.
    6. +
    7. Taking the address of a variable just uses the stack address directly.
    8. +
    + +

    While this solution has solved our immediate problem, it introduced another +one: we have now apparently introduced a lot of stack traffic for very simple +and common operations, a major performance problem. Fortunately for us, the +LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles +this case, promoting allocas like this into SSA registers, inserting Phi nodes +as appropriate. If you run this example through the pass, for example, you'll +get:

    + +
    +
    +$ llvm-as < example.ll | opt -mem2reg | llvm-dis
    + at G = weak global i32 0
    + at H = weak global i32 0
    +
    +define i32 @test(i1 %Condition) {
    +entry:
    +  br i1 %Condition, label %cond_true, label %cond_false
    +
    +cond_true:
    +  %X.0 = load i32* @G
    +  br label %cond_next
    +
    +cond_false:
    +  %X.1 = load i32* @H
    +  br label %cond_next
    +
    +cond_next:
    +  %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
    +  ret i32 %X.01
    +}
    +
    +
    + +

    The mem2reg pass implements the standard "iterated dominance frontier" +algorithm for constructing SSA form and has a number of optimizations that speed +up (very common) degenerate cases. The mem2reg optimization pass is the answer +to dealing with mutable variables, and we highly recommend that you depend on +it. Note that mem2reg only works on variables in certain circumstances:

    + +
      +
    1. mem2reg is alloca-driven: it looks for allocas and if it can handle them, it +promotes them. It does not apply to global variables or heap allocations.
    2. + +
    3. mem2reg only looks for alloca instructions in the entry block of the +function. Being in the entry block guarantees that the alloca is only executed +once, which makes analysis simpler.
    4. + +
    5. mem2reg only promotes allocas whose uses are direct loads and stores. If +the address of the stack object is passed to a function, or if any funny pointer +arithmetic is involved, the alloca will not be promoted.
    6. + +
    7. mem2reg only works on allocas of first class +values (such as pointers, scalars and vectors), and only if the array size +of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of +promoting structs or arrays to registers. Note that the "scalarrepl" pass is +more powerful and can promote structs, "unions", and arrays in many cases.
    8. + +
    + +

    +All of these properties are easy to satisfy for most imperative languages, and +we'll illustrate it below with Kaleidoscope. The final question you may be +asking is: should I bother with this nonsense for my front-end? Wouldn't it be +better if I just did SSA construction directly, avoiding use of the mem2reg +optimization pass? In short, we strongly recommend that you use this technique +for building SSA form, unless there is an extremely good reason not to. Using +this technique is:

    + +
      +
    • Proven and well tested: llvm-gcc and clang both use this technique for local +mutable variables. As such, the most common clients of LLVM are using this to +handle a bulk of their variables. You can be sure that bugs are found fast and +fixed early.
    • + +
    • Extremely Fast: mem2reg has a number of special cases that make it fast in +common cases as well as fully general. For example, it has fast-paths for +variables that are only used in a single block, variables that only have one +assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc. +
    • + +
    • Needed for debug info generation: +Debug information in LLVM relies on having the address of the variable +exposed so that debug info can be attached to it. This technique dovetails +very naturally with this style of debug info.
    • +
    + +

    If nothing else, this makes it much easier to get your front-end up and +running, and is very simple to implement. Lets extend Kaleidoscope with mutable +variables now! +

    + +
    + + + + + +
    + +

    Now that we know the sort of problem we want to tackle, lets see what this +looks like in the context of our little Kaleidoscope language. We're going to +add two features:

    + +
      +
    1. The ability to mutate variables with the '=' operator.
    2. +
    3. The ability to define new variables.
    4. +
    + +

    While the first item is really what this is about, we only have variables +for incoming arguments as well as for induction variables, and redefining those only +goes so far :). Also, the ability to define new variables is a +useful thing regardless of whether you will be mutating them. Here's a +motivating example that shows how we could use these:

    + +
    +
    +# Define ':' for sequencing: as a low-precedence operator that ignores operands
    +# and just returns the RHS.
    +def binary : 1 (x y) y;
    +
    +# Recursive fib, we could do this before.
    +def fib(x)
    +  if (x < 3) then
    +    1
    +  else
    +    fib(x-1)+fib(x-2);
    +
    +# Iterative fib.
    +def fibi(x)
    +  var a = 1, b = 1, c in
    +  (for i = 3, i < x in
    +     c = a + b :
    +     a = b :
    +     b = c) :
    +  b;
    +
    +# Call it.
    +fibi(10);
    +
    +
    + +

    +In order to mutate variables, we have to change our existing variables to use +the "alloca trick". Once we have that, we'll add our new operator, then extend +Kaleidoscope to support new variable definitions. +

    + +
    + + + + + +
    + +

    +The symbol table in Kaleidoscope is managed at code generation time by the +'named_values' map. This map currently keeps track of the LLVM +"Value*" that holds the double value for the named variable. In order to +support mutation, we need to change this slightly, so that it +named_values holds the memory location of the variable in +question. Note that this change is a refactoring: it changes the structure of +the code, but does not (by itself) change the behavior of the compiler. All of +these changes are isolated in the Kaleidoscope code generator.

    + +

    +At this point in Kaleidoscope's development, it only supports variables for two +things: incoming arguments to functions and the induction variable of 'for' +loops. For consistency, we'll allow mutation of these variables in addition to +other user-defined variables. This means that these will both need memory +locations. +

    + +

    To start our transformation of Kaleidoscope, we'll change the +named_values map so that it maps to AllocaInst* instead of Value*. +Once we do this, the C++ compiler will tell us what parts of the code we need to +update:

    + +

    Note: the ocaml bindings currently model both Value*s and +AllocInst*s as Llvm.llvalues, but this may change in the +future to be more type safe.

    + +
    +
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +
    +
    + +

    Also, since we will need to create these alloca's, we'll use a helper +function that ensures that the allocas are created in the entry block of the +function:

    + +
    +
    +(* Create an alloca instruction in the entry block of the function. This
    + * is used for mutable variables etc. *)
    +let create_entry_block_alloca the_function var_name =
    +  let builder = builder_at (instr_begin (entry_block the_function)) in
    +  build_alloca double_type var_name builder
    +
    +
    + +

    This funny looking code creates an Llvm.llbuilder object that is +pointing at the first instruction of the entry block. It then creates an alloca +with the expected name and returns it. Because all values in Kaleidoscope are +doubles, there is no need to pass in a type to use.

    + +

    With this in place, the first functionality change we want to make is to +variable references. In our new scheme, variables live on the stack, so code +generating a reference to them actually needs to produce a load from the stack +slot:

    + +
    +
    +let rec codegen_expr = function
    +  ...
    +  | Ast.Variable name ->
    +      let v = try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name")
    +      in
    +      (* Load the value. *)
    +      build_load v name builder
    +
    +
    + +

    As you can see, this is pretty straightforward. Now we need to update the +things that define the variables to set up the alloca. We'll start with +codegen_expr Ast.For ... (see the full code listing +for the unabridged code):

    + +
    +
    +  | Ast.For (var_name, start, end_, step, body) ->
    +      let the_function = block_parent (insertion_block builder) in
    +
    +      (* Create an alloca for the variable in the entry block. *)
    +      let alloca = create_entry_block_alloca the_function var_name in
    +
    +      (* Emit the start code first, without 'variable' in scope. *)
    +      let start_val = codegen_expr start in
    +
    +      (* Store the value into the alloca. *)
    +      ignore(build_store start_val alloca builder);
    +
    +      ...
    +
    +      (* Within the loop, the variable is defined equal to the PHI node. If it
    +       * shadows an existing variable, we have to restore it, so save it
    +       * now. *)
    +      let old_val =
    +        try Some (Hashtbl.find named_values var_name) with Not_found -> None
    +      in
    +      Hashtbl.add named_values var_name alloca;
    +
    +      ...
    +
    +      (* Compute the end condition. *)
    +      let end_cond = codegen_expr end_ in
    +
    +      (* Reload, increment, and restore the alloca. This handles the case where
    +       * the body of the loop mutates the variable. *)
    +      let cur_var = build_load alloca var_name builder in
    +      let next_var = build_add cur_var step_val "nextvar" builder in
    +      ignore(build_store next_var alloca builder);
    +      ...
    +
    +
    + +

    This code is virtually identical to the code before we allowed mutable variables. +The big difference is that we no longer have to construct a PHI node, and we use +load/store to access the variable as needed.

    + +

    To support mutable argument variables, we need to also make allocas for them. +The code for this is also pretty simple:

    + +
    +
    +(* Create an alloca for each argument and register the argument in the symbol
    + * table so that references to it will succeed. *)
    +let create_argument_allocas the_function proto =
    +  let args = match proto with
    +    | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
    +  in
    +  Array.iteri (fun i ai ->
    +    let var_name = args.(i) in
    +    (* Create an alloca for this variable. *)
    +    let alloca = create_entry_block_alloca the_function var_name in
    +
    +    (* Store the initial value into the alloca. *)
    +    ignore(build_store ai alloca builder);
    +
    +    (* Add arguments to variable symbol table. *)
    +    Hashtbl.add named_values var_name alloca;
    +  ) (params the_function)
    +
    +
    + +

    For each argument, we make an alloca, store the input value to the function +into the alloca, and register the alloca as the memory location for the +argument. This method gets invoked by Codegen.codegen_func right after +it sets up the entry block for the function.

    + +

    The final missing piece is adding the mem2reg pass, which allows us to get +good codegen once again:

    + +
    +
    +let main () =
    +  ...
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Promote allocas to registers. *)
    +  add_memory_to_register_promotion the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combining the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +
    + +

    It is interesting to see what the code looks like before and after the +mem2reg optimization runs. For example, this is the before/after code for our +recursive fib function. Before the optimization:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +  %x1 = alloca double
    +  store double %x, double* %x1
    +  %x2 = load double* %x1
    +  %cmptmp = fcmp ult double %x2, 3.000000e+00
    +  %booltmp = uitofp i1 %cmptmp to double
    +  %ifcond = fcmp one double %booltmp, 0.000000e+00
    +  br i1 %ifcond, label %then, label %else
    +
    +then:    ; preds = %entry
    +  br label %ifcont
    +
    +else:    ; preds = %entry
    +  %x3 = load double* %x1
    +  %subtmp = fsub double %x3, 1.000000e+00
    +  %calltmp = call double @fib(double %subtmp)
    +  %x4 = load double* %x1
    +  %subtmp5 = fsub double %x4, 2.000000e+00
    +  %calltmp6 = call double @fib(double %subtmp5)
    +  %addtmp = fadd double %calltmp, %calltmp6
    +  br label %ifcont
    +
    +ifcont:    ; preds = %else, %then
    +  %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    +  ret double %iftmp
    +}
    +
    +
    + +

    Here there is only one variable (x, the input argument) but you can still +see the extremely simple-minded code generation strategy we are using. In the +entry block, an alloca is created, and the initial input value is stored into +it. Each reference to the variable does a reload from the stack. Also, note +that we didn't modify the if/then/else expression, so it still inserts a PHI +node. While we could make an alloca for it, it is actually easier to create a +PHI node for it, so we still just make the PHI.

    + +

    Here is the code after the mem2reg pass runs:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +  %cmptmp = fcmp ult double %x, 3.000000e+00
    +  %booltmp = uitofp i1 %cmptmp to double
    +  %ifcond = fcmp one double %booltmp, 0.000000e+00
    +  br i1 %ifcond, label %then, label %else
    +
    +then:
    +  br label %ifcont
    +
    +else:
    +  %subtmp = fsub double %x, 1.000000e+00
    +  %calltmp = call double @fib(double %subtmp)
    +  %subtmp5 = fsub double %x, 2.000000e+00
    +  %calltmp6 = call double @fib(double %subtmp5)
    +  %addtmp = fadd double %calltmp, %calltmp6
    +  br label %ifcont
    +
    +ifcont:    ; preds = %else, %then
    +  %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
    +  ret double %iftmp
    +}
    +
    +
    + +

    This is a trivial case for mem2reg, since there are no redefinitions of the +variable. The point of showing this is to calm your tension about inserting +such blatent inefficiencies :).

    + +

    After the rest of the optimizers run, we get:

    + +
    +
    +define double @fib(double %x) {
    +entry:
    +  %cmptmp = fcmp ult double %x, 3.000000e+00
    +  %booltmp = uitofp i1 %cmptmp to double
    +  %ifcond = fcmp ueq double %booltmp, 0.000000e+00
    +  br i1 %ifcond, label %else, label %ifcont
    +
    +else:
    +  %subtmp = fsub double %x, 1.000000e+00
    +  %calltmp = call double @fib(double %subtmp)
    +  %subtmp5 = fsub double %x, 2.000000e+00
    +  %calltmp6 = call double @fib(double %subtmp5)
    +  %addtmp = fadd double %calltmp, %calltmp6
    +  ret double %addtmp
    +
    +ifcont:
    +  ret double 1.000000e+00
    +}
    +
    +
    + +

    Here we see that the simplifycfg pass decided to clone the return instruction +into the end of the 'else' block. This allowed it to eliminate some branches +and the PHI node.

    + +

    Now that all symbol table references are updated to use stack variables, +we'll add the assignment operator.

    + +
    + + + + + +
    + +

    With our current framework, adding a new assignment operator is really +simple. We will parse it just like any other binary operator, but handle it +internally (instead of allowing the user to define it). The first step is to +set a precedence:

    + +
    +
    +let main () =
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '=' 2;
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  ...
    +
    +
    + +

    Now that the parser knows the precedence of the binary operator, it takes +care of all the parsing and AST generation. We just need to implement codegen +for the assignment operator. This looks like:

    + +
    +
    +let rec codegen_expr = function
    +      begin match op with
    +      | '=' ->
    +          (* Special case '=' because we don't want to emit the LHS as an
    +           * expression. *)
    +          let name =
    +            match lhs with
    +            | Ast.Variable name -> name
    +            | _ -> raise (Error "destination of '=' must be a variable")
    +          in
    +
    +
    + +

    Unlike the rest of the binary operators, our assignment operator doesn't +follow the "emit LHS, emit RHS, do computation" model. As such, it is handled +as a special case before the other binary operators are handled. The other +strange thing is that it requires the LHS to be a variable. It is invalid to +have "(x+1) = expr" - only things like "x = expr" are allowed. +

    + + +
    +
    +          (* Codegen the rhs. *)
    +          let val_ = codegen_expr rhs in
    +
    +          (* Lookup the name. *)
    +          let variable = try Hashtbl.find named_values name with
    +          | Not_found -> raise (Error "unknown variable name")
    +          in
    +          ignore(build_store val_ variable builder);
    +          val_
    +      | _ ->
    +			...
    +
    +
    + +

    Once we have the variable, codegen'ing the assignment is straightforward: +we emit the RHS of the assignment, create a store, and return the computed +value. Returning a value allows for chained assignments like "X = (Y = Z)".

    + +

    Now that we have an assignment operator, we can mutate loop variables and +arguments. For example, we can now run code like this:

    + +
    +
    +# Function to print a double.
    +extern printd(x);
    +
    +# Define ':' for sequencing: as a low-precedence operator that ignores operands
    +# and just returns the RHS.
    +def binary : 1 (x y) y;
    +
    +def test(x)
    +  printd(x) :
    +  x = 4 :
    +  printd(x);
    +
    +test(123);
    +
    +
    + +

    When run, this example prints "123" and then "4", showing that we did +actually mutate the value! Okay, we have now officially implemented our goal: +getting this to work requires SSA construction in the general case. However, +to be really useful, we want the ability to define our own local variables, lets +add this next! +

    + +
    + + + + + +
    + +

    Adding var/in is just like any other other extensions we made to +Kaleidoscope: we extend the lexer, the parser, the AST and the code generator. +The first step for adding our new 'var/in' construct is to extend the lexer. +As before, this is pretty trivial, the code looks like this:

    + +
    +
    +type token =
    +  ...
    +  (* var definition *)
    +  | Var
    +
    +...
    +
    +and lex_ident buffer = parser
    +      ...
    +      | "in" -> [< 'Token.In; stream >]
    +      | "binary" -> [< 'Token.Binary; stream >]
    +      | "unary" -> [< 'Token.Unary; stream >]
    +      | "var" -> [< 'Token.Var; stream >]
    +      ...
    +
    +
    + +

    The next step is to define the AST node that we will construct. For var/in, +it looks like this:

    + +
    +
    +type expr =
    +  ...
    +  (* variant for var/in. *)
    +  | Var of (string * expr option) array * expr
    +  ...
    +
    +
    + +

    var/in allows a list of names to be defined all at once, and each name can +optionally have an initializer value. As such, we capture this information in +the VarNames vector. Also, var/in has a body, this body is allowed to access +the variables defined by the var/in.

    + +

    With this in place, we can define the parser pieces. The first thing we do +is add it as a primary expression:

    + +
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr
    + *   ::= ifexpr
    + *   ::= forexpr
    + *   ::= varexpr *)
    +let rec parse_primary = parser
    +  ...
    +  (* varexpr
    +   *   ::= 'var' identifier ('=' expression?
    +   *             (',' identifier ('=' expression)?)* 'in' expression *)
    +  | [< 'Token.Var;
    +       (* At least one variable name is required. *)
    +       'Token.Ident id ?? "expected identifier after var";
    +       init=parse_var_init;
    +       var_names=parse_var_names [(id, init)];
    +       (* At this point, we have to have 'in'. *)
    +       'Token.In ?? "expected 'in' keyword after 'var'";
    +       body=parse_expr >] ->
    +      Ast.Var (Array.of_list (List.rev var_names), body)
    +
    +...
    +
    +and parse_var_init = parser
    +  (* read in the optional initializer. *)
    +  | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
    +  | [< >] -> None
    +
    +and parse_var_names accumulator = parser
    +  | [< 'Token.Kwd ',';
    +       'Token.Ident id ?? "expected identifier list after var";
    +       init=parse_var_init;
    +       e=parse_var_names ((id, init) :: accumulator) >] -> e
    +  | [< >] -> accumulator
    +
    +
    + +

    Now that we can parse and represent the code, we need to support emission of +LLVM IR for it. This code starts out with:

    + +
    +
    +let rec codegen_expr = function
    +  ...
    +  | Ast.Var (var_names, body)
    +      let old_bindings = ref [] in
    +
    +      let the_function = block_parent (insertion_block builder) in
    +
    +      (* Register all variables and emit their initializer. *)
    +      Array.iter (fun (var_name, init) ->
    +
    +
    + +

    Basically it loops over all the variables, installing them one at a time. +For each variable we put into the symbol table, we remember the previous value +that we replace in OldBindings.

    + +
    +
    +        (* Emit the initializer before adding the variable to scope, this
    +         * prevents the initializer from referencing the variable itself, and
    +         * permits stuff like this:
    +         *   var a = 1 in
    +         *     var a = a in ...   # refers to outer 'a'. *)
    +        let init_val =
    +          match init with
    +          | Some init -> codegen_expr init
    +          (* If not specified, use 0.0. *)
    +          | None -> const_float double_type 0.0
    +        in
    +
    +        let alloca = create_entry_block_alloca the_function var_name in
    +        ignore(build_store init_val alloca builder);
    +
    +        (* Remember the old variable binding so that we can restore the binding
    +         * when we unrecurse. *)
    +
    +        begin
    +          try
    +            let old_value = Hashtbl.find named_values var_name in
    +            old_bindings := (var_name, old_value) :: !old_bindings;
    +          with Not_found > ()
    +        end;
    +
    +        (* Remember this binding. *)
    +        Hashtbl.add named_values var_name alloca;
    +      ) var_names;
    +
    +
    + +

    There are more comments here than code. The basic idea is that we emit the +initializer, create the alloca, then update the symbol table to point to it. +Once all the variables are installed in the symbol table, we evaluate the body +of the var/in expression:

    + +
    +
    +      (* Codegen the body, now that all vars are in scope. *)
    +      let body_val = codegen_expr body in
    +
    +
    + +

    Finally, before returning, we restore the previous variable bindings:

    + +
    +
    +      (* Pop all our variables from scope. *)
    +      List.iter (fun (var_name, old_value) ->
    +        Hashtbl.add named_values var_name old_value
    +      ) !old_bindings;
    +
    +      (* Return the body computation. *)
    +      body_val
    +
    +
    + +

    The end result of all of this is that we get properly scoped variable +definitions, and we even (trivially) allow mutation of them :).

    + +

    With this, we completed what we set out to do. Our nice iterative fib +example from the intro compiles and runs just fine. The mem2reg pass optimizes +all of our stack variables into SSA registers, inserting PHI nodes where needed, +and our front-end remains simple: no "iterated dominance frontier" computation +anywhere in sight.

    + +
    + + + + + +
    + +

    +Here is the complete code listing for our running example, enhanced with mutable +variables and var/in support. To build this example, use: +

    + +
    +
    +# Compile
    +ocamlbuild toy.byte
    +# Run
    +./toy.byte
    +
    +
    + +

    Here is the code:

    + +
    +
    _tags:
    +
    +
    +<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
    +<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
    +<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
    +<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
    +
    +
    + +
    myocamlbuild.ml:
    +
    +
    +open Ocamlbuild_plugin;;
    +
    +ocaml_lib ~extern:true "llvm";;
    +ocaml_lib ~extern:true "llvm_analysis";;
    +ocaml_lib ~extern:true "llvm_executionengine";;
    +ocaml_lib ~extern:true "llvm_target";;
    +ocaml_lib ~extern:true "llvm_scalar_opts";;
    +
    +flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
    +dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
    +
    +
    + +
    token.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer Tokens
    + *===----------------------------------------------------------------------===*)
    +
    +(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
    + * these others for known things. *)
    +type token =
    +  (* commands *)
    +  | Def | Extern
    +
    +  (* primary *)
    +  | Ident of string | Number of float
    +
    +  (* unknown *)
    +  | Kwd of char
    +
    +  (* control *)
    +  | If | Then | Else
    +  | For | In
    +
    +  (* operators *)
    +  | Binary | Unary
    +
    +  (* var definition *)
    +  | Var
    +
    +
    + +
    lexer.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Lexer
    + *===----------------------------------------------------------------------===*)
    +
    +let rec lex = parser
    +  (* Skip any whitespace. *)
    +  | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
    +
    +  (* identifier: [a-zA-Z][a-zA-Z0-9] *)
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +
    +  (* number: [0-9.]+ *)
    +  | [< ' ('0' .. '9' as c); stream >] ->
    +      let buffer = Buffer.create 1 in
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +
    +  (* Comment until end of line. *)
    +  | [< ' ('#'); stream >] ->
    +      lex_comment stream
    +
    +  (* Otherwise, just return the character as its ascii value. *)
    +  | [< 'c; stream >] ->
    +      [< 'Token.Kwd c; lex stream >]
    +
    +  (* end of stream. *)
    +  | [< >] -> [< >]
    +
    +and lex_number buffer = parser
    +  | [< ' ('0' .. '9' | '.' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_number buffer stream
    +  | [< stream=lex >] ->
    +      [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
    +
    +and lex_ident buffer = parser
    +  | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
    +      Buffer.add_char buffer c;
    +      lex_ident buffer stream
    +  | [< stream=lex >] ->
    +      match Buffer.contents buffer with
    +      | "def" -> [< 'Token.Def; stream >]
    +      | "extern" -> [< 'Token.Extern; stream >]
    +      | "if" -> [< 'Token.If; stream >]
    +      | "then" -> [< 'Token.Then; stream >]
    +      | "else" -> [< 'Token.Else; stream >]
    +      | "for" -> [< 'Token.For; stream >]
    +      | "in" -> [< 'Token.In; stream >]
    +      | "binary" -> [< 'Token.Binary; stream >]
    +      | "unary" -> [< 'Token.Unary; stream >]
    +      | "var" -> [< 'Token.Var; stream >]
    +      | id -> [< 'Token.Ident id; stream >]
    +
    +and lex_comment = parser
    +  | [< ' ('\n'); stream=lex >] -> stream
    +  | [< 'c; e=lex_comment >] -> e
    +  | [< >] -> [< >]
    +
    +
    + +
    ast.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Abstract Syntax Tree (aka Parse Tree)
    + *===----------------------------------------------------------------------===*)
    +
    +(* expr - Base type for all expression nodes. *)
    +type expr =
    +  (* variant for numeric literals like "1.0". *)
    +  | Number of float
    +
    +  (* variant for referencing a variable, like "a". *)
    +  | Variable of string
    +
    +  (* variant for a unary operator. *)
    +  | Unary of char * expr
    +
    +  (* variant for a binary operator. *)
    +  | Binary of char * expr * expr
    +
    +  (* variant for function calls. *)
    +  | Call of string * expr array
    +
    +  (* variant for if/then/else. *)
    +  | If of expr * expr * expr
    +
    +  (* variant for for/in. *)
    +  | For of string * expr * expr * expr option * expr
    +
    +  (* variant for var/in. *)
    +  | Var of (string * expr option) array * expr
    +
    +(* proto - This type represents the "prototype" for a function, which captures
    + * its name, and its argument names (thus implicitly the number of arguments the
    + * function takes). *)
    +type proto =
    +  | Prototype of string * string array
    +  | BinOpPrototype of string * string array * int
    +
    +(* func - This type represents a function definition itself. *)
    +type func = Function of proto * expr
    +
    +
    + +
    parser.ml:
    +
    +
    +(*===---------------------------------------------------------------------===
    + * Parser
    + *===---------------------------------------------------------------------===*)
    +
    +(* binop_precedence - This holds the precedence for each binary operator that is
    + * defined *)
    +let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
    +
    +(* precedence - Get the precedence of the pending binary operator token. *)
    +let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
    +
    +(* primary
    + *   ::= identifier
    + *   ::= numberexpr
    + *   ::= parenexpr
    + *   ::= ifexpr
    + *   ::= forexpr
    + *   ::= varexpr *)
    +let rec parse_primary = parser
    +  (* numberexpr ::= number *)
    +  | [< 'Token.Number n >] -> Ast.Number n
    +
    +  (* parenexpr ::= '(' expression ')' *)
    +  | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
    +
    +  (* identifierexpr
    +   *   ::= identifier
    +   *   ::= identifier '(' argumentexpr ')' *)
    +  | [< 'Token.Ident id; stream >] ->
    +      let rec parse_args accumulator = parser
    +        | [< e=parse_expr; stream >] ->
    +            begin parser
    +              | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
    +              | [< >] -> e :: accumulator
    +            end stream
    +        | [< >] -> accumulator
    +      in
    +      let rec parse_ident id = parser
    +        (* Call. *)
    +        | [< 'Token.Kwd '(';
    +             args=parse_args [];
    +             'Token.Kwd ')' ?? "expected ')'">] ->
    +            Ast.Call (id, Array.of_list (List.rev args))
    +
    +        (* Simple variable ref. *)
    +        | [< >] -> Ast.Variable id
    +      in
    +      parse_ident id stream
    +
    +  (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
    +  | [< 'Token.If; c=parse_expr;
    +       'Token.Then ?? "expected 'then'"; t=parse_expr;
    +       'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
    +      Ast.If (c, t, e)
    +
    +  (* forexpr
    +        ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
    +  | [< 'Token.For;
    +       'Token.Ident id ?? "expected identifier after for";
    +       'Token.Kwd '=' ?? "expected '=' after for";
    +       stream >] ->
    +      begin parser
    +        | [<
    +             start=parse_expr;
    +             'Token.Kwd ',' ?? "expected ',' after for";
    +             end_=parse_expr;
    +             stream >] ->
    +            let step =
    +              begin parser
    +              | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
    +              | [< >] -> None
    +              end stream
    +            in
    +            begin parser
    +            | [< 'Token.In; body=parse_expr >] ->
    +                Ast.For (id, start, end_, step, body)
    +            | [< >] ->
    +                raise (Stream.Error "expected 'in' after for")
    +            end stream
    +        | [< >] ->
    +            raise (Stream.Error "expected '=' after for")
    +      end stream
    +
    +  (* varexpr
    +   *   ::= 'var' identifier ('=' expression?
    +   *             (',' identifier ('=' expression)?)* 'in' expression *)
    +  | [< 'Token.Var;
    +       (* At least one variable name is required. *)
    +       'Token.Ident id ?? "expected identifier after var";
    +       init=parse_var_init;
    +       var_names=parse_var_names [(id, init)];
    +       (* At this point, we have to have 'in'. *)
    +       'Token.In ?? "expected 'in' keyword after 'var'";
    +       body=parse_expr >] ->
    +      Ast.Var (Array.of_list (List.rev var_names), body)
    +
    +  | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
    +
    +(* unary
    + *   ::= primary
    + *   ::= '!' unary *)
    +and parse_unary = parser
    +  (* If this is a unary operator, read it. *)
    +  | [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
    +      Ast.Unary (op, operand)
    +
    +  (* If the current token is not an operator, it must be a primary expr. *)
    +  | [< stream >] -> parse_primary stream
    +
    +(* binoprhs
    + *   ::= ('+' primary)* *)
    +and parse_bin_rhs expr_prec lhs stream =
    +  match Stream.peek stream with
    +  (* If this is a binop, find its precedence. *)
    +  | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
    +      let token_prec = precedence c in
    +
    +      (* If this is a binop that binds at least as tightly as the current binop,
    +       * consume it, otherwise we are done. *)
    +      if token_prec < expr_prec then lhs else begin
    +        (* Eat the binop. *)
    +        Stream.junk stream;
    +
    +        (* Parse the primary expression after the binary operator. *)
    +        let rhs = parse_unary stream in
    +
    +        (* Okay, we know this is a binop. *)
    +        let rhs =
    +          match Stream.peek stream with
    +          | Some (Token.Kwd c2) ->
    +              (* If BinOp binds less tightly with rhs than the operator after
    +               * rhs, let the pending operator take rhs as its lhs. *)
    +              let next_prec = precedence c2 in
    +              if token_prec < next_prec
    +              then parse_bin_rhs (token_prec + 1) rhs stream
    +              else rhs
    +          | _ -> rhs
    +        in
    +
    +        (* Merge lhs/rhs. *)
    +        let lhs = Ast.Binary (c, lhs, rhs) in
    +        parse_bin_rhs expr_prec lhs stream
    +      end
    +  | _ -> lhs
    +
    +and parse_var_init = parser
    +  (* read in the optional initializer. *)
    +  | [< 'Token.Kwd '='; e=parse_expr >] -> Some e
    +  | [< >] -> None
    +
    +and parse_var_names accumulator = parser
    +  | [< 'Token.Kwd ',';
    +       'Token.Ident id ?? "expected identifier list after var";
    +       init=parse_var_init;
    +       e=parse_var_names ((id, init) :: accumulator) >] -> e
    +  | [< >] -> accumulator
    +
    +(* expression
    + *   ::= primary binoprhs *)
    +and parse_expr = parser
    +  | [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
    +
    +(* prototype
    + *   ::= id '(' id* ')'
    + *   ::= binary LETTER number? (id, id)
    + *   ::= unary LETTER number? (id) *)
    +let parse_prototype =
    +  let rec parse_args accumulator = parser
    +    | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
    +    | [< >] -> accumulator
    +  in
    +  let parse_operator = parser
    +    | [< 'Token.Unary >] -> "unary", 1
    +    | [< 'Token.Binary >] -> "binary", 2
    +  in
    +  let parse_binary_precedence = parser
    +    | [< 'Token.Number n >] -> int_of_float n
    +    | [< >] -> 30
    +  in
    +  parser
    +  | [< 'Token.Ident id;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +       args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      (* success. *)
    +      Ast.Prototype (id, Array.of_list (List.rev args))
    +  | [< (prefix, kind)=parse_operator;
    +       'Token.Kwd op ?? "expected an operator";
    +       (* Read the precedence if present. *)
    +       binary_precedence=parse_binary_precedence;
    +       'Token.Kwd '(' ?? "expected '(' in prototype";
    +        args=parse_args [];
    +       'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
    +      let name = prefix ^ (String.make 1 op) in
    +      let args = Array.of_list (List.rev args) in
    +
    +      (* Verify right number of arguments for operator. *)
    +      if Array.length args != kind
    +      then raise (Stream.Error "invalid number of operands for operator")
    +      else
    +        if kind == 1 then
    +          Ast.Prototype (name, args)
    +        else
    +          Ast.BinOpPrototype (name, args, binary_precedence)
    +  | [< >] ->
    +      raise (Stream.Error "expected function name in prototype")
    +
    +(* definition ::= 'def' prototype expression *)
    +let parse_definition = parser
    +  | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
    +      Ast.Function (p, e)
    +
    +(* toplevelexpr ::= expression *)
    +let parse_toplevel = parser
    +  | [< e=parse_expr >] ->
    +      (* Make an anonymous proto. *)
    +      Ast.Function (Ast.Prototype ("", [||]), e)
    +
    +(*  external ::= 'extern' prototype *)
    +let parse_extern = parser
    +  | [< 'Token.Extern; e=parse_prototype >] -> e
    +
    +
    + +
    codegen.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Code Generation
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +
    +exception Error of string
    +
    +let context = global_context ()
    +let the_module = create_module context "my cool jit"
    +let builder = builder context
    +let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
    +let double_type = double_type context
    +
    +(* Create an alloca instruction in the entry block of the function. This
    + * is used for mutable variables etc. *)
    +let create_entry_block_alloca the_function var_name =
    +  let builder = builder_at context (instr_begin (entry_block the_function)) in
    +  build_alloca double_type var_name builder
    +
    +let rec codegen_expr = function
    +  | Ast.Number n -> const_float double_type n
    +  | Ast.Variable name ->
    +      let v = try Hashtbl.find named_values name with
    +        | Not_found -> raise (Error "unknown variable name")
    +      in
    +      (* Load the value. *)
    +      build_load v name builder
    +  | Ast.Unary (op, operand) ->
    +      let operand = codegen_expr operand in
    +      let callee = "unary" ^ (String.make 1 op) in
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown unary operator")
    +      in
    +      build_call callee [|operand|] "unop" builder
    +  | Ast.Binary (op, lhs, rhs) ->
    +      begin match op with
    +      | '=' ->
    +          (* Special case '=' because we don't want to emit the LHS as an
    +           * expression. *)
    +          let name =
    +            match lhs with
    +            | Ast.Variable name -> name
    +            | _ -> raise (Error "destination of '=' must be a variable")
    +          in
    +
    +          (* Codegen the rhs. *)
    +          let val_ = codegen_expr rhs in
    +
    +          (* Lookup the name. *)
    +          let variable = try Hashtbl.find named_values name with
    +          | Not_found -> raise (Error "unknown variable name")
    +          in
    +          ignore(build_store val_ variable builder);
    +          val_
    +      | _ ->
    +          let lhs_val = codegen_expr lhs in
    +          let rhs_val = codegen_expr rhs in
    +          begin
    +            match op with
    +            | '+' -> build_add lhs_val rhs_val "addtmp" builder
    +            | '-' -> build_sub lhs_val rhs_val "subtmp" builder
    +            | '*' -> build_mul lhs_val rhs_val "multmp" builder
    +            | '<' ->
    +                (* Convert bool 0/1 to double 0.0 or 1.0 *)
    +                let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
    +                build_uitofp i double_type "booltmp" builder
    +            | _ ->
    +                (* If it wasn't a builtin binary operator, it must be a user defined
    +                 * one. Emit a call to it. *)
    +                let callee = "binary" ^ (String.make 1 op) in
    +                let callee =
    +                  match lookup_function callee the_module with
    +                  | Some callee -> callee
    +                  | None -> raise (Error "binary operator not found!")
    +                in
    +                build_call callee [|lhs_val; rhs_val|] "binop" builder
    +          end
    +      end
    +  | Ast.Call (callee, args) ->
    +      (* Look up the name in the module table. *)
    +      let callee =
    +        match lookup_function callee the_module with
    +        | Some callee -> callee
    +        | None -> raise (Error "unknown function referenced")
    +      in
    +      let params = params callee in
    +
    +      (* If argument mismatch error. *)
    +      if Array.length params == Array.length args then () else
    +        raise (Error "incorrect # arguments passed");
    +      let args = Array.map codegen_expr args in
    +      build_call callee args "calltmp" builder
    +  | Ast.If (cond, then_, else_) ->
    +      let cond = codegen_expr cond in
    +
    +      (* Convert condition to a bool by comparing equal to 0.0 *)
    +      let zero = const_float double_type 0.0 in
    +      let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
    +
    +      (* Grab the first block so that we might later add the conditional branch
    +       * to it at the end of the function. *)
    +      let start_bb = insertion_block builder in
    +      let the_function = block_parent start_bb in
    +
    +      let then_bb = append_block context "then" the_function in
    +
    +      (* Emit 'then' value. *)
    +      position_at_end then_bb builder;
    +      let then_val = codegen_expr then_ in
    +
    +      (* Codegen of 'then' can change the current block, update then_bb for the
    +       * phi. We create a new name because one is used for the phi node, and the
    +       * other is used for the conditional branch. *)
    +      let new_then_bb = insertion_block builder in
    +
    +      (* Emit 'else' value. *)
    +      let else_bb = append_block context "else" the_function in
    +      position_at_end else_bb builder;
    +      let else_val = codegen_expr else_ in
    +
    +      (* Codegen of 'else' can change the current block, update else_bb for the
    +       * phi. *)
    +      let new_else_bb = insertion_block builder in
    +
    +      (* Emit merge block. *)
    +      let merge_bb = append_block context "ifcont" the_function in
    +      position_at_end merge_bb builder;
    +      let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
    +      let phi = build_phi incoming "iftmp" builder in
    +
    +      (* Return to the start block to add the conditional branch. *)
    +      position_at_end start_bb builder;
    +      ignore (build_cond_br cond_val then_bb else_bb builder);
    +
    +      (* Set a unconditional branch at the end of the 'then' block and the
    +       * 'else' block to the 'merge' block. *)
    +      position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
    +      position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
    +
    +      (* Finally, set the builder to the end of the merge block. *)
    +      position_at_end merge_bb builder;
    +
    +      phi
    +  | Ast.For (var_name, start, end_, step, body) ->
    +      (* Output this as:
    +       *   var = alloca double
    +       *   ...
    +       *   start = startexpr
    +       *   store start -> var
    +       *   goto loop
    +       * loop:
    +       *   ...
    +       *   bodyexpr
    +       *   ...
    +       * loopend:
    +       *   step = stepexpr
    +       *   endcond = endexpr
    +       *
    +       *   curvar = load var
    +       *   nextvar = curvar + step
    +       *   store nextvar -> var
    +       *   br endcond, loop, endloop
    +       * outloop: *)
    +
    +      let the_function = block_parent (insertion_block builder) in
    +
    +      (* Create an alloca for the variable in the entry block. *)
    +      let alloca = create_entry_block_alloca the_function var_name in
    +
    +      (* Emit the start code first, without 'variable' in scope. *)
    +      let start_val = codegen_expr start in
    +
    +      (* Store the value into the alloca. *)
    +      ignore(build_store start_val alloca builder);
    +
    +      (* Make the new basic block for the loop header, inserting after current
    +       * block. *)
    +      let loop_bb = append_block context "loop" the_function in
    +
    +      (* Insert an explicit fall through from the current block to the
    +       * loop_bb. *)
    +      ignore (build_br loop_bb builder);
    +
    +      (* Start insertion in loop_bb. *)
    +      position_at_end loop_bb builder;
    +
    +      (* Within the loop, the variable is defined equal to the PHI node. If it
    +       * shadows an existing variable, we have to restore it, so save it
    +       * now. *)
    +      let old_val =
    +        try Some (Hashtbl.find named_values var_name) with Not_found -> None
    +      in
    +      Hashtbl.add named_values var_name alloca;
    +
    +      (* Emit the body of the loop.  This, like any other expr, can change the
    +       * current BB.  Note that we ignore the value computed by the body, but
    +       * don't allow an error *)
    +      ignore (codegen_expr body);
    +
    +      (* Emit the step value. *)
    +      let step_val =
    +        match step with
    +        | Some step -> codegen_expr step
    +        (* If not specified, use 1.0. *)
    +        | None -> const_float double_type 1.0
    +      in
    +
    +      (* Compute the end condition. *)
    +      let end_cond = codegen_expr end_ in
    +
    +      (* Reload, increment, and restore the alloca. This handles the case where
    +       * the body of the loop mutates the variable. *)
    +      let cur_var = build_load alloca var_name builder in
    +      let next_var = build_add cur_var step_val "nextvar" builder in
    +      ignore(build_store next_var alloca builder);
    +
    +      (* Convert condition to a bool by comparing equal to 0.0. *)
    +      let zero = const_float double_type 0.0 in
    +      let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
    +
    +      (* Create the "after loop" block and insert it. *)
    +      let after_bb = append_block context "afterloop" the_function in
    +
    +      (* Insert the conditional branch into the end of loop_end_bb. *)
    +      ignore (build_cond_br end_cond loop_bb after_bb builder);
    +
    +      (* Any new code will be inserted in after_bb. *)
    +      position_at_end after_bb builder;
    +
    +      (* Restore the unshadowed variable. *)
    +      begin match old_val with
    +      | Some old_val -> Hashtbl.add named_values var_name old_val
    +      | None -> ()
    +      end;
    +
    +      (* for expr always returns 0.0. *)
    +      const_null double_type
    +  | Ast.Var (var_names, body) ->
    +      let old_bindings = ref [] in
    +
    +      let the_function = block_parent (insertion_block builder) in
    +
    +      (* Register all variables and emit their initializer. *)
    +      Array.iter (fun (var_name, init) ->
    +        (* Emit the initializer before adding the variable to scope, this
    +         * prevents the initializer from referencing the variable itself, and
    +         * permits stuff like this:
    +         *   var a = 1 in
    +         *     var a = a in ...   # refers to outer 'a'. *)
    +        let init_val =
    +          match init with
    +          | Some init -> codegen_expr init
    +          (* If not specified, use 0.0. *)
    +          | None -> const_float double_type 0.0
    +        in
    +
    +        let alloca = create_entry_block_alloca the_function var_name in
    +        ignore(build_store init_val alloca builder);
    +
    +        (* Remember the old variable binding so that we can restore the binding
    +         * when we unrecurse. *)
    +        begin
    +          try
    +            let old_value = Hashtbl.find named_values var_name in
    +            old_bindings := (var_name, old_value) :: !old_bindings;
    +          with Not_found -> ()
    +        end;
    +
    +        (* Remember this binding. *)
    +        Hashtbl.add named_values var_name alloca;
    +      ) var_names;
    +
    +      (* Codegen the body, now that all vars are in scope. *)
    +      let body_val = codegen_expr body in
    +
    +      (* Pop all our variables from scope. *)
    +      List.iter (fun (var_name, old_value) ->
    +        Hashtbl.add named_values var_name old_value
    +      ) !old_bindings;
    +
    +      (* Return the body computation. *)
    +      body_val
    +
    +let codegen_proto = function
    +  | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) ->
    +      (* Make the function type: double(double,double) etc. *)
    +      let doubles = Array.make (Array.length args) double_type in
    +      let ft = function_type double_type doubles in
    +      let f =
    +        match lookup_function name the_module with
    +        | None -> declare_function name ft the_module
    +
    +        (* If 'f' conflicted, there was already something named 'name'. If it
    +         * has a body, don't allow redefinition or reextern. *)
    +        | Some f ->
    +            (* If 'f' already has a body, reject this. *)
    +            if block_begin f <> At_end f then
    +              raise (Error "redefinition of function");
    +
    +            (* If 'f' took a different number of arguments, reject. *)
    +            if element_type (type_of f) <> ft then
    +              raise (Error "redefinition of function with different # args");
    +            f
    +      in
    +
    +      (* Set names for all arguments. *)
    +      Array.iteri (fun i a ->
    +        let n = args.(i) in
    +        set_value_name n a;
    +        Hashtbl.add named_values n a;
    +      ) (params f);
    +      f
    +
    +(* Create an alloca for each argument and register the argument in the symbol
    + * table so that references to it will succeed. *)
    +let create_argument_allocas the_function proto =
    +  let args = match proto with
    +    | Ast.Prototype (_, args) | Ast.BinOpPrototype (_, args, _) -> args
    +  in
    +  Array.iteri (fun i ai ->
    +    let var_name = args.(i) in
    +    (* Create an alloca for this variable. *)
    +    let alloca = create_entry_block_alloca the_function var_name in
    +
    +    (* Store the initial value into the alloca. *)
    +    ignore(build_store ai alloca builder);
    +
    +    (* Add arguments to variable symbol table. *)
    +    Hashtbl.add named_values var_name alloca;
    +  ) (params the_function)
    +
    +let codegen_func the_fpm = function
    +  | Ast.Function (proto, body) ->
    +      Hashtbl.clear named_values;
    +      let the_function = codegen_proto proto in
    +
    +      (* If this is an operator, install it. *)
    +      begin match proto with
    +      | Ast.BinOpPrototype (name, args, prec) ->
    +          let op = name.[String.length name - 1] in
    +          Hashtbl.add Parser.binop_precedence op prec;
    +      | _ -> ()
    +      end;
    +
    +      (* Create a new basic block to start insertion into. *)
    +      let bb = append_block context "entry" the_function in
    +      position_at_end bb builder;
    +
    +      try
    +        (* Add all arguments to the symbol table and create their allocas. *)
    +        create_argument_allocas the_function proto;
    +
    +        let ret_val = codegen_expr body in
    +
    +        (* Finish off the function. *)
    +        let _ = build_ret ret_val builder in
    +
    +        (* Validate the generated code, checking for consistency. *)
    +        Llvm_analysis.assert_valid_function the_function;
    +
    +        (* Optimize the function. *)
    +        let _ = PassManager.run_function the_function the_fpm in
    +
    +        the_function
    +      with e ->
    +        delete_function the_function;
    +        raise e
    +
    +
    + +
    toplevel.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Top-Level parsing and JIT Driver
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +
    +(* top ::= definition | external | expression | ';' *)
    +let rec main_loop the_fpm the_execution_engine stream =
    +  match Stream.peek stream with
    +  | None -> ()
    +
    +  (* ignore top-level semicolons. *)
    +  | Some (Token.Kwd ';') ->
    +      Stream.junk stream;
    +      main_loop the_fpm the_execution_engine stream
    +
    +  | Some token ->
    +      begin
    +        try match token with
    +        | Token.Def ->
    +            let e = Parser.parse_definition stream in
    +            print_endline "parsed a function definition.";
    +            dump_value (Codegen.codegen_func the_fpm e);
    +        | Token.Extern ->
    +            let e = Parser.parse_extern stream in
    +            print_endline "parsed an extern.";
    +            dump_value (Codegen.codegen_proto e);
    +        | _ ->
    +            (* Evaluate a top-level expression into an anonymous function. *)
    +            let e = Parser.parse_toplevel stream in
    +            print_endline "parsed a top-level expr";
    +            let the_function = Codegen.codegen_func the_fpm e in
    +            dump_value the_function;
    +
    +            (* JIT the function, returning a function pointer. *)
    +            let result = ExecutionEngine.run_function the_function [||]
    +              the_execution_engine in
    +
    +            print_string "Evaluated to ";
    +            print_float (GenericValue.as_float Codegen.double_type result);
    +            print_newline ();
    +        with Stream.Error s | Codegen.Error s ->
    +          (* Skip token for error recovery. *)
    +          Stream.junk stream;
    +          print_endline s;
    +      end;
    +      print_string "ready> "; flush stdout;
    +      main_loop the_fpm the_execution_engine stream
    +
    +
    + +
    toy.ml:
    +
    +
    +(*===----------------------------------------------------------------------===
    + * Main driver code.
    + *===----------------------------------------------------------------------===*)
    +
    +open Llvm
    +open Llvm_executionengine
    +open Llvm_target
    +open Llvm_scalar_opts
    +
    +let main () =
    +  ignore (initialize_native_target ());
    +
    +  (* Install standard binary operators.
    +   * 1 is the lowest precedence. *)
    +  Hashtbl.add Parser.binop_precedence '=' 2;
    +  Hashtbl.add Parser.binop_precedence '<' 10;
    +  Hashtbl.add Parser.binop_precedence '+' 20;
    +  Hashtbl.add Parser.binop_precedence '-' 20;
    +  Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
    +
    +  (* Prime the first token. *)
    +  print_string "ready> "; flush stdout;
    +  let stream = Lexer.lex (Stream.of_channel stdin) in
    +
    +  (* Create the JIT. *)
    +  let the_execution_engine = ExecutionEngine.create Codegen.the_module in
    +  let the_fpm = PassManager.create_function Codegen.the_module in
    +
    +  (* Set up the optimizer pipeline.  Start with registering info about how the
    +   * target lays out data structures. *)
    +  TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
    +
    +  (* Promote allocas to registers. *)
    +  add_memory_to_register_promotion the_fpm;
    +
    +  (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
    +  add_instruction_combination the_fpm;
    +
    +  (* reassociate expressions. *)
    +  add_reassociation the_fpm;
    +
    +  (* Eliminate Common SubExpressions. *)
    +  add_gvn the_fpm;
    +
    +  (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
    +  add_cfg_simplification the_fpm;
    +
    +  ignore (PassManager.initialize the_fpm);
    +
    +  (* Run the main "interpreter loop" now. *)
    +  Toplevel.main_loop the_fpm the_execution_engine stream;
    +
    +  (* Print out all the generated code. *)
    +  dump_module Codegen.the_module
    +;;
    +
    +main ()
    +
    +
    + +
    bindings.c
    +
    +
    +#include <stdio.h>
    +
    +/* putchard - putchar that takes a double and returns 0. */
    +extern double putchard(double X) {
    +  putchar((char)X);
    +  return 0;
    +}
    +
    +/* printd - printf that takes a double prints it as "%f\n", returning 0. */
    +extern double printd(double X) {
    +  printf("%f\n", X);
    +  return 0;
    +}
    +
    +
    +
    + +Next: Conclusion and other useful LLVM tidbits +
    + + +
    +
    + Valid CSS! + Valid HTML 4.01! + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Erick Tryzelaar
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/tutorial/index.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/tutorial/index.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/tutorial/index.html (added) +++ www-releases/trunk/2.8/docs/tutorial/index.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,48 @@ + + + + LLVM Tutorial: Table of Contents + + + + + + + + +
    LLVM Tutorial: Table of Contents
    + +
      +
    1. Kaleidoscope: Implementing a Language with LLVM +
        +
      1. Tutorial Introduction and the Lexer
      2. +
      3. Implementing a Parser and AST
      4. +
      5. Implementing Code Generation to LLVM IR
      6. +
      7. Adding JIT and Optimizer Support
      8. +
      9. Extending the language: control flow
      10. +
      11. Extending the language: user-defined operators
      12. +
      13. Extending the language: mutable variables / SSA construction
      14. +
      15. Conclusion and other useful LLVM tidbits
      16. +
    2. +
    3. Kaleidoscope: Implementing a Language with LLVM in Objective Caml +
        +
      1. Tutorial Introduction and the Lexer
      2. +
      3. Implementing a Parser and AST
      4. +
      5. Implementing Code Generation to LLVM IR
      6. +
      7. Adding JIT and Optimizer Support
      8. +
      9. Extending the language: control flow
      10. +
      11. Extending the language: user-defined operators
      12. +
      13. Extending the language: mutable variables / SSA construction
      14. +
      15. Conclusion and other useful LLVM tidbits
      16. +
    4. +
    5. Advanced Topics +
        +
      1. Writing + an Optimization for LLVM
      2. +
    6. +
    + + + Added: www-releases/trunk/2.8/llvm-gcc-4.2-2.8-i686-linux.tgz URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/llvm-gcc-4.2-2.8-i686-linux.tgz?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/llvm-gcc-4.2-2.8-i686-linux.tgz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/llvm-gcc-4.2-2.8-x86_64-apple-darwin10.tar.gz URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/llvm-gcc-4.2-2.8-x86_64-apple-darwin10.tar.gz?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/llvm-gcc-4.2-2.8-x86_64-apple-darwin10.tar.gz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/llvm-gcc4.2-2.8-x86-mingw32.tar.bz2 URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/llvm-gcc4.2-2.8-x86-mingw32.tar.bz2?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/llvm-gcc4.2-2.8-x86-mingw32.tar.bz2 ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream From isanbard at gmail.com Mon Oct 4 15:49:24 2010 From: isanbard at gmail.com (Bill Wendling) Date: Mon, 04 Oct 2010 20:49:24 -0000 Subject: [llvm-commits] [www-releases] r115556 [1/3] - in /www-releases/trunk/2.8: ./ docs/ docs/CommandGuide/ docs/CommandGuide/html/ docs/CommandGuide/man/ docs/CommandGuide/man/man1/ docs/CommandGuide/ps/ docs/HistoricalNotes/ docs/img/ docs/tutorial/ Message-ID: <20101004204926.859A72A6C12C@llvm.org> Author: void Date: Mon Oct 4 15:49:23 2010 New Revision: 115556 URL: http://llvm.org/viewvc/llvm-project?rev=115556&view=rev Log: Add in Linux-32, Mingw32, and Darwin binaries. Added: www-releases/trunk/2.8/clang+llvm-2.8-i686-linux.tgz (with props) www-releases/trunk/2.8/clang+llvm-2.8-x86_64-apple-darwin10.tar.gz (with props) www-releases/trunk/2.8/docs/ www-releases/trunk/2.8/docs/AliasAnalysis.html www-releases/trunk/2.8/docs/BitCodeFormat.html www-releases/trunk/2.8/docs/Bugpoint.html www-releases/trunk/2.8/docs/CFEBuildInstrs.html www-releases/trunk/2.8/docs/CMake.html www-releases/trunk/2.8/docs/CodeGenerator.html www-releases/trunk/2.8/docs/CodingStandards.html www-releases/trunk/2.8/docs/CommandGuide/ www-releases/trunk/2.8/docs/CommandGuide/FileCheck.pod www-releases/trunk/2.8/docs/CommandGuide/Makefile www-releases/trunk/2.8/docs/CommandGuide/bugpoint.pod www-releases/trunk/2.8/docs/CommandGuide/html/ www-releases/trunk/2.8/docs/CommandGuide/html/manpage.css www-releases/trunk/2.8/docs/CommandGuide/index.html www-releases/trunk/2.8/docs/CommandGuide/lit.pod www-releases/trunk/2.8/docs/CommandGuide/llc.pod www-releases/trunk/2.8/docs/CommandGuide/lli.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-ar.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-as.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-bcanalyzer.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-config.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-diff.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-dis.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-extract.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-ld.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-link.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-nm.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-prof.pod www-releases/trunk/2.8/docs/CommandGuide/llvm-ranlib.pod www-releases/trunk/2.8/docs/CommandGuide/llvmc.pod www-releases/trunk/2.8/docs/CommandGuide/llvmgcc.pod www-releases/trunk/2.8/docs/CommandGuide/llvmgxx.pod www-releases/trunk/2.8/docs/CommandGuide/man/ www-releases/trunk/2.8/docs/CommandGuide/man/man1/ www-releases/trunk/2.8/docs/CommandGuide/manpage.css www-releases/trunk/2.8/docs/CommandGuide/opt.pod www-releases/trunk/2.8/docs/CommandGuide/ps/ www-releases/trunk/2.8/docs/CommandGuide/tblgen.pod www-releases/trunk/2.8/docs/CommandLine.html www-releases/trunk/2.8/docs/CompilerDriver.html www-releases/trunk/2.8/docs/CompilerDriverTutorial.html www-releases/trunk/2.8/docs/CompilerWriterInfo.html www-releases/trunk/2.8/docs/DebuggingJITedCode.html www-releases/trunk/2.8/docs/DeveloperPolicy.html www-releases/trunk/2.8/docs/ExceptionHandling.html www-releases/trunk/2.8/docs/ExtendedIntegerResults.txt www-releases/trunk/2.8/docs/ExtendingLLVM.html www-releases/trunk/2.8/docs/FAQ.html www-releases/trunk/2.8/docs/GCCFEBuildInstrs.html www-releases/trunk/2.8/docs/GarbageCollection.html www-releases/trunk/2.8/docs/GetElementPtr.html www-releases/trunk/2.8/docs/GettingStarted.html www-releases/trunk/2.8/docs/GettingStartedVS.html www-releases/trunk/2.8/docs/GoldPlugin.html www-releases/trunk/2.8/docs/HistoricalNotes/ www-releases/trunk/2.8/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt www-releases/trunk/2.8/docs/HistoricalNotes/2000-11-18-EarlyDesignIdeasResp.txt www-releases/trunk/2.8/docs/HistoricalNotes/2000-12-06-EncodingIdea.txt www-releases/trunk/2.8/docs/HistoricalNotes/2000-12-06-MeetingSummary.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-01-31-UniversalIRIdea.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-06-TypeNotationDebate.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp1.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp2.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-06-TypeNotationDebateResp4.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-09-AdveComments.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-09-AdveCommentsResponse.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-13-Reference-Memory.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-02-13-Reference-MemoryResponse.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-04-16-DynamicCompilation.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-05-18-ExceptionHandling.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-05-19-ExceptionResponse.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-06-01-GCCOptimizations.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-06-01-GCCOptimizations2.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-06-20-.NET-Differences.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-07-06-LoweringIRForCodeGen.txt www-releases/trunk/2.8/docs/HistoricalNotes/2001-09-18-OptimizeExceptions.txt www-releases/trunk/2.8/docs/HistoricalNotes/2002-05-12-InstListChange.txt www-releases/trunk/2.8/docs/HistoricalNotes/2002-06-25-MegaPatchInfo.txt www-releases/trunk/2.8/docs/HistoricalNotes/2003-01-23-CygwinNotes.txt www-releases/trunk/2.8/docs/HistoricalNotes/2003-06-25-Reoptimizer1.txt www-releases/trunk/2.8/docs/HistoricalNotes/2003-06-26-Reoptimizer2.txt www-releases/trunk/2.8/docs/HistoricalNotes/2007-OriginalClangReadme.txt www-releases/trunk/2.8/docs/HowToReleaseLLVM.html www-releases/trunk/2.8/docs/HowToSubmitABug.html www-releases/trunk/2.8/docs/LangRef.html www-releases/trunk/2.8/docs/Lexicon.html www-releases/trunk/2.8/docs/LinkTimeOptimization.html www-releases/trunk/2.8/docs/Makefile www-releases/trunk/2.8/docs/MakefileGuide.html www-releases/trunk/2.8/docs/Packaging.html www-releases/trunk/2.8/docs/Passes.html www-releases/trunk/2.8/docs/ProgrammersManual.html www-releases/trunk/2.8/docs/Projects.html www-releases/trunk/2.8/docs/ReleaseNotes.html www-releases/trunk/2.8/docs/SourceLevelDebugging.html www-releases/trunk/2.8/docs/SystemLibrary.html www-releases/trunk/2.8/docs/TableGenFundamentals.html www-releases/trunk/2.8/docs/TestingGuide.html www-releases/trunk/2.8/docs/UsingLibraries.html www-releases/trunk/2.8/docs/WritingAnLLVMBackend.html www-releases/trunk/2.8/docs/WritingAnLLVMPass.html www-releases/trunk/2.8/docs/doxygen.cfg.in www-releases/trunk/2.8/docs/doxygen.css www-releases/trunk/2.8/docs/doxygen.footer www-releases/trunk/2.8/docs/doxygen.header www-releases/trunk/2.8/docs/doxygen.intro www-releases/trunk/2.8/docs/img/ www-releases/trunk/2.8/docs/img/Debugging.gif (with props) www-releases/trunk/2.8/docs/img/libdeps.gif (with props) www-releases/trunk/2.8/docs/img/lines.gif (with props) www-releases/trunk/2.8/docs/img/objdeps.gif (with props) www-releases/trunk/2.8/docs/img/venusflytrap.jpg (with props) www-releases/trunk/2.8/docs/index.html www-releases/trunk/2.8/docs/llvm.css www-releases/trunk/2.8/docs/re_format.7 www-releases/trunk/2.8/docs/tutorial/ www-releases/trunk/2.8/docs/tutorial/LangImpl1.html www-releases/trunk/2.8/docs/tutorial/LangImpl2.html www-releases/trunk/2.8/docs/tutorial/LangImpl3.html www-releases/trunk/2.8/docs/tutorial/LangImpl4.html www-releases/trunk/2.8/docs/tutorial/LangImpl5-cfg.png (with props) www-releases/trunk/2.8/docs/tutorial/LangImpl5.html www-releases/trunk/2.8/docs/tutorial/LangImpl6.html www-releases/trunk/2.8/docs/tutorial/LangImpl7.html www-releases/trunk/2.8/docs/tutorial/LangImpl8.html www-releases/trunk/2.8/docs/tutorial/Makefile www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl1.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl2.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl3.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl4.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl5.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl6.html www-releases/trunk/2.8/docs/tutorial/OCamlLangImpl7.html www-releases/trunk/2.8/docs/tutorial/index.html www-releases/trunk/2.8/llvm-gcc-4.2-2.8-i686-linux.tgz (with props) www-releases/trunk/2.8/llvm-gcc-4.2-2.8-x86_64-apple-darwin10.tar.gz (with props) www-releases/trunk/2.8/llvm-gcc4.2-2.8-x86-mingw32.tar.bz2 (with props) Added: www-releases/trunk/2.8/clang+llvm-2.8-i686-linux.tgz URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/clang%2Bllvm-2.8-i686-linux.tgz?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/clang+llvm-2.8-i686-linux.tgz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/clang+llvm-2.8-x86_64-apple-darwin10.tar.gz URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/clang%2Bllvm-2.8-x86_64-apple-darwin10.tar.gz?rev=115556&view=auto ============================================================================== Binary file - no diff available. Propchange: www-releases/trunk/2.8/clang+llvm-2.8-x86_64-apple-darwin10.tar.gz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: www-releases/trunk/2.8/docs/AliasAnalysis.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/AliasAnalysis.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/AliasAnalysis.html (added) +++ www-releases/trunk/2.8/docs/AliasAnalysis.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1005 @@ + + + + LLVM Alias Analysis Infrastructure + + + + +
    + LLVM Alias Analysis Infrastructure +
    + +
      +
    1. Introduction
    2. + +
    3. AliasAnalysis Class Overview + +
    4. + +
    5. Writing a new AliasAnalysis Implementation + +
    6. + +
    7. Using alias analysis results + +
    8. + +
    9. Existing alias analysis implementations and clients + +
    10. +
    11. Memory Dependence Analysis
    12. +
    + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt +to determine whether or not two pointers ever can point to the same object in +memory. There are many different algorithms for alias analysis and many +different ways of classifying them: flow-sensitive vs flow-insensitive, +context-sensitive vs context-insensitive, field-sensitive vs field-insensitive, +unification-based vs subset-based, etc. Traditionally, alias analyses respond +to a query with a Must, May, or No alias response, +indicating that two pointers always point to the same object, might point to the +same object, or are known to never point to the same object.

    + +

    The LLVM AliasAnalysis +class is the primary interface used by clients and implementations of alias +analyses in the LLVM system. This class is the common interface between clients +of alias analysis information and the implementations providing it, and is +designed to support a wide range of implementations and clients (but currently +all clients are assumed to be flow-insensitive). In addition to simple alias +analysis information, this class exposes Mod/Ref information from those +implementations which can provide it, allowing for powerful analyses and +transformations to work well together.

    + +

    This document contains information necessary to successfully implement this +interface, use it, and to test both sides. It also explains some of the finer +points about what exactly results mean. If you feel that something is unclear +or should be added, please let me +know.

    + +
    + + + + + +
    + +

    The AliasAnalysis +class defines the interface that the various alias analysis implementations +should support. This class exports two important enums: AliasResult +and ModRefResult which represent the result of an alias query or a +mod/ref query, respectively.

    + +

    The AliasAnalysis interface exposes information about memory, +represented in several different ways. In particular, memory objects are +represented as a starting address and size, and function calls are represented +as the actual call or invoke instructions that performs the +call. The AliasAnalysis interface also exposes some helper methods +which allow you to get mod/ref information for arbitrary instructions.

    + +

    All AliasAnalysis interfaces require that in queries involving +multiple values, values which are not +constants are all defined within the +same function.

    + +
    + + + + +
    + +

    Most importantly, the AliasAnalysis class provides several methods +which are used to query whether or not two memory objects alias, whether +function calls can modify or read a memory object, etc. For all of these +queries, memory objects are represented as a pair of their starting address (a +symbolic LLVM Value*) and a static size.

    + +

    Representing memory objects as a starting address and a size is critically +important for correct Alias Analyses. For example, consider this (silly, but +possible) C code:

    + +
    +
    +int i;
    +char C[2];
    +char A[10]; 
    +/* ... */
    +for (i = 0; i != 10; ++i) {
    +  C[0] = A[i];          /* One byte store */
    +  C[1] = A[9-i];        /* One byte store */
    +}
    +
    +
    + +

    In this case, the basicaa pass will disambiguate the stores to +C[0] and C[1] because they are accesses to two distinct +locations one byte apart, and the accesses are each one byte. In this case, the +LICM pass can use store motion to remove the stores from the loop. In +constrast, the following code:

    + +
    +
    +int i;
    +char C[2];
    +char A[10]; 
    +/* ... */
    +for (i = 0; i != 10; ++i) {
    +  ((short*)C)[0] = A[i];  /* Two byte store! */
    +  C[1] = A[9-i];          /* One byte store */
    +}
    +
    +
    + +

    In this case, the two stores to C do alias each other, because the access to +the &C[0] element is a two byte access. If size information wasn't +available in the query, even the first case would have to conservatively assume +that the accesses alias.

    + +
    + + + + +
    +

    The alias method is the primary interface used to determine whether +or not two memory objects alias each other. It takes two memory objects as +input and returns MustAlias, MayAlias, or NoAlias as appropriate.

    + +

    Like all AliasAnalysis interfaces, the alias method requires +that either the two pointer values be defined within the same function, or at +least one of the values is a constant.

    +
    + + + + +
    +

    The NoAlias response may be used when there is never an immediate dependence +between any memory reference based on one pointer and any memory +reference based the other. The most obvious example is when the two +pointers point to non-overlapping memory ranges. Another is when the two +pointers are only ever used for reading memory. Another is when the memory is +freed and reallocated between accesses through one pointer and accesses through +the other -- in this case, there is a dependence, but it's mediated by the free +and reallocation.

    + +

    As an exception to this is with the +noalias keyword; the "irrelevant" +dependencies are ignored.

    + +

    The MayAlias response is used whenever the two pointers might refer to the +same object. If the two memory objects overlap, but do not start at the same +location, return MayAlias.

    + +

    The MustAlias response may only be returned if the two memory objects are +guaranteed to always start at exactly the same location. A MustAlias response +implies that the pointers compare equal.

    + +
    + + + + +
    + +

    The getModRefInfo methods return information about whether the +execution of an instruction can read or modify a memory location. Mod/Ref +information is always conservative: if an instruction might read or write +a location, ModRef is returned.

    + +

    The AliasAnalysis class also provides a getModRefInfo +method for testing dependencies between function calls. This method takes two +call sites (CS1 & CS2), returns NoModRef if neither call writes to memory +read or written by the other, Ref if CS1 reads memory written by CS2, Mod if CS1 +writes to memory read or written by CS2, or ModRef if CS1 might read or write +memory written to by CS2. Note that this relation is not commutative.

    + +
    + + + + + +
    + +

    +Several other tidbits of information are often collected by various alias +analysis implementations and can be put to good use by various clients. +

    + +
    + + +
    + The pointsToConstantMemory method +
    + +
    + +

    The pointsToConstantMemory method returns true if and only if the +analysis can prove that the pointer only points to unchanging memory locations +(functions, constant global variables, and the null pointer). This information +can be used to refine mod/ref information: it is impossible for an unchanging +memory location to be modified.

    + +
    + + + + +
    + +

    These methods are used to provide very simple mod/ref information for +function calls. The doesNotAccessMemory method returns true for a +function if the analysis can prove that the function never reads or writes to +memory, or if the function only reads from constant memory. Functions with this +property are side-effect free and only depend on their input arguments, allowing +them to be eliminated if they form common subexpressions or be hoisted out of +loops. Many common functions behave this way (e.g., sin and +cos) but many others do not (e.g., acos, which modifies the +errno variable).

    + +

    The onlyReadsMemory method returns true for a function if analysis +can prove that (at most) the function only reads from non-volatile memory. +Functions with this property are side-effect free, only depending on their input +arguments and the state of memory when they are called. This property allows +calls to these functions to be eliminated and moved around, as long as there is +no store instruction that changes the contents of memory. Note that all +functions that satisfy the doesNotAccessMemory method also satisfies +onlyReadsMemory.

    + +
    + + + + + +
    + +

    Writing a new alias analysis implementation for LLVM is quite +straight-forward. There are already several implementations that you can use +for examples, and the following information should help fill in any details. +For a examples, take a look at the various alias analysis +implementations included with LLVM.

    + +
    + + + + +
    + +

    The first step to determining what type of LLVM pass you need to use for your Alias +Analysis. As is the case with most other analyses and transformations, the +answer should be fairly obvious from what type of problem you are trying to +solve:

    + +
      +
    1. If you require interprocedural analysis, it should be a + Pass.
    2. +
    3. If you are a function-local analysis, subclass FunctionPass.
    4. +
    5. If you don't need to look at the program at all, subclass + ImmutablePass.
    6. +
    + +

    In addition to the pass that you subclass, you should also inherit from the +AliasAnalysis interface, of course, and use the +RegisterAnalysisGroup template to register as an implementation of +AliasAnalysis.

    + +
    + + + + +
    + +

    Your subclass of AliasAnalysis is required to invoke two methods on +the AliasAnalysis base class: getAnalysisUsage and +InitializeAliasAnalysis. In particular, your implementation of +getAnalysisUsage should explicitly call into the +AliasAnalysis::getAnalysisUsage method in addition to doing any +declaring any pass dependencies your pass has. Thus you should have something +like this:

    + +
    +
    +void getAnalysisUsage(AnalysisUsage &AU) const {
    +  AliasAnalysis::getAnalysisUsage(AU);
    +  // declare your dependencies here.
    +}
    +
    +
    + +

    Additionally, your must invoke the InitializeAliasAnalysis method +from your analysis run method (run for a Pass, +runOnFunction for a FunctionPass, or InitializePass +for an ImmutablePass). For example (as part of a Pass):

    + +
    +
    +bool run(Module &M) {
    +  InitializeAliasAnalysis(this);
    +  // Perform analysis here...
    +  return false;
    +}
    +
    +
    + +
    + + + + +
    + +

    All of the AliasAnalysis +virtual methods default to providing chaining to another +alias analysis implementation, which ends up returning conservatively correct +information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries +respectively). Depending on the capabilities of the analysis you are +implementing, you just override the interfaces you can improve.

    + +
    + + + + + + +
    + +

    With only two special exceptions (the basicaa and no-aa +passes) every alias analysis pass chains to another alias analysis +implementation (for example, the user can specify "-basicaa -ds-aa +-licm" to get the maximum benefit from both alias +analyses). The alias analysis class automatically takes care of most of this +for methods that you don't override. For methods that you do override, in code +paths that return a conservative MayAlias or Mod/Ref result, simply return +whatever the superclass computes. For example:

    + +
    +
    +AliasAnalysis::AliasResult alias(const Value *V1, unsigned V1Size,
    +                                 const Value *V2, unsigned V2Size) {
    +  if (...)
    +    return NoAlias;
    +  ...
    +
    +  // Couldn't determine a must or no-alias result.
    +  return AliasAnalysis::alias(V1, V1Size, V2, V2Size);
    +}
    +
    +
    + +

    In addition to analysis queries, you must make sure to unconditionally pass +LLVM update notification methods to the superclass as +well if you override them, which allows all alias analyses in a change to be +updated.

    + +
    + + + + + +
    +

    +Alias analysis information is initially computed for a static snapshot of the +program, but clients will use this information to make transformations to the +code. All but the most trivial forms of alias analysis will need to have their +analysis results updated to reflect the changes made by these transformations. +

    + +

    +The AliasAnalysis interface exposes two methods which are used to +communicate program changes from the clients to the analysis implementations. +Various alias analysis implementations should use these methods to ensure that +their internal data structures are kept up-to-date as the program changes (for +example, when an instruction is deleted), and clients of alias analysis must be +sure to call these interfaces appropriately. +

    +
    + + +
    The deleteValue method
    + +
    +The deleteValue method is called by transformations when they remove an +instruction or any other value from the program (including values that do not +use pointers). Typically alias analyses keep data structures that have entries +for each value in the program. When this method is called, they should remove +any entries for the specified value, if they exist. +
    + + +
    The copyValue method
    + +
    +The copyValue method is used when a new value is introduced into the +program. There is no way to introduce a value into the program that did not +exist before (this doesn't make sense for a safe compiler transformation), so +this is the only way to introduce a new value. This method indicates that the +new value has exactly the same properties as the value being copied. +
    + + +
    The replaceWithNewValue method
    + +
    +This method is a simple helper method that is provided to make clients easier to +use. It is implemented by copying the old analysis information to the new +value, then deleting the old value. This method cannot be overridden by alias +analysis implementations. +
    + + + + +
    + +

    From the LLVM perspective, the only thing you need to do to provide an +efficient alias analysis is to make sure that alias analysis queries are +serviced quickly. The actual calculation of the alias analysis results (the +"run" method) is only performed once, but many (perhaps duplicate) queries may +be performed. Because of this, try to move as much computation to the run +method as possible (within reason).

    + +
    + + + + +
    + +

    PassManager support for alternative AliasAnalysis implementation +has some issues.

    + +

    There is no way to override the default alias analysis. It would +be very useful to be able to do something like "opt -my-aa -O2" and +have it use -my-aa for all passes which need AliasAnalysis, but there +is currently no support for that, short of changing the source code +and recompiling. Similarly, there is also no way of setting a chain +of analyses as the default.

    + +

    There is no way for transform passes to declare that they preserve +AliasAnalysis implementations. The AliasAnalysis +interface includes deleteValue and copyValue methods +which are intended to allow a pass to keep an AliasAnalysis consistent, +however there's no way for a pass to declare in its +getAnalysisUsage that it does so. Some passes attempt to use +AU.addPreserved<AliasAnalysis>, however this doesn't +actually have any effect. + +

    AliasAnalysisCounter (-count-aa) and AliasDebugger +(-debug-aa) are implemented as ModulePass classes, so if your +alias analysis uses FunctionPass, it won't be able to use +these utilities. If you try to use them, the pass manager will +silently route alias analysis queries directly to +BasicAliasAnalysis instead.

    + +

    Similarly, the opt -p option introduces ModulePass +passes between each pass, which prevents the use of FunctionPass +alias analysis passes.

    + +
    + + + + + +
    + +

    There are several different ways to use alias analysis results. In order of +preference, these are...

    + +
    + + + + +
    + +

    The memdep pass uses alias analysis to provide high-level dependence +information about memory-using instructions. This will tell you which store +feeds into a load, for example. It uses caching and other techniques to be +efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations. +

    + +
    + + + + +
    + +

    Many transformations need information about alias sets that are active +in some scope, rather than information about pairwise aliasing. The AliasSetTracker class +is used to efficiently build these Alias Sets from the pairwise alias analysis +information provided by the AliasAnalysis interface.

    + +

    First you initialize the AliasSetTracker by using the "add" methods +to add information about various potentially aliasing instructions in the scope +you are interested in. Once all of the alias sets are completed, your pass +should simply iterate through the constructed alias sets, using the +AliasSetTracker begin()/end() methods.

    + +

    The AliasSets formed by the AliasSetTracker are guaranteed +to be disjoint, calculate mod/ref information and volatility for the set, and +keep track of whether or not all of the pointers in the set are Must aliases. +The AliasSetTracker also makes sure that sets are properly folded due to call +instructions, and can provide a list of pointers in each set.

    + +

    As an example user of this, the Loop +Invariant Code Motion pass uses AliasSetTrackers to calculate alias +sets for each loop nest. If an AliasSet in a loop is not modified, +then all load instructions from that set may be hoisted out of the loop. If any +alias sets are stored to and are must alias sets, then the stores may be +sunk to outside of the loop, promoting the memory location to a register for the +duration of the loop nest. Both of these transformations only apply if the +pointer argument is loop-invariant.

    + +
    + + +
    + The AliasSetTracker implementation +
    + +
    + +

    The AliasSetTracker class is implemented to be as efficient as possible. It +uses the union-find algorithm to efficiently merge AliasSets when a pointer is +inserted into the AliasSetTracker that aliases multiple sets. The primary data +structure is a hash table mapping pointers to the AliasSet they are in.

    + +

    The AliasSetTracker class must maintain a list of all of the LLVM Value*'s +that are in each AliasSet. Since the hash table already has entries for each +LLVM Value* of interest, the AliasesSets thread the linked list through these +hash-table nodes to avoid having to allocate memory unnecessarily, and to make +merging alias sets extremely efficient (the linked list merge is constant time). +

    + +

    You shouldn't need to understand these details if you are just a client of +the AliasSetTracker, but if you look at the code, hopefully this brief +description will help make sense of why things are designed the way they +are.

    + +
    + + + + +
    + +

    If neither of these utility class are what your pass needs, you should use +the interfaces exposed by the AliasAnalysis class directly. Try to use +the higher-level methods when possible (e.g., use mod/ref information instead of +the alias method directly if possible) to get the +best precision and efficiency.

    + +
    + + + + + +
    + +

    If you're going to be working with the LLVM alias analysis infrastructure, +you should know what clients and implementations of alias analysis are +available. In particular, if you are implementing an alias analysis, you should +be aware of the the clients that are useful +for monitoring and evaluating different implementations.

    + +
    + + + + +
    + +

    This section lists the various implementations of the AliasAnalysis +interface. With the exception of the -no-aa and +-basicaa implementations, all of these chain to other alias analysis implementations.

    + +
    + + + + +
    + +

    The -no-aa pass is just like what it sounds: an alias analysis that +never returns any useful information. This pass can be useful if you think that +alias analysis is doing something wrong and are trying to narrow down a +problem.

    + +
    + + + + +
    + +

    The -basicaa pass is the default LLVM alias analysis. It is an +aggressive local analysis that "knows" many important facts:

    + +
      +
    • Distinct globals, stack allocations, and heap allocations can never + alias.
    • +
    • Globals, stack allocations, and heap allocations never alias the null + pointer.
    • +
    • Different fields of a structure do not alias.
    • +
    • Indexes into arrays with statically differing subscripts cannot alias.
    • +
    • Many common standard C library functions never access memory or only read memory.
    • +
    • Pointers that obviously point to constant globals + "pointToConstantMemory".
    • +
    • Function calls can not modify or references stack allocations if they never + escape from the function that allocates them (a common case for automatic + arrays).
    • +
    + +
    + + + + +
    + +

    This pass implements a simple context-sensitive mod/ref and alias analysis +for internal global variables that don't "have their address taken". If a +global does not have its address taken, the pass knows that no pointers alias +the global. This pass also keeps track of functions that it knows never access +memory or never read memory. This allows certain optimizations (e.g. GVN) to +eliminate call instructions entirely. +

    + +

    The real power of this pass is that it provides context-sensitive mod/ref +information for call instructions. This allows the optimizer to know that +calls to a function do not clobber or read the value of the global, allowing +loads and stores to be eliminated.

    + +

    Note that this pass is somewhat limited in its scope (only support +non-address taken globals), but is very quick analysis.

    +
    + + + + +
    + +

    The -steens-aa pass implements a variation on the well-known +"Steensgaard's algorithm" for interprocedural alias analysis. Steensgaard's +algorithm is a unification-based, flow-insensitive, context-insensitive, and +field-insensitive alias analysis that is also very scalable (effectively linear +time).

    + +

    The LLVM -steens-aa pass implements a "speculatively +field-sensitive" version of Steensgaard's algorithm using the Data +Structure Analysis framework. This gives it substantially more precision than +the standard algorithm while maintaining excellent analysis scalability.

    + +

    Note that -steens-aa is available in the optional "poolalloc" +module, it is not part of the LLVM core.

    + +
    + + + + +
    + +

    The -ds-aa pass implements the full Data Structure Analysis +algorithm. Data Structure Analysis is a modular unification-based, +flow-insensitive, context-sensitive, and speculatively +field-sensitive alias analysis that is also quite scalable, usually at +O(n*log(n)).

    + +

    This algorithm is capable of responding to a full variety of alias analysis +queries, and can provide context-sensitive mod/ref information as well. The +only major facility not implemented so far is support for must-alias +information.

    + +

    Note that -ds-aa is available in the optional "poolalloc" +module, it is not part of the LLVM core.

    + +
    + + + + +
    + +

    The -scev-aa pass implements AliasAnalysis queries by +translating them into ScalarEvolution queries. This gives it a +more complete understanding of getelementptr instructions +and loop induction variables than other alias analyses have.

    + +
    + + + + +
    +LLVM includes several alias-analysis driven transformations which can be used +with any of the implementations above. +
    + + + + +
    + +

    The -adce pass, which implements Aggressive Dead Code Elimination +uses the AliasAnalysis interface to delete calls to functions that do +not have side-effects and are not used.

    + +
    + + + + + +
    + +

    The -licm pass implements various Loop Invariant Code Motion related +transformations. It uses the AliasAnalysis interface for several +different transformations:

    + +
      +
    • It uses mod/ref information to hoist or sink load instructions out of loops +if there are no instructions in the loop that modifies the memory loaded.
    • + +
    • It uses mod/ref information to hoist function calls out of loops that do not +write to memory and are loop-invariant.
    • + +
    • If uses alias information to promote memory objects that are loaded and +stored to in loops to live in a register instead. It can do this if there are +no may aliases to the loaded/stored memory location.
    • +
    + +
    + + + + +
    +

    +The -argpromotion pass promotes by-reference arguments to be passed in +by-value instead. In particular, if pointer arguments are only loaded from it +passes in the value loaded instead of the address to the function. This pass +uses alias information to make sure that the value loaded from the argument +pointer is not modified between the entry of the function and any load of the +pointer.

    +
    + + + + +
    + +

    These passes use AliasAnalysis information to reason about loads and stores. +

    + +
    + + + + +
    + +

    These passes are useful for evaluating the various alias analysis +implementations. You can use them with commands like 'opt -ds-aa +-aa-eval foo.bc -disable-output -stats'.

    + +
    + + + + +
    + +

    The -print-alias-sets pass is exposed as part of the +opt tool to print out the Alias Sets formed by the AliasSetTracker class. This is useful if you're using +the AliasSetTracker class. To use it, use something like:

    + +
    +
    +% opt -ds-aa -print-alias-sets -disable-output
    +
    +
    + +
    + + + + + +
    + +

    The -count-aa pass is useful to see how many queries a particular +pass is making and what responses are returned by the alias analysis. As an +example,

    + +
    +
    +% opt -basicaa -count-aa -ds-aa -count-aa -licm
    +
    +
    + +

    will print out how many queries (and what responses are returned) by the +-licm pass (of the -ds-aa pass) and how many queries are made +of the -basicaa pass by the -ds-aa pass. This can be useful +when debugging a transformation or an alias analysis implementation.

    + +
    + + + + +
    + +

    The -aa-eval pass simply iterates through all pairs of pointers in a +function and asks an alias analysis whether or not the pointers alias. This +gives an indication of the precision of the alias analysis. Statistics are +printed indicating the percent of no/may/must aliases found (a more precise +algorithm will have a lower number of may aliases).

    + +
    + + + + + +
    + +

    If you're just looking to be a client of alias analysis information, consider +using the Memory Dependence Analysis interface instead. MemDep is a lazy, +caching layer on top of alias analysis that is able to answer the question of +what preceding memory operations a given instruction depends on, either at an +intra- or inter-block level. Because of its laziness and caching +policy, using MemDep can be a significant performance win over accessing alias +analysis directly.

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-08-30 16:47:24 -0700 (Mon, 30 Aug 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/BitCodeFormat.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/BitCodeFormat.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/BitCodeFormat.html (added) +++ www-releases/trunk/2.8/docs/BitCodeFormat.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1480 @@ + + + + + LLVM Bitcode File Format + + + +
    LLVM Bitcode File Format
    +
      +
    1. Abstract
    2. +
    3. Overview
    4. +
    5. Bitstream Format +
        +
      1. Magic Numbers
      2. +
      3. Primitives
      4. +
      5. Abbreviation IDs
      6. +
      7. Blocks
      8. +
      9. Data Records
      10. +
      11. Abbreviations
      12. +
      13. Standard Blocks
      14. +
      +
    6. +
    7. Bitcode Wrapper Format +
    8. +
    9. LLVM IR Encoding +
        +
      1. Basics
      2. +
      3. MODULE_BLOCK Contents
      4. +
      5. PARAMATTR_BLOCK Contents
      6. +
      7. TYPE_BLOCK Contents
      8. +
      9. CONSTANTS_BLOCK Contents
      10. +
      11. FUNCTION_BLOCK Contents
      12. +
      13. TYPE_SYMTAB_BLOCK Contents
      14. +
      15. VALUE_SYMTAB_BLOCK Contents
      16. +
      17. METADATA_BLOCK Contents
      18. +
      19. METADATA_ATTACHMENT Contents
      20. +
      +
    10. +
    +
    +

    Written by Chris Lattner, + Joshua Haberman, + and Peter S. Housel. +

    +
    + + + + + +
    + +

    This document describes the LLVM bitstream file format and the encoding of +the LLVM IR into it.

    + +
    + + + + + +
    + +

    +What is commonly known as the LLVM bitcode file format (also, sometimes +anachronistically known as bytecode) is actually two things: a bitstream container format +and an encoding of LLVM IR into the container format.

    + +

    +The bitstream format is an abstract encoding of structured data, very +similar to XML in some ways. Like XML, bitstream files contain tags, and nested +structures, and you can parse the file without having to understand the tags. +Unlike XML, the bitstream format is a binary encoding, and unlike XML it +provides a mechanism for the file to self-describe "abbreviations", which are +effectively size optimizations for the content.

    + +

    LLVM IR files may be optionally embedded into a wrapper structure that makes it easy to embed extra data +along with LLVM IR files.

    + +

    This document first describes the LLVM bitstream format, describes the +wrapper format, then describes the record structure used by LLVM IR files. +

    + +
    + + + + + +
    + +

    +The bitstream format is literally a stream of bits, with a very simple +structure. This structure consists of the following concepts: +

    + +
      +
    • A "magic number" that identifies the contents of + the stream.
    • +
    • Encoding primitives like variable bit-rate + integers.
    • +
    • Blocks, which define nested content.
    • +
    • Data Records, which describe entities within the + file.
    • +
    • Abbreviations, which specify compression optimizations for the file.
    • +
    + +

    Note that the llvm-bcanalyzer tool can be +used to dump and inspect arbitrary bitstreams, which is very useful for +understanding the encoding.

    + +
    + + + + +
    + +

    The first two bytes of a bitcode file are 'BC' (0x42, 0x43). +The second two bytes are an application-specific magic number. Generic +bitcode tools can look at only the first two bytes to verify the file is +bitcode, while application-specific programs will want to look at all four.

    + +
    + + + + +
    + +

    +A bitstream literally consists of a stream of bits, which are read in order +starting with the least significant bit of each byte. The stream is made up of a +number of primitive values that encode a stream of unsigned integer values. +These integers are encoded in two ways: either as Fixed +Width Integers or as Variable Width +Integers. +

    + +
    + + + + +
    + +

    Fixed-width integer values have their low bits emitted directly to the file. + For example, a 3-bit integer value encodes 1 as 001. Fixed width integers + are used when there are a well-known number of options for a field. For + example, boolean values are usually encoded with a 1-bit wide integer. +

    + +
    + + + + +
    + +

    Variable-width integer (VBR) values encode values of arbitrary size, +optimizing for the case where the values are small. Given a 4-bit VBR field, +any 3-bit value (0 through 7) is encoded directly, with the high bit set to +zero. Values larger than N-1 bits emit their bits in a series of N-1 bit +chunks, where all but the last set the high bit.

    + +

    For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a +vbr4 value. The first set of four bits indicates the value 3 (011) with a +continuation piece (indicated by a high bit of 1). The next word indicates a +value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value +27. +

    + +
    + + + + +
    + +

    6-bit characters encode common characters into a fixed 6-bit field. They +represent the following characters with the following 6-bit values:

    + +
    +
    +'a' .. 'z' —  0 .. 25
    +'A' .. 'Z' — 26 .. 51
    +'0' .. '9' — 52 .. 61
    +       '.' — 62
    +       '_' — 63
    +
    +
    + +

    This encoding is only suitable for encoding characters and strings that +consist only of the above characters. It is completely incapable of encoding +characters not in the set.

    + +
    + + + + +
    + +

    Occasionally, it is useful to emit zero bits until the bitstream is a +multiple of 32 bits. This ensures that the bit position in the stream can be +represented as a multiple of 32-bit words.

    + +
    + + + + + +
    + +

    +A bitstream is a sequential series of Blocks and +Data Records. Both of these start with an +abbreviation ID encoded as a fixed-bitwidth field. The width is specified by +the current block, as described below. The value of the abbreviation ID +specifies either a builtin ID (which have special meanings, defined below) or +one of the abbreviation IDs defined for the current block by the stream itself. +

    + +

    +The set of builtin abbrev IDs is: +

    + +
      +
    • 0 - END_BLOCK — This abbrev ID marks + the end of the current block.
    • +
    • 1 - ENTER_SUBBLOCK — This + abbrev ID marks the beginning of a new block.
    • +
    • 2 - DEFINE_ABBREV — This defines + a new abbreviation.
    • +
    • 3 - UNABBREV_RECORD — This ID + specifies the definition of an unabbreviated record.
    • +
    + +

    Abbreviation IDs 4 and above are defined by the stream itself, and specify +an abbreviated record encoding.

    + +
    + + + + +
    + +

    +Blocks in a bitstream denote nested regions of the stream, and are identified by +a content-specific id number (for example, LLVM IR uses an ID of 12 to represent +function bodies). Block IDs 0-7 are reserved for standard blocks +whose meaning is defined by Bitcode; block IDs 8 and greater are +application specific. Nested blocks capture the hierarchical structure of the data +encoded in it, and various properties are associated with blocks as the file is +parsed. Block definitions allow the reader to efficiently skip blocks +in constant time if the reader wants a summary of blocks, or if it wants to +efficiently skip data it does not understand. The LLVM IR reader uses this +mechanism to skip function bodies, lazily reading them on demand. +

    + +

    +When reading and encoding the stream, several properties are maintained for the +block. In particular, each block maintains: +

    + +
      +
    1. A current abbrev id width. This value starts at 2 at the beginning of + the stream, and is set every time a + block record is entered. The block entry specifies the abbrev id width for + the body of the block.
    2. + +
    3. A set of abbreviations. Abbreviations may be defined within a block, in + which case they are only defined in that block (neither subblocks nor + enclosing blocks see the abbreviation). Abbreviations can also be defined + inside a BLOCKINFO block, in which case + they are defined in all blocks that match the ID that the BLOCKINFO block is + describing. +
    4. +
    + +

    +As sub blocks are entered, these properties are saved and the new sub-block has +its own set of abbreviations, and its own abbrev id width. When a sub-block is +popped, the saved values are restored. +

    + +
    + + + + +
    + +

    [ENTER_SUBBLOCK, blockidvbr8, newabbrevlenvbr4, + <align32bits>, blocklen32]

    + +

    +The ENTER_SUBBLOCK abbreviation ID specifies the start of a new block +record. The blockid value is encoded as an 8-bit VBR identifier, and +indicates the type of block being entered, which can be +a standard block or an application-specific block. +The newabbrevlen value is a 4-bit VBR, which specifies the abbrev id +width for the sub-block. The blocklen value is a 32-bit aligned value +that specifies the size of the subblock in 32-bit words. This value allows the +reader to skip over the entire block in one jump. +

    + +
    + + + + +
    + +

    [END_BLOCK, <align32bits>]

    + +

    +The END_BLOCK abbreviation ID specifies the end of the current block +record. Its end is aligned to 32-bits to ensure that the size of the block is +an even multiple of 32-bits. +

    + +
    + + + + + + +
    +

    +Data records consist of a record code and a number of (up to) 64-bit +integer values. The interpretation of the code and values is +application specific and may vary between different block types. +Records can be encoded either using an unabbrev record, or with an +abbreviation. In the LLVM IR format, for example, there is a record +which encodes the target triple of a module. The code is +MODULE_CODE_TRIPLE, and the values of the record are the +ASCII codes for the characters in the string. +

    + +
    + + + + +
    + +

    [UNABBREV_RECORD, codevbr6, numopsvbr6, + op0vbr6, op1vbr6, ...]

    + +

    +An UNABBREV_RECORD provides a default fallback encoding, which is both +completely general and extremely inefficient. It can describe an arbitrary +record by emitting the code and operands as VBRs. +

    + +

    +For example, emitting an LLVM IR target triple as an unabbreviated record +requires emitting the UNABBREV_RECORD abbrevid, a vbr6 for the +MODULE_CODE_TRIPLE code, a vbr6 for the length of the string, which is +equal to the number of operands, and a vbr6 for each character. Because there +are no letters with values less than 32, each letter would need to be emitted as +at least a two-part VBR, which means that each letter would require at least 12 +bits. This is not an efficient encoding, but it is fully general. +

    + +
    + + + + +
    + +

    [<abbrevid>, fields...]

    + +

    +An abbreviated record is a abbreviation id followed by a set of fields that are +encoded according to the abbreviation definition. +This allows records to be encoded significantly more densely than records +encoded with the UNABBREV_RECORD type, +and allows the abbreviation types to be specified in the stream itself, which +allows the files to be completely self describing. The actual encoding of +abbreviations is defined below. +

    + +

    The record code, which is the first field of an abbreviated record, +may be encoded in the abbreviation definition (as a literal +operand) or supplied in the abbreviated record (as a Fixed or VBR +operand value).

    + +
    + + + + +
    +

    +Abbreviations are an important form of compression for bitstreams. The idea is +to specify a dense encoding for a class of records once, then use that encoding +to emit many records. It takes space to emit the encoding into the file, but +the space is recouped (hopefully plus some) when the records that use it are +emitted. +

    + +

    +Abbreviations can be determined dynamically per client, per file. Because the +abbreviations are stored in the bitstream itself, different streams of the same +format can contain different sets of abbreviations according to the needs +of the specific stream. +As a concrete example, LLVM IR files usually emit an abbreviation +for binary operators. If a specific LLVM module contained no or few binary +operators, the abbreviation does not need to be emitted. +

    +
    + + + + +
    + +

    [DEFINE_ABBREV, numabbrevopsvbr5, abbrevop0, abbrevop1, + ...]

    + +

    +A DEFINE_ABBREV record adds an abbreviation to the list of currently +defined abbreviations in the scope of this block. This definition only exists +inside this immediate block — it is not visible in subblocks or enclosing +blocks. Abbreviations are implicitly assigned IDs sequentially starting from 4 +(the first application-defined abbreviation ID). Any abbreviations defined in a +BLOCKINFO record for the particular block type +receive IDs first, in order, followed by any +abbreviations defined within the block itself. Abbreviated data records +reference this ID to indicate what abbreviation they are invoking. +

    + +

    +An abbreviation definition consists of the DEFINE_ABBREV abbrevid +followed by a VBR that specifies the number of abbrev operands, then the abbrev +operands themselves. Abbreviation operands come in three forms. They all start +with a single bit that indicates whether the abbrev operand is a literal operand +(when the bit is 1) or an encoding operand (when the bit is 0). +

    + +
      +
    1. Literal operands — [11, litvaluevbr8] +— Literal operands specify that the value in the result is always a single +specific value. This specific value is emitted as a vbr8 after the bit +indicating that it is a literal operand.
    2. +
    3. Encoding info without data — [01, + encoding3] — Operand encodings that do not have extra + data are just emitted as their code. +
    4. +
    5. Encoding info with data — [01, encoding3, +valuevbr5] — Operand encodings that do have extra data are +emitted as their code, followed by the extra data. +
    6. +
    + +

    The possible operand encodings are:

    + +
      +
    • Fixed (code 1): The field should be emitted as + a fixed-width value, whose width is specified by + the operand's extra data.
    • +
    • VBR (code 2): The field should be emitted as + a variable-width value, whose width is + specified by the operand's extra data.
    • +
    • Array (code 3): This field is an array of values. The array operand + has no extra data, but expects another operand to follow it, indicating + the element type of the array. When reading an array in an abbreviated + record, the first integer is a vbr6 that indicates the array length, + followed by the encoded elements of the array. An array may only occur as + the last operand of an abbreviation (except for the one final operand that + gives the array's type).
    • +
    • Char6 (code 4): This field should be emitted as + a char6-encoded value. This operand type takes no + extra data. Char6 encoding is normally used as an array element type. +
    • +
    • Blob (code 5): This field is emitted as a vbr6, followed by padding to a + 32-bit boundary (for alignment) and an array of 8-bit objects. The array of + bytes is further followed by tail padding to ensure that its total length is + a multiple of 4 bytes. This makes it very efficient for the reader to + decode the data without having to make a copy of it: it can use a pointer to + the data in the mapped in file and poke directly at it. A blob may only + occur as the last operand of an abbreviation.
    • +
    + +

    +For example, target triples in LLVM modules are encoded as a record of the +form [TRIPLE, 'a', 'b', 'c', 'd']. Consider if the bitstream emitted +the following abbrev entry: +

    + +
    +
    +[0, Fixed, 4]
    +[0, Array]
    +[0, Char6]
    +
    +
    + +

    +When emitting a record with this abbreviation, the above entry would be emitted +as: +

    + +
    +

    +[4abbrevwidth, 24, 4vbr6, 06, +16, 26, 36] +

    +
    + +

    These values are:

    + +
      +
    1. The first value, 4, is the abbreviation ID for this abbreviation.
    2. +
    3. The second value, 2, is the record code for TRIPLE records within LLVM IR file MODULE_BLOCK blocks.
    4. +
    5. The third value, 4, is the length of the array.
    6. +
    7. The rest of the values are the char6 encoded values + for "abcd".
    8. +
    + +

    +With this abbreviation, the triple is emitted with only 37 bits (assuming a +abbrev id width of 3). Without the abbreviation, significantly more space would +be required to emit the target triple. Also, because the TRIPLE value +is not emitted as a literal in the abbreviation, the abbreviation can also be +used for any other string value. +

    + +
    + + + + +
    + +

    +In addition to the basic block structure and record encodings, the bitstream +also defines specific built-in block types. These block types specify how the +stream is to be decoded or other metadata. In the future, new standard blocks +may be added. Block IDs 0-7 are reserved for standard blocks. +

    + +
    + + + + +
    + +

    +The BLOCKINFO block allows the description of metadata for other +blocks. The currently specified records are: +

    + +
    +
    +[SETBID (#1), blockid]
    +[DEFINE_ABBREV, ...]
    +[BLOCKNAME, ...name...]
    +[SETRECORDNAME, RecordID, ...name...]
    +
    +
    + +

    +The SETBID record (code 1) indicates which block ID is being +described. SETBID records can occur multiple times throughout the +block to change which block ID is being described. There must be +a SETBID record prior to any other records. +

    + +

    +Standard DEFINE_ABBREV records can occur inside BLOCKINFO +blocks, but unlike their occurrence in normal blocks, the abbreviation is +defined for blocks matching the block ID we are describing, not the +BLOCKINFO block itself. The abbreviations defined +in BLOCKINFO blocks receive abbreviation IDs as described +in DEFINE_ABBREV. +

    + +

    The BLOCKNAME record (code 2) can optionally occur in this block. The elements of +the record are the bytes of the string name of the block. llvm-bcanalyzer can use +this to dump out bitcode files symbolically.

    + +

    The SETRECORDNAME record (code 3) can also optionally occur in this block. The +first operand value is a record ID number, and the rest of the elements of the record are +the bytes for the string name of the record. llvm-bcanalyzer can use +this to dump out bitcode files symbolically.

    + +

    +Note that although the data in BLOCKINFO blocks is described as +"metadata," the abbreviations they contain are essential for parsing records +from the corresponding blocks. It is not safe to skip them. +

    + +
    + + + + + +
    + +

    +Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper +structure. This structure contains a simple header that indicates the offset +and size of the embedded BC file. This allows additional information to be +stored alongside the BC file. The structure of this file header is: +

    + +
    +

    +[Magic32, Version32, Offset32, +Size32, CPUType32] +

    +
    + +

    +Each of the fields are 32-bit fields stored in little endian form (as with +the rest of the bitcode file fields). The Magic number is always +0x0B17C0DE and the version is currently always 0. The Offset +field is the offset in bytes to the start of the bitcode stream in the file, and +the Size field is the size in bytes of the stream. CPUType is a target-specific +value that can be used to encode the CPU of the target. +

    + +
    + + + + + +
    + +

    +LLVM IR is encoded into a bitstream by defining blocks and records. It uses +blocks for things like constant pools, functions, symbol tables, etc. It uses +records for things like instructions, global variable descriptors, type +descriptions, etc. This document does not describe the set of abbreviations +that the writer uses, as these are fully self-described in the file, and the +reader is not allowed to build in any knowledge of this. +

    + +
    + + + + + + + +
    + +

    +The magic number for LLVM IR files is: +

    + +
    +

    +[0x04, 0xC4, 0xE4, 0xD4] +

    +
    + +

    +When combined with the bitcode magic number and viewed as bytes, this is +"BC 0xC0DE". +

    + +
    + + + + +
    + +

    +Variable Width Integer encoding is an efficient way to +encode arbitrary sized unsigned values, but is an extremely inefficient for +encoding signed values, as signed values are otherwise treated as maximally large +unsigned values. +

    + +

    +As such, signed VBR values of a specific width are emitted as follows: +

    + +
      +
    • Positive values are emitted as VBRs of the specified width, but with their + value shifted left by one.
    • +
    • Negative values are emitted as VBRs of the specified width, but the negated + value is shifted left by one, and the low bit is set.
    • +
    + +

    +With this encoding, small positive and small negative values can both +be emitted efficiently. Signed VBR encoding is used in +CST_CODE_INTEGER and CST_CODE_WIDE_INTEGER records +within CONSTANTS_BLOCK blocks. +

    + +
    + + + + + +
    + +

    +LLVM IR is defined with the following blocks: +

    + +
      +
    • 8 — MODULE_BLOCK — This is the top-level block that + contains the entire module, and describes a variety of per-module + information.
    • +
    • 9 — PARAMATTR_BLOCK — This enumerates the parameter + attributes.
    • +
    • 10 — TYPE_BLOCK — This describes all of the types in + the module.
    • +
    • 11 — CONSTANTS_BLOCK — This describes constants for a + module or function.
    • +
    • 12 — FUNCTION_BLOCK — This describes a function + body.
    • +
    • 13 — TYPE_SYMTAB_BLOCK — This describes the type symbol + table.
    • +
    • 14 — VALUE_SYMTAB_BLOCK — This describes a value symbol + table.
    • +
    • 15 — METADATA_BLOCK — This describes metadata items.
    • +
    • 16 — METADATA_ATTACHMENT — This contains records associating metadata with function instruction values.
    • +
    + +
    + + + + +
    + +

    The MODULE_BLOCK block (id 8) is the top-level block for LLVM +bitcode files, and each bitcode file must contain exactly one. In +addition to records (described below) containing information +about the module, a MODULE_BLOCK block may contain the +following sub-blocks: +

    + + + +
    + + + + +
    + +

    [VERSION, version#]

    + +

    The VERSION record (code 1) contains a single value +indicating the format version. Only version 0 is supported at this +time.

    +
    + + + + +
    +

    [TRIPLE, ...string...]

    + +

    The TRIPLE record (code 2) contains a variable number of +values representing the bytes of the target triple +specification string.

    +
    + + + + +
    +

    [DATALAYOUT, ...string...]

    + +

    The DATALAYOUT record (code 3) contains a variable number of +values representing the bytes of the target datalayout +specification string.

    +
    + + + + +
    +

    [ASM, ...string...]

    + +

    The ASM record (code 4) contains a variable number of +values representing the bytes of module asm strings, with +individual assembly blocks separated by newline (ASCII 10) characters.

    +
    + + + + +
    +

    [SECTIONNAME, ...string...]

    + +

    The SECTIONNAME record (code 5) contains a variable number +of values representing the bytes of a single section name +string. There should be one SECTIONNAME record for each +section name referenced (e.g., in global variable or function +section attributes) within the module. These records can be +referenced by the 1-based index in the section fields of +GLOBALVAR or FUNCTION records.

    +
    + + + + +
    +

    [DEPLIB, ...string...]

    + +

    The DEPLIB record (code 6) contains a variable number of +values representing the bytes of a single dependent library name +string, one of the libraries mentioned in a deplibs +declaration. There should be one DEPLIB record for each +library name referenced.

    +
    + + + + +
    +

    [GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal]

    + +

    The GLOBALVAR record (code 7) marks the declaration or +definition of a global variable. The operand fields are:

    + +
      +
    • pointer type: The type index of the pointer type used to point to +this global variable
    • + +
    • isconst: Non-zero if the variable is treated as constant within +the module, or zero if it is not
    • + +
    • initid: If non-zero, the value index of the initializer for this +variable, plus 1.
    • + +
    • linkage: An encoding of the linkage +type for this variable: +
        +
      • external: code 0
      • +
      • weak: code 1
      • +
      • appending: code 2
      • +
      • internal: code 3
      • +
      • linkonce: code 4
      • +
      • dllimport: code 5
      • +
      • dllexport: code 6
      • +
      • extern_weak: code 7
      • +
      • common: code 8
      • +
      • private: code 9
      • +
      • weak_odr: code 10
      • +
      • linkonce_odr: code 11
      • +
      • available_externally: code 12
      • +
      • linker_private: code 13
      • +
      +
    • + +
    • alignment: The logarithm base 2 of the variable's requested +alignment, plus 1
    • + +
    • section: If non-zero, the 1-based section index in the +table of MODULE_CODE_SECTIONNAME +entries.
    • + +
    • visibility: If present, an +encoding of the visibility of this variable: +
        +
      • default: code 0
      • +
      • hidden: code 1
      • +
      • protected: code 2
      • +
      +
    • + +
    • threadlocal: If present and non-zero, indicates that the variable +is thread_local
    • + +
    +
    + + + + +
    + +

    [FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]

    + +

    The FUNCTION record (code 8) marks the declaration or +definition of a function. The operand fields are:

    + +
      +
    • type: The type index of the function type describing this function
    • + +
    • callingconv: The calling convention number: +
        +
      • ccc: code 0
      • +
      • fastcc: code 8
      • +
      • coldcc: code 9
      • +
      • x86_stdcallcc: code 64
      • +
      • x86_fastcallcc: code 65
      • +
      • arm_apcscc: code 66
      • +
      • arm_aapcscc: code 67
      • +
      • arm_aapcs_vfpcc: code 68
      • +
      +
    • + +
    • isproto: Non-zero if this entry represents a declaration +rather than a definition
    • + +
    • linkage: An encoding of the linkage type +for this function
    • + +
    • paramattr: If nonzero, the 1-based parameter attribute index +into the table of PARAMATTR_CODE_ENTRY +entries.
    • + +
    • alignment: The logarithm base 2 of the function's requested +alignment, plus 1
    • + +
    • section: If non-zero, the 1-based section index in the +table of MODULE_CODE_SECTIONNAME +entries.
    • + +
    • visibility: An encoding of the visibility + of this function
    • + +
    • gc: If present and nonzero, the 1-based garbage collector +index in the table of +MODULE_CODE_GCNAME entries.
    • +
    +
    + + + + +
    + +

    [ALIAS, alias type, aliasee val#, linkage, visibility]

    + +

    The ALIAS record (code 9) marks the definition of an +alias. The operand fields are

    + +
      +
    • alias type: The type index of the alias
    • + +
    • aliasee val#: The value index of the aliased value
    • + +
    • linkage: An encoding of the linkage type +for this alias
    • + +
    • visibility: If present, an encoding of the +visibility of the alias
    • + +
    +
    + + + + +
    +

    [PURGEVALS, numvals]

    + +

    The PURGEVALS record (code 10) resets the module-level +value list to the size given by the single operand value. Module-level +value list items are added by GLOBALVAR, FUNCTION, +and ALIAS records. After a PURGEVALS record is seen, +new value indices will start from the given numvals value.

    +
    + + + + +
    +

    [GCNAME, ...string...]

    + +

    The GCNAME record (code 11) contains a variable number of +values representing the bytes of a single garbage collector name +string. There should be one GCNAME record for each garbage +collector name referenced in function gc attributes within +the module. These records can be referenced by 1-based index in the gc +fields of FUNCTION records.

    +
    + + + + +
    + +

    The PARAMATTR_BLOCK block (id 9) contains a table of +entries describing the attributes of function parameters. These +entries are referenced by 1-based index in the paramattr field +of module block FUNCTION +records, or within the attr field of function block INST_INVOKE and INST_CALL records.

    + +

    Entries within PARAMATTR_BLOCK are constructed to ensure +that each is unique (i.e., no two indicies represent equivalent +attribute lists).

    + +
    + + + + + +
    + +

    [ENTRY, paramidx0, attr0, paramidx1, attr1...]

    + +

    The ENTRY record (code 1) contains an even number of +values describing a unique set of function parameter attributes. Each +paramidx value indicates which set of attributes is +represented, with 0 representing the return value attributes, +0xFFFFFFFF representing function attributes, and other values +representing 1-based function parameters. Each attr value is a +bitmap with the following interpretation: +

    + +
      +
    • bit 0: zeroext
    • +
    • bit 1: signext
    • +
    • bit 2: noreturn
    • +
    • bit 3: inreg
    • +
    • bit 4: sret
    • +
    • bit 5: nounwind
    • +
    • bit 6: noalias
    • +
    • bit 7: byval
    • +
    • bit 8: nest
    • +
    • bit 9: readnone
    • +
    • bit 10: readonly
    • +
    • bit 11: noinline
    • +
    • bit 12: alwaysinline
    • +
    • bit 13: optsize
    • +
    • bit 14: ssp
    • +
    • bit 15: sspreq
    • +
    • bits 16–31: align n
    • +
    • bit 32: nocapture
    • +
    • bit 33: noredzone
    • +
    • bit 34: noimplicitfloat
    • +
    • bit 35: naked
    • +
    • bit 36: inlinehint
    • +
    • bits 37–39: alignstack n, represented as +the logarithm base 2 of the requested alignment, plus 1
    • +
    +
    + + + + +
    + +

    The TYPE_BLOCK block (id 10) contains records which +constitute a table of type operator entries used to represent types +referenced within an LLVM module. Each record (with the exception of +NUMENTRY) generates a +single type table entry, which may be referenced by 0-based index from +instructions, constants, metadata, type symbol table entries, or other +type operator records. +

    + +

    Entries within TYPE_BLOCK are constructed to ensure that +each entry is unique (i.e., no two indicies represent structurally +equivalent types).

    + +
    + + + + +
    + +

    [NUMENTRY, numentries]

    + +

    The NUMENTRY record (code 1) contains a single value which +indicates the total number of type code entries in the type table of +the module. If present, NUMENTRY should be the first record +in the block. +

    +
    + + + + +
    + +

    [VOID]

    + +

    The VOID record (code 2) adds a void type to the +type table. +

    +
    + + + + +
    + +

    [FLOAT]

    + +

    The FLOAT record (code 3) adds a float (32-bit +floating point) type to the type table. +

    +
    + + + + +
    + +

    [DOUBLE]

    + +

    The DOUBLE record (code 4) adds a double (64-bit +floating point) type to the type table. +

    +
    + + + + +
    + +

    [LABEL]

    + +

    The LABEL record (code 5) adds a label type to +the type table. +

    +
    + + + + +
    + +

    [OPAQUE]

    + +

    The OPAQUE record (code 6) adds an opaque type to +the type table. Note that distinct opaque types are not +unified. +

    +
    + + + + +
    + +

    [INTEGER, width]

    + +

    The INTEGER record (code 7) adds an integer type to the +type table. The single width field indicates the width of the +integer type. +

    +
    + + + + +
    + +

    [POINTER, pointee type, address space]

    + +

    The POINTER record (code 8) adds a pointer type to the +type table. The operand fields are

    + +
      +
    • pointee type: The type index of the pointed-to type
    • + +
    • address space: If supplied, the target-specific numbered +address space where the pointed-to object resides. Otherwise, the +default address space is zero. +
    • +
    +
    + + + + +
    + +

    [FUNCTION, vararg, ignored, retty, ...paramty... ]

    + +

    The FUNCTION record (code 9) adds a function type to the +type table. The operand fields are

    + +
      +
    • vararg: Non-zero if the type represents a varargs function
    • + +
    • ignored: This value field is present for backward +compatibility only, and is ignored
    • + +
    • retty: The type index of the function's return type
    • + +
    • paramty: Zero or more type indices representing the +parameter types of the function
    • +
    + +
    + + + + +
    + +

    [STRUCT, ispacked, ...eltty...]

    + +

    The STRUCT record (code 10) adds a struct type to the +type table. The operand fields are

    + +
      +
    • ispacked: Non-zero if the type represents a packed structure
    • + +
    • eltty: Zero or more type indices representing the element +types of the structure
    • +
    +
    + + + + +
    + +

    [ARRAY, numelts, eltty]

    + +

    The ARRAY record (code 11) adds an array type to the type +table. The operand fields are

    + +
      +
    • numelts: The number of elements in arrays of this type
    • + +
    • eltty: The type index of the array element type
    • +
    +
    + + + + +
    + +

    [VECTOR, numelts, eltty]

    + +

    The VECTOR record (code 12) adds a vector type to the type +table. The operand fields are

    + +
      +
    • numelts: The number of elements in vectors of this type
    • + +
    • eltty: The type index of the vector element type
    • +
    +
    + + + + +
    + +

    [X86_FP80]

    + +

    The X86_FP80 record (code 13) adds an x86_fp80 (80-bit +floating point) type to the type table. +

    +
    + + + + +
    + +

    [FP128]

    + +

    The FP128 record (code 14) adds an fp128 (128-bit +floating point) type to the type table. +

    +
    + + + + +
    + +

    [PPC_FP128]

    + +

    The PPC_FP128 record (code 15) adds a ppc_fp128 +(128-bit floating point) type to the type table. +

    +
    + + + + +
    + +

    [METADATA]

    + +

    The METADATA record (code 16) adds a metadata +type to the type table. +

    +
    + + + + +
    + +

    The CONSTANTS_BLOCK block (id 11) ... +

    + +
    + + + + + +
    + +

    The FUNCTION_BLOCK block (id 12) ... +

    + +

    In addition to the record types described below, a +FUNCTION_BLOCK block may contain the following sub-blocks: +

    + + + +
    + + + + + +
    + +

    The TYPE_SYMTAB_BLOCK block (id 13) contains entries which +map between module-level named types and their corresponding type +indices. +

    + +
    + + + + +
    + +

    [ENTRY, typeid, ...string...]

    + +

    The ENTRY record (code 1) contains a variable number of +values, with the first giving the type index of the designated type, +and the remaining values giving the character codes of the type +name. Each entry corresponds to a single named type. +

    +
    + + + + + +
    + +

    The VALUE_SYMTAB_BLOCK block (id 14) ... +

    + +
    + + + + + +
    + +

    The METADATA_BLOCK block (id 15) ... +

    + +
    + + + + + +
    + +

    The METADATA_ATTACHMENT block (id 16) ... +

    + +
    + + + +
    +
    Valid CSS +Valid HTML 4.01 + Chris Lattner
    +The LLVM Compiler Infrastructure
    +Last modified: $Date: 2010-08-27 21:09:24 -0700 (Fri, 27 Aug 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/Bugpoint.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/Bugpoint.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/Bugpoint.html (added) +++ www-releases/trunk/2.8/docs/Bugpoint.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,250 @@ + + + + LLVM bugpoint tool: design and usage + + + +
    + LLVM bugpoint tool: design and usage +
    + + + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    bugpoint narrows down the source of problems in LLVM tools and +passes. It can be used to debug three types of failures: optimizer crashes, +miscompilations by optimizers, or bad native code generation (including problems +in the static and JIT compilers). It aims to reduce large test cases to small, +useful ones. For example, if opt crashes while optimizing a +file, it will identify the optimization (or combination of optimizations) that +causes the crash, and reduce the file down to a small example which triggers the +crash.

    + +

    For detailed case scenarios, such as debugging opt, +llvm-ld, or one of the LLVM code generators, see How To Submit a Bug Report document.

    + +
    + + + + + +
    + +

    bugpoint is designed to be a useful tool without requiring any +hooks into the LLVM infrastructure at all. It works with any and all LLVM +passes and code generators, and does not need to "know" how they work. Because +of this, it may appear to do stupid things or miss obvious +simplifications. bugpoint is also designed to trade off programmer +time for computer time in the compiler-debugging process; consequently, it may +take a long period of (unattended) time to reduce a test case, but we feel it +is still worth it. Note that bugpoint is generally very quick unless +debugging a miscompilation where each test of the program (which requires +executing it) takes a long time.

    + +
    + + + + +
    + +

    bugpoint reads each .bc or .ll file specified on +the command line and links them together into a single module, called the test +program. If any LLVM passes are specified on the command line, it runs these +passes on the test program. If any of the passes crash, or if they produce +malformed output (which causes the verifier to abort), bugpoint starts +the crash debugger.

    + +

    Otherwise, if the -output option was not specified, +bugpoint runs the test program with the C backend (which is assumed to +generate good code) to generate a reference output. Once bugpoint has +a reference output for the test program, it tries executing it with the +selected code generator. If the selected code generator crashes, +bugpoint starts the crash debugger on the +code generator. Otherwise, if the resulting output differs from the reference +output, it assumes the difference resulted from a code generator failure, and +starts the code generator debugger.

    + +

    Finally, if the output of the selected code generator matches the reference +output, bugpoint runs the test program after all of the LLVM passes +have been applied to it. If its output differs from the reference output, it +assumes the difference resulted from a failure in one of the LLVM passes, and +enters the miscompilation debugger. +Otherwise, there is no problem bugpoint can debug.

    + +
    + + + + +
    + +

    If an optimizer or code generator crashes, bugpoint will try as hard +as it can to reduce the list of passes (for optimizer crashes) and the size of +the test program. First, bugpoint figures out which combination of +optimizer passes triggers the bug. This is useful when debugging a problem +exposed by opt, for example, because it runs over 38 passes.

    + +

    Next, bugpoint tries removing functions from the test program, to +reduce its size. Usually it is able to reduce a test program to a single +function, when debugging intraprocedural optimizations. Once the number of +functions has been reduced, it attempts to delete various edges in the control +flow graph, to reduce the size of the function as much as possible. Finally, +bugpoint deletes any individual LLVM instructions whose absence does +not eliminate the failure. At the end, bugpoint should tell you what +passes crash, give you a bitcode file, and give you instructions on how to +reproduce the failure with opt or llc.

    + +
    + + + + +
    + +

    The code generator debugger attempts to narrow down the amount of code that +is being miscompiled by the selected code generator. To do this, it takes the +test program and partitions it into two pieces: one piece which it compiles +with the C backend (into a shared object), and one piece which it runs with +either the JIT or the static LLC compiler. It uses several techniques to +reduce the amount of code pushed through the LLVM code generator, to reduce the +potential scope of the problem. After it is finished, it emits two bitcode +files (called "test" [to be compiled with the code generator] and "safe" [to be +compiled with the C backend], respectively), and instructions for reproducing +the problem. The code generator debugger assumes that the C backend produces +good code.

    + +
    + + + + +
    + +

    The miscompilation debugger works similarly to the code generator debugger. +It works by splitting the test program into two pieces, running the +optimizations specified on one piece, linking the two pieces back together, and +then executing the result. It attempts to narrow down the list of passes to +the one (or few) which are causing the miscompilation, then reduce the portion +of the test program which is being miscompiled. The miscompilation debugger +assumes that the selected code generator is working properly.

    + +
    + + + + + +
    + +bugpoint can be a remarkably useful tool, but it sometimes works in +non-obvious ways. Here are some hints and tips:

    + +

      +
    1. In the code generator and miscompilation debuggers, bugpoint only + works with programs that have deterministic output. Thus, if the program + outputs argv[0], the date, time, or any other "random" data, + bugpoint may misinterpret differences in these data, when output, + as the result of a miscompilation. Programs should be temporarily modified + to disable outputs that are likely to vary from run to run. + +
    2. In the code generator and miscompilation debuggers, debugging will go + faster if you manually modify the program or its inputs to reduce the + runtime, but still exhibit the problem. + +
    3. bugpoint is extremely useful when working on a new optimization: + it helps track down regressions quickly. To avoid having to relink + bugpoint every time you change your optimization however, have + bugpoint dynamically load your optimization with the + -load option. + +
    4. bugpoint can generate a lot of output and run for a long period + of time. It is often useful to capture the output of the program to file. + For example, in the C shell, you can run:

      + +
      +

      bugpoint ... |& tee bugpoint.log

      +
      + +

      to get a copy of bugpoint's output in the file + bugpoint.log, as well as on your terminal.

      + +
    5. bugpoint cannot debug problems with the LLVM linker. If + bugpoint crashes before you see its "All input ok" message, + you might try llvm-link -v on the same set of input files. If + that also crashes, you may be experiencing a linker bug. + +
    6. bugpoint is useful for proactively finding bugs in LLVM. + Invoking bugpoint with the -find-bugs option will cause + the list of specified optimizations to be randomized and applied to the + program. This process will repeat until a bug is found or the user + kills bugpoint. + +
    7. bugpoint does not understand the -O option + that is used to specify optimization level to opt. You + can use e.g.

      + +
      +

      opt -O2 -debug-pass=Arguments foo.bc -disable-output

      +
      + +

      to get a list of passes that are used with -O2 and + then pass this list to bugpoint.

      + +
    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CFEBuildInstrs.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CFEBuildInstrs.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CFEBuildInstrs.html (added) +++ www-releases/trunk/2.8/docs/CFEBuildInstrs.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,29 @@ + + + + + + Building the LLVM C/C++ Front-End + + + +
    +This page has moved here. +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + LLVM Compiler Infrastructure
    + Last modified: $Date: 2008-02-13 17:46:10 +0100 (Wed, 13 Feb 2008) $ +
    + + + Added: www-releases/trunk/2.8/docs/CMake.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CMake.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CMake.html (added) +++ www-releases/trunk/2.8/docs/CMake.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,421 @@ + + + + Building LLVM with CMake + + + +
    + Building LLVM with CMake +
    + + + +
    +

    Written by Oscar Fuentes

    +
    + + + + + +
    + +

    CMake is a cross-platform + build-generator tool. CMake does not build the project, it generates + the files needed by your build tool (GNU make, Visual Studio, etc) for + building LLVM.

    + +

    If you are really anxious about getting a functional LLVM build, + go to the Quick start section. If you + are a CMake novice, start on Basic CMake + usage and then go back to the Quick + start once you know what you are + doing. The Options and variables section + is a reference for customizing your build. If you already have + experience with CMake, this is the recommended starting point. +

    + + + + + +
    + +

    We use here the command-line, non-interactive CMake interface

    + +
      + +
    1. Download + and install CMake. Version 2.6.2 is the minimum required.

      + +
    2. Open a shell. Your development tools must be reachable from this + shell through the PATH environment variable.

      + +
    3. Create a directory for containing the build. It is not + supported to build LLVM on the source directory. cd to this + directory:

      +
      +

      mkdir mybuilddir

      +

      cd mybuilddir

      +
      + +
    4. Execute this command on the shell + replacing path/to/llvm/source/root with the path to the + root of your LLVM source tree:

      +
      +

      cmake path/to/llvm/source/root

      +
      + +

      CMake will detect your development environment, perform a + series of test and generate the files required for building + LLVM. CMake will use default values for all build + parameters. See the Options and variables + section for fine-tuning your build

      + +

      This can fail if CMake can't detect your toolset, or if it + thinks that the environment is not sane enough. On this case + make sure that the toolset that you intend to use is the only + one reachable from the shell and that the shell itself is the + correct one for you development environment. CMake will refuse + to build MinGW makefiles if you have a POSIX shell reachable + through the PATH environment variable, for instance. You can + force CMake to use a given build tool, see + the Usage section.

      + +
    + +
    + + + + + +
    + +

    This section explains basic aspects of CMake, mostly for + explaining those options which you may need on your day-to-day + usage.

    + +

    CMake comes with extensive documentation in the form of html + files and on the cmake executable itself. Execute cmake + --help for further help options.

    + +

    CMake requires to know for which build tool it shall generate + files (GNU make, Visual Studio, Xcode, etc). If not specified on + the command line, it tries to guess it based on you + environment. Once identified the build tool, CMake uses the + corresponding Generator for creating files for your build + tool. You can explicitly specify the generator with the command + line option -G "Name of the generator". For knowing the + available generators on your platform, execute

    + +
    +

    cmake --help

    +
    + +

    This will list the generator's names at the end of the help + text. Generator's names are case-sensitive. Example:

    + +
    +

    cmake -G "Visual Studio 8 2005" path/to/llvm/source/root

    +
    + +

    For a given development platform there can be more than one + adequate generator. If you use Visual Studio "NMake Makefiles" + is a generator you can use for building with NMake. By default, + CMake chooses the more specific generator supported by your + development environment. If you want an alternative generator, + you must tell this to CMake with the -G option.

    + +

    TODO: explain variables and cache. Move explanation here from + #options section.

    + +
    + + + + + +
    + +

    Variables customize how the build will be generated. Options are + boolean variables, with possible values ON/OFF. Options and + variables are defined on the CMake command line like this:

    + +
    +

    cmake -DVARIABLE=value path/to/llvm/source

    +
    + +

    You can set a variable after the initial CMake invocation for + changing its value. You can also undefine a variable:

    + +
    +

    cmake -UVARIABLE path/to/llvm/source

    +
    + +

    Variables are stored on the CMake cache. This is a file + named CMakeCache.txt on the root of the build + directory. Do not hand-edit it.

    + +

    Variables are listed here appending its type after a colon. It is + correct to write the variable and the type on the CMake command + line:

    + +
    +

    cmake -DVARIABLE:TYPE=value path/to/llvm/source

    +
    + +
    + + + + +
    + +

    Here are listed some of the CMake variables that are used often, + along with a brief explanation and LLVM-specific notes. For full + documentation, check the CMake docs or execute cmake + --help-variable VARIABLE_NAME.

    + +
    +
    CMAKE_BUILD_TYPE:STRING
    + +
    Sets the build type for make based generators. Possible + values are Release, Debug, RelWithDebInfo and MinSizeRel. On + systems like Visual Studio the user sets the build type with the IDE + settings.
    + +
    CMAKE_INSTALL_PREFIX:PATH
    +
    Path where LLVM will be installed if "make install" is invoked + or the "INSTALL" target is built.
    + +
    LLVM_LIBDIR_SUFFIX:STRING
    +
    Extra suffix to append to the directory where libraries are to + be installed. On a 64-bit architecture, one could use + -DLLVM_LIBDIR_SUFFIX=64 to install libraries to /usr/lib64.
    + +
    CMAKE_C_FLAGS:STRING
    +
    Extra flags to use when compiling C source files.
    + +
    CMAKE_CXX_FLAGS:STRING
    +
    Extra flags to use when compiling C++ source files.
    + +
    BUILD_SHARED_LIBS:BOOL
    +
    Flag indicating is shared libraries will be built. Its default + value is OFF. Shared libraries are not supported on Windows and + not recommended in the other OSes.
    +
    + +
    + + + + +
    + +
    +
    LLVM_TARGETS_TO_BUILD:STRING
    +
    Semicolon-separated list of targets to build, or all for + building all targets. Case-sensitive. For Visual C++ defaults + to X86. On the other cases defaults to all. Example: + -DLLVM_TARGETS_TO_BUILD="X86;PowerPC;Alpha".
    + +
    LLVM_BUILD_TOOLS:BOOL
    +
    Build LLVM tools. Defaults to ON. Targets for building each tool + are generated in any case. You can build an tool separately by + invoking its target. For example, you can build llvm-as + with a makefile-based system executing make llvm-as on the + root of your build directory.
    + +
    LLVM_BUILD_EXAMPLES:BOOL
    +
    Build LLVM examples. Defaults to OFF. Targets for building each + example are generated in any case. See documentation + for LLVM_BUILD_TOOLS above for more details.
    + +
    LLVM_ENABLE_THREADS:BOOL
    +
    Build with threads support, if available. Defaults to ON.
    + +
    LLVM_ENABLE_ASSERTIONS:BOOL
    +
    Enables code assertions. Defaults to OFF if and only if + CMAKE_BUILD_TYPE is Release.
    + +
    LLVM_ENABLE_PIC:BOOL
    +
    Add the -fPIC flag for the compiler command-line, if the + compiler supports this flag. Some systems, like Windows, do not + need this flag. Defaults to ON.
    + +
    LLVM_ENABLE_WARNINGS:BOOL
    +
    Enable all compiler warnings. Defaults to ON.
    + +
    LLVM_ENABLE_PEDANTIC:BOOL
    +
    Enable pedantic mode. This disable compiler specific extensions, is + possible. Defaults to ON.
    + +
    LLVM_ENABLE_WERROR:BOOL
    +
    Stop and fail build, if a compiler warning is + triggered. Defaults to OFF.
    + +
    LLVM_BUILD_32_BITS:BOOL
    +
    Build 32-bits executables and libraries on 64-bits systems. This + option is available only on some 64-bits unix systems. Defaults to + OFF.
    + +
    LLVM_TARGET_ARCH:STRING
    +
    LLVM target to use for native code generation. This is required + for JIT generation. It defaults to "host", meaning that it shall + pick the architecture of the machine where LLVM is being built. If + you are cross-compiling, set it to the target architecture + name.
    + +
    LLVM_TABLEGEN:STRING
    +
    Full path to a native TableGen executable (usually + named tblgen). This is intented for cross-compiling: if the + user sets this variable, no native TableGen will be created.
    +
    + +
    + + + + + +
    + +

    Testing is performed when the check target is built. For + instance, if you are using makefiles, execute this command while on + the top level of your build directory:

    + +
    +

    make check

    +
    + +

    Testing is not supported on Visual Studio.

    + +
    + + + + + +
    + +

    See this + wiki page for generic instructions on how to cross-compile + with CMake. It goes into detailed explanations and may seem + daunting, but it is not. On the wiki page there are several + examples including toolchain files. Go directly to + this + section for a quick solution.

    + +

    Also see the LLVM-specific variables + section for variables used when cross-compiling.

    + +
    + + + + + +
    + +

    The most difficult part of adding LLVM to the build of a project + is to determine the set of LLVM libraries corresponding to the set + of required LLVM features. What follows is an example of how to + obtain this information:

    + +
    +
    +    # A convenience variable:
    +    set(LLVM_ROOT "" CACHE PATH "Root of LLVM install.")
    +    # A bit of a sanity check:
    +    if( NOT EXISTS ${LLVM_ROOT}/include/llvm )
    +    message(FATAL_ERROR "LLVM_ROOT (${LLVM_ROOT}) is not a valid LLVM install")
    +    endif()
    +    # We incorporate the CMake features provided by LLVM:
    +    set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${LLVM_ROOT}/share/llvm/cmake")
    +    include(LLVM)
    +    # Now set the header and library paths:
    +    include_directories( ${LLVM_ROOT}/include )
    +    link_directories( ${LLVM_ROOT}/lib )
    +    # Let's suppose we want to build a JIT compiler with support for
    +    # binary code (no interpreter):
    +    llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native)
    +    # Finally, we link the LLVM libraries to our executable:
    +    target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES})
    +    
    +
    + +

    This assumes that LLVM_ROOT points to an install of LLVM. The + procedure works too for uninstalled builds although we need to take + care to add an include_directories for the location of the + headers on the LLVM source directory (if we are building + out-of-source.)

    + +
    + + + + + + + +
    + +

    Notes for specific compilers and/or platforms.

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Oscar Fuentes
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-08-09 03:59:36 +0100 (Mon, 9 Aug 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CodeGenerator.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CodeGenerator.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CodeGenerator.html (added) +++ www-releases/trunk/2.8/docs/CodeGenerator.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,2169 @@ + + + + + The LLVM Target-Independent Code Generator + + + + +
    + The LLVM Target-Independent Code Generator +
    + +
      +
    1. Introduction + +
    2. +
    3. Target description classes + +
    4. +
    5. Machine code description classes + +
    6. +
    7. Target-independent code generation algorithms + +
    8. +
    9. Target-specific Implementation Notes +
    10. + +
    + + + +
    +

    Warning: This is a work in progress.

    +
    + + + + + +
    + +

    The LLVM target-independent code generator is a framework that provides a + suite of reusable components for translating the LLVM internal representation + to the machine code for a specified target—either in assembly form + (suitable for a static compiler) or in binary machine code format (usable for + a JIT compiler). The LLVM target-independent code generator consists of five + main components:

    + +
      +
    1. Abstract target description interfaces which + capture important properties about various aspects of the machine, + independently of how they will be used. These interfaces are defined in + include/llvm/Target/.
    2. + +
    3. Classes used to represent the machine code + being generated for a target. These classes are intended to be abstract + enough to represent the machine code for any target machine. These + classes are defined in include/llvm/CodeGen/.
    4. + +
    5. Target-independent algorithms used to implement + various phases of native code generation (register allocation, scheduling, + stack frame representation, etc). This code lives + in lib/CodeGen/.
    6. + +
    7. Implementations of the abstract target description + interfaces for particular targets. These machine descriptions make + use of the components provided by LLVM, and can optionally provide custom + target-specific passes, to build complete code generators for a specific + target. Target descriptions live in lib/Target/.
    8. + +
    9. The target-independent JIT components. The LLVM JIT is + completely target independent (it uses the TargetJITInfo + structure to interface for target-specific issues. The code for the + target-independent JIT lives in lib/ExecutionEngine/JIT.
    10. +
    + +

    Depending on which part of the code generator you are interested in working + on, different pieces of this will be useful to you. In any case, you should + be familiar with the target description + and machine code representation classes. If you + want to add a backend for a new target, you will need + to implement the target description classes for + your new target and understand the LLVM code + representation. If you are interested in implementing a + new code generation algorithm, it should only + depend on the target-description and machine code representation classes, + ensuring that it is portable.

    + +
    + + + + +
    + +

    The two pieces of the LLVM code generator are the high-level interface to the + code generator and the set of reusable components that can be used to build + target-specific backends. The two most important interfaces + (TargetMachine + and TargetData) are the only ones that are + required to be defined for a backend to fit into the LLVM system, but the + others must be defined if the reusable code generator components are going to + be used.

    + +

    This design has two important implications. The first is that LLVM can + support completely non-traditional code generation targets. For example, the + C backend does not require register allocation, instruction selection, or any + of the other standard components provided by the system. As such, it only + implements these two interfaces, and does its own thing. Another example of + a code generator like this is a (purely hypothetical) backend that converts + LLVM to the GCC RTL form and uses GCC to emit machine code for a target.

    + +

    This design also implies that it is possible to design and implement + radically different code generators in the LLVM system that do not make use + of any of the built-in components. Doing so is not recommended at all, but + could be required for radically different targets that do not fit into the + LLVM machine description model: FPGAs for example.

    + +
    + + + + +
    + +

    The LLVM target-independent code generator is designed to support efficient + and quality code generation for standard register-based microprocessors. + Code generation in this model is divided into the following stages:

    + +
      +
    1. Instruction Selection — This phase + determines an efficient way to express the input LLVM code in the target + instruction set. This stage produces the initial code for the program in + the target instruction set, then makes use of virtual registers in SSA + form and physical registers that represent any required register + assignments due to target constraints or calling conventions. This step + turns the LLVM code into a DAG of target instructions.
    2. + +
    3. Scheduling and Formation — + This phase takes the DAG of target instructions produced by the + instruction selection phase, determines an ordering of the instructions, + then emits the instructions + as MachineInstrs with that ordering. + Note that we describe this in the instruction + selection section because it operates on + a SelectionDAG.
    4. + +
    5. SSA-based Machine Code Optimizations — + This optional stage consists of a series of machine-code optimizations + that operate on the SSA-form produced by the instruction selector. + Optimizations like modulo-scheduling or peephole optimization work + here.
    6. + +
    7. Register Allocation — The target code + is transformed from an infinite virtual register file in SSA form to the + concrete register file used by the target. This phase introduces spill + code and eliminates all virtual register references from the program.
    8. + +
    9. Prolog/Epilog Code Insertion — Once + the machine code has been generated for the function and the amount of + stack space required is known (used for LLVM alloca's and spill slots), + the prolog and epilog code for the function can be inserted and "abstract + stack location references" can be eliminated. This stage is responsible + for implementing optimizations like frame-pointer elimination and stack + packing.
    10. + +
    11. Late Machine Code Optimizations — + Optimizations that operate on "final" machine code can go here, such as + spill code scheduling and peephole optimizations.
    12. + +
    13. Code Emission — The final stage + actually puts out the code for the current function, either in the target + assembler format or in machine code.
    14. +
    + +

    The code generator is based on the assumption that the instruction selector + will use an optimal pattern matching selector to create high-quality + sequences of native instructions. Alternative code generator designs based + on pattern expansion and aggressive iterative peephole optimization are much + slower. This design permits efficient compilation (important for JIT + environments) and aggressive optimization (used when generating code offline) + by allowing components of varying levels of sophistication to be used for any + step of compilation.

    + +

    In addition to these stages, target implementations can insert arbitrary + target-specific passes into the flow. For example, the X86 target uses a + special pass to handle the 80x87 floating point stack architecture. Other + targets with unusual requirements can be supported with custom passes as + needed.

    + +
    + + + + +
    + +

    The target description classes require a detailed description of the target + architecture. These target descriptions often have a large amount of common + information (e.g., an add instruction is almost identical to a + sub instruction). In order to allow the maximum amount of + commonality to be factored out, the LLVM code generator uses + the TableGen tool to describe big + chunks of the target machine, which allows the use of domain-specific and + target-specific abstractions to reduce the amount of repetition.

    + +

    As LLVM continues to be developed and refined, we plan to move more and more + of the target description to the .td form. Doing so gives us a + number of advantages. The most important is that it makes it easier to port + LLVM because it reduces the amount of C++ code that has to be written, and + the surface area of the code generator that needs to be understood before + someone can get something working. Second, it makes it easier to change + things. In particular, if tables and other things are all emitted + by tblgen, we only need a change in one place (tblgen) to + update all of the targets to a new interface.

    + +
    + + + + + +
    + +

    The LLVM target description classes (located in the + include/llvm/Target directory) provide an abstract description of + the target machine independent of any particular client. These classes are + designed to capture the abstract properties of the target (such as the + instructions and registers it has), and do not incorporate any particular + pieces of code generation algorithms.

    + +

    All of the target description classes (except the + TargetData class) are designed to be + subclassed by the concrete target implementation, and have virtual methods + implemented. To get to these implementations, the + TargetMachine class provides accessors + that should be implemented by the target.

    + +
    + + + + +
    + +

    The TargetMachine class provides virtual methods that are used to + access the target-specific implementations of the various target description + classes via the get*Info methods (getInstrInfo, + getRegisterInfo, getFrameInfo, etc.). This class is + designed to be specialized by a concrete target implementation + (e.g., X86TargetMachine) which implements the various virtual + methods. The only required target description class is + the TargetData class, but if the code + generator components are to be used, the other interfaces should be + implemented as well.

    + +
    + + + + +
    + +

    The TargetData class is the only required target description class, + and it is the only class that is not extensible (you cannot derived a new + class from it). TargetData specifies information about how the + target lays out memory for structures, the alignment requirements for various + data types, the size of pointers in the target, and whether the target is + little-endian or big-endian.

    + +
    + + + + +
    + +

    The TargetLowering class is used by SelectionDAG based instruction + selectors primarily to describe how LLVM code should be lowered to + SelectionDAG operations. Among other things, this class indicates:

    + +
      +
    • an initial register class to use for various ValueTypes,
    • + +
    • which operations are natively supported by the target machine,
    • + +
    • the return type of setcc operations,
    • + +
    • the type to use for shift amounts, and
    • + +
    • various high-level characteristics, like whether it is profitable to turn + division by a constant into a multiplication sequence
    • +
    + +
    + + + + +
    + +

    The TargetRegisterInfo class is used to describe the register file + of the target and any interactions between the registers.

    + +

    Registers in the code generator are represented in the code generator by + unsigned integers. Physical registers (those that actually exist in the + target description) are unique small numbers, and virtual registers are + generally large. Note that register #0 is reserved as a flag value.

    + +

    Each register in the processor description has an associated + TargetRegisterDesc entry, which provides a textual name for the + register (used for assembly output and debugging dumps) and a set of aliases + (used to indicate whether one register overlaps with another).

    + +

    In addition to the per-register description, the TargetRegisterInfo + class exposes a set of processor specific register classes (instances of the + TargetRegisterClass class). Each register class contains sets of + registers that have the same properties (for example, they are all 32-bit + integer registers). Each SSA virtual register created by the instruction + selector has an associated register class. When the register allocator runs, + it replaces virtual registers with a physical register in the set.

    + +

    The target-specific implementations of these classes is auto-generated from + a TableGen description of the + register file.

    + +
    + + + + +
    + +

    The TargetInstrInfo class is used to describe the machine + instructions supported by the target. It is essentially an array of + TargetInstrDescriptor objects, each of which describes one + instruction the target supports. Descriptors define things like the mnemonic + for the opcode, the number of operands, the list of implicit register uses + and defs, whether the instruction has certain target-independent properties + (accesses memory, is commutable, etc), and holds any target-specific + flags.

    + +
    + + + + +
    + +

    The TargetFrameInfo class is used to provide information about the + stack frame layout of the target. It holds the direction of stack growth, the + known stack alignment on entry to each function, and the offset to the local + area. The offset to the local area is the offset from the stack pointer on + function entry to the first location where function data (local variables, + spill locations) can be stored.

    + +
    + + + + +
    + +

    The TargetSubtarget class is used to provide information about the + specific chip set being targeted. A sub-target informs code generation of + which instructions are supported, instruction latencies and instruction + execution itinerary; i.e., which processing units are used, in what order, + and for how long.

    + +
    + + + + + +
    + +

    The TargetJITInfo class exposes an abstract interface used by the + Just-In-Time code generator to perform target-specific activities, such as + emitting stubs. If a TargetMachine supports JIT code generation, it + should provide one of these objects through the getJITInfo + method.

    + +
    + + + + + +
    + +

    At the high-level, LLVM code is translated to a machine specific + representation formed out of + MachineFunction, + MachineBasicBlock, + and MachineInstr instances (defined + in include/llvm/CodeGen). This representation is completely target + agnostic, representing instructions in their most abstract form: an opcode + and a series of operands. This representation is designed to support both an + SSA representation for machine code, as well as a register allocated, non-SSA + form.

    + +
    + + + + +
    + +

    Target machine instructions are represented as instances of the + MachineInstr class. This class is an extremely abstract way of + representing machine instructions. In particular, it only keeps track of an + opcode number and a set of operands.

    + +

    The opcode number is a simple unsigned integer that only has meaning to a + specific backend. All of the instructions for a target should be defined in + the *InstrInfo.td file for the target. The opcode enum values are + auto-generated from this description. The MachineInstr class does + not have any information about how to interpret the instruction (i.e., what + the semantics of the instruction are); for that you must refer to the + TargetInstrInfo class.

    + +

    The operands of a machine instruction can be of several different types: a + register reference, a constant integer, a basic block reference, etc. In + addition, a machine operand should be marked as a def or a use of the value + (though only registers are allowed to be defs).

    + +

    By convention, the LLVM code generator orders instruction operands so that + all register definitions come before the register uses, even on architectures + that are normally printed in other orders. For example, the SPARC add + instruction: "add %i1, %i2, %i3" adds the "%i1", and "%i2" registers + and stores the result into the "%i3" register. In the LLVM code generator, + the operands should be stored as "%i3, %i1, %i2": with the + destination first.

    + +

    Keeping destination (definition) operands at the beginning of the operand + list has several advantages. In particular, the debugging printer will print + the instruction like this:

    + +
    +
    +%r3 = add %i1, %i2
    +
    +
    + +

    Also if the first operand is a def, it is easier to create + instructions whose only def is the first operand.

    + +
    + + + + +
    + +

    Machine instructions are created by using the BuildMI functions, + located in the include/llvm/CodeGen/MachineInstrBuilder.h file. The + BuildMI functions make it easy to build arbitrary machine + instructions. Usage of the BuildMI functions look like this:

    + +
    +
    +// Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
    +// instruction.  The '1' specifies how many operands will be added.
    +MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42);
    +
    +// Create the same instr, but insert it at the end of a basic block.
    +MachineBasicBlock &MBB = ...
    +BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42);
    +
    +// Create the same instr, but insert it before a specified iterator point.
    +MachineBasicBlock::iterator MBBI = ...
    +BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42);
    +
    +// Create a 'cmp Reg, 0' instruction, no destination reg.
    +MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0);
    +// Create an 'sahf' instruction which takes no operands and stores nothing.
    +MI = BuildMI(X86::SAHF, 0);
    +
    +// Create a self looping branch instruction.
    +BuildMI(MBB, X86::JNE, 1).addMBB(&MBB);
    +
    +
    + +

    The key thing to remember with the BuildMI functions is that you + have to specify the number of operands that the machine instruction will + take. This allows for efficient memory allocation. You also need to specify + if operands default to be uses of values, not definitions. If you need to + add a definition operand (other than the optional destination register), you + must explicitly mark it as such:

    + +
    +
    +MI.addReg(Reg, RegState::Define);
    +
    +
    + +
    + + + + +
    + +

    One important issue that the code generator needs to be aware of is the + presence of fixed registers. In particular, there are often places in the + instruction stream where the register allocator must arrange for a + particular value to be in a particular register. This can occur due to + limitations of the instruction set (e.g., the X86 can only do a 32-bit divide + with the EAX/EDX registers), or external factors like + calling conventions. In any case, the instruction selector should emit code + that copies a virtual register into or out of a physical register when + needed.

    + +

    For example, consider this simple LLVM example:

    + +
    +
    +define i32 @test(i32 %X, i32 %Y) {
    +  %Z = udiv i32 %X, %Y
    +  ret i32 %Z
    +}
    +
    +
    + +

    The X86 instruction selector produces this machine code for the div + and ret (use "llc X.bc -march=x86 -print-machineinstrs" to + get this):

    + +
    +
    +;; Start of div
    +%EAX = mov %reg1024           ;; Copy X (in reg1024) into EAX
    +%reg1027 = sar %reg1024, 31
    +%EDX = mov %reg1027           ;; Sign extend X into EDX
    +idiv %reg1025                 ;; Divide by Y (in reg1025)
    +%reg1026 = mov %EAX           ;; Read the result (Z) out of EAX
    +
    +;; Start of ret
    +%EAX = mov %reg1026           ;; 32-bit return value goes in EAX
    +ret
    +
    +
    + +

    By the end of code generation, the register allocator has coalesced the + registers and deleted the resultant identity moves producing the following + code:

    + +
    +
    +;; X is in EAX, Y is in ECX
    +mov %EAX, %EDX
    +sar %EDX, 31
    +idiv %ECX
    +ret 
    +
    +
    + +

    This approach is extremely general (if it can handle the X86 architecture, it + can handle anything!) and allows all of the target specific knowledge about + the instruction stream to be isolated in the instruction selector. Note that + physical registers should have a short lifetime for good code generation, and + all physical registers are assumed dead on entry to and exit from basic + blocks (before register allocation). Thus, if you need a value to be live + across basic block boundaries, it must live in a virtual + register.

    + +
    + + + + +
    + +

    MachineInstr's are initially selected in SSA-form, and are + maintained in SSA-form until register allocation happens. For the most part, + this is trivially simple since LLVM is already in SSA form; LLVM PHI nodes + become machine code PHI nodes, and virtual registers are only allowed to have + a single definition.

    + +

    After register allocation, machine code is no longer in SSA-form because + there are no virtual registers left in the code.

    + +
    + + + + +
    + +

    The MachineBasicBlock class contains a list of machine instructions + (MachineInstr instances). It roughly + corresponds to the LLVM code input to the instruction selector, but there can + be a one-to-many mapping (i.e. one LLVM basic block can map to multiple + machine basic blocks). The MachineBasicBlock class has a + "getBasicBlock" method, which returns the LLVM basic block that it + comes from.

    + +
    + + + + +
    + +

    The MachineFunction class contains a list of machine basic blocks + (MachineBasicBlock instances). It + corresponds one-to-one with the LLVM function input to the instruction + selector. In addition to a list of basic blocks, + the MachineFunction contains a a MachineConstantPool, + a MachineFrameInfo, a MachineFunctionInfo, and a + MachineRegisterInfo. See + include/llvm/CodeGen/MachineFunction.h for more information.

    + +
    + + + + + +
    + +

    This section documents the phases described in the + high-level design of the code generator. + It explains how they work and some of the rationale behind their design.

    + +
    + + + + +
    + +

    Instruction Selection is the process of translating LLVM code presented to + the code generator into target-specific machine instructions. There are + several well-known ways to do this in the literature. LLVM uses a + SelectionDAG based instruction selector.

    + +

    Portions of the DAG instruction selector are generated from the target + description (*.td) files. Our goal is for the entire instruction + selector to be generated from these .td files, though currently + there are still things that require custom C++ code.

    + +
    + + + + +
    + +

    The SelectionDAG provides an abstraction for code representation in a way + that is amenable to instruction selection using automatic techniques + (e.g. dynamic-programming based optimal pattern matching selectors). It is + also well-suited to other phases of code generation; in particular, + instruction scheduling (SelectionDAG's are very close to scheduling DAGs + post-selection). Additionally, the SelectionDAG provides a host + representation where a large variety of very-low-level (but + target-independent) optimizations may be + performed; ones which require extensive information about the instructions + efficiently supported by the target.

    + +

    The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the + SDNode class. The primary payload of the SDNode is its + operation code (Opcode) that indicates what operation the node performs and + the operands to the operation. The various operation node types are + described at the top of the include/llvm/CodeGen/SelectionDAGNodes.h + file.

    + +

    Although most operations define a single value, each node in the graph may + define multiple values. For example, a combined div/rem operation will + define both the dividend and the remainder. Many other situations require + multiple values as well. Each node also has some number of operands, which + are edges to the node defining the used value. Because nodes may define + multiple values, edges are represented by instances of the SDValue + class, which is a <SDNode, unsigned> pair, indicating the node + and result value being used, respectively. Each value produced by + an SDNode has an associated MVT (Machine Value Type) + indicating what the type of the value is.

    + +

    SelectionDAGs contain two different kinds of values: those that represent + data flow and those that represent control flow dependencies. Data values + are simple edges with an integer or floating point value type. Control edges + are represented as "chain" edges which are of type MVT::Other. + These edges provide an ordering between nodes that have side effects (such as + loads, stores, calls, returns, etc). All nodes that have side effects should + take a token chain as input and produce a new one as output. By convention, + token chain inputs are always operand #0, and chain results are always the + last value produced by an operation.

    + +

    A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is + always a marker node with an Opcode of ISD::EntryToken. The Root + node is the final side-effecting node in the token chain. For example, in a + single basic block function it would be the return node.

    + +

    One important concept for SelectionDAGs is the notion of a "legal" vs. + "illegal" DAG. A legal DAG for a target is one that only uses supported + operations and supported types. On a 32-bit PowerPC, for example, a DAG with + a value of type i1, i8, i16, or i64 would be illegal, as would a DAG that + uses a SREM or UREM operation. The + legalize types and + legalize operations phases are + responsible for turning an illegal DAG into a legal DAG.

    + +
    + + + + +
    + +

    SelectionDAG-based instruction selection consists of the following steps:

    + +
      +
    1. Build initial DAG — This stage + performs a simple translation from the input LLVM code to an illegal + SelectionDAG.
    2. + +
    3. Optimize SelectionDAG — This + stage performs simple optimizations on the SelectionDAG to simplify it, + and recognize meta instructions (like rotates + and div/rem pairs) for targets that support these meta + operations. This makes the resultant code more efficient and + the select instructions from DAG phase + (below) simpler.
    4. + +
    5. Legalize SelectionDAG Types + — This stage transforms SelectionDAG nodes to eliminate any types + that are unsupported on the target.
    6. + +
    7. Optimize SelectionDAG — The + SelectionDAG optimizer is run to clean up redundancies exposed by type + legalization.
    8. + +
    9. Legalize SelectionDAG Types — + This stage transforms SelectionDAG nodes to eliminate any types that are + unsupported on the target.
    10. + +
    11. Optimize SelectionDAG — The + SelectionDAG optimizer is run to eliminate inefficiencies introduced by + operation legalization.
    12. + +
    13. Select instructions from DAG — + Finally, the target instruction selector matches the DAG operations to + target instructions. This process translates the target-independent input + DAG into another DAG of target instructions.
    14. + +
    15. SelectionDAG Scheduling and Formation + — The last phase assigns a linear order to the instructions in the + target-instruction DAG and emits them into the MachineFunction being + compiled. This step uses traditional prepass scheduling techniques.
    16. +
    + +

    After all of these steps are complete, the SelectionDAG is destroyed and the + rest of the code generation passes are run.

    + +

    One great way to visualize what is going on here is to take advantage of a + few LLC command line options. The following options pop up a window + displaying the SelectionDAG at specific times (if you only get errors printed + to the console while using this, you probably + need to configure your system + to add support for it).

    + +
      +
    • -view-dag-combine1-dags displays the DAG after being built, + before the first optimization pass.
    • + +
    • -view-legalize-dags displays the DAG before Legalization.
    • + +
    • -view-dag-combine2-dags displays the DAG before the second + optimization pass.
    • + +
    • -view-isel-dags displays the DAG before the Select phase.
    • + +
    • -view-sched-dags displays the DAG before Scheduling.
    • +
    + +

    The -view-sunit-dags displays the Scheduler's dependency graph. + This graph is based on the final SelectionDAG, with nodes that must be + scheduled together bundled into a single scheduling-unit node, and with + immediate operands and other nodes that aren't relevant for scheduling + omitted.

    + +
    + + + + +
    + +

    The initial SelectionDAG is naïvely peephole expanded from the LLVM + input by the SelectionDAGLowering class in the + lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp file. The intent of + this pass is to expose as much low-level, target-specific details to the + SelectionDAG as possible. This pass is mostly hard-coded (e.g. an + LLVM add turns into an SDNode add while a + getelementptr is expanded into the obvious arithmetic). This pass + requires target-specific hooks to lower calls, returns, varargs, etc. For + these features, the TargetLowering + interface is used.

    + +
    + + + + +
    + +

    The Legalize phase is in charge of converting a DAG to only use the types + that are natively supported by the target.

    + +

    There are two main ways of converting values of unsupported scalar types to + values of supported types: converting small types to larger types + ("promoting"), and breaking up large integer types into smaller ones + ("expanding"). For example, a target might require that all f32 values are + promoted to f64 and that all i1/i8/i16 values are promoted to i32. The same + target might require that all i64 values be expanded into pairs of i32 + values. These changes can insert sign and zero extensions as needed to make + sure that the final code has the same behavior as the input.

    + +

    There are two main ways of converting values of unsupported vector types to + value of supported types: splitting vector types, multiple times if + necessary, until a legal type is found, and extending vector types by adding + elements to the end to round them out to legal types ("widening"). If a + vector gets split all the way down to single-element parts with no supported + vector type being found, the elements are converted to scalars + ("scalarizing").

    + +

    A target implementation tells the legalizer which types are supported (and + which register class to use for them) by calling the + addRegisterClass method in its TargetLowering constructor.

    + +
    + + + + +
    + +

    The Legalize phase is in charge of converting a DAG to only use the + operations that are natively supported by the target.

    + +

    Targets often have weird constraints, such as not supporting every operation + on every supported datatype (e.g. X86 does not support byte conditional moves + and PowerPC does not support sign-extending loads from a 16-bit memory + location). Legalize takes care of this by open-coding another sequence of + operations to emulate the operation ("expansion"), by promoting one type to a + larger type that supports the operation ("promotion"), or by using a + target-specific hook to implement the legalization ("custom").

    + +

    A target implementation tells the legalizer which operations are not + supported (and which of the above three actions to take) by calling the + setOperationAction method in its TargetLowering + constructor.

    + +

    Prior to the existence of the Legalize passes, we required that every target + selector supported and handled every + operator and type even if they are not natively supported. The introduction + of the Legalize phases allows all of the canonicalization patterns to be + shared across targets, and makes it very easy to optimize the canonicalized + code because it is still in the form of a DAG.

    + +
    + + + + +
    + +

    The SelectionDAG optimization phase is run multiple times for code + generation, immediately after the DAG is built and once after each + legalization. The first run of the pass allows the initial code to be + cleaned up (e.g. performing optimizations that depend on knowing that the + operators have restricted type inputs). Subsequent runs of the pass clean up + the messy code generated by the Legalize passes, which allows Legalize to be + very simple (it can focus on making code legal instead of focusing on + generating good and legal code).

    + +

    One important class of optimizations performed is optimizing inserted sign + and zero extension instructions. We currently use ad-hoc techniques, but + could move to more rigorous techniques in the future. Here are some good + papers on the subject:

    + +

    "Widening + integer arithmetic"
    + Kevin Redwine and Norman Ramsey
    + International Conference on Compiler Construction (CC) 2004

    + +

    "Effective + sign extension elimination"
    + Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani
    + Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design + and Implementation.

    + +
    + + + + +
    + +

    The Select phase is the bulk of the target-specific code for instruction + selection. This phase takes a legal SelectionDAG as input, pattern matches + the instructions supported by the target to this DAG, and produces a new DAG + of target code. For example, consider the following LLVM fragment:

    + +
    +
    +%t1 = fadd float %W, %X
    +%t2 = fmul float %t1, %Y
    +%t3 = fadd float %t2, %Z
    +
    +
    + +

    This LLVM code corresponds to a SelectionDAG that looks basically like + this:

    + +
    +
    +(fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z)
    +
    +
    + +

    If a target supports floating point multiply-and-add (FMA) operations, one of + the adds can be merged with the multiply. On the PowerPC, for example, the + output of the instruction selector might look like this DAG:

    + +
    +
    +(FMADDS (FADDS W, X), Y, Z)
    +
    +
    + +

    The FMADDS instruction is a ternary instruction that multiplies its +first two operands and adds the third (as single-precision floating-point +numbers). The FADDS instruction is a simple binary single-precision +add instruction. To perform this pattern match, the PowerPC backend includes +the following instruction definitions:

    + +
    +
    +def FMADDS : AForm_1<59, 29,
    +                    (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB),
    +                    "fmadds $FRT, $FRA, $FRC, $FRB",
    +                    [(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC),
    +                                           F4RC:$FRB))]>;
    +def FADDS : AForm_2<59, 21,
    +                    (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB),
    +                    "fadds $FRT, $FRA, $FRB",
    +                    [(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))]>;
    +
    +
    + +

    The portion of the instruction definition in bold indicates the pattern used + to match the instruction. The DAG operators + (like fmul/fadd) are defined in + the include/llvm/Target/TargetSelectionDAG.td file. " + F4RC" is the register class of the input and result values.

    + +

    The TableGen DAG instruction selector generator reads the instruction + patterns in the .td file and automatically builds parts of the + pattern matching code for your target. It has the following strengths:

    + +
      +
    • At compiler-compiler time, it analyzes your instruction patterns and tells + you if your patterns make sense or not.
    • + +
    • It can handle arbitrary constraints on operands for the pattern match. In + particular, it is straight-forward to say things like "match any immediate + that is a 13-bit sign-extended value". For examples, see the + immSExt16 and related tblgen classes in the PowerPC + backend.
    • + +
    • It knows several important identities for the patterns defined. For + example, it knows that addition is commutative, so it allows the + FMADDS pattern above to match "(fadd X, (fmul Y, Z))" as + well as "(fadd (fmul X, Y), Z)", without the target author having + to specially handle this case.
    • + +
    • It has a full-featured type-inferencing system. In particular, you should + rarely have to explicitly tell the system what type parts of your patterns + are. In the FMADDS case above, we didn't have to tell + tblgen that all of the nodes in the pattern are of type 'f32'. + It was able to infer and propagate this knowledge from the fact that + F4RC has type 'f32'.
    • + +
    • Targets can define their own (and rely on built-in) "pattern fragments". + Pattern fragments are chunks of reusable patterns that get inlined into + your patterns during compiler-compiler time. For example, the integer + "(not x)" operation is actually defined as a pattern fragment + that expands as "(xor x, -1)", since the SelectionDAG does not + have a native 'not' operation. Targets can define their own + short-hand fragments as they see fit. See the definition of + 'not' and 'ineg' for examples.
    • + +
    • In addition to instructions, targets can specify arbitrary patterns that + map to one or more instructions using the 'Pat' class. For example, the + PowerPC has no way to load an arbitrary integer immediate into a register + in one instruction. To tell tblgen how to do this, it defines: +
      +
      +
      +
      +// Arbitrary immediate support.  Implement in terms of LIS/ORI.
      +def : Pat<(i32 imm:$imm),
      +          (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>;
      +
      +
      +
      + If none of the single-instruction patterns for loading an immediate into a + register match, this will be used. This rule says "match an arbitrary i32 + immediate, turning it into an ORI ('or a 16-bit immediate') and + an LIS ('load 16-bit immediate, where the immediate is shifted to + the left 16 bits') instruction". To make this work, the + LO16/HI16 node transformations are used to manipulate + the input immediate (in this case, take the high or low 16-bits of the + immediate).
    • + +
    • While the system does automate a lot, it still allows you to write custom + C++ code to match special cases if there is something that is hard to + express.
    • +
    + +

    While it has many strengths, the system currently has some limitations, + primarily because it is a work in progress and is not yet finished:

    + +
      +
    • Overall, there is no way to define or match SelectionDAG nodes that define + multiple values (e.g. SMUL_LOHI, LOAD, CALL, + etc). This is the biggest reason that you currently still have + to write custom C++ code for your instruction selector.
    • + +
    • There is no great way to support matching complex addressing modes yet. + In the future, we will extend pattern fragments to allow them to define + multiple values (e.g. the four operands of the X86 + addressing mode, which are currently matched with custom C++ code). + In addition, we'll extend fragments so that a fragment can match multiple + different patterns.
    • + +
    • We don't automatically infer flags like isStore/isLoad yet.
    • + +
    • We don't automatically generate the set of supported registers and + operations for the Legalizer + yet.
    • + +
    • We don't have a way of tying in custom legalized nodes yet.
    • +
    + +

    Despite these limitations, the instruction selector generator is still quite + useful for most of the binary and logical operations in typical instruction + sets. If you run into any problems or can't figure out how to do something, + please let Chris know!

    + +
    + + + + +
    + +

    The scheduling phase takes the DAG of target instructions from the selection + phase and assigns an order. The scheduler can pick an order depending on + various constraints of the machines (i.e. order for minimal register pressure + or try to cover instruction latencies). Once an order is established, the + DAG is converted to a list + of MachineInstrs and the SelectionDAG is + destroyed.

    + +

    Note that this phase is logically separate from the instruction selection + phase, but is tied to it closely in the code because it operates on + SelectionDAGs.

    + +
    + + + + +
    + +
      +
    1. Optional function-at-a-time selection.
    2. + +
    3. Auto-generate entire selector from .td file.
    4. +
    + +
    + + + +

    To Be Written

    + + + + +
    + +

    Live Intervals are the ranges (intervals) where a variable is live. + They are used by some register allocator passes to + determine if two or more virtual registers which require the same physical + register are live at the same point in the program (i.e., they conflict). + When this situation occurs, one virtual register must be spilled.

    + +
    + + + + +
    + +

    The first step in determining the live intervals of variables is to calculate + the set of registers that are immediately dead after the instruction (i.e., + the instruction calculates the value, but it is never used) and the set of + registers that are used by the instruction, but are never used after the + instruction (i.e., they are killed). Live variable information is computed + for each virtual register and register allocatable physical + register in the function. This is done in a very efficient manner because it + uses SSA to sparsely compute lifetime information for virtual registers + (which are in SSA form) and only has to track physical registers within a + block. Before register allocation, LLVM can assume that physical registers + are only live within a single basic block. This allows it to do a single, + local analysis to resolve physical register lifetimes within each basic + block. If a physical register is not register allocatable (e.g., a stack + pointer or condition codes), it is not tracked.

    + +

    Physical registers may be live in to or out of a function. Live in values are + typically arguments in registers. Live out values are typically return values + in registers. Live in values are marked as such, and are given a dummy + "defining" instruction during live intervals analysis. If the last basic + block of a function is a return, then it's marked as using all live + out values in the function.

    + +

    PHI nodes need to be handled specially, because the calculation of + the live variable information from a depth first traversal of the CFG of the + function won't guarantee that a virtual register used by the PHI + node is defined before it's used. When a PHI node is encountered, + only the definition is handled, because the uses will be handled in other + basic blocks.

    + +

    For each PHI node of the current basic block, we simulate an + assignment at the end of the current basic block and traverse the successor + basic blocks. If a successor basic block has a PHI node and one of + the PHI node's operands is coming from the current basic block, then + the variable is marked as alive within the current basic block and all + of its predecessor basic blocks, until the basic block with the defining + instruction is encountered.

    + +
    + + + + +
    + +

    We now have the information available to perform the live intervals analysis + and build the live intervals themselves. We start off by numbering the basic + blocks and machine instructions. We then handle the "live-in" values. These + are in physical registers, so the physical register is assumed to be killed + by the end of the basic block. Live intervals for virtual registers are + computed for some ordering of the machine instructions [1, N]. A + live interval is an interval [i, j), where 1 <= i <= j + < N, for which a variable is live.

    + +

    More to come...

    + +
    + + + + +
    + +

    The Register Allocation problem consists in mapping a program + Pv, that can use an unbounded number of virtual registers, + to a program Pp that contains a finite (possibly small) + number of physical registers. Each target architecture has a different number + of physical registers. If the number of physical registers is not enough to + accommodate all the virtual registers, some of them will have to be mapped + into memory. These virtuals are called spilled virtuals.

    + +
    + + + + + +
    + +

    In LLVM, physical registers are denoted by integer numbers that normally + range from 1 to 1023. To see how this numbering is defined for a particular + architecture, you can read the GenRegisterNames.inc file for that + architecture. For instance, by + inspecting lib/Target/X86/X86GenRegisterNames.inc we see that the + 32-bit register EAX is denoted by 15, and the MMX register + MM0 is mapped to 48.

    + +

    Some architectures contain registers that share the same physical location. A + notable example is the X86 platform. For instance, in the X86 architecture, + the registers EAX, AX and AL share the first eight + bits. These physical registers are marked as aliased in LLVM. Given a + particular architecture, you can check which registers are aliased by + inspecting its RegisterInfo.td file. Moreover, the method + TargetRegisterInfo::getAliasSet(p_reg) returns an array containing + all the physical registers aliased to the register p_reg.

    + +

    Physical registers, in LLVM, are grouped in Register Classes. + Elements in the same register class are functionally equivalent, and can be + interchangeably used. Each virtual register can only be mapped to physical + registers of a particular class. For instance, in the X86 architecture, some + virtuals can only be allocated to 8 bit registers. A register class is + described by TargetRegisterClass objects. To discover if a virtual + register is compatible with a given physical, this code can be used:

    + +
    +
    +bool RegMapping_Fer::compatible_class(MachineFunction &mf,
    +                                      unsigned v_reg,
    +                                      unsigned p_reg) {
    +  assert(TargetRegisterInfo::isPhysicalRegister(p_reg) &&
    +         "Target register must be physical");
    +  const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg);
    +  return trc->contains(p_reg);
    +}
    +
    +
    + +

    Sometimes, mostly for debugging purposes, it is useful to change the number + of physical registers available in the target architecture. This must be done + statically, inside the TargetRegsterInfo.td file. Just grep + for RegisterClass, the last parameter of which is a list of + registers. Just commenting some out is one simple way to avoid them being + used. A more polite way is to explicitly exclude some registers from + the allocation order. See the definition of the GR8 register + class in lib/Target/X86/X86RegisterInfo.td for an example of this. +

    + +

    Virtual registers are also denoted by integer numbers. Contrary to physical + registers, different virtual registers never share the same number. The + smallest virtual register is normally assigned the number 1024. This may + change, so, in order to know which is the first virtual register, you should + access TargetRegisterInfo::FirstVirtualRegister. Any register whose + number is greater than or equal + to TargetRegisterInfo::FirstVirtualRegister is considered a virtual + register. Whereas physical registers are statically defined in + a TargetRegisterInfo.td file and cannot be created by the + application developer, that is not the case with virtual registers. In order + to create new virtual registers, use the + method MachineRegisterInfo::createVirtualRegister(). This method + will return a virtual register with the highest code.

    + +

    Before register allocation, the operands of an instruction are mostly virtual + registers, although physical registers may also be used. In order to check if + a given machine operand is a register, use the boolean + function MachineOperand::isRegister(). To obtain the integer code of + a register, use MachineOperand::getReg(). An instruction may define + or use a register. For instance, ADD reg:1026 := reg:1025 reg:1024 + defines the registers 1024, and uses registers 1025 and 1026. Given a + register operand, the method MachineOperand::isUse() informs if that + register is being used by the instruction. The + method MachineOperand::isDef() informs if that registers is being + defined.

    + +

    We will call physical registers present in the LLVM bitcode before register + allocation pre-colored registers. Pre-colored registers are used in + many different situations, for instance, to pass parameters of functions + calls, and to store results of particular instructions. There are two types + of pre-colored registers: the ones implicitly defined, and + those explicitly defined. Explicitly defined registers are normal + operands, and can be accessed + with MachineInstr::getOperand(int)::getReg(). In order to check + which registers are implicitly defined by an instruction, use + the TargetInstrInfo::get(opcode)::ImplicitDefs, + where opcode is the opcode of the target instruction. One important + difference between explicit and implicit physical registers is that the + latter are defined statically for each instruction, whereas the former may + vary depending on the program being compiled. For example, an instruction + that represents a function call will always implicitly define or use the same + set of physical registers. To read the registers implicitly used by an + instruction, + use TargetInstrInfo::get(opcode)::ImplicitUses. Pre-colored + registers impose constraints on any register allocation algorithm. The + register allocator must make sure that none of them are overwritten by + the values of virtual registers while still alive.

    + +
    + + + + + +
    + +

    There are two ways to map virtual registers to physical registers (or to + memory slots). The first way, that we will call direct mapping, is + based on the use of methods of the classes TargetRegisterInfo, + and MachineOperand. The second way, that we will call indirect + mapping, relies on the VirtRegMap class in order to insert loads + and stores sending and getting values to and from memory.

    + +

    The direct mapping provides more flexibility to the developer of the register + allocator; however, it is more error prone, and demands more implementation + work. Basically, the programmer will have to specify where load and store + instructions should be inserted in the target function being compiled in + order to get and store values in memory. To assign a physical register to a + virtual register present in a given operand, + use MachineOperand::setReg(p_reg). To insert a store instruction, + use TargetInstrInfo::storeRegToStackSlot(...), and to insert a + load instruction, use TargetInstrInfo::loadRegFromStackSlot.

    + +

    The indirect mapping shields the application developer from the complexities + of inserting load and store instructions. In order to map a virtual register + to a physical one, use VirtRegMap::assignVirt2Phys(vreg, preg). In + order to map a certain virtual register to memory, + use VirtRegMap::assignVirt2StackSlot(vreg). This method will return + the stack slot where vreg's value will be located. If it is + necessary to map another virtual register to the same stack slot, + use VirtRegMap::assignVirt2StackSlot(vreg, stack_location). One + important point to consider when using the indirect mapping, is that even if + a virtual register is mapped to memory, it still needs to be mapped to a + physical register. This physical register is the location where the virtual + register is supposed to be found before being stored or after being + reloaded.

    + +

    If the indirect strategy is used, after all the virtual registers have been + mapped to physical registers or stack slots, it is necessary to use a spiller + object to place load and store instructions in the code. Every virtual that + has been mapped to a stack slot will be stored to memory after been defined + and will be loaded before being used. The implementation of the spiller tries + to recycle load/store instructions, avoiding unnecessary instructions. For an + example of how to invoke the spiller, + see RegAllocLinearScan::runOnMachineFunction + in lib/CodeGen/RegAllocLinearScan.cpp.

    + +
    + + + + +
    + +

    With very rare exceptions (e.g., function calls), the LLVM machine code + instructions are three address instructions. That is, each instruction is + expected to define at most one register, and to use at most two registers. + However, some architectures use two address instructions. In this case, the + defined register is also one of the used register. For instance, an + instruction such as ADD %EAX, %EBX, in X86 is actually equivalent + to %EAX = %EAX + %EBX.

    + +

    In order to produce correct code, LLVM must convert three address + instructions that represent two address instructions into true two address + instructions. LLVM provides the pass TwoAddressInstructionPass for + this specific purpose. It must be run before register allocation takes + place. After its execution, the resulting code may no longer be in SSA + form. This happens, for instance, in situations where an instruction such + as %a = ADD %b %c is converted to two instructions such as:

    + +
    +
    +%a = MOVE %b
    +%a = ADD %a %c
    +
    +
    + +

    Notice that, internally, the second instruction is represented as + ADD %a[def/use] %c. I.e., the register operand %a is both + used and defined by the instruction.

    + +
    + + + + +
    + +

    An important transformation that happens during register allocation is called + the SSA Deconstruction Phase. The SSA form simplifies many analyses + that are performed on the control flow graph of programs. However, + traditional instruction sets do not implement PHI instructions. Thus, in + order to generate executable code, compilers must replace PHI instructions + with other instructions that preserve their semantics.

    + +

    There are many ways in which PHI instructions can safely be removed from the + target code. The most traditional PHI deconstruction algorithm replaces PHI + instructions with copy instructions. That is the strategy adopted by + LLVM. The SSA deconstruction algorithm is implemented + in lib/CodeGen/PHIElimination.cpp. In order to invoke this pass, the + identifier PHIEliminationID must be marked as required in the code + of the register allocator.

    + +
    + + + + +
    + +

    Instruction folding is an optimization performed during register + allocation that removes unnecessary copy instructions. For instance, a + sequence of instructions such as:

    + +
    +
    +%EBX = LOAD %mem_address
    +%EAX = COPY %EBX
    +
    +
    + +

    can be safely substituted by the single instruction:

    + +
    +
    +%EAX = LOAD %mem_address
    +
    +
    + +

    Instructions can be folded with + the TargetRegisterInfo::foldMemoryOperand(...) method. Care must be + taken when folding instructions; a folded instruction can be quite different + from the original + instruction. See LiveIntervals::addIntervalsForSpills + in lib/CodeGen/LiveIntervalAnalysis.cpp for an example of its + use.

    + +
    + + + + + +
    + +

    The LLVM infrastructure provides the application developer with three + different register allocators:

    + +
      +
    • Linear ScanThe default allocator. This is the + well-know linear scan register allocator. Whereas the + Simple and Local algorithms use a direct mapping + implementation technique, the Linear Scan implementation + uses a spiller in order to place load and stores.
    • + +
    • Fast — This register allocator is the default for debug + builds. It allocates registers on a basic block level, attempting to keep + values in registers and reusing registers as appropriate.
    • + +
    • PBQP — A Partitioned Boolean Quadratic Programming (PBQP) + based register allocator. This allocator works by constructing a PBQP + problem representing the register allocation problem under consideration, + solving this using a PBQP solver, and mapping the solution back to a + register assignment.
    • + +
    + +

    The type of register allocator used in llc can be chosen with the + command line option -regalloc=...:

    + +
    +
    +$ llc -regalloc=linearscan file.bc -o ln.s;
    +$ llc -regalloc=fast file.bc -o fa.s;
    +$ llc -regalloc=pbqp file.bc -o pbqp.s;
    +
    +
    + +
    + + + +

    To Be Written

    + + +

    To Be Written

    + + +

    To Be Written

    + + +

    To Be Written

    + + + +
    +

    For the JIT or .o file writer

    +
    + + + + + + +
    + +

    This section of the document explains features or design decisions that are + specific to the code generator for a particular target.

    + +
    + + + + +
    + +

    Tail call optimization, callee reusing the stack of the caller, is currently + supported on x86/x86-64 and PowerPC. It is performed if:

    + +
      +
    • Caller and callee have the calling convention fastcc or + cc 10 (GHC call convention).
    • + +
    • The call is a tail call - in tail position (ret immediately follows call + and ret uses value of call or is void).
    • + +
    • Option -tailcallopt is enabled.
    • + +
    • Platform specific constraints are met.
    • +
    + +

    x86/x86-64 constraints:

    + +
      +
    • No variable argument lists are used.
    • + +
    • On x86-64 when generating GOT/PIC code only module-local calls (visibility + = hidden or protected) are supported.
    • +
    + +

    PowerPC constraints:

    + +
      +
    • No variable argument lists are used.
    • + +
    • No byval parameters are used.
    • + +
    • On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected) are supported.
    • +
    + +

    Example:

    + +

    Call as llc -tailcallopt test.ll.

    + +
    +
    +declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4)
    +
    +define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
    +  %l1 = add i32 %in1, %in2
    +  %tmp = tail call fastcc i32 @tailcallee(i32 %in1 inreg, i32 %in2 inreg, i32 %in1, i32 %l1)
    +  ret i32 %tmp
    +}
    +
    +
    + +

    Implications of -tailcallopt:

    + +

    To support tail call optimization in situations where the callee has more + arguments than the caller a 'callee pops arguments' convention is used. This + currently causes each fastcc call that is not tail call optimized + (because one or more of above constraints are not met) to be followed by a + readjustment of the stack. So performance might be worse in such cases.

    + +
    + + + +
    + +

    Sibling call optimization is a restricted form of tail call optimization. + Unlike tail call optimization described in the previous section, it can be + performed automatically on any tail calls when -tailcallopt option + is not specified.

    + +

    Sibling call optimization is currently performed on x86/x86-64 when the + following constraints are met:

    + +
      +
    • Caller and callee have the same calling convention. It can be either + c or fastcc. + +
    • The call is a tail call - in tail position (ret immediately follows call + and ret uses value of call or is void).
    • + +
    • Caller and callee have matching return type or the callee result is not + used. + +
    • If any of the callee arguments are being passed in stack, they must be + available in caller's own incoming argument stack and the frame offsets + must be the same. +
    + +

    Example:

    +
    +
    +declare i32 @bar(i32, i32)
    +
    +define i32 @foo(i32 %a, i32 %b, i32 %c) {
    +entry:
    +  %0 = tail call i32 @bar(i32 %a, i32 %b)
    +  ret i32 %0
    +}
    +
    +
    + +
    + + + +
    + +

    The X86 code generator lives in the lib/Target/X86 directory. This + code generator is capable of targeting a variety of x86-32 and x86-64 + processors, and includes support for ISA extensions such as MMX and SSE.

    + +
    + + + + +
    + +

    The following are the known target triples that are supported by the X86 + backend. This is not an exhaustive list, and it would be useful to add those + that people test.

    + +
      +
    • i686-pc-linux-gnu — Linux
    • + +
    • i386-unknown-freebsd5.3 — FreeBSD 5.3
    • + +
    • i686-pc-cygwin — Cygwin on Win32
    • + +
    • i686-pc-mingw32 — MingW on Win32
    • + +
    • i386-pc-mingw32msvc — MingW crosscompiler on Linux
    • + +
    • i686-apple-darwin* — Apple Darwin on X86
    • + +
    • x86_64-unknown-linux-gnu — Linux
    • +
    + +
    + + + + + +
    + +

    The following target-specific calling conventions are known to backend:

    + +
      +
    • x86_StdCall — stdcall calling convention seen on Microsoft + Windows platform (CC ID = 64).
    • + +
    • x86_FastCall — fastcall calling convention seen on Microsoft + Windows platform (CC ID = 65).
    • +
    + +
    + + + + +
    + +

    The x86 has a very flexible way of accessing memory. It is capable of + forming memory addresses of the following expression directly in integer + instructions (which use ModR/M addressing):

    + +
    +
    +SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32
    +
    +
    + +

    In order to represent this, LLVM tracks no less than 5 operands for each + memory operand of this form. This means that the "load" form of + 'mov' has the following MachineOperands in this order:

    + +
    +
    +Index:        0     |    1        2       3           4          5
    +Meaning:   DestReg, | BaseReg,  Scale, IndexReg, Displacement Segment
    +OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg,   SignExtImm  PhysReg
    +
    +
    + +

    Stores, and all other instructions, treat the four memory operands in the + same way and in the same order. If the segment register is unspecified + (regno = 0), then no segment override is generated. "Lea" operations do not + have a segment register specified, so they only have 4 operands for their + memory reference.

    + +
    + + + + +
    + +

    x86 has an experimental feature which provides + the ability to perform loads and stores to different address spaces + via the x86 segment registers. A segment override prefix byte on an + instruction causes the instruction's memory access to go to the specified + segment. LLVM address space 0 is the default address space, which includes + the stack, and any unqualified memory accesses in a program. Address spaces + 1-255 are currently reserved for user-defined code. The GS-segment is + represented by address space 256, while the FS-segment is represented by + address space 257. Other x86 segments have yet to be allocated address space + numbers.

    + +

    While these address spaces may seem similar to TLS via the + thread_local keyword, and often use the same underlying hardware, + there are some fundamental differences.

    + +

    The thread_local keyword applies to global variables and + specifies that they are to be allocated in thread-local memory. There are + no type qualifiers involved, and these variables can be pointed to with + normal pointers and accessed with normal loads and stores. + The thread_local keyword is target-independent at the LLVM IR + level (though LLVM doesn't yet have implementations of it for some + configurations).

    + +

    Special address spaces, in contrast, apply to static types. Every + load and store has a particular address space in its address operand type, + and this is what determines which address space is accessed. + LLVM ignores these special address space qualifiers on global variables, + and does not provide a way to directly allocate storage in them. + At the LLVM IR level, the behavior of these special address spaces depends + in part on the underlying OS or runtime environment, and they are specific + to x86 (and LLVM doesn't yet handle them correctly in some cases).

    + +

    Some operating systems and runtime environments use (or may in the future + use) the FS/GS-segment registers for various low-level purposes, so care + should be taken when considering them.

    + +
    + + + + +
    + +

    An instruction name consists of the base name, a default operand size, and a + a character per operand with an optional special size. For example:

    + +
    +
    +ADD8rr      -> add, 8-bit register, 8-bit register
    +IMUL16rmi   -> imul, 16-bit register, 16-bit memory, 16-bit immediate
    +IMUL16rmi8  -> imul, 16-bit register, 16-bit memory, 8-bit immediate
    +MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory
    +
    +
    + +
    + + + + +
    + +

    The PowerPC code generator lives in the lib/Target/PowerPC directory. The + code generation is retargetable to several variations or subtargets of + the PowerPC ISA; including ppc32, ppc64 and altivec.

    + +
    + + + + +
    + +

    LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC + relative (PIC) or static addressing for accessing global values, so no TOC + (r2) is used. Second, r31 is used as a frame pointer to allow dynamic growth + of a stack frame. LLVM takes advantage of having no TOC to provide space to + save the frame pointer in the PowerPC linkage area of the caller frame. + Other details of PowerPC ABI can be found at PowerPC ABI. Note: This link describes the 32 bit ABI. The 64 bit ABI + is similar except space for GPRs are 8 bytes wide (not 4) and r13 is reserved + for system use.

    + +
    + + + + +
    + +

    The size of a PowerPC frame is usually fixed for the duration of a + function's invocation. Since the frame is fixed size, all references + into the frame can be accessed via fixed offsets from the stack pointer. The + exception to this is when dynamic alloca or variable sized arrays are + present, then a base pointer (r31) is used as a proxy for the stack pointer + and stack pointer is free to grow or shrink. A base pointer is also used if + llvm-gcc is not passed the -fomit-frame-pointer flag. The stack pointer is + always aligned to 16 bytes, so that space allocated for altivec vectors will + be properly aligned.

    + +

    An invocation frame is laid out as follows (low memory at top);

    + + + + + + + + + + + + + + + + + + + + + + + +
    Linkage

    Parameter area

    Dynamic area

    Locals area

    Saved registers area


    Previous Frame

    + +

    The linkage area is used by a callee to save special registers prior + to allocating its own frame. Only three entries are relevant to LLVM. The + first entry is the previous stack pointer (sp), aka link. This allows + probing tools like gdb or exception handlers to quickly scan the frames in + the stack. A function epilog can also use the link to pop the frame from the + stack. The third entry in the linkage area is used to save the return + address from the lr register. Finally, as mentioned above, the last entry is + used to save the previous frame pointer (r31.) The entries in the linkage + area are the size of a GPR, thus the linkage area is 24 bytes long in 32 bit + mode and 48 bytes in 64 bit mode.

    + +

    32 bit linkage area

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    0Saved SP (r1)
    4Saved CR
    8Saved LR
    12Reserved
    16Reserved
    20Saved FP (r31)
    + +

    64 bit linkage area

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    0Saved SP (r1)
    8Saved CR
    16Saved LR
    24Reserved
    32Reserved
    40Saved FP (r31)
    + +

    The parameter area is used to store arguments being passed to a callee + function. Following the PowerPC ABI, the first few arguments are actually + passed in registers, with the space in the parameter area unused. However, + if there are not enough registers or the callee is a thunk or vararg + function, these register arguments can be spilled into the parameter area. + Thus, the parameter area must be large enough to store all the parameters for + the largest call sequence made by the caller. The size must also be + minimally large enough to spill registers r3-r10. This allows callees blind + to the call signature, such as thunks and vararg functions, enough space to + cache the argument registers. Therefore, the parameter area is minimally 32 + bytes (64 bytes in 64 bit mode.) Also note that since the parameter area is + a fixed offset from the top of the frame, that a callee can access its spilt + arguments using fixed offsets from the stack pointer (or base pointer.)

    + +

    Combining the information about the linkage, parameter areas and alignment. A + stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit + mode.

    + +

    The dynamic area starts out as size zero. If a function uses dynamic + alloca then space is added to the stack, the linkage and parameter areas are + shifted to top of stack, and the new space is available immediately below the + linkage and parameter areas. The cost of shifting the linkage and parameter + areas is minor since only the link value needs to be copied. The link value + can be easily fetched by adding the original frame size to the base pointer. + Note that allocations in the dynamic space need to observe 16 byte + alignment.

    + +

    The locals area is where the llvm compiler reserves space for local + variables.

    + +

    The saved registers area is where the llvm compiler spills callee + saved registers on entry to the callee.

    + +
    + + + + +
    + +

    The llvm prolog and epilog are the same as described in the PowerPC ABI, with + the following exceptions. Callee saved registers are spilled after the frame + is created. This allows the llvm epilog/prolog support to be common with + other targets. The base pointer callee saved register r31 is saved in the + TOC slot of linkage area. This simplifies allocation of space for the base + pointer and makes it convenient to locate programatically and during + debugging.

    + +
    + + + + +
    + +

    TODO - More to come.

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-08-31 15:01:07 -0700 (Tue, 31 Aug 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CodingStandards.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CodingStandards.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CodingStandards.html (added) +++ www-releases/trunk/2.8/docs/CodingStandards.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1353 @@ + + + + + LLVM Coding Standards + + + +
    + LLVM Coding Standards +
    + +
      +
    1. Introduction
    2. +
    3. Mechanical Source Issues +
        +
      1. Source Code Formatting +
          +
        1. Commenting
        2. +
        3. Comment Formatting
        4. +
        5. #include Style
        6. +
        7. Source Code Width
        8. +
        9. Use Spaces Instead of Tabs
        10. +
        11. Indent Code Consistently
        12. +
      2. +
      3. Compiler Issues +
          +
        1. Treat Compiler Warnings Like + Errors
        2. +
        3. Write Portable Code
        4. +
        5. Use of class/struct Keywords
        6. +
      4. +
    4. +
    5. Style Issues +
        +
      1. The High Level Issues +
          +
        1. A Public Header File is a + Module
        2. +
        3. #include as Little as Possible
        4. +
        5. Keep "internal" Headers + Private
        6. +
        7. Use Early Exits and 'continue' to Simplify + Code
        8. +
        9. Don't use "else" after a + return
        10. +
        11. Turn Predicate Loops into Predicate + Functions
        12. +
      2. +
      3. The Low Level Issues +
          +
        1. Assert Liberally
        2. +
        3. Do not use 'using namespace std'
        4. +
        5. Provide a virtual method anchor for + classes in headers
        6. +
        7. Don't evaluate end() every time through a + loop
        8. +
        9. #include <iostream> is + forbidden
        10. +
        11. Avoid std::endl
        12. +
        13. Use raw_ostream +
      4. + +
      5. Microscopic Details +
          +
        1. Spaces Before Parentheses
        2. +
        3. Prefer Preincrement
        4. +
        5. Namespace Indentation
        6. +
        7. Anonymous Namespaces
        8. +
      6. + + +
    6. +
    7. See Also
    8. +
    + +
    +

    Written by Chris Lattner

    +
    + + + + + + +
    + +

    This document attempts to describe a few coding standards that are being used +in the LLVM source tree. Although no coding standards should be regarded as +absolute requirements to be followed in all instances, coding standards can be +useful.

    + +

    This document intentionally does not prescribe fixed standards for religious +issues such as brace placement and space usage. For issues like this, follow +the golden rule:

    + +
    + +

    If you are adding a significant body of source to a +project, feel free to use whatever style you are most comfortable with. If you +are extending, enhancing, or bug fixing already implemented code, use the style +that is already being used so that the source is uniform and easy to +follow.

    + +
    + +

    The ultimate goal of these guidelines is the increase readability and +maintainability of our common source base. If you have suggestions for topics to +be included, please mail them to Chris.

    + +
    + + + + + + + + + + + +
    + +

    Comments are one critical part of readability and maintainability. Everyone +knows they should comment, so should you. When writing comments, write them as +English prose, which means they should use proper capitalization, punctuation, +etc. Although we all should probably +comment our code more than we do, there are a few very critical places that +documentation is very useful:

    + +File Headers + +

    Every source file should have a header on it that describes the basic +purpose of the file. If a file does not have a header, it should not be +checked into Subversion. Most source trees will probably have a standard +file header format. The standard format for the LLVM source tree looks like +this:

    + +
    +
    +//===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
    +//
    +//                     The LLVM Compiler Infrastructure
    +//
    +// This file is distributed under the University of Illinois Open Source
    +// License. See LICENSE.TXT for details.
    +//
    +//===----------------------------------------------------------------------===//
    +//
    +// This file contains the declaration of the Instruction class, which is the
    +// base class for all of the VM instructions.
    +//
    +//===----------------------------------------------------------------------===//
    +
    +
    + +

    A few things to note about this particular format: The "-*- C++ +-*-" string on the first line is there to tell Emacs that the source file +is a C++ file, not a C file (Emacs assumes .h files are C files by default). +Note that this tag is not necessary in .cpp files. The name of the file is also +on the first line, along with a very short description of the purpose of the +file. This is important when printing out code and flipping though lots of +pages.

    + +

    The next section in the file is a concise note that defines the license +that the file is released under. This makes it perfectly clear what terms the +source code can be distributed under and should not be modified in any way.

    + +

    The main body of the description does not have to be very long in most cases. +Here it's only two lines. If an algorithm is being implemented or something +tricky is going on, a reference to the paper where it is published should be +included, as well as any notes or "gotchas" in the code to watch out for.

    + +Class overviews + +

    Classes are one fundamental part of a good object oriented design. As such, +a class definition should have a comment block that explains what the class is +used for... if it's not obvious. If it's so completely obvious your grandma +could figure it out, it's probably safe to leave it out. Naming classes +something sane goes a long ways towards avoiding writing documentation.

    + + +Method information + +

    Methods defined in a class (as well as any global functions) should also be +documented properly. A quick note about what it does and a description of the +borderline behaviour is all that is necessary here (unless something +particularly tricky or insidious is going on). The hope is that people can +figure out how to use your interfaces without reading the code itself... that is +the goal metric.

    + +

    Good things to talk about here are what happens when something unexpected +happens: does the method return null? Abort? Format your hard disk?

    + +
    + + + + +
    + +

    In general, prefer C++ style (//) comments. They take less space, +require less typing, don't have nesting problems, etc. There are a few cases +when it is useful to use C style (/* */) comments however:

    + +
      +
    1. When writing a C code: Obviously if you are writing C code, use C style + comments.
    2. +
    3. When writing a header file that may be #included by a C source + file.
    4. +
    5. When writing a source file that is used by a tool that only accepts C + style comments.
    6. +
    + +

    To comment out a large block of code, use #if 0 and #endif. +These nest properly and are better behaved in general than C style comments.

    + +
    + + + + +
    + +

    Immediately after the header file comment (and +include guards if working on a header file), the minimal list of #includes required by the +file should be listed. We prefer these #includes to be listed in this +order:

    + +
      +
    1. Main Module header
    2. +
    3. Local/Private Headers
    4. +
    5. llvm/*
    6. +
    7. llvm/Analysis/*
    8. +
    9. llvm/Assembly/*
    10. +
    11. llvm/Bytecode/*
    12. +
    13. llvm/CodeGen/*
    14. +
    15. ...
    16. +
    17. Support/*
    18. +
    19. Config/*
    20. +
    21. System #includes
    22. +
    + +

    ... and each category should be sorted by name.

    + +

    The "Main Module Header" file applies to .cpp file +which implement an interface defined by a .h file. This #include +should always be included first regardless of where it lives on the file +system. By including a header file first in the .cpp files that implement the +interfaces, we ensure that the header does not have any hidden dependencies +which are not explicitly #included in the header, but should be. It is also a +form of documentation in the .cpp file to indicate where the interfaces it +implements are defined.

    + +
    + + + + +
    + +

    Write your code to fit within 80 columns of text. This helps those of us who +like to print out code and look at your code in an xterm without resizing +it.

    + +

    The longer answer is that there must be some limit to the width of the code +in order to reasonably allow developers to have multiple files side-by-side in +windows on a modest display. If you are going to pick a width limit, it is +somewhat arbitrary but you might as well pick something standard. Going with +90 columns (for example) instead of 80 columns wouldn't add any significant +value and would be detrimental to printing out code. Also many other projects +have standardized on 80 columns, so some people have already configured their +editors for it (vs something else, like 90 columns).

    + +

    This is one of many contentious issues in coding standards, but is not up +for debate.

    + +
    + + + + +
    + +

    In all cases, prefer spaces to tabs in source files. People have different +preferred indentation levels, and different styles of indentation that they +like... this is fine. What isn't is that different editors/viewers expand tabs +out to different tab stops. This can cause your code to look completely +unreadable, and it is not worth dealing with.

    + +

    As always, follow the Golden Rule above: follow the +style of existing code if your are modifying and extending it. If you like four +spaces of indentation, DO NOT do that in the middle of a chunk of code +with two spaces of indentation. Also, do not reindent a whole source file: it +makes for incredible diffs that are absolutely worthless.

    + +
    + + + + +
    + +

    Okay, your first year of programming you were told that indentation is +important. If you didn't believe and internalize this then, now is the time. +Just do it.

    + +
    + + + + + + + + + +
    + +

    If your code has compiler warnings in it, something is wrong: you aren't +casting values correctly, your have "questionable" constructs in your code, or +you are doing something legitimately wrong. Compiler warnings can cover up +legitimate errors in output and make dealing with a translation unit +difficult.

    + +

    It is not possible to prevent all warnings from all compilers, nor is it +desirable. Instead, pick a standard compiler (like gcc) that provides +a good thorough set of warnings, and stick to them. At least in the case of +gcc, it is possible to work around any spurious errors by changing the +syntax of the code slightly. For example, an warning that annoys me occurs when +I write code like this:

    + +
    +
    +if (V = getValue()) {
    +  ...
    +}
    +
    +
    + +

    gcc will warn me that I probably want to use the == +operator, and that I probably mistyped it. In most cases, I haven't, and I +really don't want the spurious errors. To fix this particular problem, I +rewrite the code like this:

    + +
    +
    +if ((V = getValue())) {
    +  ...
    +}
    +
    +
    + +

    ...which shuts gcc up. Any gcc warning that annoys you can +be fixed by massaging the code appropriately.

    + +

    These are the gcc warnings that I prefer to enable: -Wall +-Winline -W -Wwrite-strings -Wno-unused

    + +
    + + + + +
    + +

    In almost all cases, it is possible and within reason to write completely +portable code. If there are cases where it isn't possible to write portable +code, isolate it behind a well defined (and well documented) interface.

    + +

    In practice, this means that you shouldn't assume much about the host +compiler, including its support for "high tech" features like partial +specialization of templates. If these features are used, they should only be +an implementation detail of a library which has a simple exposed API.

    + +
    + + + +
    + +

    In C++, the class and struct keywords can be used almost +interchangeably. The only difference is when they are used to declare a class: +class makes all members private by default while struct makes +all members public by default.

    + +

    Unfortunately, not all compilers follow the rules and some will generate +different symbols based on whether class or struct was used to +declare the symbol. This can lead to problems at link time.

    + +

    So, the rule for LLVM is to always use the class keyword, unless +all members are public and the type is a C++ "POD" type, in which case +struct is allowed.

    + +
    + + + + + + + + + + + + + + +
    + +

    C++ doesn't do too well in the modularity department. There is no real +encapsulation or data hiding (unless you use expensive protocol classes), but it +is what we have to work with. When you write a public header file (in the LLVM +source tree, they live in the top level "include" directory), you are defining a +module of functionality.

    + +

    Ideally, modules should be completely independent of each other, and their +header files should only include the absolute minimum number of headers +possible. A module is not just a class, a function, or a namespace: it's a collection +of these that defines an interface. This interface may be several +functions, classes or data structures, but the important issue is how they work +together.

    + +

    In general, a module should be implemented with one or more .cpp +files. Each of these .cpp files should include the header that defines +their interface first. This ensure that all of the dependences of the module +header have been properly added to the module header itself, and are not +implicit. System headers should be included after user headers for a +translation unit.

    + +
    + + + + +
    + +

    #include hurts compile time performance. Don't do it unless you +have to, especially in header files.

    + +

    But wait, sometimes you need to have the definition of a class to use it, or +to inherit from it. In these cases go ahead and #include that header +file. Be aware however that there are many cases where you don't need to have +the full definition of a class. If you are using a pointer or reference to a +class, you don't need the header file. If you are simply returning a class +instance from a prototyped function or method, you don't need it. In fact, for +most cases, you simply don't need the definition of a class... and not +#include'ing speeds up compilation.

    + +

    It is easy to try to go too overboard on this recommendation, however. You +must include all of the header files that you are using -- you can +include them either directly +or indirectly (through another header file). To make sure that you don't +accidentally forget to include a header file in your module header, make sure to +include your module header first in the implementation file (as mentioned +above). This way there won't be any hidden dependencies that you'll find out +about later...

    + +
    + + + + +
    + +

    Many modules have a complex implementation that causes them to use more than +one implementation (.cpp) file. It is often tempting to put the +internal communication interface (helper classes, extra functions, etc) in the +public module header file. Don't do this.

    + +

    If you really need to do something like this, put a private header file in +the same directory as the source files, and include it locally. This ensures +that your private interface remains private and undisturbed by outsiders.

    + +

    Note however, that it's okay to put extra implementation methods a public +class itself... just make them private (or protected), and all is well.

    + +
    + + + + +
    + +

    When reading code, keep in mind how much state and how many previous +decisions have to be remembered by the reader to understand a block of code. +Aim to reduce indentation where possible when it doesn't make it more difficult +to understand the code. One great way to do this is by making use of early +exits and the 'continue' keyword in long loops. As an example of using an early +exit from a function, consider this "bad" code:

    + +
    +
    +Value *DoSomething(Instruction *I) {
    +  if (!isa<TerminatorInst>(I) &&
    +      I->hasOneUse() && SomeOtherThing(I)) {
    +    ... some long code ....
    +  }
    +  
    +  return 0;
    +}
    +
    +
    + +

    This code has several problems if the body of the 'if' is large. When you're +looking at the top of the function, it isn't immediately clear that this +only does interesting things with non-terminator instructions, and only +applies to things with the other predicates. Second, it is relatively difficult +to describe (in comments) why these predicates are important because the if +statement makes it difficult to lay out the comments. Third, when you're deep +within the body of the code, it is indented an extra level. Finally, when +reading the top of the function, it isn't clear what the result is if the +predicate isn't true, you have to read to the end of the function to know that +it returns null.

    + +

    It is much preferred to format the code like this:

    + +
    +
    +Value *DoSomething(Instruction *I) {
    +  // Terminators never need 'something' done to them because, ... 
    +  if (isa<TerminatorInst>(I))
    +    return 0;
    +
    +  // We conservatively avoid transforming instructions with multiple uses
    +  // because goats like cheese.
    +  if (!I->hasOneUse())
    +    return 0;
    +
    +  // This is really just here for example.
    +  if (!SomeOtherThing(I))
    +    return 0;
    +    
    +  ... some long code ....
    +}
    +
    +
    + +

    This fixes these problems. A similar problem frequently happens in for +loops. A silly example is something like this:

    + +
    +
    +  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
    +    if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {
    +      Value *LHS = BO->getOperand(0);
    +      Value *RHS = BO->getOperand(1);
    +      if (LHS != RHS) {
    +        ...
    +      }
    +    }
    +  }
    +
    +
    + +

    When you have very very small loops, this sort of structure is fine, but if +it exceeds more than 10-15 lines, it becomes difficult for people to read and +understand at a glance. +The problem with this sort of code is that it gets very nested very quickly, +meaning that the reader of the code has to keep a lot of context in their brain +to remember what is going immediately on in the loop, because they don't know +if/when the if conditions will have elses etc. It is strongly preferred to +structure the loop like this:

    + +
    +
    +  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
    +    BinaryOperator *BO = dyn_cast<BinaryOperator>(II);
    +    if (!BO) continue;
    +    
    +    Value *LHS = BO->getOperand(0);
    +    Value *RHS = BO->getOperand(1);
    +    if (LHS == RHS) continue;
    +  }
    +
    +
    + +

    This has all the benefits of using early exits from functions: it reduces +nesting of the loop, it makes it easier to describe why the conditions are true, +and it makes it obvious to the reader that there is no "else" coming up that +they have to push context into their brain for. If a loop is large, this can +be a big understandability win.

    + +
    + + + + +
    + +

    For similar reasons above (reduction of indentation and easier reading), + please do not use "else" or "else if" after something that interrupts + control flow like return, break, continue, goto, etc. For example, this is + "bad":

    + +
    +
    +  case 'J': {
    +    if (Signed) {
    +      Type = Context.getsigjmp_bufType();
    +      if (Type.isNull()) {
    +        Error = ASTContext::GE_Missing_sigjmp_buf;
    +        return QualType();
    +      } else {
    +        break;
    +      }
    +    } else {
    +      Type = Context.getjmp_bufType();
    +      if (Type.isNull()) {
    +        Error = ASTContext::GE_Missing_jmp_buf;
    +        return QualType();
    +      } else {
    +        break;
    +      }
    +    }
    +  }
    +  }
    +
    +
    + +

    It is better to write this something like:

    + +
    +
    +  case 'J':
    +    if (Signed) {
    +      Type = Context.getsigjmp_bufType();
    +      if (Type.isNull()) {
    +        Error = ASTContext::GE_Missing_sigjmp_buf;
    +        return QualType();
    +      }
    +    } else {
    +      Type = Context.getjmp_bufType();
    +      if (Type.isNull()) {
    +        Error = ASTContext::GE_Missing_jmp_buf;
    +        return QualType();
    +      }
    +    }
    +    break;
    +
    +
    + +

    Or better yet (in this case), as:

    + +
    +
    +  case 'J':
    +    if (Signed)
    +      Type = Context.getsigjmp_bufType();
    +    else
    +      Type = Context.getjmp_bufType();
    +    
    +    if (Type.isNull()) {
    +      Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
    +                       ASTContext::GE_Missing_jmp_buf;
    +      return QualType();
    +    }
    +    break;
    +
    +
    + +

    The idea is to reduce indentation and the amount of code you have to keep + track of when reading the code.

    + +
    + + + + +
    + +

    It is very common to write small loops that just compute a boolean + value. There are a number of ways that people commonly write these, but an + example of this sort of thing is:

    + +
    +
    +  bool FoundFoo = false;
    +  for (unsigned i = 0, e = BarList.size(); i != e; ++i)
    +    if (BarList[i]->isFoo()) {
    +      FoundFoo = true;
    +      break;
    +    }
    +    
    +  if (FoundFoo) {
    +    ...
    +  }
    +
    +
    + +

    This sort of code is awkward to write, and is almost always a bad sign. +Instead of this sort of loop, we strongly prefer to use a predicate function +(which may be static) that uses +early exits to compute the predicate. We prefer +the code to be structured like this: +

    + + +
    +
    +/// ListContainsFoo - Return true if the specified list has an element that is
    +/// a foo.
    +static bool ListContainsFoo(const std::vector<Bar*> &List) {
    +  for (unsigned i = 0, e = List.size(); i != e; ++i)
    +    if (List[i]->isFoo())
    +      return true;
    +  return false;
    +}
    +...
    +
    +  if (ListContainsFoo(BarList)) {
    +    ...
    +  }
    +
    +
    + +

    There are many reasons for doing this: it reduces indentation and factors out +code which can often be shared by other code that checks for the same predicate. +More importantly, it forces you to pick a name for the function, and +forces you to write a comment for it. In this silly example, this doesn't add +much value. However, if the condition is complex, this can make it a lot easier +for the reader to understand the code that queries for this predicate. Instead +of being faced with the in-line details of how we check to see if the BarList +contains a foo, we can trust the function name and continue reading with better +locality.

    + +
    + + + + + + + + + + +
    + +

    Use the "assert" function to its fullest. Check all of your +preconditions and assumptions, you never know when a bug (not necessarily even +yours) might be caught early by an assertion, which reduces debugging time +dramatically. The "<cassert>" header file is probably already +included by the header files you are using, so it doesn't cost anything to use +it.

    + +

    To further assist with debugging, make sure to put some kind of error message +in the assertion statement (which is printed if the assertion is tripped). This +helps the poor debugging make sense of why an assertion is being made and +enforced, and hopefully what to do about it. Here is one complete example:

    + +
    +
    +inline Value *getOperand(unsigned i) { 
    +  assert(i < Operands.size() && "getOperand() out of range!");
    +  return Operands[i]; 
    +}
    +
    +
    + +

    Here are some examples:

    + +
    +
    +assert(Ty->isPointerType() && "Can't allocate a non pointer type!");
    +
    +assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!");
    +
    +assert(idx < getNumSuccessors() && "Successor # out of range!");
    +
    +assert(V1.getType() == V2.getType() && "Constant types must be identical!");
    +
    +assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");
    +
    +
    + +

    You get the idea...

    + +

    Please be aware when adding assert statements that not all compilers are aware of +the semantics of the assert. In some places, asserts are used to indicate a piece of +code that should not be reached. These are typically of the form:

    + +
    +
    +assert(0 && "Some helpful error message");
    +
    +
    + +

    When used in a function that returns a value, they should be followed with a return +statement and a comment indicating that this line is never reached. This will prevent +a compiler which is unable to deduce that the assert statement never returns from +generating a warning.

    + +
    +
    +assert(0 && "Some helpful error message");
    +// Not reached
    +return 0;
    +
    +
    + +
    + + + + +
    +

    In LLVM, we prefer to explicitly prefix all identifiers from the standard +namespace with an "std::" prefix, rather than rely on +"using namespace std;".

    + +

    In header files, adding a 'using namespace XXX' directive pollutes +the namespace of any source file that #includes the header. This is +clearly a bad thing.

    + +

    In implementation files (e.g. .cpp files), the rule is more of a stylistic +rule, but is still important. Basically, using explicit namespace prefixes +makes the code clearer, because it is immediately obvious what facilities +are being used and where they are coming from, and more portable, because +namespace clashes cannot occur between LLVM code and other namespaces. The +portability rule is important because different standard library implementations +expose different symbols (potentially ones they shouldn't), and future revisions +to the C++ standard will add more symbols to the std namespace. As +such, we never use 'using namespace std;' in LLVM.

    + +

    The exception to the general rule (i.e. it's not an exception for +the std namespace) is for implementation files. For example, all of +the code in the LLVM project implements code that lives in the 'llvm' namespace. +As such, it is ok, and actually clearer, for the .cpp files to have a 'using +namespace llvm' directive at their top, after the #includes. The +general form of this rule is that any .cpp file that implements code in any +namespace may use that namespace (and its parents'), but should not use any +others.

    + +
    + + + + +
    + +

    If a class is defined in a header file and has a v-table (either it has +virtual methods or it derives from classes with virtual methods), it must +always have at least one out-of-line virtual method in the class. Without +this, the compiler will copy the vtable and RTTI into every .o file +that #includes the header, bloating .o file sizes and +increasing link times.

    + +
    + + + + +
    + +

    Because C++ doesn't have a standard "foreach" loop (though it can be emulated +with macros and may be coming in C++'0x) we end up writing a lot of loops that +manually iterate from begin to end on a variety of containers or through other +data structures. One common mistake is to write a loop in this style:

    + +
    +
    +  BasicBlock *BB = ...
    +  for (BasicBlock::iterator I = BB->begin(); I != BB->end(); ++I)
    +     ... use I ...
    +
    +
    + +

    The problem with this construct is that it evaluates "BB->end()" +every time through the loop. Instead of writing the loop like this, we strongly +prefer loops to be written so that they evaluate it once before the loop starts. +A convenient way to do this is like so:

    + +
    +
    +  BasicBlock *BB = ...
    +  for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I)
    +     ... use I ...
    +
    +
    + +

    The observant may quickly point out that these two loops may have different +semantics: if the container (a basic block in this case) is being mutated, then +"BB->end()" may change its value every time through the loop and the +second loop may not in fact be correct. If you actually do depend on this +behavior, please write the loop in the first form and add a comment indicating +that you did it intentionally.

    + +

    Why do we prefer the second form (when correct)? Writing the loop in the +first form has two problems: First it may be less efficient than evaluating it +at the start of the loop. In this case, the cost is probably minor: a few extra +loads every time through the loop. However, if the base expression is more +complex, then the cost can rise quickly. I've seen loops where the end +expression was actually something like: "SomeMap[x]->end()" and map +lookups really aren't cheap. By writing it in the second form consistently, you +eliminate the issue entirely and don't even have to think about it.

    + +

    The second (even bigger) issue is that writing the loop in the first form +hints to the reader that the loop is mutating the container (a fact that a +comment would handily confirm!). If you write the loop in the second form, it +is immediately obvious without even looking at the body of the loop that the +container isn't being modified, which makes it easier to read the code and +understand what it does.

    + +

    While the second form of the loop is a few extra keystrokes, we do strongly +prefer it.

    + +
    + + + + +
    + +

    The use of #include <iostream> in library files is +hereby forbidden. The primary reason for doing this is to +support clients using LLVM libraries as part of larger systems. In particular, +we statically link LLVM into some dynamic libraries. Even if LLVM isn't used, +the static c'tors are run whenever an application start up that uses the dynamic +library. There are two problems with this:

    + +
      +
    1. The time to run the static c'tors impacts startup time of + applications—a critical time for GUI apps.
    2. +
    3. The static c'tors cause the app to pull many extra pages of memory off the + disk: both the code for the static c'tors in each .o file and the + small amount of data that gets touched. In addition, touched/dirty pages + put more pressure on the VM system on low-memory machines.
    4. +
    + +

    Note that using the other stream headers (<sstream> for +example) is not problematic in this regard (just <iostream>). +However, raw_ostream provides various APIs that are better performing for almost +every use than std::ostream style APIs, so you should just use it for new +code.

    + +

    New code should always +use raw_ostream for writing, or +the llvm::MemoryBuffer API for reading files.

    + +
    + + + + + +
    + +

    The std::endl modifier, when used with iostreams outputs a newline +to the output stream specified. In addition to doing this, however, it also +flushes the output stream. In other words, these are equivalent:

    + +
    +
    +std::cout << std::endl;
    +std::cout << '\n' << std::flush;
    +
    +
    + +

    Most of the time, you probably have no reason to flush the output stream, so +it's better to use a literal '\n'.

    + +
    + + + + + +
    + +

    LLVM includes a lightweight, simple, and efficient stream implementation +in llvm/Support/raw_ostream.h which provides all of the common features +of std::ostream. All new code should use raw_ostream instead +of ostream.

    + +

    Unlike std::ostream, raw_ostream is not a template and can +be forward declared as class raw_ostream. Public headers should +generally not include the raw_ostream header, but use forward +declarations and constant references to raw_ostream instances.

    + +
    + + + + + + +

    This section describes preferred low-level formatting guidelines along with +reasoning on why we prefer them.

    + + + + +
    + +

    We prefer to put a space before a parentheses only in control flow +statements, but not in normal function call expressions and function-like +macros. For example, this is good:

    + +
    +
    +  if (x) ...
    +  for (i = 0; i != 100; ++i) ...
    +  while (llvm_rocks) ...
    +
    +  somefunc(42);
    +  assert(3 != 4 && "laws of math are failing me");
    +  
    +  a = foo(42, 92) + bar(x);
    +  
    +
    + +

    ... and this is bad:

    + +
    +
    +  if(x) ...
    +  for(i = 0; i != 100; ++i) ...
    +  while(llvm_rocks) ...
    +
    +  somefunc (42);
    +  assert (3 != 4 && "laws of math are failing me");
    +  
    +  a = foo (42, 92) + bar (x);
    +
    +
    + +

    The reason for doing this is not completely arbitrary. This style makes + control flow operators stand out more, and makes expressions flow better. The + function call operator binds very tightly as a postfix operator. Putting + a space after a function name (as in the last example) makes it appear that + the code might bind the arguments of the left-hand-side of a binary operator + with the argument list of a function and the name of the right side. More + specifically, it is easy to misread the "a" example as:

    + +
    +
    +  a = foo ((42, 92) + bar) (x);
    +
    +
    + +

    ... when skimming through the code. By avoiding a space in a function, we +avoid this misinterpretation.

    + +
    + + + + +
    + +

    Hard fast rule: Preincrement (++X) may be no slower than +postincrement (X++) and could very well be a lot faster than it. Use +preincrementation whenever possible.

    + +

    The semantics of postincrement include making a copy of the value being +incremented, returning it, and then preincrementing the "work value". For +primitive types, this isn't a big deal... but for iterators, it can be a huge +issue (for example, some iterators contains stack and set objects in them... +copying an iterator could invoke the copy ctor's of these as well). In general, +get in the habit of always using preincrement, and you won't have a problem.

    + +
    + + + + +
    + +

    +In general, we strive to reduce indentation where ever possible. This is useful +because we want code to fit into 80 columns without +wrapping horribly, but also because it makes it easier to understand the code. +Namespaces are a funny thing: they are often large, and we often desire to put +lots of stuff into them (so they can be large). Other times they are tiny, +because they just hold an enum or something similar. In order to balance this, +we use different approaches for small versus large namespaces. +

    + +

    +If a namespace definition is small and easily fits on a screen (say, +less than 35 lines of code), then you should indent its body. Here's an +example: +

    + +
    +
    +namespace llvm {
    +  namespace X86 {
    +    /// RelocationType - An enum for the x86 relocation codes. Note that
    +    /// the terminology here doesn't follow x86 convention - word means
    +    /// 32-bit and dword means 64-bit.
    +    enum RelocationType {
    +      /// reloc_pcrel_word - PC relative relocation, add the relocated value to
    +      /// the value already in memory, after we adjust it for where the PC is.
    +      reloc_pcrel_word = 0,
    +
    +      /// reloc_picrel_word - PIC base relative relocation, add the relocated
    +      /// value to the value already in memory, after we adjust it for where the
    +      /// PIC base is.
    +      reloc_picrel_word = 1,
    +      
    +      /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just
    +      /// add the relocated value to the value already in memory.
    +      reloc_absolute_word = 2,
    +      reloc_absolute_dword = 3
    +    };
    +  }
    +}
    +
    +
    + +

    Since the body is small, indenting adds value because it makes it very clear +where the namespace starts and ends, and it is easy to take the whole thing in +in one "gulp" when reading the code. If the blob of code in the namespace is +larger (as it typically is in a header in the llvm or clang namespaces), do not +indent the code, and add a comment indicating what namespace is being closed. +For example:

    + +
    +
    +namespace llvm {
    +namespace knowledge {
    +
    +/// Grokable - This class represents things that Smith can have an intimate
    +/// understanding of and contains the data associated with it.
    +class Grokable {
    +...
    +public:
    +  explicit Grokable() { ... }
    +  virtual ~Grokable() = 0;
    +  
    +  ...
    +
    +};
    +
    +} // end namespace knowledge
    +} // end namespace llvm
    +
    +
    + +

    Because the class is large, we don't expect that the reader can easily +understand the entire concept in a glance, and the end of the file (where the +namespaces end) may be a long ways away from the place they open. As such, +indenting the contents of the namespace doesn't add any value, and detracts from +the readability of the class. In these cases it is best to not indent +the contents of the namespace.

    + +
    + + + + +
    + +

    After talking about namespaces in general, you may be wondering about +anonymous namespaces in particular. +Anonymous namespaces are a great language feature that tells the C++ compiler +that the contents of the namespace are only visible within the current +translation unit, allowing more aggressive optimization and eliminating the +possibility of symbol name collisions. Anonymous namespaces are to C++ as +"static" is to C functions and global variables. While "static" is available +in C++, anonymous namespaces are more general: they can make entire classes +private to a file.

    + +

    The problem with anonymous namespaces is that they naturally want to +encourage indentation of their body, and they reduce locality of reference: if +you see a random function definition in a C++ file, it is easy to see if it is +marked static, but seeing if it is in an anonymous namespace requires scanning +a big chunk of the file.

    + +

    Because of this, we have a simple guideline: make anonymous namespaces as +small as possible, and only use them for class declarations. For example, this +is good:

    + +
    +
    +namespace {
    +  class StringSort {
    +  ...
    +  public:
    +    StringSort(...)
    +    bool operator<(const char *RHS) const;
    +  };
    +} // end anonymous namespace
    +
    +static void Helper() { 
    +  ... 
    +}
    +
    +bool StringSort::operator<(const char *RHS) const {
    +  ...
    +}
    +
    +
    +
    + +

    This is bad:

    + + +
    +
    +namespace {
    +class StringSort {
    +...
    +public:
    +  StringSort(...)
    +  bool operator<(const char *RHS) const;
    +};
    +
    +void Helper() { 
    +  ... 
    +}
    +
    +bool StringSort::operator<(const char *RHS) const {
    +  ...
    +}
    +
    +} // end anonymous namespace
    +
    +
    +
    + + +

    This is bad specifically because if you're looking at "Helper" in the middle +of a large C++ file, that you have no immediate way to tell if it is local to +the file. When it is marked static explicitly, this is immediately obvious. +Also, there is no reason to enclose the definition of "operator<" in the +namespace just because it was declared there. +

    + +
    + + + + + + + +
    + +

    A lot of these comments and recommendations have been culled for other +sources. Two particularly important books for our work are:

    + +
      + +
    1. Effective +C++ by Scott Meyers. Also +interesting and useful are "More Effective C++" and "Effective STL" by the same +author.
    2. + +
    3. Large-Scale C++ Software Design by John Lakos
    4. + +
    + +

    If you get some free time, and you haven't read them: do so, you might learn +something.

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CommandGuide/FileCheck.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/FileCheck.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/FileCheck.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/FileCheck.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,245 @@ + +=pod + +=head1 NAME + +FileCheck - Flexible pattern matching file verifier + +=head1 SYNOPSIS + +B I [I<--check-prefix=XXX>] [I<--strict-whitespace>] + +=head1 DESCRIPTION + +B reads two files (one from standard input, and one specified on the +command line) and uses one to verify the other. This behavior is particularly +useful for the testsuite, which wants to verify that the output of some tool +(e.g. llc) contains the expected information (for example, a movsd from esp or +whatever is interesting). This is similar to using grep, but it is optimized +for matching multiple different inputs in one file in a specific order. + +The I file specifies the file that contains the patterns to +match. The file to verify is always read from standard input. + +=head1 OPTIONS + +=over + +=item B<-help> + +Print a summary of command line options. + +=item B<--check-prefix> I + +FileCheck searches the contents of I for patterns to match. By +default, these patterns are prefixed with "CHECK:". If you'd like to use a +different prefix (e.g. because the same input file is checking multiple +different tool or options), the B<--check-prefix> argument allows you to specify +a specific prefix to match. + +=item B<--strict-whitespace> + +By default, FileCheck canonicalizes input horizontal whitespace (spaces and +tabs) which causes it to ignore these differences (a space will match a tab). +The --strict-whitespace argument disables this behavior. + +=item B<-version> + +Show the version number of this program. + +=back + +=head1 EXIT STATUS + +If B verifies that the file matches the expected contents, it exits +with 0. Otherwise, if not, or if an error occurs, it will exit with a non-zero +value. + +=head1 TUTORIAL + +FileCheck is typically used from LLVM regression tests, being invoked on the RUN +line of the test. A simple example of using FileCheck from a RUN line looks +like this: + + ; RUN: llvm-as < %s | llc -march=x86-64 | FileCheck %s + +This syntax says to pipe the current file ("%s") into llvm-as, pipe that into +llc, then pipe the output of llc into FileCheck. This means that FileCheck will +be verifying its standard input (the llc output) against the filename argument +specified (the original .ll file specified by "%s"). To see how this works, +lets look at the rest of the .ll file (after the RUN line): + + define void @sub1(i32* %p, i32 %v) { + entry: + ; CHECK: sub1: + ; CHECK: subl + %0 = tail call i32 @llvm.atomic.load.sub.i32.p0i32(i32* %p, i32 %v) + ret void + } + + define void @inc4(i64* %p) { + entry: + ; CHECK: inc4: + ; CHECK: incq + %0 = tail call i64 @llvm.atomic.load.add.i64.p0i64(i64* %p, i64 1) + ret void + } + +Here you can see some "CHECK:" lines specified in comments. Now you can see +how the file is piped into llvm-as, then llc, and the machine code output is +what we are verifying. FileCheck checks the machine code output to verify that +it matches what the "CHECK:" lines specify. + +The syntax of the CHECK: lines is very simple: they are fixed strings that +must occur in order. FileCheck defaults to ignoring horizontal whitespace +differences (e.g. a space is allowed to match a tab) but otherwise, the contents +of the CHECK: line is required to match some thing in the test file exactly. + +One nice thing about FileCheck (compared to grep) is that it allows merging +test cases together into logical groups. For example, because the test above +is checking for the "sub1:" and "inc4:" labels, it will not match unless there +is a "subl" in between those labels. If it existed somewhere else in the file, +that would not count: "grep subl" matches if subl exists anywhere in the +file. + + + +=head2 The FileCheck -check-prefix option + +The FileCheck -check-prefix option allows multiple test configurations to be +driven from one .ll file. This is useful in many circumstances, for example, +testing different architectural variants with llc. Here's a simple example: + + ; RUN: llvm-as < %s | llc -mtriple=i686-apple-darwin9 -mattr=sse41 \ + ; RUN: | FileCheck %s -check-prefix=X32 + ; RUN: llvm-as < %s | llc -mtriple=x86_64-apple-darwin9 -mattr=sse41 \ + ; RUN: | FileCheck %s -check-prefix=X64 + + define <4 x i32> @pinsrd_1(i32 %s, <4 x i32> %tmp) nounwind { + %tmp1 = insertelement <4 x i32>; %tmp, i32 %s, i32 1 + ret <4 x i32> %tmp1 + ; X32: pinsrd_1: + ; X32: pinsrd $1, 4(%esp), %xmm0 + + ; X64: pinsrd_1: + ; X64: pinsrd $1, %edi, %xmm0 + } + +In this case, we're testing that we get the expected code generation with +both 32-bit and 64-bit code generation. + + + +=head2 The "CHECK-NEXT:" directive + +Sometimes you want to match lines and would like to verify that matches +happen on exactly consequtive lines with no other lines in between them. In +this case, you can use CHECK: and CHECK-NEXT: directives to specify this. If +you specified a custom check prefix, just use "-NEXT:". For +example, something like this works as you'd expect: + + define void @t2(<2 x double>* %r, <2 x double>* %A, double %B) { + %tmp3 = load <2 x double>* %A, align 16 + %tmp7 = insertelement <2 x double> undef, double %B, i32 0 + %tmp9 = shufflevector <2 x double> %tmp3, + <2 x double> %tmp7, + <2 x i32> < i32 0, i32 2 > + store <2 x double> %tmp9, <2 x double>* %r, align 16 + ret void + + ; CHECK: t2: + ; CHECK: movl 8(%esp), %eax + ; CHECK-NEXT: movapd (%eax), %xmm0 + ; CHECK-NEXT: movhpd 12(%esp), %xmm0 + ; CHECK-NEXT: movl 4(%esp), %eax + ; CHECK-NEXT: movapd %xmm0, (%eax) + ; CHECK-NEXT: ret + } + +CHECK-NEXT: directives reject the input unless there is exactly one newline +between it an the previous directive. A CHECK-NEXT cannot be the first +directive in a file. + + + +=head2 The "CHECK-NOT:" directive + +The CHECK-NOT: directive is used to verify that a string doesn't occur +between two matches (or the first match and the beginning of the file). For +example, to verify that a load is removed by a transformation, a test like this +can be used: + + define i8 @coerce_offset0(i32 %V, i32* %P) { + store i32 %V, i32* %P + + %P2 = bitcast i32* %P to i8* + %P3 = getelementptr i8* %P2, i32 2 + + %A = load i8* %P3 + ret i8 %A + ; CHECK: @coerce_offset0 + ; CHECK-NOT: load + ; CHECK: ret i8 + } + + + +=head2 FileCheck Pattern Matching Syntax + +The CHECK: and CHECK-NOT: directives both take a pattern to match. For most +uses of FileCheck, fixed string matching is perfectly sufficient. For some +things, a more flexible form of matching is desired. To support this, FileCheck +allows you to specify regular expressions in matching strings, surrounded by +double braces: B<{{yourregex}}>. Because we want to use fixed string +matching for a majority of what we do, FileCheck has been designed to support +mixing and matching fixed string matching with regular expressions. This allows +you to write things like this: + + ; CHECK: movhpd {{[0-9]+}}(%esp), {{%xmm[0-7]}} + +In this case, any offset from the ESP register will be allowed, and any xmm +register will be allowed. + +Because regular expressions are enclosed with double braces, they are +visually distinct, and you don't need to use escape characters within the double +braces like you would in C. In the rare case that you want to match double +braces explicitly from the input, you can use something ugly like +B<{{[{][{]}}> as your pattern. + + + +=head2 FileCheck Variables + +It is often useful to match a pattern and then verify that it occurs again +later in the file. For codegen tests, this can be useful to allow any register, +but verify that that register is used consistently later. To do this, FileCheck +allows named variables to be defined and substituted into patterns. Here is a +simple example: + + ; CHECK: test5: + ; CHECK: notw [[REGISTER:%[a-z]+]] + ; CHECK: andw {{.*}}[[REGISTER]] + +The first check line matches a regex (%[a-z]+) and captures it into +the variables "REGISTER". The second line verifies that whatever is in REGISTER +occurs later in the file after an "andw". FileCheck variable references are +always contained in [[ ]] pairs, are named, and their names can be +formed with the regex "[a-zA-Z_][a-zA-Z0-9_]*". If a colon follows the +name, then it is a definition of the variable, if not, it is a use. + +FileCheck variables can be defined multiple times, and uses always get the +latest value. Note that variables are all read at the start of a "CHECK" line +and are all defined at the end. This means that if you have something like +"CHECK: [[XYZ:.*]]x[[XYZ]]" that the check line will read the previous +value of the XYZ variable and define a new one after the match is performed. If +you need to do something like this you can probably take advantage of the fact +that FileCheck is not actually line-oriented when it matches, this allows you to +define two separate CHECK lines that match on the same line. + + + +=head1 AUTHORS + +Maintained by The LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/Makefile URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/Makefile?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/Makefile (added) +++ www-releases/trunk/2.8/docs/CommandGuide/Makefile Mon Oct 4 15:49:23 2010 @@ -0,0 +1,103 @@ +##===- docs/CommandGuide/Makefile --------------------------*- Makefile -*-===## +# +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +# +##===----------------------------------------------------------------------===## + +ifdef BUILD_FOR_WEBSITE +# This special case is for keeping the CommandGuide on the LLVM web site +# up to date automatically as the documents are checked in. It must build +# the POD files to HTML only and keep them in the src directories. It must also +# build in an unconfigured tree, hence the ifdef. To use this, run +# make -s BUILD_FOR_WEBSITE=1 inside the cvs commit script. +SRC_DOC_DIR= +DST_HTML_DIR=html/ +DST_MAN_DIR=man/man1/ +DST_PS_DIR=ps/ + +# If we are in BUILD_FOR_WEBSITE mode, default to the all target. +all:: html man ps + +clean: + rm -f pod2htm*.*~~ $(HTML) $(MAN) $(PS) + +# To create other directories, as needed, and timestamp their creation +%/.dir: + -mkdir $* > /dev/null + date > $@ + +else + +# Otherwise, if not in BUILD_FOR_WEBSITE mode, use the project info. +LEVEL := ../.. +include $(LEVEL)/Makefile.common + +SRC_DOC_DIR=$(PROJ_SRC_DIR)/ +DST_HTML_DIR=$(PROJ_OBJ_DIR)/ +DST_MAN_DIR=$(PROJ_OBJ_DIR)/ +DST_PS_DIR=$(PROJ_OBJ_DIR)/ + +endif + + +POD := $(wildcard $(SRC_DOC_DIR)*.pod) +HTML := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_HTML_DIR)%.html, $(POD)) +MAN := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_MAN_DIR)%.1, $(POD)) +PS := $(patsubst $(SRC_DOC_DIR)%.pod, $(DST_PS_DIR)%.ps, $(POD)) + +# The set of man pages we will not install +NO_INSTALL_MANS = $(DST_MAN_DIR)FileCheck.1 + +# The set of man pages that we will install +INSTALL_MANS = $(filter-out $(NO_INSTALL_MANS), $(MAN)) + +.SUFFIXES: +.SUFFIXES: .html .pod .1 .ps + +$(DST_HTML_DIR)%.html: %.pod $(DST_HTML_DIR)/.dir + pod2html --css=manpage.css --htmlroot=. \ + --podpath=. --noindex --infile=$< --outfile=$@ --title=$* + +$(DST_MAN_DIR)%.1: %.pod $(DST_MAN_DIR)/.dir + pod2man --release=CVS --center="LLVM Command Guide" $< $@ + +$(DST_PS_DIR)%.ps: $(DST_MAN_DIR)%.1 $(DST_PS_DIR)/.dir + groff -Tps -man $< > $@ + + +html: $(HTML) +man: $(MAN) +ps: $(PS) + +EXTRA_DIST := $(POD) index.html + +clean-local:: + $(Verb) $(RM) -f pod2htm*.*~~ $(HTML) $(MAN) $(PS) + +HTML_DIR := $(DESTDIR)$(PROJ_docsdir)/html/CommandGuide +MAN_DIR := $(DESTDIR)$(PROJ_mandir)/man1 +PS_DIR := $(DESTDIR)$(PROJ_docsdir)/ps + +install-local:: $(HTML) $(INSTALL_MANS) $(PS) + $(Echo) Installing HTML CommandGuide Documentation + $(Verb) $(MKDIR) $(HTML_DIR) + $(Verb) $(DataInstall) $(HTML) $(HTML_DIR) + $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/index.html $(HTML_DIR) + $(Verb) $(DataInstall) $(PROJ_SRC_DIR)/manpage.css $(HTML_DIR) + $(Echo) Installing MAN CommandGuide Documentation + $(Verb) $(MKDIR) $(MAN_DIR) + $(Verb) $(DataInstall) $(INSTALL_MANS) $(MAN_DIR) + $(Echo) Installing PS CommandGuide Documentation + $(Verb) $(MKDIR) $(PS_DIR) + $(Verb) $(DataInstall) $(PS) $(PS_DIR) + +uninstall-local:: + $(Echo) Uninstalling CommandGuide Documentation + $(Verb) $(RM) -rf $(HTML_DIR) $(MAN_DIR) $(PS_DIR) + +printvars:: + $(Echo) "POD : " '$(POD)' + $(Echo) "HTML : " '$(HTML)' Added: www-releases/trunk/2.8/docs/CommandGuide/bugpoint.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/bugpoint.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/bugpoint.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/bugpoint.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,171 @@ +=pod + +=head1 NAME + +bugpoint - automatic test case reduction tool + +=head1 SYNOPSIS + +B [I] [I] [I] B<--args> +I + +=head1 DESCRIPTION + +B narrows down the source of problems in LLVM tools and passes. It +can be used to debug three types of failures: optimizer crashes, miscompilations +by optimizers, or bad native code generation (including problems in the static +and JIT compilers). It aims to reduce large test cases to small, useful ones. +For more information on the design and inner workings of B, as well as +advice for using bugpoint, see F in the LLVM +distribution. + +=head1 OPTIONS + +=over + +=item B<--additional-so> F + +Load the dynamic shared object F into the test program whenever it is +run. This is useful if you are debugging programs which depend on non-LLVM +libraries (such as the X or curses libraries) to run. + +=item B<--append-exit-code>=I<{true,false}> + +Append the test programs exit code to the output file so that a change in exit +code is considered a test failure. Defaults to false. + +=item B<--args> I + +Pass all arguments specified after -args to the test program whenever it runs. +Note that if any of the I start with a '-', you should use: + + bugpoint [bugpoint args] --args -- [program args] + +The "--" right after the B<--args> option tells B to consider any +options starting with C<-> to be part of the B<--args> option, not as options to +B itself. + +=item B<--tool-args> I + +Pass all arguments specified after --tool-args to the LLVM tool under test +(B, B, etc.) whenever it runs. You should use this option in the +following way: + + bugpoint [bugpoint args] --tool-args -- [tool args] + +The "--" right after the B<--tool-args> option tells B to consider any +options starting with C<-> to be part of the B<--tool-args> option, not as +options to B itself. (See B<--args>, above.) + +=item B<--safe-tool-args> I + +Pass all arguments specified after B<--safe-tool-args> to the "safe" execution +tool. + +=item B<--gcc-tool-args> I + +Pass all arguments specified after B<--gcc-tool-args> to the invocation of +B. + +=item B<--opt-args> I + +Pass all arguments specified after B<--opt-args> to the invocation of B. + +=item B<--disable-{dce,simplifycfg}> + +Do not run the specified passes to clean up and reduce the size of the test +program. By default, B uses these passes internally when attempting to +reduce test programs. If you're trying to find a bug in one of these passes, +B may crash. + +=item B<--enable-valgrind> + +Use valgrind to find faults in the optimization phase. This will allow +bugpoint to find otherwise asymptomatic problems caused by memory +mis-management. + +=item B<-find-bugs> + +Continually randomize the specified passes and run them on the test program +until a bug is found or the user kills B. + +=item B<-help> + +Print a summary of command line options. + +=item B<--input> F + +Open F and redirect the standard input of the test program, whenever +it runs, to come from that file. + +=item B<--load> F + +Load the dynamic object F into B itself. This object should +register new optimization passes. Once loaded, the object will add new command +line options to enable various optimizations. To see the new complete list of +optimizations, use the B<-help> and B<--load> options together; for example: + + bugpoint --load myNewPass.so -help + +=item B<--mlimit> F + +Specifies an upper limit on memory usage of the optimization and codegen. Set +to zero to disable the limit. + +=item B<--output> F + +Whenever the test program produces output on its standard output stream, it +should match the contents of F (the "reference output"). If you +do not use this option, B will attempt to generate a reference output +by compiling the program with the "safe" backend and running it. + +=item B<--profile-info-file> F + +Profile file loaded by B<--profile-loader>. + +=item B<--run-{int,jit,llc,cbe,custom}> + +Whenever the test program is compiled, B should generate code for it +using the specified code generator. These options allow you to choose the +interpreter, the JIT compiler, the static native code compiler, the C +backend, or a custom command (see B<--exec-command>) respectively. + +=item B<--safe-{llc,cbe,custom}> + +When debugging a code generator, B should use the specified code +generator as the "safe" code generator. This is a known-good code generator +used to generate the "reference output" if it has not been provided, and to +compile portions of the program that as they are excluded from the testcase. +These options allow you to choose the +static native code compiler, the C backend, or a custom command, +(see B<--exec-command>) respectively. The interpreter and the JIT backends +cannot currently be used as the "safe" backends. + +=item B<--exec-command> I + +This option defines the command to use with the B<--run-custom> and +B<--safe-custom> options to execute the bitcode testcase. This can +be useful for cross-compilation. + +=item B<--safe-path> I + +This option defines the path to the command to execute with the +B<--safe-{int,jit,llc,cbe,custom}> +option. + +=back + +=head1 EXIT STATUS + +If B succeeds in finding a problem, it will exit with 0. Otherwise, +if an error occurs, it will exit with a non-zero value. + +=head1 SEE ALSO + +L + +=head1 AUTHOR + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/html/manpage.css URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/html/manpage.css?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/html/manpage.css (added) +++ www-releases/trunk/2.8/docs/CommandGuide/html/manpage.css Mon Oct 4 15:49:23 2010 @@ -0,0 +1,256 @@ +/* Based on http://www.perldoc.com/css/perldoc.css */ + + at import url("../llvm.css"); + +body { font-family: Arial,Helvetica; } + +blockquote { margin: 10pt; } + +h1, a { color: #336699; } + + +/*** Top menu style ****/ +.mmenuon { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #ff6600; font-size: 10pt; + } +.mmenuoff { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #ffffff; font-size: 10pt; +} +.cpyright { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #ffffff; font-size: xx-small; +} +.cpyrightText { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #ffffff; font-size: xx-small; +} +.sections { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 11pt; +} +.dsections { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 12pt; +} +.slink { + font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; + color: #000000; font-size: 9pt; +} + +.slink2 { font-family: Arial,Helvetica; text-decoration: none; color: #336699; } + +.maintitle { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 18pt; +} +.dblArrow { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: small; +} +.menuSec { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: small; +} + +.newstext { + font-family: Arial,Helvetica; font-size: small; +} + +.linkmenu { + font-family: Arial,Helvetica; color: #000000; font-weight: bold; + text-decoration: none; +} + +P { + font-family: Arial,Helvetica; +} + +PRE { + font-size: 10pt; +} +.quote { + font-family: Times; text-decoration: none; + color: #000000; font-size: 9pt; font-style: italic; +} +.smstd { font-family: Arial,Helvetica; color: #000000; font-size: x-small; } +.std { font-family: Arial,Helvetica; color: #000000; } +.meerkatTitle { + font-family: sans-serif; font-size: x-small; color: black; } + +.meerkatDescription { font-family: sans-serif; font-size: 10pt; color: black } +.meerkatCategory { + font-family: sans-serif; font-size: 9pt; font-weight: bold; font-style: italic; + color: brown; } +.meerkatChannel { + font-family: sans-serif; font-size: 9pt; font-style: italic; color: brown; } +.meerkatDate { font-family: sans-serif; font-size: xx-small; color: #336699; } + +.tocTitle { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #333333; font-size: 10pt; +} + +.toc-item { + font-family: Arial,Helvetica; font-weight: bold; + color: #336699; font-size: 10pt; text-decoration: underline; +} + +.perlVersion { + font-family: Arial,Helvetica; font-weight: bold; + color: #336699; font-size: 10pt; text-decoration: none; +} + +.podTitle { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #000000; +} + +.docTitle { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #000000; font-size: 10pt; +} +.dotDot { + font-family: Arial,Helvetica; font-weight: bold; + color: #000000; font-size: 9pt; +} + +.docSec { + font-family: Arial,Helvetica; font-weight: normal; + color: #333333; font-size: 9pt; +} +.docVersion { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 10pt; +} + +.docSecs-on { + font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; + color: #ff0000; font-size: 10pt; +} +.docSecs-off { + font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; + color: #333333; font-size: 10pt; +} + +h2 { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: medium; +} +h1 { + font-family: Verdana,Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: large; +} + +DL { + font-family: Arial,Helvetica; font-weight: normal; text-decoration: none; + color: #333333; font-size: 10pt; +} + +UL > LI > A { + font-family: Arial,Helvetica; font-weight: bold; + color: #336699; font-size: 10pt; +} + +.moduleInfo { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #333333; font-size: 11pt; +} + +.moduleInfoSec { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 10pt; +} + +.moduleInfoVal { + font-family: Arial,Helvetica; font-weight: normal; text-decoration: underline; + color: #000000; font-size: 10pt; +} + +.cpanNavTitle { + font-family: Arial,Helvetica; font-weight: bold; + color: #ffffff; font-size: 10pt; +} +.cpanNavLetter { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #333333; font-size: 9pt; +} +.cpanCat { + font-family: Arial,Helvetica; font-weight: bold; text-decoration: none; + color: #336699; font-size: 9pt; +} + +.bttndrkblue-bkgd-top { + background-color: #225688; + background-image: url(/global/mvc_objects/images/bttndrkblue_bgtop.gif); +} +.bttndrkblue-bkgd-left { + background-color: #225688; + background-image: url(/global/mvc_objects/images/bttndrkblue_bgleft.gif); +} +.bttndrkblue-bkgd { + padding-top: 0px; + padding-bottom: 0px; + margin-bottom: 0px; + margin-top: 0px; + background-repeat: no-repeat; + background-color: #225688; + background-image: url(/global/mvc_objects/images/bttndrkblue_bgmiddle.gif); + vertical-align: top; +} +.bttndrkblue-bkgd-right { + background-color: #225688; + background-image: url(/global/mvc_objects/images/bttndrkblue_bgright.gif); +} +.bttndrkblue-bkgd-bottom { + background-color: #225688; + background-image: url(/global/mvc_objects/images/bttndrkblue_bgbottom.gif); +} +.bttndrkblue-text a { + color: #ffffff; + text-decoration: none; +} +a.bttndrkblue-text:hover { + color: #ffDD3C; + text-decoration: none; +} +.bg-ltblue { + background-color: #f0f5fa; +} + +.border-left-b { + background: #f0f5fa url(/i/corner-leftline.gif) repeat-y; +} + +.border-right-b { + background: #f0f5fa url(/i/corner-rightline.gif) repeat-y; +} + +.border-top-b { + background: #f0f5fa url(/i/corner-topline.gif) repeat-x; +} + +.border-bottom-b { + background: #f0f5fa url(/i/corner-botline.gif) repeat-x; +} + +.border-right-w { + background: #ffffff url(/i/corner-rightline.gif) repeat-y; +} + +.border-top-w { + background: #ffffff url(/i/corner-topline.gif) repeat-x; +} + +.border-bottom-w { + background: #ffffff url(/i/corner-botline.gif) repeat-x; +} + +.bg-white { + background-color: #ffffff; +} + +.border-left-w { + background: #ffffff url(/i/corner-leftline.gif) repeat-y; +} Added: www-releases/trunk/2.8/docs/CommandGuide/index.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/index.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/index.html (added) +++ www-releases/trunk/2.8/docs/CommandGuide/index.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,158 @@ + + + + LLVM Command Guide + + + + +
    + LLVM Command Guide +
    + +
    + +

    These documents are HTML versions of the man pages +for all of the LLVM tools. These pages describe how to use the LLVM commands +and what their options are. Note that these pages do not describe all of the +options available for all tools. To get a complete listing, pass the +-help (general options) or -help-hidden (general+debugging +options) arguments to the tool you are interested in.

    + +
    + + + + + +
    + +
      + +
    • llvm-as - + assemble a human-readable .ll file into bytecode
    • + +
    • llvm-dis - + disassemble a bytecode file into a human-readable .ll file
    • + +
    • opt - + run a series of LLVM-to-LLVM optimizations on a bytecode file
    • + +
    • llc - + generate native machine code for a bytecode file
    • + +
    • lli - + directly run a program compiled to bytecode using a JIT compiler or + interpreter
    • + +
    • llvm-link - + link several bytecode files into one
    • + +
    • llvm-ar - + archive bytecode files
    • + +
    • llvm-ranlib - + create an index for archives made with llvm-ar
    • + +
    • llvm-nm - + print out the names and types of symbols in a bytecode file
    • + +
    • llvm-prof - + format raw `llvmprof.out' data into a human-readable report
    • + +
    • llvm-ld - + general purpose linker with loadable runtime optimization support
    • + +
    • llvm-config - + print out LLVM compilation options, libraries, etc. as configured
    • + +
    • llvmc - + a generic customizable compiler driver
    • + +
    • llvm-diff - + structurally compare two modules
    • + +
    + +
    + + + + + +
    +
      + +
    • llvm-gcc - + GCC-based C front-end for LLVM + +
    • llvm-g++ - + GCC-based C++ front-end for LLVM
    • + +
    + +
    + + + + + + +
    + +
      + +
    • bugpoint - + automatic test-case reducer
    • + +
    • llvm-extract - + extract a function from an LLVM bytecode file
    • + +
    • llvm-bcanalyzer - + bytecode analyzer (analyzes the binary encoding itself, not the program it + represents)
    • + +
    +
    + + + + + +
    +
      + +
    • FileCheck - + Flexible file verifier used extensively by the testing harness
    • +
    • tblgen - + target description reader and generator
    • +
    • lit - + LLVM Integrated Tester, for running tests
    • + +
    +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-07 16:32:02 -0700 (Tue, 07 Sep 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CommandGuide/lit.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/lit.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/lit.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/lit.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,354 @@ +=pod + +=head1 NAME + +lit - LLVM Integrated Tester + +=head1 SYNOPSIS + +B [I] [I] + +=head1 DESCRIPTION + +B is a portable tool for executing LLVM and Clang style test suites, +summarizing their results, and providing indication of failures. B is +designed to be a lightweight testing tool with as simple a user interface as +possible. + +B should be run with one or more I to run specified on the command +line. Tests can be either individual test files or directories to search for +tests (see L<"TEST DISCOVERY">). + +Each specified test will be executed (potentially in parallel) and once all +tests have been run B will print summary information on the number of tests +which passed or failed (see L<"TEST STATUS RESULTS">). The B program will +execute with a non-zero exit code if any tests fail. + +By default B will use a succinct progress display and will only print +summary information for test failures. See L<"OUTPUT OPTIONS"> for options +controlling the B progress display and output. + +B also includes a number of options for controlling how tests are exected +(specific features may depend on the particular test format). See L<"EXECUTION +OPTIONS"> for more information. + +Finally, B also supports additional options for only running a subset of +the options specified on the command line, see L<"SELECTION OPTIONS"> for +more information. + +Users interested in the B architecture or designing a B testing +implementation should see L<"LIT ARCHITECTURE"> + +=head1 GENERAL OPTIONS + +=over + +=item B<-h>, B<--help> + +Show the B help message. + +=item B<-j> I, B<--threads>=I + +Run I tests in parallel. By default, this is automatically chosen to match +the number of detected available CPUs. + +=item B<--config-prefix>=I + +Search for I and I when searching for test suites, +instead of I and I. + +=item B<--param> I, B<--param> I=I + +Add a user defined parameter I with the given I (or the empty +string if not given). The meaning and use of these parameters is test suite +dependent. + +=back + +=head1 OUTPUT OPTIONS + +=over + +=item B<-q>, B<--quiet> + +Suppress any output except for test failures. + +=item B<-s>, B<--succinct> + +Show less output, for example don't show information on tests that pass. + +=item B<-v>, B<--verbose> + +Show more information on test failures, for example the entire test output +instead of just the test result. + +=item B<--no-progress-bar> + +Do not use curses based progress bar. + +=back + +=head1 EXECUTION OPTIONS + +=over + +=item B<--path>=I + +Specify an addition I to use when searching for executables in tests. + +=item B<--vg> + +Run individual tests under valgrind (using the memcheck tool). The +I<--error-exitcode> argument for valgrind is used so that valgrind failures will +cause the program to exit with a non-zero status. + +=item B<--vg-arg>=I + +When I<--vg> is used, specify an additional argument to pass to valgrind itself. + +=item B<--time-tests> + +Track the wall time individual tests take to execute and includes the results in +the summary output. This is useful for determining which tests in a test suite +take the most time to execute. Note that this option is most useful with I<-j +1>. + +=back + +=head1 SELECTION OPTIONS + +=over + +=item B<--max-tests>=I + +Run at most I tests and then terminate. + +=item B<--max-time>=I + +Spend at most I seconds (approximately) running tests and then terminate. + +=item B<--shuffle> + +Run the tests in a random order. + +=back + +=head1 ADDITIONAL OPTIONS + +=over + +=item B<--debug> + +Run B in debug mode, for debugging configuration issues and B itself. + +=item B<--show-suites> + +List the discovered test suites as part of the standard output. + +=item B<--no-tcl-as-sh> + +Run Tcl scripts internally (instead of converting to shell scripts). + +=item B<--repeat>=I + +Run each test I times. Currently this is primarily useful for timing tests, +other results are not collated in any reasonable fashion. + +=back + +=head1 EXIT STATUS + +B will exit with an exit code of 1 if there are any FAIL or XPASS +results. Otherwise, it will exit with the status 0. Other exit codes used for +non-test related failures (for example a user error or an internal program +error). + +=head1 TEST DISCOVERY + +The inputs passed to B can be either individual tests, or entire +directories or hierarchies of tests to run. When B starts up, the first +thing it does is convert the inputs into a complete list of tests to run as part +of I. + +In the B model, every test must exist inside some I. B +resolves the inputs specified on the command line to test suites by searching +upwards from the input path until it finds a I or I +file. These files serve as both a marker of test suites and as configuration +files which B loads in order to understand how to find and run the tests +inside the test suite. + +Once B has mapped the inputs into test suites it traverses the list of +inputs adding tests for individual files and recursively searching for tests in +directories. + +This behavior makes it easy to specify a subset of tests to run, while still +allowing the test suite configuration to control exactly how tests are +interpreted. In addition, B always identifies tests by the test suite they +are in, and their relative path inside the test suite. For appropriately +configured projects, this allows B to provide convenient and flexible +support for out-of-tree builds. + +=head1 TEST STATUS RESULTS + +Each test ultimately produces one of the following six results: + +=over + +=item B + +The test succeeded. + +=item B + +The test failed, but that is expected. This is used for test formats which allow +specifying that a test does not currently work, but wish to leave it in the test +suite. + +=item B + +The test succeeded, but it was expected to fail. This is used for tests which +were specified as expected to fail, but are now succeeding (generally because +the feautre they test was broken and has been fixed). + +=item B + +The test failed. + +=item B + +The test result could not be determined. For example, this occurs when the test +could not be run, the test itself is invalid, or the test was interrupted. + +=item B + +The test is not supported in this environment. This is used by test formats +which can report unsupported tests. + +=back + +Depending on the test format tests may produce additional information about +their status (generally only for failures). See the L +section for more information. + +=head1 LIT INFRASTRUCTURE + +This section describes the B testing architecture for users interested in +creating a new B testing implementation, or extending an existing one. + +B proper is primarily an infrastructure for discovering and running +arbitrary tests, and to expose a single convenient interface to these +tests. B itself doesn't know how to run tests, rather this logic is +defined by I. + +=head2 TEST SUITES + +As described in L<"TEST DISCOVERY">, tests are always located inside a I. Test suites serve to define the format of the tests they contain, the +logic for finding those tests, and any additional information to run the tests. + +B identifies test suites as directories containing I or +I files (see also B<--config-prefix>. Test suites are initially +discovered by recursively searching up the directory hierarchy for all the input +files passed on the command line. You can use B<--show-suites> to display the +discovered test suites at startup. + +Once a test suite is discovered, its config file is loaded. Config files +themselves are Python modules which will be executed. When the config file is +executed, two important global variables are predefined: + +=over + +=item B + +The global B configuration object (a I instance), which defines +the builtin test formats, global configuration parameters, and other helper +routines for implementing test configurations. + +=item B + +This is the config object (a I instance) for the test suite, +which the config file is expected to populate. The following variables are also +available on the I object, some of which must be set by the config and +others are optional or predefined: + +B I<[required]> The name of the test suite, for use in reports and +diagnostics. + +B I<[required]> The test format object which will be used to +discover and run tests in the test suite. Generally this will be a builtin test +format available from the I module. + +B The filesystem path to the test suite root. For out-of-dir +builds this is the directory that will be scanned for tests. + +B For out-of-dir builds, the path to the test suite root inside +the object directory. This is where tests will be run and temporary output files +places. + +B A dictionary representing the environment to use when executing +tests in the suite. + +B For B test formats which scan directories for tests, this +variable as a list of suffixes to identify test files. Used by: I, +I. + +B For B test formats which substitute variables into a test +script, the list of substitutions to perform. Used by: I, I. + +B Mark an unsupported directory, all tests within it will be +reported as unsupported. Used by: I, I. + +B The parent configuration, this is the config object for the directory +containing the test suite, or None. + +B The config is actually cloned for every subdirectory inside a test +suite, to allow local configuration on a per-directory basis. The I +variable can be set to a Python function which will be called whenever a +configuration is cloned (for a subdirectory). The function should takes three +arguments: (1) the parent configuration, (2) the new configuration (which the +I function will generally modify), and (3) the test path to the new +directory being scanned. + +=back + +=head2 TEST DISCOVERY + +Once test suites are located, B recursively traverses the source directory +(following I) looking for tests. When B enters a +sub-directory, it first checks to see if a nest test suite is defined in that +directory. If so, it loads that test suite recursively, otherwise it +instantiates a local test config for the directory (see L<"LOCAL CONFIGURATION +FILES">). + +Tests are identified by the test suite they are contained within, and the +relative path inside that suite. Note that the relative path may not refer to an +actual file on disk; some test formats (such as I) define "virtual +tests" which have a path that contains both the path to the actual test file and +a subpath to identify the virtual test. + +=head2 LOCAL CONFIGURATION FILES + +When B loads a subdirectory in a test suite, it instantiates a local test +configuration by cloning the configuration for the parent direction -- the root +of this configuration chain will always be a test suite. Once the test +configuration is cloned B checks for a I file in the +subdirectory. If present, this file will be loaded and can be used to specialize +the configuration for each individual directory. This facility can be used to +define subdirectories of optional tests, or to change other configuration +parameters -- for example, to change the test format, or the suffixes which +identify test files. + +=head2 LIT EXAMPLE TESTS + +The B distribution contains several example implementations of test suites +in the I directory. + +=head1 SEE ALSO + +L + +=head1 AUTHOR + +Written by Daniel Dunbar and maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/llc.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/llc.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/llc.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/llc.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,193 @@ +=pod + +=head1 NAME + +llc - LLVM static compiler + +=head1 SYNOPSIS + +B [I] [I] + +=head1 DESCRIPTION + +The B command compiles LLVM source inputs into assembly language for a +specified architecture. The assembly language output can then be passed through +a native assembler and linker to generate a native executable. + +The choice of architecture for the output assembly code is automatically +determined from the input file, unless the B<-march> option is used to override +the default. + +=head1 OPTIONS + +If I is - or omitted, B reads from standard input. Otherwise, it +will from I. Inputs can be in either the LLVM assembly language +format (.ll) or the LLVM bitcode format (.bc). + +If the B<-o> option is omitted, then B will send its output to standard +output if the input is from standard input. If the B<-o> option specifies -, +then the output will also be sent to standard output. + +If no B<-o> option is specified and an input file other than - is specified, +then B creates the output filename by taking the input filename, +removing any existing F<.bc> extension, and adding a F<.s> suffix. + +Other B options are as follows: + +=head2 End-user Options + +=over + +=item B<-help> + +Print a summary of command line options. + +=item B<-O>=I + +Generate code at different optimization levels. These correspond to the I<-O0>, +I<-O1>, I<-O2>, I<-O3>, and I<-O4> optimization levels used by B and +B. + +=item B<-mtriple>=I + +Override the target triple specified in the input file with the specified +string. + +=item B<-march>=I + +Specify the architecture for which to generate assembly, overriding the target +encoded in the input file. See the output of B for a list of +valid architectures. By default this is inferred from the target triple or +autodetected to the current architecture. + +=item B<-mcpu>=I + +Specify a specific chip in the current architecture to generate code for. +By default this is inferred from the target triple and autodetected to +the current architecture. For a list of available CPUs, use: +B /dev/null | llc -march=xyz -mcpu=help> + +=item B<-mattr>=I + +Override or control specific attributes of the target, such as whether SIMD +operations are enabled or not. The default set of attributes is set by the +current CPU. For a list of available attributes, use: +B /dev/null | llc -march=xyz -mattr=help> + +=item B<--disable-fp-elim> + +Disable frame pointer elimination optimization. + +=item B<--disable-excess-fp-precision> + +Disable optimizations that may produce excess precision for floating point. +Note that this option can dramatically slow down code on some systems +(e.g. X86). + +=item B<--enable-unsafe-fp-math> + +Enable optimizations that make unsafe assumptions about IEEE math (e.g. that +addition is associative) or may not work for all input ranges. These +optimizations allow the code generator to make use of some instructions which +would otherwise not be usable (such as fsin on X86). + +=item B<--enable-correct-eh-support> + +Instruct the B pass to insert code for correct exception handling +support. This is expensive and is by default omitted for efficiency. + +=item B<--stats> + +Print statistics recorded by code-generation passes. + +=item B<--time-passes> + +Record the amount of time needed for each pass and print a report to standard +error. + +=item B<--load>=F + +Dynamically load F (a path to a dynamically shared object) that +implements an LLVM target. This will permit the target name to be used with the +B<-march> option so that code can be generated for that target. + +=back + +=head2 Tuning/Configuration Options + +=over + +=item B<--print-machineinstrs> + +Print generated machine code between compilation phases (useful for debugging). + +=item B<--regalloc>=I + +Specify the register allocator to use. The default I is I. +Valid register allocators are: + +=over + +=item I + +Very simple "always spill" register allocator + +=item I + +Local register allocator + +=item I + +Linear scan global register allocator + +=item I + +Iterative scan global register allocator + +=back + +=item B<--spiller>=I + +Specify the spiller to use for register allocators that support it. Currently +this option is used only by the linear scan register allocator. The default +I is I. Valid spillers are: + +=over + +=item I + +Simple spiller + +=item I + +Local spiller + +=back + +=back + +=head2 Intel IA-32-specific Options + +=over + +=item B<--x86-asm-syntax=att|intel> + +Specify whether to emit assembly code in AT&T syntax (the default) or intel +syntax. + +=back + +=head1 EXIT STATUS + +If B succeeds, it will exit with 0. Otherwise, if an error occurs, +it will exit with a non-zero value. + +=head1 SEE ALSO + +L + +=head1 AUTHORS + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/lli.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/lli.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/lli.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/lli.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,216 @@ +=pod + +=head1 NAME + +lli - directly execute programs from LLVM bitcode + +=head1 SYNOPSIS + +B [I] [I] [I] + +=head1 DESCRIPTION + +B directly executes programs in LLVM bitcode format. It takes a program +in LLVM bitcode format and executes it using a just-in-time compiler, if one is +available for the current architecture, or an interpreter. B takes all of +the same code generator options as L, but they are only effective when +B is using the just-in-time compiler. + +If I is not specified, then B reads the LLVM bitcode for the +program from standard input. + +The optional I specified on the command line are passed to the program as +arguments. + +=head1 GENERAL OPTIONS + +=over + +=item B<-fake-argv0>=I + +Override the C value passed into the executing program. + +=item B<-force-interpreter>=I<{false,true}> + +If set to true, use the interpreter even if a just-in-time compiler is available +for this architecture. Defaults to false. + +=item B<-help> + +Print a summary of command line options. + +=item B<-load>=I + +Causes B to load the plugin (shared object) named I and use +it for optimization. + +=item B<-stats> + +Print statistics from the code-generation passes. This is only meaningful for +the just-in-time compiler, at present. + +=item B<-time-passes> + +Record the amount of time needed for each code-generation pass and print it to +standard error. + +=item B<-version> + +Print out the version of B and exit without doing anything else. + +=back + +=head1 TARGET OPTIONS + +=over + +=item B<-mtriple>=I + +Override the target triple specified in the input bitcode file with the +specified string. This may result in a crash if you pick an +architecture which is not compatible with the current system. + +=item B<-march>=I + +Specify the architecture for which to generate assembly, overriding the target +encoded in the bitcode file. See the output of B for a list of +valid architectures. By default this is inferred from the target triple or +autodetected to the current architecture. + +=item B<-mcpu>=I + +Specify a specific chip in the current architecture to generate code for. +By default this is inferred from the target triple and autodetected to +the current architecture. For a list of available CPUs, use: +B /dev/null | llc -march=xyz -mcpu=help> + +=item B<-mattr>=I + +Override or control specific attributes of the target, such as whether SIMD +operations are enabled or not. The default set of attributes is set by the +current CPU. For a list of available attributes, use: +B /dev/null | llc -march=xyz -mattr=help> + +=back + + +=head1 FLOATING POINT OPTIONS + +=over + +=item B<-disable-excess-fp-precision> + +Disable optimizations that may increase floating point precision. + +=item B<-enable-finite-only-fp-math> + +Enable optimizations that assumes only finite floating point math. That is, +there is no NAN or Inf values. + +=item B<-enable-unsafe-fp-math> + +Causes B to enable optimizations that may decrease floating point +precision. + +=item B<-soft-float> + +Causes B to generate software floating point library calls instead of +equivalent hardware instructions. + +=back + +=head1 CODE GENERATION OPTIONS + +=over + +=item B<-code-model>=I + +Choose the code model from: + + default: Target default code model + small: Small code model + kernel: Kernel code model + medium: Medium code model + large: Large code model + +=item B<-disable-post-RA-scheduler> + +Disable scheduling after register allocation. + +=item B<-disable-spill-fusing> + +Disable fusing of spill code into instructions. + +=item B<-enable-correct-eh-support> + +Make the -lowerinvoke pass insert expensive, but correct, EH code. + +=item B<-jit-enable-eh> + +Exception handling should be enabled in the just-in-time compiler. + +=item B<-join-liveintervals> + +Coalesce copies (default=true). + +=item B<-nozero-initialized-in-bss> +Don't place zero-initialized symbols into the BSS section. + +=item B<-pre-RA-sched>=I + +Instruction schedulers available (before register allocation): + + =default: Best scheduler for the target + =none: No scheduling: breadth first sequencing + =simple: Simple two pass scheduling: minimize critical path and maximize processor utilization + =simple-noitin: Simple two pass scheduling: Same as simple except using generic latency + =list-burr: Bottom-up register reduction list scheduling + =list-tdrr: Top-down register reduction list scheduling + =list-td: Top-down list scheduler -print-machineinstrs - Print generated machine code + +=item B<-regalloc>=I + +Register allocator to use (default=linearscan) + + =bigblock: Big-block register allocator + =linearscan: linear scan register allocator =local - local register allocator + =simple: simple register allocator + +=item B<-relocation-model>=I + +Choose relocation model from: + + =default: Target default relocation model + =static: Non-relocatable code =pic - Fully relocatable, position independent code + =dynamic-no-pic: Relocatable external references, non-relocatable code + +=item B<-spiller> + +Spiller to use (default=local) + + =simple: simple spiller + =local: local spiller + +=item B<-x86-asm-syntax>=I + +Choose style of code to emit from X86 backend: + + =att: Emit AT&T-style assembly + =intel: Emit Intel-style assembly + +=back + +=head1 EXIT STATUS + +If B fails to load the program, it will exit with an exit code of 1. +Otherwise, it will return the exit code of the program it executes. + +=head1 SEE ALSO + +L + +=head1 AUTHOR + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/llvm-ar.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/llvm-ar.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/llvm-ar.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/llvm-ar.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,406 @@ +=pod + +=head1 NAME + +llvm-ar - LLVM archiver + +=head1 SYNOPSIS + +B [-]{dmpqrtx}[Rabfikouz] [relpos] [count] [files...] + + +=head1 DESCRIPTION + +The B command is similar to the common Unix utility, C. It +archives several files together into a single file. The intent for this is +to produce archive libraries by LLVM bitcode that can be linked into an +LLVM program. However, the archive can contain any kind of file. By default, +B generates a symbol table that makes linking faster because +only the symbol table needs to be consulted, not each individual file member +of the archive. + +The B command can be used to I both SVR4 and BSD style archive +files. However, it cannot be used to write them. While the B command +produces files that are I identical to the format used by other C +implementations, it has two significant departures in order to make the +archive appropriate for LLVM. The first departure is that B only +uses BSD4.4 style long path names (stored immediately after the header) and +never contains a string table for long names. The second departure is that the +symbol table is formated for efficient construction of an in-memory data +structure that permits rapid (red-black tree) lookups. Consequently, archives +produced with B usually won't be readable or editable with any +C implementation or useful for linking. Using the C modifier to flatten +file names will make the archive readable by other C implementations +but not for linking because the symbol table format for LLVM is unique. If an +SVR4 or BSD style archive is used with the C (replace) or C (quick +update) operations, the archive will be reconstructed in LLVM format. This +means that the string table will be dropped (in deference to BSD 4.4 long names) +and an LLVM symbol table will be added (by default). The system symbol table +will be retained. + +Here's where B departs from previous C implementations: + +=over + +=item I + +Since B is intended to archive bitcode files, the symbol table +won't make much sense to anything but LLVM. Consequently, the symbol table's +format has been simplified. It consists simply of a sequence of pairs +of a file member index number as an LSB 4byte integer and a null-terminated +string. + +=item I + +Some C implementations (SVR4) use a separate file member to record long +path names (> 15 characters). B takes the BSD 4.4 and Mac OS X +approach which is to simply store the full path name immediately preceding +the data for the file. The path name is null terminated and may contain the +slash (/) character. + +=item I + +B can compress the members of an archive to save space. The +compression used depends on what's available on the platform and what choices +the LLVM Compressor utility makes. It generally favors bzip2 but will select +between "no compression" or bzip2 depending on what makes sense for the +file's content. + +=item I + +Most C implementations do not recurse through directories but simply +ignore directories if they are presented to the program in the F +option. B, however, can recurse through directory structures and +add all the files under a directory, if requested. + +=item I + +When B prints out the verbose table of contents (C option), it +precedes the usual output with a character indicating the basic kind of +content in the file. A blank means the file is a regular file. A 'Z' means +the file is compressed. A 'B' means the file is an LLVM bitcode file. An +'S' means the file is the symbol table. + +=back + +=head1 OPTIONS + +The options to B are compatible with other C implementations. +However, there are a few modifiers (F) that are not found in other +Cs. The options to B specify a single basic operation to +perform on the archive, a variety of modifiers for that operation, the +name of the archive file, and an optional list of file names. These options +are used to determine how B should process the archive file. + +The Operations and Modifiers are explained in the sections below. The minimal +set of options is at least one operator and the name of the archive. Typically +archive files end with a C<.a> suffix, but this is not required. Following +the F comes a list of F that indicate the specific members +of the archive to operate on. If the F option is not specified, it +generally means either "none" or "all" members, depending on the operation. + +=head2 Operations + +=over + +=item d + +Delete files from the archive. No modifiers are applicable to this operation. +The F options specify which members should be removed from the +archive. It is not an error if a specified file does not appear in the archive. +If no F are specified, the archive is not modified. + +=item m[abi] + +Move files from one location in the archive to another. The F, F, and +F modifiers apply to this operation. The F will all be moved +to the location given by the modifiers. If no modifiers are used, the files +will be moved to the end of the archive. If no F are specified, the +archive is not modified. + +=item p[k] + +Print files to the standard output. The F modifier applies to this +operation. This operation simply prints the F indicated to the +standard output. If no F are specified, the entire archive is printed. +Printing bitcode files is ill-advised as they might confuse your terminal +settings. The F

    operation is used. This modifier defeats the default and allows the +bitcode members to be printed. + +=item [N] + +This option is ignored by B but provided for compatibility. + +=item [o] + +When extracting files, this option will cause B to preserve the +original modification times of the files it writes. + +=item [P] + +use full path names when matching + +=item [R] + +This modifier instructions the F option to recursively process directories. +Without F, directories are ignored and only those F that refer to +files will be added to the archive. When F is used, any directories specified +with F will be scanned (recursively) to find files to be added to the +archive. Any file whose name begins with a dot will not be added. + +=item [u] + +When replacing existing files in the archive, only replace those files that have +a time stamp than the time stamp of the member in the archive. + +=item [z] + +When inserting or replacing any file in the archive, compress the file first. +This +modifier is safe to use when (previously) compressed bitcode files are added to +the archive; the compressed bitcode files will not be doubly compressed. + +=back + +=head2 Modifiers (generic) + +The modifiers below may be applied to any operation. + +=over + +=item [c] + +For all operations, B will always create the archive if it doesn't +exist. Normally, B will print a warning message indicating that the +archive is being created. Using this modifier turns off that warning. + +=item [s] + +This modifier requests that an archive index (or symbol table) be added to the +archive. This is the default mode of operation. The symbol table will contain +all the externally visible functions and global variables defined by all the +bitcode files in the archive. Using this modifier is more efficient that using +L which also creates the symbol table. + +=item [S] + +This modifier is the opposite of the F modifier. It instructs B to +not build the symbol table. If both F and F are used, the last modifier to +occur in the options will prevail. + +=item [v] + +This modifier instructs B to be verbose about what it is doing. Each +editing operation taken against the archive will produce a line of output saying +what is being done. + +=back + +=head1 STANDARDS + +The B utility is intended to provide a superset of the IEEE Std 1003.2 +(POSIX.2) functionality for C. B can read both SVR4 and BSD4.4 (or +Mac OS X) archives. If the C modifier is given to the C or C operations +then B will write SVR4 compatible archives. Without this modifier, +B will write BSD4.4 compatible archives that have long names +immediately after the header and indicated using the "#1/ddd" notation for the +name in the header. + +=head1 FILE FORMAT + +The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX +archive files. In fact, except for the symbol table, the C commands on those +operating systems should be able to read LLVM archive files. The details of the +file format follow. + +Each archive begins with the archive magic number which is the eight printable +characters "!\n" where \n represents the newline character (0x0A). +Following the magic number, the file is composed of even length members that +begin with an archive header and end with a \n padding character if necessary +(to make the length even). Each file member is composed of a header (defined +below), an optional newline-terminated "long file name" and the contents of +the file. + +The fields of the header are described in the items below. All fields of the +header contain only ASCII characters, are left justified and are right padded +with space characters. + +=over + +=item name - char[16] + +This field of the header provides the name of the archive member. If the name is +longer than 15 characters or contains a slash (/) character, then this field +contains C<#1/nnn> where C provides the length of the name and the C<#1/> +is literal. In this case, the actual name of the file is provided in the C +bytes immediately following the header. If the name is 15 characters or less, it +is contained directly in this field and terminated with a slash (/) character. + +=item date - char[12] + +This field provides the date of modification of the file in the form of a +decimal encoded number that provides the number of seconds since the epoch +(since 00:00:00 Jan 1, 1970) per Posix specifications. + +=item uid - char[6] + +This field provides the user id of the file encoded as a decimal ASCII string. +This field might not make much sense on non-Unix systems. On Unix, it is the +same value as the st_uid field of the stat structure returned by the stat(2) +operating system call. + +=item gid - char[6] + +This field provides the group id of the file encoded as a decimal ASCII string. +This field might not make much sense on non-Unix systems. On Unix, it is the +same value as the st_gid field of the stat structure returned by the stat(2) +operating system call. + +=item mode - char[8] + +This field provides the access mode of the file encoded as an octal ASCII +string. This field might not make much sense on non-Unix systems. On Unix, it +is the same value as the st_mode field of the stat structure returned by the +stat(2) operating system call. + +=item size - char[10] + +This field provides the size of the file, in bytes, encoded as a decimal ASCII +string. If the size field is negative (starts with a minus sign, 0x02D), then +the archive member is stored in compressed form. The first byte of the archive +member's data indicates the compression type used. A value of 0 (0x30) indicates +that no compression was used. A value of 2 (0x32) indicates that bzip2 +compression was used. + +=item fmag - char[2] + +This field is the archive file member magic number. Its content is always the +two characters back tick (0x60) and newline (0x0A). This provides some measure +utility in identifying archive files that have been corrupted. + +=back + +The LLVM symbol table has the special name "#_LLVM_SYM_TAB_#". It is presumed +that no regular archive member file will want this name. The LLVM symbol table +is simply composed of a sequence of triplets: byte offset, length of symbol, +and the symbol itself. Symbols are not null or newline terminated. Here are +the details on each of these items: + +=over + +=item offset - vbr encoded 32-bit integer + +The offset item provides the offset into the archive file where the bitcode +member is stored that is associated with the symbol. The offset value is 0 +based at the start of the first "normal" file member. To derive the actual +file offset of the member, you must add the number of bytes occupied by the file +signature (8 bytes) and the symbol tables. The value of this item is encoded +using variable bit rate encoding to reduce the size of the symbol table. +Variable bit rate encoding uses the high bit (0x80) of each byte to indicate +if there are more bytes to follow. The remaining 7 bits in each byte carry bits +from the value. The final byte does not have the high bit set. + +=item length - vbr encoded 32-bit integer + +The length item provides the length of the symbol that follows. Like this +I item, the length is variable bit rate encoded. + +=item symbol - character array + +The symbol item provides the text of the symbol that is associated with the +I. The symbol is not terminated by any character. Its length is provided +by the I field. Note that is allowed (but unwise) to use non-printing +characters (even 0x00) in the symbol. This allows for multiple encodings of +symbol names. + +=back + +=head1 EXIT STATUS + +If B succeeds, it will exit with 0. A usage error, results +in an exit code of 1. A hard (file system typically) error results in an +exit code of 2. Miscellaneous or unknown errors result in an +exit code of 3. + +=head1 SEE ALSO + +L, ar(1) + +=head1 AUTHORS + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/llvm-as.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/llvm-as.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/llvm-as.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/llvm-as.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,77 @@ +=pod + +=head1 NAME + +llvm-as - LLVM assembler + +=head1 SYNOPSIS + +B [I] [I] + +=head1 DESCRIPTION + +B is the LLVM assembler. It reads a file containing human-readable +LLVM assembly language, translates it to LLVM bitcode, and writes the result +into a file or to standard output. + +If F is omitted or is C<->, then B reads its input from +standard input. + +If an output file is not specified with the B<-o> option, then +B sends its output to a file or standard output by following +these rules: + +=over + +=item * + +If the input is standard input, then the output is standard output. + +=item * + +If the input is a file that ends with C<.ll>, then the output file is of +the same name, except that the suffix is changed to C<.bc>. + +=item * + +If the input is a file that does not end with the C<.ll> suffix, then the +output file has the same name as the input file, except that the C<.bc> +suffix is appended. + +=back + +=head1 OPTIONS + +=over + +=item B<-f> + +Enable binary output on terminals. Normally, B will refuse to +write raw bitcode output if the output stream is a terminal. With this option, +B will write raw bitcode regardless of the output device. + +=item B<-help> + +Print a summary of command line options. + +=item B<-o> F + +Specify the output file name. If F is C<->, then B +sends its output to standard output. + +=back + +=head1 EXIT STATUS + +If B succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value. + +=head1 SEE ALSO + +L, L + +=head1 AUTHORS + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/llvm-bcanalyzer.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/llvm-bcanalyzer.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/llvm-bcanalyzer.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/llvm-bcanalyzer.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,315 @@ +=pod + +=head1 NAME + +llvm-bcanalyzer - LLVM bitcode analyzer + +=head1 SYNOPSIS + +B [I] [F] + +=head1 DESCRIPTION + +The B command is a small utility for analyzing bitcode files. +The tool reads a bitcode file (such as generated with the B tool) and +produces a statistical report on the contents of the bitcode file. The tool +can also dump a low level but human readable version of the bitcode file. +This tool is probably not of much interest or utility except for those working +directly with the bitcode file format. Most LLVM users can just ignore +this tool. + +If F is omitted or is C<->, then B reads its input +from standard input. This is useful for combining the tool into a pipeline. +Output is written to the standard output. + +=head1 OPTIONS + +=over + +=item B<-nodetails> + +Causes B to abbreviate its output by writing out only a module +level summary. The details for individual functions are not displayed. + +=item B<-dump> + +Causes B to dump the bitcode in a human readable format. This +format is significantly different from LLVM assembly and provides details about +the encoding of the bitcode file. + +=item B<-verify> + +Causes B to verify the module produced by reading the +bitcode. This ensures that the statistics generated are based on a consistent +module. + +=item B<-help> + +Print a summary of command line options. + +=back + +=head1 EXIT STATUS + +If B succeeds, it will exit with 0. Otherwise, if an error +occurs, it will exit with a non-zero value, usually 1. + +=head1 SUMMARY OUTPUT DEFINITIONS + +The following items are always printed by llvm-bcanalyzer. They comprize the +summary output. + +=over + +=item B + +This just provides the name of the module for which bitcode analysis is being +generated. + +=item B + +The bitcode version (not LLVM version) of the file read by the analyzer. + +=item B + +The size, in bytes, of the entire bitcode file. + +=item B + +The size, in bytes, of the module block. Percentage is relative to File Size. + +=item B + +The size, in bytes, of all the function blocks. Percentage is relative to File +Size. + +=item B + +The size, in bytes, of the Global Types Pool. Percentage is relative to File +Size. This is the size of the definitions of all types in the bitcode file. + +=item B + +The size, in bytes, of the Constant Pool Blocks Percentage is relative to File +Size. + +=item B + +Ths size, in bytes, of the Global Variable Definitions and their initializers. +Percentage is relative to File Size. + +=item B + +The size, in bytes, of all the instruction lists in all the functions. +Percentage is relative to File Size. Note that this value is also included in +the Function Bytes. + +=item B + +The size, in bytes, of all the compaction tables in all the functions. +Percentage is relative to File Size. Note that this value is also included in +the Function Bytes. + +=item B + +The size, in bytes, of all the symbol tables in all the functions. Percentage is +relative to File Size. Note that this value is also included in the Function +Bytes. + +=item B + +The size, in bytes, of the list of dependent libraries in the module. Percentage +is relative to File Size. Note that this value is also included in the Module +Global Bytes. + +=item B + +The total number of blocks of any kind in the bitcode file. + +=item B + +The total number of function definitions in the bitcode file. + +=item B + +The total number of types defined in the Global Types Pool. + +=item B + +The total number of constants (of any type) defined in the Constant Pool. + +=item B + +The total number of basic blocks defined in all functions in the bitcode file. + +=item B + +The total number of instructions defined in all functions in the bitcode file. + +=item B + +The total number of long instructions defined in all functions in the bitcode +file. Long instructions are those taking greater than 4 bytes. Typically long +instructions are GetElementPtr with several indices, PHI nodes, and calls to +functions with large numbers of arguments. + +=item B + +The total number of operands used in all instructions in the bitcode file. + +=item B + +The total number of compaction tables in all functions in the bitcode file. + +=item B + +The total number of symbol tables in all functions in the bitcode file. + +=item B + +The total number of dependent libraries found in the bitcode file. + +=item B + +The total size of the instructions in all functions in the bitcode file. + +=item B + +The average number of bytes per instruction across all functions in the bitcode +file. This value is computed by dividing Total Instruction Size by Number Of +Instructions. + +=item B + +The maximum value used for a type's slot number. Larger slot number values take +more bytes to encode. + +=item B + +The maximum value used for a value's slot number. Larger slot number values take +more bytes to encode. + +=item B + +The average size of a Value definition (of any type). This is computed by +dividing File Size by the total number of values of any type. + +=item B + +The average size of a global definition (constants and global variables). + +=item B + +The average number of bytes per function definition. This is computed by +dividing Function Bytes by Number Of Functions. + +=item B<# of VBR 32-bit Integers> + +The total number of 32-bit integers encoded using the Variable Bit Rate +encoding scheme. + +=item B<# of VBR 64-bit Integers> + +The total number of 64-bit integers encoded using the Variable Bit Rate encoding +scheme. + +=item B<# of VBR Compressed Bytes> + +The total number of bytes consumed by the 32-bit and 64-bit integers that use +the Variable Bit Rate encoding scheme. + +=item B<# of VBR Expanded Bytes> + +The total number of bytes that would have been consumed by the 32-bit and 64-bit +integers had they not been compressed with the Variable Bit Rage encoding +scheme. + +=item B + +The total number of bytes saved by using the Variable Bit Rate encoding scheme. +The percentage is relative to # of VBR Expanded Bytes. + +=back + +=head1 DETAILED OUTPUT DEFINITIONS + +The following definitions occur only if the -nodetails option was not given. +The detailed output provides additional information on a per-function basis. + +=over + +=item B + +The type signature of the function. + +=item B + +The total number of bytes in the function's block. + +=item B + +The number of basic blocks defined by the function. + +=item B + +The number of instructions defined by the function. + +=item B + +The number of instructions using the long instruction format in the function. + +=item B + +The number of operands used by all instructions in the function. + +=item B + +The number of bytes consumed by instructions in the function. + +=item B + +The average number of bytes consumed by the instructions in the funtion. This +value is computed by dividing Instruction Size by Instructions. + +=item B + +The average number of bytes used by the function per instruction. This value is +computed by dividing Byte Size by Instructions. Note that this is not the same +as Average Instruction Size. It computes a number relative to the total function +size not just the size of the instruction list. + +=item B + +The total number of 32-bit integers found in this function (for any use). + +=item B + +The total number of 64-bit integers found in this function (for any use). + +=item B + +The total number of bytes in this function consumed by the 32-bit and 64-bit +integers that use the Variable Bit Rate encoding scheme. + +=item B + +The total number of bytes in this function that would have been consumed by +the 32-bit and 64-bit integers had they not been compressed with the Variable +Bit Rate encoding scheme. + +=item B + +The total number of bytes saved in this function by using the Variable Bit +Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. + +=back + +=head1 SEE ALSO + +L, L + +=head1 AUTHORS + +Maintained by the LLVM Team (L). + +=cut Added: www-releases/trunk/2.8/docs/CommandGuide/llvm-config.pod URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CommandGuide/llvm-config.pod?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CommandGuide/llvm-config.pod (added) +++ www-releases/trunk/2.8/docs/CommandGuide/llvm-config.pod Mon Oct 4 15:49:23 2010 @@ -0,0 +1,131 @@ +=pod + +=head1 NAME + +llvm-config - Print LLVM compilation options + +=head1 SYNOPSIS + +B I

    + +
      +
    1. Introduction
    2. + +
    3. Quick Start Guide +
        +
      1. Boolean Arguments
      2. +
      3. Argument Aliases
      4. +
      5. Selecting an alternative from a + set of possibilities
      6. +
      7. Named alternatives
      8. +
      9. Parsing a list of options
      10. +
      11. Collecting options as a set of flags
      12. +
      13. Adding freeform text to help output
      14. +
    4. + +
    5. Reference Guide +
        +
      1. Positional Arguments +
      2. + +
      3. Internal vs External Storage
      4. + +
      5. Option Attributes
      6. + +
      7. Option Modifiers +
      8. + +
      9. Top-Level Classes and Functions +
      10. + +
      11. Builtin parsers +
      12. +
    6. +
    7. Extension Guide +
        +
      1. Writing a custom parser
      2. +
      3. Exploiting external storage
      4. +
      5. Dynamically adding command line + options
      6. +
    8. +
    + +
    +

    Written by Chris Lattner

    +
    + + + + + +
    + +

    This document describes the CommandLine argument processing library. It will +show you how to use it, and what it can do. The CommandLine library uses a +declarative approach to specifying the command line options that your program +takes. By default, these options declarations implicitly hold the value parsed +for the option declared (of course this can be +changed).

    + +

    Although there are a lot of command line argument parsing libraries +out there in many different languages, none of them fit well with what I needed. +By looking at the features and problems of other libraries, I designed the +CommandLine library to have the following features:

    + +
      +
    1. Speed: The CommandLine library is very quick and uses little resources. The +parsing time of the library is directly proportional to the number of arguments +parsed, not the the number of options recognized. Additionally, command line +argument values are captured transparently into user defined global variables, +which can be accessed like any other variable (and with the same +performance).
    2. + +
    3. Type Safe: As a user of CommandLine, you don't have to worry about +remembering the type of arguments that you want (is it an int? a string? a +bool? an enum?) and keep casting it around. Not only does this help prevent +error prone constructs, it also leads to dramatically cleaner source code.
    4. + +
    5. No subclasses required: To use CommandLine, you instantiate variables that +correspond to the arguments that you would like to capture, you don't subclass a +parser. This means that you don't have to write any boilerplate +code.
    6. + +
    7. Globally accessible: Libraries can specify command line arguments that are +automatically enabled in any tool that links to the library. This is possible +because the application doesn't have to keep a list of arguments to pass to +the parser. This also makes supporting dynamically +loaded options trivial.
    8. + +
    9. Cleaner: CommandLine supports enum and other types directly, meaning that +there is less error and more security built into the library. You don't have to +worry about whether your integral command line argument accidentally got +assigned a value that is not valid for your enum type.
    10. + +
    11. Powerful: The CommandLine library supports many different types of +arguments, from simple boolean flags to scalars arguments (strings, integers, enums, doubles), to lists of +arguments. This is possible because CommandLine is...
    12. + +
    13. Extensible: It is very simple to add a new argument type to CommandLine. +Simply specify the parser that you want to use with the command line option when +you declare it. Custom parsers are no problem.
    14. + +
    15. Labor Saving: The CommandLine library cuts down on the amount of grunt work +that you, the user, have to do. For example, it automatically provides a +-help option that shows the available command line options for your +tool. Additionally, it does most of the basic correctness checking for +you.
    16. + +
    17. Capable: The CommandLine library can handle lots of different forms of +options often found in real programs. For example, positional arguments, ls style grouping options (to allow processing 'ls +-lad' naturally), ld style prefix +options (to parse '-lmalloc -L/usr/lib'), and interpreter style options.
    18. + +
    + +

    This document will hopefully let you jump in and start using CommandLine in +your utility quickly and painlessly. Additionally it should be a simple +reference manual to figure out how stuff works. If it is failing in some area +(or you want an extension to the library), nag the author, Chris Lattner.

    + +
    + + + + + +
    + +

    This section of the manual runs through a simple CommandLine'ification of a +basic compiler tool. This is intended to show you how to jump into using the +CommandLine library in your own program, and show you some of the cool things it +can do.

    + +

    To start out, you need to include the CommandLine header file into your +program:

    + +
    +  #include "llvm/Support/CommandLine.h"
    +
    + +

    Additionally, you need to add this as the first line of your main +program:

    + +
    +int main(int argc, char **argv) {
    +  cl::ParseCommandLineOptions(argc, argv);
    +  ...
    +}
    +
    + +

    ... which actually parses the arguments and fills in the variable +declarations.

    + +

    Now that you are ready to support command line arguments, we need to tell the +system which ones we want, and what type of arguments they are. The CommandLine +library uses a declarative syntax to model command line arguments with the +global variable declarations that capture the parsed values. This means that +for every command line option that you would like to support, there should be a +global variable declaration to capture the result. For example, in a compiler, +we would like to support the Unix-standard '-o <filename>' option +to specify where to put the output. With the CommandLine library, this is +represented like this:

    + + +
    +cl::opt<string> OutputFilename("o", cl::desc("Specify output filename"), cl::value_desc("filename"));
    +
    + +

    This declares a global variable "OutputFilename" that is used to +capture the result of the "o" argument (first parameter). We specify +that this is a simple scalar option by using the "cl::opt" template (as opposed to the "cl::list template), and tell the CommandLine library +that the data type that we are parsing is a string.

    + +

    The second and third parameters (which are optional) are used to specify what +to output for the "-help" option. In this case, we get a line that +looks like this:

    + +
    +USAGE: compiler [options]
    +
    +OPTIONS:
    +  -help             - display available options (-help-hidden for more)
    +  -o <filename>     - Specify output filename
    +
    + +

    Because we specified that the command line option should parse using the +string data type, the variable declared is automatically usable as a +real string in all contexts that a normal C++ string object may be used. For +example:

    + +
    +  ...
    +  std::ofstream Output(OutputFilename.c_str());
    +  if (Output.good()) ...
    +  ...
    +
    + +

    There are many different options that you can use to customize the command +line option handling library, but the above example shows the general interface +to these options. The options can be specified in any order, and are specified +with helper functions like cl::desc(...), so +there are no positional dependencies to remember. The available options are +discussed in detail in the Reference Guide.

    + +

    Continuing the example, we would like to have our compiler take an input +filename as well as an output filename, but we do not want the input filename to +be specified with a hyphen (ie, not -filename.c). To support this +style of argument, the CommandLine library allows for positional arguments to be specified for the program. +These positional arguments are filled with command line parameters that are not +in option form. We use this feature like this:

    + +
    +cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-"));
    +
    + +

    This declaration indicates that the first positional argument should be +treated as the input filename. Here we use the cl::init option to specify an initial value for the +command line option, which is used if the option is not specified (if you do not +specify a cl::init modifier for an option, then +the default constructor for the data type is used to initialize the value). +Command line options default to being optional, so if we would like to require +that the user always specify an input filename, we would add the cl::Required flag, and we could eliminate the +cl::init modifier, like this:

    + +
    +cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::Required);
    +
    + +

    Again, the CommandLine library does not require the options to be specified +in any particular order, so the above declaration is equivalent to:

    + +
    +cl::opt<string> InputFilename(cl::Positional, cl::Required, cl::desc("<input file>"));
    +
    + +

    By simply adding the cl::Required flag, +the CommandLine library will automatically issue an error if the argument is not +specified, which shifts all of the command line option verification code out of +your application into the library. This is just one example of how using flags +can alter the default behaviour of the library, on a per-option basis. By +adding one of the declarations above, the -help option synopsis is now +extended to:

    + +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  -help             - display available options (-help-hidden for more)
    +  -o <filename>     - Specify output filename
    +
    + +

    ... indicating that an input filename is expected.

    + +
    + + + + +
    + +

    In addition to input and output filenames, we would like the compiler example +to support three boolean flags: "-f" to force writing binary output to +a terminal, "--quiet" to enable quiet mode, and "-q" for +backwards compatibility with some of our users. We can support these by +declaring options of boolean type like this:

    + +
    +cl::opt<bool> Force ("f", cl::desc("Enable binary output on terminals"));
    +cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
    +cl::opt<bool> Quiet2("q", cl::desc("Don't print informational messages"), cl::Hidden);
    +
    + +

    This does what you would expect: it declares three boolean variables +("Force", "Quiet", and "Quiet2") to recognize these +options. Note that the "-q" option is specified with the "cl::Hidden" flag. This modifier prevents it +from being shown by the standard "-help" output (note that it is still +shown in the "-help-hidden" output).

    + +

    The CommandLine library uses a different parser +for different data types. For example, in the string case, the argument passed +to the option is copied literally into the content of the string variable... we +obviously cannot do that in the boolean case, however, so we must use a smarter +parser. In the case of the boolean parser, it allows no options (in which case +it assigns the value of true to the variable), or it allows the values +"true" or "false" to be specified, allowing any of the +following inputs:

    + +
    + compiler -f          # No value, 'Force' == true
    + compiler -f=true     # Value specified, 'Force' == true
    + compiler -f=TRUE     # Value specified, 'Force' == true
    + compiler -f=FALSE    # Value specified, 'Force' == false
    +
    + +

    ... you get the idea. The bool parser just turns +the string values into boolean values, and rejects things like 'compiler +-f=foo'. Similarly, the float, double, and int parsers work +like you would expect, using the 'strtol' and 'strtod' C +library calls to parse the string value into the specified data type.

    + +

    With the declarations above, "compiler -help" emits this:

    + +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  -f     - Enable binary output on terminals
    +  -o     - Override output filename
    +  -quiet - Don't print informational messages
    +  -help  - display available options (-help-hidden for more)
    +
    + +

    and "compiler -help-hidden" prints this:

    + +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  -f     - Enable binary output on terminals
    +  -o     - Override output filename
    +  -q     - Don't print informational messages
    +  -quiet - Don't print informational messages
    +  -help  - display available options (-help-hidden for more)
    +
    + +

    This brief example has shown you how to use the 'cl::opt' class to parse simple scalar command line +arguments. In addition to simple scalar arguments, the CommandLine library also +provides primitives to support CommandLine option aliases, +and lists of options.

    + +
    + + + + +
    + +

    So far, the example works well, except for the fact that we need to check the +quiet condition like this now:

    + +
    +...
    +  if (!Quiet && !Quiet2) printInformationalMessage(...);
    +...
    +
    + +

    ... which is a real pain! Instead of defining two values for the same +condition, we can use the "cl::alias" class to make the "-q" +option an alias for the "-quiet" option, instead of providing +a value itself:

    + +
    +cl::opt<bool> Force ("f", cl::desc("Overwrite output files"));
    +cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
    +cl::alias     QuietA("q", cl::desc("Alias for -quiet"), cl::aliasopt(Quiet));
    +
    + +

    The third line (which is the only one we modified from above) defines a +"-q" alias that updates the "Quiet" variable (as specified by +the cl::aliasopt modifier) whenever it is +specified. Because aliases do not hold state, the only thing the program has to +query is the Quiet variable now. Another nice feature of aliases is +that they automatically hide themselves from the -help output +(although, again, they are still visible in the -help-hidden +output).

    + +

    Now the application code can simply use:

    + +
    +...
    +  if (!Quiet) printInformationalMessage(...);
    +...
    +
    + +

    ... which is much nicer! The "cl::alias" +can be used to specify an alternative name for any variable type, and has many +uses.

    + +
    + + + + +
    + +

    So far we have seen how the CommandLine library handles builtin types like +std::string, bool and int, but how does it handle +things it doesn't know about, like enums or 'int*'s?

    + +

    The answer is that it uses a table-driven generic parser (unless you specify +your own parser, as described in the Extension +Guide). This parser maps literal strings to whatever type is required, and +requires you to tell it what this mapping should be.

    + +

    Let's say that we would like to add four optimization levels to our +optimizer, using the standard flags "-g", "-O0", +"-O1", and "-O2". We could easily implement this with boolean +options like above, but there are several problems with this strategy:

    + +
      +
    1. A user could specify more than one of the options at a time, for example, +"compiler -O3 -O2". The CommandLine library would not be able to +catch this erroneous input for us.
    2. + +
    3. We would have to test 4 different variables to see which ones are set.
    4. + +
    5. This doesn't map to the numeric levels that we want... so we cannot easily +see if some level >= "-O1" is enabled.
    6. + +
    + +

    To cope with these problems, we can use an enum value, and have the +CommandLine library fill it in with the appropriate level directly, which is +used like this:

    + +
    +enum OptLevel {
    +  g, O1, O2, O3
    +};
    +
    +cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
    +  cl::values(
    +    clEnumVal(g , "No optimizations, enable debugging"),
    +    clEnumVal(O1, "Enable trivial optimizations"),
    +    clEnumVal(O2, "Enable default optimizations"),
    +    clEnumVal(O3, "Enable expensive optimizations"),
    +   clEnumValEnd));
    +
    +...
    +  if (OptimizationLevel >= O2) doPartialRedundancyElimination(...);
    +...
    +
    + +

    This declaration defines a variable "OptimizationLevel" of the +"OptLevel" enum type. This variable can be assigned any of the values +that are listed in the declaration (Note that the declaration list must be +terminated with the "clEnumValEnd" argument!). The CommandLine +library enforces +that the user can only specify one of the options, and it ensure that only valid +enum values can be specified. The "clEnumVal" macros ensure that the +command line arguments matched the enum values. With this option added, our +help output now is:

    + +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  Choose optimization level:
    +    -g          - No optimizations, enable debugging
    +    -O1         - Enable trivial optimizations
    +    -O2         - Enable default optimizations
    +    -O3         - Enable expensive optimizations
    +  -f            - Enable binary output on terminals
    +  -help         - display available options (-help-hidden for more)
    +  -o <filename> - Specify output filename
    +  -quiet        - Don't print informational messages
    +
    + +

    In this case, it is sort of awkward that flag names correspond directly to +enum names, because we probably don't want a enum definition named "g" +in our program. Because of this, we can alternatively write this example like +this:

    + +
    +enum OptLevel {
    +  Debug, O1, O2, O3
    +};
    +
    +cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
    +  cl::values(
    +   clEnumValN(Debug, "g", "No optimizations, enable debugging"),
    +    clEnumVal(O1        , "Enable trivial optimizations"),
    +    clEnumVal(O2        , "Enable default optimizations"),
    +    clEnumVal(O3        , "Enable expensive optimizations"),
    +   clEnumValEnd));
    +
    +...
    +  if (OptimizationLevel == Debug) outputDebugInfo(...);
    +...
    +
    + +

    By using the "clEnumValN" macro instead of "clEnumVal", we +can directly specify the name that the flag should get. In general a direct +mapping is nice, but sometimes you can't or don't want to preserve the mapping, +which is when you would use it.

    + +
    + + + + +
    + +

    Another useful argument form is a named alternative style. We shall use this +style in our compiler to specify different debug levels that can be used. +Instead of each debug level being its own switch, we want to support the +following options, of which only one can be specified at a time: +"--debug-level=none", "--debug-level=quick", +"--debug-level=detailed". To do this, we use the exact same format as +our optimization level flags, but we also specify an option name. For this +case, the code looks like this:

    + +
    +enum DebugLev {
    +  nodebuginfo, quick, detailed
    +};
    +
    +// Enable Debug Options to be specified on the command line
    +cl::opt<DebugLev> DebugLevel("debug_level", cl::desc("Set the debugging level:"),
    +  cl::values(
    +    clEnumValN(nodebuginfo, "none", "disable debug information"),
    +     clEnumVal(quick,               "enable quick debug information"),
    +     clEnumVal(detailed,            "enable detailed debug information"),
    +    clEnumValEnd));
    +
    + +

    This definition defines an enumerated command line variable of type "enum +DebugLev", which works exactly the same way as before. The difference here +is just the interface exposed to the user of your program and the help output by +the "-help" option:

    + +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  Choose optimization level:
    +    -g          - No optimizations, enable debugging
    +    -O1         - Enable trivial optimizations
    +    -O2         - Enable default optimizations
    +    -O3         - Enable expensive optimizations
    +  -debug_level  - Set the debugging level:
    +    =none       - disable debug information
    +    =quick      - enable quick debug information
    +    =detailed   - enable detailed debug information
    +  -f            - Enable binary output on terminals
    +  -help         - display available options (-help-hidden for more)
    +  -o <filename> - Specify output filename
    +  -quiet        - Don't print informational messages
    +
    + +

    Again, the only structural difference between the debug level declaration and +the optimization level declaration is that the debug level declaration includes +an option name ("debug_level"), which automatically changes how the +library processes the argument. The CommandLine library supports both forms so +that you can choose the form most appropriate for your application.

    + +
    + + + + +
    + +

    Now that we have the standard run-of-the-mill argument types out of the way, +lets get a little wild and crazy. Lets say that we want our optimizer to accept +a list of optimizations to perform, allowing duplicates. For example, we +might want to run: "compiler -dce -constprop -inline -dce -strip". In +this case, the order of the arguments and the number of appearances is very +important. This is what the "cl::list" +template is for. First, start by defining an enum of the optimizations that you +would like to perform:

    + +
    +enum Opts {
    +  // 'inline' is a C++ keyword, so name it 'inlining'
    +  dce, constprop, inlining, strip
    +};
    +
    + +

    Then define your "cl::list" variable:

    + +
    +cl::list<Opts> OptimizationList(cl::desc("Available Optimizations:"),
    +  cl::values(
    +    clEnumVal(dce               , "Dead Code Elimination"),
    +    clEnumVal(constprop         , "Constant Propagation"),
    +   clEnumValN(inlining, "inline", "Procedure Integration"),
    +    clEnumVal(strip             , "Strip Symbols"),
    +  clEnumValEnd));
    +
    + +

    This defines a variable that is conceptually of the type +"std::vector<enum Opts>". Thus, you can access it with standard +vector methods:

    + +
    +  for (unsigned i = 0; i != OptimizationList.size(); ++i)
    +    switch (OptimizationList[i])
    +       ...
    +
    + +

    ... to iterate through the list of options specified.

    + +

    Note that the "cl::list" template is +completely general and may be used with any data types or other arguments that +you can use with the "cl::opt" template. One +especially useful way to use a list is to capture all of the positional +arguments together if there may be more than one specified. In the case of a +linker, for example, the linker takes several '.o' files, and needs to +capture them into a list. This is naturally specified as:

    + +
    +...
    +cl::list<std::string> InputFilenames(cl::Positional, cl::desc("<Input files>"), cl::OneOrMore);
    +...
    +
    + +

    This variable works just like a "vector<string>" object. As +such, accessing the list is simple, just like above. In this example, we used +the cl::OneOrMore modifier to inform the +CommandLine library that it is an error if the user does not specify any +.o files on our command line. Again, this just reduces the amount of +checking we have to do.

    + +
    + + + + +
    + +

    Instead of collecting sets of options in a list, it is also possible to +gather information for enum values in a bit vector. The representation used by +the cl::bits class is an unsigned +integer. An enum value is represented by a 0/1 in the enum's ordinal value bit +position. 1 indicating that the enum was specified, 0 otherwise. As each +specified value is parsed, the resulting enum's bit is set in the option's bit +vector:

    + +
    +  bits |= 1 << (unsigned)enum;
    +
    + +

    Options that are specified multiple times are redundant. Any instances after +the first are discarded.

    + +

    Reworking the above list example, we could replace +cl::list with cl::bits:

    + +
    +cl::bits<Opts> OptimizationBits(cl::desc("Available Optimizations:"),
    +  cl::values(
    +    clEnumVal(dce               , "Dead Code Elimination"),
    +    clEnumVal(constprop         , "Constant Propagation"),
    +   clEnumValN(inlining, "inline", "Procedure Integration"),
    +    clEnumVal(strip             , "Strip Symbols"),
    +  clEnumValEnd));
    +
    + +

    To test to see if constprop was specified, we can use the +cl:bits::isSet function:

    + +
    +  if (OptimizationBits.isSet(constprop)) {
    +    ...
    +  }
    +
    + +

    It's also possible to get the raw bit vector using the +cl::bits::getBits function:

    + +
    +  unsigned bits = OptimizationBits.getBits();
    +
    + +

    Finally, if external storage is used, then the location specified must be of +type unsigned. In all other ways a cl::bits option is equivalent to a cl::list option.

    + +
    + + + + + +
    + +

    As our program grows and becomes more mature, we may decide to put summary +information about what it does into the help output. The help output is styled +to look similar to a Unix man page, providing concise information about +a program. Unix man pages, however often have a description about what +the program does. To add this to your CommandLine program, simply pass a third +argument to the cl::ParseCommandLineOptions +call in main. This additional argument is then printed as the overview +information for your program, allowing you to include any additional information +that you want. For example:

    + +
    +int main(int argc, char **argv) {
    +  cl::ParseCommandLineOptions(argc, argv, " CommandLine compiler example\n\n"
    +                              "  This program blah blah blah...\n");
    +  ...
    +}
    +
    + +

    would yield the help output:

    + +
    +OVERVIEW: CommandLine compiler example
    +
    +  This program blah blah blah...
    +
    +USAGE: compiler [options] <input file>
    +
    +OPTIONS:
    +  ...
    +  -help             - display available options (-help-hidden for more)
    +  -o <filename>     - Specify output filename
    +
    + +
    + + + + + + +
    + +

    Now that you know the basics of how to use the CommandLine library, this +section will give you the detailed information you need to tune how command line +options work, as well as information on more "advanced" command line option +processing capabilities.

    + +
    + + + + +
    + +

    Positional arguments are those arguments that are not named, and are not +specified with a hyphen. Positional arguments should be used when an option is +specified by its position alone. For example, the standard Unix grep +tool takes a regular expression argument, and an optional filename to search +through (which defaults to standard input if a filename is not specified). +Using the CommandLine library, this would be specified as:

    + +
    +cl::opt<string> Regex   (cl::Positional, cl::desc("<regular expression>"), cl::Required);
    +cl::opt<string> Filename(cl::Positional, cl::desc("<input file>"), cl::init("-"));
    +
    + +

    Given these two option declarations, the -help output for our grep +replacement would look like this:

    + +
    +USAGE: spiffygrep [options] <regular expression> <input file>
    +
    +OPTIONS:
    +  -help - display available options (-help-hidden for more)
    +
    + +

    ... and the resultant program could be used just like the standard +grep tool.

    + +

    Positional arguments are sorted by their order of construction. This means +that command line options will be ordered according to how they are listed in a +.cpp file, but will not have an ordering defined if the positional arguments +are defined in multiple .cpp files. The fix for this problem is simply to +define all of your positional arguments in one .cpp file.

    + +
    + + + + + +
    + +

    Sometimes you may want to specify a value to your positional argument that +starts with a hyphen (for example, searching for '-foo' in a file). At +first, you will have trouble doing this, because it will try to find an argument +named '-foo', and will fail (and single quotes will not save you). +Note that the system grep has the same problem:

    + +
    +  $ spiffygrep '-foo' test.txt
    +  Unknown command line argument '-foo'.  Try: spiffygrep -help'
    +
    +  $ grep '-foo' test.txt
    +  grep: illegal option -- f
    +  grep: illegal option -- o
    +  grep: illegal option -- o
    +  Usage: grep -hblcnsviw pattern file . . .
    +
    + +

    The solution for this problem is the same for both your tool and the system +version: use the '--' marker. When the user specifies '--' on +the command line, it is telling the program that all options after the +'--' should be treated as positional arguments, not options. Thus, we +can use it like this:

    + +
    +  $ spiffygrep -- -foo test.txt
    +    ...output...
    +
    + +
    + + + +
    +

    Sometimes an option can affect or modify the meaning of another option. For + example, consider gcc's -x LANG option. This tells + gcc to ignore the suffix of subsequent positional arguments and force + the file to be interpreted as if it contained source code in language + LANG. In order to handle this properly, you need to know the + absolute position of each argument, especially those in lists, so their + interaction(s) can be applied correctly. This is also useful for options like + -llibname which is actually a positional argument that starts with + a dash.

    +

    So, generally, the problem is that you have two cl::list variables + that interact in some way. To ensure the correct interaction, you can use the + cl::list::getPosition(optnum) method. This method returns the + absolute position (as found on the command line) of the optnum + item in the cl::list.

    +

    The idiom for usage is like this:

    + +
    +  static cl::list<std::string> Files(cl::Positional, cl::OneOrMore);
    +  static cl::list<std::string> Libraries("l", cl::ZeroOrMore);
    +
    +  int main(int argc, char**argv) {
    +    // ...
    +    std::vector<std::string>::iterator fileIt = Files.begin();
    +    std::vector<std::string>::iterator libIt  = Libraries.begin();
    +    unsigned libPos = 0, filePos = 0;
    +    while ( 1 ) {
    +      if ( libIt != Libraries.end() )
    +        libPos = Libraries.getPosition( libIt - Libraries.begin() );
    +      else
    +        libPos = 0;
    +      if ( fileIt != Files.end() )
    +        filePos = Files.getPosition( fileIt - Files.begin() );
    +      else
    +        filePos = 0;
    +
    +      if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) {
    +        // Source File Is next
    +        ++fileIt;
    +      }
    +      else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) {
    +        // Library is next
    +        ++libIt;
    +      }
    +      else
    +        break; // we're done with the list
    +    }
    +  }
    + +

    Note that, for compatibility reasons, the cl::opt also supports an + unsigned getPosition() option that will provide the absolute position + of that option. You can apply the same approach as above with a + cl::opt and a cl::list option as you can with two lists.

    +
    + + + + +
    + +

    The cl::ConsumeAfter formatting option is +used to construct programs that use "interpreter style" option processing. With +this style of option processing, all arguments specified after the last +positional argument are treated as special interpreter arguments that are not +interpreted by the command line argument.

    + +

    As a concrete example, lets say we are developing a replacement for the +standard Unix Bourne shell (/bin/sh). To run /bin/sh, first +you specify options to the shell itself (like -x which turns on trace +output), then you specify the name of the script to run, then you specify +arguments to the script. These arguments to the script are parsed by the Bourne +shell command line option processor, but are not interpreted as options to the +shell itself. Using the CommandLine library, we would specify this as:

    + +
    +cl::opt<string> Script(cl::Positional, cl::desc("<input script>"), cl::init("-"));
    +cl::list<string>  Argv(cl::ConsumeAfter, cl::desc("<program arguments>..."));
    +cl::opt<bool>    Trace("x", cl::desc("Enable trace output"));
    +
    + +

    which automatically provides the help output:

    + +
    +USAGE: spiffysh [options] <input script> <program arguments>...
    +
    +OPTIONS:
    +  -help - display available options (-help-hidden for more)
    +  -x    - Enable trace output
    +
    + +

    At runtime, if we run our new shell replacement as `spiffysh -x test.sh +-a -x -y bar', the Trace variable will be set to true, the +Script variable will be set to "test.sh", and the +Argv list will contain ["-a", "-x", "-y", "bar"], because they +were specified after the last positional argument (which is the script +name).

    + +

    There are several limitations to when cl::ConsumeAfter options can +be specified. For example, only one cl::ConsumeAfter can be specified +per program, there must be at least one positional +argument specified, there must not be any cl::list +positional arguments, and the cl::ConsumeAfter option should be a cl::list option.

    + +
    + + + + +
    + +

    By default, all command line options automatically hold the value that they +parse from the command line. This is very convenient in the common case, +especially when combined with the ability to define command line options in the +files that use them. This is called the internal storage model.

    + +

    Sometimes, however, it is nice to separate the command line option processing +code from the storage of the value parsed. For example, lets say that we have a +'-debug' option that we would like to use to enable debug information +across the entire body of our program. In this case, the boolean value +controlling the debug code should be globally accessible (in a header file, for +example) yet the command line option processing code should not be exposed to +all of these clients (requiring lots of .cpp files to #include +CommandLine.h).

    + +

    To do this, set up your .h file with your option, like this for example:

    + +
    +
    +// DebugFlag.h - Get access to the '-debug' command line option
    +//
    +
    +// DebugFlag - This boolean is set to true if the '-debug' command line option
    +// is specified.  This should probably not be referenced directly, instead, use
    +// the DEBUG macro below.
    +//
    +extern bool DebugFlag;
    +
    +// DEBUG macro - This macro should be used by code to emit debug information.
    +// In the '-debug' option is specified on the command line, and if this is a
    +// debug build, then the code specified as the option to the macro will be
    +// executed.  Otherwise it will not be.
    +#ifdef NDEBUG
    +#define DEBUG(X)
    +#else
    +#define DEBUG(X) do { if (DebugFlag) { X; } } while (0)
    +#endif
    +
    +
    + +

    This allows clients to blissfully use the DEBUG() macro, or the +DebugFlag explicitly if they want to. Now we just need to be able to +set the DebugFlag boolean when the option is set. To do this, we pass +an additional argument to our command line argument processor, and we specify +where to fill in with the cl::location +attribute:

    + +
    +
    +bool DebugFlag;                  // the actual value
    +static cl::opt<bool, true>       // The parser
    +Debug("debug", cl::desc("Enable debug output"), cl::Hidden, cl::location(DebugFlag));
    +
    +
    + +

    In the above example, we specify "true" as the second argument to +the cl::opt template, indicating that the +template should not maintain a copy of the value itself. In addition to this, +we specify the cl::location attribute, so +that DebugFlag is automatically set.

    + +
    + + + + +
    + +

    This section describes the basic attributes that you can specify on +options.

    + +
      + +
    • The option name attribute (which is required for all options, except positional options) specifies what the option name is. +This option is specified in simple double quotes: + +
      +cl::opt<bool> Quiet("quiet");
      +
      + +
    • + +
    • The cl::desc attribute specifies a +description for the option to be shown in the -help output for the +program.
    • + +
    • The cl::value_desc attribute +specifies a string that can be used to fine tune the -help output for +a command line option. Look here for an +example.
    • + +
    • The cl::init attribute specifies an +initial value for a scalar option. If this attribute is +not specified then the command line option value defaults to the value created +by the default constructor for the type. Warning: If you specify both +cl::init and cl::location for an option, +you must specify cl::location first, so that when the +command-line parser sees cl::init, it knows where to put the +initial value. (You will get an error at runtime if you don't put them in +the right order.)
    • + +
    • The cl::location attribute where +to store the value for a parsed command line option if using external storage. +See the section on Internal vs External Storage for more +information.
    • + +
    • The cl::aliasopt attribute +specifies which option a cl::alias option is +an alias for.
    • + +
    • The cl::values attribute specifies +the string-to-value mapping to be used by the generic parser. It takes a +clEnumValEnd terminated list of (option, value, description) triplets +that +specify the option name, the value mapped to, and the description shown in the +-help for the tool. Because the generic parser is used most +frequently with enum values, two macros are often useful: + +
        + +
      1. The clEnumVal macro is used as a +nice simple way to specify a triplet for an enum. This macro automatically +makes the option name be the same as the enum name. The first option to the +macro is the enum, the second is the description for the command line +option.
      2. + +
      3. The clEnumValN macro is used to +specify macro options where the option name doesn't equal the enum name. For +this macro, the first argument is the enum value, the second is the flag name, +and the second is the description.
      4. + +
      + +You will get a compile time error if you try to use cl::values with a parser +that does not support it.
    • + +
    • The cl::multi_val +attribute specifies that this option takes has multiple values +(example: -sectalign segname sectname sectvalue). This +attribute takes one unsigned argument - the number of values for the +option. This attribute is valid only on cl::list options (and +will fail with compile error if you try to use it with other option +types). It is allowed to use all of the usual modifiers on +multi-valued options (besides cl::ValueDisallowed, +obviously).
    • + +
    + +
    + + + + +
    + +

    Option modifiers are the flags and expressions that you pass into the +constructors for cl::opt and cl::list. These modifiers give you the ability to +tweak how options are parsed and how -help output is generated to fit +your application well.

    + +

    These options fall into five main categories:

    + +
      +
    1. Hiding an option from -help output
    2. +
    3. Controlling the number of occurrences + required and allowed
    4. +
    5. Controlling whether or not a value must be + specified
    6. +
    7. Controlling other formatting options
    8. +
    9. Miscellaneous option modifiers
    10. +
    + +

    It is not possible to specify two options from the same category (you'll get +a runtime error) to a single option, except for options in the miscellaneous +category. The CommandLine library specifies defaults for all of these settings +that are the most useful in practice and the most common, which mean that you +usually shouldn't have to worry about these.

    + +
    + + + + +
    + +

    The cl::NotHidden, cl::Hidden, and +cl::ReallyHidden modifiers are used to control whether or not an option +appears in the -help and -help-hidden output for the +compiled program:

    + +
      + +
    • The cl::NotHidden modifier +(which is the default for cl::opt and cl::list options) indicates the option is to appear +in both help listings.
    • + +
    • The cl::Hidden modifier (which is the +default for cl::alias options) indicates that +the option should not appear in the -help output, but should appear in +the -help-hidden output.
    • + +
    • The cl::ReallyHidden modifier +indicates that the option should not appear in any help output.
    • + +
    + +
    + + + + +
    + +

    This group of options is used to control how many time an option is allowed +(or required) to be specified on the command line of your program. Specifying a +value for this setting allows the CommandLine library to do error checking for +you.

    + +

    The allowed values for this option group are:

    + +
      + +
    • The cl::Optional modifier (which +is the default for the cl::opt and cl::alias classes) indicates that your program will +allow either zero or one occurrence of the option to be specified.
    • + +
    • The cl::ZeroOrMore modifier +(which is the default for the cl::list class) +indicates that your program will allow the option to be specified zero or more +times.
    • + +
    • The cl::Required modifier +indicates that the specified option must be specified exactly one time.
    • + +
    • The cl::OneOrMore modifier +indicates that the option must be specified at least one time.
    • + +
    • The cl::ConsumeAfter modifier is described in the Positional arguments section.
    • + +
    + +

    If an option is not specified, then the value of the option is equal to the +value specified by the cl::init attribute. If +the cl::init attribute is not specified, the +option value is initialized with the default constructor for the data type.

    + +

    If an option is specified multiple times for an option of the cl::opt class, only the last value will be +retained.

    + +
    + + + + +
    + +

    This group of options is used to control whether or not the option allows a +value to be present. In the case of the CommandLine library, a value is either +specified with an equal sign (e.g. '-index-depth=17') or as a trailing +string (e.g. '-o a.out').

    + +

    The allowed values for this option group are:

    + +
      + +
    • The cl::ValueOptional modifier +(which is the default for bool typed options) specifies that it is +acceptable to have a value, or not. A boolean argument can be enabled just by +appearing on the command line, or it can have an explicit '-foo=true'. +If an option is specified with this mode, it is illegal for the value to be +provided without the equal sign. Therefore '-foo true' is illegal. To +get this behavior, you must use the cl::ValueRequired modifier.
    • + +
    • The cl::ValueRequired modifier +(which is the default for all other types except for unnamed alternatives using the generic parser) +specifies that a value must be provided. This mode informs the command line +library that if an option is not provides with an equal sign, that the next +argument provided must be the value. This allows things like '-o +a.out' to work.
    • + +
    • The cl::ValueDisallowed +modifier (which is the default for unnamed +alternatives using the generic parser) indicates that it is a runtime error +for the user to specify a value. This can be provided to disallow users from +providing options to boolean options (like '-foo=true').
    • + +
    + +

    In general, the default values for this option group work just like you would +want them to. As mentioned above, you can specify the cl::ValueDisallowed modifier to a boolean +argument to restrict your command line parser. These options are mostly useful +when extending the library.

    + +
    + + + + +
    + +

    The formatting option group is used to specify that the command line option +has special abilities and is otherwise different from other command line +arguments. As usual, you can only specify one of these arguments at most.

    + +
      + +
    • The cl::NormalFormatting +modifier (which is the default all options) specifies that this option is +"normal".
    • + +
    • The cl::Positional modifier +specifies that this is a positional argument that does not have a command line +option associated with it. See the Positional +Arguments section for more information.
    • + +
    • The cl::ConsumeAfter modifier +specifies that this option is used to capture "interpreter style" arguments. See this section for more information.
    • + +
    • The cl::Prefix modifier specifies +that this option prefixes its value. With 'Prefix' options, the equal sign does +not separate the value from the option name specified. Instead, the value is +everything after the prefix, including any equal sign if present. This is useful +for processing odd arguments like -lmalloc and -L/usr/lib in a +linker tool or -DNAME=value in a compiler tool. Here, the +'l', 'D' and 'L' options are normal string (or list) +options, that have the cl::Prefix +modifier added to allow the CommandLine library to recognize them. Note that +cl::Prefix options must not have the +cl::ValueDisallowed modifier +specified.
    • + +
    • The cl::Grouping modifier is used +to implement Unix-style tools (like ls) that have lots of single letter +arguments, but only require a single dash. For example, the 'ls -labF' +command actually enables four different options, all of which are single +letters. Note that cl::Grouping +options cannot have values.
    • + +
    + +

    The CommandLine library does not restrict how you use the cl::Prefix or cl::Grouping modifiers, but it is possible to +specify ambiguous argument settings. Thus, it is possible to have multiple +letter options that are prefix or grouping options, and they will still work as +designed.

    + +

    To do this, the CommandLine library uses a greedy algorithm to parse the +input option into (potentially multiple) prefix and grouping options. The +strategy basically looks like this:

    + +
    parse(string OrigInput) { + +
      +
    1. string input = OrigInput; +
    2. if (isOption(input)) return getOption(input).parse();    // Normal option +
    3. while (!isOption(input) && !input.empty()) input.pop_back();    // Remove the last letter +
    4. if (input.empty()) return error();    // No matching option +
    5. if (getOption(input).isPrefix())
      +  return getOption(input).parse(input);
      +
    6. while (!input.empty()) {    // Must be grouping options
      +  getOption(input).parse();
      +  OrigInput.erase(OrigInput.begin(), OrigInput.begin()+input.length());
      +  input = OrigInput;
      +  while (!isOption(input) && !input.empty()) input.pop_back();
      +}
      +
    7. if (!OrigInput.empty()) error();
    8. +
    + +

    }

    +
    + +
    + + + + +
    + +

    The miscellaneous option modifiers are the only flags where you can specify +more than one flag from the set: they are not mutually exclusive. These flags +specify boolean properties that modify the option.

    + +
      + +
    • The cl::CommaSeparated modifier +indicates that any commas specified for an option's value should be used to +split the value up into multiple values for the option. For example, these two +options are equivalent when cl::CommaSeparated is specified: +"-foo=a -foo=b -foo=c" and "-foo=a,b,c". This option only +makes sense to be used in a case where the option is allowed to accept one or +more values (i.e. it is a cl::list option).
    • + +
    • The +cl::PositionalEatsArgs modifier (which only applies to +positional arguments, and only makes sense for lists) indicates that positional +argument should consume any strings after it (including strings that start with +a "-") up until another recognized positional argument. For example, if you +have two "eating" positional arguments, "pos1" and "pos2", the +string "-pos1 -foo -bar baz -pos2 -bork" would cause the "-foo -bar +-baz" strings to be applied to the "-pos1" option and the +"-bork" string to be applied to the "-pos2" option.
    • + +
    • The cl::Sink modifier is +used to handle unknown options. If there is at least one option with +cl::Sink modifier specified, the parser passes +unrecognized option strings to it as values instead of signaling an +error. As with cl::CommaSeparated, this modifier +only makes sense with a cl::list option.
    • + +
    + +

    So far, these are the only three miscellaneous option modifiers.

    + +
    + + + + +
    + +

    Some systems, such as certain variants of Microsoft Windows and +some older Unices have a relatively low limit on command-line +length. It is therefore customary to use the so-called 'response +files' to circumvent this restriction. These files are mentioned on +the command-line (using the "@file") syntax. The program reads these +files and inserts the contents into argv, thereby working around the +command-line length limits. Response files are enabled by an optional +fourth argument to +cl::ParseEnvironmentOptions +and +cl::ParseCommandLineOptions. +

    + +
    + + + + + +
    + +

    Despite all of the built-in flexibility, the CommandLine option library +really only consists of one function (cl::ParseCommandLineOptions) +and three main classes: cl::opt, cl::list, and cl::alias. This section describes these three +classes in detail.

    + +
    + + + + +
    + +

    The cl::ParseCommandLineOptions function is designed to be called +directly from main, and is used to fill in the values of all of the +command line option variables once argc and argv are +available.

    + +

    The cl::ParseCommandLineOptions function requires two parameters +(argc and argv), but may also take an optional third parameter +which holds additional extra text to emit when the +-help option is invoked, and a fourth boolean parameter that enables +response files.

    + +
    + + + + +
    + +

    The cl::ParseEnvironmentOptions function has mostly the same effects +as cl::ParseCommandLineOptions, +except that it is designed to take values for options from an environment +variable, for those cases in which reading the command line is not convenient or +desired. It fills in the values of all the command line option variables just +like cl::ParseCommandLineOptions +does.

    + +

    It takes four parameters: the name of the program (since argv may +not be available, it can't just look in argv[0]), the name of the +environment variable to examine, the optional +additional extra text to emit when the +-help option is invoked, and the boolean +switch that controls whether response files +should be read.

    + +

    cl::ParseEnvironmentOptions will break the environment +variable's value up into words and then process them using +cl::ParseCommandLineOptions. +Note: Currently cl::ParseEnvironmentOptions does not support +quoting, so an environment variable containing -option "foo bar" will +be parsed as three words, -option, "foo, and bar", +which is different from what you would get from the shell with the same +input.

    + +
    + + + + +
    + +

    The cl::SetVersionPrinter function is designed to be called +directly from main and before +cl::ParseCommandLineOptions. Its use is optional. It simply arranges +for a function to be called in response to the --version option instead +of having the CommandLine library print out the usual version string +for LLVM. This is useful for programs that are not part of LLVM but wish to use +the CommandLine facilities. Such programs should just define a small +function that takes no arguments and returns void and that prints out +whatever version information is appropriate for the program. Pass the address +of that function to cl::SetVersionPrinter to arrange for it to be +called when the --version option is given by the user.

    + +
    + + + +
    + +

    The cl::opt class is the class used to represent scalar command line +options, and is the one used most of the time. It is a templated class which +can take up to three arguments (all except for the first have default values +though):

    + +
    +namespace cl {
    +  template <class DataType, bool ExternalStorage = false,
    +            class ParserClass = parser<DataType> >
    +  class opt;
    +}
    +
    + +

    The first template argument specifies what underlying data type the command +line argument is, and is used to select a default parser implementation. The +second template argument is used to specify whether the option should contain +the storage for the option (the default) or whether external storage should be +used to contain the value parsed for the option (see Internal +vs External Storage for more information).

    + +

    The third template argument specifies which parser to use. The default value +selects an instantiation of the parser class based on the underlying +data type of the option. In general, this default works well for most +applications, so this option is only used when using a custom parser.

    + +
    + + + + +
    + +

    The cl::list class is the class used to represent a list of command +line options. It too is a templated class which can take up to three +arguments:

    + +
    +namespace cl {
    +  template <class DataType, class Storage = bool,
    +            class ParserClass = parser<DataType> >
    +  class list;
    +}
    +
    + +

    This class works the exact same as the cl::opt class, except that the second argument is +the type of the external storage, not a boolean value. For this class, +the marker type 'bool' is used to indicate that internal storage should +be used.

    + +
    + + + + +
    + +

    The cl::bits class is the class used to represent a list of command +line options in the form of a bit vector. It is also a templated class which +can take up to three arguments:

    + +
    +namespace cl {
    +  template <class DataType, class Storage = bool,
    +            class ParserClass = parser<DataType> >
    +  class bits;
    +}
    +
    + +

    This class works the exact same as the cl::lists class, except that the second argument +must be of type unsigned if external storage is used.

    + +
    + + + + +
    + +

    The cl::alias class is a nontemplated class that is used to form +aliases for other arguments.

    + +
    +namespace cl {
    +  class alias;
    +}
    +
    + +

    The cl::aliasopt attribute should be +used to specify which option this is an alias for. Alias arguments default to +being Hidden, and use the aliased options parser to do +the conversion from string to data.

    + +
    + + + + +
    + +

    The cl::extrahelp class is a nontemplated class that allows extra +help text to be printed out for the -help option.

    + +
    +namespace cl {
    +  struct extrahelp;
    +}
    +
    + +

    To use the extrahelp, simply construct one with a const char* +parameter to the constructor. The text passed to the constructor will be printed +at the bottom of the help message, verbatim. Note that multiple +cl::extrahelp can be used, but this practice is discouraged. If +your tool needs to print additional help information, put all that help into a +single cl::extrahelp instance.

    +

    For example:

    +
    +  cl::extrahelp("\nADDITIONAL HELP:\n\n  This is the extra help\n");
    +
    +
    + + + + +
    + +

    Parsers control how the string value taken from the command line is +translated into a typed value, suitable for use in a C++ program. By default, +the CommandLine library uses an instance of parser<type> if the +command line option specifies that it uses values of type 'type'. +Because of this, custom option processing is specified with specializations of +the 'parser' class.

    + +

    The CommandLine library provides the following builtin parser +specializations, which are sufficient for most applications. It can, however, +also be extended to work with new data types and new ways of interpreting the +same data. See the Writing a Custom Parser for more +details on this type of library extension.

    + +
      + +
    • The generic parser<t> parser +can be used to map strings values to any data type, through the use of the cl::values property, which specifies the mapping +information. The most common use of this parser is for parsing enum values, +which allows you to use the CommandLine library for all of the error checking to +make sure that only valid enum values are specified (as opposed to accepting +arbitrary strings). Despite this, however, the generic parser class can be used +for any data type.
    • + +
    • The parser<bool> specialization +is used to convert boolean strings to a boolean value. Currently accepted +strings are "true", "TRUE", "True", "1", +"false", "FALSE", "False", and "0".
    • + +
    • The parser<boolOrDefault> + specialization is used for cases where the value is boolean, +but we also need to know whether the option was specified at all. boolOrDefault +is an enum with 3 values, BOU_UNSET, BOU_TRUE and BOU_FALSE. This parser accepts +the same strings as parser<bool>.
    • + +
    • The parser<string> +specialization simply stores the parsed string into the string value +specified. No conversion or modification of the data is performed.
    • + +
    • The parser<int> specialization +uses the C strtol function to parse the string input. As such, it will +accept a decimal number (with an optional '+' or '-' prefix) which must start +with a non-zero digit. It accepts octal numbers, which are identified with a +'0' prefix digit, and hexadecimal numbers with a prefix of +'0x' or '0X'.
    • + +
    • The parser<double> and +parser<float> specializations use the standard C +strtod function to convert floating point strings into floating point +values. As such, a broad range of string formats is supported, including +exponential notation (ex: 1.7e15) and properly supports locales. +
    • + +
    + +
    + + + + + +
    + +

    Although the CommandLine library has a lot of functionality built into it +already (as discussed previously), one of its true strengths lie in its +extensibility. This section discusses how the CommandLine library works under +the covers and illustrates how to do some simple, common, extensions.

    + +
    + + + + +
    + +

    One of the simplest and most common extensions is the use of a custom parser. +As discussed previously, parsers are the portion +of the CommandLine library that turns string input from the user into a +particular parsed data type, validating the input in the process.

    + +

    There are two ways to use a new parser:

    + +
      + +
    1. + +

      Specialize the cl::parser template for +your custom data type.

      + +

      This approach has the advantage that users of your custom data type will +automatically use your custom parser whenever they define an option with a value +type of your data type. The disadvantage of this approach is that it doesn't +work if your fundamental data type is something that is already supported.

      + +
    2. + +
    3. + +

      Write an independent class, using it explicitly from options that need +it.

      + +

      This approach works well in situations where you would line to parse an +option using special syntax for a not-very-special data-type. The drawback of +this approach is that users of your parser have to be aware that they are using +your parser instead of the builtin ones.

      + +
    4. + +
    + +

    To guide the discussion, we will discuss a custom parser that accepts file +sizes, specified with an optional unit after the numeric size. For example, we +would like to parse "102kb", "41M", "1G" into the appropriate integer value. In +this case, the underlying data type we want to parse into is +'unsigned'. We choose approach #2 above because we don't want to make +this the default for all unsigned options.

    + +

    To start out, we declare our new FileSizeParser class:

    + +
    +struct FileSizeParser : public cl::basic_parser<unsigned> {
    +  // parse - Return true on error.
    +  bool parse(cl::Option &O, const char *ArgName, const std::string &ArgValue,
    +             unsigned &Val);
    +};
    +
    + +

    Our new class inherits from the cl::basic_parser template class to +fill in the default, boiler plate code for us. We give it the data type that +we parse into, the last argument to the parse method, so that clients of +our custom parser know what object type to pass in to the parse method. (Here we +declare that we parse into 'unsigned' variables.)

    + +

    For most purposes, the only method that must be implemented in a custom +parser is the parse method. The parse method is called +whenever the option is invoked, passing in the option itself, the option name, +the string to parse, and a reference to a return value. If the string to parse +is not well-formed, the parser should output an error message and return true. +Otherwise it should return false and set 'Val' to the parsed value. In +our example, we implement parse as:

    + +
    +bool FileSizeParser::parse(cl::Option &O, const char *ArgName,
    +                           const std::string &Arg, unsigned &Val) {
    +  const char *ArgStart = Arg.c_str();
    +  char *End;
    +
    +  // Parse integer part, leaving 'End' pointing to the first non-integer char
    +  Val = (unsigned)strtol(ArgStart, &End, 0);
    +
    +  while (1) {
    +    switch (*End++) {
    +    case 0: return false;   // No error
    +    case 'i':               // Ignore the 'i' in KiB if people use that
    +    case 'b': case 'B':     // Ignore B suffix
    +      break;
    +
    +    case 'g': case 'G': Val *= 1024*1024*1024; break;
    +    case 'm': case 'M': Val *= 1024*1024;      break;
    +    case 'k': case 'K': Val *= 1024;           break;
    +
    +    default:
    +      // Print an error message if unrecognized character!
    +      return O.error("'" + Arg + "' value invalid for file size argument!");
    +    }
    +  }
    +}
    +
    + +

    This function implements a very simple parser for the kinds of strings we are +interested in. Although it has some holes (it allows "123KKK" for +example), it is good enough for this example. Note that we use the option +itself to print out the error message (the error method always returns +true) in order to get a nice error message (shown below). Now that we have our +parser class, we can use it like this:

    + +
    +static cl::opt<unsigned, false, FileSizeParser>
    +MFS("max-file-size", cl::desc("Maximum file size to accept"),
    +    cl::value_desc("size"));
    +
    + +

    Which adds this to the output of our program:

    + +
    +OPTIONS:
    +  -help                 - display available options (-help-hidden for more)
    +  ...
    +  -max-file-size=<size> - Maximum file size to accept
    +
    + +

    And we can test that our parse works correctly now (the test program just +prints out the max-file-size argument value):

    + +
    +$ ./test
    +MFS: 0
    +$ ./test -max-file-size=123MB
    +MFS: 128974848
    +$ ./test -max-file-size=3G
    +MFS: 3221225472
    +$ ./test -max-file-size=dog
    +-max-file-size option: 'dog' value invalid for file size argument!
    +
    + +

    It looks like it works. The error message that we get is nice and helpful, +and we seem to accept reasonable file sizes. This wraps up the "custom parser" +tutorial.

    + +
    + + + + +
    +

    Several of the LLVM libraries define static cl::opt instances that + will automatically be included in any program that links with that library. + This is a feature. However, sometimes it is necessary to know the value of the + command line option outside of the library. In these cases the library does or + should provide an external storage location that is accessible to users of the + library. Examples of this include the llvm::DebugFlag exported by the + lib/Support/Debug.cpp file and the llvm::TimePassesIsEnabled + flag exported by the lib/VMCore/Pass.cpp file.

    + +

    TODO: complete this section

    + +
    + + + + +
    + +

    TODO: fill in this section

    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/CompilerDriver.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CompilerDriver.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CompilerDriver.html (added) +++ www-releases/trunk/2.8/docs/CompilerDriver.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,756 @@ + + + + + + +Customizing LLVMC: Reference Manual + + + +
    +

    Customizing LLVMC: Reference Manual

    + + + +
    +

    Written by Mikhail Glushenkov

    +
    +

    Introduction

    +

    LLVMC is a generic compiler driver, designed to be customizable and +extensible. It plays the same role for LLVM as the gcc program +does for GCC - LLVMC's job is essentially to transform a set of input +files into a set of targets depending on configuration rules and user +options. What makes LLVMC different is that these transformation rules +are completely customizable - in fact, LLVMC knows nothing about the +specifics of transformation (even the command-line options are mostly +not hard-coded) and regards the transformation structure as an +abstract graph. The structure of this graph is completely determined +by plugins, which can be either statically or dynamically linked. This +makes it possible to easily adapt LLVMC for other purposes - for +example, as a build tool for game resources.

    +

    Because LLVMC employs TableGen as its configuration language, you +need to be familiar with it to customize LLVMC.

    +
    +
    +

    Compiling with LLVMC

    +

    LLVMC tries hard to be as compatible with gcc as possible, +although there are some small differences. Most of the time, however, +you shouldn't be able to notice them:

    +
    +$ # This works as expected:
    +$ llvmc -O3 -Wall hello.cpp
    +$ ./a.out
    +hello
    +
    +

    One nice feature of LLVMC is that one doesn't have to distinguish between +different compilers for different languages (think g++ vs. gcc) - the +right toolchain is chosen automatically based on input language names (which +are, in turn, determined from file extensions). If you want to force files +ending with ".c" to compile as C++, use the -x option, just like you would +do it with gcc:

    +
    +$ # hello.c is really a C++ file
    +$ llvmc -x c++ hello.c
    +$ ./a.out
    +hello
    +
    +

    On the other hand, when using LLVMC as a linker to combine several C++ +object files you should provide the --linker option since it's +impossible for LLVMC to choose the right linker in that case:

    +
    +$ llvmc -c hello.cpp
    +$ llvmc hello.o
    +[A lot of link-time errors skipped]
    +$ llvmc --linker=c++ hello.o
    +$ ./a.out
    +hello
    +
    +

    By default, LLVMC uses llvm-gcc to compile the source code. It is also +possible to choose the clang compiler with the -clang option.

    +
    +
    +

    Predefined options

    +

    LLVMC has some built-in options that can't be overridden in the +configuration libraries:

    +
      +
    • -o FILE - Output file name.
    • +
    • -x LANGUAGE - Specify the language of the following input files +until the next -x option.
    • +
    • -load PLUGIN_NAME - Load the specified plugin DLL. Example: +-load $LLVM_DIR/Release/lib/LLVMCSimple.so.
    • +
    • -v - Enable verbose mode, i.e. print out all executed commands.
    • +
    • --save-temps - Write temporary files to the current directory and do not +delete them on exit. This option can also take an argument: the +--save-temps=obj switch will write files into the directory specified with +the -o option. The --save-temps=cwd and --save-temps switches are +both synonyms for the default behaviour.
    • +
    • --temp-dir DIRECTORY - Store temporary files in the given directory. This +directory is deleted on exit unless --save-temps is specified. If +--save-temps=obj is also specified, --temp-dir is given the +precedence.
    • +
    • --check-graph - Check the compilation for common errors like mismatched +output/input language names, multiple default edges and cycles. Because of +plugins, these checks can't be performed at compile-time. Exit with code zero +if no errors were found, and return the number of found errors +otherwise. Hidden option, useful for debugging LLVMC plugins.
    • +
    • --view-graph - Show a graphical representation of the compilation graph +and exit. Requires that you have dot and gv programs installed. Hidden +option, useful for debugging LLVMC plugins.
    • +
    • --write-graph - Write a compilation-graph.dot file in the current +directory with the compilation graph description in Graphviz format (identical +to the file used by the --view-graph option). The -o option can be +used to set the output file name. Hidden option, useful for debugging LLVMC +plugins.
    • +
    • -help, -help-hidden, --version - These options have +their standard meaning.
    • +
    +
    +
    +

    Compiling LLVMC plugins

    +

    It's easiest to start working on your own LLVMC plugin by copying the +skeleton project which lives under $LLVMC_DIR/plugins/Simple:

    +
    +$ cd $LLVMC_DIR/plugins
    +$ cp -r Simple MyPlugin
    +$ cd MyPlugin
    +$ ls
    +Makefile PluginMain.cpp Simple.td
    +
    +

    As you can see, our basic plugin consists of only two files (not +counting the build script). Simple.td contains TableGen +description of the compilation graph; its format is documented in the +following sections. PluginMain.cpp is just a helper file used to +compile the auto-generated C++ code produced from TableGen source. It +can also contain hook definitions (see below).

    +

    The first thing that you should do is to change the LLVMC_PLUGIN +variable in the Makefile to avoid conflicts (since this variable +is used to name the resulting library):

    +
    +LLVMC_PLUGIN=MyPlugin
    +
    +

    It is also a good idea to rename Simple.td to something less +generic:

    +
    +$ mv Simple.td MyPlugin.td
    +
    +

    To build your plugin as a dynamic library, just cd to its source +directory and run make. The resulting file will be called +plugin_llvmc_$(LLVMC_PLUGIN).$(DLL_EXTENSION) (in our case, +plugin_llvmc_MyPlugin.so). This library can be then loaded in with the +-load option. Example:

    +
    +$ cd $LLVMC_DIR/plugins/Simple
    +$ make
    +$ llvmc -load $LLVM_DIR/Release/lib/plugin_llvmc_Simple.so
    +
    +
    +
    +

    Compiling standalone LLVMC-based drivers

    +

    By default, the llvmc executable consists of a driver core plus several +statically linked plugins (Base and Clang at the moment). You can +produce a standalone LLVMC-based driver executable by linking the core with your +own plugins. The recommended way to do this is by starting with the provided +Skeleton example ($LLVMC_DIR/example/Skeleton):

    +
    +$ cd $LLVMC_DIR/example/
    +$ cp -r Skeleton mydriver
    +$ cd mydriver
    +$ vim Makefile
    +[...]
    +$ make
    +
    +

    If you're compiling LLVM with different source and object directories, then you +must perform the following additional steps before running make:

    +
    +# LLVMC_SRC_DIR = $LLVM_SRC_DIR/tools/llvmc/
    +# LLVMC_OBJ_DIR = $LLVM_OBJ_DIR/tools/llvmc/
    +$ cp $LLVMC_SRC_DIR/example/mydriver/Makefile \
    +  $LLVMC_OBJ_DIR/example/mydriver/
    +$ cd $LLVMC_OBJ_DIR/example/mydriver
    +$ make
    +
    +

    Another way to do the same thing is by using the following command:

    +
    +$ cd $LLVMC_DIR
    +$ make LLVMC_BUILTIN_PLUGINS=MyPlugin LLVMC_BASED_DRIVER_NAME=mydriver
    +
    +

    This works with both srcdir == objdir and srcdir != objdir, but assumes that the +plugin source directory was placed under $LLVMC_DIR/plugins.

    +

    Sometimes, you will want a 'bare-bones' version of LLVMC that has no +built-in plugins. It can be compiled with the following command:

    +
    +$ cd $LLVMC_DIR
    +$ make LLVMC_BUILTIN_PLUGINS=""
    +
    +
    +
    +

    Customizing LLVMC: the compilation graph

    +

    Each TableGen configuration file should include the common +definitions:

    +
    +include "llvm/CompilerDriver/Common.td"
    +
    +

    Internally, LLVMC stores information about possible source +transformations in form of a graph. Nodes in this graph represent +tools, and edges between two nodes represent a transformation path. A +special "root" node is used to mark entry points for the +transformations. LLVMC also assigns a weight to each edge (more on +this later) to choose between several alternative edges.

    +

    The definition of the compilation graph (see file +plugins/Base/Base.td for an example) is just a list of edges:

    +
    +def CompilationGraph : CompilationGraph<[
    +    Edge<"root", "llvm_gcc_c">,
    +    Edge<"root", "llvm_gcc_assembler">,
    +    ...
    +
    +    Edge<"llvm_gcc_c", "llc">,
    +    Edge<"llvm_gcc_cpp", "llc">,
    +    ...
    +
    +    OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
    +                                      (inc_weight))>,
    +    OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
    +                                              (inc_weight))>,
    +    ...
    +
    +    OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
    +        (case (input_languages_contain "c++"), (inc_weight),
    +              (or (parameter_equals "linker", "g++"),
    +                  (parameter_equals "linker", "c++")), (inc_weight))>,
    +    ...
    +
    +    ]>;
    +
    +

    As you can see, the edges can be either default or optional, where +optional edges are differentiated by an additional case expression +used to calculate the weight of this edge. Notice also that we refer +to tools via their names (as strings). This makes it possible to add +edges to an existing compilation graph in plugins without having to +know about all tool definitions used in the graph.

    +

    The default edges are assigned a weight of 1, and optional edges get a +weight of 0 + 2*N where N is the number of tests that evaluated to +true in the case expression. It is also possible to provide an +integer parameter to inc_weight and dec_weight - in this case, +the weight is increased (or decreased) by the provided value instead +of the default 2. It is also possible to change the default weight of +an optional edge by using the default clause of the case +construct.

    +

    When passing an input file through the graph, LLVMC picks the edge +with the maximum weight. To avoid ambiguity, there should be only one +default edge between two nodes (with the exception of the root node, +which gets a special treatment - there you are allowed to specify one +default edge per language).

    +

    When multiple plugins are loaded, their compilation graphs are merged +together. Since multiple edges that have the same end nodes are not +allowed (i.e. the graph is not a multigraph), an edge defined in +several plugins will be replaced by the definition from the plugin +that was loaded last. Plugin load order can be controlled by using the +plugin priority feature described above.

    +

    To get a visual representation of the compilation graph (useful for +debugging), run llvmc --view-graph. You will need dot and +gsview installed for this to work properly.

    +
    +
    +

    Describing options

    +

    Command-line options that the plugin supports are defined by using an +OptionList:

    +
    +def Options : OptionList<[
    +(switch_option "E", (help "Help string")),
    +(alias_option "quiet", "q")
    +...
    +]>;
    +
    +

    As you can see, the option list is just a list of DAGs, where each DAG +is an option description consisting of the option name and some +properties. A plugin can define more than one option list (they are +all merged together in the end), which can be handy if one wants to +separate option groups syntactically.

    +
      +
    • Possible option types:

      +
      +
        +
      • switch_option - a simple boolean switch without arguments, for example +-O2 or -time. At most one occurrence is allowed.
      • +
      • parameter_option - option that takes one argument, for example +-std=c99. It is also allowed to use spaces instead of the equality +sign: -std c99. At most one occurrence is allowed.
      • +
      • parameter_list_option - same as the above, but more than one option +occurence is allowed.
      • +
      • prefix_option - same as the parameter_option, but the option name and +argument do not have to be separated. Example: -ofile. This can be also +specified as -o file; however, -o=file will be parsed incorrectly +(=file will be interpreted as option value). At most one occurrence is +allowed.
      • +
      • prefix_list_option - same as the above, but more than one occurence of +the option is allowed; example: -lm -lpthread.
      • +
      • alias_option - a special option type for creating aliases. Unlike other +option types, aliases are not allowed to have any properties besides the +aliased option name. Usage example: (alias_option "preprocess", "E")
      • +
      +
      +
    • +
    • Possible option properties:

      +
      +
        +
      • help - help string associated with this option. Used for -help +output.
      • +
      • required - this option must be specified exactly once (or, in case of +the list options without the multi_val property, at least +once). Incompatible with zero_or_one and one_or_more.
      • +
      • one_or_more - the option must be specified at least one time. Useful +only for list options in conjunction with multi_val; for ordinary lists +it is synonymous with required. Incompatible with required and +zero_or_one.
      • +
      • optional - the option can be specified zero or one times. Useful only +for list options in conjunction with multi_val. Incompatible with +required and one_or_more.
      • +
      • hidden - the description of this option will not appear in +the -help output (but will appear in the -help-hidden +output).
      • +
      • really_hidden - the option will not be mentioned in any help +output.
      • +
      • comma_separated - Indicates that any commas specified for an option's +value should be used to split the value up into multiple values for the +option. This property is valid only for list options. In conjunction with +forward_value can be used to implement option forwarding in style of +gcc's -Wa,.
      • +
      • multi_val n - this option takes n arguments (can be useful in some +special cases). Usage example: (parameter_list_option "foo", (multi_val +3)); the command-line syntax is '-foo a b c'. Only list options can have +this attribute; you can, however, use the one_or_more, optional +and required properties.
      • +
      • init - this option has a default value, either a string (if it is a +parameter), or a boolean (if it is a switch; as in C++, boolean constants +are called true and false). List options can't have init +attribute. +Usage examples: (switch_option "foo", (init true)); (prefix_option +"bar", (init "baz")).
      • +
      • extern - this option is defined in some other plugin, see below.
      • +
      +
      +
    • +
    +
    +

    External options

    +

    Sometimes, when linking several plugins together, one plugin needs to +access options defined in some other plugin. Because of the way +options are implemented, such options must be marked as +extern. This is what the extern option property is +for. Example:

    +
    +...
    +(switch_option "E", (extern))
    +...
    +
    +

    If an external option has additional attributes besides 'extern', they are +ignored. See also the section on plugin priorities.

    +
    +
    +
    +

    Conditional evaluation

    +

    The 'case' construct is the main means by which programmability is +achieved in LLVMC. It can be used to calculate edge weights, program +actions and modify the shell commands to be executed. The 'case' +expression is designed after the similarly-named construct in +functional languages and takes the form (case (test_1), statement_1, +(test_2), statement_2, ... (test_N), statement_N). The statements +are evaluated only if the corresponding tests evaluate to true.

    +

    Examples:

    +
    +// Edge weight calculation
    +
    +// Increases edge weight by 5 if "-A" is provided on the
    +// command-line, and by 5 more if "-B" is also provided.
    +(case
    +    (switch_on "A"), (inc_weight 5),
    +    (switch_on "B"), (inc_weight 5))
    +
    +
    +// Tool command line specification
    +
    +// Evaluates to "cmdline1" if the option "-A" is provided on the
    +// command line; to "cmdline2" if "-B" is provided;
    +// otherwise to "cmdline3".
    +
    +(case
    +    (switch_on "A"), "cmdline1",
    +    (switch_on "B"), "cmdline2",
    +    (default), "cmdline3")
    +
    +

    Note the slight difference in 'case' expression handling in contexts +of edge weights and command line specification - in the second example +the value of the "B" switch is never checked when switch "A" is +enabled, and the whole expression always evaluates to "cmdline1" in +that case.

    +

    Case expressions can also be nested, i.e. the following is legal:

    +
    +(case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
    +      (default), ...)
    +
    +

    You should, however, try to avoid doing that because it hurts +readability. It is usually better to split tool descriptions and/or +use TableGen inheritance instead.

    +
      +
    • Possible tests are:
        +
      • switch_on - Returns true if a given command-line switch is provided by +the user. Can be given a list as argument, in that case (switch_on ["foo", +"bar", "baz"]) is equivalent to (and (switch_on "foo"), (switch_on +"bar"), (switch_on "baz")). +Example: (switch_on "opt").
      • +
      • any_switch_on - Given a list of switch options, returns true if any of +the switches is turned on. +Example: (any_switch_on ["foo", "bar", "baz"]) is equivalent to (or +(switch_on "foo"), (switch_on "bar"), (switch_on "baz")).
      • +
      • parameter_equals - Returns true if a command-line parameter equals +a given value. +Example: (parameter_equals "W", "all").
      • +
      • element_in_list - Returns true if a command-line parameter +list contains a given value. +Example: (element_in_list "l", "pthread").
      • +
      • input_languages_contain - Returns true if a given language +belongs to the current input language set. +Example: (input_languages_contain "c++").
      • +
      • in_language - Evaluates to true if the input file language is equal to +the argument. At the moment works only with cmd_line and actions (on +non-join nodes). +Example: (in_language "c++").
      • +
      • not_empty - Returns true if a given option (which should be either a +parameter or a parameter list) is set by the user. Like switch_on, can +be also given a list as argument. +Example: (not_empty "o").
      • +
      • any_not_empty - Returns true if not_empty returns true for any of +the options in the list. +Example: (any_not_empty ["foo", "bar", "baz"]) is equivalent to (or +(not_empty "foo"), (not_empty "bar"), (not_empty "baz")).
      • +
      • empty - The opposite of not_empty. Equivalent to (not (not_empty +X)). Provided for convenience. Can be given a list as argument.
      • +
      • any_not_empty - Returns true if not_empty returns true for any of +the options in the list. +Example: (any_empty ["foo", "bar", "baz"]) is equivalent to (not (and +(not_empty "foo"), (not_empty "bar"), (not_empty "baz"))).
      • +
      • single_input_file - Returns true if there was only one input file +provided on the command-line. Used without arguments: +(single_input_file).
      • +
      • multiple_input_files - Equivalent to (not (single_input_file)) (the +case of zero input files is considered an error).
      • +
      • default - Always evaluates to true. Should always be the last +test in the case expression.
      • +
      • and - A standard binary logical combinator that returns true iff all of +its arguments return true. Used like this: (and (test1), (test2), +... (testN)). Nesting of and and or is allowed, but not +encouraged.
      • +
      • or - A binary logical combinator that returns true iff any of its +arguments returns true. Example: (or (test1), (test2), ... (testN)).
      • +
      • not - Standard unary logical combinator that negates its +argument. Example: (not (or (test1), (test2), ... (testN))).
      • +
      +
    • +
    +
    +
    +

    Writing a tool description

    +

    As was said earlier, nodes in the compilation graph represent tools, +which are described separately. A tool definition looks like this +(taken from the include/llvm/CompilerDriver/Tools.td file):

    +
    +def llvm_gcc_cpp : Tool<[
    +    (in_language "c++"),
    +    (out_language "llvm-assembler"),
    +    (output_suffix "bc"),
    +    (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
    +    (sink)
    +    ]>;
    +
    +

    This defines a new tool called llvm_gcc_cpp, which is an alias for +llvm-g++. As you can see, a tool definition is just a list of +properties; most of them should be self-explanatory. The sink +property means that this tool should be passed all command-line +options that aren't mentioned in the option list.

    +

    The complete list of all currently implemented tool properties follows.

    +
      +
    • Possible tool properties:
        +
      • in_language - input language name. Can be either a string or a +list, in case the tool supports multiple input languages.
      • +
      • out_language - output language name. Multiple output languages are not +allowed.
      • +
      • output_suffix - output file suffix. Can also be changed +dynamically, see documentation on actions.
      • +
      • cmd_line - the actual command used to run the tool. You can +use $INFILE and $OUTFILE variables, output redirection +with >, hook invocations ($CALL), environment variables +(via $ENV) and the case construct.
      • +
      • join - this tool is a "join node" in the graph, i.e. it gets a +list of input files and joins them together. Used for linkers.
      • +
      • sink - all command-line options that are not handled by other +tools are passed to this tool.
      • +
      • actions - A single big case expression that specifies how +this tool reacts on command-line options (described in more detail +below).
      • +
      +
    • +
    +
    +

    Actions

    +

    A tool often needs to react to command-line options, and this is +precisely what the actions property is for. The next example +illustrates this feature:

    +
    +def llvm_gcc_linker : Tool<[
    +    (in_language "object-code"),
    +    (out_language "executable"),
    +    (output_suffix "out"),
    +    (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
    +    (join),
    +    (actions (case (not_empty "L"), (forward "L"),
    +                   (not_empty "l"), (forward "l"),
    +                   (not_empty "dummy"),
    +                             [(append_cmd "-dummy1"), (append_cmd "-dummy2")])
    +    ]>;
    +
    +

    The actions tool property is implemented on top of the omnipresent +case expression. It associates one or more different actions +with given conditions - in the example, the actions are forward, +which forwards a given option unchanged, and append_cmd, which +appends a given string to the tool execution command. Multiple actions +can be associated with a single condition by using a list of actions +(used in the example to append some dummy options). The same case +construct can also be used in the cmd_line property to modify the +tool command line.

    +

    The "join" property used in the example means that this tool behaves +like a linker.

    +

    The list of all possible actions follows.

    +
      +
    • Possible actions:

      +
      +
        +
      • append_cmd - Append a string to the tool invocation command. +Example: (case (switch_on "pthread"), (append_cmd "-lpthread")).
      • +
      • error - Exit with error. +Example: (error "Mixing -c and -S is not allowed!").
      • +
      • warning - Print a warning. +Example: (warning "Specifying both -O1 and -O2 is meaningless!").
      • +
      • forward - Forward the option unchanged. +Example: (forward "Wall").
      • +
      • forward_as - Change the option's name, but forward the argument +unchanged. +Example: (forward_as "O0", "--disable-optimization").
      • +
      • forward_value - Forward only option's value. Cannot be used with switch +options (since they don't have values), but works fine with lists. +Example: (forward_value "Wa,").
      • +
      • forward_transformed_value - As above, but applies a hook to the +option's value before forwarding (see below). When +forward_transformed_value is applied to a list +option, the hook must have signature +std::string hooks::HookName (const std::vector<std::string>&). +Example: (forward_transformed_value "m", "ConvertToMAttr").
      • +
      • output_suffix - Modify the output suffix of this tool. +Example: (output_suffix "i").
      • +
      • stop_compilation - Stop compilation after this tool processes its +input. Used without arguments. +Example: (stop_compilation).
      • +
      +
      +
    • +
    +
    +
    +
    +

    Language map

    +

    If you are adding support for a new language to LLVMC, you'll need to +modify the language map, which defines mappings from file extensions +to language names. It is used to choose the proper toolchain(s) for a +given input file set. Language map definition looks like this:

    +
    +def LanguageMap : LanguageMap<
    +    [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
    +     LangToSuffixes<"c", ["c"]>,
    +     ...
    +    ]>;
    +
    +

    For example, without those definitions the following command wouldn't work:

    +
    +$ llvmc hello.cpp
    +llvmc: Unknown suffix: cpp
    +
    +

    The language map entries are needed only for the tools that are linked from the +root node. Since a tool can't have multiple output languages, for inner nodes of +the graph the input and output languages should match. This is enforced at +compile-time.

    +
    +
    +

    Option preprocessor

    +

    It is sometimes useful to run error-checking code before processing the +compilation graph. For example, if optimization options "-O1" and "-O2" are +implemented as switches, we might want to output a warning if the user invokes +the driver with both of these options enabled.

    +

    The OptionPreprocessor feature is reserved specially for these +occasions. Example (adapted from the built-in Base plugin):

    +
    +def Preprocess : OptionPreprocessor<
    +(case (not (any_switch_on ["O0", "O1", "O2", "O3"])),
    +           (set_option "O2"),
    +      (and (switch_on "O3"), (any_switch_on ["O0", "O1", "O2"])),
    +           (unset_option ["O0", "O1", "O2"]),
    +      (and (switch_on "O2"), (any_switch_on ["O0", "O1"])),
    +           (unset_option ["O0", "O1"]),
    +      (and (switch_on "O1"), (switch_on "O0")),
    +           (unset_option "O0"))
    +>;
    +
    +

    Here, OptionPreprocessor is used to unset all spurious -O options so +that they are not forwarded to the compiler. If no optimization options are +specified, -O2 is enabled.

    +

    OptionPreprocessor is basically a single big case expression, which is +evaluated only once right after the plugin is loaded. The only allowed actions +in OptionPreprocessor are error, warning, and two special actions: +unset_option and set_option. As their names suggest, they can be used to +set or unset a given option. To set an option with set_option, use the +two-argument form: (set_option "parameter", VALUE). Here, VALUE can be +either a string, a string list, or a boolean constant.

    +

    For convenience, set_option and unset_option also work on lists. That +is, instead of [(unset_option "A"), (unset_option "B")] you can use +(unset_option ["A", "B"]). Obviously, (set_option ["A", "B"]) is valid +only if both A and B are switches.

    +
    +
    +

    More advanced topics

    +
    +

    Hooks and environment variables

    +

    Normally, LLVMC executes programs from the system PATH. Sometimes, +this is not sufficient: for example, we may want to specify tool paths +or names in the configuration file. This can be easily achieved via +the hooks mechanism. To write your own hooks, just add their +definitions to the PluginMain.cpp or drop a .cpp file into the +your plugin directory. Hooks should live in the hooks namespace +and have the signature std::string hooks::MyHookName ([const char* +Arg0 [ const char* Arg2 [, ...]]]). They can be used from the +cmd_line tool property:

    +
    +(cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
    +
    +

    To pass arguments to hooks, use the following syntax:

    +
    +(cmd_line "$CALL(MyHook, 'Arg1', 'Arg2', 'Arg # 3')/path/to/file -o1 -o2")
    +
    +

    It is also possible to use environment variables in the same manner:

    +
    +(cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
    +
    +

    To change the command line string based on user-provided options use +the case expression (documented above):

    +
    +(cmd_line
    +  (case
    +    (switch_on "E"),
    +       "llvm-g++ -E -x c $INFILE -o $OUTFILE",
    +    (default),
    +       "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
    +
    +
    +
    +

    How plugins are loaded

    +

    It is possible for LLVMC plugins to depend on each other. For example, +one can create edges between nodes defined in some other plugin. To +make this work, however, that plugin should be loaded first. To +achieve this, the concept of plugin priority was introduced. By +default, every plugin has priority zero; to specify the priority +explicitly, put the following line in your plugin's TableGen file:

    +
    +def Priority : PluginPriority<$PRIORITY_VALUE>;
    +# Where PRIORITY_VALUE is some integer > 0
    +
    +

    Plugins are loaded in order of their (increasing) priority, starting +with 0. Therefore, the plugin with the highest priority value will be +loaded last.

    +
    +
    +

    Debugging

    +

    When writing LLVMC plugins, it can be useful to get a visual view of +the resulting compilation graph. This can be achieved via the command +line option --view-graph. This command assumes that Graphviz and +Ghostview are installed. There is also a --write-graph option that +creates a Graphviz source file (compilation-graph.dot) in the +current directory.

    +

    Another useful llvmc option is --check-graph. It checks the +compilation graph for common errors like mismatched output/input +language names, multiple default edges and cycles. These checks can't +be performed at compile-time because the plugins can load code +dynamically. When invoked with --check-graph, llvmc doesn't +perform any compilation tasks and returns the number of encountered +errors as its status code.

    +
    +
    +

    Conditioning on the executable name

    +

    For now, the executable name (the value passed to the driver in argv[0]) is +accessible only in the C++ code (i.e. hooks). Use the following code:

    +
    +namespace llvmc {
    +extern const char* ProgramName;
    +}
    +
    +namespace hooks {
    +
    +std::string MyHook() {
    +//...
    +if (strcmp(ProgramName, "mydriver") == 0) {
    +   //...
    +
    +}
    +
    +} // end namespace hooks
    +
    +

    In general, you're encouraged not to make the behaviour dependent on the +executable file name, and use command-line switches instead. See for example how +the Base plugin behaves when it needs to choose the correct linker options +(think g++ vs. gcc).

    +
    +
    + +Valid CSS + +Valid XHTML 1.0 Transitional + +Mikhail Glushenkov
    +LLVM Compiler Infrastructure
    + +Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    +
    +
    + + Added: www-releases/trunk/2.8/docs/CompilerDriverTutorial.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CompilerDriverTutorial.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CompilerDriverTutorial.html (added) +++ www-releases/trunk/2.8/docs/CompilerDriverTutorial.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,126 @@ + + + + + + +Tutorial - Using LLVMC + + + +
    +

    Tutorial - Using LLVMC

    + + + +
    +

    Written by Mikhail Glushenkov

    +
    +

    Introduction

    +

    LLVMC is a generic compiler driver, which plays the same role for LLVM +as the gcc program does for GCC - the difference being that LLVMC +is designed to be more adaptable and easier to customize. Most of +LLVMC functionality is implemented via plugins, which can be loaded +dynamically or compiled in. This tutorial describes the basic usage +and configuration of LLVMC.

    +
    +
    +

    Compiling with LLVMC

    +

    In general, LLVMC tries to be command-line compatible with gcc as +much as possible, so most of the familiar options work:

    +
    +$ llvmc -O3 -Wall hello.cpp
    +$ ./a.out
    +hello
    +
    +

    This will invoke llvm-g++ under the hood (you can see which +commands are executed by using the -v option). For further help on +command-line LLVMC usage, refer to the llvmc --help output.

    +
    +
    +

    Using LLVMC to generate toolchain drivers

    +

    LLVMC plugins are written mostly using TableGen, so you need to +be familiar with it to get anything done.

    +

    Start by compiling example/Simple, which is a primitive wrapper for +gcc:

    +
    +$ cd $LLVM_DIR/tools/llvmc
    +$ cp -r example/Simple plugins/Simple
    +
    +  # NB: A less verbose way to compile standalone LLVMC-based drivers is
    +  # described in the reference manual.
    +
    +$ make LLVMC_BASED_DRIVER_NAME=mygcc LLVMC_BUILTIN_PLUGINS=Simple
    +$ cat > hello.c
    +[...]
    +$ mygcc hello.c
    +$ ./hello.out
    +Hello
    +
    +

    Here we link our plugin with the LLVMC core statically to form an executable +file called mygcc. It is also possible to build our plugin as a dynamic +library to be loaded by the llvmc executable (or any other LLVMC-based +standalone driver); this is described in the reference manual.

    +

    Contents of the file Simple.td look like this:

    +
    +// Include common definitions
    +include "llvm/CompilerDriver/Common.td"
    +
    +// Tool descriptions
    +def gcc : Tool<
    +[(in_language "c"),
    + (out_language "executable"),
    + (output_suffix "out"),
    + (cmd_line "gcc $INFILE -o $OUTFILE"),
    + (sink)
    +]>;
    +
    +// Language map
    +def LanguageMap : LanguageMap<[LangToSuffixes<"c", ["c"]>]>;
    +
    +// Compilation graph
    +def CompilationGraph : CompilationGraph<[Edge<"root", "gcc">]>;
    +
    +

    As you can see, this file consists of three parts: tool descriptions, +language map, and the compilation graph definition.

    +

    At the heart of LLVMC is the idea of a compilation graph: vertices in +this graph are tools, and edges represent a transformation path +between two tools (for example, assembly source produced by the +compiler can be transformed into executable code by an assembler). The +compilation graph is basically a list of edges; a special node named +root is used to mark graph entry points.

    +

    Tool descriptions are represented as property lists: most properties +in the example above should be self-explanatory; the sink property +means that all options lacking an explicit description should be +forwarded to this tool.

    +

    The LanguageMap associates a language name with a list of suffixes +and is used for deciding which toolchain corresponds to a given input +file.

    +

    To learn more about LLVMC customization, refer to the reference +manual and plugin source code in the plugins directory.

    +
    +
    + +Valid CSS + +Valid XHTML 1.0 Transitional + +Mikhail Glushenkov
    +LLVM Compiler Infrastructure
    + +Last modified: $Date: 2008-12-11 11:34:48 -0600 (Thu, 11 Dec 2008) $ +
    +
    + + Added: www-releases/trunk/2.8/docs/CompilerWriterInfo.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/CompilerWriterInfo.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/CompilerWriterInfo.html (added) +++ www-releases/trunk/2.8/docs/CompilerWriterInfo.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,263 @@ + + + + + Architecture/platform information for compiler writers + + + + + +
    + Architecture/platform information for compiler writers +
    + +
    +

    Note: This document is a work-in-progress. Additions and clarifications + are welcome.

    +
    + +
      +
    1. Hardware +
        +
      1. Alpha
      2. +
      3. ARM
      4. +
      5. Itanium
      6. +
      7. MIPS
      8. +
      9. PowerPC
      10. +
      11. SPARC
      12. +
      13. X86
      14. +
      15. Other lists
      16. +
    2. +
    3. Application Binary Interface (ABI) +
        +
      1. Linux
      2. +
      3. OS X
      4. +
    4. +
    5. Miscellaneous resources
    6. +
    + +
    +

    Compiled by Misha Brukman

    +
    + + + + + + + + +
    + +
    + + + + + + + + + + + + + + + + + + + + +
    IBM - Official manuals and docs
    + + + + +
    Other documents, collections, notes
    + + + + + + + + + + + + +
    AMD - Official manuals and docs
    + + + + +
    Intel - Official manuals and docs
    + + + + +
    Other x86-specific information
    + + + + + + +
    + + + +
    + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Misha Brukman
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/DebuggingJITedCode.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/DebuggingJITedCode.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/DebuggingJITedCode.html (added) +++ www-releases/trunk/2.8/docs/DebuggingJITedCode.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,152 @@ + + + + Debugging JITed Code With GDB + + + + +
    Debugging JITed Code With GDB
    +
      +
    1. Example usage
    2. +
    3. Background
    4. +
    +
    Written by Reid Kleckner
    + + + + +
    + +

    In order to debug code JITed by LLVM, you need GDB 7.0 or newer, which is +available on most modern distributions of Linux. The version of GDB that Apple +ships with XCode has been frozen at 6.3 for a while. LLDB may be a better +option for debugging JITed code on Mac OS X. +

    + +

    Consider debugging the following code compiled with clang and run through +lli: +

    + +
    +#include <stdio.h>
    +
    +void foo() {
    +    printf("%d\n", *(int*)NULL);  // Crash here
    +}
    +
    +void bar() {
    +    foo();
    +}
    +
    +void baz() {
    +    bar();
    +}
    +
    +int main(int argc, char **argv) {
    +    baz();
    +}
    +
    + +

    Here are the commands to run that application under GDB and print the stack +trace at the crash: +

    + +
    +# Compile foo.c to bitcode.  You can use either clang or llvm-gcc with this
    +# command line.  Both require -fexceptions, or the calls are all marked
    +# 'nounwind' which disables DWARF exception handling info.  Custom frontends
    +# should avoid adding this attribute to JITed code, since it interferes with
    +# DWARF CFA generation at the moment.
    +$ clang foo.c -fexceptions -emit-llvm -c -o foo.bc
    +
    +# Run foo.bc under lli with -jit-emit-debug.  If you built lli in debug mode,
    +# -jit-emit-debug defaults to true.
    +$ $GDB_INSTALL/gdb --args lli -jit-emit-debug foo.bc
    +...
    +
    +# Run the code.
    +(gdb) run
    +Starting program: /tmp/gdb/lli -jit-emit-debug foo.bc
    +[Thread debugging using libthread_db enabled]
    +
    +Program received signal SIGSEGV, Segmentation fault.
    +0x00007ffff7f55164 in foo ()
    +
    +# Print the backtrace, this time with symbols instead of ??.
    +(gdb) bt
    +#0  0x00007ffff7f55164 in foo ()
    +#1  0x00007ffff7f550f9 in bar ()
    +#2  0x00007ffff7f55099 in baz ()
    +#3  0x00007ffff7f5502a in main ()
    +#4  0x00000000007c0225 in llvm::JIT::runFunction(llvm::Function*,
    +    std::vector<llvm::GenericValue,
    +    std::allocator<llvm::GenericValue> > const&) ()
    +#5  0x00000000007d6d98 in
    +    llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*,
    +    std::vector<std::string,
    +    std::allocator<std::string> > const&, char const* const*) ()
    +#6  0x00000000004dab76 in main ()
    +
    + +

    As you can see, GDB can correctly unwind the stack and has the appropriate +function names. +

    +
    + + + + +
    + +

    Without special runtime support, debugging dynamically generated code with +GDB (as well as most debuggers) can be quite painful. Debuggers generally read +debug information from the object file of the code, but for JITed code, there is +no such file to look for. +

    + +

    Depending on the architecture, this can impact the debugging experience in +different ways. For example, on most 32-bit x86 architectures, you can simply +compile with -fno-omit-frame-pointer for GCC and -disable-fp-elim for LLVM. +When GDB creates a backtrace, it can properly unwind the stack, but the stack +frames owned by JITed code have ??'s instead of the appropriate symbol name. +However, on Linux x86_64 in particular, GDB relies on the DWARF call frame +address (CFA) debug information to unwind the stack, so even if you compile +your program to leave the frame pointer untouched, GDB will usually be unable +to unwind the stack past any JITed code stack frames. +

    + +

    In order to communicate the necessary debug info to GDB, an interface for +registering JITed code with debuggers has been designed and implemented for +GDB and LLVM. At a high level, whenever LLVM generates new machine code, it +also generates an object file in memory containing the debug information. LLVM +then adds the object file to the global list of object files and calls a special +function (__jit_debug_register_code) marked noinline that GDB knows about. When +GDB attaches to a process, it puts a breakpoint in this function and loads all +of the object files in the global list. When LLVM calls the registration +function, GDB catches the breakpoint signal, loads the new object file from +LLVM's memory, and resumes the execution. In this way, GDB can get the +necessary debug information. +

    + +

    At the time of this writing, LLVM only supports architectures that use ELF +object files and it only generates symbols and DWARF CFA information. However, +it would be easy to add more information to the object file, so we don't need to +coordinate with GDB to get better debug information. +

    +
    + + +
    +
    + Valid CSS + Valid HTML 4.01 + Reid Kleckner
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-07-07 13:16:45 -0700 (Wed, 07 Jul 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/DeveloperPolicy.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/DeveloperPolicy.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/DeveloperPolicy.html (added) +++ www-releases/trunk/2.8/docs/DeveloperPolicy.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,615 @@ + + + + + LLVM Developer Policy + + + + +
    LLVM Developer Policy
    +
      +
    1. Introduction
    2. +
    3. Developer Policies +
        +
      1. Stay Informed
      2. +
      3. Making a Patch
      4. +
      5. Code Reviews
      6. +
      7. Code Owners
      8. +
      9. Test Cases
      10. +
      11. Quality
      12. +
      13. Obtaining Commit Access
      14. +
      15. Making a Major Change
      16. +
      17. Incremental Development
      18. +
      19. Attribution of Changes
      20. +
    4. +
    5. Copyright, License, and Patents +
        +
      1. Copyright
      2. +
      3. License
      4. +
      5. Patents
      6. +
      7. Developer Agreements
      8. +
    6. +
    +
    Written by the LLVM Oversight Team
    + + + + +
    +

    This document contains the LLVM Developer Policy which defines the project's + policy towards developers and their contributions. The intent of this policy + is to eliminate miscommunication, rework, and confusion that might arise from + the distributed nature of LLVM's development. By stating the policy in clear + terms, we hope each developer can know ahead of time what to expect when + making LLVM contributions. This policy covers all llvm.org subprojects, + including Clang, LLDB, etc.

    +

    This policy is also designed to accomplish the following objectives:

    + +
      +
    1. Attract both users and developers to the LLVM project.
    2. + +
    3. Make life as simple and easy for contributors as possible.
    4. + +
    5. Keep the top of Subversion trees as stable as possible.
    6. +
    + +

    This policy is aimed at frequent contributors to LLVM. People interested in + contributing one-off patches can do so in an informal way by sending them to + the + llvm-commits + mailing list and engaging another developer to see it through the + process.

    +
    + + + + +
    +

    This section contains policies that pertain to frequent LLVM developers. We + always welcome one-off patches from people who do not + routinely contribute to LLVM, but we expect more from frequent contributors + to keep the system as efficient as possible for everyone. Frequent LLVM + contributors are expected to meet the following requirements in order for + LLVM to maintain a high standard of quality.

    +

    + + + +
    +

    Developers should stay informed by reading at least the "dev" mailing list + for the projects you are interested in, such as + llvmdev for + LLVM, cfe-dev + for Clang, or lldb-dev + for LLDB. If you are doing anything more than just casual work on LLVM, it + is suggested that you also subscribe to the "commits" mailing list for the + subproject you're interested in, such as + llvm-commits, + cfe-commits, + or lldb-commits. + Reading the "commits" list and paying attention to changes being made by + others is a good way to see what other people are interested in and watching + the flow of the project as a whole.

    + +

    We recommend that active developers register an email account with + LLVM Bugzilla and preferably subscribe to + the llvm-bugs + email list to keep track of bugs and enhancements occurring in LLVM. We + really appreciate people who are proactive at catching incoming bugs in their + components and dealing with them promptly.

    +
    + + + + +
    +

    When making a patch for review, the goal is to make it as easy for the + reviewer to read it as possible. As such, we recommend that you:

    + +
      +
    1. Make your patch against the Subversion trunk, not a branch, and not an old + version of LLVM. This makes it easy to apply the patch. For information + on how to check out SVN trunk, please see the Getting Started Guide.
    2. + +
    3. Similarly, patches should be submitted soon after they are generated. Old + patches may not apply correctly if the underlying code changes between the + time the patch was created and the time it is applied.
    4. + +
    5. Patches should be made with svn diff, or similar. If you use + a different tool, make sure it uses the diff -u format and + that it doesn't contain clutter which makes it hard to read.
    6. + +
    7. If you are modifying generated files, such as the top-level + configure script, please separate out those changes into + a separate patch from the rest of your changes.
    8. +
    + +

    When sending a patch to a mailing list, it is a good idea to send it as an + attachment to the message, not embedded into the text of the + message. This ensures that your mailer will not mangle the patch when it + sends it (e.g. by making whitespace changes or by wrapping lines).

    + +

    For Thunderbird users: Before submitting a patch, please open + Preferences → Advanced → General → Config Editor, + find the key mail.content_disposition_type, and set its value to + 1. Without this setting, Thunderbird sends your attachment using + Content-Disposition: inline rather than Content-Disposition: + attachment. Apple Mail gamely displays such a file inline, making it + difficult to work with for reviewers using that program.

    +
    + + + +
    +

    LLVM has a code review policy. Code review is one way to increase the quality + of software. We generally follow these policies:

    + +
      +
    1. All developers are required to have significant changes reviewed before + they are committed to the repository.
    2. + +
    3. Code reviews are conducted by email, usually on the llvm-commits + list.
    4. + +
    5. Code can be reviewed either before it is committed or after. We expect + major changes to be reviewed before being committed, but smaller changes + (or changes where the developer owns the component) can be reviewed after + commit.
    6. + +
    7. The developer responsible for a code change is also responsible for making + all necessary review-related changes.
    8. + +
    9. Code review can be an iterative process, which continues until the patch + is ready to be committed.
    10. +
    + +

    Developers should participate in code reviews as both reviewers and + reviewees. If someone is kind enough to review your code, you should return + the favor for someone else. Note that anyone is welcome to review and give + feedback on a patch, but only people with Subversion write access can approve + it.

    +
    + + + +
    + +

    The LLVM Project relies on two features of its process to maintain rapid + development in addition to the high quality of its source base: the + combination of code review plus post-commit review for trusted maintainers. + Having both is a great way for the project to take advantage of the fact that + most people do the right thing most of the time, and only commit patches + without pre-commit review when they are confident they are right.

    + +

    The trick to this is that the project has to guarantee that all patches that + are committed are reviewed after they go in: you don't want everyone to + assume someone else will review it, allowing the patch to go unreviewed. To + solve this problem, we have a notion of an 'owner' for a piece of the code. + The sole responsibility of a code owner is to ensure that a commit to their + area of the code is appropriately reviewed, either by themself or by someone + else. The current code owners are:

    + +
      +
    1. Evan Cheng: Code generator and all targets.
    2. + +
    3. Doug Gregor: Clang Basic, Lex, Parse, and Sema Libraries.
    4. + +
    5. Anton Korobeynikov: Exception handling, debug information, and + Windows codegen.
    6. + +
    7. Ted Kremenek: Clang Static Analyzer.
    8. + +
    9. Chris Lattner: Everything not covered by someone else.
    10. + +
    11. Duncan Sands: llvm-gcc 4.2.
    12. +
    + +

    Note that code ownership is completely different than reviewers: anyone can + review a piece of code, and we welcome code review from anyone who is + interested. Code owners are the "last line of defense" to guarantee that all + patches that are committed are actually reviewed.

    + +

    Being a code owner is a somewhat unglamorous position, but it is incredibly + important for the ongoing success of the project. Because people get busy, + interests change, and unexpected things happen, code ownership is purely + opt-in, and anyone can choose to resign their "title" at any time. For now, + we do not have an official policy on how one gets elected to be a code + owner.

    +
    + + + +
    +

    Developers are required to create test cases for any bugs fixed and any new + features added. Some tips for getting your testcase approved:

    + +
      +
    1. All feature and regression test cases are added to the + llvm/test directory. The appropriate sub-directory should be + selected (see the Testing Guide for + details).
    2. + +
    3. Test cases should be written in LLVM assembly + language unless the feature or regression being tested requires + another language (e.g. the bug being fixed or feature being implemented is + in the llvm-gcc C++ front-end, in which case it must be written in + C++).
    4. + +
    5. Test cases, especially for regressions, should be reduced as much as + possible, by bugpoint or manually. It is + unacceptable to place an entire failing program into llvm/test as + this creates a time-to-test burden on all developers. Please keep + them short.
    6. +
    + +

    Note that llvm/test and clang/test are designed for regression and small + feature tests only. More extensive test cases (e.g., entire applications, + benchmarks, etc) + should be added to the llvm-test test suite. The llvm-test suite is + for coverage (correctness, performance, etc) testing, not feature or + regression testing.

    +
    + + + +
    +

    The minimum quality standards that any change must satisfy before being + committed to the main development branch are:

    + +
      +
    1. Code must adhere to the LLVM Coding + Standards.
    2. + +
    3. Code must compile cleanly (no errors, no warnings) on at least one + platform.
    4. + +
    5. Bug fixes and new features should include a + testcase so we know if the fix/feature ever regresses in the + future.
    6. + +
    7. Code must pass the llvm/test test suite.
    8. + +
    9. The code must not cause regressions on a reasonable subset of llvm-test, + where "reasonable" depends on the contributor's judgement and the scope of + the change (more invasive changes require more testing). A reasonable + subset might be something like + "llvm-test/MultiSource/Benchmarks".
    10. +
    + +

    Additionally, the committer is responsible for addressing any problems found + in the future that the change is responsible for. For example:

    + +
      +
    • The code should compile cleanly on all supported platforms.
    • + +
    • The changes should not cause any correctness regressions in the + llvm-test suite and must not cause any major performance + regressions.
    • + +
    • The change set should not cause performance or correctness regressions for + the LLVM tools.
    • + +
    • The changes should not cause performance or correctness regressions in + code compiled by LLVM on all applicable targets.
    • + +
    • You are expected to address any bugzilla + bugs that result from your change.
    • +
    + +

    We prefer for this to be handled before submission but understand that it + isn't possible to test all of this for every submission. Our build bots and + nightly testing infrastructure normally finds these problems. A good rule of + thumb is to check the nightly testers for regressions the day after your + change. Build bots will directly email you if a group of commits that + included yours caused a failure. You are expected to check the build bot + messages to see if they are your fault and, if so, fix the breakage.

    + +

    Commits that violate these quality standards (e.g. are very broken) may be + reverted. This is necessary when the change blocks other developers from + making progress. The developer is welcome to re-commit the change after the + problem has been fixed.

    +
    + + + +
    + +

    We grant commit access to contributors with a track record of submitting high + quality patches. If you would like commit access, please send an email to + Chris with the following + information:

    + +
      +
    1. The user name you want to commit with, e.g. "hacker".
    2. + +
    3. The full name and email address you want message to llvm-commits to come + from, e.g. "J. Random Hacker <hacker at yoyodyne.com>".
    4. + +
    5. A "password hash" of the password you want to use, e.g. "2ACR96qjUqsyM". + Note that you don't ever tell us what your password is, you just give it + to us in an encrypted form. To get this, run "htpasswd" (a utility that + comes with apache) in crypt mode (often enabled with "-d"), or find a web + page that will do it for you.
    6. +
    + +

    Once you've been granted commit access, you should be able to check out an + LLVM tree with an SVN URL of "https://username at llvm.org/..." instead of the + normal anonymous URL of "http://llvm.org/...". The first time you commit + you'll have to type in your password. Note that you may get a warning from + SVN about an untrusted key, you can ignore this. To verify that your commit + access works, please do a test commit (e.g. change a comment or add a blank + line). Your first commit to a repository may require the autogenerated email + to be approved by a mailing list. This is normal, and will be done when + the mailing list owner has time.

    + +

    If you have recently been granted commit access, these policies apply:

    + +
      +
    1. You are granted commit-after-approval to all parts of LLVM. To get + approval, submit a patch to + llvm-commits. + When approved you may commit it yourself.
    2. + +
    3. You are allowed to commit patches without approval which you think are + obvious. This is clearly a subjective decision — we simply expect + you to use good judgement. Examples include: fixing build breakage, + reverting obviously broken patches, documentation/comment changes, any + other minor changes.
    4. + +
    5. You are allowed to commit patches without approval to those portions of + LLVM that you have contributed or maintain (i.e., have been assigned + responsibility for), with the proviso that such commits must not break the + build. This is a "trust but verify" policy and commits of this nature are + reviewed after they are committed.
    6. + +
    7. Multiple violations of these policies or a single egregious violation may + cause commit access to be revoked.
    8. +
    + +

    In any case, your changes are still subject to code + review (either before or after they are committed, depending on the + nature of the change). You are encouraged to review other peoples' patches + as well, but you aren't required to.

    +
    + + + +
    +

    When a developer begins a major new project with the aim of contributing it + back to LLVM, s/he should inform the community with an email to + the llvmdev + email list, to the extent possible. The reason for this is to: + +

      +
    1. keep the community informed about future changes to LLVM,
    2. + +
    3. avoid duplication of effort by preventing multiple parties working on the + same thing and not knowing about it, and
    4. + +
    5. ensure that any technical issues around the proposed work are discussed + and resolved before any significant work is done.
    6. +
    + +

    The design of LLVM is carefully controlled to ensure that all the pieces fit + together well and are as consistent as possible. If you plan to make a major + change to the way LLVM works or want to add a major new extension, it is a + good idea to get consensus with the development community before you start + working on it.

    + +

    Once the design of the new feature is finalized, the work itself should be + done as a series of incremental changes, not as a + long-term development branch.

    +
    + + + +
    +

    In the LLVM project, we do all significant changes as a series of incremental + patches. We have a strong dislike for huge changes or long-term development + branches. Long-term development branches have a number of drawbacks:

    + +
      +
    1. Branches must have mainline merged into them periodically. If the branch + development and mainline development occur in the same pieces of code, + resolving merge conflicts can take a lot of time.
    2. + +
    3. Other people in the community tend to ignore work on branches.
    4. + +
    5. Huge changes (produced when a branch is merged back onto mainline) are + extremely difficult to code review.
    6. + +
    7. Branches are not routinely tested by our nightly tester + infrastructure.
    8. + +
    9. Changes developed as monolithic large changes often don't work until the + entire set of changes is done. Breaking it down into a set of smaller + changes increases the odds that any of the work will be committed to the + main repository.
    10. +
    + +

    To address these problems, LLVM uses an incremental development style and we + require contributors to follow this practice when making a large/invasive + change. Some tips:

    + +
      +
    • Large/invasive changes usually have a number of secondary changes that are + required before the big change can be made (e.g. API cleanup, etc). These + sorts of changes can often be done before the major change is done, + independently of that work.
    • + +
    • The remaining inter-related work should be decomposed into unrelated sets + of changes if possible. Once this is done, define the first increment and + get consensus on what the end goal of the change is.
    • + +
    • Each change in the set can be stand alone (e.g. to fix a bug), or part of + a planned series of changes that works towards the development goal.
    • + +
    • Each change should be kept as small as possible. This simplifies your work + (into a logical progression), simplifies code review and reduces the + chance that you will get negative feedback on the change. Small increments + also facilitate the maintenance of a high quality code base.
    • + +
    • Often, an independent precursor to a big change is to add a new API and + slowly migrate clients to use the new API. Each change to use the new API + is often "obvious" and can be committed without review. Once the new API + is in place and used, it is much easier to replace the underlying + implementation of the API. This implementation change is logically + separate from the API change.
    • +
    + +

    If you are interested in making a large change, and this scares you, please + make sure to first discuss the change/gather consensus + then ask about the best way to go about making the change.

    +
    + + + +
    +

    We believe in correct attribution of contributions to their contributors. + However, we do not want the source code to be littered with random + attributions "this code written by J. Random Hacker" (this is noisy and + distracting). In practice, the revision control system keeps a perfect + history of who changed what, and the CREDITS.txt file describes higher-level + contributions. If you commit a patch for someone else, please say "patch + contributed by J. Random Hacker!" in the commit message.

    + +

    Overall, please do not add contributor names to the source code.

    +
    + + + + + +
    +

    This section addresses the issues of copyright, license and patents for the + LLVM project. Currently, the University of Illinois is the LLVM copyright + holder and the terms of its license to LLVM users and developers is the + University of + Illinois/NCSA Open Source License.

    + +
    +

    NOTE: This section deals with + legal matters but does not provide legal advice. We are not lawyers, please + seek legal counsel from an attorney.

    +
    +
    + + + +
    +

    For consistency and ease of management, the project requires the copyright + for all LLVM software to be held by a single copyright holder: the University + of Illinois (UIUC).

    + +

    Although UIUC may eventually reassign the copyright of the software to + another entity (e.g. a dedicated non-profit "LLVM Organization") the intent + for the project is to always have a single entity hold the copyrights to LLVM + at any given time.

    + +

    We believe that having a single copyright holder is in the best interests of + all developers and users as it greatly reduces the managerial burden for any + kind of administrative or technical decisions about LLVM. The goal of the + LLVM project is to always keep the code open and licensed + under a very liberal license.

    +
    + + + +
    +

    We intend to keep LLVM perpetually open source and to use a liberal open + source license. The current license is the + University of + Illinois/NCSA Open Source License, which boils down to this:

    + +
      +
    • You can freely distribute LLVM.
    • + +
    • You must retain the copyright notice if you redistribute LLVM.
    • + +
    • Binaries derived from LLVM must reproduce the copyright notice (e.g. in + an included readme file).
    • + +
    • You can't use our names to promote your LLVM derived products.
    • + +
    • There's no warranty on LLVM at all.
    • +
    + +

    We believe this fosters the widest adoption of LLVM because it allows + commercial products to be derived from LLVM with few restrictions and + without a requirement for making any derived works also open source (i.e. + LLVM's license is not a "copyleft" license like the GPL). We suggest that you + read the License + if further clarification is needed.

    + +

    Note that the LLVM Project does distribute llvm-gcc, which is GPL. + This means that anything "linked" into llvm-gcc must itself be compatible + with the GPL, and must be releasable under the terms of the GPL. This + implies that any code linked into llvm-gcc and distributed to others may + be subject to the viral aspects of the GPL (for example, a proprietary + code generator linked into llvm-gcc must be made available under the GPL). + This is not a problem for code already distributed under a more liberal + license (like the UIUC license), and does not affect code generated by + llvm-gcc. It may be a problem if you intend to base commercial development + on llvm-gcc without redistributing your source code.

    + +

    We have no plans to change the license of LLVM. If you have questions or + comments about the license, please contact the + LLVM Oversight Group.

    +
    + + + +
    +

    To the best of our knowledge, LLVM does not infringe on any patents (we have + actually removed code from LLVM in the past that was found to infringe). + Having code in LLVM that infringes on patents would violate an important goal + of the project by making it hard or impossible to reuse the code for + arbitrary purposes (including commercial use).

    + +

    When contributing code, we expect contributors to notify us of any potential + for patent-related trouble with their changes. If you or your employer own + the rights to a patent and would like to contribute code to LLVM that relies + on it, we require that the copyright owner sign an agreement that allows any + other user of LLVM to freely use your patent. Please contact + the oversight group for more + details.

    +
    + + + +
    +

    With regards to the LLVM copyright and licensing, developers agree to assign + their copyrights to UIUC for any contribution made so that the entire + software base can be managed by a single copyright holder. This implies that + any contributions can be licensed under the license that the project + uses.

    + +

    When contributing code, you also affirm that you are legally entitled to + grant this copyright, personally or on behalf of your employer. If the code + belongs to some other entity, please raise this issue with the oversight + group before the code is committed.

    +
    + + +
    +
    + Valid CSS + Valid HTML 4.01 + Written by the + LLVM Oversight Group
    + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-09-01 17:09:17 -0700 (Wed, 01 Sep 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/ExceptionHandling.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/ExceptionHandling.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/ExceptionHandling.html (added) +++ www-releases/trunk/2.8/docs/ExceptionHandling.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,626 @@ + + + + Exception Handling in LLVM + + + + + + + +
    Exception Handling in LLVM
    + + + + +
    + +
    + +
    +

    Written by Jim Laskey

    +
    + + + + + + +
    + +

    This document is the central repository for all information pertaining to + exception handling in LLVM. It describes the format that LLVM exception + handling information takes, which is useful for those interested in creating + front-ends or dealing directly with the information. Further, this document + provides specific examples of what exception handling information is used for + in C/C++.

    + +
    + + + + +
    + +

    Exception handling for most programming languages is designed to recover from + conditions that rarely occur during general use of an application. To that + end, exception handling should not interfere with the main flow of an + application's algorithm by performing checkpointing tasks, such as saving the + current pc or register state.

    + +

    The Itanium ABI Exception Handling Specification defines a methodology for + providing outlying data in the form of exception tables without inlining + speculative exception handling code in the flow of an application's main + algorithm. Thus, the specification is said to add "zero-cost" to the normal + execution of an application.

    + +

    A more complete description of the Itanium ABI exception handling runtime + support of can be found at + Itanium C++ ABI: + Exception Handling. A description of the exception frame format can be + found at + Exception + Frames, with details of the DWARF 3 specification at + DWARF 3 Standard. + A description for the C++ exception table formats can be found at + Exception Handling + Tables.

    + +
    + + + + +
    + +

    Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics + llvm.eh.sjlj.setjmp and + llvm.eh.sjlj.longjmp to + handle control flow for exception handling.

    + +

    For each function which does exception processing, be it try/catch blocks + or cleanups, that function registers itself on a global frame list. When + exceptions are being unwound, the runtime uses this list to identify which + functions need processing.

    + +

    Landing pad selection is encoded in the call site entry of the function + context. The runtime returns to the function via + llvm.eh.sjlj.longjmp, where + a switch table transfers control to the appropriate landing pad based on + the index stored in the function context.

    + +

    In contrast to DWARF exception handling, which encodes exception regions + and frame information in out-of-line tables, SJLJ exception handling + builds and removes the unwind frame context at runtime. This results in + faster exception handling at the expense of slower execution when no + exceptions are thrown. As exceptions are, by their nature, intended for + uncommon code paths, DWARF exception handling is generally preferred to + SJLJ.

    +
    + + + + +
    + +

    When an exception is thrown in LLVM code, the runtime does its best to find a + handler suited to processing the circumstance.

    + +

    The runtime first attempts to find an exception frame corresponding to + the function where the exception was thrown. If the programming language + (e.g. C++) supports exception handling, the exception frame contains a + reference to an exception table describing how to process the exception. If + the language (e.g. C) does not support exception handling, or if the + exception needs to be forwarded to a prior activation, the exception frame + contains information about how to unwind the current activation and restore + the state of the prior activation. This process is repeated until the + exception is handled. If the exception is not handled and no activations + remain, then the application is terminated with an appropriate error + message.

    + +

    Because different programming languages have different behaviors when + handling exceptions, the exception handling ABI provides a mechanism for + supplying personalities. An exception handling personality is defined + by way of a personality function (e.g. __gxx_personality_v0 + in C++), which receives the context of the exception, an exception + structure containing the exception object type and value, and a reference + to the exception table for the current function. The personality function + for the current compile unit is specified in a common exception + frame.

    + +

    The organization of an exception table is language dependent. For C++, an + exception table is organized as a series of code ranges defining what to do + if an exception occurs in that range. Typically, the information associated + with a range defines which types of exception objects (using C++ type + info) that are handled in that range, and an associated action that + should take place. Actions typically pass control to a landing + pad.

    + +

    A landing pad corresponds to the code found in the catch portion of + a try/catch sequence. When execution resumes at a landing + pad, it receives the exception structure and a selector corresponding to + the type of exception thrown. The selector is then used to determine + which catch should actually process the exception.

    + +
    + + + + +
    + +

    At the time of this writing, only C++ exception handling support is available + in LLVM. So the remainder of this document will be somewhat C++-centric.

    + +

    From the C++ developers perspective, exceptions are defined in terms of the + throw and try/catch statements. In this section + we will describe the implementation of LLVM exception handling in terms of + C++ examples.

    + +
    + + +
    + Throw +
    + +
    + +

    Languages that support exception handling typically provide a throw + operation to initiate the exception process. Internally, a throw operation + breaks down into two steps. First, a request is made to allocate exception + space for an exception structure. This structure needs to survive beyond the + current activation. This structure will contain the type and value of the + object being thrown. Second, a call is made to the runtime to raise the + exception, passing the exception structure as an argument.

    + +

    In C++, the allocation of the exception structure is done by + the __cxa_allocate_exception runtime function. The exception + raising is handled by __cxa_throw. The type of the exception is + represented using a C++ RTTI structure.

    + +
    + + + + +
    + +

    A call within the scope of a try statement can potentially raise an + exception. In those circumstances, the LLVM C++ front-end replaces the call + with an invoke instruction. Unlike a call, the invoke has + two potential continuation points: where to continue when the call succeeds + as per normal; and where to continue if the call raises an exception, either + by a throw or the unwinding of a throw.

    + +

    The term used to define a the place where an invoke continues after + an exception is called a landing pad. LLVM landing pads are + conceptually alternative function entry points where an exception structure + reference and a type info index are passed in as arguments. The landing pad + saves the exception structure reference and then proceeds to select the catch + block that corresponds to the type info of the exception object.

    + +

    Two LLVM intrinsic functions are used to convey information about the landing + pad to the back end.

    + +
      +
    1. llvm.eh.exception takes no + arguments and returns a pointer to the exception structure. This only + returns a sensible value if called after an invoke has branched + to a landing pad. Due to code generation limitations, it must currently + be called in the landing pad itself.
    2. + +
    3. llvm.eh.selector takes a minimum + of three arguments. The first argument is the reference to the exception + structure. The second argument is a reference to the personality function + to be used for this try/catch sequence. Each of the + remaining arguments is either a reference to the type info for + a catch statement, a filter + expression, or the number zero (0) representing + a cleanup. The exception is tested against the + arguments sequentially from first to last. The result of + the llvm.eh.selector is a + positive number if the exception matched a type info, a negative number if + it matched a filter, and zero if it matched a cleanup. If nothing is + matched, the behaviour of the program + is undefined. This only returns a sensible + value if called after an invoke has branched to a landing pad. + Due to codegen limitations, it must currently be called in the landing pad + itself. If a type info matched, then the selector value is the index of + the type info in the exception table, which can be obtained using the + llvm.eh.typeid.for + intrinsic.
    4. +
    + +

    Once the landing pad has the type info selector, the code branches to the + code for the first catch. The catch then checks the value of the type info + selector against the index of type info for that catch. Since the type info + index is not known until all the type info have been gathered in the backend, + the catch code will call the + llvm.eh.typeid.for intrinsic + to determine the index for a given type info. If the catch fails to match + the selector then control is passed on to the next catch. Note: Since the + landing pad will not be used if there is no match in the list of type info on + the call to llvm.eh.selector, then + neither the last catch nor catch all need to perform the check + against the selector.

    + +

    Finally, the entry and exit of catch code is bracketed with calls + to __cxa_begin_catch and __cxa_end_catch.

    + +
      +
    • __cxa_begin_catch takes a exception structure reference as an + argument and returns the value of the exception object.
    • + +
    • __cxa_end_catch takes no arguments. This function:

      +
        +
      1. Locates the most recently caught exception and decrements its handler + count,
      2. +
      3. Removes the exception from the "caught" stack if the handler count + goes to zero, and
      4. +
      5. Destroys the exception if the handler count goes to zero, and the + exception was not re-thrown by throw.
      6. +
      +

      Note: a rethrow from within the catch may replace this call with + a __cxa_rethrow.

    • +
    + +
    + + + + +
    + +

    To handle destructors and cleanups in try code, control may not run + directly from a landing pad to the first catch. Control may actually flow + from the landing pad to clean up code and then to the first catch. Since the + required clean up for each invoke in a try may be different + (e.g. intervening constructor), there may be several landing pads for a given + try. If cleanups need to be run, an i32 0 should be passed as the + last llvm.eh.selector argument. + However, when using DWARF exception handling with C++, a i8* null + must be passed instead.

    + +
    + + + + +
    + +

    C++ allows the specification of which exception types can be thrown from a + function. To represent this a top level landing pad may exist to filter out + invalid types. To express this in LLVM code the landing pad will + call llvm.eh.selector. The + arguments are a reference to the exception structure, a reference to the + personality function, the length of the filter expression (the number of type + infos plus one), followed by the type infos themselves. + llvm.eh.selector will return a + negative value if the exception does not match any of the type infos. If no + match is found then a call to __cxa_call_unexpected should be made, + otherwise _Unwind_Resume. Each of these functions requires a + reference to the exception structure. Note that the most general form of an + llvm.eh.selector call can contain + any number of type infos, filter expressions and cleanups (though having more + than one cleanup is pointless). The LLVM C++ front-end can generate such + llvm.eh.selector calls due to + inlining creating nested exception handling scopes.

    + +
    + + + + +
    + +

    The semantics of the invoke instruction require that any exception that + unwinds through an invoke call should result in a branch to the invoke's + unwind label. However such a branch will only happen if the + llvm.eh.selector matches. Thus in + order to ensure correct operation, the front-end must only generate + llvm.eh.selector calls that are + guaranteed to always match whatever exception unwinds through the invoke. + For most languages it is enough to pass zero, indicating the presence of + a cleanup, as the + last llvm.eh.selector argument. + However for C++ this is not sufficient, because the C++ personality function + will terminate the program if it detects that unwinding the exception only + results in matches with cleanups. For C++ a null i8* should be + passed as the last llvm.eh.selector + argument instead. This is interpreted as a catch-all by the C++ personality + function, and will always match.

    + +
    + + + + +
    + +

    LLVM uses several intrinsic functions (name prefixed with "llvm.eh") to + provide exception handling information at various points in generated + code.

    + +
    + + + + +
    + +
    +  i8* %llvm.eh.exception()
    +
    + +

    This intrinsic returns a pointer to the exception structure.

    + +
    + + + + +
    + +
    +  i32 %llvm.eh.selector(i8*, i8*, i8*, ...)
    +
    + +

    This intrinsic is used to compare the exception with the given type infos, + filters and cleanups.

    + +

    llvm.eh.selector takes a minimum of + three arguments. The first argument is the reference to the exception + structure. The second argument is a reference to the personality function to + be used for this try catch sequence. Each of the remaining arguments is + either a reference to the type info for a catch statement, + a filter expression, or the number zero + representing a cleanup. The exception is tested + against the arguments sequentially from first to last. The result of + the llvm.eh.selector is a positive + number if the exception matched a type info, a negative number if it matched + a filter, and zero if it matched a cleanup. If nothing is matched, the + behaviour of the program is undefined. If a type + info matched then the selector value is the index of the type info in the + exception table, which can be obtained using the + llvm.eh.typeid.for intrinsic.

    + +
    + + + + +
    + +
    +  i32 %llvm.eh.typeid.for(i8*)
    +
    + +

    This intrinsic returns the type info index in the exception table of the + current function. This value can be used to compare against the result + of llvm.eh.selector. The single + argument is a reference to a type info.

    + +
    + + + + +
    + +
    +  i32 %llvm.eh.sjlj.setjmp(i8*)
    +
    + +

    The SJLJ exception handling uses this intrinsic to force register saving for + the current function and to store the address of the following instruction + for use as a destination address by + llvm.eh.sjlj.longjmp. The buffer format and the overall + functioning of this intrinsic is compatible with the GCC + __builtin_setjmp implementation, allowing code built with the + two compilers to interoperate.

    + +

    The single parameter is a pointer to a five word buffer in which the calling + context is saved. The front end places the frame pointer in the first word, + and the target implementation of this intrinsic should place the destination + address for a + llvm.eh.sjlj.longjmp in the + second word. The following three words are available for use in a + target-specific manner.

    + +
    + + + + +
    + +
    +  void %llvm.eh.sjlj.setjmp(i8*)
    +
    + +

    The llvm.eh.sjlj.longjmp + intrinsic is used to implement __builtin_longjmp() for SJLJ + style exception handling. The single parameter is a pointer to a + buffer populated by + llvm.eh.sjlj.setjmp. The frame pointer and stack pointer + are restored from the buffer, then control is transfered to the + destination address.

    + +
    + + + +
    + +
    +  i8* %llvm.eh.sjlj.lsda()
    +
    + +

    Used for SJLJ based exception handling, the + llvm.eh.sjlj.lsda intrinsic returns the address of the Language + Specific Data Area (LSDA) for the current function. The SJLJ front-end code + stores this address in the exception handling function context for use by the + runtime.

    + +
    + + + + +
    + +
    +  void %llvm.eh.sjlj.callsite(i32)
    +
    + +

    For SJLJ based exception handling, the + llvm.eh.sjlj.callsite intrinsic identifies the callsite value + associated with the following invoke instruction. This is used to ensure + that landing pad entries in the LSDA are generated in the matching order.

    + +
    + + + + +
    + +

    There are two tables that are used by the exception handling runtime to + determine which actions should take place when an exception is thrown.

    + +
    + + + + +
    + +

    An exception handling frame eh_frame is very similar to the unwind + frame used by dwarf debug info. The frame contains all the information + necessary to tear down the current frame and restore the state of the prior + frame. There is an exception handling frame for each function in a compile + unit, plus a common exception handling frame that defines information common + to all functions in the unit.

    + +

    Todo - Table details here.

    + +
    + + + + +
    + +

    An exception table contains information about what actions to take when an + exception is thrown in a particular part of a function's code. There is one + exception table per function except leaf routines and functions that have + only calls to non-throwing functions will not need an exception table.

    + +

    Todo - Table details here.

    + +
    + + +
    + ToDo +
    + +
    + +
      + +
    1. Testing/Testing/Testing.
    2. + +
    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/ExtendedIntegerResults.txt URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/ExtendedIntegerResults.txt?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/ExtendedIntegerResults.txt (added) +++ www-releases/trunk/2.8/docs/ExtendedIntegerResults.txt Mon Oct 4 15:49:23 2010 @@ -0,0 +1,133 @@ +//===----------------------------------------------------------------------===// +// Representing sign/zero extension of function results +//===----------------------------------------------------------------------===// + +Mar 25, 2009 - Initial Revision + +Most ABIs specify that functions which return small integers do so in a +specific integer GPR. This is an efficient way to go, but raises the question: +if the returned value is smaller than the register, what do the high bits hold? + +There are three (interesting) possible answers: undefined, zero extended, or +sign extended. The number of bits in question depends on the data-type that +the front-end is referencing (typically i1/i8/i16/i32). + +Knowing the answer to this is important for two reasons: 1) we want to be able +to implement the ABI correctly. If we need to sign extend the result according +to the ABI, we really really do need to do this to preserve correctness. 2) +this information is often useful for optimization purposes, and we want the +mid-level optimizers to be able to process this (e.g. eliminate redundant +extensions). + +For example, lets pretend that X86 requires the caller to properly extend the +result of a return (I'm not sure this is the case, but the argument doesn't +depend on this). Given this, we should compile this: + +int a(); +short b() { return a(); } + +into: + +_b: + subl $12, %esp + call L_a$stub + addl $12, %esp + cwtl + ret + +An optimization example is that we should be able to eliminate the explicit +sign extension in this example: + +short y(); +int z() { + return ((int)y() << 16) >> 16; +} + +_z: + subl $12, %esp + call _y + ;; movswl %ax, %eax -> not needed because eax is already sext'd + addl $12, %esp + ret + +//===----------------------------------------------------------------------===// +// What we have right now. +//===----------------------------------------------------------------------===// + +Currently, these sorts of things are modelled by compiling a function to return +the small type and a signext/zeroext marker is used. For example, we compile +Z into: + +define i32 @z() nounwind { +entry: + %0 = tail call signext i16 (...)* @y() nounwind + %1 = sext i16 %0 to i32 + ret i32 %1 +} + +and b into: + +define signext i16 @b() nounwind { +entry: + %0 = tail call i32 (...)* @a() nounwind ; [#uses=1] + %retval12 = trunc i32 %0 to i16 ; [#uses=1] + ret i16 %retval12 +} + +This has some problems: 1) the actual precise semantics are really poorly +defined (see PR3779). 2) some targets might want the caller to extend, some +might want the callee to extend 3) the mid-level optimizer doesn't know the +size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits +here, and even if it did, it could not eliminate the sext. 4) the code +generator has historically assumed that the result is extended to i32, which is +a problem on PIC16 (and is also probably wrong on alpha and other 64-bit +targets). + +//===----------------------------------------------------------------------===// +// The proposal +//===----------------------------------------------------------------------===// + +I suggest that we have the front-end fully lower out the ABI issues here to +LLVM IR. This makes it 100% explicit what is going on and means that there is +no cause for confusion. For example, the cases above should compile into: + +define i32 @z() nounwind { +entry: + %0 = tail call i32 (...)* @y() nounwind + %1 = trunc i32 %0 to i16 + %2 = sext i16 %1 to i32 + ret i32 %2 +} +define i32 @b() nounwind { +entry: + %0 = tail call i32 (...)* @a() nounwind + %retval12 = trunc i32 %0 to i16 + %tmp = sext i16 %retval12 to i32 + ret i32 %tmp +} + +In this model, no functions will return an i1/i8/i16 (and on a x86-64 target +that extends results to i64, no i32). This solves the ambiguity issue, allows us +to fully describe all possible ABIs, and now allows the optimizers to reason +about and eliminate these extensions. + +The one thing that is missing is the ability for the front-end and optimizer to +specify/infer the guarantees provided by the ABI to allow other optimizations. +For example, in the y/z case, since y is known to return a sign extended value, +the trunc/sext in z should be eliminable. + +This can be done by introducing new sext/zext attributes which mean "I know +that the result of the function is sign extended at least N bits. Given this, +and given that it is stuck on the y function, the mid-level optimizer could +easily eliminate the extensions etc with existing functionality. + +The major disadvantage of doing this sort of thing is that it makes the ABI +lowering stuff even more explicit in the front-end, and that we would like to +eventually move to having the code generator do more of this work. However, +the sad truth of the matter is that this is a) unlikely to happen anytime in +the near future, and b) this is no worse than we have now with the existing +attributes. + +C compilers fundamentally have to reason about the target in many ways. +This is ugly and horrible, but a fact of life. + Added: www-releases/trunk/2.8/docs/ExtendingLLVM.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/ExtendingLLVM.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/ExtendingLLVM.html (added) +++ www-releases/trunk/2.8/docs/ExtendingLLVM.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,391 @@ + + + + Extending LLVM: Adding instructions, intrinsics, types, etc. + + + + + +
    + Extending LLVM: Adding instructions, intrinsics, types, etc. +
    + +
      +
    1. Introduction and Warning
    2. +
    3. Adding a new intrinsic function
    4. +
    5. Adding a new instruction
    6. +
    7. Adding a new SelectionDAG node
    8. +
    9. Adding a new type +
        +
      1. Adding a new fundamental type
      2. +
      3. Adding a new derived type
      4. +
    10. +
    + +
    +

    Written by Misha Brukman, + Brad Jones, Nate Begeman, + and Chris Lattner

    +
    + + + + + +
    + +

    During the course of using LLVM, you may wish to customize it for your +research project or for experimentation. At this point, you may realize that +you need to add something to LLVM, whether it be a new fundamental type, a new +intrinsic function, or a whole new instruction.

    + +

    When you come to this realization, stop and think. Do you really need to +extend LLVM? Is it a new fundamental capability that LLVM does not support at +its current incarnation or can it be synthesized from already pre-existing LLVM +elements? If you are not sure, ask on the LLVM-dev list. The +reason is that extending LLVM will get involved as you need to update all the +different passes that you intend to use with your extension, and there are +many LLVM analyses and transformations, so it may be quite a bit of +work.

    + +

    Adding an intrinsic function is far easier than +adding an instruction, and is transparent to optimization passes. If your added +functionality can be expressed as a +function call, an intrinsic function is the method of choice for LLVM +extension.

    + +

    Before you invest a significant amount of effort into a non-trivial +extension, ask on the list if what you are +looking to do can be done with already-existing infrastructure, or if maybe +someone else is already working on it. You will save yourself a lot of time and +effort by doing so.

    + +
    + + + + + +
    + +

    Adding a new intrinsic function to LLVM is much easier than adding a new +instruction. Almost all extensions to LLVM should start as an intrinsic +function and then be turned into an instruction if warranted.

    + +
      +
    1. llvm/docs/LangRef.html: + Document the intrinsic. Decide whether it is code generator specific and + what the restrictions are. Talk to other people about it so that you are + sure it's a good idea.
    2. + +
    3. llvm/include/llvm/Intrinsics*.td: + Add an entry for your intrinsic. Describe its memory access characteristics + for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note + that any intrinsic using the llvm_int_ty type for an argument will + be deemed by tblgen as overloaded and the corresponding suffix + will be required on the intrinsic's name.
    4. + +
    5. llvm/lib/Analysis/ConstantFolding.cpp: If it is possible to + constant fold your intrinsic, add support to it in the + canConstantFoldCallTo and ConstantFoldCall functions.
    6. + +
    7. llvm/test/Regression/*: Add test cases for your test cases to the + test suite
    8. +
    + +

    Once the intrinsic has been added to the system, you must add code generator +support for it. Generally you must do the following steps:

    + +
    +
    Add support to the C backend in lib/Target/CBackend/
    + +
    Depending on the intrinsic, there are a few ways to implement this. For + most intrinsics, it makes sense to add code to lower your intrinsic in + LowerIntrinsicCall in lib/CodeGen/IntrinsicLowering.cpp. + Second, if it makes sense to lower the intrinsic to an expanded sequence of + C code in all cases, just emit the expansion in visitCallInst in + Writer.cpp. If the intrinsic has some way to express it with GCC + (or any other compiler) extensions, it can be conditionally supported based + on the compiler compiling the CBE output (see llvm.prefetch for an + example). Third, if the intrinsic really has no way to be lowered, just + have the code generator emit code that prints an error message and calls + abort if executed.
    + +
    Add support to the .td file for the target(s) of your choice in + lib/Target/*/*.td.
    + +
    This is usually a matter of adding a pattern to the .td file that matches + the intrinsic, though it may obviously require adding the instructions you + want to generate as well. There are lots of examples in the PowerPC and X86 + backend to follow.
    +
    + +
    + + + + + +
    + +

    As with intrinsics, adding a new SelectionDAG node to LLVM is much easier +than adding a new instruction. New nodes are often added to help represent +instructions common to many targets. These nodes often map to an LLVM +instruction (add, sub) or intrinsic (byteswap, population count). In other +cases, new nodes have been added to allow many targets to perform a common task +(converting between floating point and integer representation) or capture more +complicated behavior in a single node (rotate).

    + +
      +
    1. include/llvm/CodeGen/SelectionDAGNodes.h: + Add an enum value for the new SelectionDAG node.
    2. +
    3. lib/CodeGen/SelectionDAG/SelectionDAG.cpp: + Add code to print the node to getOperationName. If your new node + can be evaluated at compile time when given constant arguments (such as an + add of a constant with another constant), find the getNode method + that takes the appropriate number of arguments, and add a case for your node + to the switch statement that performs constant folding for nodes that take + the same number of arguments as your new node.
    4. +
    5. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: + Add code to legalize, + promote, and expand the node as necessary. At a minimum, you will need + to add a case statement for your node in LegalizeOp which calls + LegalizeOp on the node's operands, and returns a new node if any of the + operands changed as a result of being legalized. It is likely that not all + targets supported by the SelectionDAG framework will natively support the + new node. In this case, you must also add code in your node's case + statement in LegalizeOp to Expand your node into simpler, legal + operations. The case for ISD::UREM for expanding a remainder into + a divide, multiply, and a subtract is a good example.
    6. +
    7. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: + If targets may support the new node being added only at certain sizes, you + will also need to add code to your node's case statement in + LegalizeOp to Promote your node's operands to a larger size, and + perform the correct operation. You will also need to add code to + PromoteOp to do this as well. For a good example, see + ISD::BSWAP, + which promotes its operand to a wider size, performs the byteswap, and then + shifts the correct bytes right to emulate the narrower byteswap in the + wider type.
    8. +
    9. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: + Add a case for your node in ExpandOp to teach the legalizer how to + perform the action represented by the new node on a value that has been + split into high and low halves. This case will be used to support your + node with a 64 bit operand on a 32 bit target.
    10. +
    11. lib/CodeGen/SelectionDAG/DAGCombiner.cpp: + If your node can be combined with itself, or other existing nodes in a + peephole-like fashion, add a visit function for it, and call that function + from . There are several good examples for simple combines you + can do; visitFABS and visitSRL are good starting places. +
    12. +
    13. lib/Target/PowerPC/PPCISelLowering.cpp: + Each target has an implementation of the TargetLowering class, + usually in its own file (although some targets include it in the same + file as the DAGToDAGISel). The default behavior for a target is to + assume that your new node is legal for all types that are legal for + that target. If this target does not natively support your node, then + tell the target to either Promote it (if it is supported at a larger + type) or Expand it. This will cause the code you wrote in + LegalizeOp above to decompose your new node into other legal + nodes for this target.
    14. +
    15. lib/Target/TargetSelectionDAG.td: + Most current targets supported by LLVM generate code using the DAGToDAG + method, where SelectionDAG nodes are pattern matched to target-specific + nodes, which represent individual instructions. In order for the targets + to match an instruction to your new node, you must add a def for that node + to the list in this file, with the appropriate type constraints. Look at + add, bswap, and fadd for examples.
    16. +
    17. lib/Target/PowerPC/PPCInstrInfo.td: + Each target has a tablegen file that describes the target's instruction + set. For targets that use the DAGToDAG instruction selection framework, + add a pattern for your new node that uses one or more target nodes. + Documentation for this is a bit sparse right now, but there are several + decent examples. See the patterns for rotl in + PPCInstrInfo.td.
    18. +
    19. TODO: document complex patterns.
    20. +
    21. llvm/test/Regression/CodeGen/*: Add test cases for your new node + to the test suite. llvm/test/Regression/CodeGen/X86/bswap.ll is + a good example.
    22. +
    + +
    + + + + + +
    + +

    WARNING: adding instructions changes the bitcode +format, and it will take some effort to maintain compatibility with +the previous version. Only add an instruction if it is absolutely +necessary.

    + +
      + +
    1. llvm/include/llvm/Instruction.def: + add a number for your instruction and an enum name
    2. + +
    3. llvm/include/llvm/Instructions.h: + add a definition for the class that will represent your instruction
    4. + +
    5. llvm/include/llvm/Support/InstVisitor.h: + add a prototype for a visitor to your new instruction type
    6. + +
    7. llvm/lib/AsmParser/Lexer.l: + add a new token to parse your instruction from assembly text file
    8. + +
    9. llvm/lib/AsmParser/llvmAsmParser.y: + add the grammar on how your instruction can be read and what it will + construct as a result
    10. + +
    11. llvm/lib/Bitcode/Reader/Reader.cpp: + add a case for your instruction and how it will be parsed from bitcode
    12. + +
    13. llvm/lib/VMCore/Instruction.cpp: + add a case for how your instruction will be printed out to assembly
    14. + +
    15. llvm/lib/VMCore/Instructions.cpp: + implement the class you defined in + llvm/include/llvm/Instructions.h
    16. + +
    17. Test your instruction
    18. + +
    19. llvm/lib/Target/*: + Add support for your instruction to code generators, or add a lowering + pass.
    20. + +
    21. llvm/test/Regression/*: add your test cases to the test suite.
    22. + +
    + +

    Also, you need to implement (or modify) any analyses or passes that you want +to understand this new instruction.

    + +
    + + + + + + +
    + +

    WARNING: adding new types changes the bitcode +format, and will break compatibility with currently-existing LLVM +installations. Only add new types if it is absolutely necessary.

    + +
    + + + + +
    + +
      + +
    1. llvm/include/llvm/Type.h: + add enum for the new type; add static Type* for this type
    2. + +
    3. llvm/lib/VMCore/Type.cpp: + add mapping from TypeID => Type*; + initialize the static Type*
    4. + +
    5. llvm/lib/AsmReader/Lexer.l: + add ability to parse in the type from text assembly
    6. + +
    7. llvm/lib/AsmReader/llvmAsmParser.y: + add a token for that type
    8. + +
    + +
    + + + + +
    + +
      +
    1. llvm/include/llvm/Type.h: + add enum for the new type; add a forward declaration of the type + also
    2. + +
    3. llvm/include/llvm/DerivedTypes.h: + add new class to represent new class in the hierarchy; add forward + declaration to the TypeMap value type
    4. + +
    5. llvm/lib/VMCore/Type.cpp: + add support for derived type to: +
      +
      +std::string getTypeDescription(const Type &Ty,
      +  std::vector<const Type*> &TypeStack)
      +bool TypesEqual(const Type *Ty, const Type *Ty2,
      +  std::map<const Type*, const Type*> & EqTypes)
      +
      +
      + add necessary member functions for type, and factory methods
    6. + +
    7. llvm/lib/AsmReader/Lexer.l: + add ability to parse in the type from text assembly
    8. + +
    9. llvm/lib/BitCode/Writer/Writer.cpp: + modify void BitcodeWriter::outputType(const Type *T) to serialize + your type
    10. + +
    11. llvm/lib/BitCode/Reader/Reader.cpp: + modify const Type *BitcodeReader::ParseType() to read your data + type
    12. + +
    13. llvm/lib/VMCore/AsmWriter.cpp: + modify +
      +
      +void calcTypeName(const Type *Ty,
      +                  std::vector<const Type*> &TypeStack,
      +                  std::map<const Type*,std::string> &TypeNames,
      +                  std::string & Result)
      +
      +
      + to output the new derived type +
    14. + + +
    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + The LLVM Compiler Infrastructure +
    + Last modified: $Date: 2010-05-06 17:28:04 -0700 (Thu, 06 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/FAQ.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/FAQ.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/FAQ.html (added) +++ www-releases/trunk/2.8/docs/FAQ.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,938 @@ + + + + + LLVM: Frequently Asked Questions + + + + +
    + LLVM: Frequently Asked Questions +
    + +
      +
    1. License +
        +
      1. Why are the LLVM source code and the front-end distributed under + different licenses?
      2. + +
      3. Does the University of Illinois Open Source License really qualify as an + "open source" license?
      4. + +
      5. Can I modify LLVM source code and redistribute the modified source?
      6. + +
      7. Can I modify LLVM source code and redistribute binaries or other tools + based on it, without redistributing the source?
      8. +
    2. + +
    3. Source code +
        +
      1. In what language is LLVM written?
      2. + +
      3. How portable is the LLVM source code?
      4. +
    4. + +
    5. Build Problems +
        +
      1. When I run configure, it finds the wrong C compiler.
      2. + +
      3. The configure script finds the right C compiler, but it uses + the LLVM linker from a previous build. What do I do?
      4. + +
      5. When creating a dynamic library, I get a strange GLIBC error.
      6. + +
      7. I've updated my source tree from Subversion, and now my build is trying + to use a file/directory that doesn't exist.
      8. + +
      9. I've modified a Makefile in my source tree, but my build tree keeps + using the old version. What do I do?
      10. + +
      11. I've upgraded to a new version of LLVM, and I get strange build + errors.
      12. + +
      13. I've built LLVM and am testing it, but the tests freeze.
      14. + +
      15. Why do test results differ when I perform different types of + builds?
      16. + +
      17. Compiling LLVM with GCC 3.3.2 fails, what should I do?
      18. + +
      19. Compiling LLVM with GCC succeeds, but the resulting tools do not work, + what can be wrong?
      20. + +
      21. When I use the test suite, all of the C Backend tests fail. What is + wrong?
      22. + +
      23. After Subversion update, rebuilding gives the error "No rule to make + target".
      24. + +
      25. The llvmc program gives me errors/doesn't + work.
      26. + +
      27. When I compile LLVM-GCC with srcdir == objdir, + it fails. Why?
      28. +
    6. + +
    7. Source Languages +
        +
      1. What source languages are supported?
      2. + +
      3. I'd like to write a self-hosting LLVM compiler. How + should I interface with the LLVM middle-end optimizers and back-end code + generators?
      4. + +
      5. What support is there for higher level source + language constructs for building a compiler?
      6. + +
      7. I don't understand the GetElementPtr + instruction. Help!
      8. +
      + +
    8. Using the GCC Front End +
        +
      1. When I compile software that uses a configure script, the configure + script thinks my system has all of the header files and libraries it is + testing for. How do I get configure to work correctly?
      2. + +
      3. When I compile code using the LLVM GCC front end, it complains that it + cannot find libcrtend.a?
      4. + +
      5. How can I disable all optimizations when compiling code using the LLVM + GCC front end?
      6. + +
      7. Can I use LLVM to convert C++ code to C + code?
      8. + +
      9. Can I compile C or C++ code to + platform-independent LLVM bitcode?
      10. +
      +
    9. + +
    10. Questions about code generated by the GCC front-end +
        +
      1. What is this llvm.global_ctors and + _GLOBAL__I__tmp_webcompile... stuff that happens when I + #include <iostream>?
      2. + +
      3. Where did all of my code go??
      4. + +
      5. What is this "undef" thing that shows up in + my code?
      6. + +
      7. Why does instcombine + simplifycfg turn + a call to a function with a mismatched calling convention into "unreachable"? + Why not make the verifier reject it?
      8. +
      +
    11. +
    + +
    +

    Written by The LLVM Team

    +
    + + + +
    + License +
    + + +
    +

    Why are the LLVM source code and the front-end distributed under different + licenses?

    +
    + +
    +

    The C/C++ front-ends are based on GCC and must be distributed under the GPL. + Our aim is to distribute LLVM source code under a much less + restrictive license, in particular one that does not compel users who + distribute tools based on modifying the source to redistribute the modified + source code as well.

    +
    + +
    +

    Does the University of Illinois Open Source License really qualify as an + "open source" license?

    +
    + +
    +

    Yes, the license + is certified by + the Open Source Initiative (OSI).

    +
    + +
    +

    Can I modify LLVM source code and redistribute the modified source?

    +
    + +
    +

    Yes. The modified source distribution must retain the copyright notice and + follow the three bulletted conditions listed in + the LLVM + license.

    +
    + +
    +

    Can I modify LLVM source code and redistribute binaries or other tools based + on it, without redistributing the source?

    +
    + +
    +

    Yes. This is why we distribute LLVM under a less restrictive license than + GPL, as explained in the first question above.

    +
    + + + + + +
    +

    In what language is LLVM written?

    +
    + +
    +

    All of the LLVM tools and libraries are written in C++ with extensive use of + the STL.

    +
    + +
    +

    How portable is the LLVM source code?

    +
    + +
    +

    The LLVM source code should be portable to most modern UNIX-like operating +systems. Most of the code is written in standard C++ with operating system +services abstracted to a support library. The tools required to build and test +LLVM have been ported to a plethora of platforms.

    + +

    Some porting problems may exist in the following areas:

    + +
      +
    • The GCC front end code is not as portable as the LLVM suite, so it may not + compile as well on unsupported platforms.
    • + +
    • The LLVM build system relies heavily on UNIX shell tools, like the Bourne + Shell and sed. Porting to systems without these tools (MacOS 9, Plan 9) + will require more effort.
    • +
    + +
    + + + + + +
    +

    When I run configure, it finds the wrong C compiler.

    +
    + +
    +

    The configure script attempts to locate first gcc and then + cc, unless it finds compiler paths set in CC + and CXX for the C and C++ compiler, respectively.

    + +

    If configure finds the wrong compiler, either adjust your + PATH environment variable or set CC and CXX + explicitly.

    + +
    + +
    +

    The configure script finds the right C compiler, but it uses the + LLVM linker from a previous build. What do I do?

    +
    + +
    +

    The configure script uses the PATH to find executables, so + if it's grabbing the wrong linker/assembler/etc, there are two ways to fix + it:

    + +
      +
    1. Adjust your PATH environment variable so that the correct + program appears first in the PATH. This may work, but may not be + convenient when you want them first in your path for other + work.

    2. + +
    3. Run configure with an alternative PATH that is + correct. In a Borne compatible shell, the syntax would be:

      + +
      +% PATH=[the path without the bad program] ./configure ...
      +
      + +

      This is still somewhat inconvenient, but it allows configure + to do its work without having to adjust your PATH + permanently.

    4. +
    +
    + +
    +

    When creating a dynamic library, I get a strange GLIBC error.

    +
    + +
    +

    Under some operating systems (i.e. Linux), libtool does not work correctly if + GCC was compiled with the --disable-shared option. To work around this, + install your own version of GCC that has shared libraries enabled by + default.

    +
    + +
    +

    I've updated my source tree from Subversion, and now my build is trying to + use a file/directory that doesn't exist.

    +
    + +
    +

    You need to re-run configure in your object directory. When new Makefiles + are added to the source tree, they have to be copied over to the object tree + in order to be used by the build.

    +
    + +
    +

    I've modified a Makefile in my source tree, but my build tree keeps using the + old version. What do I do?

    +
    + +
    +

    If the Makefile already exists in your object tree, you can just run the + following command in the top level directory of your object tree:

    + +
    +% ./config.status <relative path to Makefile>
    +
    + +

    If the Makefile is new, you will have to modify the configure script to copy + it over.

    +
    + +
    +

    I've upgraded to a new version of LLVM, and I get strange build errors.

    +
    + +
    + +

    Sometimes, changes to the LLVM source code alters how the build system works. + Changes in libtool, autoconf, or header file dependencies are especially + prone to this sort of problem.

    + +

    The best thing to try is to remove the old files and re-build. In most + cases, this takes care of the problem. To do this, just type make + clean and then make in the directory that fails to build.

    +
    + +
    +

    I've built LLVM and am testing it, but the tests freeze.

    +
    + +
    +

    This is most likely occurring because you built a profile or release + (optimized) build of LLVM and have not specified the same information on the + gmake command line.

    + +

    For example, if you built LLVM with the command:

    + +
    +% gmake ENABLE_PROFILING=1
    +
    + +

    ...then you must run the tests with the following commands:

    + +
    +% cd llvm/test
    +% gmake ENABLE_PROFILING=1
    +
    +
    + +
    +

    Why do test results differ when I perform different types of builds?

    +
    + +
    +

    The LLVM test suite is dependent upon several features of the LLVM tools and + libraries.

    + +

    First, the debugging assertions in code are not enabled in optimized or + profiling builds. Hence, tests that used to fail may pass.

    + +

    Second, some tests may rely upon debugging options or behavior that is only + available in the debug build. These tests will fail in an optimized or + profile build.

    +
    + +
    +

    Compiling LLVM with GCC 3.3.2 fails, what should I do?

    +
    + +
    +

    This is a bug in + GCC, and affects projects other than LLVM. Try upgrading or downgrading + your GCC.

    +
    + +
    +

    Compiling LLVM with GCC succeeds, but the resulting tools do not work, what + can be wrong?

    +
    + +
    +

    Several versions of GCC have shown a weakness in miscompiling the LLVM + codebase. Please consult your compiler version (gcc --version) to + find out whether it is broken. + If so, your only option is to upgrade GCC to a known good version.

    +
    + +
    +

    After Subversion update, rebuilding gives the error "No rule to make + target".

    +
    + +
    +

    If the error is of the form:

    + +
    +gmake[2]: *** No rule to make target `/path/to/somefile', needed by
    +`/path/to/another/file.d'.
    +Stop. +
    + +

    This may occur anytime files are moved within the Subversion repository or + removed entirely. In this case, the best solution is to erase all + .d files, which list dependencies for source files, and rebuild:

    + +
    +% cd $LLVM_OBJ_DIR
    +% rm -f `find . -name \*\.d` 
    +% gmake 
    +
    + +

    In other cases, it may be necessary to run make clean before + rebuilding.

    +
    + + + +
    +

    llvmc is experimental and isn't really supported. We suggest + using llvm-gcc instead.

    +
    + + + +
    +

    The GNUmakefile in the top-level directory of LLVM-GCC is a special + Makefile used by Apple to invoke the build_gcc script after + setting up a special environment. This has the unfortunate side-effect that + trying to build LLVM-GCC with srcdir == objdir in a "non-Apple way" invokes + the GNUmakefile instead of Makefile. Because the + environment isn't set up correctly to do this, the build fails.

    + +

    People not building LLVM-GCC the "Apple way" need to build LLVM-GCC with + srcdir != objdir, or simply remove the GNUmakefile entirely.

    + +

    We regret the inconvenience.

    +
    + + + + + + +
    +

    LLVM currently has full support for C and C++ source languages. These are + available through a special version of GCC that LLVM calls the + C Front End

    + +

    There is an incomplete version of a Java front end available in the + java module. There is no documentation on this yet so you'll need to + download the code, compile it, and try it.

    + +

    The PyPy developers are working on integrating LLVM into the PyPy backend so + that PyPy language can translate to LLVM.

    +
    + + + +
    +

    Your compiler front-end will communicate with LLVM by creating a module in + the LLVM intermediate representation (IR) format. Assuming you want to write + your language's compiler in the language itself (rather than C++), there are + 3 major ways to tackle generating LLVM IR from a front-end:

    + +
      +
    • Call into the LLVM libraries code using your language's FFI + (foreign function interface). + +
        +
      • for: best tracks changes to the LLVM IR, .ll syntax, and .bc + format
      • + +
      • for: enables running LLVM optimization passes without a + emit/parse overhead
      • + +
      • for: adapts well to a JIT context
      • + +
      • against: lots of ugly glue code to write
      • +
    • + +
    • Emit LLVM assembly from your compiler's native language. +
        +
      • for: very straightforward to get started
      • + +
      • against: the .ll parser is slower than the bitcode reader + when interfacing to the middle end
      • + +
      • against: you'll have to re-engineer the LLVM IR object model + and asm writer in your language
      • + +
      • against: it may be harder to track changes to the IR
      • +
    • + +
    • Emit LLVM bitcode from your compiler's native language. + +
        +
      • for: can use the more-efficient bitcode reader when + interfacing to the middle end
      • + +
      • against: you'll have to re-engineer the LLVM IR object + model and bitcode writer in your language
      • + +
      • against: it may be harder to track changes to the IR
      • +
    • +
    + +

    If you go with the first option, the C bindings in include/llvm-c should help + a lot, since most languages have strong support for interfacing with C. The + most common hurdle with calling C from managed code is interfacing with the + garbage collector. The C interface was designed to require very little memory + management, and so is straightforward in this regard.

    +
    + + + +
    +

    Currently, there isn't much. LLVM supports an intermediate representation + which is useful for code representation but will not support the high level + (abstract syntax tree) representation needed by most compilers. There are no + facilities for lexical nor semantic analysis. There is, however, a mostly + implemented configuration-driven + compiler driver which simplifies the task + of running optimizations, linking, and executable generation.

    +
    + + + + + + + + +
    +

    When I compile software that uses a configure script, the configure script + thinks my system has all of the header files and libraries it is testing for. + How do I get configure to work correctly?

    +
    + +
    +

    The configure script is getting things wrong because the LLVM linker allows + symbols to be undefined at link time (so that they can be resolved during JIT + or translation to the C back end). That is why configure thinks your system + "has everything."

    + +

    To work around this, perform the following steps:

    + +
      +
    1. Make sure the CC and CXX environment variables contains the full path to + the LLVM GCC front end.
    2. + +
    3. Make sure that the regular C compiler is first in your PATH.
    4. + +
    5. Add the string "-Wl,-native" to your CFLAGS environment variable.
    6. +
    + +

    This will allow the llvm-ld linker to create a native code + executable instead of shell script that runs the JIT. Creating native code + requires standard linkage, which in turn will allow the configure script to + find out if code is not linking on your system because the feature isn't + available on your system.

    +
    + +
    +

    When I compile code using the LLVM GCC front end, it complains that it cannot + find libcrtend.a. +

    +
    + +
    +

    The only way this can happen is if you haven't installed the runtime + library. To correct this, do:

    + +
    +% cd llvm/runtime
    +% make clean ; make install-bytecode
    +
    +
    + +
    +

    How can I disable all optimizations when compiling code using the LLVM GCC + front end?

    +
    + +
    +

    Passing "-Wa,-disable-opt -Wl,-disable-opt" will disable *all* cleanup and + optimizations done at the llvm level, leaving you with the truly horrible + code that you desire.

    +
    + + + + +
    +

    Yes, you can use LLVM to convert code from any language LLVM supports to C. + Note that the generated C code will be very low level (all loops are lowered + to gotos, etc) and not very pretty (comments are stripped, original source + formatting is totally lost, variables are renamed, expressions are + regrouped), so this may not be what you're looking for. Also, there are + several limitations noted below.

    + +

    Use commands like this:

    + +
      +
    1. Compile your program with llvm-g++:

      + +
      +% llvm-g++ -emit-llvm x.cpp -o program.bc -c
      +
      + +

      or:

      + +
      +% llvm-g++ a.cpp -c -emit-llvm
      +% llvm-g++ b.cpp -c -emit-llvm
      +% llvm-ld a.o b.o -o program
      +
      + +

      This will generate program and program.bc. The .bc + file is the LLVM version of the program all linked together.

    2. + +
    3. Convert the LLVM code to C code, using the LLC tool with the C + backend:

      + +
      +% llc -march=c program.bc -o program.c
      +
    4. + +
    5. Finally, compile the C file:

      + +
      +% cc x.c -lstdc++
      +
    6. + +
    + +

    Using LLVM does not eliminate the need for C++ library support. If you use + the llvm-g++ front-end, the generated code will depend on g++'s C++ support + libraries in the same way that code generated from g++ would. If you use + another C++ front-end, the generated code will depend on whatever library + that front-end would normally require.

    + +

    If you are working on a platform that does not provide any C++ libraries, you + may be able to manually compile libstdc++ to LLVM bitcode, statically link it + into your program, then use the commands above to convert the whole result + into C code. Alternatively, you might compile the libraries and your + application into two different chunks of C code and link them.

    + +

    Note that, by default, the C back end does not support exception handling. + If you want/need it for a certain program, you can enable it by passing + "-enable-correct-eh-support" to the llc program. The resultant code will use + setjmp/longjmp to implement exception support that is relatively slow, and + not C++-ABI-conforming on most platforms, but otherwise correct.

    + +

    Also, there are a number of other limitations of the C backend that cause it + to produce code that does not fully conform to the C++ ABI on most + platforms. Some of the C++ programs in LLVM's test suite are known to fail + when compiled with the C back end because of ABI incompatibilities with + standard C++ libraries.

    +
    + + + +
    +

    No. C and C++ are inherently platform-dependent languages. The most obvious + example of this is the preprocessor. A very common way that C code is made + portable is by using the preprocessor to include platform-specific code. In + practice, information about other platforms is lost after preprocessing, so + the result is inherently dependent on the platform that the preprocessing was + targeting.

    + +

    Another example is sizeof. It's common for sizeof(long) to + vary between platforms. In most C front-ends, sizeof is expanded to + a constant immediately, thus hard-wiring a platform-specific detail.

    + +

    Also, since many platforms define their ABIs in terms of C, and since LLVM is + lower-level than C, front-ends currently must emit platform-specific IR in + order to have the result conform to the platform ABI.

    +
    + + + + + + +
    +

    If you #include the <iostream> header into a C++ + translation unit, the file will probably use + the std::cin/std::cout/... global objects. However, C++ + does not guarantee an order of initialization between static objects in + different translation units, so if a static ctor/dtor in your .cpp file + used std::cout, for example, the object would not necessarily be + automatically initialized before your use.

    + +

    To make std::cout and friends work correctly in these scenarios, the + STL that we use declares a static object that gets created in every + translation unit that includes <iostream>. This object has a + static constructor and destructor that initializes and destroys the global + iostream objects before they could possibly be used in the file. The code + that you see in the .ll file corresponds to the constructor and destructor + registration code. +

    + +

    If you would like to make it easier to understand the LLVM code + generated by the compiler in the demo page, consider using printf() + instead of iostreams to print values.

    +
    + + + + + +
    +

    If you are using the LLVM demo page, you may often wonder what happened to + all of the code that you typed in. Remember that the demo script is running + the code through the LLVM optimizers, so if your code doesn't actually do + anything useful, it might all be deleted.

    + +

    To prevent this, make sure that the code is actually needed. For example, if + you are computing some expression, return the value from the function instead + of leaving it in a local variable. If you really want to constrain the + optimizer, you can read from and assign to volatile global + variables.

    +
    + + + + + +
    +

    undef is the LLVM way of + representing a value that is not defined. You can get these if you do not + initialize a variable before you use it. For example, the C function:

    + +
    +int X() { int i; return i; }
    +
    + +

    Is compiled to "ret i32 undef" because "i" never has a + value specified for it.

    +
    + + + + + +
    +

    This is a common problem run into by authors of front-ends that are using +custom calling conventions: you need to make sure to set the right calling +convention on both the function and on each call to the function. For example, +this code:

    + +
    +define fastcc void @foo() {
    +        ret void
    +}
    +define void @bar() {
    +        call void @foo()
    +        ret void
    +}
    +
    + +

    Is optimized to:

    + +
    +define fastcc void @foo() {
    +	ret void
    +}
    +define void @bar() {
    +	unreachable
    +}
    +
    + +

    ... with "opt -instcombine -simplifycfg". This often bites people because +"all their code disappears". Setting the calling convention on the caller and +callee is required for indirect calls to work, so people often ask why not make +the verifier reject this sort of thing.

    + +

    The answer is that this code has undefined behavior, but it is not illegal. +If we made it illegal, then every transformation that could potentially create +this would have to ensure that it doesn't, and there is valid code that can +create this sort of construct (in dead code). The sorts of things that can +cause this to happen are fairly contrived, but we still need to accept them. +Here's an example:

    + +
    +define fastcc void @foo() {
    +        ret void
    +}
    +define internal void @bar(void()* %FP, i1 %cond) {
    +        br i1 %cond, label %T, label %F
    +T:  
    +        call void %FP()
    +        ret void
    +F:
    +        call fastcc void %FP()
    +        ret void
    +}
    +define void @test() {
    +        %X = or i1 false, false
    +        call void @bar(void()* @foo, i1 %X)
    +        ret void
    +} 
    +
    + +

    In this example, "test" always passes @foo/false into bar, which ensures that + it is dynamically called with the right calling conv (thus, the code is + perfectly well defined). If you run this through the inliner, you get this + (the explicit "or" is there so that the inliner doesn't dead code eliminate + a bunch of stuff): +

    + +
    +define fastcc void @foo() {
    +	ret void
    +}
    +define void @test() {
    +	%X = or i1 false, false
    +	br i1 %X, label %T.i, label %F.i
    +T.i:
    +	call void @foo()
    +	br label %bar.exit
    +F.i:
    +	call fastcc void @foo()
    +	br label %bar.exit
    +bar.exit:
    +	ret void
    +}
    +
    + +

    Here you can see that the inlining pass made an undefined call to @foo with + the wrong calling convention. We really don't want to make the inliner have + to know about this sort of thing, so it needs to be valid code. In this case, + dead code elimination can trivially remove the undefined code. However, if %X + was an input argument to @test, the inliner would produce this: +

    + +
    +define fastcc void @foo() {
    +	ret void
    +}
    +
    +define void @test(i1 %X) {
    +	br i1 %X, label %T.i, label %F.i
    +T.i:
    +	call void @foo()
    +	br label %bar.exit
    +F.i:
    +	call fastcc void @foo()
    +	br label %bar.exit
    +bar.exit:
    +	ret void
    +}
    +
    + +

    The interesting thing about this is that %X must be false for the +code to be well-defined, but no amount of dead code elimination will be able to +delete the broken call as unreachable. However, since instcombine/simplifycfg +turns the undefined call into unreachable, we end up with a branch on a +condition that goes to unreachable: a branch to unreachable can never happen, so +"-inline -instcombine -simplifycfg" is able to produce:

    + +
    +define fastcc void @foo() {
    +	ret void
    +}
    +define void @test(i1 %X) {
    +F.i:
    +	call fastcc void @foo()
    +	ret void
    +}
    +
    + +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-28 10:07:41 -0700 (Fri, 28 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/GCCFEBuildInstrs.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/GCCFEBuildInstrs.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/GCCFEBuildInstrs.html (added) +++ www-releases/trunk/2.8/docs/GCCFEBuildInstrs.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,279 @@ + + + + + + Building the LLVM GCC Front-End + + + +
    + Building the LLVM GCC Front-End +
    + +
      +
    1. Building llvm-gcc from Source
    2. +
    3. Building the Ada front-end
    4. +
    5. Building the Fortran front-end
    6. +
    7. License Information
    8. +
    + +
    +

    Written by the LLVM Team

    +
    + + +

    Building llvm-gcc from Source

    + + +
    + +

    This section describes how to acquire and build llvm-gcc 4.2, which is based +on the GCC 4.2.1 front-end. Supported languages are Ada, C, C++, Fortran, +Objective-C and Objective-C++. Note that the instructions for building these +front-ends are completely different (and much easier!) than those for building +llvm-gcc3 in the past.

    + +
      +
    1. Retrieve the appropriate llvm-gcc-4.2-version.source.tar.gz + archive from the LLVM web + site.

      + +

      It is also possible to download the sources of the llvm-gcc front end + from a read-only mirror using subversion. To check out the 4.2 code + for first time use:

      + +
      +
      +svn co http://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk dst-directory
      +
      +
      + +

      After that, the code can be be updated in the destination directory + using:

      + +
      +
      svn update
      +
      + +

      The mirror is brought up to date every evening.

    2. + +
    3. Follow the directions in the top-level README.LLVM file for + up-to-date instructions on how to build llvm-gcc. See below for building + with support for Ada or Fortran. +
    + +
    + + +

    Building the Ada front-end

    + + +
    +

    Building with support for Ada amounts to following the directions in the +top-level README.LLVM file, adding ",ada" to EXTRALANGS, for example: +EXTRALANGS=,ada

    + +

    There are some complications however:

    + +
      +
    1. The only platform for which the Ada front-end is known to build is + 32 bit intel x86 running linux. It is unlikely to build for other + systems without some work.

    2. +
    3. The build requires having a compiler that supports Ada, C and C++. + The Ada front-end is written in Ada so an Ada compiler is needed to + build it. Compilers known to work with the + LLVM 2.7 release + are gcc-4.2 and the + 2005, 2006 and 2007 versions of the + GNAT GPL Edition. + GNAT GPL 2008, gcc-4.3 and later will not work. + The LLVM parts of llvm-gcc are written in C++ so a C++ compiler is + needed to build them. The rest of gcc is written in C. + Some linux distributions provide a version of gcc that supports all + three languages (the Ada part often comes as an add-on package to + the rest of gcc). Otherwise it is possible to combine two versions + of gcc, one that supports Ada and C (such as the + 2007 GNAT GPL Edition) + and another which supports C++, see below.

    4. +
    5. Because the Ada front-end is experimental, it is wise to build the + compiler with checking enabled. This causes it to run much slower, but + helps catch mistakes in the compiler (please report any problems using + LLVM bugzilla).

    6. +
    7. The Ada front-end fails to + bootstrap, due to lack of LLVM support for + setjmp/longjmp style exception handling (used + internally by the compiler), so you must specify + --disable-bootstrap.

    8. +
    + +

    Supposing appropriate compilers are available, llvm-gcc with Ada support can + be built on an x86-32 linux box using the following recipe:

    + +
      +
    1. Download the LLVM source + and unpack it:

      + +
      +wget http://llvm.org/releases/2.7/llvm-2.7.tgz
      +tar xzf llvm-2.7.tgz
      +mv llvm-2.7 llvm
      +
      + +

      or check out the + latest version from subversion:

      + +
      svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
      + +
    2. + +
    3. Download the + llvm-gcc-4.2 source + and unpack it:

      + +
      +wget http://llvm.org/releases/2.7/llvm-gcc-4.2-2.7.source.tgz
      +tar xzf llvm-gcc-4.2-2.7.source.tgz
      +mv llvm-gcc-4.2-2.7.source llvm-gcc-4.2
      +
      + +

      or check out the + latest version from subversion:

      + +
      +svn co http://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk llvm-gcc-4.2
      +
      +
    4. + +
    5. Make a build directory llvm-objects for llvm and make it the + current directory:

      + +
      +mkdir llvm-objects
      +cd llvm-objects
      +
      +
    6. + +
    7. Configure LLVM (here it is configured to install into /usr/local):

      + +
      +../llvm/configure --prefix=/usr/local --enable-optimized --enable-assertions
      +
      + +

      If you have a multi-compiler setup and the C++ compiler is not the + default, then you can configure like this:

      + +
      +CXX=PATH_TO_C++_COMPILER ../llvm/configure --prefix=/usr/local --enable-optimized --enable-assertions
      +
      + +

      To compile without checking (not recommended), replace + --enable-assertions with --disable-assertions.

      + +
    8. + +
    9. Build LLVM:

      + +
      +make
      +
      +
    10. + +
    11. Install LLVM (optional):

      + +
      +make install
      +
      +
    12. + +
    13. Make a build directory llvm-gcc-4.2-objects for llvm-gcc and make it the + current directory:

      + +
      +cd ..
      +mkdir llvm-gcc-4.2-objects
      +cd llvm-gcc-4.2-objects
      +
      +
    14. + +
    15. Configure llvm-gcc (here it is configured to install into /usr/local). + The --enable-checking flag turns on sanity checks inside the compiler. + To turn off these checks (not recommended), replace --enable-checking + with --disable-checking. + Additional languages can be appended to the --enable-languages switch, + for example --enable-languages=ada,c,c++.

      + +
      +../llvm-gcc-4.2/configure --prefix=/usr/local --enable-languages=ada,c \
      +                          --enable-checking --enable-llvm=$PWD/../llvm-objects \
      +			  --disable-bootstrap --disable-multilib
      +
      + +

      If you have a multi-compiler setup, then you can configure like this:

      + +
      +export CC=PATH_TO_C_AND_ADA_COMPILER
      +export CXX=PATH_TO_C++_COMPILER
      +../llvm-gcc-4.2/configure --prefix=/usr/local --enable-languages=ada,c \
      +                          --enable-checking --enable-llvm=$PWD/../llvm-objects \
      +			  --disable-bootstrap --disable-multilib
      +
      +
    16. + +
    17. Build and install the compiler:

      + +
      +make
      +make install
      +
      +
    18. +
    + +
    + + +

    Building the Fortran front-end

    + + +
    +

    To build with support for Fortran, follow the directions in the top-level +README.LLVM file, adding ",fortran" to EXTRALANGS, for example:

    + +
    +EXTRALANGS=,fortran
    +
    + +
    + + +

    License Information

    + + +
    +

    +The LLVM GCC frontend is licensed to you under the GNU General Public License +and the GNU Lesser General Public License. Please see the files COPYING and +COPYING.LIB for more details. +

    + +

    +More information is available in the FAQ. +

    +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-08-31 12:40:21 -0700 (Tue, 31 Aug 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/GarbageCollection.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/GarbageCollection.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/GarbageCollection.html (added) +++ www-releases/trunk/2.8/docs/GarbageCollection.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1387 @@ + + + + + Accurate Garbage Collection with LLVM + + + + + +
    + Accurate Garbage Collection with LLVM +
    + +
      +
    1. Introduction + +
    2. + +
    3. Getting started + +
    4. + +
    5. Core support + +
    6. + +
    7. Compiler plugin interface + +
    8. + +
    9. Implementing a collector runtime + +
    10. + +
    11. References
    12. + +
    + +
    +

    Written by Chris Lattner and + Gordon Henriksen

    +
    + + + + + +
    + +

    Garbage collection is a widely used technique that frees the programmer from +having to know the lifetimes of heap objects, making software easier to produce +and maintain. Many programming languages rely on garbage collection for +automatic memory management. There are two primary forms of garbage collection: +conservative and accurate.

    + +

    Conservative garbage collection often does not require any special support +from either the language or the compiler: it can handle non-type-safe +programming languages (such as C/C++) and does not require any special +information from the compiler. The +Boehm collector is +an example of a state-of-the-art conservative collector.

    + +

    Accurate garbage collection requires the ability to identify all pointers in +the program at run-time (which requires that the source-language be type-safe in +most cases). Identifying pointers at run-time requires compiler support to +locate all places that hold live pointer variables at run-time, including the +processor stack and registers.

    + +

    Conservative garbage collection is attractive because it does not require any +special compiler support, but it does have problems. In particular, because the +conservative garbage collector cannot know that a particular word in the +machine is a pointer, it cannot move live objects in the heap (preventing the +use of compacting and generational GC algorithms) and it can occasionally suffer +from memory leaks due to integer values that happen to point to objects in the +program. In addition, some aggressive compiler transformations can break +conservative garbage collectors (though these seem rare in practice).

    + +

    Accurate garbage collectors do not suffer from any of these problems, but +they can suffer from degraded scalar optimization of the program. In particular, +because the runtime must be able to identify and update all pointers active in +the program, some optimizations are less effective. In practice, however, the +locality and performance benefits of using aggressive garbage collection +techniques dominates any low-level losses.

    + +

    This document describes the mechanisms and interfaces provided by LLVM to +support accurate garbage collection.

    + +
    + + + + +
    + +

    LLVM's intermediate representation provides garbage +collection intrinsics that offer support for a broad class of +collector models. For instance, the intrinsics permit:

    + +
      +
    • semi-space collectors
    • +
    • mark-sweep collectors
    • +
    • generational collectors
    • +
    • reference counting
    • +
    • incremental collectors
    • +
    • concurrent collectors
    • +
    • cooperative collectors
    • +
    + +

    We hope that the primitive support built into the LLVM IR is sufficient to +support a broad class of garbage collected languages including Scheme, ML, Java, +C#, Perl, Python, Lua, Ruby, other scripting languages, and more.

    + +

    However, LLVM does not itself provide a garbage collector—this should +be part of your language's runtime library. LLVM provides a framework for +compile time code generation plugins. The role of these +plugins is to generate code and data structures which conforms to the binary +interface specified by the runtime library. This is similar to the +relationship between LLVM and DWARF debugging info, for example. The +difference primarily lies in the lack of an established standard in the domain +of garbage collection—thus the plugins.

    + +

    The aspects of the binary interface with which LLVM's GC support is +concerned are:

    + +
      +
    • Creation of GC-safe points within code where collection is allowed to + execute safely.
    • +
    • Computation of the stack map. For each safe point in the code, object + references within the stack frame must be identified so that the + collector may traverse and perhaps update them.
    • +
    • Write barriers when storing object references to the heap. These are + commonly used to optimize incremental scans in generational + collectors.
    • +
    • Emission of read barriers when loading object references. These are + useful for interoperating with concurrent collectors.
    • +
    + +

    There are additional areas that LLVM does not directly address:

    + +
      +
    • Registration of global roots with the runtime.
    • +
    • Registration of stack map entries with the runtime.
    • +
    • The functions used by the program to allocate memory, trigger a + collection, etc.
    • +
    • Computation or compilation of type maps, or registration of them with + the runtime. These are used to crawl the heap for object + references.
    • +
    + +

    In general, LLVM's support for GC does not include features which can be +adequately addressed with other features of the IR and does not specify a +particular binary interface. On the plus side, this means that you should be +able to integrate LLVM with an existing runtime. On the other hand, it leaves +a lot of work for the developer of a novel language. However, it's easy to get +started quickly and scale up to a more sophisticated implementation as your +compiler matures.

    + +
    + + + + + +
    + +

    Using a GC with LLVM implies many things, for example:

    + +
      +
    • Write a runtime library or find an existing one which implements a GC + heap.
        +
      1. Implement a memory allocator.
      2. +
      3. Design a binary interface for the stack map, used to identify + references within a stack frame on the machine stack.*
      4. +
      5. Implement a stack crawler to discover functions on the call stack.*
      6. +
      7. Implement a registry for global roots.
      8. +
      9. Design a binary interface for type maps, used to identify references + within heap objects.
      10. +
      11. Implement a collection routine bringing together all of the above.
      12. +
    • +
    • Emit compatible code from your compiler.
        +
      • Initialization in the main function.
      • +
      • Use the gc "..." attribute to enable GC code generation + (or F.setGC("...")).
      • +
      • Use @llvm.gcroot to mark stack roots.
      • +
      • Use @llvm.gcread and/or @llvm.gcwrite to + manipulate GC references, if necessary.
      • +
      • Allocate memory using the GC allocation routine provided by the + runtime library.
      • +
      • Generate type maps according to your runtime's binary interface.
      • +
    • +
    • Write a compiler plugin to interface LLVM with the runtime library.*
        +
      • Lower @llvm.gcread and @llvm.gcwrite to appropriate + code sequences.*
      • +
      • Compile LLVM's stack map to the binary form expected by the + runtime.
      • +
    • +
    • Load the plugin into the compiler. Use llc -load or link the + plugin statically with your language's compiler.*
    • +
    • Link program executables with the runtime.
    • +
    + +

    To help with several of these tasks (those indicated with a *), LLVM +includes a highly portable, built-in ShadowStack code generator. It is compiled +into llc and works even with the interpreter and C backends.

    + +
    + + + + +
    + +

    To turn the shadow stack on for your functions, first call:

    + +
    F.setGC("shadow-stack");
    + +

    for each function your compiler emits. Since the shadow stack is built into +LLVM, you do not need to load a plugin.

    + +

    Your compiler must also use @llvm.gcroot as documented. +Don't forget to create a root for each intermediate value that is generated +when evaluating an expression. In h(f(), g()), the result of +f() could easily be collected if evaluating g() triggers a +collection.

    + +

    There's no need to use @llvm.gcread and @llvm.gcwrite over +plain load and store for now. You will need them when +switching to a more advanced GC.

    + +
    + + + + +
    + +

    The shadow stack doesn't imply a memory allocation algorithm. A semispace +collector or building atop malloc are great places to start, and can +be implemented with very little code.

    + +

    When it comes time to collect, however, your runtime needs to traverse the +stack roots, and for this it needs to integrate with the shadow stack. Luckily, +doing so is very simple. (This code is heavily commented to help you +understand the data structure, but there are only 20 lines of meaningful +code.)

    + +
    + +
    /// @brief The map for a single function's stack frame. One of these is
    +///        compiled as constant data into the executable for each function.
    +/// 
    +/// Storage of metadata values is elided if the %metadata parameter to
    +/// @llvm.gcroot is null.
    +struct FrameMap {
    +  int32_t NumRoots;    //< Number of roots in stack frame.
    +  int32_t NumMeta;     //< Number of metadata entries. May be < NumRoots.
    +  const void *Meta[0]; //< Metadata for each root.
    +};
    +
    +/// @brief A link in the dynamic shadow stack. One of these is embedded in the
    +///        stack frame of each function on the call stack.
    +struct StackEntry {
    +  StackEntry *Next;    //< Link to next stack entry (the caller's).
    +  const FrameMap *Map; //< Pointer to constant FrameMap.
    +  void *Roots[0];      //< Stack roots (in-place array).
    +};
    +
    +/// @brief The head of the singly-linked list of StackEntries. Functions push
    +///        and pop onto this in their prologue and epilogue.
    +/// 
    +/// Since there is only a global list, this technique is not threadsafe.
    +StackEntry *llvm_gc_root_chain;
    +
    +/// @brief Calls Visitor(root, meta) for each GC root on the stack.
    +///        root and meta are exactly the values passed to
    +///        @llvm.gcroot.
    +/// 
    +/// Visitor could be a function to recursively mark live objects. Or it
    +/// might copy them to another heap or generation.
    +/// 
    +/// @param Visitor A function to invoke for every GC root on the stack.
    +void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) {
    +  for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) {
    +    unsigned i = 0;
    +    
    +    // For roots [0, NumMeta), the metadata pointer is in the FrameMap.
    +    for (unsigned e = R->Map->NumMeta; i != e; ++i)
    +      Visitor(&R->Roots[i], R->Map->Meta[i]);
    +    
    +    // For roots [NumMeta, NumRoots), the metadata pointer is null.
    +    for (unsigned e = R->Map->NumRoots; i != e; ++i)
    +      Visitor(&R->Roots[i], NULL);
    +  }
    +}
    + + + + +
    + +

    Unlike many GC algorithms which rely on a cooperative code generator to +compile stack maps, this algorithm carefully maintains a linked list of stack +roots [Henderson2002]. This so-called "shadow stack" +mirrors the machine stack. Maintaining this data structure is slower than using +a stack map compiled into the executable as constant data, but has a significant +portability advantage because it requires no special support from the target +code generator, and does not require tricky platform-specific code to crawl +the machine stack.

    + +

    The tradeoff for this simplicity and portability is:

    + +
      +
    • High overhead per function call.
    • +
    • Not thread-safe.
    • +
    + +

    Still, it's an easy way to get started. After your compiler and runtime are +up and running, writing a plugin will allow you to take +advantage of more advanced GC features of LLVM +in order to improve performance.

    + +
    + + + + + +
    + +

    This section describes the garbage collection facilities provided by the +LLVM intermediate representation. The exact behavior +of these IR features is specified by the binary interface implemented by a +code generation plugin, not by this document.

    + +

    These facilities are limited to those strictly necessary; they are not +intended to be a complete interface to any garbage collector. A program will +need to interface with the GC library using the facilities provided by that +program.

    + +
    + + + + +
    + define ty @name(...) gc "name" { ... +
    + +
    + +

    The gc function attribute is used to specify the desired GC style +to the compiler. Its programmatic equivalent is the setGC method of +Function.

    + +

    Setting gc "name" on a function triggers a search for a +matching code generation plugin "name"; it is that plugin which defines +the exact nature of the code generated to support GC. If none is found, the +compiler will raise an error.

    + +

    Specifying the GC style on a per-function basis allows LLVM to link together +programs that use different garbage collection algorithms (or none at all).

    + +
    + + + + +
    + void @llvm.gcroot(i8** %ptrloc, i8* %metadata) +
    + +
    + +

    The llvm.gcroot intrinsic is used to inform LLVM that a stack +variable references an object on the heap and is to be tracked for garbage +collection. The exact impact on generated code is specified by a compiler plugin.

    + +

    A compiler which uses mem2reg to raise imperative code using alloca +into SSA form need only add a call to @llvm.gcroot for those variables +which a pointers into the GC heap.

    + +

    It is also important to mark intermediate values with llvm.gcroot. +For example, consider h(f(), g()). Beware leaking the result of +f() in the case that g() triggers a collection.

    + +

    The first argument must be a value referring to an alloca instruction +or a bitcast of an alloca. The second contains a pointer to metadata that +should be associated with the pointer, and must be a constant or global +value address. If your target collector uses tags, use a null pointer for +metadata.

    + +

    The %metadata argument can be used to avoid requiring heap objects +to have 'isa' pointers or tag bits. [Appel89, Goldberg91, Tolmach94] If +specified, its value will be tracked along with the location of the pointer in +the stack frame.

    + +

    Consider the following fragment of Java code:

    + +
    +       {
    +         Object X;   // A null-initialized reference to an object
    +         ...
    +       }
    +
    + +

    This block (which may be located in the middle of a function or in a loop +nest), could be compiled to this LLVM code:

    + +
    +Entry:
    +   ;; In the entry block for the function, allocate the
    +   ;; stack space for X, which is an LLVM pointer.
    +   %X = alloca %Object*
    +   
    +   ;; Tell LLVM that the stack space is a stack root.
    +   ;; Java has type-tags on objects, so we pass null as metadata.
    +   %tmp = bitcast %Object** %X to i8**
    +   call void @llvm.gcroot(i8** %X, i8* null)
    +   ...
    +
    +   ;; "CodeBlock" is the block corresponding to the start
    +   ;;  of the scope above.
    +CodeBlock:
    +   ;; Java null-initializes pointers.
    +   store %Object* null, %Object** %X
    +
    +   ...
    +
    +   ;; As the pointer goes out of scope, store a null value into
    +   ;; it, to indicate that the value is no longer live.
    +   store %Object* null, %Object** %X
    +   ...
    +
    + +
    + + + + +
    + +

    Some collectors need to be informed when the mutator (the program that needs +garbage collection) either reads a pointer from or writes a pointer to a field +of a heap object. The code fragments inserted at these points are called +read barriers and write barriers, respectively. The amount of +code that needs to be executed is usually quite small and not on the critical +path of any computation, so the overall performance impact of the barrier is +tolerable.

    + +

    Barriers often require access to the object pointer rather than the +derived pointer (which is a pointer to the field within the +object). Accordingly, these intrinsics take both pointers as separate arguments +for completeness. In this snippet, %object is the object pointer, and +%derived is the derived pointer:

    + +
    +    ;; An array type.
    +    %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
    +    ...
    +
    +    ;; Load the object pointer from a gcroot.
    +    %object = load %class.Array** %object_addr
    +
    +    ;; Compute the derived pointer.
    +    %derived = getelementptr %object, i32 0, i32 2, i32 %n
    + +

    LLVM does not enforce this relationship between the object and derived +pointer (although a plugin might). However, it would be +an unusual collector that violated it.

    + +

    The use of these intrinsics is naturally optional if the target GC does +require the corresponding barrier. Such a GC plugin will replace the intrinsic +calls with the corresponding load or store instruction if they +are used.

    + +
    + + + + +
    +void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived) +
    + +
    + +

    For write barriers, LLVM provides the llvm.gcwrite intrinsic +function. It has exactly the same semantics as a non-volatile store to +the derived pointer (the third argument). The exact code generated is specified +by a compiler plugin.

    + +

    Many important algorithms require write barriers, including generational +and concurrent collectors. Additionally, write barriers could be used to +implement reference counting.

    + +
    + + + + +
    +i8* @llvm.gcread(i8* %object, i8** %derived)
    +
    + +
    + +

    For read barriers, LLVM provides the llvm.gcread intrinsic function. +It has exactly the same semantics as a non-volatile load from the +derived pointer (the second argument). The exact code generated is specified by +a compiler plugin.

    + +

    Read barriers are needed by fewer algorithms than write barriers, and may +have a greater performance impact since pointer reads are more frequent than +writes.

    + +
    + + + + + +
    + +

    User code specifies which GC code generation to use with the gc +function attribute or, equivalently, with the setGC method of +Function.

    + +

    To implement a GC plugin, it is necessary to subclass +llvm::GCStrategy, which can be accomplished in a few lines of +boilerplate code. LLVM's infrastructure provides access to several important +algorithms. For an uncontroversial collector, all that remains may be to +compile LLVM's computed stack map to assembly code (using the binary +representation expected by the runtime library). This can be accomplished in +about 100 lines of code.

    + +

    This is not the appropriate place to implement a garbage collected heap or a +garbage collector itself. That code should exist in the language's runtime +library. The compiler plugin is responsible for generating code which +conforms to the binary interface defined by library, most essentially the +stack map.

    + +

    To subclass llvm::GCStrategy and register it with the compiler:

    + +
    // lib/MyGC/MyGC.cpp - Example LLVM GC plugin
    +
    +#include "llvm/CodeGen/GCStrategy.h"
    +#include "llvm/CodeGen/GCMetadata.h"
    +#include "llvm/Support/Compiler.h"
    +
    +using namespace llvm;
    +
    +namespace {
    +  class LLVM_LIBRARY_VISIBILITY MyGC : public GCStrategy {
    +  public:
    +    MyGC() {}
    +  };
    +  
    +  GCRegistry::Add<MyGC>
    +  X("mygc", "My bespoke garbage collector.");
    +}
    + +

    This boilerplate collector does nothing. More specifically:

    + +
      +
    • llvm.gcread calls are replaced with the corresponding + load instruction.
    • +
    • llvm.gcwrite calls are replaced with the corresponding + store instruction.
    • +
    • No safe points are added to the code.
    • +
    • The stack map is not compiled into the executable.
    • +
    + +

    Using the LLVM makefiles (like the sample +project), this code can be compiled as a plugin using a simple +makefile:

    + +
    # lib/MyGC/Makefile
    +
    +LEVEL := ../..
    +LIBRARYNAME = MyGC
    +LOADABLE_MODULE = 1
    +
    +include $(LEVEL)/Makefile.common
    + +

    Once the plugin is compiled, code using it may be compiled using llc +-load=MyGC.so (though MyGC.so may have some other +platform-specific extension):

    + +
    $ cat sample.ll
    +define void @f() gc "mygc" {
    +entry:
    +        ret void
    +}
    +$ llvm-as < sample.ll | llc -load=MyGC.so
    + +

    It is also possible to statically link the collector plugin into tools, such +as a language-specific compiler front-end.

    + +
    + + + + +
    + +

    GCStrategy provides a range of features through which a plugin +may do useful work. Some of these are callbacks, some are algorithms that can +be enabled, disabled, or customized. This matrix summarizes the supported (and +planned) features and correlates them with the collection techniques which +typically require them.

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    AlgorithmDoneshadow stackrefcountmark-sweepcopyingincrementalthreadedconcurrent
    stack map
    initialize roots
    derived pointersNO✘*✘*
    custom lowering
    gcroot
    gcwrite
    gcread
    safe points
    in calls
    before calls
    for loopsNO
    before escape
    emit code at safe pointsNO
    output
    assembly
    JITNO
    objNO
    live analysisNO
    register mapNO
    +
    * Derived pointers only pose a + hazard to copying collectors.
    +
    in gray denotes a feature which + could be utilized if available.
    +
    + +

    To be clear, the collection techniques above are defined as:

    + +
    +
    Shadow Stack
    +
    The mutator carefully maintains a linked list of stack roots.
    +
    Reference Counting
    +
    The mutator maintains a reference count for each object and frees an + object when its count falls to zero.
    +
    Mark-Sweep
    +
    When the heap is exhausted, the collector marks reachable objects starting + from the roots, then deallocates unreachable objects in a sweep + phase.
    +
    Copying
    +
    As reachability analysis proceeds, the collector copies objects from one + heap area to another, compacting them in the process. Copying collectors + enable highly efficient "bump pointer" allocation and can improve locality + of reference.
    +
    Incremental
    +
    (Including generational collectors.) Incremental collectors generally have + all the properties of a copying collector (regardless of whether the + mature heap is compacting), but bring the added complexity of requiring + write barriers.
    +
    Threaded
    +
    Denotes a multithreaded mutator; the collector must still stop the mutator + ("stop the world") before beginning reachability analysis. Stopping a + multithreaded mutator is a complicated problem. It generally requires + highly platform specific code in the runtime, and the production of + carefully designed machine code at safe points.
    +
    Concurrent
    +
    In this technique, the mutator and the collector run concurrently, with + the goal of eliminating pause times. In a cooperative collector, + the mutator further aids with collection should a pause occur, allowing + collection to take advantage of multiprocessor hosts. The "stop the world" + problem of threaded collectors is generally still present to a limited + extent. Sophisticated marking algorithms are necessary. Read barriers may + be necessary.
    +
    + +

    As the matrix indicates, LLVM's garbage collection infrastructure is already +suitable for a wide variety of collectors, but does not currently extend to +multithreaded programs. This will be added in the future as there is +interest.

    + +
    + + + + +
    + +

    LLVM automatically computes a stack map. One of the most important features +of a GCStrategy is to compile this information into the executable in +the binary representation expected by the runtime library.

    + +

    The stack map consists of the location and identity of each GC root in the +each function in the module. For each root:

    + +
      +
    • RootNum: The index of the root.
    • +
    • StackOffset: The offset of the object relative to the frame + pointer.
    • +
    • RootMetadata: The value passed as the %metadata + parameter to the @llvm.gcroot intrinsic.
    • +
    + +

    Also, for the function as a whole:

    + +
      +
    • getFrameSize(): The overall size of the function's initial + stack frame, not accounting for any dynamic allocation.
    • +
    • roots_size(): The count of roots in the function.
    • +
    + +

    To access the stack map, use GCFunctionMetadata::roots_begin() and +-end() from the GCMetadataPrinter:

    + +
    for (iterator I = begin(), E = end(); I != E; ++I) {
    +  GCFunctionInfo *FI = *I;
    +  unsigned FrameSize = FI->getFrameSize();
    +  size_t RootCount = FI->roots_size();
    +
    +  for (GCFunctionInfo::roots_iterator RI = FI->roots_begin(),
    +                                      RE = FI->roots_end();
    +                                      RI != RE; ++RI) {
    +    int RootNum = RI->Num;
    +    int RootStackOffset = RI->StackOffset;
    +    Constant *RootMetadata = RI->Metadata;
    +  }
    +}
    + +

    If the llvm.gcroot intrinsic is eliminated before code generation by +a custom lowering pass, LLVM will compute an empty stack map. This may be useful +for collector plugins which implement reference counting or a shadow stack.

    + +
    + + + + + +
    + +
    MyGC::MyGC() {
    +  InitRoots = true;
    +}
    + +

    When set, LLVM will automatically initialize each root to null upon +entry to the function. This prevents the GC's sweep phase from visiting +uninitialized pointers, which will almost certainly cause it to crash. This +initialization occurs before custom lowering, so the two may be used +together.

    + +

    Since LLVM does not yet compute liveness information, there is no means of +distinguishing an uninitialized stack root from an initialized one. Therefore, +this feature should be used by all GC plugins. It is enabled by default.

    + +
    + + + + + +
    + +

    For GCs which use barriers or unusual treatment of stack roots, these +flags allow the collector to perform arbitrary transformations of the LLVM +IR:

    + +
    class MyGC : public GCStrategy {
    +public:
    +  MyGC() {
    +    CustomRoots = true;
    +    CustomReadBarriers = true;
    +    CustomWriteBarriers = true;
    +  }
    +  
    +  virtual bool initializeCustomLowering(Module &M);
    +  virtual bool performCustomLowering(Function &F);
    +};
    + +

    If any of these flags are set, then LLVM suppresses its default lowering for +the corresponding intrinsics and instead calls +performCustomLowering.

    + +

    LLVM's default action for each intrinsic is as follows:

    + +
      +
    • llvm.gcroot: Leave it alone. The code generator must see it + or the stack map will not be computed.
    • +
    • llvm.gcread: Substitute a load instruction.
    • +
    • llvm.gcwrite: Substitute a store instruction.
    • +
    + +

    If CustomReadBarriers or CustomWriteBarriers are specified, +then performCustomLowering must eliminate the +corresponding barriers.

    + +

    performCustomLowering must comply with the same restrictions as FunctionPass::runOnFunction. +Likewise, initializeCustomLowering has the same semantics as Pass::doInitialization(Module&).

    + +

    The following can be used as a template:

    + +
    #include "llvm/Module.h"
    +#include "llvm/IntrinsicInst.h"
    +
    +bool MyGC::initializeCustomLowering(Module &M) {
    +  return false;
    +}
    +
    +bool MyGC::performCustomLowering(Function &F) {
    +  bool MadeChange = false;
    +  
    +  for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB)
    +    for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; )
    +      if (IntrinsicInst *CI = dyn_cast<IntrinsicInst>(II++))
    +        if (Function *F = CI->getCalledFunction())
    +          switch (F->getIntrinsicID()) {
    +          case Intrinsic::gcwrite:
    +            // Handle llvm.gcwrite.
    +            CI->eraseFromParent();
    +            MadeChange = true;
    +            break;
    +          case Intrinsic::gcread:
    +            // Handle llvm.gcread.
    +            CI->eraseFromParent();
    +            MadeChange = true;
    +            break;
    +          case Intrinsic::gcroot:
    +            // Handle llvm.gcroot.
    +            CI->eraseFromParent();
    +            MadeChange = true;
    +            break;
    +          }
    +  
    +  return MadeChange;
    +}
    + +
    + + + + + +
    + +

    LLVM can compute four kinds of safe points:

    + +
    namespace GC {
    +  /// PointKind - The type of a collector-safe point.
    +  /// 
    +  enum PointKind {
    +    Loop,    //< Instr is a loop (backwards branch).
    +    Return,  //< Instr is a return instruction.
    +    PreCall, //< Instr is a call instruction.
    +    PostCall //< Instr is the return address of a call.
    +  };
    +}
    + +

    A collector can request any combination of the four by setting the +NeededSafePoints mask:

    + +
    MyGC::MyGC() {
    +  NeededSafePoints = 1 << GC::Loop
    +                   | 1 << GC::Return
    +                   | 1 << GC::PreCall
    +                   | 1 << GC::PostCall;
    +}
    + +

    It can then use the following routines to access safe points.

    + +
    for (iterator I = begin(), E = end(); I != E; ++I) {
    +  GCFunctionInfo *MD = *I;
    +  size_t PointCount = MD->size();
    +
    +  for (GCFunctionInfo::iterator PI = MD->begin(),
    +                                PE = MD->end(); PI != PE; ++PI) {
    +    GC::PointKind PointKind = PI->Kind;
    +    unsigned PointNum = PI->Num;
    +  }
    +}
    +
    + +

    Almost every collector requires PostCall safe points, since these +correspond to the moments when the function is suspended during a call to a +subroutine.

    + +

    Threaded programs generally require Loop safe points to guarantee +that the application will reach a safe point within a bounded amount of time, +even if it is executing a long-running loop which contains no function +calls.

    + +

    Threaded collectors may also require Return and PreCall +safe points to implement "stop the world" techniques using self-modifying code, +where it is important that the program not exit the function without reaching a +safe point (because only the topmost function has been patched).

    + +
    + + + + + +
    + +

    LLVM allows a plugin to print arbitrary assembly code before and after the +rest of a module's assembly code. At the end of the module, the GC can compile +the LLVM stack map into assembly code. (At the beginning, this information is not +yet computed.)

    + +

    Since AsmWriter and CodeGen are separate components of LLVM, a separate +abstract base class and registry is provided for printing assembly code, the +GCMetadaPrinter and GCMetadataPrinterRegistry. The AsmWriter +will look for such a subclass if the GCStrategy sets +UsesMetadata:

    + +
    MyGC::MyGC() {
    +  UsesMetadata = true;
    +}
    + +

    This separation allows JIT-only clients to be smaller.

    + +

    Note that LLVM does not currently have analogous APIs to support code +generation in the JIT, nor using the object writers.

    + +
    // lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer
    +
    +#include "llvm/CodeGen/GCMetadataPrinter.h"
    +#include "llvm/Support/Compiler.h"
    +
    +using namespace llvm;
    +
    +namespace {
    +  class LLVM_LIBRARY_VISIBILITY MyGCPrinter : public GCMetadataPrinter {
    +  public:
    +    virtual void beginAssembly(std::ostream &OS, AsmPrinter &AP,
    +                               const TargetAsmInfo &TAI);
    +  
    +    virtual void finishAssembly(std::ostream &OS, AsmPrinter &AP,
    +                                const TargetAsmInfo &TAI);
    +  };
    +  
    +  GCMetadataPrinterRegistry::Add<MyGCPrinter>
    +  X("mygc", "My bespoke garbage collector.");
    +}
    + +

    The collector should use AsmPrinter and TargetAsmInfo to +print portable assembly code to the std::ostream. The collector itself +contains the stack map for the entire module, and may access the +GCFunctionInfo using its own begin() and end() +methods. Here's a realistic example:

    + +
    #include "llvm/CodeGen/AsmPrinter.h"
    +#include "llvm/Function.h"
    +#include "llvm/Target/TargetMachine.h"
    +#include "llvm/Target/TargetData.h"
    +#include "llvm/Target/TargetAsmInfo.h"
    +
    +void MyGCPrinter::beginAssembly(std::ostream &OS, AsmPrinter &AP,
    +                                const TargetAsmInfo &TAI) {
    +  // Nothing to do.
    +}
    +
    +void MyGCPrinter::finishAssembly(std::ostream &OS, AsmPrinter &AP,
    +                                 const TargetAsmInfo &TAI) {
    +  // Set up for emitting addresses.
    +  const char *AddressDirective;
    +  int AddressAlignLog;
    +  if (AP.TM.getTargetData()->getPointerSize() == sizeof(int32_t)) {
    +    AddressDirective = TAI.getData32bitsDirective();
    +    AddressAlignLog = 2;
    +  } else {
    +    AddressDirective = TAI.getData64bitsDirective();
    +    AddressAlignLog = 3;
    +  }
    +  
    +  // Put this in the data section.
    +  AP.SwitchToDataSection(TAI.getDataSection());
    +  
    +  // For each function...
    +  for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
    +    GCFunctionInfo &MD = **FI;
    +    
    +    // Emit this data structure:
    +    // 
    +    // struct {
    +    //   int32_t PointCount;
    +    //   struct {
    +    //     void *SafePointAddress;
    +    //     int32_t LiveCount;
    +    //     int32_t LiveOffsets[LiveCount];
    +    //   } Points[PointCount];
    +    // } __gcmap_<FUNCTIONNAME>;
    +    
    +    // Align to address width.
    +    AP.EmitAlignment(AddressAlignLog);
    +    
    +    // Emit the symbol by which the stack map entry can be found.
    +    std::string Symbol;
    +    Symbol += TAI.getGlobalPrefix();
    +    Symbol += "__gcmap_";
    +    Symbol += MD.getFunction().getName();
    +    if (const char *GlobalDirective = TAI.getGlobalDirective())
    +      OS << GlobalDirective << Symbol << "\n";
    +    OS << TAI.getGlobalPrefix() << Symbol << ":\n";
    +    
    +    // Emit PointCount.
    +    AP.EmitInt32(MD.size());
    +    AP.EOL("safe point count");
    +    
    +    // And each safe point...
    +    for (GCFunctionInfo::iterator PI = MD.begin(),
    +                                     PE = MD.end(); PI != PE; ++PI) {
    +      // Align to address width.
    +      AP.EmitAlignment(AddressAlignLog);
    +      
    +      // Emit the address of the safe point.
    +      OS << AddressDirective
    +         << TAI.getPrivateGlobalPrefix() << "label" << PI->Num;
    +      AP.EOL("safe point address");
    +      
    +      // Emit the stack frame size.
    +      AP.EmitInt32(MD.getFrameSize());
    +      AP.EOL("stack frame size");
    +      
    +      // Emit the number of live roots in the function.
    +      AP.EmitInt32(MD.live_size(PI));
    +      AP.EOL("live root count");
    +      
    +      // And for each live root...
    +      for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
    +                                            LE = MD.live_end(PI);
    +                                            LI != LE; ++LI) {
    +        // Print its offset within the stack frame.
    +        AP.EmitInt32(LI->StackOffset);
    +        AP.EOL("stack offset");
    +      }
    +    }
    +  }
    +}
    +
    + +
    + + + + + + +
    + +

    [Appel89] Runtime Tags Aren't Necessary. Andrew +W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.

    + +

    [Goldberg91] Tag-free garbage collection for +strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN +PLDI'91.

    + +

    [Tolmach94] Tag-free garbage collection using +explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM +conference on LISP and functional programming.

    + +

    [Henderson2002] +Accurate Garbage Collection in an Uncooperative Environment. +Fergus Henderson. International Symposium on Memory Management 2002.

    + +
    + + + + +
    +
    + Valid CSS + Valid HTML 4.01 + + Chris Lattner
    + LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-05-11 13:16:09 -0700 (Tue, 11 May 2010) $ +
    + + + Added: www-releases/trunk/2.8/docs/GetElementPtr.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/GetElementPtr.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/GetElementPtr.html (added) +++ www-releases/trunk/2.8/docs/GetElementPtr.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,725 @@ + + + + + The Often Misunderstood GEP Instruction + + + + + +
    + The Often Misunderstood GEP Instruction +
    + +
      +
    1. Introduction
    2. +
    3. Address Computation +
        +
      1. Why is the extra 0 index required?
      2. +
      3. What is dereferenced by GEP?
      4. +
      5. Why can you index through the first pointer but not + subsequent ones?
      6. +
      7. Why don't GEP x,0,0,1 and GEP x,1 alias?
      8. +
      9. Why do GEP x,1,0,0 and GEP x,1 alias?
      10. +
      11. Can GEP index into vector elements? +
      12. What effect do address spaces have on GEPs? +
      13. How is GEP different from ptrtoint, arithmetic, and inttoptr?
      14. +
      15. I'm writing a backend for a target which needs custom lowering for GEP. How do I do this? +
      16. How does VLA addressing work with GEPs? +
    4. +
    5. Rules +
        +
      1. What happens if an array index is out of bounds? +
      2. Can array indices be negative? +
      3. Can I compare two values computed with GEPs? +
      4. Can I do GEP with a different pointer type than the type of the underlying object? +
      5. Can I cast an object's address to integer and add it to null? +
      6. Can I compute the distance between two objects, and add that value to one address to compute the other address? +
      7. Can I do type-based alias analysis on LLVM IR? +
      8. What happens if a GEP computation overflows? +
      9. How can I tell if my front-end is following the rules? +
    6. +
    7. Rationale +
        +
      1. Why is GEP designed this way?
      2. +
      3. Why do struct member indices always use i32?
      4. +
      5. What's an uglygep? +
    8. +
    9. Summary
    10. +
    + +
    +

    Written by: Reid Spencer.

    +
    + + + + + + +
    +

    This document seeks to dispel the mystery and confusion surrounding LLVM's + GetElementPtr (GEP) instruction. + Questions about the wily GEP instruction are + probably the most frequently occurring questions once a developer gets down to + coding with LLVM. Here we lay out the sources of confusion and show that the + GEP instruction is really quite simple. +

    +
    + + + + +
    +

    When people are first confronted with the GEP instruction, they tend to + relate it to known concepts from other programming paradigms, most notably C + array indexing and field selection. GEP closely resembles C array indexing + and field selection, however it's is a little different and this leads to + the following questions.

    +
    + + + +
    +

    Quick answer: The index stepping through the first operand.

    +

    The confusion with the first index usually arises from thinking about + the GetElementPtr instruction as if it was a C index operator. They aren't the + same. For example, when we write, in "C":

    + +
    +
    +AType *Foo;
    +...
    +X = &Foo->F;
    +
    +
    + +

    it is natural to think that there is only one index, the selection of the + field F. However, in this example, Foo is a pointer. That + pointer must be indexed explicitly in LLVM. C, on the other hand, indices + through it transparently. To arrive at the same address location as the C + code, you would provide the GEP instruction with two index operands. The + first operand indexes through the pointer; the second operand indexes the + field F of the structure, just as if you wrote:

    + +
    +
    +X = &Foo[0].F;
    +
    +
    + +

    Sometimes this question gets rephrased as:

    +

    Why is it okay to index through the first pointer, but + subsequent pointers won't be dereferenced?

    +

    The answer is simply because memory does not have to be accessed to + perform the computation. The first operand to the GEP instruction must be a + value of a pointer type. The value of the pointer is provided directly to + the GEP instruction as an operand without any need for accessing memory. It + must, therefore be indexed and requires an index operand. Consider this + example:

    + +
    +
    +struct munger_struct {
    +  int f1;
    +  int f2;
    +};
    +void munge(struct munger_struct *P) {
    +  P[0].f1 = P[1].f1 + P[2].f2;
    +}
    +...
    +munger_struct Array[3];
    +...
    +munge(Array);
    +
    +
    + +

    In this "C" example, the front end compiler (llvm-gcc) will generate three + GEP instructions for the three indices through "P" in the assignment + statement. The function argument P will be the first operand of each + of these GEP instructions. The second operand indexes through that pointer. + The third operand will be the field offset into the + struct munger_struct type, for either the f1 or + f2 field. So, in LLVM assembly the munge function looks + like:

    + +
    +
    +void %munge(%struct.munger_struct* %P) {
    +entry:
    +  %tmp = getelementptr %struct.munger_struct* %P, i32 1, i32 0
    +  %tmp = load i32* %tmp
    +  %tmp6 = getelementptr %struct.munger_struct* %P, i32 2, i32 1
    +  %tmp7 = load i32* %tmp6
    +  %tmp8 = add i32 %tmp7, %tmp
    +  %tmp9 = getelementptr %struct.munger_struct* %P, i32 0, i32 0
    +  store i32 %tmp8, i32* %tmp9
    +  ret void
    +}
    +
    +
    + +

    In each case the first operand is the pointer through which the GEP + instruction starts. The same is true whether the first operand is an + argument, allocated memory, or a global variable.

    +

    To make this clear, let's consider a more obtuse example:

    + +
    +
    +%MyVar = uninitialized global i32
    +...
    +%idx1 = getelementptr i32* %MyVar, i64 0
    +%idx2 = getelementptr i32* %MyVar, i64 1
    +%idx3 = getelementptr i32* %MyVar, i64 2
    +
    +
    + +

    These GEP instructions are simply making address computations from the + base address of MyVar. They compute, as follows (using C syntax): +

    + +
    +
    +idx1 = (char*) &MyVar + 0
    +idx2 = (char*) &MyVar + 4
    +idx3 = (char*) &MyVar + 8
    +
    +
    + +

    Since the type i32 is known to be four bytes long, the indices + 0, 1 and 2 translate into memory offsets of 0, 4, and 8, respectively. No + memory is accessed to make these computations because the address of + %MyVar is passed directly to the GEP instructions.

    +

    The obtuse part of this example is in the cases of %idx2 and + %idx3. They result in the computation of addresses that point to + memory past the end of the %MyVar global, which is only one + i32 long, not three i32s long. While this is legal in LLVM, + it is inadvisable because any load or store with the pointer that results + from these GEP instructions would produce undefined results.

    +
    + + + + +
    +

    Quick answer: there are no superfluous indices.

    +

    This question arises most often when the GEP instruction is applied to a + global variable which is always a pointer type. For example, consider + this:

    + +
    +
    +%MyStruct = uninitialized global { float*, i32 }
    +...
    +%idx = getelementptr { float*, i32 }* %MyStruct, i64 0, i32 1
    +
    +
    + +

    The GEP above yields an i32* by indexing the i32 typed + field of the structure %MyStruct. When people first look at it, they + wonder why the i64 0 index is needed. However, a closer inspection + of how globals and GEPs work reveals the need. Becoming aware of the following + facts will dispel the confusion:

    +
      +
    1. The type of %MyStruct is not { float*, i32 } + but rather { float*, i32 }*. That is, %MyStruct is a + pointer to a structure containing a pointer to a float and an + i32.
    2. +
    3. Point #1 is evidenced by noticing the type of the first operand of + the GEP instruction (%MyStruct) which is + { float*, i32 }*.
    4. +
    5. The first index, i64 0 is required to step over the global + variable %MyStruct. Since the first argument to the GEP + instruction must always be a value of pointer type, the first index + steps through that pointer. A value of 0 means 0 elements offset from that + pointer.
    6. +
    7. The second index, i32 1 selects the second field of the + structure (the i32).
    8. +
    +
    + + + +
    +

    Quick answer: nothing.

    +

    The GetElementPtr instruction dereferences nothing. That is, it doesn't + access memory in any way. That's what the Load and Store instructions are for. + GEP is only involved in the computation of addresses. For example, consider + this:

    + +
    +
    +%MyVar = uninitialized global { [40 x i32 ]* }
    +...
    +%idx = getelementptr { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17
    +
    +
    + +

    In this example, we have a global variable, %MyVar that is a + pointer to a structure containing a pointer to an array of 40 ints. The + GEP instruction seems to be accessing the 18th integer of the structure's + array of ints. However, this is actually an illegal GEP instruction. It + won't compile. The reason is that the pointer in the structure must + be dereferenced in order to index into the array of 40 ints. Since the + GEP instruction never accesses memory, it is illegal.

    +

    In order to access the 18th integer in the array, you would need to do the + following:

    + +
    +
    +%idx = getelementptr { [40 x i32]* }* %, i64 0, i32 0
    +%arr = load [40 x i32]** %idx
    +%idx = getelementptr [40 x i32]* %arr, i64 0, i64 17
    +
    +
    + +

    In this case, we have to load the pointer in the structure with a load + instruction before we can index into the array. If the example was changed + to:

    + +
    +
    +%MyVar = uninitialized global { [40 x i32 ] }
    +...
    +%idx = getelementptr { [40 x i32] }*, i64 0, i32 0, i64 17
    +
    +
    + +

    then everything works fine. In this case, the structure does not contain a + pointer and the GEP instruction can index through the global variable, + into the first field of the structure and access the 18th i32 in the + array there.

    +
    + + + +
    +

    Quick Answer: They compute different address locations.

    +

    If you look at the first indices in these GEP + instructions you find that they are different (0 and 1), therefore the address + computation diverges with that index. Consider this example:

    + +
    +
    +%MyVar = global { [10 x i32 ] }
    +%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 0, i32 0, i64 1
    +%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
    +
    +
    + +

    In this example, idx1 computes the address of the second integer + in the array that is in the structure in %MyVar, that is + MyVar+4. The type of idx1 is i32*. However, + idx2 computes the address of the next structure after + %MyVar. The type of idx2 is { [10 x i32] }* and its + value is equivalent to MyVar + 40 because it indexes past the ten + 4-byte integers in MyVar. Obviously, in such a situation, the + pointers don't alias.

    + +
    + + + +
    +

    Quick Answer: They compute the same address location.

    +

    These two GEP instructions will compute the same address because indexing + through the 0th element does not change the address. However, it does change + the type. Consider this example:

    + +
    +
    +%MyVar = global { [10 x i32 ] }
    +%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 1, i32 0, i64 0
    +%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
    +
    +
    + +

    In this example, the value of %idx1 is %MyVar+40 and + its type is i32*. The value of %idx2 is also + MyVar+40 but its type is { [10 x i32] }*.

    +
    + + + + +
    +

    This hasn't always been forcefully disallowed, though it's not recommended. + It leads to awkward special cases in the optimizers, and fundamental + inconsistency in the IR. In the future, it will probably be outright + disallowed.

    + +
    + + + + +
    +

    None, except that the address space qualifier on the first operand pointer + type always matches the address space qualifier on the result type.

    + +
    + + + + +
    +

    It's very similar; there are only subtle differences.

    + +

    With ptrtoint, you have to pick an integer type. One approach is to pick i64; + this is safe on everything LLVM supports (LLVM internally assumes pointers + are never wider than 64 bits in many places), and the optimizer will actually + narrow the i64 arithmetic down to the actual pointer size on targets which + don't support 64-bit arithmetic in most cases. However, there are some cases + where it doesn't do this. With GEP you can avoid this problem. + +

    Also, GEP carries additional pointer aliasing rules. It's invalid to take a + GEP from one object, address into a different separately allocated + object, and dereference it. IR producers (front-ends) must follow this rule, + and consumers (optimizers, specifically alias analysis) benefit from being + able to rely on it. See the Rules section for more + information.

    + +

    And, GEP is more concise in common cases.

    + +

    However, for the underlying integer computation implied, there + is no difference.

    + +
    + + + + +
    +

    You don't. The integer computation implied by a GEP is target-independent. + Typically what you'll need to do is make your backend pattern-match + expressions trees involving ADD, MUL, etc., which are what GEP is lowered + into. This has the advantage of letting your code work correctly in more + cases.

    + +

    GEP does use target-dependent parameters for the size and layout of data + types, which targets can customize.

    + +

    If you require support for addressing units which are not 8 bits, you'll + need to fix a lot of code in the backend, with GEP lowering being only a + small piece of the overall picture.

    + +
    + + + + +
    +

    GEPs don't natively support VLAs. LLVM's type system is entirely static, + and GEP address computations are guided by an LLVM type.

    + +

    VLA indices can be implemented as linearized indices. For example, an + expression like X[a][b][c], must be effectively lowered into a form + like X[a*m+b*n+c], so that it appears to the GEP as a single-dimensional + array reference.

    + +

    This means if you want to write an analysis which understands array + indices and you want to support VLAs, your code will have to be + prepared to reverse-engineer the linearization. One way to solve this + problem is to use the ScalarEvolution library, which always presents + VLA and non-VLA indexing in the same manner.

    +
    + + + + + + + + +
    +

    There are two senses in which an array index can be out of bounds.

    + +

    First, there's the array type which comes from the (static) type of + the first operand to the GEP. Indices greater than the number of elements + in the corresponding static array type are valid. There is no problem with + out of bounds indices in this sense. Indexing into an array only depends + on the size of the array element, not the number of elements.

    + +

    A common example of how this is used is arrays where the size is not known. + It's common to use array types with zero length to represent these. The + fact that the static type says there are zero elements is irrelevant; it's + perfectly valid to compute arbitrary element indices, as the computation + only depends on the size of the array element, not the number of + elements. Note that zero-sized arrays are not a special case here.

    + +

    This sense is unconnected with inbounds keyword. The + inbounds keyword is designed to describe low-level pointer + arithmetic overflow conditions, rather than high-level array + indexing rules. + +

    Analysis passes which wish to understand array indexing should not + assume that the static array type bounds are respected.

    + +

    The second sense of being out of bounds is computing an address that's + beyond the actual underlying allocated object.

    + +

    With the inbounds keyword, the result value of the GEP is + undefined if the address is outside the actual underlying allocated + object and not the address one-past-the-end.

    + +

    Without the inbounds keyword, there are no restrictions + on computing out-of-bounds addresses. Obviously, performing a load or + a store requires an address of allocated and sufficiently aligned + memory. But the GEP itself is only concerned with computing addresses.

    + +
    + + + +
    +

    Yes. This is basically a special case of array indices being out + of bounds.

    + +
    + + + +
    +

    Yes. If both addresses are within the same allocated object, or + one-past-the-end, you'll get the comparison result you expect. If either + is outside of it, integer arithmetic wrapping may occur, so the + comparison may not be meaningful.

    + +
    + + + +
    +

    Yes. There are no restrictions on bitcasting a pointer value to an arbitrary + pointer type. The types in a GEP serve only to define the parameters for the + underlying integer computation. They need not correspond with the actual + type of the underlying object.

    + +

    Furthermore, loads and stores don't have to use the same types as the type + of the underlying object. Types in this context serve only to specify + memory size and alignment. Beyond that there are merely a hint to the + optimizer indicating how the value will likely be used.

    + +
    + + + +
    +

    You can compute an address that way, but if you use GEP to do the add, + you can't use that pointer to actually access the object, unless the + object is managed outside of LLVM.

    + +

    The underlying integer computation is sufficiently defined; null has a + defined value -- zero -- and you can add whatever value you want to it.

    + +

    However, it's invalid to access (load from or store to) an LLVM-aware + object with such a pointer. This includes GlobalVariables, Allocas, and + objects pointed to by noalias pointers.

    + +

    If you really need this functionality, you can do the arithmetic with + explicit integer instructions, and use inttoptr to convert the result to + an address. Most of GEP's special aliasing rules do not apply to pointers + computed from ptrtoint, arithmetic, and inttoptr sequences.

    + +
    + + + +
    +

    As with arithmetic on null, You can use GEP to compute an address that + way, but you can't use that pointer to actually access the object if you + do, unless the object is managed outside of LLVM.

    + +

    Also as above, ptrtoint and inttoptr provide an alternative way to do this + which do not have this restriction.

    + +
    + + + +
    +

    You can't do type-based alias analysis using LLVM's built-in type system, + because LLVM has no restrictions on mixing types in addressing, loads or + stores.

    + +

    It would be possible to add special annotations to the IR, probably using + metadata, to describe a different type system (such as the C type system), + and do type-based aliasing on top of that. This is a much bigger + undertaking though.

    + +
    + + + + +
    +

    If the GEP has the inbounds keyword, the result value is + undefined.

    + +

    Otherwise, the result value is the result from evaluating the implied + two's complement integer computation. However, since there's no + guarantee of where an object will be allocated in the address space, + such values have limited meaning.

    + +
    + + + + +
    +

    There is currently no checker for the getelementptr rules. Currently, + the only way to do this is to manually check each place in your front-end + where GetElementPtr operators are created.

    + +

    It's not possible to write a checker which could find all rule + violations statically. It would be possible to write a checker which + works by instrumenting the code with dynamic checks though. Alternatively, + it would be possible to write a static checker which catches a subset of + possible problems. However, no such checker exists today.

    + +
    + + + + + + + + +
    +

    The design of GEP has the following goals, in rough unofficial + order of priority:

    +
      +
    • Support C, C-like languages, and languages which can be + conceptually lowered into C (this covers a lot).
    • +
    • Support optimizations such as those that are common in + C compilers. In particular, GEP is a cornerstone of LLVM's + pointer aliasing model.
    • +
    • Provide a consistent method for computing addresses so that + address computations don't need to be a part of load and + store instructions in the IR.
    • +
    • Support non-C-like languages, to the extent that it doesn't + interfere with other goals.
    • +
    • Minimize target-specific information in the IR.
    • +
    +
    + + + +
    +

    The specific type i32 is probably just a historical artifact, however it's + wide enough for all practical purposes, so there's been no need to change it. + It doesn't necessarily imply i32 address arithmetic; it's just an identifier + which identifies a field in a struct. Requiring that all struct indices be + the same reduces the range of possibilities for cases where two GEPs are + effectively the same but have distinct operand types.

    + +
    + + + + +
    +

    Some LLVM optimizers operate on GEPs by internally lowering them into + more primitive integer expressions, which allows them to be combined + with other integer expressions and/or split into multiple separate + integer expressions. If they've made non-trivial changes, translating + back into LLVM IR can involve reverse-engineering the structure of + the addressing in order to fit it into the static type of the original + first operand. It isn't always possibly to fully reconstruct this + structure; sometimes the underlying addressing doesn't correspond with + the static type at all. In such cases the optimizer instead will emit + a GEP with the base pointer casted to a simple address-unit pointer, + using the name "uglygep". This isn't pretty, but it's just as + valid, and it's sufficient to preserve the pointer aliasing guarantees + that GEP provides.

    + +
    + + + + + +
    +

    In summary, here's some things to always remember about the GetElementPtr + instruction:

    +
      +
    1. The GEP instruction never accesses memory, it only provides pointer + computations.
    2. +
    3. The first operand to the GEP instruction is always a pointer and it must + be indexed.
    4. +
    5. There are no superfluous indices for the GEP instruction.
    6. +
    7. Trailing zero indices are superfluous for pointer aliasing, but not for + the types of the pointers.
    8. +
    9. Leading zero indices are not superfluous for pointer aliasing nor the + types of the pointers.
    10. +
    +
    + + + +
    +
    + Valid CSS + Valid HTML 4.01 + The LLVM Compiler Infrastructure
    + Last modified: $Date: 2010-08-27 21:09:24 -0700 (Fri, 27 Aug 2010) $ +
    + + Added: www-releases/trunk/2.8/docs/GettingStarted.html URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/2.8/docs/GettingStarted.html?rev=115556&view=auto ============================================================================== --- www-releases/trunk/2.8/docs/GettingStarted.html (added) +++ www-releases/trunk/2.8/docs/GettingStarted.html Mon Oct 4 15:49:23 2010 @@ -0,0 +1,1679 @@ + + + + + Getting Started with LLVM System + + + + +
    + Getting Started with the LLVM System +
    + + + +
    +

    Written by: + John Criswell, + Chris Lattner, + Misha Brukman, + Vikram Adve, and + Guochun Shi. +

    +
    + + + + + + +
    + +

    Welcome to LLVM! In order to get started, you first need to know some +basic information.

    + +

    First, LLVM comes in two pieces. The first piece is the LLVM suite. This +contains all of the tools, libraries, and header files needed to use the low +level virtual machine. It contains an assembler, disassembler, bitcode +analyzer and bitcode optimizer. It also contains a test suite that can be +used to test the LLVM tools and the GCC front end.

    + +

    The second piece is the GCC front end. This component provides a version of +GCC that compiles C and C++ code into LLVM bitcode. Currently, the GCC front +end uses the GCC parser to convert code to LLVM. Once +compiled into LLVM bitcode, a program can be manipulated with the LLVM tools +from the LLVM suite.

    + +

    +There is a third, optional piece called llvm-test. It is a suite of programs +with a testing harness that can be used to further test LLVM's functionality +and performance. +

    + +
    + + + + + +
    + +

    Here's the short story for getting up and running quickly with LLVM:

    + +
      +
    1. Read the documentation.
    2. +
    3. Read the documentation.
    4. +
    5. Remember that you were warned twice about reading the documentation.
    6. +
    7. Install the llvm-gcc-4.2 front end if you intend to compile C or C++ + (see Install the GCC Front End for details):
    8. +
        +
      1. cd where-you-want-the-C-front-end-to-live
      2. +
      3. gunzip --stdout llvm-gcc-4.2-version-platform.tar.gz | tar -xvf -
      4. +
      5. install-binutils-binary-from-MinGW (Windows only)
      6. +
      7. Note: If the binary extension is ".bz" use bunzip2 instead of gunzip.
      8. +
      9. Note: On Windows, use 7-Zip or a similar archiving tool.
      10. +
      11. Add llvm-gcc's "bin" directory to your PATH environment variable.
      12. +
      + +
    9. Get the LLVM Source Code +
        +
      • With the distributed files (or use SVN): +
          +
        1. cd where-you-want-llvm-to-live +
        2. gunzip --stdout llvm-version.tar.gz | tar -xvf - +
      • + +
    10. + +
    11. [Optional] Get the Test Suite Source Code +
        +
      • With the distributed files (or use SVN): +
          +
        1. cd where-you-want-llvm-to-live +
        2. cd llvm/projects +
        3. gunzip --stdout llvm-test-version.tar.gz | tar -xvf - +
      • + +
    12. + + +
    13. Configure the LLVM Build Environment +
        +
      1. cd where-you-want-to-build-llvm
      2. +
      3. /path/to/llvm/configure [options]
        + Some common options: + +
          +
        • --prefix=directory +

          Specify for directory the full pathname of where you + want the LLVM tools and libraries to be installed (default + /usr/local).

        • +
        • --with-llvmgccdir=directory +

          Optionally, specify for directory the full pathname of the + C/C++ front end installation to use with this LLVM configuration. If + not specified, the PATH will be searched. This is only needed if you + want to run the testsuite or do some special kinds of LLVM builds.

        • +
        • --enable-spec2000=directory +

          Enable the SPEC2000 benchmarks for testing. The SPEC2000 + benchmarks should be available in + directory.

        • +
        +
    14. + +
    15. Build the LLVM Suite: +
        +
      1. gmake -k |& tee gnumake.out +    # this is csh or tcsh syntax
      2. +
      3. If you get an "internal compiler error (ICE)" or test failures, see + below.
      4. +
      + +
    + +

    Consult the Getting Started with LLVM section for +detailed information on configuring and compiling LLVM. See Setting Up Your Environment for tips that simplify +working with the GCC front end and LLVM tools. Go to Program +Layout to learn about the layout of the source code tree.

    + +
    + + + + + +
    + +

    Before you begin to use the LLVM system, review the requirements given below. +This may save you some trouble by knowing ahead of time what hardware and +software you will need.

    + +
    + + + + +
    + +

    LLVM is known to work on the following platforms:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    OSArchCompilers
    AuroraUXx861GCC
    Linuxx861GCC
    Linuxamd64GCC
    SolarisV9 (Ultrasparc)GCC
    FreeBSDx861GCC
    MacOS X2PowerPCGCC
    MacOS X2,9x86GCC
    Cygwin/Win32x861,8, + 11GCC 3.4.X, binutils 2.20
    MinGW/Win32x861,6, + 8, 10GCC 3.4.X, binutils 2.20
    + +

    LLVM has partial support for the following platforms:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    OSArchCompilers
    Windowsx861Visual Studio 2005 SP1 or higher4,5
    AIX3,4PowerPCGCC
    Linux3,5PowerPCGCC
    Linux7AlphaGCC
    Linux7Itanium (IA-64)GCC
    HP-UX7Itanium (IA-64)HP aCC
    + +

    Notes:

    + + + +

    Note that you will need about 1-3 GB of space for a full LLVM build in Debug +mode, depending on the system (it is so large because of all the debugging +information and the fact that the libraries are statically linked into multiple +tools). If you do not need many of the tools and you are space-conscious, you +can pass ONLY_TOOLS="tools you need" to make. The Release build +requires considerably less space.

    + +

    The LLVM suite may compile on other platforms, but it is not +guaranteed to do so. If compilation is successful, the LLVM utilities should be +able to assemble, disassemble, analyze, and optimize LLVM bitcode. Code +generation should work as well, although the generated native code may not work +on your platform.

    + +

    The GCC front end is not very portable at the moment. If you want to get it +to work on another platform, you can download a copy of the source and try to compile it on your platform.

    + +
    + + + +
    +

    Compiling LLVM requires that you have several software packages + installed. The table below lists those required packages. The Package column + is the usual name for the software package that LLVM depends on. The Version + column provides "known to work" versions of the package. The Notes column + describes how LLVM uses the package and provides other details.

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    PackageVersionNotes
    GNU Make3.79, 3.79.1Makefile/build processor
    GCC3.4.2C/C++ compiler1
    TeXinfo4.5For building the CFE
    SVN≥1.3Subversion access to LLVM2
    DejaGnu1.4.2Automated test suite3
    tcl8.3, 8.4Automated test suite3
    expect5.38.0Automated test suite3
    perl≥5.6.0Nightly tester, utilities
    GNU M4 + 1.4Macro processor for configuration4
    GNU Autoconf2.60Configuration script builder4
    GNU Automake1.9.6aclocal macro generator4
    libtool1.5.22Shared library manager4
    + +

    Notes:

    + + +

    Additionally, your compilation host is expected to have the usual + plethora of Unix utilities. Specifically:

    +
      +
    • ar - archive library builder
    • +
    • bzip2* - bzip2 command for distribution generation
    • +
    • bunzip2* - bunzip2 command for distribution checking
    • +
    • chmod - change permissions on a file
    • +
    • cat - output concatenation utility
    • +
    • cp - copy files
    • +
    • date - print the current date/time
    • +
    • echo - print to standard output
    • +
    • egrep - extended regular expression search utility
    • +
    • find - find files/dirs in a file system
    • +
    • grep - regular expression search utility
    • +
    • gzip* - gzip command for distribution generation
    • +
    • gunzip* - gunzip command for distribution checking
    • +
    • install - install directories/files
    • +
    • mkdir - create a directory
    • +
    • mv - move (rename) files
    • +
    • ranlib - symbol table builder for archive libraries
    • +
    • rm - remove (delete) files and directories
    • +
    • sed - stream editor for transforming output
    • +
    • sh - Bourne shell for make build scripts
    • +
    • tar - tape archive for distribution generation
    • +
    • test - test things in file system
    • +
    • unzip* - unzip command for distribution checking
    • +
    • zip* - zip command for distribution generation
    • +
    +
    + + + + +
    + +

    LLVM is very demanding of the host C++ compiler, and as such tends to expose +bugs in the compiler. In particular, several versions of GCC crash when trying +to compile LLVM. We routinely use GCC 3.3.3, 3.4.0, and Apple 4.0.1 +successfully with them (however, see important notes below). Other versions +of GCC will probably work as well. GCC versions listed +here are known to not work. If you are using one of these versions, please try +to upgrade your GCC to something more recent. If you run into a problem with a +version of GCC not listed here, please let +us know. Please use the "gcc -v" command to find out which version +of GCC you are using. +

    + +

    GCC versions prior to 3.0: GCC 2.96.x and before had several +problems in the STL that effectively prevent it from compiling LLVM. +

    + +

    GCC 3.2.2 and 3.2.3: These versions of GCC fails to compile LLVM with +a bogus template error. This was fixed in later GCCs.

    + +

    GCC 3.3.2: This version of GCC suffered from a serious bug which causes it to crash in +the "convert_from_eh_region_ranges_1" GCC function.

    + +

    Cygwin GCC 3.3.3: The version of GCC 3.3.3 commonly shipped with + Cygwin does not work. Please upgrade + to a newer version if possible.

    +

    SuSE GCC 3.3.3: The version of GCC 3.3.3 shipped with SuSE 9.1 (and + possibly others) does not compile LLVM correctly (it appears that exception + handling is broken in some cases). Please download the FSF 3.3.3 or upgrade + to a newer version of GCC.

    +

    GCC 3.4.0 on linux/x86 (32-bit): GCC miscompiles portions of the + code generator, causing an infinite loop in the llvm-gcc build when built + with optimizations enabled (i.e. a release build).

    +

    GCC 3.4.2 on linux/x86 (32-bit): GCC miscompiles portions of the + code generator at -O3, as with 3.4.0. However gcc 3.4.2 (unlike 3.4.0) + correctly compiles LLVM at -O2. A work around is to build release LLVM + builds with "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2 ..."

    +

    GCC 3.4.x on X86-64/amd64: GCC + miscompiles portions of LLVM.

    +

    GCC 3.4.4 (CodeSourcery ARM 2005q3-2): this compiler miscompiles LLVM + when building with optimizations enabled. It appears to work with + "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O1" or build a debug + build.

    +

    IA-64 GCC 4.0.0: The IA-64 version of GCC 4.0.0 is known to + miscompile LLVM.

    +

    Apple Xcode 2.3: GCC crashes when compiling LLVM at -O3 (which is the + default with ENABLE_OPTIMIZED=1. To work around this, build with + "ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2".

    +

    GCC 4.1.1: GCC fails to build LLVM with template concept check errors + compiling some files. At the time of this writing, GCC mainline (4.2) + did not share the problem.

    +

    GCC 4.1.1 on X86-64/amd64: GCC + miscompiles portions of LLVM when compiling llvm itself into 64-bit + code. LLVM will appear to mostly work but will be buggy, e.g. failing + portions of its testsuite.

    +

    GCC 4.1.2 on OpenSUSE: Seg faults during libstdc++ build and on x86_64 +platforms compiling md5.c gets a mangled constant.

    +

    GCC 4.1.2 (20061115 (prerelease) (Debian 4.1.1-21)) on Debian: Appears +to miscompile parts of LLVM 2.4. One symptom is ValueSymbolTable complaining +about symbols remaining in the table on destruction.

    +

    GCC 4.1.2 20071124 (Red Hat 4.1.2-42): Suffers from the same symptoms +as the previous one. It appears to work with ENABLE_OPTIMIZED=0 (the default).

    +

    Cygwin GCC 4.3.2 20080827 (beta) 2: + Users reported various problems related + with link errors when using this GCC version.

    +

    Debian GCC 4.3.2 on X86: Crashes building some files in LLVM 2.6.

    +

    GCC 4.3.3 (Debian 4.3.3-10) on ARM: Miscompiles parts of LLVM 2.6 +when optimizations are turned on. The symptom is an infinite loop in +FoldingSetImpl::RemoveNode while running the code generator.

    +

    GNU ld 2.16.X. Some 2.16.X versions of the ld linker will produce very +long warning messages complaining that some ".gnu.linkonce.t.*" symbol was +defined in a discarded section. You can safely ignore these messages as they are +erroneous and the linkage is correct. These messages disappear using ld +2.17.

    + +

    GNU binutils 2.17: Binutils 2.17 contains a bug which +causes huge link times (minutes instead of seconds) when building LLVM. We +recommend upgrading to a newer version (2.17.50.0.4 or later).

    + +

    GNU Binutils 2.19.1 Gold: This version of Gold contained +a bug +which causes intermittent failures when building LLVM with position independent +code. The symptom is an error about cyclic dependencies. We recommend +upgrading to a newer version of Gold.

    + +
    + + + + + + + +
    + +

    The remainder of this guide is meant to get you up and running with +LLVM and to give you some basic information about the LLVM environment.

    + +

    The later sections of this guide describe the general layout of the the LLVM source tree, a simple example using the LLVM tool chain, and links to find more information about LLVM or to get +help via e-mail.

    +
    + + + + +
    + +

    Throughout this manual, the following names are used to denote paths +specific to the local system and working environment. These are not +environment variables you need to set but just strings used in the rest +of this document below. In any of the examples below, simply replace +each of these names with the appropriate pathname on your local system. +All these paths are absolute:

    + +
    +
    SRC_ROOT +
    + This is the top level directory of the LLVM source tree. +

    + +
    OBJ_ROOT +
    + This is the top level directory of the LLVM object tree (i.e. the + tree where object files and compiled programs will be placed. It + can be the same as SRC_ROOT). +

    + +
    LLVMGCCDIR +
    + This is where the LLVM GCC Front End is installed. +

    + For the pre-built GCC front end binaries, the LLVMGCCDIR is + llvm-gcc/platform/llvm-gcc. +

    + +
    + + + + +
    + +

    +In order to compile and use LLVM, you may need to set some environment +variables. + +

    +
    LLVM_LIB_SEARCH_PATH=/path/to/your/bitcode/libs
    +
    [Optional] This environment variable helps LLVM linking tools find the + locations of your bitcode libraries. It is provided only as a + convenience since you can specify the paths using the -L options of the + tools and the C/C++ front-end will automatically use the bitcode files + installed in its + lib directory.
    +
    + +
    + + + + +
    + +

    +If you have the LLVM distribution, you will need to unpack it before you +can begin to compile it. LLVM is distributed as a set of two files: the LLVM +suite and the LLVM GCC front end compiled for your platform. There is an +additional test suite that is optional. Each file is a TAR archive that is +compressed with the gzip program. +

    + +

    The files are as follows, with x.y marking the version number: +

    +
    llvm-x.y.tar.gz
    +
    Source release for the LLVM libraries and tools.
    + +
    llvm-test-x.y.tar.gz
    +
    Source release for the LLVM test suite.
    + +
    llvm-gcc-4.2-x.y.source.tar.gz
    +
    Source release of the llvm-gcc-4.2 front end. See README.LLVM in the root + directory for build instructions.
    + +
    llvm-gcc-4.2-x.y-platform.tar.gz
    +
    Binary release of the llvm-gcc-4.2 front end for a specific platform.
    + +
    + +
    + + + + +
    + +

    If you have access to our Subversion repository, you can get a fresh copy of +the entire source code. All you need to do is check it out from Subversion as +follows:

    + +
      +
    • cd where-you-want-llvm-to-live
    • +
    • Read-Only: svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
    • +
    • Read-Write:svn co https://user at llvm.org/svn/llvm-project/llvm/trunk + llvm
    • +
    + + +

    This will create an 'llvm' directory in the current +directory and fully populate it with the LLVM source code, Makefiles, +test directories, and local copies of documentation files.

    + +

    If you want to get a specific release (as opposed to the most recent +revision), you can checkout it from the 'tags' directory (instead of +'trunk'). The following releases are located in the following +subdirectories of the 'tags' directory:

    + +
      +
    • Release 2.6: RELEASE_26
    • +
    • Release 2.5: RELEASE_25
    • +
    • Release 2.4: RELEASE_24
    • +
    • Release 2.3: RELEASE_23
    • +
    • Release 2.2: RELEASE_22
    • +
    • Release 2.1: RELEASE_21
    • +
    • Release 2.0: RELEASE_20
    • +
    • Release 1.9: RELEASE_19
    • +
    • Release 1.8: RELEASE_18
    • +
    • Release 1.7: RELEASE_17
    • +
    • Release 1.6: RELEASE_16
    • +
    • Release 1.5: RELEASE_15
    • +
    • Release 1.4: RELEASE_14
    • +
    • Release 1.3: RELEASE_13
    • +
    • Release 1.2: RELEASE_12
    • +
    • Release 1.1: RELEASE_11
    • +
    • Release 1.0: RELEASE_1
    • +
    + +

    If you would like to get the LLVM test suite (a separate package as of 1.4), +you get it from the Subversion repository:

    + +
    +
    +% cd llvm/projects
    +% svn co http://llvm.org/svn/llvm-project/test-suite/trunk llvm-test
    +
    +
    + +

    By placing it in the llvm/projects, it will be automatically +configured by the LLVM configure script as well as automatically updated when +you run svn update.

    + +

    If you would like to get the GCC front end source code, you can also get it +and build it yourself. Please follow these +instructions to successfully get and build the LLVM GCC front-end.

    + +
    + + + + +
    + +

    Before configuring and compiling the LLVM suite (or if you want to use just the LLVM +GCC front end) you can optionally extract the front end from the binary distribution. +It is used for running the llvm-test testsuite and for compiling C/C++ programs. Note that +you can optionally build llvm-gcc yourself after building the +main LLVM repository.

    + +

    To install the GCC front end, do the following (on Windows, use an archival tool +like 7-zip that u

    operation never modifies the archive. + +=item q[Rfz] + +Quickly append files to the end of the archive. The F, F, and F +modifiers apply to this operation. This operation quickly adds the +F to the archive without checking for duplicates that should be +removed first. If no F are specified, the archive is not modified. +Because of the way that B constructs the archive file, its dubious +whether the F operation is any faster than the F operation. + +=item r[Rabfuz] + +Replace or insert file members. The F, F, F, F, F, and F +modifiers apply to this operation. This operation will replace existing +F or insert them at the end of the archive if they do not exist. If no +F are specified, the archive is not modified. + +=item t[v] + +Print the table of contents. Without any modifiers, this operation just prints +the names of the members to the standard output. With the F modifier, +B also prints out the file type (B=bitcode, Z=compressed, S=symbol +table, blank=regular file), the permission mode, the owner and group, the +size, and the date. If any F are specified, the listing is only for +those files. If no F are specified, the table of contents for the +whole archive is printed. + +=item x[oP] + +Extract archive members back to files. The F modifier applies to this +operation. This operation retrieves the indicated F from the archive +and writes them back to the operating system's file system. If no +F are specified, the entire archive is extract. + +=back + +=head2 Modifiers (operation specific) + +The modifiers below are specific to certain operations. See the Operations +section (above) to determine which modifiers are applicable to which operations. + +=over + +=item [a] + +When inserting or moving member files, this option specifies the destination of +the new files as being Cfter the F member. If F is not found, +the files are placed at the end of the archive. + +=item [b] + +When inserting or moving member files, this option specifies the destination of +the new files as being Cefore the F member. If F is not +found, the files are placed at the end of the archive. This modifier is +identical to the the F modifier. + +=item [f] + +Normally, B stores the full path name to a file as presented to it on +the command line. With this option, truncated (15 characters max) names are +used. This ensures name compatibility with older versions of C but may also +thwart correct extraction of the files (duplicates may overwrite). If used with +the F option, the directory recursion will be performed but the file names +will all be Clattened to simple file names. + +=item [i] + +A synonym for the F option. + +=item [k] + +Normally, B will not print the contents of bitcode files when the +F